comparison README.md @ 46:4d2a8f98a502 draft

Uploaded
author davidvanzessen
date Mon, 28 Jan 2019 09:37:41 -0500
parents
children e35b82f31ec7
comparison
equal deleted inserted replaced
45:942eea8359fe 46:4d2a8f98a502
1 # ARGalaxy Immune Repertoire
2 This is the GitHub repository for the ARGalaxy Immune repertoire pipeline.
3 The Galaxy tool version can be found [here](https://toolshed.g2.bx.psu.edu/repository/browse_repositories_i_own?sort=name&operation=view_or_manage_repository&id=2e457d63170a4b1c).
4
5 ## Overview
6
7 In execution order:
8
9 #### imgt_loader or igblast
10
11 ###### imgt_loader (Recommended)
12 Start the analysis with [IMGT HighV Quest](https://www.imgt.org/HighV-QUEST/) archives.
13 An IMGT archive file holds [multiple tabular files](http://www.imgt.org/IMGT_vquest/share/textes/imgtvquest.html#output3), this script extracts the specific columns relevant to the analysis from several of these files.
14
15 `Rscript imgt_loader.r 1_Summary.txt 3_Nt-sequences.txt 5_AA-sequences.txt 6_Junction.txt 4_IMGT-gapped-AA-sequences.txt /path/to/output.txt`
16
17
18 ###### igblast
19 Start the analysis with FASTA files that are aligned with [igblast](https://www.ncbi.nlm.nih.gov/igblast/).
20 Note that this method will provide less information than the IMGT archive.
21
22 `sh igblast.sh /path/to/sequences.fasta species locus /path/to/output.txt`
23
24 #### experimental_design
25 This script will merge multiple result files (from the last step) into a single file with an additional ID and Replicate column to differentiate the individual samples during the analysis and to allow for analysis across samples.
26
27 `Rscript experimental_design.r /path/to/input_1 id_1 [/path/to/input_2 id_2] [/path/to/input_n id_n] /path/to/output`
28
29 #### report_clonality
30 The R script that creates the analysis result.
31
32 `sh r_wrapper.sh /path/to/experimental_design/output.txt /path/to/output_dir/output.html /path/to/output_dir "clonaltype" "species" "locus" "filter_productive" "clonality_method"`
33
34 ###### parameters
35 Clonaltype:
36 - none
37 - Top.V.Gene,CDR3.Seq
38 - Top.V.Gene,CDR3.Seq.DNA
39 - Top.V.Gene,Top.J.Gene,CDR3.Seq
40 - Top.V.Gene,Top.J.Gene,CDR3.Seq.DNA
41 - Top.V.Gene,Top.D.Gene,Top.J.Gene,CDR3.Seq.DNA
42
43 Species:
44 - Homo sapiens functional
45 - Homo sapiens
46 - Homo sapiens non-functional
47 - Bos taurus
48 - Bos taurus functional
49 - Bos taurus non-functional
50 - Camelus dromedarius
51 - Camelus dromedarius functional
52 - Camelus dromedarius non-functional
53 - Canis lupus familiaris
54 - Canis lupus familiaris functional
55 - Canis lupus familiaris non-functional
56 - Danio rerio
57 - Danio rerio functional
58 - Danio rerio non-functional
59 - Macaca mulatta
60 - Macaca mulatta functional
61 - Macaca mulatta non-functional
62 - Mus musculus
63 - Mus musculus functional
64 - Mus musculus non-functional
65 - Mus spretus
66 - Mus spretus functional
67 - Mus spretus non-functional
68 - Oncorhynchus mykiss
69 - Oncorhynchus mykiss functional
70 - Oncorhynchus mykiss non-functional
71 - Ornithorhynchus anatinus
72 - Ornithorhynchus anatinus functional
73 - Ornithorhynchus anatinus non-functional
74 - Oryctolagus cuniculus
75 - Oryctolagus cuniculus functional
76 - Oryctolagus cuniculus non-functional
77 - Rattus norvegicus
78 - Rattus norvegicus functional
79 - Rattus norvegicus non-functional
80 - Sus scrofa
81 - Sus scrofa functional
82 - Sus scrofa non-functional
83
84 Locus:
85 - TRA
86 - TRD
87 - TRG
88 - TRB
89 - IGH
90 - IGI
91 - IGK
92 - IGL
93
94 Filter productive:
95 - yes
96 - no
97
98 Clonality Method:
99 - none
100 - old
101 - boyd
102
103 ## complete.sh
104 This script will run all of the above for you, it will detect if you are using FASTA files or IMGT archives and use the appropriate tools.
105
106 `sh complete.sh /path/to/input_1 id_1 [/path/to/input_n id_n] /path/to/out_dir/out.html /path/to/out_dir clonaltype species locus filter_productive clonality_method`
107 See "report_clonality" for the parameter options.
108
109 ## Dependencies
110 - Linux
111 - R
112 - gridExtra
113 - ggplot2
114 - plyr
115 - data.table
116 - reshape2
117 - lymphclon
118
119 #### optional
120 - Circos
121 - IgBlast
122 - igblastwrp