annotate README.md @ 49:df3645fb1bc0 draft

Uploaded
author davidvanzessen
date Wed, 10 Jul 2019 03:37:52 -0400
parents 4d2a8f98a502
children e35b82f31ec7
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
46
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
1 # ARGalaxy Immune Repertoire
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
2 This is the GitHub repository for the ARGalaxy Immune repertoire pipeline.
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
3 The Galaxy tool version can be found [here](https://toolshed.g2.bx.psu.edu/repository/browse_repositories_i_own?sort=name&operation=view_or_manage_repository&id=2e457d63170a4b1c).
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
4
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
5 ## Overview
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
6
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
7 In execution order:
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
8
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
9 #### imgt_loader or igblast
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
10
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
11 ###### imgt_loader (Recommended)
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
12 Start the analysis with [IMGT HighV Quest](https://www.imgt.org/HighV-QUEST/) archives.
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
13 An IMGT archive file holds [multiple tabular files](http://www.imgt.org/IMGT_vquest/share/textes/imgtvquest.html#output3), this script extracts the specific columns relevant to the analysis from several of these files.
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
14
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
15 `Rscript imgt_loader.r 1_Summary.txt 3_Nt-sequences.txt 5_AA-sequences.txt 6_Junction.txt 4_IMGT-gapped-AA-sequences.txt /path/to/output.txt`
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
16
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
17
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
18 ###### igblast
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
19 Start the analysis with FASTA files that are aligned with [igblast](https://www.ncbi.nlm.nih.gov/igblast/).
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
20 Note that this method will provide less information than the IMGT archive.
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
21
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
22 `sh igblast.sh /path/to/sequences.fasta species locus /path/to/output.txt`
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
23
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
24 #### experimental_design
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
25 This script will merge multiple result files (from the last step) into a single file with an additional ID and Replicate column to differentiate the individual samples during the analysis and to allow for analysis across samples.
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
26
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
27 `Rscript experimental_design.r /path/to/input_1 id_1 [/path/to/input_2 id_2] [/path/to/input_n id_n] /path/to/output`
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
28
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
29 #### report_clonality
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
30 The R script that creates the analysis result.
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
31
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
32 `sh r_wrapper.sh /path/to/experimental_design/output.txt /path/to/output_dir/output.html /path/to/output_dir "clonaltype" "species" "locus" "filter_productive" "clonality_method"`
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
33
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
34 ###### parameters
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
35 Clonaltype:
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
36 - none
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
37 - Top.V.Gene,CDR3.Seq
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
38 - Top.V.Gene,CDR3.Seq.DNA
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
39 - Top.V.Gene,Top.J.Gene,CDR3.Seq
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
40 - Top.V.Gene,Top.J.Gene,CDR3.Seq.DNA
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
41 - Top.V.Gene,Top.D.Gene,Top.J.Gene,CDR3.Seq.DNA
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
42
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
43 Species:
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
44 - Homo sapiens functional
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
45 - Homo sapiens
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
46 - Homo sapiens non-functional
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
47 - Bos taurus
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
48 - Bos taurus functional
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
49 - Bos taurus non-functional
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
50 - Camelus dromedarius
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
51 - Camelus dromedarius functional
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
52 - Camelus dromedarius non-functional
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
53 - Canis lupus familiaris
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
54 - Canis lupus familiaris functional
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
55 - Canis lupus familiaris non-functional
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
56 - Danio rerio
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
57 - Danio rerio functional
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
58 - Danio rerio non-functional
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
59 - Macaca mulatta
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
60 - Macaca mulatta functional
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
61 - Macaca mulatta non-functional
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
62 - Mus musculus
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
63 - Mus musculus functional
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
64 - Mus musculus non-functional
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
65 - Mus spretus
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
66 - Mus spretus functional
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
67 - Mus spretus non-functional
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
68 - Oncorhynchus mykiss
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
69 - Oncorhynchus mykiss functional
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
70 - Oncorhynchus mykiss non-functional
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
71 - Ornithorhynchus anatinus
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
72 - Ornithorhynchus anatinus functional
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
73 - Ornithorhynchus anatinus non-functional
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
74 - Oryctolagus cuniculus
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
75 - Oryctolagus cuniculus functional
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
76 - Oryctolagus cuniculus non-functional
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
77 - Rattus norvegicus
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
78 - Rattus norvegicus functional
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
79 - Rattus norvegicus non-functional
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
80 - Sus scrofa
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
81 - Sus scrofa functional
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
82 - Sus scrofa non-functional
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
83
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
84 Locus:
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
85 - TRA
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
86 - TRD
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
87 - TRG
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
88 - TRB
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
89 - IGH
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
90 - IGI
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
91 - IGK
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
92 - IGL
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
93
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
94 Filter productive:
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
95 - yes
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
96 - no
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
97
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
98 Clonality Method:
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
99 - none
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
100 - old
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
101 - boyd
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
102
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
103 ## complete.sh
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
104 This script will run all of the above for you, it will detect if you are using FASTA files or IMGT archives and use the appropriate tools.
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
105
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
106 `sh complete.sh /path/to/input_1 id_1 [/path/to/input_n id_n] /path/to/out_dir/out.html /path/to/out_dir clonaltype species locus filter_productive clonality_method`
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
107 See "report_clonality" for the parameter options.
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
108
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
109 ## Dependencies
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
110 - Linux
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
111 - R
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
112 - gridExtra
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
113 - ggplot2
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
114 - plyr
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
115 - data.table
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
116 - reshape2
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
117 - lymphclon
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
118
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
119 #### optional
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
120 - Circos
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
121 - IgBlast
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
122 - igblastwrp