annotate tools/fastq/fastq_stats.xml @ 0:9071e359b9a3

Uploaded
author xuebing
date Fri, 09 Mar 2012 19:37:19 -0500
parents
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
0
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
1 <tool id="fastq_stats" name="FASTQ Summary Statistics" version="1.0.0">
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
2 <description>by column</description>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
3 <command interpreter="python">fastq_stats.py '$input_file' '$output_file' '${input_file.extension[len( 'fastq' ):]}'</command>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
4 <inputs>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
5 <param name="input_file" type="data" format="fastqsanger,fastqillumina,fastqsolexa,fastqcssanger" label="FASTQ File"/>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
6 </inputs>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
7 <outputs>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
8 <data name="output_file" format="tabular" />
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
9 </outputs>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
10 <tests>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
11 <test>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
12 <param name="input_file" value="fastq_stats1.fastq" ftype="fastqsanger" />
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
13 <output name="output_file" file="fastq_stats_1_out.tabular" />
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
14 </test>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
15 </tests>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
16 <help>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
17 This tool creates summary statistics on a FASTQ file.
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
18
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
19 .. class:: infomark
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
20
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
21 **TIP:** This statistics report can be used as input for the **Boxplot** and **Nucleotides Distribution** tools.
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
22
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
23 -----
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
24
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
25 **The output file will contain the following fields:**
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
26
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
27 * column = column number (1 to 36 for a 36-cycles read Solexa file)
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
28 * count = number of bases found in this column.
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
29 * min = Lowest quality score value found in this column.
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
30 * max = Highest quality score value found in this column.
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
31 * sum = Sum of quality score values for this column.
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
32 * mean = Mean quality score value for this column.
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
33 * Q1 = 1st quartile quality score.
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
34 * med = Median quality score.
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
35 * Q3 = 3rd quartile quality score.
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
36 * IQR = Inter-Quartile range (Q3-Q1).
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
37 * lW = 'Left-Whisker' value (for boxplotting).
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
38 * rW = 'Right-Whisker' value (for boxplotting).
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
39 * outliers = Scores falling beyond the left and right whiskers (comma separated list).
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
40 * A_Count = Count of 'A' nucleotides found in this column.
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
41 * C_Count = Count of 'C' nucleotides found in this column.
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
42 * G_Count = Count of 'G' nucleotides found in this column.
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
43 * T_Count = Count of 'T' nucleotides found in this column.
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
44 * N_Count = Count of 'N' nucleotides found in this column.
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
45 * Other_Nucs = Comma separated list of other nucleotides found in this column.
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
46 * Other_Count = Comma separated count of other nucleotides found in this column.
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
47
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
48 For example::
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
49
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
50 #column count min max sum mean Q1 med Q3 IQR lW rW outliers A_Count C_Count G_Count T_Count N_Count other_bases other_base_count
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
51 1 14336356 2 33 450600675 31.4306281875 32.0 33.0 33.0 1.0 31 33 2,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30 4482314 2199633 4425957 3208745 19707
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
52 2 14336356 2 34 441135033 30.7703737965 30.0 33.0 33.0 3.0 26 34 2,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25 4419184 2170537 4627987 3118567 81
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
53 3 14336356 2 34 433659182 30.2489127642 29.0 32.0 33.0 4.0 23 34 2,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22 4310988 2941988 3437467 3645784 129
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
54 4 14336356 2 34 433635331 30.2472490917 29.0 32.0 33.0 4.0 23 34 2,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22 4110637 3007028 3671749 3546839 103
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
55 5 14336356 2 34 432498583 30.167957813 29.0 32.0 33.0 4.0 23 34 2,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22 4348275 2935903 3293025 3759029 124
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
56
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
57 -----
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
58
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
59 .. class:: warningmark
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
60
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
61 Adapter bases in color space reads are excluded from statistics.
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
62
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
63 ------
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
64
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
65 **Citation**
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
66
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
67 If you use this tool, please cite `Blankenberg D, Gordon A, Von Kuster G, Coraor N, Taylor J, Nekrutenko A; Galaxy Team. Manipulation of FASTQ data with Galaxy. Bioinformatics. 2010 Jul 15;26(14):1783-5. &lt;http://www.ncbi.nlm.nih.gov/pubmed/20562416&gt;`_
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
68
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
69
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
70 </help>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
71 </tool>