0
|
1 <!--
|
|
2 # =====================================================
|
|
3 # $Id: FastaStats.xml 90 2011-01-19 13:20:31Z pieter.neerincx@gmail.com $
|
|
4 # $URL: https://trac.nbic.nl/svn/galaxytools/trunk/tools/general/FastaTools/FastaStats.xml $
|
|
5 # $LastChangedDate: 2011-01-19 07:20:31 -0600 (Wed, 19 Jan 2011) $
|
|
6 # $LastChangedRevision: 90 $
|
|
7 # $LastChangedBy: pieter.neerincx@gmail.com $
|
|
8 # =====================================================
|
|
9 -->
|
|
10 <tool id="FastaStats1" name="FastaStats">
|
|
11 <description>List statistics for sequences in a FASTA file</description>
|
|
12 <command interpreter="perl">FastaStats.pl $get_positional_composition_stats -i $input -o $output -l WARN</command>
|
|
13 <inputs>
|
|
14 <param format="fasta" name="input" type="data" label="FASTA sequences"/>
|
|
15 <param name="get_positional_composition_stats" type="boolean" truevalue="-p" falsevalue="" optional="true" label="Calculate positional acid frequencies"/>
|
|
16 </inputs>
|
|
17 <outputs>
|
|
18 <data format="txt" name="output" label="FASTA Statistics for ${input.name}"/>
|
|
19 </outputs>
|
|
20 <tests>
|
|
21 <test>
|
|
22 <param name="input" value="fasta_2_proteins.fasta" ftype="fasta"/>
|
|
23 <output name="output" file="FastaStats_example_output.txt"/>
|
|
24 </test>
|
|
25 </tests>
|
|
26 <help>
|
|
27
|
|
28 .. class:: infomark
|
|
29
|
|
30 **What it does**
|
|
31
|
|
32 This tool analyzes a collection of sequences in FASTA format and reports: \
|
|
33
|
|
34 - The total number of sequences.
|
|
35 - The total number of nucleotide or amino acids.
|
|
36 - The total frequency of nucleotide or amino acids.
|
|
37 - The positional frequency of nucleotide or amino acids (optional).
|
|
38
|
|
39 -----
|
|
40
|
|
41 **Example**
|
|
42
|
|
43 If the FASTA sequence collection contains these two sequences::
|
|
44
|
|
45 >UniProtKB:Q42593 L-ascorbate peroxidase T, chloroplastic;
|
|
46 MSVSLSAASHLLCSSTRVSLSPAVTSSSSSPVVALSSSTSPHSLGSVASSSLFPHSSFVL
|
|
47 QKKHPINGTSTRMISPKCAASDAAQLISAKEDIKVLLRTKFCHPILVRLGWHDAGTYNKN
|
|
48 IEEWPLRGGANGSLRFEAELKHAANAGLLNALKLIQPLKDKYPNISYADLFQLASATAIE
|
|
49 EAGGPDIPMKYGRVDVVAPEQCPEEGRLPDAGPPSPADHLRDVFYRMGLDDKEIVALSGA
|
|
50 HTLGRARPDRSGWGKPETKYTKTGPGEAGGQSWTVKWLKFDNSYFKDIKEKRDDDLLVLP
|
|
51 TDAALFEDPSFKNYAEKYAEDVAAFFKDYAEAHAKLSNLGAKFDPPEGIVIENVPEKFVA
|
|
52 AKYSTGKKELSDSMKKKIRAEYEAIGGSPDKPLPTNYFLNIIIAIGVLVLLSTLFGGNNN
|
|
53 SDFSGF
|
|
54 >UniProtKB:A0MQ79 Ascorbate peroxidase;
|
|
55 MVKNYPVVSEEYLIAVDKAKKKLRGFIAEKNCAPLMLRLAWHSAGTFDQCSRTGGPFGTM
|
|
56 RFKAEQAHSANNGIDIAIRLLEPIKEQFPILSYADFYQLAGVVAVEVTGGPEVPFHPGRP
|
|
57 DKEEPPVEGRLPDAYKGSDHLRDVFIKQMGLSDQDIVALSGGHTLGRCHKERSGFEGPWT
|
|
58 ENPLIFDNSYFKELVCGERDGLLQLPSDKALLADPVFHPLVEKYAADEDAFFADYAEAHL
|
|
59 KLSELGFADA
|
|
60
|
|
61 The reported stats (without optional positional acid frequencies) will be this::
|
|
62
|
|
63 Sequences 2
|
|
64 Acid A 69
|
|
65 Acid C 8
|
|
66 Acid D 44
|
|
67 Acid E 44
|
|
68 Acid F 33
|
|
69 Acid G 52
|
|
70 Acid H 18
|
|
71 Acid I 30
|
|
72 Acid K 50
|
|
73 Acid L 67
|
|
74 Acid M 9
|
|
75 Acid N 22
|
|
76 Acid P 46
|
|
77 Acid Q 13
|
|
78 Acid R 26
|
|
79 Acid S 57
|
|
80 Acid T 23
|
|
81 Acid V 37
|
|
82 Acid W 7
|
|
83 Acid Y 21
|
|
84 Total acids 676
|
|
85
|
|
86 </help>
|
|
87 </tool>
|