comparison FastaStats.xml @ 0:163892325845 draft default tip

Initial commit.
author galaxyp
date Fri, 10 May 2013 17:15:08 -0400
parents
children
comparison
equal deleted inserted replaced
-1:000000000000 0:163892325845
1 <!--
2 # =====================================================
3 # $Id: FastaStats.xml 90 2011-01-19 13:20:31Z pieter.neerincx@gmail.com $
4 # $URL: https://trac.nbic.nl/svn/galaxytools/trunk/tools/general/FastaTools/FastaStats.xml $
5 # $LastChangedDate: 2011-01-19 07:20:31 -0600 (Wed, 19 Jan 2011) $
6 # $LastChangedRevision: 90 $
7 # $LastChangedBy: pieter.neerincx@gmail.com $
8 # =====================================================
9 -->
10 <tool id="FastaStats1" name="FastaStats">
11 <description>List statistics for sequences in a FASTA file</description>
12 <command interpreter="perl">FastaStats.pl $get_positional_composition_stats -i $input -o $output -l WARN</command>
13 <inputs>
14 <param format="fasta" name="input" type="data" label="FASTA sequences"/>
15 <param name="get_positional_composition_stats" type="boolean" truevalue="-p" falsevalue="" optional="true" label="Calculate positional acid frequencies"/>
16 </inputs>
17 <outputs>
18 <data format="txt" name="output" label="FASTA Statistics for ${input.name}"/>
19 </outputs>
20 <tests>
21 <test>
22 <param name="input" value="fasta_2_proteins.fasta" ftype="fasta"/>
23 <output name="output" file="FastaStats_example_output.txt"/>
24 </test>
25 </tests>
26 <help>
27
28 .. class:: infomark
29
30 **What it does**
31
32 This tool analyzes a collection of sequences in FASTA format and reports: \
33
34 - The total number of sequences.
35 - The total number of nucleotide or amino acids.
36 - The total frequency of nucleotide or amino acids.
37 - The positional frequency of nucleotide or amino acids (optional).
38
39 -----
40
41 **Example**
42
43 If the FASTA sequence collection contains these two sequences::
44
45 &gt;UniProtKB:Q42593 L-ascorbate peroxidase T, chloroplastic;
46 MSVSLSAASHLLCSSTRVSLSPAVTSSSSSPVVALSSSTSPHSLGSVASSSLFPHSSFVL
47 QKKHPINGTSTRMISPKCAASDAAQLISAKEDIKVLLRTKFCHPILVRLGWHDAGTYNKN
48 IEEWPLRGGANGSLRFEAELKHAANAGLLNALKLIQPLKDKYPNISYADLFQLASATAIE
49 EAGGPDIPMKYGRVDVVAPEQCPEEGRLPDAGPPSPADHLRDVFYRMGLDDKEIVALSGA
50 HTLGRARPDRSGWGKPETKYTKTGPGEAGGQSWTVKWLKFDNSYFKDIKEKRDDDLLVLP
51 TDAALFEDPSFKNYAEKYAEDVAAFFKDYAEAHAKLSNLGAKFDPPEGIVIENVPEKFVA
52 AKYSTGKKELSDSMKKKIRAEYEAIGGSPDKPLPTNYFLNIIIAIGVLVLLSTLFGGNNN
53 SDFSGF
54 &gt;UniProtKB:A0MQ79 Ascorbate peroxidase;
55 MVKNYPVVSEEYLIAVDKAKKKLRGFIAEKNCAPLMLRLAWHSAGTFDQCSRTGGPFGTM
56 RFKAEQAHSANNGIDIAIRLLEPIKEQFPILSYADFYQLAGVVAVEVTGGPEVPFHPGRP
57 DKEEPPVEGRLPDAYKGSDHLRDVFIKQMGLSDQDIVALSGGHTLGRCHKERSGFEGPWT
58 ENPLIFDNSYFKELVCGERDGLLQLPSDKALLADPVFHPLVEKYAADEDAFFADYAEAHL
59 KLSELGFADA
60
61 The reported stats (without optional positional acid frequencies) will be this::
62
63 Sequences 2
64 Acid A 69
65 Acid C 8
66 Acid D 44
67 Acid E 44
68 Acid F 33
69 Acid G 52
70 Acid H 18
71 Acid I 30
72 Acid K 50
73 Acid L 67
74 Acid M 9
75 Acid N 22
76 Acid P 46
77 Acid Q 13
78 Acid R 26
79 Acid S 57
80 Acid T 23
81 Acid V 37
82 Acid W 7
83 Acid Y 21
84 Total acids 676
85
86 </help>
87 </tool>