view mytools/fastashuffle2.xml @ 3:6449b4a15b88

Uploaded
author xuebing
date Fri, 16 Mar 2012 14:00:03 -0400
parents 39217fa39ff2
children
line wrap: on
line source

<tool id="seqshuffle2" name="shuffle sequence">
  <description>preserving dinucleotide frequency</description>
  <command interpreter="python">fasta-dinucleotide-shuffle.py -f $input -t $tag -c $n -s $seed > $output </command>
  <inputs>
    <param name="input" format="fasta" type="data" label="Original sequence file"/>
    <param name="tag" type="text" size="40" value="-shuffled" label="tag added to shuffled sequence name"/>
    <param name="n" type="integer" value="1" label="number of shuffled copies for each sequence"/>
    <param name="seed" type="integer" value="1" label="random seed" help="the same seed gives the same random sequences"/>
  </inputs>
  <outputs>
    <data format="fasta" name="output" />
  </outputs>
  <help>

**What it does**

This tool shuffles the sequences in the input file but preserves the dinucleotide frequency of each sequence. 

The code implements the Altschul-Erikson dinucleotide shuffle algorithm, described in "Significance of nucleotide sequence alignments: A method for random sequence permutation that preserves dinucleotide and codon usage", S.F. Altschul and B.W. Erikson, Mol. Biol. Evol., 2(6):526--538, 1985. 

Code adapted from http://bioinformatics.bc.edu/clotelab/RNAdinucleotideShuffle/dinucleotideShuffle.html

  </help>
</tool>