view tools/mytools/fastashuffle2.xml @ 1:cdcb0ce84a1b

author xuebing
date Fri, 09 Mar 2012 19:45:15 -0500
parents 9071e359b9a3
line wrap: on
line source

<tool id="seqshuffle2" name="shuffle sequence">
  <description>preserving dinucleotide frequency</description>
  <command interpreter="python"> -f $input -t $tag -c $n -s $seed > $output </command>
    <param name="input" format="fasta" type="data" label="Original sequence file"/>
    <param name="tag" type="text" size="40" value="-shuffled" label="tag added to shuffled sequence name"/>
    <param name="n" type="integer" value="1" label="number of shuffled copies for each sequence"/>
    <param name="seed" type="integer" value="1" label="random seed" help="the same seed gives the same random sequences"/>
    <data format="fasta" name="output" />

**What it does**

This tool shuffles the sequences in the input file but preserves the dinucleotide frequency of each sequence. 

The code implements the Altschul-Erikson dinucleotide shuffle algorithm, described in "Significance of nucleotide sequence alignments: A method for random sequence permutation that preserves dinucleotide and codon usage", S.F. Altschul and B.W. Erikson, Mol. Biol. Evol., 2(6):526--538, 1985. 

Code adapted from