annotate repeat_annotate_custom.xml @ 0:ea6a3059a6af draft

Uploaded
author petr-novak
date Mon, 18 Oct 2021 11:01:20 +0000
parents
children 7f1032da7a0a
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
0
ea6a3059a6af Uploaded
petr-novak
parents:
diff changeset
1 <tool id="repeat_annotate" name="RepeatExplorer Based Assembly Annotation" version="0.1.1" python_template_version="3.5">
ea6a3059a6af Uploaded
petr-novak
parents:
diff changeset
2 <requirements>
ea6a3059a6af Uploaded
petr-novak
parents:
diff changeset
3 <requirement type="package">repeatmasker</requirement>
ea6a3059a6af Uploaded
petr-novak
parents:
diff changeset
4 <requirement type="package">bioconductor-rtracklayer</requirement>
ea6a3059a6af Uploaded
petr-novak
parents:
diff changeset
5 </requirements>
ea6a3059a6af Uploaded
petr-novak
parents:
diff changeset
6 <command detect_errors="exit_code"><![CDATA[
ea6a3059a6af Uploaded
petr-novak
parents:
diff changeset
7 RepeatMasker -dir \$(pwd) '$input' -pa 32 -lib '$repeat_library' -xsmall -nolow -no_is -e ncbi -s
ea6a3059a6af Uploaded
petr-novak
parents:
diff changeset
8 &&
ea6a3059a6af Uploaded
petr-novak
parents:
diff changeset
9 ls -l * >&2 &&
ea6a3059a6af Uploaded
petr-novak
parents:
diff changeset
10 cp `basename $input`.out $output2
ea6a3059a6af Uploaded
petr-novak
parents:
diff changeset
11 &&
ea6a3059a6af Uploaded
petr-novak
parents:
diff changeset
12 Rscript ${__tool_directory__}/clean_rm_output.R $output2 $output1
ea6a3059a6af Uploaded
petr-novak
parents:
diff changeset
13
ea6a3059a6af Uploaded
petr-novak
parents:
diff changeset
14 ]]></command>
ea6a3059a6af Uploaded
petr-novak
parents:
diff changeset
15 <inputs>
ea6a3059a6af Uploaded
petr-novak
parents:
diff changeset
16 <param type="data" name="input" format="fasta" label="Genome/ Assembly to annotate" />
ea6a3059a6af Uploaded
petr-novak
parents:
diff changeset
17 <param type="data" name="repeat_library" format="fasta" label="RepeatExplorer based Library of Repetitive Sequences"
ea6a3059a6af Uploaded
petr-novak
parents:
diff changeset
18 help="custom database of repetitive sequences should be provided in fasta format. Sequence header should specify repeat class:
ea6a3059a6af Uploaded
petr-novak
parents:
diff changeset
19 >sequence_id#classification_level1/classification_level2/..." />
ea6a3059a6af Uploaded
petr-novak
parents:
diff changeset
20 </inputs>
ea6a3059a6af Uploaded
petr-novak
parents:
diff changeset
21 <outputs>
ea6a3059a6af Uploaded
petr-novak
parents:
diff changeset
22 <data name="output1" format="gff3" label="Repeat Annotation on ${on_string}, cleaned gff"/>
ea6a3059a6af Uploaded
petr-novak
parents:
diff changeset
23 <data name="output2" format="tabular" label="Raw output from RepeatMasker on ${on_string}" />
ea6a3059a6af Uploaded
petr-novak
parents:
diff changeset
24 </outputs>
ea6a3059a6af Uploaded
petr-novak
parents:
diff changeset
25 <help><![CDATA[
ea6a3059a6af Uploaded
petr-novak
parents:
diff changeset
26 This tools uses RepeatMasker to annotate repetitive sequences in the genome assemblie using custom library of repeats created from RepeatExplorer output.
ea6a3059a6af Uploaded
petr-novak
parents:
diff changeset
27 Library of repeats can be created from RepeatExplorer ouputt from contigs and TAREAN consensus sequences. Fasta formated library of repeats must contain header containg information about classification of repeats as **>sequence_id#classification_level1/classification_level2/...**
ea6a3059a6af Uploaded
petr-novak
parents:
diff changeset
28
ea6a3059a6af Uploaded
petr-novak
parents:
diff changeset
29 Classification in RepeatExplorer based library follows predetermined classification levels. User can however specify additional classification levels or completelly custom classifications. Conflicts in annotations are resolved based on classification hierarchy.
ea6a3059a6af Uploaded
petr-novak
parents:
diff changeset
30 ]]></help>
ea6a3059a6af Uploaded
petr-novak
parents:
diff changeset
31 </tool>
ea6a3059a6af Uploaded
petr-novak
parents:
diff changeset
32