comparison README.org @ 0:cf3cea0a3039 draft

Uploaded
author petr-novak
date Thu, 07 Oct 2021 06:07:34 +0000
parents
children e955b40ad3a4
comparison
equal deleted inserted replaced
-1:000000000000 0:cf3cea0a3039
1 #+TITLE: RepeatExplorer based Assembly Annotation Pipeline
2
3 * Tools in repository
4 ** Extract Repeat Library from RepeatExplorer Archive
5 (=extract_re_contigs.xml=)
6
7 This toll will extract library of repeats based on RepeatExplorer2 analysis. Library is available as fasta file.
8
9 ** Format repeat library
10 (=format_repeat_library.xml=)
11
12 This tool append classification of repeats to library of repeats. Type of repeat is then part of sequence name in format:
13
14 ~>sequence_id#classification_level1/classification_level2/...~ this enable to specify classification hierarchy
15 Classification of sequneces in library is provided using =CLUSTER_TABLE.csv= (part of RE2 output)
16
17 This file can then be used for annotation of repeat in your assembly:
18 ** Repeat Annotation
19 (=repeat_annotate_custom.xml=)
20
21 Internally annotation is performed using RepeatMasker search. Output from RepeatMasker is parsed to remove duplicated and overlaping annotations, Conflicts in annotations are resolved using hierarchical classification of repeats provided in custom database
22
23 * test data
24
25 - ~test_assembly_1.fasta~ with ~test_db_1_satellites.fasta~ (include CLASS followed by double underscore - syntax 1)
26 - ~test_assembly_2.fasta~ with ~test_db_2_RE_repeats.fasta~ (include full hierarchical classification)
27
28
29
30 #+begin_comment
31 create tarball for toolshed:
32 tar -czvf ../repeat_annotation_pipeline.tar.gz --exclude test_data --exclude .git .
33
34 #+end_comment>