# HG changeset patch
# User petr-novak
# Date 1580471723 18000
# Node ID c2c69c6090f06bfcb56d4d835ba5a09801249646
# Parent 99569eccc58386209cc8407b7ea74d049ef81607
Uploaded
diff -r 99569eccc583 -r c2c69c6090f0 ChipSeqRatioDef.xml
--- a/ChipSeqRatioDef.xml Mon Dec 09 04:14:48 2019 -0500
+++ b/ChipSeqRatioDef.xml Fri Jan 31 06:55:23 2020 -0500
@@ -22,12 +22,12 @@
-
-
-
-
-
+
+
+
+
+
**What it does**
-Analysis of NGS sequences from Chromatin Imunoprecipitation. ChiP
-and Input reads are mapped to contigs obtained from graph based
-repetitive sequence clustering(`Novak et al. 2013`__) to enriched repeats. Reads from input
-and ChIP should be ideally short illumina reads with uniform length
-above 80 nt. It is sufficiant to use about 1 milion of reads for both Input and Chip.
+The ChIP-seq Mapper evaluates the enrichment of repetitive sequences in sequencing data from chromatin
+immunoprecipitation experiments, using repeats identified by RepeatExplorer as the reference. The tool
+performs BLASTN similarity search of the read sequences to the reference,
+and the reads producing hits that passed the user-specified similarity threshold are assigned to the
+repeat clusters. The assignment is made to the cluster that produced the best similarity hit, and every
+read is assigned to only a single cluster. Following read mapping, the numbers of reads from the
+INPUT and ChIP samples are evaluated, and ChIP/INPUT ratios of the normalized read counts are reported
+for individual clusters.
+ChIP and INPUT reads should be of uniform lengths of at least 40 nt. The bit score threshold value should be
+adjusted based on the length of the analyzed reads (the value equal to the read length is recommended for a start).
This method was first used in (`Neumann et al. 2012`__) for
-identification of repetitive sequences associated with cetromeric
-region. If you use this method, reference:
+identification of repetitive sequences associated with centromeres:
`PLoS Genet. Epub 2012 Jun 21. Stretching the rules: monocentric chromosomes with multiple centromere domains. Neumann P, Navrátilová A, Schroeder-Reiter E, Koblížková A, Steinbauerová V, Chocholová E, Novák P, Wanner G, Macas J.`__.
-.. __: http://bioinformatics.oxfordjournals.org/content/29/6/792.full
-
.. __: http://journals.plos.org/plosgenetics/article?id=10.1371/journal.pgen.1002777
.. __: http://journals.plos.org/plosgenetics/article?id=10.1371/journal.pgen.1002777
diff -r 99569eccc583 -r c2c69c6090f0 RM_custom_search.xml
--- a/RM_custom_search.xml Mon Dec 09 04:14:48 2019 -0500
+++ b/RM_custom_search.xml Fri Jan 31 06:55:23 2020 -0500
@@ -10,7 +10,7 @@
-
+
diff -r 99569eccc583 -r c2c69c6090f0 extract_contigs_from_archive.xml
--- a/extract_contigs_from_archive.xml Mon Dec 09 04:14:48 2019 -0500
+++ b/extract_contigs_from_archive.xml Fri Jan 31 06:55:23 2020 -0500
@@ -10,7 +10,7 @@
-
-
+
+
diff -r 99569eccc583 -r c2c69c6090f0 fasta_affixer.xml
--- a/fasta_affixer.xml Mon Dec 09 04:14:48 2019 -0500
+++ b/fasta_affixer.xml Fri Jan 31 06:55:23 2020 -0500
@@ -1,19 +1,19 @@
- Tool appending suffix and prefix to sequences names
+ Appending suffix and prefix to the read names
fasta_affixer.py -f $input -p "$prefix" -s "$suffix" -n $nspace -o $output
-
-
-
-
+
+
+
+
-
+
diff -r 99569eccc583 -r c2c69c6090f0 fasta_interlacer.xml
--- a/fasta_interlacer.xml Mon Dec 09 04:14:48 2019 -0500
+++ b/fasta_interlacer.xml Fri Jan 31 06:55:23 2020 -0500
@@ -12,8 +12,8 @@
-
-
+
+
diff -r 99569eccc583 -r c2c69c6090f0 fasta_manual_input.xml
--- a/fasta_manual_input.xml Mon Dec 09 04:14:48 2019 -0500
+++ b/fasta_manual_input.xml Fri Jan 31 06:55:23 2020 -0500
@@ -7,12 +7,12 @@
-
+
-
+
diff -r 99569eccc583 -r c2c69c6090f0 fastq_name_affixer.xml
--- a/fastq_name_affixer.xml Mon Dec 09 04:14:48 2019 -0500
+++ b/fastq_name_affixer.xml Fri Jan 31 06:55:23 2020 -0500
@@ -5,15 +5,15 @@
-
+
-
+
-
+
diff -r 99569eccc583 -r c2c69c6090f0 pairScan.xml
--- a/pairScan.xml Mon Dec 09 04:14:48 2019 -0500
+++ b/pairScan.xml Fri Jan 31 06:55:23 2020 -0500
@@ -1,6 +1,6 @@
-
- Scan paired reads for overlap
+
+ Scan paired-end reads for overlap python-levenshtein
@@ -9,8 +9,8 @@
-
-
+
+
@@ -38,8 +38,8 @@
-
-
+
+
diff -r 99569eccc583 -r c2c69c6090f0 paired_fastq_filtering.xml
--- a/paired_fastq_filtering.xml Mon Dec 09 04:14:48 2019 -0500
+++ b/paired_fastq_filtering.xml Fri Jan 31 06:55:23 2020 -0500
@@ -1,9 +1,9 @@
-
+
- Preprocessing of paired-end reads fastq files
+ Preprocessing of paired-end reads in FASTQ format
including trimming, quality filtering, cutadapt filtering and interlacing. Broken
pairs are discarded.
@@ -40,41 +40,41 @@
-
+
-
+
-
-
+
+
-
+
-
-
+
+
-
+
-
+
-
+
>
@@ -87,17 +87,17 @@
-
+
-
+
- "
+ "
diff -r 99569eccc583 -r c2c69c6090f0 renameSequences.xml
--- a/renameSequences.xml Mon Dec 09 04:14:48 2019 -0500
+++ b/renameSequences.xml Fri Jan 31 06:55:23 2020 -0500
@@ -5,21 +5,21 @@
-
-
-
+
+
+
-
+
**What is does**
Use this tool to rename your sequences with numerical counter while keeping sequence name prefex as part of the name.
-If paired sequences are used, last character in sequence name is used to distinguish pairs.
+If paired-end reads are used, the last character in sequence name is used to distinguish pairs.
diff -r 99569eccc583 -r c2c69c6090f0 sampleFasta.xml
--- a/sampleFasta.xml Mon Dec 09 04:14:48 2019 -0500
+++ b/sampleFasta.xml Fri Jan 31 06:55:23 2020 -0500
@@ -1,5 +1,5 @@
-
- Tool for creating samples of sequences from larger dataset
+
+ Tool for random sampling subsets of reads from larger datasetseqkit
@@ -7,24 +7,26 @@
+
+ ]]>
+
-
-
-
-
+
+
+
+
diff -r 99569eccc583 -r c2c69c6090f0 single_fastq_filtering.xml
--- a/single_fastq_filtering.xml Mon Dec 09 04:14:48 2019 -0500
+++ b/single_fastq_filtering.xml Fri Jan 31 06:55:23 2020 -0500
@@ -1,9 +1,9 @@
-
+
- Preprocessing of fastq files
+ Preprocessing of FASTQ read files
including trimming, quality filtering, cutadapt filtering and sampling
@@ -35,43 +35,43 @@
-
+
-
+
-
+
-
-
+
+
-
+
-
-
+
+
-
+
-
+
-
+
>
@@ -84,7 +84,7 @@
-
+
@@ -92,8 +92,8 @@
-
- "
+
+ "