Mercurial > repos > petr-novak > long_reads_sampling
diff README.org @ 0:dd46956ff61f draft
Uploaded
author | petr-novak |
---|---|
date | Fri, 08 Dec 2017 09:57:17 -0500 |
parents | |
children |
line wrap: on
line diff
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/README.org Fri Dec 08 09:57:17 2017 -0500 @@ -0,0 +1,73 @@ +#+TITLE: Sequence Read Simulator +#+AUTHOR: Petr Novak + +Create pseudo short reads from long reads (Illumina Like). + +* Requirements +- python version > 3.4 +- biopython + +* Available tools +** long_reads_sampling +#+BEGIN_EXAMPLE + +usage: long_reads_sampling.py [-h] [-i INPUT] [-o OUTPUT] [-l TOTAL_LENGTH] + [-s SEED] + +Create sample of long reads, instead of setting number of reads to be sampled, +total length of all sampled sequences is defined + +optional arguments: + -h, --help show this help message and exit + -i INPUT, --input INPUT + file with long reads in fasta format (default: None) + -o OUTPUT, --output OUTPUT + Output file name (default: None) + -l TOTAL_LENGTH, --total_length TOTAL_LENGTH + total length of sampled output (default: None) + -s SEED, --seed SEED random number generator seed (default: 123) +#+END_EXAMPLE + +** long2short +#+BEGIN_EXAMPLE +usage: long2short.py [-h] [-i INPUT] [-o OUTPUT] [-cov COVERAGE] + [-L INSERT_LENGTH] [-l READ_LENGTH] + +Creates pseudo short reads from long oxford nanopore reads + +optional arguments: + -h, --help show this help message and exit + -i INPUT, --input INPUT + file with long reads in fasta format (default: None) + -o OUTPUT, --output OUTPUT + Output file name (default: None) + -cov COVERAGE, --coverage COVERAGE + samplig coverage (default: 0.1) + -L INSERT_LENGTH, --insert_length INSERT_LENGTH + length of insert, must be longer than read length + (default: 600) + -l READ_LENGTH, --read_length READ_LENGTH + read length (default: 100) + +#+END_EXAMPLE +resulting reads in fasta format has names which include following information: + - original long read name index + - position of pseudo forward read in long reads +forward a reverse reads are interlaced a reverse reads are reverse complement of original long sequence +example outut: +#+BEGIN_EXAMPLE +>1_1_101_f +TGGTACTTGCGGTTACGTATTGCTAGCTAGTCTCCATTTGTCCGTTGGTCTTAGGTGATT +TTCCAAGCTTTGTGTGTAAATGTAAGGATCCTCATTTGTA +>1_1_101_r +GTTTTGTTATCGTGATCCACAGATCAGAAGATATCGCCGCTCACCTGTCAATTAATCTTA +ACTTAATGTACACTAGGGTTTTGGTTTTAACTGCTATCTT +>1_2001_2101_f +CTGAGTTGGGCAACATAGCCGACAAATTTGAACAATAAGCCGGTCCAGCCTTCTTTCTCA +GCTGATACATGAAACAAATCAAAGGAGCATTGTAAAGGCG +>1_2001_2101_r +TTTTGAATGATGGCACTACCGTGATCAAGGACGATGGTCTCCGTTCACTCGCTTTTGTTG +TACGTTCTCTATGAACTTGGTTTCTTTGCATTCGGTTCTT +>1_4001_4101_f +GAAGTTGAAGGAACATTTGGAAAGGTGTGTGAAGACTAATTTGGTCT +#+END_EXAMPLE