Mercurial > repos > petr-novak > long_reads_sampling
view README.org @ 0:dd46956ff61f draft
Uploaded
author | petr-novak |
---|---|
date | Fri, 08 Dec 2017 09:57:17 -0500 |
parents | |
children |
line wrap: on
line source
#+TITLE: Sequence Read Simulator #+AUTHOR: Petr Novak Create pseudo short reads from long reads (Illumina Like). * Requirements - python version > 3.4 - biopython * Available tools ** long_reads_sampling #+BEGIN_EXAMPLE usage: long_reads_sampling.py [-h] [-i INPUT] [-o OUTPUT] [-l TOTAL_LENGTH] [-s SEED] Create sample of long reads, instead of setting number of reads to be sampled, total length of all sampled sequences is defined optional arguments: -h, --help show this help message and exit -i INPUT, --input INPUT file with long reads in fasta format (default: None) -o OUTPUT, --output OUTPUT Output file name (default: None) -l TOTAL_LENGTH, --total_length TOTAL_LENGTH total length of sampled output (default: None) -s SEED, --seed SEED random number generator seed (default: 123) #+END_EXAMPLE ** long2short #+BEGIN_EXAMPLE usage: long2short.py [-h] [-i INPUT] [-o OUTPUT] [-cov COVERAGE] [-L INSERT_LENGTH] [-l READ_LENGTH] Creates pseudo short reads from long oxford nanopore reads optional arguments: -h, --help show this help message and exit -i INPUT, --input INPUT file with long reads in fasta format (default: None) -o OUTPUT, --output OUTPUT Output file name (default: None) -cov COVERAGE, --coverage COVERAGE samplig coverage (default: 0.1) -L INSERT_LENGTH, --insert_length INSERT_LENGTH length of insert, must be longer than read length (default: 600) -l READ_LENGTH, --read_length READ_LENGTH read length (default: 100) #+END_EXAMPLE resulting reads in fasta format has names which include following information: - original long read name index - position of pseudo forward read in long reads forward a reverse reads are interlaced a reverse reads are reverse complement of original long sequence example outut: #+BEGIN_EXAMPLE >1_1_101_f TGGTACTTGCGGTTACGTATTGCTAGCTAGTCTCCATTTGTCCGTTGGTCTTAGGTGATT TTCCAAGCTTTGTGTGTAAATGTAAGGATCCTCATTTGTA >1_1_101_r GTTTTGTTATCGTGATCCACAGATCAGAAGATATCGCCGCTCACCTGTCAATTAATCTTA ACTTAATGTACACTAGGGTTTTGGTTTTAACTGCTATCTT >1_2001_2101_f CTGAGTTGGGCAACATAGCCGACAAATTTGAACAATAAGCCGGTCCAGCCTTCTTTCTCA GCTGATACATGAAACAAATCAAAGGAGCATTGTAAAGGCG >1_2001_2101_r TTTTGAATGATGGCACTACCGTGATCAAGGACGATGGTCTCCGTTCACTCGCTTTTGTTG TACGTTCTCTATGAACTTGGTTTCTTTGCATTCGGTTCTT >1_4001_4101_f GAAGTTGAAGGAACATTTGGAAAGGTGTGTGAAGACTAATTTGGTCT #+END_EXAMPLE