0
|
1 #+TITLE: Sequence Read Simulator
|
|
2 #+AUTHOR: Petr Novak
|
|
3
|
|
4 Create pseudo short reads from long reads (Illumina Like).
|
|
5
|
|
6 * Requirements
|
|
7 - python version > 3.4
|
|
8 - biopython
|
|
9
|
|
10 * Available tools
|
|
11 ** long_reads_sampling
|
|
12 #+BEGIN_EXAMPLE
|
|
13
|
|
14 usage: long_reads_sampling.py [-h] [-i INPUT] [-o OUTPUT] [-l TOTAL_LENGTH]
|
|
15 [-s SEED]
|
|
16
|
|
17 Create sample of long reads, instead of setting number of reads to be sampled,
|
|
18 total length of all sampled sequences is defined
|
|
19
|
|
20 optional arguments:
|
|
21 -h, --help show this help message and exit
|
|
22 -i INPUT, --input INPUT
|
|
23 file with long reads in fasta format (default: None)
|
|
24 -o OUTPUT, --output OUTPUT
|
|
25 Output file name (default: None)
|
|
26 -l TOTAL_LENGTH, --total_length TOTAL_LENGTH
|
|
27 total length of sampled output (default: None)
|
|
28 -s SEED, --seed SEED random number generator seed (default: 123)
|
|
29 #+END_EXAMPLE
|
|
30
|
|
31 ** long2short
|
|
32 #+BEGIN_EXAMPLE
|
|
33 usage: long2short.py [-h] [-i INPUT] [-o OUTPUT] [-cov COVERAGE]
|
|
34 [-L INSERT_LENGTH] [-l READ_LENGTH]
|
|
35
|
|
36 Creates pseudo short reads from long oxford nanopore reads
|
|
37
|
|
38 optional arguments:
|
|
39 -h, --help show this help message and exit
|
|
40 -i INPUT, --input INPUT
|
|
41 file with long reads in fasta format (default: None)
|
|
42 -o OUTPUT, --output OUTPUT
|
|
43 Output file name (default: None)
|
|
44 -cov COVERAGE, --coverage COVERAGE
|
|
45 samplig coverage (default: 0.1)
|
|
46 -L INSERT_LENGTH, --insert_length INSERT_LENGTH
|
|
47 length of insert, must be longer than read length
|
|
48 (default: 600)
|
|
49 -l READ_LENGTH, --read_length READ_LENGTH
|
|
50 read length (default: 100)
|
|
51
|
|
52 #+END_EXAMPLE
|
|
53 resulting reads in fasta format has names which include following information:
|
|
54 - original long read name index
|
|
55 - position of pseudo forward read in long reads
|
|
56 forward a reverse reads are interlaced a reverse reads are reverse complement of original long sequence
|
|
57 example outut:
|
|
58 #+BEGIN_EXAMPLE
|
|
59 >1_1_101_f
|
|
60 TGGTACTTGCGGTTACGTATTGCTAGCTAGTCTCCATTTGTCCGTTGGTCTTAGGTGATT
|
|
61 TTCCAAGCTTTGTGTGTAAATGTAAGGATCCTCATTTGTA
|
|
62 >1_1_101_r
|
|
63 GTTTTGTTATCGTGATCCACAGATCAGAAGATATCGCCGCTCACCTGTCAATTAATCTTA
|
|
64 ACTTAATGTACACTAGGGTTTTGGTTTTAACTGCTATCTT
|
|
65 >1_2001_2101_f
|
|
66 CTGAGTTGGGCAACATAGCCGACAAATTTGAACAATAAGCCGGTCCAGCCTTCTTTCTCA
|
|
67 GCTGATACATGAAACAAATCAAAGGAGCATTGTAAAGGCG
|
|
68 >1_2001_2101_r
|
|
69 TTTTGAATGATGGCACTACCGTGATCAAGGACGATGGTCTCCGTTCACTCGCTTTTGTTG
|
|
70 TACGTTCTCTATGAACTTGGTTTCTTTGCATTCGGTTCTT
|
|
71 >1_4001_4101_f
|
|
72 GAAGTTGAAGGAACATTTGGAAAGGTGTGTGAAGACTAATTTGGTCT
|
|
73 #+END_EXAMPLE
|