Mercurial > repos > petr-novak > long_reads_sampling
comparison README.org @ 0:dd46956ff61f draft
Uploaded
author | petr-novak |
---|---|
date | Fri, 08 Dec 2017 09:57:17 -0500 |
parents | |
children |
comparison
equal
deleted
inserted
replaced
-1:000000000000 | 0:dd46956ff61f |
---|---|
1 #+TITLE: Sequence Read Simulator | |
2 #+AUTHOR: Petr Novak | |
3 | |
4 Create pseudo short reads from long reads (Illumina Like). | |
5 | |
6 * Requirements | |
7 - python version > 3.4 | |
8 - biopython | |
9 | |
10 * Available tools | |
11 ** long_reads_sampling | |
12 #+BEGIN_EXAMPLE | |
13 | |
14 usage: long_reads_sampling.py [-h] [-i INPUT] [-o OUTPUT] [-l TOTAL_LENGTH] | |
15 [-s SEED] | |
16 | |
17 Create sample of long reads, instead of setting number of reads to be sampled, | |
18 total length of all sampled sequences is defined | |
19 | |
20 optional arguments: | |
21 -h, --help show this help message and exit | |
22 -i INPUT, --input INPUT | |
23 file with long reads in fasta format (default: None) | |
24 -o OUTPUT, --output OUTPUT | |
25 Output file name (default: None) | |
26 -l TOTAL_LENGTH, --total_length TOTAL_LENGTH | |
27 total length of sampled output (default: None) | |
28 -s SEED, --seed SEED random number generator seed (default: 123) | |
29 #+END_EXAMPLE | |
30 | |
31 ** long2short | |
32 #+BEGIN_EXAMPLE | |
33 usage: long2short.py [-h] [-i INPUT] [-o OUTPUT] [-cov COVERAGE] | |
34 [-L INSERT_LENGTH] [-l READ_LENGTH] | |
35 | |
36 Creates pseudo short reads from long oxford nanopore reads | |
37 | |
38 optional arguments: | |
39 -h, --help show this help message and exit | |
40 -i INPUT, --input INPUT | |
41 file with long reads in fasta format (default: None) | |
42 -o OUTPUT, --output OUTPUT | |
43 Output file name (default: None) | |
44 -cov COVERAGE, --coverage COVERAGE | |
45 samplig coverage (default: 0.1) | |
46 -L INSERT_LENGTH, --insert_length INSERT_LENGTH | |
47 length of insert, must be longer than read length | |
48 (default: 600) | |
49 -l READ_LENGTH, --read_length READ_LENGTH | |
50 read length (default: 100) | |
51 | |
52 #+END_EXAMPLE | |
53 resulting reads in fasta format has names which include following information: | |
54 - original long read name index | |
55 - position of pseudo forward read in long reads | |
56 forward a reverse reads are interlaced a reverse reads are reverse complement of original long sequence | |
57 example outut: | |
58 #+BEGIN_EXAMPLE | |
59 >1_1_101_f | |
60 TGGTACTTGCGGTTACGTATTGCTAGCTAGTCTCCATTTGTCCGTTGGTCTTAGGTGATT | |
61 TTCCAAGCTTTGTGTGTAAATGTAAGGATCCTCATTTGTA | |
62 >1_1_101_r | |
63 GTTTTGTTATCGTGATCCACAGATCAGAAGATATCGCCGCTCACCTGTCAATTAATCTTA | |
64 ACTTAATGTACACTAGGGTTTTGGTTTTAACTGCTATCTT | |
65 >1_2001_2101_f | |
66 CTGAGTTGGGCAACATAGCCGACAAATTTGAACAATAAGCCGGTCCAGCCTTCTTTCTCA | |
67 GCTGATACATGAAACAAATCAAAGGAGCATTGTAAAGGCG | |
68 >1_2001_2101_r | |
69 TTTTGAATGATGGCACTACCGTGATCAAGGACGATGGTCTCCGTTCACTCGCTTTTGTTG | |
70 TACGTTCTCTATGAACTTGGTTTCTTTGCATTCGGTTCTT | |
71 >1_4001_4101_f | |
72 GAAGTTGAAGGAACATTTGGAAAGGTGTGTGAAGACTAATTTGGTCT | |
73 #+END_EXAMPLE |