Mercurial > repos > mmonot > phageterm
diff README.txt @ 24:c8f88ae512f3 draft default tip
Uploaded
| author | mmonot |
|---|---|
| date | Tue, 17 Sep 2024 13:35:16 +0000 |
| parents | |
| children |
line wrap: on
line diff
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/README.txt Tue Sep 17 13:35:16 2024 +0000 @@ -0,0 +1,139 @@ +PROGRAM +======= + +PhageTerm.py - run as command line in a shell + + +UPDATES +======= + +Bug fix: +When the repeat region (of DTR phages) wraps around the reference contig ends, the first and last bases of the repeat region were missing in the reported sequence. +Bug fixed thanks to Matthew Lueder (Naval Medical Research Center-Frederick). + +Bug fix: cohesive sequence of COS 3' phages are now correctly determined. +Thanks to Dr Wei Shen. + +VERSION +======= + +Version 1.0.12 + + +INTRODUCTION +============ + +PhageTerm software is a tool to determine phage termini and packaging mode +from high throughput sequences that rely on the random fragmentation of DNA (e.g. +Illumina TruSeq). Phage sequencing reads from a fastq file are aligned to the phage +reference genome in order to calculate two types of coverage values (whole genome coverage +and the starting position coverage). The starting position coverage is used to perform a +detailed termini analysis. If the user provides the host sequence, reads that does not +match the phage genome are tested on the host using the same mapping function. + +The PhageTerm program and information is available at https://sourceforge.net/projects/phageterm/ + +A Galaxy wrapper version is also available at https://galaxy.pasteur.fr + + +PREREQUISITES +============= + +Unix/Linux + +- Python 2.7 +- matplotlib 2.0.2 +- numpy 1.11 +- pandas 0.19.1 +- scikit-learn 0.18.1 +- scipy 0.19.0 +- statsmodels 0.8.0 +- reportlab 3.4.0 + + +COMMAND LINE +============ + + + ./PhageTerm.py -f reads.fastq -r phage_sequence.fasta [-n phage_name -p reads_paired + -s seed_lenght -d surrounding -t installation_test -c nbr_core -g host.fasta + (warning increase process time)] + + + Help: + + ./PhageTerm.py -h + ./PhageTerm.py --help + + Options: + + Raw reads file in fastq format: + -f INPUT_FILE, --fastq=INPUT_FILE + Fastq reads + (NGS sequences from random fragmentation DNA only, + e.g. Illumina TruSeq) + + Raw reads file in fastq format: + -p INPUT_FILE, --paired=INPUT_FILE + Paired fastq reads + (NGS sequences from random fragmentation DNA only, + e.g. Illumina TruSeq) + + Phage genome in fasta format: + -r INPUT_FILE, --ref=INPUT_FILE + Reference phage genome as unique contig in fasta format + + Name of the phage being analyzed by the user: + -n PHAGE_NAME, --phagename=PHAGE_NAME + Manually enter the name of the phage being analyzed. + Used as prefix for output files. + + Lenght of the seed used for reads in the mapping process: + -s SEED_LENGHT, --seed=SEED_LENGHT + Manually enter the lenght of the seed used for reads + in the mapping process (Default: 20). + + Lenght of the seed used for reads in the mapping process: + -d SUROUNDING_LENGHT, --surrounding=SUROUNDING_LENGHT + Manually enter the lenght of the surrounding used to + merge close peaks in the analysis process (Default: 20). + + Host genome in fasta format: + -g INPUT_FILE, --host=INPUT_FILE + Reference host genome as unique contig in fasta format + Warning: increase drastically process time + + Core processor number to use: + -c CORE_NBR, --core=CORE_NBR + Number of core processor to use (Default: 1). + + Define phage mean coverage: + -m MEAN_NBR, --mean=MEAN_NBR + Phage mean coverage to use (Default: 250). + + Software run test: + -t TEST_VALUE, --test=TEST_VALUE + TEST_VALUE=C5 : Test run for a 5' cohesive end (e.g. Lambda) + TEST_VALUE=C3 : Test run for a 3' cohesive end (e.g. HK97) + TEST_VALUE=DS : Test run for a short Direct Terminal Repeats end (e.g. T7) + TEST_VALUE=DL : Test run for a long Direct Terminal Repeats end (e.g. T5) + TEST_VALUE=H : Test run for a Headful packaging (e.g. P1) + TEST_VALUE=M : Test run for a Mu-like packaging (e.g. Mu) + + +OUTPUT FILES +========== + + (i) Report (.pdf) + + (ii) Statistical table (.csv) + + (iii) Sequence files (.fasta) + + +CONTACT +======= + +Julian Garneau <julian.garneau@usherbrooke.ca> +Marc Monot <marc.monot@pasteur.fr> +David Bikard <david.bikard@pasteur.fr>
