Mercurial > repos > martasampaio > phagepromoter
changeset 21:b575af79e250 draft
Uploaded
author | martasampaio |
---|---|
date | Sat, 20 Apr 2019 11:06:06 -0400 |
parents | b680802b13cc |
children | 5acc4fa8b62d |
files | README.rst |
diffstat | 1 files changed, 29 insertions(+), 0 deletions(-) [+] |
line wrap: on
line diff
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/README.rst Sat Apr 20 11:06:06 2019 -0400 @@ -0,0 +1,29 @@ +=============== +PhagePromoter +=============== + +Get promoter of phage genomes + +PhagePromoter is a python script that predicts promoter sequences in phage genomes, using a machine learning SVM model. This model was built from a train dataset with 19 features and 3200 examples (800 positives and 2400 negatives), each representing a 65 bp sequence of a phage genome. The positive cases represent the phage sequences that are already identified as promoters. + +**Inputs:** + +* genome format: fasta vs genbank; +* genome file: acepts both genbank and fasta formats; +* both strands (yes or no): allows the search in both DNA strands; +* threshold: represents the probability of the test sequence be a promoter (float between 0 and 1)" +* family: The family of the testing phage - Podoviridae, Siphoviridae or Myoviridae; +* Bacteria: The host of the phage. The train dataset include the following hosts: Bacillus, EColi, Salmonella, Pseudomonas, Yersinia, Klebsiella, Pectobacterium, Morganella, Cronobacter, Staphylococcus, Streptococcus, Streptomyces, Lactococcus. If the testing phage has a different host, select the option 'other', and it is recommended the use of a higher threshold value for more accurate results. +* phage type: The type of the phage, according to its lifecycle: virulent or temperate; + +**Outputs:** +This tool outputs two files: a FASTA file and a table in HTML, with the locations, sequence, score and type (recognized by host or phage RNAP) of the predicted promoters. + +**Requirements:** + +* Biopython +* Sklearn +* Numpy +* Pandas + +