Galaxy | Tool Preview

pyMotif (version 1.0.0)
File of type .gtf
Tab file containing genomic reference sequence
GTF file containing gene ID co-ordinates

pyMotif

pyMotif is part of the pyCRAC package. Looks for enriched sequence motifs in high-throughput sequencing data. Produces a GTF type output file with coordinates and Z-scores for enriched motifs. The GTF file can be visualised in genome browsers.


Parameter list

File input options:

-f intervals.gtf, --input_file=intervals.gtf
                    Provide the path to an interval gtf file. By default
                    it expects data from the standard input.
-o OUTPUT_FILE, --output_file=OUTPUT_FILE
                    Use this flag to override the standard file names. Do
                    NOT add an extension.
--gtf=annotation_file.gtf
                    type the path to the gtf annotation file that you want
                    to use
--tab=tab_file.tab
                    type the path to the tab file that contains the
                    genomic reference sequence

pyMotif specific options:

--k_min=4
                    this option allows you to set the shortest k-mer
                    length. Default = 4.
--k_max=6
                    this option allows you to set the longest k-mer
                    length. Default = 8.
-n 100, --numberofkmers=100
                    choose the maximum number of enriched k-mer sequences
                    you want to have reported in output files. Default =
                    1000

pyCRAC common options:

-a protein_coding, --annotation=protein_coding
                    select which annotation (i.e. protein_coding, ncRNA,
                    sRNA, rRNA,snoRNA,snRNA, depending on the source of
                    your GTF file) you would like to focus your search on.
                    Default = all annotations
-r 100, --range=100
                    allows you to add regions flanking the genomic
                    feature. If you set '-r 50' or '--range=50', then the
                    program will add 50 nucleotides to each feature on
                    each side regardless of whether the GTF file has genes
                    with annotated UTRs.
--overlap=1
                    sets the number of nucleotides a motif has to overlap
                    with a genomic feature before it is considered a hit.
                    Default =  1 nucleotide