Overview of StructureFold
- StructureFold is a series of software packages that automates the process of predicting RNA secondary structure for a transcript or an entire transcriptome, with or without the inclusion of constraints on the structure(s) provided by wet bench experimentation. The process consists of mapping the raw reads of RNA structural data on every transcript in the dataset to the transcriptome, getting RT stop counts on each nucleotide, calculating structural reactivities on the nucleotides, and predicting the RNA structures. Please cite: Tang, Y, Bouvier, E, Kwok CK, Ding Y, Nekrutenko, A, Bevilacqua PC, Assmann SM, StructureFold: Genome-wide RNA secondary structure mapping and reconstruction in vivo, Bioinformatics, In press. RNA structure is predicted using the RNAstructure algorithm (http://rna.urmc.rochester.edu/RNAstructure.html) or ViennaRNA package (http://www.tbi.univie.ac.at/RNA/).
Function
- Iterative Mapping maps the raw reads of RNA structural data to the reference transcriptome using Bowtie (v0.12.8). It allows users to trim each read from either end to iteratively map the read to the reference transcriptome.
Input:
- Sequence file type (FASTA/FASTQ)
- Sequence file (fasta/fastq format)
- Reference file (fasta) used to map the reads to
- “Shift” (The length of the sequence that will be trimmed at the 3’end of the reads before each round of mapping)
- “Length” (The minimum length of the reads for mapping after trimming)
- [Optional]
- Bowtie mapping flags (options) [Default: -v 0 -a --best --strata] (-v flag indicates the number of allowed mismatches. Use -5/-3 flag to trim the nucleotides from 5'/3' end of the reads)
Output:
A sorted .bam file with all of the reads that are mapped