Mercurial > repos > devteam > data_manager_hisat_index_builder
diff data_manager/hisat_index_builder.xml @ 0:ba11fef120cd draft default tip
planemo upload for repository https://github.com/galaxyproject/tools-devteam/tree/master/data_managers/data_manager_hisat_index_builder commit 5a7365750648c26206f05ac7956936c243c2b980
| author | devteam |
|---|---|
| date | Sat, 13 Jun 2015 08:27:38 -0400 |
| parents | |
| children |
line wrap: on
line diff
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/data_manager/hisat_index_builder.xml Sat Jun 13 08:27:38 2015 -0400 @@ -0,0 +1,67 @@ +<tool id="hisat_index_builder_data_manager" name="HISAT index" tool_type="manage_data" version="1.0.0"> + <description>builder</description> + <requirements> + <requirement type="package" version="0.1.6">hisat</requirement> + </requirements> + <stdio> + <exit_code range=":-1" /> + <exit_code range="1:" /> + </stdio> + <command interpreter="python">hisat_index_builder.py "${out_file}" --fasta_filename "${all_fasta_source.fields.path}" --fasta_dbkey "${all_fasta_source.fields.dbkey}" --fasta_description "${all_fasta_source.fields.name}" --data_table_name "hisat_indexes"</command> + <inputs> + <param label="Source FASTA Sequence" name="all_fasta_source" type="select"> + <options from_data_table="all_fasta" /> + </param> + <param label="Name of sequence" name="sequence_name" type="text" value="" /> + <param label="ID for sequence" name="sequence_id" type="text" value="" /> + </inputs> + <outputs> + <data format="data_manager_json" name="out_file" /> + </outputs> + <help> +<![CDATA[ +.. class:: infomark + +**Notice:** If you leave name, description, or id blank, it will be generated automatically. + +What is HISAT? +-------------- + +`HISAT <http://ccb.jhu.edu/software/hisat>`__ is a fast and sensitive +spliced alignment program. As part of HISAT, we have developed a new +indexing scheme based on the Burrows-Wheeler transform +(`BWT <http://en.wikipedia.org/wiki/Burrows-Wheeler_transform>`__) and +the `FM index <http://en.wikipedia.org/wiki/FM-index>`__, called +hierarchical indexing, that employs two types of indexes: (1) one global +FM index representing the whole genome, and (2) many separate local FM +indexes for small regions collectively covering the genome. Our +hierarchical index for the human genome (about 3 billion bp) includes +~48,000 local FM indexes, each representing a genomic region of +~64,000bp. As the basis for non-gapped alignment, the FM index is +extremely fast with a low memory footprint, as demonstrated by +`Bowtie <http://bowtie-bio.sf.net>`__. In addition, HISAT provides +several alignment strategies specifically designed for mapping different +types of RNA-seq reads. All these together, HISAT enables extremely fast +and sensitive alignment of reads, in particular those spanning two exons +or more. As a result, HISAT is much faster >50 times than +`TopHat2 <http://ccb.jhu.edu/software/tophat>`__ with better alignment +quality. Although it uses a large number of indexes, the memory +requirement of HISAT is still modest, approximately 4.3 GB for human. +HISAT uses the `Bowtie2 <http://bowtie-bio.sf.net/bowtie2>`__ +implementation to handle most of the operations on the FM index. In +addition to spliced alignment, HISAT handles reads involving indels and +supports a paired-end alignment mode. Multiple processors can be used +simultaneously to achieve greater alignment speed. HISAT outputs +alignments in `SAM <http://samtools.sourceforge.net/SAM1.pdf>`__ format, +enabling interoperation with a large number of other tools (e.g. +`SAMtools <http://samtools.sourceforge.net>`__, +`GATK <http://www.broadinstitute.org/gsa/wiki/index.php/The_Genome_Analysis_Toolkit>`__) +that use SAM. HISAT is distributed under the `GPLv3 +license <http://www.gnu.org/licenses/gpl-3.0.html>`__, and it runs on +the command line under Linux, Mac OS X and Windows. +]]> + </help> + <citations> + <citation type="doi">10.1038/nmeth.3317</citation> + </citations> +</tool>
