Galaxy | Tool Preview

Extract features (version 1.0.0)
Multi-select list - hold the appropriate key while clicking to select multiple columns

What it does

This tool extracts selected features from GFF data.


Example

Selecting promoter from the following GFF data:

chr22  GeneA  enhancer  10000000  10001000  500  +  .  TGA
chr22  GeneA  promoter  10010000  10010100  900  +  .  TGA
chr22  GeneB  promoter  10020000  10025000  400  -  .  TGB
chr22  GeneB  CCDS2220  10030000  10065000  800  -  .  TGB

will produce the following output:

chr22  GeneA  promoter  10010000  10010100  900  +  .  TGA
chr22  GeneB  promoter  10020000  10025000  400  -  .  TGB

About formats

GFF format General Feature Format is a format for describing genes and other features associated with DNA, RNA and Protein sequences. GFF lines have nine tab-separated fields:

1. seqname - Must be a chromosome or scaffold.
2. source - The program that generated this feature.
3. feature - The name of this type of feature. Some examples of standard feature types are "CDS", "start_codon", "stop_codon", and "exon".
4. start - The starting position of the feature in the sequence. The first base is numbered 1.
5. end - The ending position of the feature (inclusive).
6. score - A score between 0 and 1000. If there is no score value, enter ".".
7. strand - Valid entries include '+', '-', or '.' (for don't know/care).
8. frame - If the feature is a coding exon, frame should be a number between 0-2 that represents the reading frame of the first base. If the feature is not a coding exon, the value should be '.'.
9. group - All lines with the same group are linked together into a single item.