Galaxy | Tool Preview

Count sequences in fastq files (version 0.1.1)
Columns to keep as identifiers for the summary report

Report fastq reads that contain sequences

This tool searches fastq reads for given nucleic acid query sequences. A typical use would be to compare the relative occurrence of two sequences.

NOTE: This only reports complete matches to the sequences, and reads that may partially match at the ends will not be counted.

INPUTS

fastq files
  • the sequence files to search
query file - a tabular file
  • that contains a column of "query" sequences to match in fastq reads
  • it may contain a second "comparison" sequence column to match

OUTPUTS

summary report - a tabular file
  • the first column is the line number from the query file
  • columns from the query file selected as identifiers
  • the count of fastq entries for the query sequence
  • the count of fastq entries for the comparison sequence (if selected)
  • the fraction of query sequence matches compared to the total of query and comparison matches
count details - an optional tabular file of match count
  • the fastq name
  • the first column is the line number from the query file
  • the sequence that matched
  • the label of the sequence that matched
  • the strand that matched
  • the number reads that matched