Galaxy | Tool Preview

Aggregate datapoints (version 1.1.3)

This tool currently only has cached data for genome builds hg16, hg17 and hg18. However, you may use your own data point (wiggle) data, such as those available from UCSC. If you are trying to use your own data point file and it is not appearing as an option, make sure that the builds for your history items are the same.

This tool assumes that the input dataset is in interval format and contains at least a chrom column, a start column and an end column. These 3 columns can be dispersed throughout any number of other data columns.


TIP: Computing summary information may throw exceptions if the data type (e.g., string, integer) in every line of the columns is not appropriate for the computation (e.g., attempting numerical calculations on strings). If an exception is thrown when computing summary information for a line, that line is skipped as invalid for the computation. The number of invalid skipped lines is documented in the resulting history item as a "Data issue".


Syntax

This tool appends columns of summary information for each interval matched against a selected dataset. For each interval, the average, minimum and maximum for the data falling within the interval is computed.


Example

If your original data has the following format:

other1 chrom start end other2

and you choose to aggregate phastCons scores, your output will look like this:

other1 chrom start end other2 avg min max

where: