Dataset formats
The input dataset is in the MasterVar format provided by the Complete Genomics analysis process (Galaxy considers this to be tabular, but it must have the columns specified for MasterVar). The output dataset is in pgSnp format. (Dataset missing?)
What it does
This converts a Complete Genomics MasterVar file to pgSnp format, so it can be viewed in browsers or used with the phenotype association and interval operations tools. Positions homozygous for the reference are skipped.
Examples
input MasterVar file:
934 2 chr1 41980 41981 hom snp A G G 76 97 dbsnp.86:rs806721 425 1 1 1 2 -170 ERVL-E-int:ERVL:47.4 2 1.17 N 935 2 chr1 41981 42198 hom ref = = = -170 1.17 N 1102 2 chr1 53205 53206 het-ref snp G C G 93 127 dbsnp.100:rs2854676 477 7 30 0 37 -127 2 1.17 N etc.
output:
chr1 41980 41981 G 1 1 76 chr1 51672 51673 C 1 1 53 chr1 52237 52238 G 1 7 63 chr1 53205 53206 C/G 2 7,30 93,127 etc.