Galaxy | Tool Preview

Contra Copy number analysis (version 1.0.0)

Reference
http://contra-cnv.sourceforge.net/

What it does

CONTRA is a tool for copy number variation (CNV) detection for targeted resequencing data such as those from whole-exome capture data. CONTRA calls copy number gains and losses for each target region with key strategies include the use of base-level log-ratios to remove GC-content bias, correction for an imbalanced library size effect on log-ratios, and the estimation of log-ratio variations via binning and interpolation. It takes standard alignment formats (BAM/SAM) and output in variant call format (VCF 4.0) for easy integration with other next generation sequencing analysis package.


Required Parameters

-t, --target         Target region definition file [BED format]

-s, --test           Alignment file for the test sample [BAM/SAM]

-c, --control        Alignment file for the control sample
                     [BAM/SAM/BED – baseline file]

--bed                **option has to be supplied for control
                     with baseline file.**

-f, --fasta          Reference genome [FASTA]

-o, --outFolder      the folder name (and its path) to store the output
                     of the analysis (this new folder will be created –
                     error message occur if the folder exists)

Optional Parameters

--numBin              Numbers of bins to group the regions. User can
                      specify multiple experiments with different numbers
                      of bins (comma separated). [Default: 20]

--minReadDepth        The threshold for minimum read depth for each bases
                      (see Step 2 in CONTRA workflow) [Default: 10]

--minNBases           The threshold for minimum number of bases for each
                      target regions (see Step 2 in CONTRA workflow)
                      [Default: 10]

--sam                 If the specified test and control samples are in
                      SAM format. [Default: False] (It will always take
                      BAM samples as default)

--bed                 If specified, control will be a baseline file in
                      BED format. [Default: False]
                      Please refer to the Baseline Script section for
                      instruction how to create baseline files from set
                      of BAMfiles. A set of baseline files from different
                      platform have also been provided in the CONTRA
                      download page.

--pval                The p-value threshold for filtering. Based on Adjusted
                      P-Values. Only regions that pass this threshold will
                      be included in the VCF file. [Default: 0.05]

--sampleName          The name to be appended to the front of the default output
                      name. By default, there will be nothing appended.

--nomultimapped       The option to remove multi-mapped reads
                      (using SAMtools with mapping quality > 0).
                      [default: FALSE]

-p, --plot            If specified, plots of log-ratio distribution for each
                      bin will be included in the output folder [default: FALSE]

--minExon             Minimum number of exons in one bin (if less than this number
                      , bin that contains small number of exons will be merged to
                      the adjacent bins) [Default : 2000]

--minControlRdForCall Minimum Control ReadDepth for call [Default: 5]

--minTestRdForCall    Minimum Test ReadDepth for call [Default: 0]

--minAvgForCall       Minimum average coverage for call [Default: 20]

--maxRegionSize       Maximum region size in target region (for breaking
                      large regions into smaller regions. By default,
                      maxRegionSize=0 means no breakdown). [Default : 0]

--targetRegionSize    Target region size for breakdown (if maxRegionSize
                      is non-zero) [Default: 200]

-l, --largeDeletion   If specified, CONTRA will run large deletion analysis (CBS).
                      User must have DNAcopy R-library installed to run the
                      analysis. [False]

--smallSegment        CBS segment size for calling large variations [Default : 1]

--largeSegment        CBS segment size for calling large variations [Default : 25]

--lrCallStart         Log ratios start range that will be used to call CNV
                      [Default : -0.3]

--lrCallEnd           Log ratios end range that will be used to call CNV
                      [Default : 0.3]

--passSize            Size of exons that passed the p-value threshold compare
                      to the original exons size [Default: 0.5]