CNVkit pipeline for copy-number changes and allelic imbalances detection cnn, cnr and cns are tabuler files extintions:
The reference .cnn file has the columns: chromosome, Start, end, gene, GC content of the sequence region (gc), RepeatMasker-masked proportion of the sequence region (rmask), Statistical spread or dispersion (spread), Robust average of coverage depths (log2 ) and Robust average of absolute-scale coverage depths without any bias corrections (depth)
Target and antitarget bin-level coverages (.cnn) chromosome, Start, end, gene, log2 and depth
Bin-level log2 ratios (.cnr) chromosome, Start, end, gene, log2, depth and proportional weight or reliability (weight)
Segmented log2 ratios (.cns) chromosome, Start, end, gene, log2, depth, weight and number of bins covered by the segment (probes)
Bin-level log2 ratios (.cnr)
Tabular file containing normalized log2 ratios for small genomic bins (divided regions of the genome). Used to detect raw copy number variations (CNVs) before segmentation.
| chromosome | Genomic chromosome (e.g., chr1, chrX) |
| start | Start position of the bin. |
| end | End position of the bin. |
| gene | Gene name(s) overlapping the bin (if applicable). |
| log2 | Normalized log2 ratio (sample coverage / reference coverage). |
| depth | Average read depth in the bin. |
| weight | Reliability weight of the bin (higher = more reliable). |
Segmented log2 ratios (.cns)
Tabular file with smoothed, merged segments of stable copy number, derived from the .cnr file. Represents final CNV calls.
| chromosome | start, end: Genomic coordinates of the segment |
| gene | Gene(s) overlapping the segment. |
| log2 | Mean log2 ratio of the segment. |
| probes | Mean log2 ratio of the segment. |
| depth | Average read depth. |
| weight | Reliability weight. |
| p_value | Statistical confidence (lower = more significant). |
Copy Number Reference Profile (.cnn)
Tabular file defining the reference baseline built from control samples (e.g., normal samples). Used to normalize test samples.
| chromosome | Genomic chromosome (e.g., chr1, chrX). |
| start | Start position of the bin. |
| end | End position of the bin. |
| gene | Gene name(s) (if applicable). |
| log2 | Reference log2 ratio (typically 0 for diploid regions). |
| depth | Average read depth across control samples. |
| spread | Variability (standard deviation) of coverage in controls. |
Target and Antitarget Bin-level Coverages (.cnn)
Two intermediate tabular files containing raw coverage counts for target regions (captured regions) and antitarget regions (background).
Target Coverage File (e.g., sample.targetcoverage.cnn):
Antitarget Coverage File (e.g., sample.antitargetcoverage.cnn):