What CopyRighter does
The genome of Bacteria and Archaea often contains several copies of the 16S rRNA gene. This can lead to significant biases when estimating the composition of microbial communities using 16S rRNA amplicons or microarrays or their total abundance using 16S rRNA quantitative PCR, since species with a large number of copies will contribute disproportionally more 16S amplicons than species with a unique copy. Fortunately, it is possible to infer the copy number of unsequenced microbial species, based on that of close relatives that have been fully sequenced. Using this information, CopyRigher corrects microbial relative abundance by applying a weight proportional to the inverse of the estimated copy number to each species.
In metagenomic surveys, a similar problem arises due to genome length variations between species, and can be corrected by CopyRighter as well.
In all cases, a community file is used as input and a corrected community file with trait-corrected (16S rRNA gene copy number or genome length) relative abundances is generated. Total abundance can optionally be provided, corrected and combined with relative abundance estimates to get the absolute abundance of each species. Also the average trait value in each community is reported on standard output.