1.1.2221.01umi_tools10.1101/gr.209601.116
@misc{githubUMI-tools,
title = {UMI-tools},
publisher = {GitHub},
journal = {GitHub repository},
url = {https://github.com/CGATOxford/UMI-tools},
}
fastqsanger,fastqsanger.gz,fastqillumina,fastqillumina.gz,fastqsolexa,fastqsolexa.gzumi-tools 'input.bam' &&
samtools index -b 'input.bam' &&
#set $input_file = 'input.bam'
#else:
ln -sf '${input}' 'input.bam' &&
ln -sf '$input.metadata.bam_index' 'input.bam.bai' &&
#set $input_file = 'input.bam'
#end if
]]>.{8,12})(?PGAGTGATTGCTTGTGACGCCTT)(?P.{8})(?P.{6})T{3}.*
Where only reads with a 3' T-tail and `GAGTGATTGCTTGTGACGCCTT` in
the correct position to yield two cell barcodes of 8-12 and 8bp
respectively, and a 6bp UMI will be retained.
You can also specify fuzzy matching to allow errors. For example if
the discard group above was specified as below this would enable
matches with up to 2 errors in the discard_1 group.
::
(?PGAGTGATTGCTTGTGACGCCTT){s<=2}
Note that all UMIs must be the same length for downstream
processing with dedup, group or count commands]]>``,
replacing with e.g ":".
Alternatively, if your UMIs are encoded in a tag, you can specify this
by setting the option --extract-umi-method=tag and set the tag name
with the --umi-tag option. For example, if your UMIs are encoded in
the 'UM' tag, provide the following options:
``--extract-umi-method=tag`` ``--umi-tag=UM``
Finally, if you have used umis to extract the UMI +/- cell barcode,
you can specify ``--extract-umi-method=umis``
The start position of a read is considered to be the start of its alignment
minus any soft clipped bases. A read aligned at position 500 with
cigar 2S98M will be assumed to start at position 498.]]>= (2* umi B counts) - 1. Each
network is a read group.
]]>log