Galaxy |

Changeset 1:9f2665b32c45 (2021-10-08)

Previous changeset 0:5ebf2354cc9b (2021-10-07) Next changeset 2:7420753b0671 (2021-10-08)

Commit message:
"planemo upload for repository https://github.com/jj-umn/tools-iuc/tree/arriba/tools/arriba commit 933ae7dfba10b1b31c30a90216d76cdad6dda685"

modified:
arriba.xml

added:
arriba_download_reference.xml
test-data/Aligned.out.sam

removed:
arriba.help

diff -r 5ebf2354cc9b -r 9f2665b32c45 arriba.help
--- a/arriba.help Thu Oct 07 11:47:02 2021 +0000
+++ /dev/null Thu Jan 01 00:00:00 1970 +0000

[

b"@@ -1,191 +0,0 @@\n-% arriba -h\n-[2021-10-06T19:04:33] Launching Arriba 2.1.0\n-\n-Arriba gene fusion detector\n----------------------------\n-Version: 2.1.0\n-\n-Arriba is a fast tool to search for aberrant transcripts such as gene fusions.\n-It is based on chimeric alignments found by the STAR RNA-Seq aligner.\n-\n-Usage: arriba [-c Chimeric.out.sam] -x Aligned.out.bam \\\n- -g annotation.gtf -a assembly.fa [-b blacklists.tsv] [-k known_fusions.tsv] \\\n- [-t tags.tsv] [-p protein_domains.gff3] [-d structural_variants_from_WGS.tsv] \\\n- -o fusions.tsv [-O fusions.discarded.tsv] \\\n- [OPTIONS]\n-\n- -c FILE File in SAM/BAM/CRAM format with chimeric alignments as generated by STAR\n- (Chimeric.out.sam). This parameter is only required, if STAR was run with the\n- parameter '--chimOutType SeparateSAMold'. When STAR was run with the parameter\n- '--chimOutType WithinBAM', it suffices to pass the parameter -x to Arriba and -c\n- can be omitted.\n-\n- -x FILE File in SAM/BAM/CRAM format with main alignments as generated by STAR\n- (Aligned.out.sam). Arriba extracts candidate reads from this file.\n-\n- -g FILE GTF file with gene annotation. The file may be gzip-compressed.\n-\n- -G GTF_FEATURES Comma-/space-separated list of names of GTF features.\n- Default: gene_name=gene_name|gene_id gene_id=gene_id\n- transcript_id=transcript_id feature_exon=exon feature_CDS=CDS\n-\n- -a FILE FastA file with genome sequence (assembly). The file may be gzip-compressed. An\n- index with the file extension .fai must exist only if CRAM files are processed.\n-\n- -b FILE File containing blacklisted events (recurrent artifacts and transcripts\n- observed in healthy tissue).\n-\n- -k FILE File containing known/recurrent fusions. Some cancer entities are often\n- characterized by fusions between the same pair of genes. In order to boost\n- sensitivity, a list of known fusions can be supplied using this parameter. The list\n- must contain two columns with the names of the fused genes, separated by tabs.\n-\n- -o FILE Output file with fusions that have passed all filters.\n-\n- -O FILE Output file with fusions that were discarded due to filtering.\n-\n- -t FILE Tab-separated file containing fusions to annotate with tags in the 'tags' column.\n- The first two columns specify the genes; the third column specifies the tag. The\n- file may be gzip-compressed.\n-\n- -p FILE File in GFF3 format containing coordinates of the protein domains of genes. The\n- protein domains retained in a fusion are listed in the column\n- 'retained_protein_domains'. The file may be gzip-compressed.\n-\n- -d FILE Tab-separated file with coordinates of structural variants found using\n- whole-genome sequencing data. These coordinates serve to increase sensitivity\n- towards weakly expressed fusions and to eliminate fusions with low evidence.\n-\n- -D MAX_GENOMIC_BREAKPOINT_DISTANCE When a file with genomic breakpoints obtained via\n- whole-genome sequencing is supplied via the -d\n- parameter, this parameter determines how far a\n- genomic breakpoint may be away from a\n- transcriptomic breakpoint to consider it as a\n- related event. For events inside genes, the\n- distance is added to the end of the gene; for\n- intergenic events, the distance threshold is\n- applied as is. Default: 100000\n-\n- -s STRANDEDNESS Whether a strand-specific protocol was used for library preparation,\n- and if so, the type of strandedness (auto/yes/no/reverse). When\n- unstranded data is processed, the strand can "..b" to a short stretch in one of the genes. The\n- 'short_anchor' filter removes these fusions. This parameter sets\n- the threshold in bp for what the filter considers short. Default: 23\n-\n- -M MANY_SPLICED_EVENTS The 'many_spliced' filter recovers fusions between genes that\n- have at least this many spliced breakpoints. Default: 4\n-\n- -K MAX_KMER_CONTENT The 'low_entropy' filter removes reads with repetitive 3-mers. If\n- the 3-mers make up more than the given fraction of the sequence, then\n- the read is discarded. Default: 0.600000\n-\n- -V MAX_MISMATCH_PVALUE The 'mismatches' filter uses a binomial model to calculate a\n- p-value for observing a given number of mismatches in a read. If\n- the number of mismatches is too high, the read is discarded.\n- Default: 0.010000\n-\n- -F FRAGMENT_LENGTH When paired-end data is given, the fragment length is estimated\n- automatically and this parameter has no effect. But when single-end\n- data is given, the mean fragment length should be specified to\n- effectively filter fusions that arise from hairpin structures.\n- Default: 200\n-\n- -U MAX_READS Subsample fusions with more than the given number of supporting reads. This\n- improves performance without compromising sensitivity, as long as the\n- threshold is high. Counting of supporting reads beyond the threshold is\n- inaccurate, obviously. Default: 300\n-\n- -Q QUANTILE Highly expressed genes are prone to produce artifacts during library\n- preparation. Genes with an expression above the given quantile are eligible\n- for filtering by the 'in_vitro' filter. Default: 0.998000\n-\n- -e EXONIC_FRACTION The breakpoints of false-positive predictions of intragenic events\n- are often both in exons. True predictions are more likely to have at\n- least one breakpoint in an intron, because introns are larger. If the\n- fraction of exonic sequence between two breakpoints is smaller than\n- the given fraction, the 'intragenic_exonic' filter discards the\n- event. Default: 0.330000\n-\n- -T TOP_N Only report viral integration sites of the top N most highly expressed viral\n- contigs. Default: 5\n-\n- -C COVERED_FRACTION Ignore virally associated events if the virus is not fully\n- expressed, i.e., less than the given fraction of the viral contig is\n- transcribed. Default: 0.150000\n-\n- -l MAX_ITD_LENGTH Maximum length of internal tandem duplications. Note: Increasing\n- this value beyond the default can impair performance and lead to many\n- false positives. Default: 100\n-\n- -u Instead of performing duplicate marking itself, Arriba relies on duplicate marking by a\n- preceding program using the BAM_FDUP flag. This makes sense when unique molecular\n- identifiers (UMI) are used.\n-\n- -X To reduce the runtime and file size, by default, the columns 'fusion_transcript',\n- 'peptide_sequence', and 'read_identifiers' are left empty in the file containing\n- discarded fusion candidates (see parameter -O). When this flag is set, this extra\n- information is reported in the discarded fusions file.\n-\n- -I If assembly of the fusion transcript sequence from the supporting reads is incomplete\n- (denoted as '...'), fill the gaps using the assembly sequence wherever possible.\n-\n- -h Print help and exit.\n-\n- Code repository: https://github.com/suhrig/arriba\n- Get help/report bugs: https://github.com/suhrig/arriba/issues\n- User manual: https://arriba.readthedocs.io/\n- Please cite: https://doi.org/10.1101/gr.257246.119\n-\n"

diff -r 5ebf2354cc9b -r 9f2665b32c45 arriba.xml
--- a/arriba.xml Thu Oct 07 11:47:02 2021 +0000
+++ b/arriba.xml Fri Oct 08 11:16:21 2021 +0000

[

b'@@ -6,31 +6,95 @@\n <expand macro="requirements" />\n <expand macro="version_command" />\n <command detect_errors="exit_code"><![CDATA[\n+#if str($input_params.input_source) == "use_fastq"\n+ #if $input_params.left_fq.is_of_type("fastq.gz"):\n+ #set read1 = \'input_1.fastq.gz\'\n+ #else:\n+ #set read1 = \'input_1.fastq\'\n+ #end if\n+ ln -f -s \'${input_params.left_fq}\' ${read1} &&\n+ #if $input_params.right_fq.is_of_type("fastq.gz"):\n+ #set read2 = \'input_2.fastq.gz\'\n+ #else:\n+ #set read2 = \'input_2.fastq\'\n+ #end if\n+ ln -f -s \'${input_params.right_fq}\' ${read2} &&\n+ STAR \n+ --runThreadN \\${GALAXY_SLOTS:-1} \n+ --genomeDir /path/to/STAR_index \n+ --genomeLoad NoSharedMemory \n+ --readFilesIn $read1 $read2\n+ --readFilesCommand zcat \n+ --outStd BAM_Unsorted \n+ --outSAMtype BAM Unsorted \n+ --outSAMunmapped Within \n+ --outBAMcompression 0 \n+ --outFilterMultimapNmax 50 \n+ --peOverlapNbasesMin 10 \n+ --alignSplicedMateMapLminOverLmate 0.5 \n+ --alignSJstitchMismatchNmax 5 -1 5 5 \n+ --chimSegmentMin 10 \n+ --chimOutType WithinBAM HardClip \n+ --chimJunctionOverhangMin 10 \n+ --chimScoreDropMax 30 \n+ --chimScoreJunctionNonGTAG 0 \n+ --chimScoreSeparation 1 \n+ --chimSegmentReadGapMax 3 \n+ --chimMultimapNmax 50 \n+ | tee Aligned.out.bam |\n arriba \n- -x \'$input\'\n- #if $chimeric\n- -c \'$chimeric\'\n- #endif\n+ -x \'/dev/stdin\'\n+#else\n+ arriba \n+ -x \'$input_params.input\'\n+ #if $input_params.chimeric\n+ -c \'$input_params.chimeric\'\n+ #end if\n+#end if\n -a \'$genome_assembly\'\n -g \'$gtf\'\n- -b \'$blacklist\'\n+ #if \'$blacklist\'\n+ -b \'$blacklist\'\n+ #end if\n #if \'$protein_domains\'\n -p \'$protein_domains\'\n- #endif\n+ #end if\n #if \'$known_fusions\'\n -k \'$known_fusions\'\n- #endif\n+ #end if\n #if \'$tags\'\n -t \'$tags\'\n- #endif\n+ #end if\n -o fusions.tsv\n -O fusions.discarded.tsv \n ]]></command>\n <inputs>\n- <param name="input" argument="-x" type="data" format="sam,bam,cram" label="STAR Aligned.out.sam"/>\n- <param name="chimeric" argument="-c" type="data" format="sam,bam,cram" optional="true" label="STAR Chimeric.out.sam">\n- <help><![CDATA[ only required, if STAR was run with the parameter \'--chimOutType SeparateSAMold\' ]]></help>\n- </param>\n+ <conditional name="input_params">\n+ <param name="input_source"\n+ type="select"\n+ label="Use output from earlier STAR run or let Arriba running STAR">\n+ <option value="use_star">Use output from earlier STAR</option>\n+ <option value="use_fastq">Let Arriba control running STAR</option>\n+ </param>\n+ <when value="use_star">\n+ <param name="input" argument="-x" type="data" format="sam,bam,cram" label="STAR Aligned.out.sam"/>\n+ <param name="chimeric" argument="-c" type="data" format="sam,bam,cram" optional="true" label="STAR Chimeric.out.sam">\n+ <help><![CDATA[ only required, if STAR was run with the parameter \'--chimOutType SeparateSAMold\' ]]></help>\n+ </param>\n+ </when>\n+ <when value="use_fastq">\n+ <param name="left_fq"\n+ type="data"\n+ format="fastqsanger,fastqsanger.gz"\n+ argument="--left_fq"\n+ label="left.fq file"/>\n+ <param name="right_fq"\n+ type="data"\n+ format="fastqsanger,fastqsanger.gz"\n+ argument="--right_fq"\n+ label="right.fq file"/>\n+ </when>\n+ </conditional>\n <param name="genome_assembly" argument="-a" type="data" format="fasta" label="genome assembly fasta"/>\n <param name="gtf" argument="-g" type="data" format="gtf" label="GTF f'..b'orting reads and assembly sequence, the start of the fusion transcript is marked by a caret sign (^) and the end by a dollar sign ($). If the full sequence could not be constructed, these signs are missing.\n \n- -l MAX_ITD_LENGTH Maximum length of internal tandem duplications. Note: Increasing\n- this value beyond the default can impair performance and lead to many\n- false positives. Default: 100\n+ * peptide_sequence : This column contains the fusion peptide sequence. The sequence is translated from the fusion transcript given in the column fusion_transcript and determines the reading frame of the fused genes according to the transcript isoforms given in the columns transcript_id1 and transcript_id2. Translation starts at the start of the assembled fusion transcript or when the start codon is encountered in the 5\' gene. Translation ends when either the end of the assembled fusion transcript is reached or when a stop codon is encountered. If the fusion transcript contains an ellipsis (...), the sequence beyond the ellipsis is trimmed before translation, because the reading frame cannot be determined reliably. The column contains a dot (.), when the transcript sequence could not be predicted or when the precise breakpoints are unknown due to lack of split reads or when the fusion transcript does not overlap any coding exons in the 5\' gene or when no start codon could be found in the 5\' gene or when there is a stop codon prior to the fusion junction (in which case the column reading_frame contains the value stop-codon). The breakpoint is represented as a pipe symbol (|). If a codon spans the breakpoint, the amino acid is placed on the side of the breakpoint where two of the three bases reside. Codons resulting from non-template bases are flanked by two pipes. Amino acids are written as lowercase characters in the following situations: non-silent SNVs/SNPs, insertions, frameshifts, codons spanning the breakpoint, non-coding regions (introns/intergenic regions/UTRs), and non-template bases. Codons which cannot be translated to amino acids, such as those having invalid characters, are represented as ?.\n+\n+ * read_identifiers : This column contains the names of the supporting reads separated by commas.\n \n- -u Instead of performing duplicate marking itself, Arriba relies on duplicate marking by a\n- preceding program using the BAM_FDUP flag. This makes sense when unique molecular\n- identifiers (UMI) are used.\n+ - fusions.discarded.tsv\n+\n+ The file fusions.discarded.tsv (as specified by the parameter -O) contains all events that Arriba classified as an artifact or that are also observed in healthy tissue. It has the same format as the file fusions.tsv. \n \n- -X To reduce the runtime and file size, by default, the columns \'fusion_transcript\',\n- \'peptide_sequence\', and \'read_identifiers\' are left empty in the file containing\n- discarded fusion candidates (see parameter -O). When this flag is set, this extra\n- information is reported in the discarded fusions file.\n+\n+\n+\n \n- -I If assembly of the fusion transcript sequence from the supporting reads is incomplete\n- (denoted as \'...\'), fill the gaps using the assembly sequence wherever possible.\n-\n- -h Print help and exit.\n+Code repository: https://github.com/suhrig/arriba\n+Get help/report bugs: https://github.com/suhrig/arriba/issues\n+User manual: https://arriba.readthedocs.io/\n+Please cite: https://doi.org/10.1101/gr.257246.119\n \n- Code repository: https://github.com/suhrig/arriba\n- Get help/report bugs: https://github.com/suhrig/arriba/issues\n- User manual: https://arriba.readthedocs.io/\n- Please cite: https://doi.org/10.1101/gr.257246.119\n+\n+.. _Arriba: https://arriba.readthedocs.io/en/latest/\n+.. _INPUTS: https://arriba.readthedocs.io/en/latest/input-files/\n+.. _OUTPUTS: https://arriba.readthedocs.io/en/latest/output-files/\n \n ]]></help>\n <expand macro="citations" />\n'

diff -r 5ebf2354cc9b -r 9f2665b32c45 arriba_download_reference.xml
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/arriba_download_reference.xml Fri Oct 08 11:16:21 2021 +0000

[

@@ -0,0 +1,71 @@
+<tool id="arriba_download_reference" name="Arriba Download Reference" version="@TOOL_VERSION@+galaxy@VERSION_SUFFIX@" python_template_version="3.5">
+    <description></description>
+    <macros>
+        <import>macros.xml</import>
+    </macros>
+    <expand macro="requirements" />
+    <expand macro="version_command" />
+    <command detect_errors="exit_code"><![CDATA[
+    echo $arriba_reference_name > '$arriba_reference'
+    mkdir -p '$arriba_reference.files_path' &&
+    cd '$arriba_reference.files_path' &&
+    BASE_DIR=$(dirname $(dirname `which arriba`)) &&
+    REF_SCRIPT=`find $BASE_DIR -name 'download_references.sh'` &&
+    $REF_SCRIPT '$arriba_reference_name'
+    ]]></command>
+    <inputs>
+        <param name="arriba_reference_name" type="select" label="Select reference">
+            <option value="GRCh37+ENSEMBL87">GRCh37+ENSEMBL87</option>
+            <option value="GRCh37+GENCODE19">GRCh37+GENCODE19</option>
+            <option value="GRCh37+RefSeq">GRCh37+RefSeq</option>
+            <option value="GRCh37viral+ENSEMBL87">GRCh37viral+ENSEMBL87</option>
+            <option value="GRCh37viral+GENCODE19">GRCh37viral+GENCODE19</option>
+            <option value="GRCh37viral+RefSeq">GRCh37viral+RefSeq</option>
+            <option value="GRCh38+ENSEMBL93">GRCh38+ENSEMBL93</option>
+            <option value="GRCh38+GENCODE28">GRCh38+GENCODE28</option>
+            <option value="GRCh38+RefSeq">GRCh38+RefSeq</option>
+            <option value="GRCh38viral+ENSEMBL93">GRCh38viral+ENSEMBL93</option>
+            <option value="GRCh38viral+GENCODE28">GRCh38viral+GENCODE28</option>
+            <option value="GRCh38viral+RefSeq">GRCh38viral+RefSeq</option>
+            <option value="GRCm38+GENCODEM25">GRCm38+GENCODEM25</option>
+            <option value="GRCm38+RefSeq">GRCm38+RefSeq</option>
+            <option value="GRCm38viral+GENCODEM25">GRCm38viral+GENCODEM25</option>
+            <option value="GRCm38viral+RefSeq">GRCm38viral+RefSeq</option>
+            <option value="hg19+ENSEMBL87">hg19+ENSEMBL87</option>
+            <option value="hg19+GENCODE19">hg19+GENCODE19</option>
+            <option value="hg19+RefSeq">hg19+RefSeq</option>
+            <option value="hg19viral+ENSEMBL87">hg19viral+ENSEMBL87</option>
+            <option value="hg19viral+GENCODE19">hg19viral+GENCODE19</option>
+            <option value="hg19viral+RefSeq">hg19viral+RefSeq</option>
+            <option value="hg38+ENSEMBL93">hg38+ENSEMBL93</option>
+            <option value="hg38+GENCODE28">hg38+GENCODE28</option>
+            <option value="hg38+RefSeq">hg38+RefSeq</option>
+            <option value="hg38viral+ENSEMBL93">hg38viral+ENSEMBL93</option>
+            <option value="hg38viral+GENCODE28">hg38viral+GENCODE28</option>
+            <option value="hg38viral+RefSeq">hg38viral+RefSeq</option>
+            <option value="hs37d5+ENSEMBL87">hs37d5+ENSEMBL87</option>
+            <option value="hs37d5+GENCODE19">hs37d5+GENCODE19</option>
+            <option value="hs37d5+RefSeq">hs37d5+RefSeq</option>
+            <option value="hs37d5viral+ENSEMBL87">hs37d5viral+ENSEMBL87</option>
+            <option value="hs37d5viral+GENCODE19">hs37d5viral+GENCODE19</option>
+            <option value="hs37d5viral+RefSeq">hs37d5viral+RefSeq</option>
+            <option value="mm10+GENCODEM25">mm10+GENCODEM25</option>
+            <option value="mm10+RefSeq">mm10+RefSeq</option>
+            <option value="mm10viral+GENCODEM25">mm10viral+GENCODEM25</option>
+            <option value="mm10viral+RefSeq">mm10viral+RefSeq</option>
+        </param>
+    </inputs>
+    <outputs>
+        <data name="arriba_reference" format="txt" label="$arriba_reference_name"/>
+    </outputs>
+    <help><![CDATA[
+** Arriba **
+
+Arriba_ is a fast tool to search for aberrant transcripts such as gene fusions.
+It is based on chimeric alignments found by the STAR RNA-Seq aligner.
+
+.. _Arriba: https://arriba.readthedocs.io/en/latest/
+
+]]></help>
+    <expand macro="citations" />
+</tool>

diff -r 5ebf2354cc9b -r 9f2665b32c45 test-data/Aligned.out.sam
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/test-data/Aligned.out.sam Fri Oct 08 11:16:21 2021 +0000

b'@@ -0,0 +1,111 @@\n+@HD\tVN:1.4\tSO:coordinate\n+@SQ\tSN:1\tLN:248956422\n+@SQ\tSN:2\tLN:242193529\n+@SQ\tSN:3\tLN:198295559\n+@SQ\tSN:4\tLN:190214555\n+@SQ\tSN:5\tLN:181538259\n+@SQ\tSN:6\tLN:170805979\n+@SQ\tSN:7\tLN:159345973\n+@SQ\tSN:8\tLN:145138636\n+@SQ\tSN:9\tLN:138394717\n+@SQ\tSN:10\tLN:133797422\n+@SQ\tSN:11\tLN:135086622\n+@SQ\tSN:12\tLN:133275309\n+@SQ\tSN:13\tLN:114364328\n+@SQ\tSN:14\tLN:107043718\n+@SQ\tSN:15\tLN:101991189\n+@SQ\tSN:16\tLN:90338345\n+@SQ\tSN:17\tLN:83257441\n+@SQ\tSN:18\tLN:80373285\n+@SQ\tSN:19\tLN:58617616\n+@SQ\tSN:20\tLN:64444167\n+@SQ\tSN:21\tLN:46709983\n+@SQ\tSN:22\tLN:50818468\n+@SQ\tSN:X\tLN:156040895\n+@SQ\tSN:Y\tLN:57227415\n+@SQ\tSN:MT\tLN:16569\n+@PG\tID:STAR\tPN:STAR\tVN:2.7.8a\tCL:STAR --runThreadN 12 --genomeDir /panfs/roc/website/galaxy.msi.umn.edu/galaxy/tool-data/rnastar/2.7.4a/GRCh38_canon/GRCh38_canon/dataset_1367616_files --genomeLoad NoSharedMemory --readFilesIn /panfs/roc/galaxy/PRODUCTION/database/files/001/368/dataset_1368710.dat /panfs/roc/galaxy/PRODUCTION/database/files/001/368/dataset_1368711.dat --readFilesCommand zcat --limitBAMsortRAM 122880000000 --outSAMtype BAM SortedByCoordinate --outSAMstrandField intronMotif --outSAMattributes NH HI AS nM ch --outSAMunmapped Within --outSAMprimaryFlag OneBestScore --outSAMmapqUnique 60 --outBAMsortingThreadN 12 --outBAMsortingBinsN 50 --outSAMattrIHstart 1 --winAnchorMultimapNmax 50 --chimSegmentMin 12 --chimOutType WithinBAM Junctions --chimOutJunctionFormat 1 --quantMode GeneCounts --twopass1readsN 50000 --twopassMode Basic\n+@CO\tuser command line: STAR --runThreadN 12 --genomeLoad NoSharedMemory --genomeDir /panfs/roc/website/galaxy.msi.umn.edu/galaxy/tool-data/rnastar/2.7.4a/GRCh38_canon/GRCh38_canon/dataset_1367616_files --readFilesIn /panfs/roc/galaxy/PRODUCTION/database/files/001/368/dataset_1368710.dat /panfs/roc/galaxy/PRODUCTION/database/files/001/368/dataset_1368711.dat --readFilesCommand zcat --outSAMtype BAM SortedByCoordinate --twopassMode Basic --twopass1readsN 50000 --quantMode GeneCounts --outSAMstrandField intronMotif --outSAMattrIHstart 1 --outSAMattributes NH HI AS nM ch --outSAMprimaryFlag OneBestScore --outSAMmapqUnique 60 --outSAMunmapped Within --chimSegmentMin 12 --outBAMsortingThreadN 12 --outBAMsortingBinsN 50 --winAnchorMultimapNmax 50 --limitBAMsortRAM 122880000000 --chimOutType WithinBAM Junctions --chimOutJunctionFormat 1\n+BCR-ABL1-76\t99\t9\t130854061\t60\t24S126M\t=\t130854103\t755\tCAGCCACTGGATTTAAGCAGAGTTCAAAAGCCCTTCAGCGGCCAGTAGCATCTGACTTTGAGCCTCAGGGTCTGAGTGAAGCCGCTCGTTGGAACTCCAAGGAAAACCTTCTCGCTGGACCCAGTGAAAATGACCCCAACCTTTTCGTTG\tCCCGGGGCGGGCGJJJJJGJJJGJJJJJJJJJGJ1JCJJGJGGJJJGJJGGJJJ8GGJJGGGJJ=GGCGGGGGG=GGCCGGG8GC=GGGG=GCGGCGGGGJGG=GGGG=GGGGGGGGCGGGGCCGGGCG=GG(G=GCGCCG1CCGGCGGG\tNH:i:1\tHI:i:1\tAS:i:274\tnM:i:1\tXS:A:+\tNM:i:1\n+BCR-ABL1-64\t99\t9\t130854061\t60\t6S144M\t=\t130854104\t756\tAGAGTTCAAAAGCCCTTCAGCGGCCAGTAGCATCTGACTTTGAGCCTCAGGGTCTGAGTGAAGCCGCTCGTTGGAACTCCAAGGAAAACCTTCTCGCTGGACCCAGTGAAAATGACCCCAACCTTTTCGTTGCACTGTATGATTTTGTGG\tCCCGGGGGCGGGGGJJJGJJJGGJJJJJCJJJJGGJJJGJJGJGJG=GGJG=JJJJCGCCC==JGGCGGGCJG1CCCCGG8CGGGGGGGGCCGC=CGCGGJGGGGCGCGGGGGGGGCCGCGGGG=GCGGGGGGG=GGGGCGGGGGGCCGG\tNH:i:1\tHI:i:1\tAS:i:290\tnM:i:2\tXS:A:+\tNM:i:1\n+BCR-ABL1-54\t99\t9\t130854061\t60\t61S89M\t=\t130854061\t140\tCCGGGGCTCTATGGGTTTCTGAATGTCATCGTCCACTCAGCCACTGGATTTAAGCAGAGTTCAAAAGCCCTTGAGCGGCCAGTAGCATCTGACTTTGAGCCTCAGGGTCTGAGTGAAGCCGCTCGTTGGAACTCCAAGGAAAACCTTCTC\tCCCGGGGGGGGGGGJJJJJGJ=JJJJJJGJJJGGJJJJJJJCJJG8JJJGJJGJ=GG=JJJGGCGGCGGJGC(GGGGGCGC8CGGCGCCGGC=GGGCGGGJG1GGGGGG1CG=GGGGC=1G1CGGGGGCCGGGGCGG=CC=C=CGGGGG8\tNH:i:1\tHI:i:1\tAS:i:219\tnM:i:4\tNM:i:2\n+BCR-ABL1-54\t147\t9\t130854061\t60\t10S140M\t=\t130854061\t-140\tAAGCAGAGTTCAAAAGCCCTTCAGCGGCCAGTAGCATCTGACTTTGAGCCTCAGGGTCTGAGTGAAGCCGCTCGTTGGAACTCCAAGGAAAACGTTCTCGCTGGACCCAGTGAAAATGACCCCAACCTTTTCGTTGCACTGTATGATTTT\t=GGGGGGGCCCCGCCGGG=G(GGGG=CGCGGCGCCGG=GGGGGCJJJ=GC8C1GGGGGCG8GCCGC=GGG1GCCGGJC8GCGGCGCGJGJJJG1CGJGG=CJJJGGGGJG=CJGJJJJCJCJJGGJJJJJGGJGGJJCGGGGGGGGG=CC\tNH:i:1\tHI:i:1\tAS:i:219\tnM:i:4\tNM:i:2\n+BCR-ABL1-4'..b'JGGJJJGJGGGGCJJJGJJJGGJGJJJGJJCCJJGGG1GGGGGGG=CC\tNH:i:1\tHI:i:1\tAS:i:227\tnM:i:0\tXS:A:+\tNM:i:0\n+BCR-ABL1-60\t2145\t22\t23290375\t60\t39M111H\t9\t130854074\t0\tTCATCGTCCACTCAGCCACTGGATTTAAGCAGAGTTCAA\t=CCGGCGGGGG=GJJGJGGGCJJCJJGJCGJG(J(JJJG\tNH:i:1\tHI:i:1\tAS:i:38\tnM:i:0\tch:A:1\tNM:i:0\tSA:Z:9,130854064,+,39S111M,60,0;\n+BCR-ABL1-74\t77\t*\t0\t0\t*\t*\t0\t0\tTCATTTTCACTGGGTCCAGCGAGAAGGTTTTCCTTGGAGTTCCAACGAGCGGCTTCACTCAGACCCTGAGGCTCAAAGTCAGATGCTACTGGCCGCTGAAGGGCTTTTGAACTCTGCTTAAATCCAGTGGCTGAGTGGACGATGACATTC\tCC11GGGGGGGGGGCCJJJGCGJJGJJJJJGGGGGGJJJGGJG==GCJCJ=GGJJGGJJGGCJGG=GGGGGJGGJGC=GC=GGGCGGGCGGGGCCGCGGGJCGC=GGC8CGCGCGGGGGGCGCC1GGCGCC=GCCGCGGC8GCGGGCCCG\tNH:i:0\tHI:i:0\tAS:i:155\tnM:i:2\tuT:A:1\n+BCR-ABL1-74\t141\t*\t0\t0\t*\t*\t0\t0\tCATTCCGCTGACCATCAATAAGGAAGATGATGAGTCTCCGGGGCTCTATGGGTTTCTGAATGTCATCGTCCACTCAGCCACTGGATTTAGGCAGAGTTCAAAAGCCCTTCAGCGGCCAGTAGCATCTGACTTTGAGCCTCAGGGTCTGAG\tCCCGGGGGGCGCGJGGJJGGJGJJJGJGGJJGGJGJJ1=JCJJGGGJJJJGGGJGCCJGGJGG=J1JG8JGCGGGJG=GC1CGCCGGCG(GGCGGCGGGGGCJC1CCGC==CCGGGGCGGCGGGCCGGCGCGC8CCCCGGG=GGGC=GGG\tNH:i:0\tHI:i:0\tAS:i:155\tnM:i:2\tuT:A:1\n+BCR-ABL1-66\t77\t*\t0\t0\t*\t*\t0\t0\tTCCAGCGAGAAGGTTTTCCTTGGAGTTCCAACGAGCGGCTTCACTCAGACCCTGAGGCTCAAAGTCAGATGCTACTGGCCGCTGAAGGGCTTTTGAACTCTGCTTAAATCCAGTGGCTGAGTGGACGATGACATTCAGAAACCCATAGAG\tCCC=GGGGCGGGGJJJJJGJJJJ=JJJGJJ1GJJGJJJJJGJJJJJGGGGCGJJGGGJJJGGCGGGGJGCGG1JCGGG=GCCGCG=GC=G=GCCGGGGG8JGGGGGGGGGGGG=GGCGGC8GGCCGGGC=GGGGGGGGG=CGG=8GGCCG\tNH:i:0\tHI:i:0\tAS:i:159\tnM:i:0\tuT:A:1\n+BCR-ABL1-66\t141\t*\t0\t0\t*\t*\t0\t0\tCATTCCGCTGACCATCAATAAGGAAGATGATGAGTCTCCGGGGCTCTATGGGTTTCTGAATGTCATCGTCCACTCAGCCACTGGATTTAAGCAGAGTTCAAAAGCCCTTCAGCGGCCAGTAGCATCTGACTTTGAGCCTCAGGGTCTGAG\tCCCGGGGGGGGGGGGJ=JGJJJJJJJGGJJCCCJGJJ1JJJGCJGGGGJJJJ=GGGJGJGC(GGGGJGGGJG1=GGGGGGGG=G=C=GG8CC8GGGGGCCCCJCCCJGCG=GGCCGGCGGCGGCG==1GCCGGC1GGGGGCGGGGGGCGG\tNH:i:0\tHI:i:0\tAS:i:159\tnM:i:0\tuT:A:1\n+BCR-ABL1-58\t77\t*\t0\t0\t*\t*\t0\t0\tATGATGAGTCTCCGGGGCTCTATGGGTTTCTGAATGTCATCGTCCACTCAGCCACTGGATTTAAGCAGAGTTCAAAAGCCCTTCAGCGGCCAGTAGCATCTGACTTTGAGCCTCAGGGTCTGAGTGAAGCCGCTCGTTGGAACTCCAAGG\tCCCGGCGGGGGGGGJJJJJGJJGJGJGJGJJJJJJJJJCJGJJJJGCG=8GGGJGJGGCGGJGCGJJJCJGGG=CGCCGGCCGGGCGCGGGCGCG1GGGCCCGGGGCG8GCCC=C8CGCGG=CCCGCCCCGGG=CCGGCGGGCGGGGGCG\tNH:i:0\tHI:i:0\tAS:i:185\tnM:i:3\tuT:A:1\n+BCR-ABL1-58\t141\t*\t0\t0\t*\t*\t0\t0\tTTGGGGTCATTTTCACTGGGTCCAGCGAGAAGGTTTTCCTTGGAGTTCCAACGAGCGGCTTCACTCAGACCCTGAGGCTCAAAGTCAGATTCTACTGGCCGCTGAAGGGCTTTTGAACTCTGCTTAAATCCAGTGGCTGAGTGGACGATG\tCCCGGGGGGGGGGJJJJJJGJGJJJGGJ=JJJJJJJJGC=GJJGGJJGJJGG1GCJGGGG=JGGG8C=GCCGC==GGGCGGGGGG=GGG=(G=CCGCCGGGGCJJJJGGGC8GCGCGCG8CGGCCGGGCGCGCGG8CCGG8CGGGGGGGG\tNH:i:0\tHI:i:0\tAS:i:185\tnM:i:3\tuT:A:1\n+BCR-ABL1-24\t77\t*\t0\t0\t*\t*\t0\t0\tCGCAGACCATCAATAAGGAAGATGATGAGTCTCCGGGGCTCTATGGGTTTCTGAATGTCATCGTCCACTCAGCCACTGGATTTAAGCAGAGTTCAAAAGCCCTTCAGCGGCCAGTAGCATCTGACTTTGAGCCTCAGGGGCTGAGTGAAG\tCC11GCGGGGGGGJCGJGJJCCJJJJGJJJJGJJGGJJJCJJJG8JJJ1GJ=JGGGGJJJCG=8GGCGCCGGGCCGGGCGGGGCGGGGCCGCGGCCGGG=J1GCCC1(CCGGCGGGCCGCGGGCGGGGC=GGCGCCGCC1GCGGGGGCGG\tNH:i:0\tHI:i:0\tAS:i:154\tnM:i:3\tuT:A:1\n+BCR-ABL1-24\t141\t*\t0\t0\t*\t*\t0\t0\tTTTCACTGGGTCCAGCGAGAAGGTTTTCCTTGGAGTTCCAACGAGCGGCTTCACTCAGACCCTGAGGCTCAAAGTCAGATGCTACTGGCCGCTGAAGGGCTTTTGAACTCTGCTTAAATCCAGTGGCTGAGTGGACGATGACATTCAGAA\tC=CCGGGGGGGGCJ1GGJJJJ1JJJJJGJJ=GJJG8GGJ=GJGJJGJJGGGCGJGCGGGCGGG8GG=GJJGCG1GCGGJGCCGGCGGGCCGGGCG8GGGGG8C1==CGGCCCGCGGGGC8GCGGG8GGGCGCCGCCGCGGGCGGGGGGCG\tNH:i:0\tHI:i:0\tAS:i:154\tnM:i:3\tuT:A:1\n+BCR-ABL1-10\t77\t*\t0\t0\t*\t*\t0\t0\tAGGTTGGGGTCATTTTCACTGGGTCCAGCGAGAAGGTTTTCCTTGGAGTTCCAACGAGCGGCTTCACTCAGACCCTGAGGCTCAAAGTCAGATGCTACTGGCCGCTGAAGGGCTTTTGAACTCTGCTTAAATCCAGTGGCTGAGTGGACG\tCC=GGGGGGGGGG1GJJJJJCJJJJJJJJJJJGJ=GJJJGCJJJJCJGJGCJGJJJGGJJJGGCCGGJGC=GGJ1C8GGGGGGCGCCGGGGGGCGGGCGCCCG1GGCGCGCGGGCC8GCGCGCGC8CCCGCGCGGGGGCGGGGGCGGCGG\tNH:i:0\tHI:i:0\tAS:i:181\tnM:i:2\tuT:A:1\n+BCR-ABL1-10\t141\t*\t0\t0\t*\t*\t0\t0\tATAAGGAAGATGATGAGTCTCCGGGGCTCTATGGGTTTCTGAATGTCATCGTCCACTCAGCCACTGGATTTAAGCAGAGTTCAAAAGCCCTTCAGCGGCCAGTAGCATCTGACTTTGAGCCTCAGGGTCTGAGTGAAGCCGCTCGTTGGA\t1CCGGCGGGGGG1GGJJJGCC1JJJJCCG=JGGJJGJJJ=GGGGGJJGGGGGGC1J=CJGCGGGGCGC(CGGGGG=GGGGG(G=CGGCGGGGCCCGC=CCCCJJCC8G1GGGGCGGGGGGCGCGGGGGGGCG=GGCCGCCGCC1G=GGGG\tNH:i:0\tHI:i:0\tAS:i:181\tnM:i:2\tuT:A:1\n'