Repository 'graphprot_predict_profile'
hg clone https://toolshed.g2.bx.psu.edu/repos/rnateam/graphprot_predict_profile

Changeset 1:20429f4c1b95 (2020-01-22)
Previous changeset 0:215925e588c4 (2018-05-25) Next changeset 2:7bbb7bf6304f (2020-01-27)
Commit message:
"planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/rna_tools/graphprot commit f3fb925b83a4982e0cf9a0c11ff93ecbb8e4e6d5"
modified:
test-data/test.fa
test-data/test.model
test-data/test.params
added:
gplib.py
graphprot_predict_wrapper.py
graphprot_train_predict.xml
graphprot_train_wrapper.py
test-data/empty_file
test-data/file1
test-data/file2
test-data/test.ensembl.fa
test-data/test.peaks.bed
test-data/test.peaks_genomic.bed
test-data/test.predictions
test-data/test.profile
test-data/test.sequence_motif
test-data/test1.bed
test-data/test2.bed
test-data/test2.fa
test-data/test2.profile
test-data/test2_1.avg_profile
test-data/test2_2.avg_profile
test-data/test2_3.avg_profile
test-data/test3.fa
test-data/test3_added_ids_exp.avg_profile
test-data/test3_added_ids_out.avg_profile
test-data/test4.avg_profile
test-data/test4.bed
test-data/test4_out.peaks.bed
test-data/test4_out_exp.peaks.bed
test-data/test4_out_exp2.peaks.bed
test-data/test_exp.peaks.bed
test-data/test_negatives.parop.fa
test-data/test_negatives.train.fa
test-data/test_out.peaks.bed
test-data/test_positives.parop.fa
test-data/test_positives.train.fa
test-data/test_predict.avg_profile
test-data/test_predict.avg_profile.genomic_peaks.bed
test-data/test_predict.avg_profile.p50.genomic_peaks.bed
test-data/test_predict.avg_profile.p50.peaks.bed
test-data/test_predict.avg_profile.peaks.bed
test-data/test_predict.bed
test-data/test_predict.fa
test-data/test_predict.p50.predictions
test-data/test_predict.predictions
removed:
data/EWSR1_eCLIP_K562_ENCSR887LPK.tar.gz
data/FMR1_eCLIP_K562_ENCSR331VNX.tar.gz
data/HNRNPC_eCLIP_HepG2_ENCSR550DVK.tar.gz
data/HUR_PAR-CLIP_HEK293_Mukherjee.tar.gz
data/IGF2BP1-3_PAR-CLIP_HEK293_Hafner.tar.gz
data/IGF2BP1-3_PAR-CLIP_HEK293_Hafner_structure.tar.gz
data/KHDRBS1_eCLIP_K562_ENCSR628IDK.tar.gz
data/KHDRBS1_eCLIP_K562_ENCSR628IDK_structure.tar.gz
data/PUM2_PAR-CLIP_HEK293_Hafner.tar.gz
data/PUM2_eCLIP_K562_exonized_3utr.tar.gz
data/QKI_PAR-CLIP_HEK293_Hafner.tar.gz
data/QKI_eCLIP_HepG2_ENCSR570WLM.tar.gz
data/QKI_eCLIP_HepG2_ENCSR570WLM_structure.tar.gz
graphprot_predict_profile.xml
graphprot_predict_profile_wrapper.pl
test-data/GraphProt_predict_profile_test_out1.average_profile
test-data/GraphProt_predict_profile_test_out1.peak_regions.bed
test-data/GraphProt_predict_profile_test_out1.peak_regions_p50.bed
test-data/GraphProt_predict_profile_test_out2.average_profile
test-data/GraphProt_predict_profile_test_out2.peak_regions.bed
test-data/GraphProt_predict_profile_test_out3.average_profile
test-data/GraphProt_predict_profile_test_out3.peak_regions.bed
test-data/GraphProt_predict_profile_test_out4.average_profile
test-data/GraphProt_predict_profile_test_out4.peak_regions.bed
test-data/GraphProt_predict_profile_test_out4.peak_regions_p50.bed
test-data/structure_test.model
test-data/structure_test.params
b
diff -r 215925e588c4 -r 20429f4c1b95 data/EWSR1_eCLIP_K562_ENCSR887LPK.tar.gz
b
Binary file data/EWSR1_eCLIP_K562_ENCSR887LPK.tar.gz has changed
b
diff -r 215925e588c4 -r 20429f4c1b95 data/FMR1_eCLIP_K562_ENCSR331VNX.tar.gz
b
Binary file data/FMR1_eCLIP_K562_ENCSR331VNX.tar.gz has changed
b
diff -r 215925e588c4 -r 20429f4c1b95 data/HNRNPC_eCLIP_HepG2_ENCSR550DVK.tar.gz
b
Binary file data/HNRNPC_eCLIP_HepG2_ENCSR550DVK.tar.gz has changed
b
diff -r 215925e588c4 -r 20429f4c1b95 data/HUR_PAR-CLIP_HEK293_Mukherjee.tar.gz
b
Binary file data/HUR_PAR-CLIP_HEK293_Mukherjee.tar.gz has changed
b
diff -r 215925e588c4 -r 20429f4c1b95 data/IGF2BP1-3_PAR-CLIP_HEK293_Hafner.tar.gz
b
Binary file data/IGF2BP1-3_PAR-CLIP_HEK293_Hafner.tar.gz has changed
b
diff -r 215925e588c4 -r 20429f4c1b95 data/IGF2BP1-3_PAR-CLIP_HEK293_Hafner_structure.tar.gz
b
Binary file data/IGF2BP1-3_PAR-CLIP_HEK293_Hafner_structure.tar.gz has changed
b
diff -r 215925e588c4 -r 20429f4c1b95 data/KHDRBS1_eCLIP_K562_ENCSR628IDK.tar.gz
b
Binary file data/KHDRBS1_eCLIP_K562_ENCSR628IDK.tar.gz has changed
b
diff -r 215925e588c4 -r 20429f4c1b95 data/KHDRBS1_eCLIP_K562_ENCSR628IDK_structure.tar.gz
b
Binary file data/KHDRBS1_eCLIP_K562_ENCSR628IDK_structure.tar.gz has changed
b
diff -r 215925e588c4 -r 20429f4c1b95 data/PUM2_PAR-CLIP_HEK293_Hafner.tar.gz
b
Binary file data/PUM2_PAR-CLIP_HEK293_Hafner.tar.gz has changed
b
diff -r 215925e588c4 -r 20429f4c1b95 data/PUM2_eCLIP_K562_exonized_3utr.tar.gz
b
Binary file data/PUM2_eCLIP_K562_exonized_3utr.tar.gz has changed
b
diff -r 215925e588c4 -r 20429f4c1b95 data/QKI_PAR-CLIP_HEK293_Hafner.tar.gz
b
Binary file data/QKI_PAR-CLIP_HEK293_Hafner.tar.gz has changed
b
diff -r 215925e588c4 -r 20429f4c1b95 data/QKI_eCLIP_HepG2_ENCSR570WLM.tar.gz
b
Binary file data/QKI_eCLIP_HepG2_ENCSR570WLM.tar.gz has changed
b
diff -r 215925e588c4 -r 20429f4c1b95 data/QKI_eCLIP_HepG2_ENCSR570WLM_structure.tar.gz
b
Binary file data/QKI_eCLIP_HepG2_ENCSR570WLM_structure.tar.gz has changed
b
diff -r 215925e588c4 -r 20429f4c1b95 gplib.py
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/gplib.py Wed Jan 22 10:14:41 2020 -0500
[
b'@@ -0,0 +1,1011 @@\n+\n+from distutils.spawn import find_executable\n+import subprocess\n+import statistics\n+import random\n+import gzip\n+import uuid\n+import sys\n+import re\n+import os\n+\n+"""\n+\n+Run doctests:\n+\n+python3 -m doctest gplib.py\n+\n+\n+"""\n+\n+\n+################################################################################\n+\n+def graphprot_predictions_get_median(predictions_file):\n+    """\n+    Given a GraphProt .predictions file, read in site scores and return \n+    the median value.\n+\n+    >>> test_file = "test-data/test.predictions"\n+    >>> graphprot_predictions_get_median(test_file)\n+    0.571673\n+\n+    """\n+    # Site scores list.\n+    sc_list = []\n+    with open(predictions_file) as f:\n+        for line in f:\n+            cols = line.strip().split("\\t")\n+            score = float(cols[2])\n+            sc_list.append(score)\n+    f.close()\n+    # Return the median.\n+    return statistics.median(sc_list)\n+\n+\n+################################################################################\n+\n+def graphprot_profile_get_top_scores_median(profile_file,\n+                                            profile_type="profile",\n+                                            avg_profile_extlr=5):\n+\n+    """\n+    Given a GraphProt .profile file, extract for each site (identified by \n+    column 1 ID) the top (= highest) score. Then return the median of these \n+    top scores.\n+    \n+    profile_type can be either "profile" or "avg_profile".\n+    "avg_profile means that the position-wise scores will first get smoothed \n+    out by calculating for each position a new score through taking a \n+    sequence window -avg_profile_extlr to +avg_profile_extlr of the position \n+    and calculate the mean score over this window and assign it to the position.\n+    After that, the maximum score of each site is chosen, and the median over \n+    all maximum scores is returned.\n+    "profile" leaves the position-wise scores as they are, directly extracting \n+    the maximum for each site and then reporting the median.\n+    \n+    >>> test_file = "test-data/test.profile"\n+    >>> graphprot_profile_get_top_scores_median(test_file)\n+    3.2\n+\n+    """\n+    # Dictionary of lists, with list of scores (value) for each site (key).\n+    lists_dic = {}\n+    with open(profile_file) as f:\n+        for line in f:\n+            cols = line.strip().split("\\t")\n+            seq_id = cols[0]\n+            score = float(cols[2])\n+            if seq_id in lists_dic:\n+                lists_dic[seq_id].append(score)\n+            else:\n+                lists_dic[seq_id] = []\n+                lists_dic[seq_id].append(score)\n+    f.close()\n+    # For each site, extract maximum and store in new list.\n+    max_list = []\n+    for seq_id in lists_dic:\n+        if profile_type == "profile":\n+            max_sc = max(lists_dic[seq_id])\n+            max_list.append(max_sc)\n+        elif profile_type == "avg_profile":\n+            # Convert profile score list to average profile scores list.\n+            aps_list = list_moving_window_average_values(lists_dic[seq_id],\n+                                                         win_extlr=avg_profile_extlr)\n+            max_sc = max(aps_list)\n+            max_list.append(max_sc)\n+        else:\n+            assert 0, "invalid profile_type argument given: \\"%s\\"" %(profile_type)\n+    # Return the median.\n+    return statistics.median(max_list)\n+\n+\n+################################################################################\n+\n+def list_moving_window_average_values(in_list, \n+                                      win_extlr=5,\n+                                      method=1):\n+    """\n+    Take a list of numeric values, and calculate for each position a new value, \n+    by taking the mean value of the window of positions -win_extlr and \n+    +win_extlr. If full extension is not possible (at list ends), it just \n+    takes what it gets.\n+    Two implementations of the task are given, chose by method=1 or method=2.\n+\n+    >>> test_list = [2,'..b'_peak:\n+                    merged_peak_list.append(new_peak)\n+                    added_peaks_dic[i] = 1\n+                    added_peaks_dic[j] = 1\n+                else:\n+                    merged_peak_list.append(peak_list[i])\n+                    added_peaks_dic[i] = 1\n+            if not peaks_merged:\n+                iterate = False\n+            peak_list = merged_peak_list\n+            peaks_merged = False\n+    # If peak coordinates should be in .bed format, make peak ends 1-based.\n+    if coords == "bed":\n+        for i in range(len(peak_list)):\n+            peak_list[i][1] += 1\n+            peak_list[i][2] += 1 # 1-base best score position too.\n+    return peak_list\n+\n+\n+################################################################################\n+\n+def bed_peaks_to_genomic_peaks(peak_file, genomic_peak_file, genomic_sites_bed, print_rows=False):\n+    """\n+    Given a .bed file of sequence peak regions (possible coordinates from \n+    0 to length of s), convert peak coordinates to genomic coordinates.\n+    Do this by taking genomic regions of sequences as input.\n+\n+    >>> test_in = "test-data/test.peaks.bed"\n+    >>> test_exp = "test-data/test_exp.peaks.bed"\n+    >>> test_out = "test-data/test_out.peaks.bed"\n+    >>> gen_in = "test-data/test.peaks_genomic.bed"\n+    >>> bed_peaks_to_genomic_peaks(test_in, test_out, gen_in)\n+    >>> diff_two_files_identical(test_out, test_exp)\n+    True\n+\n+    """\n+    # Read in genomic region info.\n+    id2row_dic = {}\n+\n+    with open(genomic_sites_bed) as f:\n+        for line in f:\n+            row = line.strip()\n+            cols = line.strip().split("\\t")\n+            site_id = cols[3]\n+            assert site_id not in id2row_dic, "column 4 IDs not unique in given .bed file \\"%s\\"" %(args.genomic_sites_bed)\n+            id2row_dic[site_id] = row\n+    f.close()\n+\n+    # Read in peaks file and convert coordinates.\n+    OUTPEAKS = open(genomic_peak_file, "w")\n+    with open(peak_file) as f:\n+        for line in f:\n+            cols = line.strip().split("\\t")\n+            site_id = cols[0]\n+            site_s = int(cols[1])\n+            site_e = int(cols[2])\n+            site_id2 = cols[3]\n+            site_sc = float(cols[4])\n+            assert re.search(".+,.+", site_id2), "regular expression failed for ID \\"%s\\"" %(site_id2)\n+            m = re.search(".+,(\\d+)", site_id2)\n+            sc_pos = int(m.group(1)) # 1-based.\n+            assert site_id in id2row_dic, "site ID \\"%s\\" not found in genomic sites dictionary" %(site_id)\n+            row = id2row_dic[site_id]\n+            rowl = row.split("\\t")\n+            gen_chr = rowl[0]\n+            gen_s = int(rowl[1])\n+            gen_e = int(rowl[2])\n+            gen_pol = rowl[5]\n+            new_s = site_s + gen_s\n+            new_e = site_e + gen_s\n+            new_sc_pos = sc_pos + gen_s\n+            if gen_pol == "-":\n+                new_s = gen_e - site_e\n+                new_e = gen_e - site_s\n+                new_sc_pos = gen_e - sc_pos + 1 # keep 1-based.\n+            new_row = "%s\\t%i\\t%i\\t%s,%i\\t%f\\t%s" %(gen_chr, new_s, new_e, site_id, new_sc_pos, site_sc, gen_pol)\n+            OUTPEAKS.write("%s\\n" %(new_row))\n+            if print_rows:\n+                print(new_row)\n+    OUTPEAKS.close()\n+\n+\n+################################################################################\n+\n+def diff_two_files_identical(file1, file2):\n+    """\n+    Check whether two files are identical. Return true if diff reports no \n+    differences.\n+    \n+    >>> file1 = "test-data/file1"\n+    >>> file2 = "test-data/file2"\n+    >>> diff_two_files_identical(file1, file2)\n+    True\n+    >>> file1 = "test-data/test1.bed"\n+    >>> diff_two_files_identical(file1, file2)\n+    False\n+\n+    """\n+    same = True\n+    check_cmd = "diff " + file1 + " " + file2\n+    output = subprocess.getoutput(check_cmd)\n+    if output:\n+        same = False\n+    return same\n+\n+\n+################################################################################\n+\n+\n'
b
diff -r 215925e588c4 -r 20429f4c1b95 graphprot_predict_profile.xml
--- a/graphprot_predict_profile.xml Fri May 25 12:17:44 2018 -0400
+++ /dev/null Thu Jan 01 00:00:00 1970 +0000
[
b'@@ -1,227 +0,0 @@\n-<tool id="graphprot_predict_profile" name="GraphProt predict profile" version="1.1.7">\n-    <description>- Predict RBP binding profiles</description>\n-    <requirements>\n-        <requirement type="package" version="1.1.7">graphprot</requirement>\n-    </requirements>\n-    <command><![CDATA[\n-    #if $select_model.model_selector == \'select_model_from_repo\':\n-        mkdir -p ./model &&\n-        tar -zxvf \'$__tool_directory__/data/${select_model.repo_model}.tar.gz\' -C ./model &&\n-    #end if\n-    perl \'$__tool_directory__/graphprot_predict_profile_wrapper.pl\' \n-        -fasta \'$fasta_file\'\n-        #if $select_model.model_selector == \'select_model_from_history\':\n-            -model \'$select_model.model_file\'\n-            #if $select_model.set_params.set_params_selector == \'supply_params_file\':\n-                -params \'$select_model.set_params.params_file\'\n-            #elif $select_model.set_params.set_params_selector == \'manual_params_setting\':\n-                #if $select_model.set_params.model_type.model_type_selector == \'sequence\':\n-                    -onlyseq\n-                #elif $select_model.set_params.model_type.model_type_selector == \'structure\':\n-                    -abstraction $select_model.set_params.model_type.gp_abstraction\n-                #end if\n-                -R $select_model.set_params.gp_r\n-                -D $select_model.set_params.gp_d\n-                -bitsize $select_model.set_params.gp_bitsize\n-                -lambda $select_model.set_params.gp_lambda\n-                -epochs $select_model.set_params.gp_epochs\n-                #if $select_model.set_params.gev_options.distr_my\n-                    -distr-my $select_model.set_params.gev_options.distr_my\n-                #end if\n-                #if $select_model.set_params.gev_options.distr_sigma\n-                    -distr-sigma $select_model.set_params.gev_options.distr_sigma\n-                #end if\n-                #if $select_model.set_params.gev_options.distr_xi\n-                    -distr-xi $select_model.set_params.gev_options.distr_xi\n-                #end if\n-            #end if\n-        #elif $select_model.model_selector == \'select_model_from_repo\':\n-            -model \'./model/${select_model.repo_model}.model\'\n-            -params \'./model/${select_model.repo_model}.params\'\n-        #end if\n-        $peak_region_options.p50_output\n-        #if $peak_region_options.merge_dist\n-            -merge-dist $peak_region_options.merge_dist\n-        #end if\n-        #if $peak_region_options.p_val_thr\n-            -thr-p $peak_region_options.p_val_thr\n-        #end if\n-        #if $peak_region_options.score_thr\n-            -thr-sc $peak_region_options.score_thr\n-        #end if\n-    ]]></command>\n-    <inputs>\n-        <param name="fasta_file" type="data" format="fasta" label="Input FASTA file" argument="-fasta"\n-               help="FASTA file containing sequences to predict binding profiles on"/>\n-\n-        <conditional name="select_model">\n-            <param name="model_selector" type="select" label="Select GraphProt model" \n-                   help="Select GraphProt model for binding profile prediction">\n-                <option value="select_model_from_history" selected="true">Select model from history</option>\n-                <option value="select_model_from_repo">Select model from repository</option>\n-            </param>\n-            <when value="select_model_from_history">\n-                <param name="model_file" type="data" format="data" label="GraphProt model file" argument="-model"\n-                       help="Predict binding profile for the given GraphProt RBP model"/>\n-                <conditional name="set_params">\n-                    <param name="set_params_selector" type="select" label="Set model parameters">\n-                        <option value="supply_params_file" selected="true">Select parameter file from history</option>\n-                        <option value="manual_params_setting">Manually set mo'..b'            <param name="fasta_file" value="test.fa" ftype="fasta"/>\n-            <param name="model_selector" value="select_model_from_history"/>\n-            <param name="model_file" value="structure_test.model"/>\n-            <param name="set_params_selector" value="supply_params_file"/>\n-            <param name="params_file" value="structure_test.params"/>\n-            <param name="model_type_selector" value="structure"/>\n-            <param name="gp_abstraction" value="3"/>\n-            <param name="score_thr" value="2"/>\n-            <output name="average_profile_outfile" file="GraphProt_predict_profile_test_out3.average_profile"/>\n-            <output name="peak_regions_outfile" file="GraphProt_predict_profile_test_out3.peak_regions.bed"/>\n-        </test>\n-        <test>\n-            <param name="fasta_file" value="test.fa" ftype="fasta"/>\n-            <param name="model_selector" value="select_model_from_repo"/>\n-            <param name="repo_model" value="FMR1_eCLIP_K562_ENCSR331VNX"/>\n-            <param name="p50_output" value="True"/>\n-            <output name="average_profile_outfile" file="GraphProt_predict_profile_test_out4.average_profile"/>\n-            <output name="peak_regions_outfile" file="GraphProt_predict_profile_test_out4.peak_regions.bed"/>\n-            <output name="peak_regions_p50_outfile" file="GraphProt_predict_profile_test_out4.peak_regions_p50.bed"/>\n-        </test>\n-\n-    </tests>\n-    <help>\n-\n-Use GraphProt (-action predict_profile) to predict binding profiles for a given RBP model (supplied as .model and .params file) on a given set of FASTA sequences. After predicting position-wise scores, the scores are averaged over small windows (11 nt with averaged score position in center) to smooth out the profiles and peak regions are extracted based on the set thresholds (p-value or score) and merge distance.\n-\n-**Output files**\n-\n-The procedure has three output files (third is optional):\n-\n-1) An average_profile file containing averaged position-scores over all supplied sequences\n-\n-2) A peak regions BED file which contains peak-scoring regions above the supplied threshold (p-value default: 0.05, score default: 0)\n-\n-3) A peak regions BED file using the best average score found in at least 50 % of the positive training sites (p50). NOTE that this requires the p50 score to be given in the .params file, otherwise if set an empty file will be output.\n-\n-**Model selection**\n-\n-The GraphProt model used for profile prediction can either be uploaded to history or chosen from an example collection of models (Select model from repository). For the repository models, the corresponding parameter file is selected automatically, providing all model parameters necessary for prediction and p-value calculation. If you choose to upload a model to the history, it is recommended to use the corresponding .params file for automatically setting the model parameters. Otherwise the model parameters have to be entered manually.\n-\n-**p-value calculation**\n-\n-Signifying the GraphProt scores is done by fitting a generalized extreme value (GEV) distribution on a set of scores derived from 10000 transcript sequences for each GraphProt model. The GEV distribution has three parameters: my (location), sigma (scale), and xi (shape). The fitted parameter values usually are read in from the .params file, but can also be entered manually. Parameter fitting was done in R using the minpack.lm_ package, using the probability density function (PDF) described here_. If no GEV parameter values are specified (either in .params file or manually), p-value calculation for the scores will be skipped and the peak regions will be extracted based on the set threshold score.\n-\n-.. _here: https://en.wikipedia.org/wiki/Generalized_extreme_value_distribution\n-.. _minpack.lm: https://cran.r-project.org/web/packages/minpack.lm\n-\n-    </help>\n-    <citations>\n-        <citation type="doi">10.1186/gb-2014-15-1-r17</citation>\n-    </citations>\n-</tool>\n'
b
diff -r 215925e588c4 -r 20429f4c1b95 graphprot_predict_profile_wrapper.pl
--- a/graphprot_predict_profile_wrapper.pl Fri May 25 12:17:44 2018 -0400
+++ /dev/null Thu Jan 01 00:00:00 1970 +0000
[
b'@@ -1,764 +0,0 @@\n-#!/usr/bin/perl\n-\n-use strict;\n-use warnings;\n-use Getopt::Long;\n-use Pod::Usage;\n-use Cwd qw(getcwd abs_path);\n-use List::Util qw(sum);\n-\n-=head1 NAME\n-\n-=head1 SYNOPSIS\n-\n-Galaxy wrapper script for GraphProt (-action predict_profile) to compute \n-the binding profile for a given model on a given set of sequences provided \n-in FASTA format. After profile prediction, average profiles get computed, \n-scores signified and binding peak regions extracted. The score signification \n-is done using the provided fitted GEV parameters, either from .params file \n-or manually set. If score threshold is set (-thr-sc), p-value assignment \n-will be skipped and set score threshold will be used to extract peak \n-regions. NOTE: Additional lines .params file are used to store and get \n-GEV parameters, as well as type of model (model_type: sequence|structure).\n-Also, this wrapper currently works for classification mode only.\n-\n-PARAMETERS:\n-\n-    -help|h         display help page\n-    -fasta          Input FASTA file (option -fasta)\n-    -model          Input .model file (option -model)\n-    -params         Input .params file\n-                    NOTE: uses .params file with additional\n-                    parameters\n-                    Manually set parameters (below) will override \n-                    found settings in .params file\n-    -data-id        Data ID (option -prefix)\n-\n-GraphProt model parameters (by default get from .params file):\n-\n-    -onlyseq        Set if model is a sequence model\n-    -R              GraphProt model R parameter\n-    -D              GraphProt model D parameter\n-    -epochs         GraphProt model epochs parameter\n-    -lambda         GraphProt model lambda parameter\n-    -bitsize        GraphProt model bitsize parameter\n-    -abstraction    GraphProt model RNAshapes abstraction level \n-                    parameter (set for structure models)\n-\n-Peak region extraction parameters:\n-\n-    -thr-sc         Score threshold for extracting peak regions\n-                    By default p-value of 0.05 is used. If no p-value \n-                    calculation possible, -thr-sc is used with default: 0\n-    -thr-p          p-value threshold for extracting peak regions\n-                    By default, peak regions with p = 0.05 are extracted,\n-                    as well as p50 score peak regions (if info given)\n-                    Default: 0.05\n-    -merge-dist     Maximum merge distance for nearby peak regions\n-                    Default: report all non-overlapping regions\n-    -p50-out        Output p50 score filtered peak regions BED file\n-                    default: false\n-\n-GEV distribution parameters:\n-\n-    -distr-my       GEV distribution my parameter for calculating p-values\n-                    from scores\n-    -distr-sigma    GEV distrubution sigma parameter for calculating \n-                    p-values from scores\n-    -distr-xi       GEV distribution xi parameter for calculating p-values\n-                    from scores\n-    -ap-extlr       Used average profile left right extension for \n-                    averaging scores, which were used for distribution \n-                    fitting. NOTE: usually a value of 5 was used for \n-                    for getting GEV distribution and parameters. If you \n-                    choose a different value here, calculated p-values \n-                    will be wrong!\n-                    default : 5\n-\n-\n-=head1 DISCRIPTION\n-\n-5) Write manual\n-6) add output p50 file with NOTE\n-\n-6) put GP into rna_tools\n-\n-NOTE:\n-Additional lines .params file used to store and get gev parameters, as well \n-as type of model (model_type: sequence|structure).\n-\n-Example .params content:\n-epochs: 20\n-lambda: 0.001\n-R: 1\n-D: 4\n-bitsize: 14\n-model_type: sequence\n-#ADDITIONAL MODEL PARAMETERS\n-ap_extlr: 5\n-gev_my: -2.5408\n-gev_sigma: 1.6444\n-gev_xi: -0.1383\n-p50_score: 6.51534 \n-p50_p_val: 0.0009059744 \n-\n-=cut\n-\n-############################\n-# COMMAND LINE CHECKING.\n-####'..b'f, $s, $sc) = (split /\\t/)[0,1,2];\n-        # If file has zero-based positions.\n-        if ($s == 0) {\n-            $zero_pos = 1;\n-        }\n-        # If positions are one-based, make them zero-based.\n-        unless ($zero_pos) {\n-            $s -= 1;\n-        }\n-        # At transcript ends, if in positive region, write and reset.\n-        if ($old_ref ne $ref) {\n-            if ($in eq "Y") {\n-                print OUT "$ref_id\\t$region_s\\t$region_e\\t$end;$best_sc\\t0\\t+\\n";\n-                $in = "N";\n-            }\n-        }\n-        $old_ref = $ref;\n-        # Deal with positive regions.\n-        if ($sc > $min_sc) {\n-            # Start of a positive cluster.\n-            if ($in eq "N") {\n-                $start = $s;\n-                $region_s = $s;\n-                $region_e = $s + 1;\n-                $end = $s + 1;\n-                $best_sc = $sc;\n-                $ref_id = $ref;\n-                $in = "Y";\n-                next;\n-            # Inside a positive cluster.\n-            } elsif ($in eq "Y") {\n-                if ($sc > $best_sc) {\n-                    $start = $s;\n-                    $end = $s + 1;\n-                    $best_sc = $sc;\n-                    $ref_id = $ref;\n-                }\n-                $region_e++;\n-                next;\n-            }\n-        } else {\n-            # If we were in positive cluster before.\n-            if ($in eq "Y") {\n-                print OUT "$ref_id\\t$region_s\\t$region_e\\t$end;$best_sc\\t0\\t+\\n";\n-                $in = "N";\n-            }\n-        }\n-    }\n-    # After last line processed.\n-    if ($in eq "Y") {\n-      print OUT "$ref_id\\t$region_s\\t$region_e\\t$end;$best_sc\\t0\\t+\\n";\n-      $in = "N";\n-    }\n-    close IN;\n-    close OUT;\n-    # If merge distance zero (i.e. end of one block is -1 from start of next block).\n-    if ($max_merge_dist == 0) {\n-        qx/cat $temp_bed_file > $peak_regions_bed_file/;\n-    } else {\n-        # Merge nearby regions.\n-        open(IN, $temp_bed_file) or die "Cannot open $temp_bed_file: $!";\n-        open(OUT, \'>\', $peak_regions_bed_file) or die "Cannot open $peak_regions_bed_file: $!";\n-        # For storing current block stats.\n-        my $block_chr = 0;\n-        my ($block_s, $block_e, $block_best_pos, $block_best_sc);\n-        while (<IN>) {\n-            chomp;\n-            my ($chr, $s, $e, $id) = (split /\\t/)[0,1,2,3];\n-            my ($best_pos, $best_sc) = (split /;/, $id);\n-            if ($chr eq $block_chr) {\n-                # If $block_e, $s within merge merge.\n-                if ( ($s - $block_e) <= $max_merge_dist ) {\n-                    # Update block stats.\n-                    $block_e = $e;\n-                    if ($block_best_sc < $best_sc) {\n-                        $block_best_sc = $best_sc;\n-                        $block_best_pos = $best_pos;\n-                    }\n-                } else {\n-                    # If $e outside merge range, print block.\n-                    print OUT "$block_chr\\t$block_s\\t$block_e\\t$block_best_pos;$block_best_sc\\t0\\t+\\n";\n-                    # Store new block.\n-                    ($block_chr, $block_s, $block_e, $block_best_pos, $block_best_sc) = ($chr, $s, $e, $best_pos, $best_sc);\n-                }\n-\n-            } else {\n-                # If new chromosome, print last block, otherwise it is the first block.\n-                if ($block_chr) {\n-                    print OUT "$block_chr\\t$block_s\\t$block_e\\t$block_best_pos;$block_best_sc\\t0\\t+\\n";\n-                }\n-                ($block_chr, $block_s, $block_e, $block_best_pos, $block_best_sc) = ($chr, $s, $e, $best_pos, $best_sc);\n-            }\n-        \n-        }\n-        # Print last block.\n-        if ($block_chr) {\n-            print OUT "$block_chr\\t$block_s\\t$block_e\\t$block_best_pos;$block_best_sc\\t0\\t+\\n";\n-        }\n-        close OUT;\n-        close IN;\n-    }\n-    qx/rm -f $temp_bed_file/;\n-}\n-\n-\n-################################################################################\n-\n-\n-\n'
b
diff -r 215925e588c4 -r 20429f4c1b95 graphprot_predict_wrapper.py
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/graphprot_predict_wrapper.py Wed Jan 22 10:14:41 2020 -0500
[
b'@@ -0,0 +1,298 @@\n+#!/usr/bin/env python3\n+\n+import subprocess\n+import argparse\n+import shutil\n+import gplib\n+import gzip\n+import sys\n+import os\n+\n+\n+"""\n+\n+TOOL DEPENDENCIES\n+=================\n+\n+GraphProt 1.1.7\n+Best install via:\n+https://anaconda.org/bioconda/graphprot\n+Tested with: miniconda3, conda 4.7.12\n+\n+\n+Script: What\'s my job this time, master?\n+Author: It\'ll be a though one.\n+Script: I take this as a given.\n+Author: Oh yeah?\n+Script: ... I\'m ready.\n+\n+\n+OUTPUT FILES\n+============\n+\n+    data_id.avg_profile\n+    data_id.avg_profile.peaks.bed\n+--conf-out\n+    data_id.avg_profile.p50.peaks.bed\n+--gen-site-bed\n+    data_id.avg_profile.genomic_peaks.bed\n+--conf-out --gen-site-bed\n+    data_id.avg_profile.p50.genomic_peaks.bed\n+--ws-pred\n+    data_id.predictions\n+--ws-pred --conf-out\n+    data_id.predictions\n+    data_id.p50.predictions\n+\n+\n+EXAMPLE CALLS\n+=============\n+\n+python graphprot_predict_wrapper.py --model test2.model --params test2.params --fasta gp_data/test10_predict.fa --data-id test2pred --gp-output\n+python graphprot_predict_wrapper.py --model test2.model --params test2.params --fasta gp_data/test10_predict.fa --data-id test2pred --gen-site-bed gp_data/test10_predict.bed\n+python graphprot_predict_wrapper.py --model test2.model --params test2.params --fasta gp_data/test10_predict.fa --data-id test2pred --gen-site-bed gp_data/test10_predict.bed --conf-out\n+python graphprot_predict_wrapper.py --model test2.model --params test2.params --fasta gp_data/test10_predict.fa --data-id test2pred --conf-out --ws-pred\n+\n+python graphprot_predict_wrapper.py --model test-data/test.model --params test-data/test.params --fasta test-data/test_predict.fa --data-id predtest\n+\n+python graphprot_predict_wrapper.py --model test-data/test.model --params test-data/test.params --fasta test-data/test_predict.fa --data-id predtest --gen-site-bed test-data/test_predict.bed --sc-thr 0.0 --max-merge-dist 0 --conf-out  --ap-extlr 5\n+\n+python graphprot_predict_wrapper.py --data-id GraphProt --fasta test-data/test_predict.fa --model test-data/test.model --params test-data/test.params --gen-site-bed test-data/test_predict.bed --sc-thr 0.0 --max-merge-dist 0 --conf-out  --ap-extlr 5\n+\n+\n+pwd && python \'/home/uhlm/Dokumente/Projekte/GraphProt_galaxy_new/galaxytools/tools/rna_tools/graphprot/graphprot_predict_wrapper.py\' --data-id GraphProt --fasta /tmp/tmpmuslpc1h/files/0/8/c/dataset_08c48d88-e3b5-423b-acf6-bf89b8c60660.dat --model /tmp/tmpmuslpc1h/files/e/6/4/dataset_e6471bb4-e74c-4372-bc49-656f900e7191.dat --params /tmp/tmpmuslpc1h/files/b/6/5/dataset_b65e8cf4-d3e6-429e-8d57-1d401adf4b3c.dat --gen-site-bed /tmp/tmpmuslpc1h/files/5/1/a/dataset_51a38b65-5943-472d-853e-5d845fa8ac3e.dat --sc-thr 0.0 --max-merge-dist 0 --conf-out  --ap-extlr 5\n+\n+\n+"""\n+\n+################################################################################\n+\n+def setup_argument_parser():\n+    """Setup argparse parser."""\n+    help_description = """\n+    Galaxy wrapper script for GraphProt (-action predict and -action \n+    predict_profile) to compute whole site or position-wise scores for input \n+    FASTA sequences.\n+    By default, profile predictions are calculated, followed by average \n+    profiles computions and peak regions extraction from average profiles.\n+    If --ws-pred is set, whole site score predictions on input sequences\n+    will be run instead.\n+    If --conf-out is set, sites or peak regions with a score >= the median \n+    score of positive training sites will be output.\n+    If --gen-site-bed .bed file is provided, peak regions will be output \n+    with genomic coordinates too.\n+\n+    """\n+    # Define argument parser.\n+    p = argparse.ArgumentParser(add_help=False,\n+                                prog="graphprot_predict_wrapper.py",\n+                                description=help_description,\n+                                formatter_class=argparse.MetavarTypeHelpFormatter)\n+\n+    # Argument groups.\n+    p_man = p.add_argument_group("REQUIR'..b't_ws_predictions_file,\n+                                                      sc_thr=pos_train_ws_pred_median)\n+    else:\n+        # Do profile prediction.\n+        print("Starting profile predictions on on input .fa file (-action predict_profile) ... ")\n+        check_cmd = "GraphProt.pl -action predict_profile -prefix " + args.data_id + " -fasta " + args.in_fa + " " + param_string + " -model " + args.in_model\n+        output = subprocess.getoutput(check_cmd)\n+        assert output, "the following call of GraphProt.pl produced no output:\\n%s" %(check_cmd)\n+        if args.gp_output:\n+            print(output)\n+        profile_predictions_file = args.data_id + ".profile"\n+        assert os.path.exists(profile_predictions_file), "Profile prediction output .profile file \\"%s\\" not found" %(profile_predictions_file)\n+\n+        # Profile prediction output files.\n+        avg_prof_file = args.data_id + ".avg_profile"\n+        avg_prof_peaks_file = args.data_id + ".avg_profile.peaks.bed"\n+        avg_prof_gen_peaks_file = args.data_id + ".avg_profile.genomic_peaks.bed"\n+        avg_prof_peaks_p50_file = args.data_id + ".avg_profile.p50.peaks.bed"\n+        avg_prof_gen_peaks_p50_file = args.data_id + ".avg_profile.p50.genomic_peaks.bed"\n+\n+        # Get sequence IDs in order from input .fa file.\n+        seq_ids_list = gplib.fasta_read_in_ids(args.in_fa)\n+        # Calculate average profiles.\n+        print("Getting average profile from profile (extlr for smoothing: %i) ... " %(args.ap_extlr))\n+        gplib.graphprot_profile_calculate_avg_profile(profile_predictions_file,\n+                                                      avg_prof_file,\n+                                                      ap_extlr=args.ap_extlr,\n+                                                      seq_ids_list=seq_ids_list,\n+                                                      method=2)\n+        # Extract peak regions on sequences with threshold score 0.\n+        print("Extracting peak regions from average profile (score threshold = 0) ... ")\n+        gplib.graphprot_profile_extract_peak_regions(avg_prof_file, avg_prof_peaks_file,\n+                                               max_merge_dist=args.max_merge_dist,\n+                                               sc_thr=args.score_thr)\n+        # Convert peaks to genomic coordinates.\n+        if args.genomic_sites_bed:\n+            print("Converting peak regions to genomic coordinates ... ")\n+            gplib.bed_peaks_to_genomic_peaks(avg_prof_peaks_file, avg_prof_gen_peaks_file,\n+                                             print_rows=False,\n+                                             genomic_sites_bed=args.genomic_sites_bed)\n+            # gplib.make_file_copy(avg_prof_gen_peaks_file, avg_prof_peaks_file)\n+        # Extract peak regions with threshold score p50.\n+        if args.conf_out:\n+            sc_id = "pos_train_avg_profile_median_%i" %(args.ap_extlr)\n+            # Filter by pos_train_ws_pred_median median.\n+            assert sc_id in param_dic, "average profile extlr %i median information missing in .params file" %(args.ap_extlr)\n+            p50_sc_thr = float(param_dic[sc_id])\n+            print("Extracting p50 peak regions from average profile (score threshold = %f) ... " %(p50_sc_thr))\n+            gplib.graphprot_profile_extract_peak_regions(avg_prof_file, avg_prof_peaks_p50_file,\n+                                                         max_merge_dist=args.max_merge_dist,\n+                                                         sc_thr=p50_sc_thr)\n+            # Convert peaks to genomic coordinates.\n+            if args.genomic_sites_bed:\n+                print("Converting p50 peak regions to genomic coordinates ... ")\n+                gplib.bed_peaks_to_genomic_peaks(avg_prof_peaks_p50_file, avg_prof_gen_peaks_p50_file,\n+                                                 genomic_sites_bed=args.genomic_sites_bed)\n+    # Done.\n+    print("Script: I\'m done.")\n+    print("Author: ... ")\n+\n+\n'
b
diff -r 215925e588c4 -r 20429f4c1b95 graphprot_train_predict.xml
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/graphprot_train_predict.xml Wed Jan 22 10:14:41 2020 -0500
[
b'@@ -0,0 +1,280 @@\n+<tool id="graphprot_predict_profile" name="GraphProt" version="1.1.7+galaxy1">\n+    <description>- Train models and predict RBP binding profiles</description>\n+    <requirements>\n+        <requirement type="package" version="1.1.7">graphprot</requirement>\n+    </requirements>\n+\n+    <command detect_errors="exit_code"><![CDATA[\n+        #if $action_type.action_type_selector == \'train\':\n+            python \'$__tool_directory__/graphprot_train_wrapper.py\'\n+                --data-id GraphProt\n+                --pos \'$action_type.pos_fasta_file\'\n+                --neg \'$action_type.neg_fasta_file\'\n+                $action_type.train_str_model\n+                #if $action_type.hpo_options.hpo_mode_type.hpo_mode_type_selector == \'take\':\n+                    --opt-set-size $action_type.hpo_options.hpo_mode_type.opt_set_size\n+                #elif $action_type.hpo_options.hpo_mode_type.hpo_mode_type_selector == \'supply\':\n+                    --opt-pos \'$action_type.hpo_options.hpo_mode_type.pos_parop_fasta\'\n+                    --opt-neg \'$action_type.hpo_options.hpo_mode_type.neg_parop_fasta\'\n+                #end if\n+                $action_type.training_options.disable_cv\n+                $action_type.training_options.disable_motifs\n+                --min-train $action_type.training_options.min_train\n+\n+        #elif $action_type.action_type_selector == \'predict\':\n+            python \'$__tool_directory__/graphprot_predict_wrapper.py\'\n+                --data-id GraphProt\n+                --fasta \'$action_type.input_fasta_file\'\n+                --model \'$action_type.model_file\'\n+                --params $action_type.params_file\n+                #if $action_type.genomic_sites_bed_file:\n+                    --gen-site-bed \'$action_type.genomic_sites_bed_file\'\n+                #end if\n+                --sc-thr $action_type.prediction_options.score_thr\n+                --max-merge-dist $action_type.prediction_options.max_merge_dist\n+                --ap-extlr $action_type.prediction_options.ap_extlr\n+                $action_type.prediction_options.conf_out\n+                $action_type.prediction_options.ws_pred_out\n+        #end if\n+\n+\n+    ]]></command>\n+\n+    <inputs>\n+        <conditional name="action_type">\n+        \n+            <param name="action_type_selector" type="select" label="Select an action">\n+                <option value="train" selected="true">Train a model</option>\n+                <option value="predict">Predict on input sequences</option>\n+            </param>\n+\n+            <when value="train">\n+                <param name="pos_fasta_file" type="data" format="fasta"\n+                       label="Positive sequences FASTA file" argument="-fasta"\n+                       help="Positive sequences (== RBP binding sites) FASTA file for model training"/>\n+                <param name="neg_fasta_file" type="data" format="fasta"\n+                       label="Negative sequences FASTA file" argument="-negfasta"\n+                       help="Negative sequences FASTA file for model training"/>\n+                <param name="train_str_model" label="Train a structure model" type="boolean"\n+                       truevalue="--str-model" falsevalue="" checked="False"\n+                       help="Train a structure model (default: train a sequence model)"/>\n+\n+                <section name="hpo_options" title="Hyperparameter optimization settings">\n+\n+                    <conditional name="hpo_mode_type">\n+                        <param name="hpo_mode_type_selector" type="select" label="Select strategy">\n+                            <option value="take" selected="true">Take sequences for optimization from input</option>\n+                            <option value="supply">Supply sequences for optimization</option>\n+                        </param>\n+                        <when value="take">\n+                            <param name="opt_set_size" type="integer" value="500"\n+                                   label="'..b'ks.bed"/>\n+        </test>\n+\n+        <test expect_num_outputs="2">\n+            <param name="action_type_selector" value="predict"/>\n+            <param name="input_fasta_file" value="test_predict.fa" ftype="fasta"/>\n+            <param name="model_file" value="test.model" ftype="txt"/>\n+            <param name="params_file" value="test.params" ftype="txt"/>\n+            <param name="ws_pred_out" value="True"/>\n+            <param name="conf_out" value="True"/>\n+            <output name="predictions_out_file" file="test_predict.predictions"/>\n+            <output name="p50_predictions_out_file" file="test_predict.p50.predictions"/>\n+        </test>\n+\n+    </tests>\n+    <help>\n+\n+Use GraphProt to train a model or to predict RBP binding profiles using a pretrained RBP model.\n+\n+\n+**Model training**\n+\n+To train a GraphProt model, a FASTA file with positive sequences (= RBP binding sites, usually determined by CLIP-seq) and a FASTA file with negative sequences (non-binding, e.g. randomly selected genomic sites) needs to be supplied. By default a sequence model is trained, since they often show similar performance compared to structure models while taking considerably less time to train. For hyperparameter optimization, a portion of the input FASTA sequences (usually n = 500) is taken away, but you can also provide separate optimization sets. After hyperparameter optimization, a model is trained using the input training sequences (minus the optimization set if not specified otherwise) with the determined optimized parameters. After that, a 10-fold cross validation is run on the training sequences to estimate the generalization performance of the model. Sequence and structure motifs (if structure model training enabled) are also output. Both cross validation and motif output can be disabled to further decrease the runtime. \n+\n+By default, the model training output files are:\n+\n+1) a .model file storing the model parameters\n+\n+2) a .params file storing model hyperparameters and additional information\n+\n+3) a .cv_results file containing the cross validation results\n+\n+4) _motif and motif.png files (sequence and / or structure)\n+\n+\n+**Profile prediction**\n+\n+This mode computes whole site or position-wise (= profile) binding scores for a given set input FASTA sequences. \n+\n+By default, binding profiles are calculated, followed by average profile computation and extraction of peak regions from the average profiles. The average binding profile is more smooth regarding the position-wise (per nucleotide) scores than the initial profile GraphProt outputs and is the recommended way to extract peaks. Note that the amount of smoothness can be controlled in the prediction options (with the lowest value 0 equaling the initial profile). A peak is defined as a contiguous region in the average profile with scores >= the set score threshold (by default 0, can be changed). In addition, a set of high confidence peak regions (p50) can be output. Here the threshold gets set to the median of the scores obtained from the positive training set during model training (information stored in parameters file). Moreover, the peak regions can be converted to genomic regions, if the genomic regions for the input FASTA sequences are supplied. \n+\n+Apart from predicting binding profiles, whole site predictions can be output as well. Here the output files are the scores for each input sequence, and optionally the p50 filtered set just like with the average profile peaks.\n+\n+\n+Summing up, the profile predictions output files are:\n+\n+1) an avg_profile file containing the position-wise (per nucleotide) binding profile scores\n+\n+2) one or several BED files containing the peak regions (all peaks, p50 peaks, all genomic peaks, p50 genomic peaks)\n+\n+3) if whole site prediction is enabled, a .predictions file and optionally a .p50.predictions file\n+\n+    </help>\n+    <citations>\n+        <citation type="doi">10.1186/gb-2014-15-1-r17</citation>\n+    </citations>\n+</tool>\n+\n+\n'
b
diff -r 215925e588c4 -r 20429f4c1b95 graphprot_train_wrapper.py
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/graphprot_train_wrapper.py Wed Jan 22 10:14:41 2020 -0500
b
b'@@ -0,0 +1,431 @@\n+#!/usr/bin/env python3\n+\n+import subprocess\n+import argparse\n+import shutil\n+import gplib\n+import gzip\n+import sys\n+import os\n+\n+\n+"""\n+\n+TOOL DEPENDENCIES\n+=================\n+\n+GraphProt 1.1.7\n+Best install via:\n+https://anaconda.org/bioconda/graphprot\n+Tested with: miniconda3, conda 4.7.12\n+\n+\n+OUTPUT FILES\n+============\n+\n+    data_id.model\n+    data_id.params\n+if not --disable-cv:\n+    data_id.cv_results\n+if not --disable-motifs:\n+    data_id.sequence_motif\n+    data_id.sequence_motif.png\n+    if --str-model:\n+        data_id.structure_motif\n+        data_id.structure_motif.png\n+Temporary:\n+    data_id.predictions\n+    data_id.profile\n+\n+\n+  --opt-set-size int  Hyperparameter optimization set size (taken away from both --pos and --neg) (default: 500)\n+  --opt-pos str       Positive (= binding site) sequences .fa file for hyperparameter optimization (default: take\n+                      --opt-set-size from --pos)\n+  --opt-neg str       Negative sequences .fa file for hyperparameter optimization (default: take --opt-set-size\n+                      from --neg)\n+  --min-train int     Minimum amount of training sites demanded (default: 500)\n+  --disable-cv        Disable cross validation step (default: false)\n+  --disable-motifs    Disable motif generation step (default: false)\n+  --gp-output         Print output produced by GraphProt (default: false)\n+  --str-model         Train a structure model (default: train a sequence model)\n+\n+\n+EXAMPLE CALLS\n+=============\n+\n+python graphprot_train_wrapper.py --pos gp_data/SERBP1_positives.train.fa --neg gp_data/SERBP1_negatives.train.fa --data-id test2 --disable-cv --gp-output --opt-set-size 200 --min-train 400\n+\n+python graphprot_train_wrapper.py --pos gp_data/SERBP1_positives.train.fa --neg gp_data/SERBP1_negatives.train.fa --data-id test2 --disable-cv --opt-set-size 100 --min-train 200\n+\n+python graphprot_train_wrapper.py --pos test-data/test_positives.train.fa --neg test-data/test_negatives.train.fa --data-id gptest2 --disable-cv --opt-pos test-data/test_positives.parop.fa --opt-neg test-data/test_negatives.parop.fa\n+\n+python graphprot_train_wrapper.py --pos test-data/test_positives.train.fa --neg test-data/test_negatives.train.fa --data-id gptest2 --disable-cv --disable-motifs --opt-pos test-data/test_positives.parop.fa --opt-neg test-data/test_negatives.parop.fa\n+\n+\n+"""\n+\n+################################################################################\n+\n+def setup_argument_parser():\n+    """Setup argparse parser."""\n+    help_description = """\n+    Galaxy wrapper script for GraphProt to train a GraphProt model on \n+    a given set of input sequences (positives and negatives .fa). By \n+    default a sequence model is trained (due to structure models \n+    being much slower to train). Also by default take a portion of \n+    the input sequences for hyperparameter optimization (HPO) prior to \n+    model training, and run a 10-fold cross validation and motif \n+    generation after model training. Thus the following output \n+    files are produced: \n+    .model model file, .params model parameter file, .png motif files \n+    (sequence, or sequence+structure), .cv_results CV results file.\n+    After model training, predict on positives to get highest whole \n+    site and profile scores found in binding sites. Take the median\n+    score out of these to store in .params file, using it later\n+    for outputting binding sites or peaks with higher confidence.\n+\n+    """\n+    # Define argument parser.\n+    p = argparse.ArgumentParser(add_help=False,\n+                                prog="graphprot_train_wrapper.py",\n+                                description=help_description,\n+                                formatter_class=argparse.MetavarTypeHelpFormatter)\n+\n+    # Argument groups.\n+    p_man = p.add_argument_group("REQUIRED ARGUMENTS")\n+    p_opt = p.add_argument_group("OPTIONAL ARGUMENTS")\n+\n+    # Required arguments.\n+    p_opt.add_argument("-h", "--help",\n+      '..b'file file.\n+    For .profile, first extract for each site the maximum score, and then \n+    from the list of maximum site scores get the median.\n+    For whole site .predictions, get the median from the site scores list.\n+    \n+    """\n+    print("Getting .profile and .predictions median scores ... ")\n+\n+    # Whole site scores median.\n+    ws_pred_median = gplib.graphprot_predictions_get_median(ws_predictions_file)\n+    # Profile top site scores median.\n+    profile_median = gplib.graphprot_profile_get_top_scores_median(profile_predictions_file, \n+                                                                     profile_type="profile")\n+    ws_pred_string = "pos_train_ws_pred_median: %f" %(ws_pred_median)\n+    profile_string = "pos_train_profile_median: %f" %(profile_median)\n+    gplib.echo_add_to_file(ws_pred_string, params_file)\n+    gplib.echo_add_to_file(profile_string, params_file)\n+    # Average profile top site scores median for extlr 1 to 10.\n+    for i in range(10):\n+        i += 1\n+        avg_profile_median = gplib.graphprot_profile_get_top_scores_median(profile_predictions_file,\n+                                                                             profile_type="avg_profile",\n+                                                                             avg_profile_extlr=i)\n+                                                                        \n+        avg_profile_string = "pos_train_avg_profile_median_%i: %f" %(i, avg_profile_median)\n+        gplib.echo_add_to_file(avg_profile_string, params_file)\n+\n+    print("Script: I\'m done.")\n+    print("Author: Good. Now go back to your file system directory.")\n+    print("Script: Ok.")\n+\n+\n+"""\n+\n+OLD CODE ...\n+\n+    p.add_argument("--ap-extlr",\n+                   dest="ap_extlr",\n+                   type = int,\n+                   default = 5,\n+                   help = "Define average profile up- and downstream extension for averaging scores to produce the average profile. This is used to get the median average profile score, which will be stored in the .params file to later be used in a prediction setting as a second filter value to get more confident peak regions. NOTE that you have to use the same value in model training and prediction! (default: 5)")\n+\n+\n+    p.add_argument("--disable-opt",\n+                   dest = "disable_opt",\n+                   default = False,\n+                   action = "store_true",\n+                   help = "Disable hyperparameter optimization (HPO) (default: optimize hyperparameters)")\n+    p.add_argument("--R",\n+                   dest = "param_r",\n+                   type = int,\n+                   default = False,\n+                   help = "GraphProt model R parameter (default: determined by HPO)")\n+    p.add_argument("--D",\n+                   dest = "param_d",\n+                   type = int,\n+                   default = False,\n+                   help = "GraphProt model D parameter (default: determined by HPO)")\n+    p.add_argument("--epochs",\n+                   dest = "param_epochs",\n+                   type = int,\n+                   default = False,\n+                   help = "GraphProt model epochs parameter (default: determined by HPO)")\n+    p.add_argument("--lambda",\n+                   dest = "param_lambda",\n+                   type = float,\n+                   default = False,\n+                   help = "GraphProt model lambda parameter (default: determined by HPO)")\n+    p.add_argument("--bitsize",\n+                   dest = "param_bitsize",\n+                   type = int,\n+                   default = False,\n+                   help = "GraphProt model bitsize parameter (default: determined by HPO)")\n+    p.add_argument("--abstraction",\n+                   dest = "param_abstraction",\n+                   type = int,\n+                   default = False,\n+                   help = "GraphProt model RNAshapes abstraction level parameter for training structure models (default: determined by HPO)")\n+\n+"""\n+\n+\n+\n'
b
diff -r 215925e588c4 -r 20429f4c1b95 test-data/GraphProt_predict_profile_test_out1.average_profile
--- a/test-data/GraphProt_predict_profile_test_out1.average_profile Fri May 25 12:17:44 2018 -0400
+++ /dev/null Thu Jan 01 00:00:00 1970 +0000
b
b'@@ -1,4532 +0,0 @@\n-ENST00000620398\t1\t0.88223\t0.08228291\n-ENST00000620398\t2\t0.76226\t0.09068277\n-ENST00000620398\t3\t0.70164\t0.09518219\n-ENST00000620398\t4\t0.04769\t0.15586177\n-ENST00000620398\t5\t0.03856\t0.15687939\n-ENST00000620398\t6\t0.00464\t0.16070393\n-ENST00000620398\t7\t0.17725\t0.14195340\n-ENST00000620398\t8\t-0.08646\t0.17132122\n-ENST00000620398\t9\t-0.61662\t0.24347692\n-ENST00000620398\t10\t-1.24997\t0.35314501\n-ENST00000620398\t11\t-1.87072\t0.48181594\n-ENST00000620398\t12\t-2.09189\t0.53096821\n-ENST00000620398\t13\t-2.27211\t0.57158527\n-ENST00000620398\t14\t-2.74742\t0.67782581\n-ENST00000620398\t15\t-2.57531\t0.63982927\n-ENST00000620398\t16\t-2.63088\t0.65218689\n-ENST00000620398\t17\t-2.73678\t0.67550300\n-ENST00000620398\t18\t-3.17049\t0.76596383\n-ENST00000620398\t19\t-3.16859\t0.76559075\n-ENST00000620398\t20\t-2.77097\t0.68295264\n-ENST00000620398\t21\t-2.80243\t0.68976911\n-ENST00000620398\t22\t-2.42409\t0.60590345\n-ENST00000620398\t23\t-2.58751\t0.64254848\n-ENST00000620398\t24\t-2.88080\t0.70657350\n-ENST00000620398\t25\t-2.32299\t0.58308238\n-ENST00000620398\t26\t-1.83523\t0.47404309\n-ENST00000620398\t27\t-1.78105\t0.46225287\n-ENST00000620398\t28\t-1.55606\t0.41445150\n-ENST00000620398\t29\t-0.77370\t0.26833083\n-ENST00000620398\t30\t-1.08246\t0.32174818\n-ENST00000620398\t31\t-1.67262\t0.43896511\n-ENST00000620398\t32\t-1.58116\t0.41968105\n-ENST00000620398\t33\t-1.28700\t0.36030150\n-ENST00000620398\t34\t-0.87017\t0.28437928\n-ENST00000620398\t35\t-0.75886\t0.26591482\n-ENST00000620398\t36\t-0.45220\t0.21917110\n-ENST00000620398\t37\t-0.19484\t0.18461904\n-ENST00000620398\t38\t0.42151\t0.11834288\n-ENST00000620398\t39\t0.86845\t0.08321446\n-ENST00000620398\t40\t1.05424\t0.07135026\n-ENST00000620398\t41\t1.64441\t0.04248276\n-ENST00000620398\t42\t2.53989\t0.01761073\n-ENST00000620398\t43\t3.74771\t0.00432059\n-ENST00000620398\t44\t3.85191\t0.00377309\n-ENST00000620398\t45\t4.06806\t0.00282450\n-ENST00000620398\t46\t5.04742\t0.00064169\n-ENST00000620398\t47\t5.27414\t0.00043386\n-ENST00000620398\t48\t5.28355\t0.00042667\n-ENST00000620398\t49\t4.31953\t0.00198574\n-ENST00000620398\t50\t4.04489\t0.00291520\n-ENST00000620398\t51\t3.73702\t0.00438043\n-ENST00000620398\t52\t3.59050\t0.00527536\n-ENST00000620398\t53\t3.50371\t0.00587617\n-ENST00000620398\t54\t2.16195\t0.02591594\n-ENST00000620398\t55\t1.52851\t0.04720970\n-ENST00000620398\t56\t0.75477\t0.09122927\n-ENST00000620398\t57\t0.27929\t0.13168447\n-ENST00000620398\t58\t-0.40566\t0.21260864\n-ENST00000620398\t59\t-1.04326\t0.31463965\n-ENST00000620398\t60\t-0.66184\t0.25046860\n-ENST00000620398\t61\t-1.05862\t0.31741399\n-ENST00000620398\t62\t-1.14323\t0.33294968\n-ENST00000620398\t63\t-1.20640\t0.34482279\n-ENST00000620398\t64\t-1.54031\t0.41118443\n-ENST00000620398\t65\t-0.97978\t0.30332632\n-ENST00000620398\t66\t-0.69353\t0.25544713\n-ENST00000620398\t67\t-0.21147\t0.18672444\n-ENST00000620398\t68\t-0.43692\t0.21700111\n-ENST00000620398\t69\t-0.58806\t0.23912932\n-ENST00000620398\t70\t-0.91130\t0.29140084\n-ENST00000620398\t71\t-1.38461\t0.37952176\n-ENST00000620398\t72\t-1.43470\t0.38957794\n-ENST00000620398\t73\t-1.66021\t0.43632827\n-ENST00000620398\t74\t-2.16178\t0.54668495\n-ENST00000620398\t75\t-2.35417\t0.59012625\n-ENST00000620398\t76\t-2.78170\t0.68528178\n-ENST00000620398\t77\t-2.90396\t0.71148736\n-ENST00000620398\t78\t-3.28006\t0.78705586\n-ENST00000620398\t79\t-3.38188\t0.80586242\n-ENST00000620398\t80\t-3.69356\t0.85807473\n-ENST00000620398\t81\t-3.62730\t0.84769347\n-ENST00000620398\t82\t-3.31309\t0.79324367\n-ENST00000620398\t83\t-3.09453\t0.75086489\n-ENST00000620398\t84\t-2.91889\t0.71464158\n-ENST00000620398\t85\t-2.47843\t0.61813397\n-ENST00000620398\t86\t-2.69840\t0.66709253\n-ENST00000620398\t87\t-2.31198\t0.58059456\n-ENST00000620398\t88\t-2.65231\t0.65693177\n-ENST00000620398\t89\t-2.55476\t0.63524177\n-ENST00000620398\t90\t-2.49715\t0.62233851\n-ENST00000620398\t91\t-2.10495\t0.53390032\n-ENST00000620398\t92\t-1.87496\t0.48274705\n-ENST00000620398\t93\t-1.66216\t0.43674219\n-ENST00000620398\t94\t-1.74058\t0.45351070\n-ENST00000620398\t95\t-2.31594\t0.58148938\n-ENST00000620398\t96\t-2.15655\t0.54550686\n-ENST00000620398\t97\t-1.36825\t0.37626516\n-ENST00000620398\t98\t-1.68358\t0.44129894\n-ENST00000620398\t99\t-1.33847\t0.37037316\n-ENST00000620398\t100\t-1.06634\t0.31881377\n-ENST0000062039'..b'622300\t2088\t-2.71949\t0.67172014\n-ENST00000622300\t2089\t-2.81494\t0.69246888\n-ENST00000622300\t2090\t-2.87752\t0.70587557\n-ENST00000622300\t2091\t-2.56178\t0.63680987\n-ENST00000622300\t2092\t-2.84411\t0.69873905\n-ENST00000622300\t2093\t-3.01108\t0.73386785\n-ENST00000622300\t2094\t-3.03936\t0.73967348\n-ENST00000622300\t2095\t-2.59397\t0.64398697\n-ENST00000622300\t2096\t-2.34811\t0.58875748\n-ENST00000622300\t2097\t-1.77121\t0.46012202\n-ENST00000622300\t2098\t-0.99290\t0.30564429\n-ENST00000622300\t2099\t-0.39795\t0.21153496\n-ENST00000622300\t2100\t-0.43876\t0.21726162\n-ENST00000622300\t2101\t-0.10996\t0.17414262\n-ENST00000622300\t2102\t0.31793\t0.12794945\n-ENST00000622300\t2103\t0.39982\t0.12030626\n-ENST00000622300\t2104\t0.75355\t0.09131853\n-ENST00000622300\t2105\t0.52498\t0.10932052\n-ENST00000622300\t2106\t0.36439\t0.12356809\n-ENST00000622300\t2107\t0.40343\t0.11997772\n-ENST00000622300\t2108\t0.42899\t0.11767165\n-ENST00000622300\t2109\t0.41707\t0.11874273\n-ENST00000622300\t2110\t-0.26861\t0.19409148\n-ENST00000622300\t2111\t0.67260\t0.09740011\n-ENST00000622300\t2112\t1.49601\t0.04861124\n-ENST00000622300\t2113\t1.63321\t0.04292154\n-ENST00000622300\t2114\t2.33281\t0.02182335\n-ENST00000622300\t2115\t2.65346\t0.01560997\n-ENST00000622300\t2116\t3.30196\t0.00750395\n-ENST00000622300\t2117\t3.39975\t0.00667232\n-ENST00000622300\t2118\t3.19936\t0.00847051\n-ENST00000622300\t2119\t2.35702\t0.02129047\n-ENST00000622300\t2120\t1.44925\t0.05068836\n-ENST00000622300\t2121\t1.98523\t0.03081446\n-ENST00000622300\t2122\t1.43825\t0.05118756\n-ENST00000622300\t2123\t0.18456\t0.14119793\n-ENST00000622300\t2124\t-0.72567\t0.26056253\n-ENST00000622300\t2125\t-1.29136\t0.36114909\n-ENST00000622300\t2126\t-2.18548\t0.55202684\n-ENST00000622300\t2127\t-2.98199\t0.72784947\n-ENST00000622300\t2128\t-3.69606\t0.85845849\n-ENST00000622300\t2129\t-3.97524\t0.89757733\n-ENST00000622300\t2130\t-3.68742\t0.85712975\n-ENST00000622300\t2131\t-3.66125\t0.85306297\n-ENST00000622300\t2132\t-4.19446\t0.92300520\n-ENST00000622300\t2133\t-4.68997\t0.96401806\n-ENST00000622300\t2134\t-4.53507\t0.95350636\n-ENST00000622300\t2135\t-4.15817\t0.91911720\n-ENST00000622300\t2136\t-3.90888\t0.88895875\n-ENST00000622300\t2137\t-3.39831\t0.80882112\n-ENST00000622300\t2138\t-2.83610\t0.69702084\n-ENST00000622300\t2139\t-2.41116\t0.60298860\n-ENST00000622300\t2140\t-2.62798\t0.65154386\n-ENST00000622300\t2141\t-2.98976\t0.72946151\n-ENST00000622300\t2142\t-3.00556\t0.73272940\n-ENST00000622300\t2143\t-3.22937\t0.77740350\n-ENST00000622300\t2144\t-3.23289\t0.77807975\n-ENST00000622300\t2145\t-3.44534\t0.81716905\n-ENST00000622300\t2146\t-3.84926\t0.88085133\n-ENST00000622300\t2147\t-4.19945\t0.92352993\n-ENST00000622300\t2148\t-4.54425\t0.95418480\n-ENST00000622300\t2149\t-5.04029\t0.98118914\n-ENST00000622300\t2150\t-4.93965\t0.97710214\n-ENST00000622300\t2151\t-5.18390\t0.98600589\n-ENST00000622300\t2152\t-5.21604\t0.98693550\n-ENST00000622300\t2153\t-4.98767\t0.97913018\n-ENST00000622300\t2154\t-4.71730\t0.96567293\n-ENST00000622300\t2155\t-4.81406\t0.97107835\n-ENST00000622300\t2156\t-5.08142\t0.98268487\n-ENST00000622300\t2157\t-5.10147\t0.98337921\n-ENST00000622300\t2158\t-5.21966\t0.98703703\n-ENST00000622300\t2159\t-5.17477\t0.98573241\n-ENST00000622300\t2160\t-4.97602\t0.97865149\n-ENST00000622300\t2161\t-5.46703\t0.99261461\n-ENST00000622300\t2162\t-4.97108\t0.97844596\n-ENST00000622300\t2163\t-4.91174\t0.97585542\n-ENST00000622300\t2164\t-4.64458\t0.96114013\n-ENST00000622300\t2165\t-4.39322\t0.94208939\n-ENST00000622300\t2166\t-4.16857\t0.92024436\n-ENST00000622300\t2167\t-3.70834\t0.86033506\n-ENST00000622300\t2168\t-2.93697\t0.71844671\n-ENST00000622300\t2169\t-3.00413\t0.73243419\n-ENST00000622300\t2170\t-3.24968\t0.78129322\n-ENST00000622300\t2171\t-3.39781\t0.80873140\n-ENST00000622300\t2172\t-3.38260\t0.80599253\n-ENST00000622300\t2173\t-3.32110\t0.79473185\n-ENST00000622300\t2174\t-2.49284\t0.62137094\n-ENST00000622300\t2175\t-2.24537\t0.56554438\n-ENST00000622300\t2176\t-2.43997\t0.60948112\n-ENST00000622300\t2177\t-1.75779\t0.45722130\n-ENST00000622300\t2178\t-1.38582\t0.37976317\n-ENST00000622300\t2179\t-1.73345\t0.45197653\n-ENST00000622300\t2180\t-1.20136\t0.34386706\n-ENST00000622300\t2181\t-0.56489\t0.23564104\n-ENST00000622300\t2182\t0.05943\t0.15456058\n-ENST00000622300\t2183\t0.93245\t0.07895921\n'
b
diff -r 215925e588c4 -r 20429f4c1b95 test-data/GraphProt_predict_profile_test_out1.peak_regions.bed
--- a/test-data/GraphProt_predict_profile_test_out1.peak_regions.bed Fri May 25 12:17:44 2018 -0400
+++ /dev/null Thu Jan 01 00:00:00 1970 +0000
b
@@ -1,41 +0,0 @@
-ENST00000620398 40 55 48;5.28355;0.00042667 0 +
-ENST00000620398 345 350 348;1.96970;0.03127996 0 +
-ENST00000620398 366 374 371;2.99552;0.01071062 0 +
-ENST00000620398 399 409 406;3.74594;0.00433045 0 +
-ENST00000620398 500 502 502;1.84967;0.03508185 0 +
-ENST00000620398 565 574 570;4.55760;0.00139889 0 +
-ENST00000620398 634 642 637;3.61150;0.00513819 0 +
-ENST00000620398 771 777 776;2.10449;0.02743062 0 +
-ENST00000550775 28 47 39;5.51074;0.00028155 0 +
-ENST00000550775 164 168 166;1.87947;0.03410347 0 +
-ENST00000550775 193 201 198;3.51905;0.00576589 0 +
-ENST00000550775 295 301 298;2.24890;0.02375878 0 +
-ENST00000550775 348 354 352;2.37461;0.02091027 0 +
-ENST00000550775 466 478 475;2.84867;0.01262256 0 +
-ENST00000550775 698 706 700;3.10360;0.00946697 0 +
-ENST00000550775 717 724 720;2.52530;0.01788293 0 +
-ENST00000550775 751 759 754;3.37203;0.00689960 0 +
-ENST00000550775 764 775 771;5.09839;0.00058872 0 +
-ENST00000550775 824 828 825;2.06144;0.02861391 0 +
-ENST00000550775 861 872 868;4.04621;0.00290997 0 +
-ENST00000550775 1032 1053 1045;5.84555;0.00014555 0 +
-ENST00000550775 1119 1124 1122;2.51525;0.01807250 0 +
-ENST00000550775 1202 1208 1207;2.70965;0.01469401 0 +
-ENST00000550775 1273 1281 1280;2.26778;0.02331108 0 +
-ENST00000550775 1351 1358 1355;3.29173;0.00759587 0 +
-ENST00000550775 1377 1378 1378;1.89003;0.03376233 0 +
-ENST00000550775 1396 1400 1398;2.26149;0.02345943 0 +
-ENST00000622300 6 14 10;3.01318;0.01049835 0 +
-ENST00000622300 15 16 16;2.09397;0.02771589 0 +
-ENST00000622300 68 79 76;4.32748;0.00196318 0 +
-ENST00000622300 113 127 119;6.64918;0.00002212 0 +
-ENST00000622300 157 162 161;2.39282;0.02052276 0 +
-ENST00000622300 195 209 202;5.46547;0.00030646 0 +
-ENST00000622300 218 228 223;3.51626;0.00578582 0 +
-ENST00000622300 311 314 313;1.82067;0.03605652 0 +
-ENST00000622300 522 528 527;2.46351;0.01907576 0 +
-ENST00000622300 1035 1038 1036;1.95475;0.03173367 0 +
-ENST00000622300 1177 1188 1183;3.26978;0.00779636 0 +
-ENST00000622300 1949 1950 1950;1.55761;0.04598351 0 +
-ENST00000622300 2111 2119 2117;3.39975;0.00667232 0 +
-ENST00000622300 2120 2121 2121;1.98523;0.03081446 0 +
b
diff -r 215925e588c4 -r 20429f4c1b95 test-data/GraphProt_predict_profile_test_out1.peak_regions_p50.bed
--- a/test-data/GraphProt_predict_profile_test_out1.peak_regions_p50.bed Fri May 25 12:17:44 2018 -0400
+++ /dev/null Thu Jan 01 00:00:00 1970 +0000
b
@@ -1,6 +0,0 @@
-ENST00000620398 45 48 48;5.28355;0.00042667 0 +
-ENST00000550775 37 41 39;5.51074;0.00028155 0 +
-ENST00000550775 769 771 771;5.09839;0.00058872 0 +
-ENST00000550775 1039 1049 1045;5.84555;0.00014555 0 +
-ENST00000622300 116 124 119;6.64918;0.00002212 0 +
-ENST00000622300 199 203 202;5.46547;0.00030646 0 +
b
diff -r 215925e588c4 -r 20429f4c1b95 test-data/GraphProt_predict_profile_test_out2.average_profile
--- a/test-data/GraphProt_predict_profile_test_out2.average_profile Fri May 25 12:17:44 2018 -0400
+++ /dev/null Thu Jan 01 00:00:00 1970 +0000
b
b'@@ -1,4532 +0,0 @@\n-ENST00000620398\t1\t0.88223\n-ENST00000620398\t2\t0.76226\n-ENST00000620398\t3\t0.70164\n-ENST00000620398\t4\t0.04769\n-ENST00000620398\t5\t0.03856\n-ENST00000620398\t6\t0.00464\n-ENST00000620398\t7\t0.17725\n-ENST00000620398\t8\t-0.08646\n-ENST00000620398\t9\t-0.61662\n-ENST00000620398\t10\t-1.24997\n-ENST00000620398\t11\t-1.87072\n-ENST00000620398\t12\t-2.09189\n-ENST00000620398\t13\t-2.27211\n-ENST00000620398\t14\t-2.74742\n-ENST00000620398\t15\t-2.57531\n-ENST00000620398\t16\t-2.63088\n-ENST00000620398\t17\t-2.73678\n-ENST00000620398\t18\t-3.17049\n-ENST00000620398\t19\t-3.16859\n-ENST00000620398\t20\t-2.77097\n-ENST00000620398\t21\t-2.80243\n-ENST00000620398\t22\t-2.42409\n-ENST00000620398\t23\t-2.58751\n-ENST00000620398\t24\t-2.88080\n-ENST00000620398\t25\t-2.32299\n-ENST00000620398\t26\t-1.83523\n-ENST00000620398\t27\t-1.78105\n-ENST00000620398\t28\t-1.55606\n-ENST00000620398\t29\t-0.77370\n-ENST00000620398\t30\t-1.08246\n-ENST00000620398\t31\t-1.67262\n-ENST00000620398\t32\t-1.58116\n-ENST00000620398\t33\t-1.28700\n-ENST00000620398\t34\t-0.87017\n-ENST00000620398\t35\t-0.75886\n-ENST00000620398\t36\t-0.45220\n-ENST00000620398\t37\t-0.19484\n-ENST00000620398\t38\t0.42151\n-ENST00000620398\t39\t0.86845\n-ENST00000620398\t40\t1.05424\n-ENST00000620398\t41\t1.64441\n-ENST00000620398\t42\t2.53989\n-ENST00000620398\t43\t3.74771\n-ENST00000620398\t44\t3.85191\n-ENST00000620398\t45\t4.06806\n-ENST00000620398\t46\t5.04742\n-ENST00000620398\t47\t5.27414\n-ENST00000620398\t48\t5.28355\n-ENST00000620398\t49\t4.31953\n-ENST00000620398\t50\t4.04489\n-ENST00000620398\t51\t3.73702\n-ENST00000620398\t52\t3.59050\n-ENST00000620398\t53\t3.50371\n-ENST00000620398\t54\t2.16195\n-ENST00000620398\t55\t1.52851\n-ENST00000620398\t56\t0.75477\n-ENST00000620398\t57\t0.27929\n-ENST00000620398\t58\t-0.40566\n-ENST00000620398\t59\t-1.04326\n-ENST00000620398\t60\t-0.66184\n-ENST00000620398\t61\t-1.05862\n-ENST00000620398\t62\t-1.14323\n-ENST00000620398\t63\t-1.20640\n-ENST00000620398\t64\t-1.54031\n-ENST00000620398\t65\t-0.97978\n-ENST00000620398\t66\t-0.69353\n-ENST00000620398\t67\t-0.21147\n-ENST00000620398\t68\t-0.43692\n-ENST00000620398\t69\t-0.58806\n-ENST00000620398\t70\t-0.91130\n-ENST00000620398\t71\t-1.38461\n-ENST00000620398\t72\t-1.43470\n-ENST00000620398\t73\t-1.66021\n-ENST00000620398\t74\t-2.16178\n-ENST00000620398\t75\t-2.35417\n-ENST00000620398\t76\t-2.78170\n-ENST00000620398\t77\t-2.90396\n-ENST00000620398\t78\t-3.28006\n-ENST00000620398\t79\t-3.38188\n-ENST00000620398\t80\t-3.69356\n-ENST00000620398\t81\t-3.62730\n-ENST00000620398\t82\t-3.31309\n-ENST00000620398\t83\t-3.09453\n-ENST00000620398\t84\t-2.91889\n-ENST00000620398\t85\t-2.47843\n-ENST00000620398\t86\t-2.69840\n-ENST00000620398\t87\t-2.31198\n-ENST00000620398\t88\t-2.65231\n-ENST00000620398\t89\t-2.55476\n-ENST00000620398\t90\t-2.49715\n-ENST00000620398\t91\t-2.10495\n-ENST00000620398\t92\t-1.87496\n-ENST00000620398\t93\t-1.66216\n-ENST00000620398\t94\t-1.74058\n-ENST00000620398\t95\t-2.31594\n-ENST00000620398\t96\t-2.15655\n-ENST00000620398\t97\t-1.36825\n-ENST00000620398\t98\t-1.68358\n-ENST00000620398\t99\t-1.33847\n-ENST00000620398\t100\t-1.06634\n-ENST00000620398\t101\t-1.01138\n-ENST00000620398\t102\t-1.41211\n-ENST00000620398\t103\t-1.57895\n-ENST00000620398\t104\t-1.98716\n-ENST00000620398\t105\t-2.00906\n-ENST00000620398\t106\t-1.49886\n-ENST00000620398\t107\t-1.87947\n-ENST00000620398\t108\t-1.83210\n-ENST00000620398\t109\t-1.33697\n-ENST00000620398\t110\t-1.16768\n-ENST00000620398\t111\t-1.02615\n-ENST00000620398\t112\t-0.56497\n-ENST00000620398\t113\t0.05008\n-ENST00000620398\t114\t-0.17721\n-ENST00000620398\t115\t-0.22401\n-ENST00000620398\t116\t-0.09074\n-ENST00000620398\t117\t-0.30371\n-ENST00000620398\t118\t-0.16716\n-ENST00000620398\t119\t-0.46805\n-ENST00000620398\t120\t-0.43787\n-ENST00000620398\t121\t-0.52024\n-ENST00000620398\t122\t-0.85052\n-ENST00000620398\t123\t-1.37447\n-ENST00000620398\t124\t-1.50466\n-ENST00000620398\t125\t-0.99556\n-ENST00000620398\t126\t-1.12375\n-ENST00000620398\t127\t-1.52044\n-ENST00000620398\t128\t-1.26301\n-ENST00000620398\t129\t-1.02761\n-ENST00000620398\t130\t-0.99542\n-ENST00000620398\t131\t-1.22844\n-ENST00000620398\t132\t-1.24777\n-ENST00000620398\t133\t-1.03441\n-ENST00000620398\t134\t-0.99168\n-ENST00000620398\t135\t-1.03185\n-ENST00000620398\t136\t-0.76297\n-ENST00000620398\t137\t-0.18651\n-ENST0'..b'00622300\t2054\t-0.24500\n-ENST00000622300\t2055\t-0.18461\n-ENST00000622300\t2056\t-0.09015\n-ENST00000622300\t2057\t-0.26557\n-ENST00000622300\t2058\t-0.79349\n-ENST00000622300\t2059\t-0.87373\n-ENST00000622300\t2060\t-0.68443\n-ENST00000622300\t2061\t-1.03542\n-ENST00000622300\t2062\t-1.39051\n-ENST00000622300\t2063\t-1.89599\n-ENST00000622300\t2064\t-2.08290\n-ENST00000622300\t2065\t-2.44637\n-ENST00000622300\t2066\t-2.69409\n-ENST00000622300\t2067\t-3.13123\n-ENST00000622300\t2068\t-3.04258\n-ENST00000622300\t2069\t-2.67950\n-ENST00000622300\t2070\t-2.93903\n-ENST00000622300\t2071\t-2.91323\n-ENST00000622300\t2072\t-2.53409\n-ENST00000622300\t2073\t-2.62860\n-ENST00000622300\t2074\t-2.69925\n-ENST00000622300\t2075\t-2.54953\n-ENST00000622300\t2076\t-2.23005\n-ENST00000622300\t2077\t-2.51097\n-ENST00000622300\t2078\t-2.43155\n-ENST00000622300\t2079\t-2.53753\n-ENST00000622300\t2080\t-2.92249\n-ENST00000622300\t2081\t-2.65876\n-ENST00000622300\t2082\t-2.73476\n-ENST00000622300\t2083\t-2.86314\n-ENST00000622300\t2084\t-2.78574\n-ENST00000622300\t2085\t-2.51670\n-ENST00000622300\t2086\t-2.76286\n-ENST00000622300\t2087\t-3.05269\n-ENST00000622300\t2088\t-2.71949\n-ENST00000622300\t2089\t-2.81494\n-ENST00000622300\t2090\t-2.87752\n-ENST00000622300\t2091\t-2.56178\n-ENST00000622300\t2092\t-2.84411\n-ENST00000622300\t2093\t-3.01108\n-ENST00000622300\t2094\t-3.03936\n-ENST00000622300\t2095\t-2.59397\n-ENST00000622300\t2096\t-2.34811\n-ENST00000622300\t2097\t-1.77121\n-ENST00000622300\t2098\t-0.99290\n-ENST00000622300\t2099\t-0.39795\n-ENST00000622300\t2100\t-0.43876\n-ENST00000622300\t2101\t-0.10996\n-ENST00000622300\t2102\t0.31793\n-ENST00000622300\t2103\t0.39982\n-ENST00000622300\t2104\t0.75355\n-ENST00000622300\t2105\t0.52498\n-ENST00000622300\t2106\t0.36439\n-ENST00000622300\t2107\t0.40343\n-ENST00000622300\t2108\t0.42899\n-ENST00000622300\t2109\t0.41707\n-ENST00000622300\t2110\t-0.26861\n-ENST00000622300\t2111\t0.67260\n-ENST00000622300\t2112\t1.49601\n-ENST00000622300\t2113\t1.63321\n-ENST00000622300\t2114\t2.33281\n-ENST00000622300\t2115\t2.65346\n-ENST00000622300\t2116\t3.30196\n-ENST00000622300\t2117\t3.39975\n-ENST00000622300\t2118\t3.19936\n-ENST00000622300\t2119\t2.35702\n-ENST00000622300\t2120\t1.44925\n-ENST00000622300\t2121\t1.98523\n-ENST00000622300\t2122\t1.43825\n-ENST00000622300\t2123\t0.18456\n-ENST00000622300\t2124\t-0.72567\n-ENST00000622300\t2125\t-1.29136\n-ENST00000622300\t2126\t-2.18548\n-ENST00000622300\t2127\t-2.98199\n-ENST00000622300\t2128\t-3.69606\n-ENST00000622300\t2129\t-3.97524\n-ENST00000622300\t2130\t-3.68742\n-ENST00000622300\t2131\t-3.66125\n-ENST00000622300\t2132\t-4.19446\n-ENST00000622300\t2133\t-4.68997\n-ENST00000622300\t2134\t-4.53507\n-ENST00000622300\t2135\t-4.15817\n-ENST00000622300\t2136\t-3.90888\n-ENST00000622300\t2137\t-3.39831\n-ENST00000622300\t2138\t-2.83610\n-ENST00000622300\t2139\t-2.41116\n-ENST00000622300\t2140\t-2.62798\n-ENST00000622300\t2141\t-2.98976\n-ENST00000622300\t2142\t-3.00556\n-ENST00000622300\t2143\t-3.22937\n-ENST00000622300\t2144\t-3.23289\n-ENST00000622300\t2145\t-3.44534\n-ENST00000622300\t2146\t-3.84926\n-ENST00000622300\t2147\t-4.19945\n-ENST00000622300\t2148\t-4.54425\n-ENST00000622300\t2149\t-5.04029\n-ENST00000622300\t2150\t-4.93965\n-ENST00000622300\t2151\t-5.18390\n-ENST00000622300\t2152\t-5.21604\n-ENST00000622300\t2153\t-4.98767\n-ENST00000622300\t2154\t-4.71730\n-ENST00000622300\t2155\t-4.81406\n-ENST00000622300\t2156\t-5.08142\n-ENST00000622300\t2157\t-5.10147\n-ENST00000622300\t2158\t-5.21966\n-ENST00000622300\t2159\t-5.17477\n-ENST00000622300\t2160\t-4.97602\n-ENST00000622300\t2161\t-5.46703\n-ENST00000622300\t2162\t-4.97108\n-ENST00000622300\t2163\t-4.91174\n-ENST00000622300\t2164\t-4.64458\n-ENST00000622300\t2165\t-4.39322\n-ENST00000622300\t2166\t-4.16857\n-ENST00000622300\t2167\t-3.70834\n-ENST00000622300\t2168\t-2.93697\n-ENST00000622300\t2169\t-3.00413\n-ENST00000622300\t2170\t-3.24968\n-ENST00000622300\t2171\t-3.39781\n-ENST00000622300\t2172\t-3.38260\n-ENST00000622300\t2173\t-3.32110\n-ENST00000622300\t2174\t-2.49284\n-ENST00000622300\t2175\t-2.24537\n-ENST00000622300\t2176\t-2.43997\n-ENST00000622300\t2177\t-1.75779\n-ENST00000622300\t2178\t-1.38582\n-ENST00000622300\t2179\t-1.73345\n-ENST00000622300\t2180\t-1.20136\n-ENST00000622300\t2181\t-0.56489\n-ENST00000622300\t2182\t0.05943\n-ENST00000622300\t2183\t0.93245\n'
b
diff -r 215925e588c4 -r 20429f4c1b95 test-data/GraphProt_predict_profile_test_out2.peak_regions.bed
--- a/test-data/GraphProt_predict_profile_test_out2.peak_regions.bed Fri May 25 12:17:44 2018 -0400
+++ /dev/null Thu Jan 01 00:00:00 1970 +0000
b
@@ -1,46 +0,0 @@
-ENST00000620398 39 55 48;5.28355 0 +
-ENST00000620398 199 201 200;1.16279 0 +
-ENST00000620398 343 351 348;1.96970 0 +
-ENST00000620398 365 375 371;2.99552 0 +
-ENST00000620398 399 410 406;3.74594 0 +
-ENST00000620398 497 503 502;1.84967 0 +
-ENST00000620398 565 575 570;4.55760 0 +
-ENST00000620398 633 643 637;3.61150 0 +
-ENST00000620398 713 714 714;1.06658 0 +
-ENST00000620398 770 778 776;2.10449 0 +
-ENST00000550775 27 49 39;5.51074 0 +
-ENST00000550775 163 169 166;1.87947 0 +
-ENST00000550775 192 201 198;3.51905 0 +
-ENST00000550775 294 301 298;2.24890 0 +
-ENST00000550775 347 354 352;2.37461 0 +
-ENST00000550775 449 452 451;1.36497 0 +
-ENST00000550775 466 478 475;2.84867 0 +
-ENST00000550775 697 706 700;3.10360 0 +
-ENST00000550775 716 725 720;2.52530 0 +
-ENST00000550775 751 776 771;5.09839 0 +
-ENST00000550775 823 828 825;2.06144 0 +
-ENST00000550775 860 873 868;4.04621 0 +
-ENST00000550775 972 977 975;1.45895 0 +
-ENST00000550775 1030 1053 1045;5.84555 0 +
-ENST00000550775 1118 1125 1122;2.51525 0 +
-ENST00000550775 1190 1191 1191;1.07250 0 +
-ENST00000550775 1201 1209 1207;2.70965 0 +
-ENST00000550775 1251 1252 1252;1.00154 0 +
-ENST00000550775 1273 1281 1280;2.26778 0 +
-ENST00000550775 1351 1358 1355;3.29173 0 +
-ENST00000550775 1376 1379 1378;1.89003 0 +
-ENST00000550775 1395 1401 1398;2.26149 0 +
-ENST00000550775 1422 1423 1423;1.15116 0 +
-ENST00000622300 3 24 10;3.01318 0 +
-ENST00000622300 68 80 76;4.32748 0 +
-ENST00000622300 112 128 119;6.64918 0 +
-ENST00000622300 155 163 161;2.39282 0 +
-ENST00000622300 194 210 202;5.46547 0 +
-ENST00000622300 217 229 223;3.51626 0 +
-ENST00000622300 310 314 313;1.82067 0 +
-ENST00000622300 521 528 527;2.46351 0 +
-ENST00000622300 1032 1038 1036;1.95475 0 +
-ENST00000622300 1176 1189 1183;3.26978 0 +
-ENST00000622300 1948 1951 1950;1.55761 0 +
-ENST00000622300 1953 1954 1954;1.14118 0 +
-ENST00000622300 2111 2122 2117;3.39975 0 +
b
diff -r 215925e588c4 -r 20429f4c1b95 test-data/GraphProt_predict_profile_test_out3.average_profile
--- a/test-data/GraphProt_predict_profile_test_out3.average_profile Fri May 25 12:17:44 2018 -0400
+++ /dev/null Thu Jan 01 00:00:00 1970 +0000
b
b'@@ -1,4532 +0,0 @@\n-ENST00000620398\t1\t2.56516\n-ENST00000620398\t2\t1.47085\n-ENST00000620398\t3\t0.38619\n-ENST00000620398\t4\t0.49097\n-ENST00000620398\t5\t-0.42994\n-ENST00000620398\t6\t-0.99409\n-ENST00000620398\t7\t-1.99338\n-ENST00000620398\t8\t-2.22462\n-ENST00000620398\t9\t-2.38102\n-ENST00000620398\t10\t-2.92963\n-ENST00000620398\t11\t-3.03152\n-ENST00000620398\t12\t-3.11306\n-ENST00000620398\t13\t-3.47216\n-ENST00000620398\t14\t-2.73835\n-ENST00000620398\t15\t-3.74191\n-ENST00000620398\t16\t-2.75715\n-ENST00000620398\t17\t-1.32424\n-ENST00000620398\t18\t0.04953\n-ENST00000620398\t19\t0.07717\n-ENST00000620398\t20\t0.84086\n-ENST00000620398\t21\t1.31089\n-ENST00000620398\t22\t1.30952\n-ENST00000620398\t23\t0.74493\n-ENST00000620398\t24\t1.86706\n-ENST00000620398\t25\t1.80127\n-ENST00000620398\t26\t3.01397\n-ENST00000620398\t27\t2.93897\n-ENST00000620398\t28\t2.27538\n-ENST00000620398\t29\t1.50969\n-ENST00000620398\t30\t1.39582\n-ENST00000620398\t31\t0.45618\n-ENST00000620398\t32\t-0.65174\n-ENST00000620398\t33\t0.02657\n-ENST00000620398\t34\t0.92490\n-ENST00000620398\t35\t-0.19374\n-ENST00000620398\t36\t0.30699\n-ENST00000620398\t37\t0.02892\n-ENST00000620398\t38\t-0.12226\n-ENST00000620398\t39\t-0.28373\n-ENST00000620398\t40\t0.03676\n-ENST00000620398\t41\t-0.30100\n-ENST00000620398\t42\t-0.01903\n-ENST00000620398\t43\t0.62274\n-ENST00000620398\t44\t0.32629\n-ENST00000620398\t45\t1.05229\n-ENST00000620398\t46\t1.79972\n-ENST00000620398\t47\t1.87942\n-ENST00000620398\t48\t1.55871\n-ENST00000620398\t49\t0.62813\n-ENST00000620398\t50\t0.47475\n-ENST00000620398\t51\t0.14349\n-ENST00000620398\t52\t0.08271\n-ENST00000620398\t53\t0.56814\n-ENST00000620398\t54\t0.70684\n-ENST00000620398\t55\t-0.37982\n-ENST00000620398\t56\t-0.67582\n-ENST00000620398\t57\t0.07603\n-ENST00000620398\t58\t-0.32582\n-ENST00000620398\t59\t-0.71153\n-ENST00000620398\t60\t-0.12967\n-ENST00000620398\t61\t-0.87651\n-ENST00000620398\t62\t-0.53489\n-ENST00000620398\t63\t-1.19587\n-ENST00000620398\t64\t-2.08688\n-ENST00000620398\t65\t-3.08311\n-ENST00000620398\t66\t-2.72471\n-ENST00000620398\t67\t-3.63876\n-ENST00000620398\t68\t-4.10674\n-ENST00000620398\t69\t-3.77128\n-ENST00000620398\t70\t-2.73552\n-ENST00000620398\t71\t-2.27111\n-ENST00000620398\t72\t-2.17613\n-ENST00000620398\t73\t-3.25148\n-ENST00000620398\t74\t-2.67752\n-ENST00000620398\t75\t-3.58821\n-ENST00000620398\t76\t-2.69258\n-ENST00000620398\t77\t-1.51679\n-ENST00000620398\t78\t-0.25095\n-ENST00000620398\t79\t0.32794\n-ENST00000620398\t80\t0.25935\n-ENST00000620398\t81\t0.05256\n-ENST00000620398\t82\t-0.72131\n-ENST00000620398\t83\t-0.69812\n-ENST00000620398\t84\t-0.43956\n-ENST00000620398\t85\t-1.14537\n-ENST00000620398\t86\t0.31315\n-ENST00000620398\t87\t-1.19156\n-ENST00000620398\t88\t-1.68829\n-ENST00000620398\t89\t-2.23242\n-ENST00000620398\t90\t-3.67432\n-ENST00000620398\t91\t-4.74495\n-ENST00000620398\t92\t-5.71206\n-ENST00000620398\t93\t-5.12539\n-ENST00000620398\t94\t-3.61975\n-ENST00000620398\t95\t-2.18159\n-ENST00000620398\t96\t-0.52548\n-ENST00000620398\t97\t-1.01779\n-ENST00000620398\t98\t-0.47946\n-ENST00000620398\t99\t-0.68045\n-ENST00000620398\t100\t-0.50964\n-ENST00000620398\t101\t0.63404\n-ENST00000620398\t102\t1.26261\n-ENST00000620398\t103\t2.04343\n-ENST00000620398\t104\t0.98933\n-ENST00000620398\t105\t0.52928\n-ENST00000620398\t106\t0.19569\n-ENST00000620398\t107\t-1.38590\n-ENST00000620398\t108\t-1.33265\n-ENST00000620398\t109\t-0.30212\n-ENST00000620398\t110\t-0.13749\n-ENST00000620398\t111\t0.14231\n-ENST00000620398\t112\t-0.27605\n-ENST00000620398\t113\t0.46538\n-ENST00000620398\t114\t0.65539\n-ENST00000620398\t115\t0.95242\n-ENST00000620398\t116\t0.17844\n-ENST00000620398\t117\t-1.34910\n-ENST00000620398\t118\t-0.38371\n-ENST00000620398\t119\t0.15339\n-ENST00000620398\t120\t-0.12188\n-ENST00000620398\t121\t-0.83684\n-ENST00000620398\t122\t-2.12156\n-ENST00000620398\t123\t-2.00991\n-ENST00000620398\t124\t-2.53150\n-ENST00000620398\t125\t-2.35649\n-ENST00000620398\t126\t-1.38437\n-ENST00000620398\t127\t-1.75027\n-ENST00000620398\t128\t-0.43524\n-ENST00000620398\t129\t-0.49254\n-ENST00000620398\t130\t-0.48982\n-ENST00000620398\t131\t-0.82314\n-ENST00000620398\t132\t-0.88437\n-ENST00000620398\t133\t-0.06217\n-ENST00000620398\t134\t-0.98988\n-ENST00000620398\t135\t-2.00321\n-ENST00000620398\t136\t-2.58189\n-ENST00000620398\t137\t-2.44256\n-ENST00000620398\t138\t-1.31558\n'..b'7\n-ENST00000622300\t2053\t-1.69738\n-ENST00000622300\t2054\t-1.21575\n-ENST00000622300\t2055\t-0.27639\n-ENST00000622300\t2056\t-0.34647\n-ENST00000622300\t2057\t-0.67862\n-ENST00000622300\t2058\t-0.32993\n-ENST00000622300\t2059\t-0.21508\n-ENST00000622300\t2060\t-0.00400\n-ENST00000622300\t2061\t0.47251\n-ENST00000622300\t2062\t0.90543\n-ENST00000622300\t2063\t1.37224\n-ENST00000622300\t2064\t0.61163\n-ENST00000622300\t2065\t0.33938\n-ENST00000622300\t2066\t-1.15698\n-ENST00000622300\t2067\t-0.80739\n-ENST00000622300\t2068\t-0.89228\n-ENST00000622300\t2069\t-0.08761\n-ENST00000622300\t2070\t1.10467\n-ENST00000622300\t2071\t2.31444\n-ENST00000622300\t2072\t1.96929\n-ENST00000622300\t2073\t0.45519\n-ENST00000622300\t2074\t0.12675\n-ENST00000622300\t2075\t0.90297\n-ENST00000622300\t2076\t1.61292\n-ENST00000622300\t2077\t2.90204\n-ENST00000622300\t2078\t1.71803\n-ENST00000622300\t2079\t1.33055\n-ENST00000622300\t2080\t0.44781\n-ENST00000622300\t2081\t-0.46316\n-ENST00000622300\t2082\t-1.32687\n-ENST00000622300\t2083\t-1.92382\n-ENST00000622300\t2084\t-1.94670\n-ENST00000622300\t2085\t-1.75430\n-ENST00000622300\t2086\t-2.27367\n-ENST00000622300\t2087\t-3.07146\n-ENST00000622300\t2088\t-3.41189\n-ENST00000622300\t2089\t-2.68607\n-ENST00000622300\t2090\t-3.42752\n-ENST00000622300\t2091\t-2.78351\n-ENST00000622300\t2092\t-2.57115\n-ENST00000622300\t2093\t-2.75022\n-ENST00000622300\t2094\t-3.15666\n-ENST00000622300\t2095\t-2.33554\n-ENST00000622300\t2096\t-2.08184\n-ENST00000622300\t2097\t-1.75798\n-ENST00000622300\t2098\t-1.37435\n-ENST00000622300\t2099\t-0.02806\n-ENST00000622300\t2100\t0.67528\n-ENST00000622300\t2101\t2.10009\n-ENST00000622300\t2102\t2.37966\n-ENST00000622300\t2103\t2.89520\n-ENST00000622300\t2104\t3.43958\n-ENST00000622300\t2105\t3.27959\n-ENST00000622300\t2106\t3.69399\n-ENST00000622300\t2107\t3.50467\n-ENST00000622300\t2108\t3.31908\n-ENST00000622300\t2109\t2.90868\n-ENST00000622300\t2110\t1.20261\n-ENST00000622300\t2111\t0.28391\n-ENST00000622300\t2112\t0.26187\n-ENST00000622300\t2113\t-0.75628\n-ENST00000622300\t2114\t-1.61381\n-ENST00000622300\t2115\t-1.76332\n-ENST00000622300\t2116\t-1.08130\n-ENST00000622300\t2117\t-1.82573\n-ENST00000622300\t2118\t-2.49651\n-ENST00000622300\t2119\t-1.83851\n-ENST00000622300\t2120\t-1.19241\n-ENST00000622300\t2121\t0.50961\n-ENST00000622300\t2122\t0.50387\n-ENST00000622300\t2123\t-0.61322\n-ENST00000622300\t2124\t-0.85289\n-ENST00000622300\t2125\t-1.78512\n-ENST00000622300\t2126\t-1.59370\n-ENST00000622300\t2127\t-1.15550\n-ENST00000622300\t2128\t-0.89010\n-ENST00000622300\t2129\t-0.83522\n-ENST00000622300\t2130\t-1.86188\n-ENST00000622300\t2131\t-2.27601\n-ENST00000622300\t2132\t-4.20973\n-ENST00000622300\t2133\t-3.75082\n-ENST00000622300\t2134\t-3.26125\n-ENST00000622300\t2135\t-3.08040\n-ENST00000622300\t2136\t-2.23328\n-ENST00000622300\t2137\t-2.55391\n-ENST00000622300\t2138\t-3.22127\n-ENST00000622300\t2139\t-3.67459\n-ENST00000622300\t2140\t-4.65018\n-ENST00000622300\t2141\t-3.96539\n-ENST00000622300\t2142\t-3.85773\n-ENST00000622300\t2143\t-2.65299\n-ENST00000622300\t2144\t-2.89788\n-ENST00000622300\t2145\t-2.78248\n-ENST00000622300\t2146\t-2.51466\n-ENST00000622300\t2147\t-2.41492\n-ENST00000622300\t2148\t-2.01146\n-ENST00000622300\t2149\t-1.23700\n-ENST00000622300\t2150\t-0.43137\n-ENST00000622300\t2151\t0.95085\n-ENST00000622300\t2152\t1.04076\n-ENST00000622300\t2153\t1.08142\n-ENST00000622300\t2154\t1.06088\n-ENST00000622300\t2155\t1.26009\n-ENST00000622300\t2156\t1.41933\n-ENST00000622300\t2157\t1.48770\n-ENST00000622300\t2158\t1.58522\n-ENST00000622300\t2159\t1.47276\n-ENST00000622300\t2160\t1.23520\n-ENST00000622300\t2161\t1.06708\n-ENST00000622300\t2162\t1.07770\n-ENST00000622300\t2163\t0.99188\n-ENST00000622300\t2164\t0.99234\n-ENST00000622300\t2165\t0.95904\n-ENST00000622300\t2166\t0.69535\n-ENST00000622300\t2167\t0.72062\n-ENST00000622300\t2168\t0.83034\n-ENST00000622300\t2169\t1.11241\n-ENST00000622300\t2170\t1.19252\n-ENST00000622300\t2171\t0.97175\n-ENST00000622300\t2172\t0.93098\n-ENST00000622300\t2173\t0.81422\n-ENST00000622300\t2174\t0.94765\n-ENST00000622300\t2175\t0.92670\n-ENST00000622300\t2176\t0.76345\n-ENST00000622300\t2177\t0.88368\n-ENST00000622300\t2178\t0.75804\n-ENST00000622300\t2179\t0.54148\n-ENST00000622300\t2180\t0.31293\n-ENST00000622300\t2181\t0.26211\n-ENST00000622300\t2182\t0.66059\n-ENST00000622300\t2183\t0.61648\n'
b
diff -r 215925e588c4 -r 20429f4c1b95 test-data/GraphProt_predict_profile_test_out3.peak_regions.bed
--- a/test-data/GraphProt_predict_profile_test_out3.peak_regions.bed Fri May 25 12:17:44 2018 -0400
+++ /dev/null Thu Jan 01 00:00:00 1970 +0000
b
@@ -1,126 +0,0 @@
-ENST00000620398 0 1 1;2.56516 0 +
-ENST00000620398 25 28 26;3.01397 0 +
-ENST00000620398 102 103 103;2.04343 0 +
-ENST00000620398 146 147 147;2.05340 0 +
-ENST00000620398 150 152 152;2.77163 0 +
-ENST00000620398 271 283 277;5.06419 0 +
-ENST00000620398 337 349 348;4.33434 0 +
-ENST00000620398 350 351 351;2.16441 0 +
-ENST00000620398 367 369 369;3.08158 0 +
-ENST00000620398 387 389 389;2.57464 0 +
-ENST00000620398 390 405 393;3.80325 0 +
-ENST00000620398 409 419 417;4.43283 0 +
-ENST00000620398 521 532 527;5.07885 0 +
-ENST00000620398 568 569 569;2.40792 0 +
-ENST00000620398 601 602 602;2.08678 0 +
-ENST00000620398 604 609 607;2.78010 0 +
-ENST00000620398 617 618 618;2.44909 0 +
-ENST00000620398 621 625 623;3.67567 0 +
-ENST00000620398 696 714 711;4.85499 0 +
-ENST00000620398 725 730 729;3.02635 0 +
-ENST00000620398 828 831 830;3.16417 0 +
-ENST00000620398 889 890 890;2.00231 0 +
-ENST00000620398 893 895 894;2.34843 0 +
-ENST00000550775 3 13 7;4.26114 0 +
-ENST00000550775 64 65 65;2.25937 0 +
-ENST00000550775 118 120 120;2.35058 0 +
-ENST00000550775 122 123 123;2.26492 0 +
-ENST00000550775 138 146 142;3.67485 0 +
-ENST00000550775 169 170 170;2.04263 0 +
-ENST00000550775 188 202 195;4.50428 0 +
-ENST00000550775 212 217 216;3.29288 0 +
-ENST00000550775 236 237 237;2.05203 0 +
-ENST00000550775 240 241 241;2.10220 0 +
-ENST00000550775 248 255 251;3.09673 0 +
-ENST00000550775 257 258 258;2.02293 0 +
-ENST00000550775 303 310 305;3.45804 0 +
-ENST00000550775 473 477 476;4.02566 0 +
-ENST00000550775 495 508 500;5.12153 0 +
-ENST00000550775 533 535 535;2.88714 0 +
-ENST00000550775 627 628 628;2.24319 0 +
-ENST00000550775 630 633 632;3.32068 0 +
-ENST00000550775 652 663 656;4.76018 0 +
-ENST00000550775 694 708 703;5.56137 0 +
-ENST00000550775 709 710 710;2.32723 0 +
-ENST00000550775 723 724 724;2.62097 0 +
-ENST00000550775 728 732 731;2.51311 0 +
-ENST00000550775 748 773 764;7.26597 0 +
-ENST00000550775 798 812 806;5.00685 0 +
-ENST00000550775 820 837 830;6.27673 0 +
-ENST00000550775 860 862 862;2.70617 0 +
-ENST00000550775 913 918 915;2.74364 0 +
-ENST00000550775 944 957 951;5.50073 0 +
-ENST00000550775 968 982 975;5.17041 0 +
-ENST00000550775 998 1007 1003;6.13518 0 +
-ENST00000550775 1015 1016 1016;2.15212 0 +
-ENST00000550775 1017 1026 1021;3.01020 0 +
-ENST00000550775 1028 1032 1031;3.89497 0 +
-ENST00000550775 1033 1051 1048;4.40980 0 +
-ENST00000550775 1083 1093 1086;5.75158 0 +
-ENST00000550775 1102 1113 1108;5.27592 0 +
-ENST00000550775 1140 1142 1141;2.71049 0 +
-ENST00000550775 1143 1151 1148;3.61343 0 +
-ENST00000550775 1169 1170 1170;2.13793 0 +
-ENST00000550775 1171 1183 1179;4.34923 0 +
-ENST00000550775 1186 1192 1187;2.53224 0 +
-ENST00000550775 1195 1202 1199;2.98945 0 +
-ENST00000550775 1225 1227 1226;2.16242 0 +
-ENST00000550775 1247 1248 1248;2.13326 0 +
-ENST00000550775 1249 1251 1250;3.11820 0 +
-ENST00000550775 1265 1277 1273;4.77230 0 +
-ENST00000550775 1278 1282 1279;2.30700 0 +
-ENST00000550775 1376 1388 1383;4.92928 0 +
-ENST00000550775 1395 1404 1399;3.93408 0 +
-ENST00000550775 1422 1431 1429;3.41741 0 +
-ENST00000622300 39 43 42;2.68360 0 +
-ENST00000622300 45 49 48;2.35632 0 +
-ENST00000622300 84 85 85;2.20936 0 +
-ENST00000622300 86 89 89;2.52062 0 +
-ENST00000622300 93 95 95;2.66642 0 +
-ENST00000622300 112 115 114;2.92246 0 +
-ENST00000622300 176 177 177;3.04359 0 +
-ENST00000622300 178 179 179;2.07534 0 +
-ENST00000622300 200 201 201;2.06920 0 +
-ENST00000622300 202 203 203;2.29769 0 +
-ENST00000622300 217 223 221;3.36809 0 +
-ENST00000622300 224 228 226;3.33227 0 +
-ENST00000622300 275 276 276;2.22853 0 +
-ENST00000622300 293 294 294;2.58778 0 +
-ENST00000622300 295 296 296;2.00697 0 +
-ENST00000622300 317 318 318;2.61031 0 +
-ENST00000622300 543 544 544;2.41285 0 +
-ENST00000622300 548 549 549;2.88195 0 +
-ENST00000622300 666 671 670;3.45755 0 +
-ENST00000622300 684 692 688;3.37274 0 +
-ENST00000622300 717 728 723;7.14369 0 +
-ENST00000622300 730 731 731;2.12353 0 +
-ENST00000622300 732 734 733;2.13951 0 +
-ENST00000622300 747 750 748;2.53218 0 +
-ENST00000622300 787 789 788;2.45550 0 +
-ENST00000622300 795 806 804;4.57226 0 +
-ENST00000622300 869 890 886;5.11897 0 +
-ENST00000622300 900 901 901;2.50531 0 +
-ENST00000622300 904 917 914;4.45438 0 +
-ENST00000622300 996 998 998;2.61714 0 +
-ENST00000622300 1000 1001 1001;2.12846 0 +
-ENST00000622300 1002 1008 1006;3.73726 0 +
-ENST00000622300 1035 1036 1036;2.00132 0 +
-ENST00000622300 1037 1062 1060;5.09661 0 +
-ENST00000622300 1074 1075 1075;2.85025 0 +
-ENST00000622300 1111 1122 1117;5.89963 0 +
-ENST00000622300 1135 1146 1139;5.01713 0 +
-ENST00000622300 1183 1191 1188;3.53619 0 +
-ENST00000622300 1202 1223 1219;6.99392 0 +
-ENST00000622300 1224 1226 1226;2.27412 0 +
-ENST00000622300 1249 1251 1251;2.69197 0 +
-ENST00000622300 1378 1380 1379;2.65475 0 +
-ENST00000622300 1629 1630 1630;2.29608 0 +
-ENST00000622300 1644 1645 1645;2.06150 0 +
-ENST00000622300 1658 1659 1659;2.72726 0 +
-ENST00000622300 1712 1721 1717;3.57623 0 +
-ENST00000622300 1728 1729 1729;2.07451 0 +
-ENST00000622300 1741 1742 1742;2.08846 0 +
-ENST00000622300 1758 1765 1763;4.01232 0 +
-ENST00000622300 2070 2071 2071;2.31444 0 +
-ENST00000622300 2076 2077 2077;2.90204 0 +
-ENST00000622300 2100 2109 2106;3.69399 0 +
b
diff -r 215925e588c4 -r 20429f4c1b95 test-data/GraphProt_predict_profile_test_out4.average_profile
--- a/test-data/GraphProt_predict_profile_test_out4.average_profile Fri May 25 12:17:44 2018 -0400
+++ /dev/null Thu Jan 01 00:00:00 1970 +0000
b
b'@@ -1,4532 +0,0 @@\n-ENST00000620398\t1\t-0.76858\t0.59735533\n-ENST00000620398\t2\t-0.03468\t0.47807645\n-ENST00000620398\t3\t0.04965\t0.46435201\n-ENST00000620398\t4\t-0.14617\t0.49627273\n-ENST00000620398\t5\t-0.25772\t0.51450557\n-ENST00000620398\t6\t-0.18232\t0.50218021\n-ENST00000620398\t7\t0.10892\t0.45473513\n-ENST00000620398\t8\t0.38469\t0.41046250\n-ENST00000620398\t9\t0.19660\t0.44056482\n-ENST00000620398\t10\t-0.09320\t0.48762201\n-ENST00000620398\t11\t-0.02488\t0.47647943\n-ENST00000620398\t12\t0.05823\t0.46295820\n-ENST00000620398\t13\t-0.16295\t0.49901457\n-ENST00000620398\t14\t-0.21293\t0.50718368\n-ENST00000620398\t15\t0.18155\t0.44299176\n-ENST00000620398\t16\t0.65843\t0.36763803\n-ENST00000620398\t17\t0.88382\t0.33352930\n-ENST00000620398\t18\t0.66959\t0.36592239\n-ENST00000620398\t19\t0.09734\t0.45661186\n-ENST00000620398\t20\t0.58534\t0.37893718\n-ENST00000620398\t21\t0.68692\t0.36326343\n-ENST00000620398\t22\t0.49868\t0.39246740\n-ENST00000620398\t23\t0.58223\t0.37942032\n-ENST00000620398\t24\t0.19797\t0.44034402\n-ENST00000620398\t25\t-0.05177\t0.48086257\n-ENST00000620398\t26\t-0.06202\t0.48253424\n-ENST00000620398\t27\t-0.21926\t0.50821845\n-ENST00000620398\t28\t-0.44244\t0.54466219\n-ENST00000620398\t29\t-0.51934\t0.55717134\n-ENST00000620398\t30\t-0.16263\t0.49896228\n-ENST00000620398\t31\t-0.58048\t0.56708627\n-ENST00000620398\t32\t-0.24010\t0.51162524\n-ENST00000620398\t33\t0.42518\t0.40404651\n-ENST00000620398\t34\t-0.01615\t0.47505721\n-ENST00000620398\t35\t0.54511\t0.38520105\n-ENST00000620398\t36\t1.13737\t0.29669349\n-ENST00000620398\t37\t0.60369\t0.37609038\n-ENST00000620398\t38\t0.22487\t0.43601261\n-ENST00000620398\t39\t-0.09225\t0.48746694\n-ENST00000620398\t40\t0.44833\t0.40038981\n-ENST00000620398\t41\t0.42720\t0.40372710\n-ENST00000620398\t42\t0.62121\t0.37337854\n-ENST00000620398\t43\t0.33090\t0.41902372\n-ENST00000620398\t44\t-0.03877\t0.47874310\n-ENST00000620398\t45\t0.06941\t0.46114284\n-ENST00000620398\t46\t-0.48474\t0.55154774\n-ENST00000620398\t47\t-0.75991\t0.59596951\n-ENST00000620398\t48\t-0.57806\t0.56669442\n-ENST00000620398\t49\t-0.24653\t0.51267636\n-ENST00000620398\t50\t-0.21702\t0.50785228\n-ENST00000620398\t51\t-0.88653\t0.61609977\n-ENST00000620398\t52\t-1.22847\t0.66900042\n-ENST00000620398\t53\t-0.83630\t0.60814313\n-ENST00000620398\t54\t-1.19545\t0.66400228\n-ENST00000620398\t55\t-0.80753\t0.60356822\n-ENST00000620398\t56\t-0.84350\t0.60928609\n-ENST00000620398\t57\t-0.27054\t0.51660112\n-ENST00000620398\t58\t-0.45957\t0.54745187\n-ENST00000620398\t59\t0.05163\t0.46403032\n-ENST00000620398\t60\t0.01026\t0.47075727\n-ENST00000620398\t61\t0.56852\t0.38155235\n-ENST00000620398\t62\t1.43453\t0.25593506\n-ENST00000620398\t63\t2.19115\t0.16596379\n-ENST00000620398\t64\t1.85583\t0.20324108\n-ENST00000620398\t65\t2.56079\t0.12988739\n-ENST00000620398\t66\t2.66895\t0.12033421\n-ENST00000620398\t67\t2.95360\t0.09734005\n-ENST00000620398\t68\t2.95875\t0.09695236\n-ENST00000620398\t69\t3.66654\t0.05277897\n-ENST00000620398\t70\t3.37299\t0.06896249\n-ENST00000620398\t71\t3.21617\t0.07882724\n-ENST00000620398\t72\t3.36483\t0.06945437\n-ENST00000620398\t73\t3.03707\t0.09117873\n-ENST00000620398\t74\t2.60779\t0.12568050\n-ENST00000620398\t75\t2.85639\t0.10484563\n-ENST00000620398\t76\t2.31767\t0.15301845\n-ENST00000620398\t77\t1.94231\t0.19321993\n-ENST00000620398\t78\t1.69364\t0.22277963\n-ENST00000620398\t79\t1.55675\t0.24000599\n-ENST00000620398\t80\t0.93058\t0.32660604\n-ENST00000620398\t81\t0.61986\t0.37358728\n-ENST00000620398\t82\t1.05410\t0.30859556\n-ENST00000620398\t83\t0.64566\t0.36960437\n-ENST00000620398\t84\t0.67873\t0.36451923\n-ENST00000620398\t85\t0.83084\t0.34143984\n-ENST00000620398\t86\t0.35702\t0.41486126\n-ENST00000620398\t87\t0.73115\t0.35650659\n-ENST00000620398\t88\t0.64236\t0.37011305\n-ENST00000620398\t89\t0.54931\t0.38454568\n-ENST00000620398\t90\t0.68638\t0.36334619\n-ENST00000620398\t91\t0.97906\t0.31948812\n-ENST00000620398\t92\t1.23112\t0.28353736\n-ENST00000620398\t93\t0.45082\t0.39999701\n-ENST00000620398\t94\t0.41468\t0.40570788\n-ENST00000620398\t95\t-0.10423\t0.48942268\n-ENST00000620398\t96\t-0.12172\t0.49227874\n-ENST00000620398\t97\t-0.08904\t0.48694300\n-ENST00000620398\t98\t0.28265\t0.42673716\n-ENST00000620398\t99\t0.51222\t0.39034438\n-ENST00000620398\t100\t0.96265\t0.32189051\n-ENST00000620398\t101\t0.87339\t0.33508112\n-ENST0000062'..b'2088\t-0.38885\t0.53592518\n-ENST00000622300\t2089\t-0.80579\t0.60329113\n-ENST00000622300\t2090\t-0.59000\t0.56862724\n-ENST00000622300\t2091\t-0.08360\t0.48605514\n-ENST00000622300\t2092\t-0.03087\t0.47745551\n-ENST00000622300\t2093\t-0.04403\t0.47960058\n-ENST00000622300\t2094\t-0.23911\t0.51146340\n-ENST00000622300\t2095\t-0.32369\t0.52528662\n-ENST00000622300\t2096\t-0.29859\t0.52118550\n-ENST00000622300\t2097\t-0.10127\t0.48893941\n-ENST00000622300\t2098\t0.04468\t0.46515962\n-ENST00000622300\t2099\t0.19115\t0.44144341\n-ENST00000622300\t2100\t0.05755\t0.46306865\n-ENST00000622300\t2101\t-0.31068\t0.52316106\n-ENST00000622300\t2102\t-0.52232\t0.55765529\n-ENST00000622300\t2103\t-0.49955\t0.55395585\n-ENST00000622300\t2104\t-0.82162\t0.60581031\n-ENST00000622300\t2105\t-0.70734\t0.58754561\n-ENST00000622300\t2106\t-0.33008\t0.52633046\n-ENST00000622300\t2107\t-0.38568\t0.53540796\n-ENST00000622300\t2108\t-0.36474\t0.53199041\n-ENST00000622300\t2109\t-0.13423\t0.49432211\n-ENST00000622300\t2110\t-0.20524\t0.50592662\n-ENST00000622300\t2111\t0.00181\t0.47213261\n-ENST00000622300\t2112\t0.57341\t0.38079150\n-ENST00000622300\t2113\t0.12068\t0.45283040\n-ENST00000622300\t2114\t-0.19236\t0.50382125\n-ENST00000622300\t2115\t-0.46403\t0.54817792\n-ENST00000622300\t2116\t-1.14300\t0.65601027\n-ENST00000622300\t2117\t-1.91000\t0.76517726\n-ENST00000622300\t2118\t-2.63527\t0.84912978\n-ENST00000622300\t2119\t-2.83474\t0.86842491\n-ENST00000622300\t2120\t-3.23425\t0.90204802\n-ENST00000622300\t2121\t-3.24377\t0.90276867\n-ENST00000622300\t2122\t-3.54207\t0.92351327\n-ENST00000622300\t2123\t-3.49120\t0.92022316\n-ENST00000622300\t2124\t-3.37913\t0.91261908\n-ENST00000622300\t2125\t-2.86945\t0.87161106\n-ENST00000622300\t2126\t-2.92146\t0.87629024\n-ENST00000622300\t2127\t-2.51600\t0.83679293\n-ENST00000622300\t2128\t-2.34713\t0.81831367\n-ENST00000622300\t2129\t-1.62587\t0.72685216\n-ENST00000622300\t2130\t-0.95580\t0.62700328\n-ENST00000622300\t2131\t-0.61513\t0.57269091\n-ENST00000622300\t2132\t-0.72322\t0.59009395\n-ENST00000622300\t2133\t-0.60173\t0.57052480\n-ENST00000622300\t2134\t-1.40123\t0.69469914\n-ENST00000622300\t2135\t-1.88554\t0.76198911\n-ENST00000622300\t2136\t-2.63829\t0.84943441\n-ENST00000622300\t2137\t-2.71734\t0.85727172\n-ENST00000622300\t2138\t-3.15382\t0.89581171\n-ENST00000622300\t2139\t-3.19192\t0.89879892\n-ENST00000622300\t2140\t-3.30942\t0.90763820\n-ENST00000622300\t2141\t-4.08497\t0.95272875\n-ENST00000622300\t2142\t-4.50253\t0.96867342\n-ENST00000622300\t2143\t-4.95542\t0.98079734\n-ENST00000622300\t2144\t-5.17351\t0.98508152\n-ENST00000622300\t2145\t-4.71217\t0.97488033\n-ENST00000622300\t2146\t-4.38306\t0.96462554\n-ENST00000622300\t2147\t-3.72728\t0.93466395\n-ENST00000622300\t2148\t-3.68919\t0.93247499\n-ENST00000622300\t2149\t-3.60091\t0.92719526\n-ENST00000622300\t2150\t-2.93785\t0.87774122\n-ENST00000622300\t2151\t-3.61480\t0.92804530\n-ENST00000622300\t2152\t-3.49113\t0.92021857\n-ENST00000622300\t2153\t-3.69243\t0.93266326\n-ENST00000622300\t2154\t-4.01617\t0.94959080\n-ENST00000622300\t2155\t-4.32021\t0.96233680\n-ENST00000622300\t2156\t-4.82328\t0.97774351\n-ENST00000622300\t2157\t-4.80534\t0.97730005\n-ENST00000622300\t2158\t-5.13810\t0.98444550\n-ENST00000622300\t2159\t-4.42529\t0.96610091\n-ENST00000622300\t2160\t-3.69276\t0.93268242\n-ENST00000622300\t2161\t-3.99025\t0.94836834\n-ENST00000622300\t2162\t-2.89683\t0.87408855\n-ENST00000622300\t2163\t-2.60825\t0.84638718\n-ENST00000622300\t2164\t-2.33405\t0.81683342\n-ENST00000622300\t2165\t-1.63861\t0.72862894\n-ENST00000622300\t2166\t-0.49683\t0.55351369\n-ENST00000622300\t2167\t-0.15199\t0.49722365\n-ENST00000622300\t2168\t0.55740\t0.38328424\n-ENST00000622300\t2169\t0.76656\t0.35112850\n-ENST00000622300\t2170\t0.28343\t0.42661223\n-ENST00000622300\t2171\t0.13769\t0.45007750\n-ENST00000622300\t2172\t0.03253\t0.46713466\n-ENST00000622300\t2173\t-0.79852\t0.60213296\n-ENST00000622300\t2174\t-0.36399\t0.53186797\n-ENST00000622300\t2175\t-0.04905\t0.48041905\n-ENST00000622300\t2176\t0.27069\t0.42865379\n-ENST00000622300\t2177\t-0.14691\t0.49639363\n-ENST00000622300\t2178\t0.21709\t0.43726452\n-ENST00000622300\t2179\t-0.17530\t0.50103287\n-ENST00000622300\t2180\t-0.03731\t0.47850512\n-ENST00000622300\t2181\t0.30761\t0.42274311\n-ENST00000622300\t2182\t0.00110\t0.47224819\n-ENST00000622300\t2183\t0.24841\t0.43222892\n'
b
diff -r 215925e588c4 -r 20429f4c1b95 test-data/GraphProt_predict_profile_test_out4.peak_regions.bed
--- a/test-data/GraphProt_predict_profile_test_out4.peak_regions.bed Fri May 25 12:17:44 2018 -0400
+++ /dev/null Thu Jan 01 00:00:00 1970 +0000
b
@@ -1,41 +0,0 @@
-ENST00000620398 162 163 163;4.10141;0.03386102 0 +
-ENST00000620398 235 236 236;3.88117;0.04272571 0 +
-ENST00000620398 258 263 261;4.77792;0.01463037 0 +
-ENST00000620398 439 440 440;3.94044;0.04019990 0 +
-ENST00000620398 575 595 578;4.60065;0.01862025 0 +
-ENST00000620398 741 745 744;4.23242;0.02924040 0 +
-ENST00000620398 857 859 859;3.77513;0.04751189 0 +
-ENST00000550775 1412 1414 1414;3.83592;0.04472568 0 +
-ENST00000622300 130 146 141;5.32023;0.00615285 0 +
-ENST00000622300 147 149 148;4.13454;0.03264802 0 +
-ENST00000622300 184 189 186;4.12509;0.03299090 0 +
-ENST00000622300 284 289 286;4.77712;0.01464689 0 +
-ENST00000622300 375 376 376;3.77414;0.04755822 0 +
-ENST00000622300 694 696 695;4.08808;0.03435773 0 +
-ENST00000622300 948 956 953;4.62972;0.01791917 0 +
-ENST00000622300 972 974 973;3.84084;0.04450518 0 +
-ENST00000622300 979 980 980;3.82283;0.04531595 0 +
-ENST00000622300 1029 1030 1030;4.21258;0.02991017 0 +
-ENST00000622300 1076 1079 1078;4.44140;0.02280568 0 +
-ENST00000622300 1080 1081 1081;3.86707;0.04334219 0 +
-ENST00000622300 1096 1107 1103;5.37082;0.00560540 0 +
-ENST00000622300 1153 1154 1154;4.10699;0.03365457 0 +
-ENST00000622300 1156 1159 1158;4.01594;0.03713346 0 +
-ENST00000622300 1171 1172 1172;3.78529;0.04703821 0 +
-ENST00000622300 1173 1174 1174;3.98787;0.03825401 0 +
-ENST00000622300 1253 1283 1273;5.88435;0.00182390 0 +
-ENST00000622300 1300 1310 1304;5.12597;0.00861035 0 +
-ENST00000622300 1396 1398 1397;3.81809;0.04553100 0 +
-ENST00000622300 1399 1400 1400;3.92285;0.04093851 0 +
-ENST00000622300 1401 1402 1402;3.72768;0.04976717 0 +
-ENST00000622300 1423 1447 1434;4.67309;0.01690797 0 +
-ENST00000622300 1450 1452 1452;4.05094;0.03576819 0 +
-ENST00000622300 1458 1469 1461;4.77693;0.01465081 0 +
-ENST00000622300 1486 1487 1487;3.95055;0.03977955 0 +
-ENST00000622300 1489 1490 1490;4.02828;0.03664808 0 +
-ENST00000622300 1492 1493 1493;3.98876;0.03821813 0 +
-ENST00000622300 1498 1499 1499;3.80562;0.04610007 0 +
-ENST00000622300 1549 1550 1550;3.74875;0.04875693 0 +
-ENST00000622300 1666 1679 1673;5.20234;0.00757419 0 +
-ENST00000622300 1720 1726 1724;4.85525;0.01309527 0 +
-ENST00000622300 2050 2051 2051;3.77574;0.04748336 0 +
b
diff -r 215925e588c4 -r 20429f4c1b95 test-data/GraphProt_predict_profile_test_out4.peak_regions_p50.bed
--- a/test-data/GraphProt_predict_profile_test_out4.peak_regions_p50.bed Fri May 25 12:17:44 2018 -0400
+++ /dev/null Thu Jan 01 00:00:00 1970 +0000
b
@@ -1,35 +0,0 @@
-ENST00000620398 162 163 163;4.10141;0.03386102 0 +
-ENST00000620398 235 236 236;3.88117;0.04272571 0 +
-ENST00000620398 258 263 261;4.77792;0.01463037 0 +
-ENST00000620398 439 440 440;3.94044;0.04019990 0 +
-ENST00000620398 575 595 578;4.60065;0.01862025 0 +
-ENST00000620398 742 745 744;4.23242;0.02924040 0 +
-ENST00000550775 1412 1414 1414;3.83592;0.04472568 0 +
-ENST00000622300 131 146 141;5.32023;0.00615285 0 +
-ENST00000622300 147 149 148;4.13454;0.03264802 0 +
-ENST00000622300 184 188 186;4.12509;0.03299090 0 +
-ENST00000622300 284 289 286;4.77712;0.01464689 0 +
-ENST00000622300 694 695 695;4.08808;0.03435773 0 +
-ENST00000622300 948 956 953;4.62972;0.01791917 0 +
-ENST00000622300 972 973 973;3.84084;0.04450518 0 +
-ENST00000622300 979 980 980;3.82283;0.04531595 0 +
-ENST00000622300 1029 1030 1030;4.21258;0.02991017 0 +
-ENST00000622300 1076 1079 1078;4.44140;0.02280568 0 +
-ENST00000622300 1080 1081 1081;3.86707;0.04334219 0 +
-ENST00000622300 1096 1107 1103;5.37082;0.00560540 0 +
-ENST00000622300 1153 1154 1154;4.10699;0.03365457 0 +
-ENST00000622300 1156 1159 1158;4.01594;0.03713346 0 +
-ENST00000622300 1173 1174 1174;3.98787;0.03825401 0 +
-ENST00000622300 1253 1283 1273;5.88435;0.00182390 0 +
-ENST00000622300 1300 1310 1304;5.12597;0.00861035 0 +
-ENST00000622300 1396 1397 1397;3.81809;0.04553100 0 +
-ENST00000622300 1399 1400 1400;3.92285;0.04093851 0 +
-ENST00000622300 1423 1447 1434;4.67309;0.01690797 0 +
-ENST00000622300 1450 1452 1452;4.05094;0.03576819 0 +
-ENST00000622300 1458 1468 1461;4.77693;0.01465081 0 +
-ENST00000622300 1486 1487 1487;3.95055;0.03977955 0 +
-ENST00000622300 1489 1490 1490;4.02828;0.03664808 0 +
-ENST00000622300 1492 1493 1493;3.98876;0.03821813 0 +
-ENST00000622300 1498 1499 1499;3.80562;0.04610007 0 +
-ENST00000622300 1666 1678 1673;5.20234;0.00757419 0 +
-ENST00000622300 1720 1726 1724;4.85525;0.01309527 0 +
b
diff -r 215925e588c4 -r 20429f4c1b95 test-data/file1
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/test-data/file1 Wed Jan 22 10:14:41 2020 -0500
b
@@ -0,0 +1,9 @@
+Open for business
+Pain don't hurt
+I love the smell of napalm in the morning
+What are you gonna do? Bleed on me?
+There's no crying in baseball
+Nice night for a walk
+I'll beat you like a bad stepchild
+There's always barber college
+
b
diff -r 215925e588c4 -r 20429f4c1b95 test-data/file2
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/test-data/file2 Wed Jan 22 10:14:41 2020 -0500
b
@@ -0,0 +1,9 @@
+Open for business
+Pain don't hurt
+I love the smell of napalm in the morning
+What are you gonna do? Bleed on me?
+There's no crying in baseball
+Nice night for a walk
+I'll beat you like a bad stepchild
+There's always barber college
+
b
diff -r 215925e588c4 -r 20429f4c1b95 test-data/structure_test.model
--- a/test-data/structure_test.model Fri May 25 12:17:44 2018 -0400
+++ /dev/null Thu Jan 01 00:00:00 1970 +0000
b
b'@@ -1,3 +0,0 @@\n-bias -0.553788\n-wscale 0.00116856\n-w  0:-4.0135382e+02 1:1.9925323e+02 2:5.0541817e+01 3:1.8746223e+01 4:4.7938229e+01 5:-1.3184420e+00 6:1.7999487e+02 7:2.6940073e+02 8:1.6889235e+01 9:6.0267132e+01 10:-1.1699441e+02 11:-1.5380527e+02 12:1.0863457e+02 13:2.2859363e+02 14:-1.1410797e+02 15:-2.8184515e+01 16:-4.8148053e+02 17:-2.0947812e+02 18:-1.8556664e+02 19:1.1845671e+02 20:-1.1883872e+02 21:-2.1624365e+03 22:-2.0481577e+01 23:1.2171440e+01 24:1.7009158e+02 25:-1.4347870e+02 26:-7.0251373e+02 27:1.6310406e+02 28:-5.2023670e+01 29:7.8187976e+02 30:-1.2321568e+02 31:-2.3541357e+01 32:3.9875206e+01 33:-2.2114096e+02 34:2.2337152e+02 35:1.5582236e+01 36:-1.0316161e+02 37:-1.9433070e+02 38:2.5308902e+00 39:-6.3811334e+02 40:2.2934598e+02 41:-2.3333070e+02 42:2.4731702e+02 43:-7.7959183e+01 44:-1.0662012e+02 45:-8.0184425e+01 46:4.3575726e+02 47:7.8634888e+02 48:-1.2234956e+02 49:4.4087677e+02 50:3.7999119e+01 51:2.4422890e+01 52:1.3852211e+02 53:7.2109711e+01 54:1.6307535e+02 55:1.5398748e+01 56:-1.0704750e+02 57:4.3501698e+01 58:-3.2439218e+02 59:2.6403030e+02 60:1.4380971e+02 61:9.9229485e+01 62:-1.5779138e+02 63:-2.1399815e+02 64:-7.1359901e+01 65:-3.7181134e+02 66:1.1802785e+02 67:-4.9734962e+01 68:-1.5117850e+02 69:-9.9662048e+01 70:1.8721564e+02 71:7.1901344e+01 72:-3.3247124e+01 73:-2.3725397e+02 74:-1.3587536e+01 75:-2.3066275e+02 76:-4.6695225e+01 77:5.6201302e+01 78:-7.0159348e+01 79:1.5901779e+02 80:1.7312286e+02 81:-9.0538742e+01 82:1.8173250e+02 83:1.0174693e+02 84:2.5895911e+02 85:-3.0658699e+01 86:5.6864807e+01 87:-1.2459058e+03 88:5.1981255e+01 89:4.0778357e+02 90:-1.6180670e+02 91:-3.2745903e+01 92:-9.9358269e+01 93:1.3428168e+02 94:1.5140968e+02 95:-1.5273450e+02 96:-5.1702404e+01 97:-8.5754309e+00 98:-5.1872887e+01 99:5.1746212e+01 100:-1.6516598e+00 101:-4.1036270e+01 102:3.3574640e+02 103:1.4686101e+02 104:8.2546585e+01 105:1.3819209e+02 106:1.9993883e+02 107:1.6171097e+02 108:1.3761618e+02 109:1.7962082e+02 110:-3.0848163e+02 111:-1.5339017e+02 112:1.1032555e+02 113:-5.8718643e+01 114:1.1818639e+02 115:2.5432645e+02 116:4.1417239e+02 117:4.1049896e+02 118:-2.8750034e+01 119:-2.9362457e+02 120:-1.7050313e+02 121:-4.1131779e+01 122:-2.7520865e+02 123:-5.4357043e+02 124:-4.2175812e+02 125:-1.9338382e+02 126:-3.4975491e+01 127:1.6096809e+01 128:3.4437717e+02 129:-1.5509314e+02 130:2.2814128e+02 131:4.9259900e+02 132:3.4401554e+01 133:-1.3594273e+02 134:-7.7588699e+01 135:3.2713943e+01 136:-1.2804221e+02 137:4.2454010e+01 138:-7.3735056e+00 139:2.5791605e+02 140:-1.5359850e+02 141:-2.9143546e+02 142:-3.3002301e+02 143:3.5335257e+02 144:3.1006140e+02 145:3.0519751e+02 146:-4.5439674e+01 147:-2.2654128e+02 148:1.7059235e+02 149:-1.0492175e+02 150:-3.7783438e+02 151:2.3708656e+02 152:3.5881171e+02 153:-6.9475006e+01 154:2.9254553e+01 155:3.1344040e+02 156:-2.4214804e+02 157:-2.2470264e+01 158:3.9350140e+02 159:-5.5949933e+02 160:-3.6170116e+01 161:3.3426294e+02 162:3.0067478e+01 163:2.1396907e+02 164:-1.6210661e+02 165:1.1993384e+02 166:-4.0366492e+02 167:1.5573065e+02 168:-3.3898026e+01 169:3.5186874e+02 170:1.8208464e+01 171:-3.9697757e+02 172:1.9563231e+01 173:-1.2167740e+00 174:1.7675720e+02 175:1.7692696e+02 176:7.3209160e+01 177:-1.0547773e+01 178:-1.3880156e+02 179:1.8516713e+02 180:-3.1930441e+01 181:2.2678960e+02 182:4.5164528e+01 183:-8.3163429e+01 184:1.8298541e+02 185:-2.3758388e+02 186:-5.5641804e+01 187:2.5421872e+02 188:-4.9556686e+01 189:-6.2510323e+01 190:1.2664469e+02 191:-1.4095432e+02 192:-1.3316272e+02 193:2.0159065e+02 194:2.6047335e+01 195:-8.9309021e+01 196:1.9047055e+02 197:-8.8010956e+01 198:-2.4942900e+02 199:-1.2512206e+02 200:2.1736819e+02 201:1.7136061e+02 202:-5.2537888e+01 203:3.9739487e+01 204:-2.2097937e+02 205:-6.8877087e+02 206:5.3756123e+01 207:1.1081714e+02 208:2.0510245e+02 209:5.4652096e+01 210:6.0420296e+01 211:3.3406414e+01 212:-2.2127650e+02 213:2.3668584e+02 214:-2.0177042e+02 215:-1.0935802e+02 216:1.5345189e+02 217:3.9307452e+02 218:-1.1375249e+02 219:'..b' 16189:-3.8923160e+02 16190:1.9454342e-01 16191:2.2205391e+02 16192:-3.0325488e+02 16193:2.0289177e+01 16194:1.3297751e+02 16195:-5.0635857e+01 16196:-1.5914134e+02 16197:7.6227020e+02 16198:-1.8649309e+02 16199:-1.0087193e+02 16200:2.7067329e+01 16201:5.6069350e+00 16202:7.1741508e+01 16203:1.0398770e+02 16204:1.5457289e+02 16205:1.4142047e+02 16206:4.3580337e+00 16207:8.2620613e+01 16208:2.2853886e+02 16209:-6.7359276e+01 16210:3.1518135e+01 16211:1.0528749e+02 16212:-1.3292932e+02 16213:-3.6427322e+02 16214:2.0843103e+02 16215:1.2301257e+02 16216:4.6732077e+02 16217:-1.5728828e+02 16218:-3.8228699e+01 16219:-6.2153629e+01 16220:9.3722038e+01 16221:1.0082492e+02 16222:-1.9527151e+02 16223:-1.0604891e+02 16224:1.4530508e+02 16225:1.9284476e+02 16226:-1.8522855e+00 16227:-3.5758786e+02 16228:1.3658163e+02 16229:-3.8850092e+02 16230:1.5093387e+02 16231:2.1205777e+02 16232:-6.3815804e+01 16233:7.6644470e+01 16234:1.1010138e+02 16235:-2.1285620e+02 16236:-1.1988432e+01 16237:9.2312897e+01 16238:-1.9465490e-01 16239:5.8619392e+01 16240:3.9814266e+01 16241:-2.7054823e+01 16242:-1.4608742e+02 16243:3.6123584e+02 16244:-7.6456200e+01 16245:-2.1640871e+01 16246:1.2341808e+02 16247:2.2152608e+02 16248:-1.1504914e+02 16249:5.1228771e+01 16250:-7.6773621e+01 16251:1.5680762e+01 16252:4.0762158e+02 16253:-1.5262030e+02 16254:-4.2093719e+01 16255:-1.7519968e+02 16256:-1.3588974e+01 16257:-3.0629393e+01 16258:-5.8851315e+01 16259:-6.8850327e+01 16260:3.7673389e+01 16261:-4.2068552e+02 16262:-4.1952283e+02 16263:1.9986372e+02 16264:6.0857750e+01 16265:1.8097498e+02 16266:-1.6868561e+02 16267:3.2376590e+02 16268:7.2399437e+01 16269:-6.0803680e+01 16270:3.8516025e+02 16271:-1.2054656e+02 16272:2.1084202e+01 16273:-1.4712164e+02 16274:4.7898384e+01 16275:-2.4151979e+02 16276:-1.7836485e+02 16277:-1.7327222e+02 16278:3.8005402e+01 16279:3.9724173e+02 16280:1.0622913e+02 16281:2.4333986e+01 16282:-3.9843573e+02 16283:-4.1193810e+00 16284:-1.5672760e+02 16285:5.9834400e+01 16286:-7.1720650e+01 16287:-1.7512376e+02 16288:5.2579895e+02 16289:-8.5459995e+00 16290:2.8361855e+01 16291:1.3838816e+03 16292:-2.1508041e+02 16293:8.3770226e+01 16294:3.9106674e+01 16295:-3.1140341e+02 16296:-3.9065884e+02 16297:-1.1651790e+02 16298:2.3443216e+02 16299:-1.7045984e+02 16300:1.5198416e+01 16301:-1.9520554e+02 16302:3.9955811e+02 16303:-1.3044139e+02 16304:1.3230621e+02 16305:-1.5916510e+02 16306:-3.9482751e+02 16307:1.1108967e+02 16308:-1.7378566e+02 16309:-2.5667422e+02 16310:-1.0399294e+02 16311:-3.4011646e+01 16312:1.2511178e+02 16313:2.1757544e+02 16314:2.9530016e+01 16315:2.9884661e+01 16316:-5.2185001e+01 16317:-1.3335194e+02 16318:-4.6273777e+01 16319:-6.9023880e+01 16320:-1.4519902e+02 16321:8.7691116e+01 16322:-1.5744434e+02 16323:9.7839882e+01 16324:-4.1649670e+01 16325:2.0033319e+02 16326:5.5544514e+01 16327:7.1370956e+01 16328:1.5337119e+02 16329:6.7460289e+01 16330:-6.6159019e+01 16331:-2.4336459e+02 16332:-3.7881564e+02 16333:9.9232010e+01 16334:-4.3925926e+01 16335:2.2937117e+02 16336:2.3818100e+02 16337:3.7972620e+02 16338:-3.0985400e+02 16339:-1.0420200e+02 16340:-3.1460129e+01 16341:-1.9390654e+02 16342:2.4413045e+02 16343:-1.0303799e+02 16344:-9.4112808e+01 16345:-9.2702232e+01 16346:2.3670500e+02 16347:-1.5132767e+02 16348:3.1461002e+01 16349:-2.5901172e+02 16350:5.3989368e+01 16351:3.5555991e+02 16352:2.4456305e+02 16353:-1.7409081e+02 16354:7.3725153e+02 16355:-4.1369320e+01 16356:6.4980652e+01 16357:-2.8018726e+02 16358:3.3691666e+01 16359:3.0508701e+02 16360:-6.2882713e+01 16361:-2.2321880e+02 16362:-3.9429558e+01 16363:9.6953926e+01 16364:1.2708233e+02 16365:-5.4174554e+02 16366:-1.6106104e+02 16367:-2.3512967e+02 16368:-1.2259144e+02 16369:-7.5829514e+01 16370:1.4410321e+02 16371:2.0485582e+02 16372:4.0015186e+01 16373:3.8507840e+02 16374:4.2588135e+01 16375:-1.4845244e+02 16376:-1.6613632e+02 16377:1.8827328e+02 16378:-3.2113272e+02 16379:8.6524193e+01 16380:-4.6628574e+01 16381:2.9751631e+01 16382:4.3969955e+00 16383:-4.1308667e+02\n'
b
diff -r 215925e588c4 -r 20429f4c1b95 test-data/structure_test.params
--- a/test-data/structure_test.params Fri May 25 12:17:44 2018 -0400
+++ /dev/null Thu Jan 01 00:00:00 1970 +0000
b
@@ -1,6 +0,0 @@
-epochs: 50
-lambda: 0.0001
-R: 4
-D: 0
-bitsize: 14
-abstraction: 3
b
diff -r 215925e588c4 -r 20429f4c1b95 test-data/test.ensembl.fa
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/test-data/test.ensembl.fa Wed Jan 22 10:14:41 2020 -0500
[
@@ -0,0 +1,5 @@
+>ENST00000415118.1 cdna chromosome:GRCh38:14:22438547:22438554:1 gene:ENSG00000223997.1 gene_biotype:TR_D_gene transcript_biotype:TR_D_gene gene_symbol:TRDD1 description:T cell receptor delta diversity 1 [Source:HGNC Symbol;Acc:HGNC:12254]
+GAAATAGT
+>ENST00000448914.1 cdna chromosome:GRCh38:14:22449113:22449125:1 gene:ENSG00000228985.1 gene_biotype:TR_D_gene transcript_biotype:TR_D_gene gene_symbol:TRDD3 description:T cell receptor delta diversity 3 [Source:HGNC Symbol;Acc:HGNC:12256]
+ACTGGGGGATACG
+AAAA
b
diff -r 215925e588c4 -r 20429f4c1b95 test-data/test.fa
--- a/test-data/test.fa Fri May 25 12:17:44 2018 -0400
+++ b/test-data/test.fa Wed Jan 22 10:14:41 2020 -0500
b
@@ -1,80 +1,5 @@
->ENST00000620398
-ACTCTGGGAGGGCTAAGGAGCCATCATGATCCCTAAGCTGCTTTCCCTCCTCTGTTTCAG
-ACTGTGCGTGGGCCAAGGAGACACAAGGGGAGATGGGTCACTGCCCAAGCCGTCCCTCAG
-TGCCTGGCCCAGCTCGGTGGTCCCTGCCAACAGCAATGTGACGCTGCGATGTTGGACTCC
-TGCCAGAGGTGTGAGCTTTGTTCTCAGGAAGGGAGGAATTATTCTGGAGTCCCCGAAGCC
-CCTTGATTCTACAGAGGGCGCGGCCGAATTTCACCTCAATAATCTAAAAGTCAGAAATGC
-TGGAGAGTACACCTGTGAATACTACAGAAAAGCATCCCCCCACATCCTTTCACAGCGCAG
-TGACGTCCTTCTACTGTTGGTGACAGGACATTTATCTAAACCTTTCCTCCGAACCTACCA
-AAGGGGTACAGTGACCGCAGGTGGAAGGGTGACTCTGCAGTGCCAGAAGCGAGACCAATT
-GTTTGTGCCTATCATGTTCGCTCTACTGAAGGCAGGGACGCCATCACCCATCCAGCTGCA
-GAGTCCAGCGGGGAAGGAGATAGACTTCTCTCTGGTGGACGTGACAGCCGGCGATGCTGG
-GAACTACAGCTGCATGTACTACCAGACAAAGTCTCCCTTCTGGGCCTCAGAACCCAGTGA
-TCAGCTTGAGATATTGGTGACAGTTCCCCCAGGTACCACATCGAGCAACTACTCCCTGGG
-TAACTTCGTACGACTGGGTCTGGCTGCCGTAATTGTGGTTATCATGGGAGCTTTCCTGGT
-GGAGGCCTGGTACAGCCGGAATGTGTCTCCAGGTGAATCAGAGGCCTTCAAACCAGAGTG
-ACTCCATCTTGAACCGGGGCTGGGTAAACTGAGGCTGCAACCTGCTGGACTGCATTC
->ENST00000550775
-TCATGACTACACAGTTAAGGGTCGTCCATCTGCTTCCCCTTCTCCTAGCCTGCTTTGTGC
-AAACAAGTCCCAAGCAGGAGAAGATGAAGATGGATTGCCACAAAGACGAGAAAGGCACCA
-TCTATGACTATGAGGCCATCGCACTTAATAAGAATGAATATGTTTCCTTCAAGCAGTATG
-TGGGCAAGCACATCCTCTTCGTCAACGTGGCCACCTACTGTGGTCTGACAGCGCAATATC
-CTGAACTAAATGCACTCCAGGAGGAGCTGAAGCCCTATGGTCTAGTTGTGTTGGGCTTTC
-CCTGCAACCAATTTGGAAAGCAAGAACCAGGAGATAACAAAGAGATTCTTCCTGGGCTCA
-AGTATGTCCGTCCAGGGGGAGGATTTGTACCTAGTTTCCAGCTTTTTGAGAAAGGGGATG
-TGAATGGTGAAAAAGAACAGAAAGTCTTCAGTTTCTTGAAGCACTCCTGTCCTCATCCCT
-CTGAGATTTTGGGCACATTCAAATCTATATCCTGGGACCCTGTAAAGGTCCATGACATCC
-GTTGGAACTTTGAAAAGTTCCTGGTGGGGCCTGATGGAATCCCTGTCATGCGCTGGTCCC
-ACCGGGCTACGGTCAGCTCAGTCAAGACAGACATCCTGGCGTACTTGAAGCAATTCAAAA
-CCAAATAGGAAGGTGGAGTTAAGGGCAGGAGCAACCCTACTTCTCACCTAATGACTTGCT
-CTCCCCACCCCTCCAAAAAAAAGGAATACCCATCTTCTCACCACACTCTCTTCCTGCATG
-GGCTCCACCTGAGTAAATCACCGCCACATACTGCCAGAATTCCCACTCTCCACAAACTAG
-ATTTATATTTGGAAGGCTACTGCTCTTTTGCCTCTCAAGAGTATGTGGGTGAAGACTGAA
-ACAAATGGAAACCTAAAATCCCCAGACCTCTGTTACAGGTTGAATCATTCATATCCACCA
-GAGGGAAATCATCCTTCCACGACAATGGTTCAGTCGGCCATCACATCCTGAAGAACATTC
-CTAGACATTCTGACTCTTCCATCTCTCTCTACCCTGGAGGTGTAGAAATAGCAATGGGGT
-CACAGTCACACAATTTAGGTTCCACTTCATAACATTTTTTGTCTCCTAGGACAAACGTGT
-ATCATCAGTTTCCAACTGTTTTGGCTCAGTTTTCATCCATGACACCTCCCCCTACCAGCC
-ATTCTCCTGTGGGAGCAAGAACATTGCTTCAAAAGAAGAGAGGGCATCTCCATGCTTGTG
-GGACCCAAAACCTATCTCTGGCCCTACAAAAGTTTTCCTAATTGTCTGATCTTTAGTGCA
-TTCAGGTTATGGCACCTGGAGAGGAATGCCCTTTATCTTTTGAAGGATGGGATTTCCCAT
-CTCAACCCTGGATTCCTCACCTTCAGAACGAGCCCTGCCACTGTCTCCAATAAAATGTTT
-TCTGCAGCATCG
->ENST00000622300
-TAGTTTCCTGTTTCCGGCTTCGCTTCGGCCCACCCCCACGTCCACCCCGAATCCCTGCTT
-AAAGGCCTTGCTTTCTTGTCTAACGCCGCAACCAGTCCTCTGAGTTGCCAACGTCTTTCT
-TCTTGTCTCGACGCCCCGTCGTCCGGCCACAGCGATTCTCTGCTTAGCAGGATCGGTCCA
-CAGCGGGACGTGAGTCCCTTTCCTCCTCGCGGCTTACCGCCTCTCTCCGCCTAGTGCCAG
-GTGCTAATAAAGTTGTTGTTTCAAATGCGGCCAGGAACATCGCGAGCGGGGACCAATCAG
-AGAGTAGCTTTGCCTCTATAACGGCGCGAGAGTGAGACGTCATCGGTGAGCGACTAACGC
-TAGAAACAGTGGTGCGCGGAGAGGAGAGGCCTCGGGATGTCTCTGGCAGATGAGCTCTTA
-GCTGATCTCGAAGAGGCAGCAGAAGAGGAGGAAGGAGGAAGCTATGGGGAGGAAGAAGAG
-GAGCCAGCGATCGAGGATGTGCAGGAGGAGACACAGCTGGATCTTTCCGGGGATTCAGTC
-AAGACCATCGCCAAGCTATGGGATAGTAAGATGTTTGCTGAGATTATGATGAAGATTGAG
-GAGTATATCAGCAAGCAAGCCAAAGCTTCAGAAGTGATGGGACCAGTGGAGGCCGCGCCT
-GAATACCGCGTCATCGTGGATGCCAACAACCTGACCGTGGAGATCGAAAACGAGCTGAAC
-ATCATCCATAAGTTCATCCGGGATAAGTACTCAAAGAGATTCCCTGAACTGGAGTCCTTG
-GTCCCCAATGCACTGGATTACATCCGCACGGTCAAGGAGCTGGGCAACAGCCTGGACAAG
-TGCAAGAACAATGAGAACCTGCAGCAGATCCTCACCAATGCCACCATCATGGTCGTCAGC
-GTCACCGCCTCCACCACCCAGGGGCAGCAGCTGTCGGAGGAGGAGCTGGAGCGGCTGGAG
-GAGGCCTGCGACATGGCGCTGGAGCTGAACGCCTCCAAGCACCGCATCTACGAGTATGTG
-GAGTCCCGGATGTCCTTCATCGCACCCAACCTGTCCATCATTATCGGGGCATCCACGGCC
-GCCAAGATCATGGGTGTGGCCGGCGGCCTGACCAACCTCTCCAAGATGCCCGCCTGCAAC
-ATCATGCTGCTCGGGGCCCAGCGCAAGACGCTGTCGGGCTTCTCGTCTACCTCAGTGCTG
-CCCCACACCGGCTACATCTACCACAGTGACATCGTGCAGTCCCTGCCACCGGATCTGCGG
-CGGAAAGCGGCCCGGCTGGTGGCCGCCAAGTGCACACTGGCAGCCCGTGTGGACAGTTTC
-CACGAGAGCACAGAAGGGAAGGTGGGCTACGAACTGAAGGATGAGATCGAGCGCAAATTC
-GACAAGTGGCAGGAGCCGCCGCCTGTGAAGCAGGTGAAGCCGCTGCCTGCGCCCCTGGAT
-GGACAGCGGAAGAAGCGAGGCGGCCGCAGGTACCGCAAGATGAAGGAGCGGCTGGGGCTG
-ACGGAGATCCGGAAGCAGGCCAACCGTATGAGCTTCGGAGAGATCGAGGAGGACGCCTAC
-CAGGAGGACCTGGGATTCAGCCTGGGCCACCTGGGCAAGTCGGGCAGTGGGCGTGTGCGG
-CAGACACAGGTAAACGAGGCCACCAAGGCCAGGATCTCCAAGACGCTGCAGCGGACCCTG
-CAGAAGCAGAGCGTCGTATATGGCGGGAAGTCCACCATCCGCGACCGCTCCTCGGGCACG
-GCCTCCAGCGTGGCCTTCACCCCACTCCAGGGCCTGGAGATTGTGAACCCACAGGCGGCA
-GAGAAGAAGGTGGCTGAGGCCAACCAGAAGTATTTCTCCAGCATGGCTGAGTTCCTCAAG
-GTCAAGGGCGAGAAGAGTGGCCTTATGTCCACCTGAATGACTGCGTGTGTCCAAGGTGGC
-TTCCCACTGAAGGGACACAGAGGTCCAGTCCTTCTGAAGGGCTAGGATCGGGTTCTGGCA
-GGGAGAACCTGCCCTGCCACTGGCCCCATTGCTGGGACTGCCCAGGGAGGAGGCCTTGGA
-AGAGTCCGGCCTGGCCTCCCCCAGGACCGAGATCACCGCCCAGTATGGGCTAGAGCAGGT
-CTTCATCATGCCTTGTCTTTTTTAACTGAGAAAGGAGATTTTTTGAAAAGAGTACAATTA
-AAAGGACATTGTCAAGATCTGTC
+>seq1
+acgtACGTacgt
+>seq2
+tgcaTGCAtgca
+ACGTacgt
b
diff -r 215925e588c4 -r 20429f4c1b95 test-data/test.model
--- a/test-data/test.model Fri May 25 12:17:44 2018 -0400
+++ b/test-data/test.model Wed Jan 22 10:14:41 2020 -0500
b
b'@@ -1,3 +1,3 @@\n-bias 0.0990594\n-wscale 0.000419047\n-w  3:-2.6323795e+02 7:-1.5498299e+02 39:3.2710247e+01 90:-9.6892609e+01 92:-6.2059918e+02 102:2.7361217e+01 108:-1.8683072e+02 111:9.2224388e+01 121:4.4261597e+01 127:-2.9686353e+02 148:-2.4798018e+02 155:1.0929895e+02 212:-3.0733829e+02 217:-3.6330551e+02 219:4.3392648e+02 220:-4.3969490e+01 223:-1.2370902e+03 235:-2.5857092e+02 286:-4.4877792e+01 306:-3.2496075e+02 308:-2.0045505e+02 375:-2.3125467e+02 396:-6.7614136e+02 400:1.1828072e+02 408:4.3950644e+00 414:2.0196147e+03 421:-3.3002304e+01 431:-4.0940579e+02 432:-5.0269598e+02 434:-1.4379294e+02 469:-4.2782242e+02 486:1.9116473e+02 492:-1.6351070e+02 496:-3.4013925e+02 502:1.6938168e+02 524:9.0154556e+01 541:-3.2447818e+02 542:-3.8114508e+02 546:1.8973650e+02 605:2.8628244e+02 609:-2.4498466e+02 615:4.9888785e+02 617:-7.0967567e+01 626:-7.4189880e+01 635:-4.8810840e+02 638:-3.8662479e+02 703:-6.1965290e+01 718:-2.5008917e+02 752:4.7394373e+02 778:3.6091281e+02 780:-7.1200995e+02 800:-7.9289467e+01 802:1.0135097e+03 805:6.8798347e+01 862:2.0908080e+02 890:-1.3723445e+03 892:-3.4540729e+02 902:3.8364291e+02 904:7.1494586e+02 919:-3.6798563e+02 921:-4.8533035e+02 932:-4.9616232e+00 934:7.6746864e+01 947:-3.0528450e+01 962:2.0308733e+02 982:-8.8417072e+02 1010:2.7702023e+02 1012:-6.7064438e+01 1042:1.5783516e+03 1043:7.6573596e+02 1052:-3.1112509e+02 1053:5.5852393e+02 1057:-1.7441716e+02 1061:-8.1911877e+02 1072:-8.0545190e+02 1097:3.0343884e+01 1101:-9.3145149e+01 1146:-1.4208945e+02 1170:-1.5208934e+02 1185:-5.1659613e+02 1188:7.9852242e+00 1191:2.7740912e+02 1226:-6.5887558e+01 1233:-7.5957849e+02 1240:6.7184441e+01 1268:-1.6966364e+01 1270:7.1152705e-01 1278:-2.6539410e+02 1281:-5.3602417e+02 1291:-7.0432999e+01 1293:-3.1627484e+02 1324:5.9499172e+01 1327:-2.5026250e+02 1338:-4.3128751e+02 1341:1.2852692e+02 1400:-1.8205722e+02 1445:8.7823769e+01 1448:-2.2847488e+02 1477:-1.8806841e+01 1478:-2.2553993e+02 1519:-4.3346246e+02 1521:1.9021851e+03 1525:9.8176186e+01 1532:1.8415245e+03 1540:-2.8561923e+02 1569:-7.1449944e+01 1581:-7.6894516e+01 1611:5.2454468e+02 1620:1.1161332e+02 1623:-5.0059381e+02 1625:2.7589495e+01 1629:2.8995767e+02 1634:-5.4618298e+02 1669:1.2852551e+02 1670:-2.0161009e+02 1696:-3.4001123e+02 1717:2.4858997e+02 1723:-1.1573253e+03 1726:-3.9857892e+02 1737:5.7238712e+01 1788:-1.2982139e+03 1809:-1.7930902e+01 1841:2.5816199e+02 1848:-1.1458385e+01 1864:2.0670494e+02 1894:-1.4508528e+01 1902:-1.7916452e+02 1913:-3.4403421e+02 1933:-5.1005667e+02 1941:6.2259216e+02 1947:4.8797308e+02 2020:-4.9923500e+01 2024:3.5331131e+02 2027:8.5347278e+02 2055:-4.3892822e+02 2056:-2.8189532e+02 2109:-1.4149394e+01 2112:3.3885446e+02 2129:-9.2347603e+01 2143:9.3486115e+01 2153:-1.3226047e+02 2157:-3.9544894e+02 2193:3.6964410e+02 2226:-1.7645128e+02 2238:-2.4203748e+02 2275:-7.3826553e+01 2282:-1.5893994e+02 2298:-5.3783575e+02 2300:-2.3985342e+01 2313:-1.0684984e+03 2318:6.7666736e+02 2331:1.5789617e+03 2346:-6.8256458e+02 2357:6.2338234e+01 2411:1.1225830e+03 2417:-2.0702786e+02 2433:-1.8449684e+02 2437:-3.5654678e+02 2440:-7.8585243e+01 2481:-2.5452950e+02 2546:-3.2265039e+02 2569:-1.6009232e+02 2588:1.5431109e+01 2672:-5.7787793e+02 2678:4.8135571e+02 2679:-4.8376819e+02 2687:1.7756122e+02 2697:-8.3875990e+00 2718:1.0383005e+00 2727:2.6974607e+01 2729:-2.9415070e+02 2730:1.5908240e+01 2740:-7.3905846e+01 2784:-8.9664299e+01 2796:-3.3169681e+02 2807:-1.7641209e+02 2809:-8.3617859e+01 2814:1.6530400e+02 2829:-3.5813651e+02 2831:-4.1616337e+01 2849:9.3477470e+01 2850:-1.2522018e+02 2882:-7.5252911e+02 2884:-1.3699104e+03 2929:-1.6485271e+02 2931:-3.9677682e+02 2981:-6.4187706e+01 2984:1.1583134e+02 2992:-5.2120243e+01 2994:-5.4991040e+02 3009:4.4423065e+02 3063:-1.1850669e+02 3073:-2.4370335e+02 3095:-9.1676208e+01 3105:-3.4315512e+02 3153:-3.6349112e+02 3156:-3.8696735e+02 3185:-1.3079045e+02 3187:1.5799861e+03 3195:-6.9351700e+01 3216:-3.7491626e+02 3221:-5.8995862e+02 3225:-6.0393408e+02 3232:4.4445654e+02 3235:-2'..b'252:-2.0254858e+02 13264:-1.1531734e+01 13302:-4.7238293e+01 13319:1.2391662e+01 13327:1.1617863e+01 13330:-8.2290775e-01 13347:1.6414478e+02 13350:1.2953079e+02 13360:-6.0334217e+01 13395:4.4279895e+02 13419:-4.1163483e+01 13431:-5.2169199e+00 13507:3.6257912e+01 13550:6.4378176e+00 13567:2.1246001e+02 13592:-6.4766258e+01 13594:-2.3700809e+01 13600:-5.9889587e+01 13615:-4.7472698e+01 13626:-4.4612346e+00 13635:1.6622469e+01 13641:-2.7826408e+01 13647:1.3331202e+01 13676:1.5068726e-01 13677:2.8233619e+00 13690:1.0518385e+01 13695:-4.5910373e+00 13739:-8.4214989e+01 13778:6.8273186e+01 13783:-4.6545403e+01 13786:8.0396671e+00 13791:-7.7837963e+00 13803:-3.3170278e+00 13840:3.9482586e+01 13845:-1.8135855e+01 13847:-2.2709573e+02 13859:-1.4240252e+00 13872:2.7615047e+00 13874:1.1435952e+02 13877:-3.1248505e+01 13888:5.7587227e+01 13902:1.9792150e+01 13926:1.5141506e+00 13927:1.3879480e+01 13932:-5.1508450e+01 13969:1.1864651e+01 13984:-1.6172890e+02 13991:7.2319870e+01 14006:1.1226241e+02 14040:-2.5487202e+01 14049:-1.0384438e+00 14052:-7.4646835e+01 14058:4.0271423e+01 14062:-4.1556347e+01 14083:4.6118198e+01 14092:-8.7803871e+01 14094:-5.9766785e+01 14117:1.0685455e+01 14118:-7.3690910e+00 14123:-6.9610977e+01 14163:8.5742172e+01 14167:1.6526582e+02 14173:-1.1325525e+02 14174:-3.4626419e+01 14187:-2.2488592e+00 14197:2.4679620e+00 14223:-1.0526658e+01 14278:1.3331645e+02 14291:3.6822250e+00 14292:3.2110714e+01 14296:-1.2694633e+02 14308:1.0793807e+01 14313:-8.7313774e+01 14317:1.9364951e+00 14326:-1.8529860e+02 14376:-8.2232529e+01 14387:-6.8386750e+00 14396:-1.6331833e+02 14423:2.1117811e+00 14424:-1.2668558e+02 14430:-3.0565704e+02 14435:2.9715757e+01 14441:8.1956110e+00 14463:1.0869225e+02 14466:-9.6319208e+00 14470:2.5520477e+01 14518:-1.4537252e+01 14523:9.0272186e+01 14576:5.4091573e+00 14588:5.3161712e+00 14621:-1.1478651e+02 14674:-7.4689651e+00 14683:-4.1556347e+01 14691:9.8472237e+01 14747:-1.4537252e+01 14775:-9.6623940e+01 14799:1.6414478e+02 14803:-9.3636444e+01 14805:2.6981489e+01 14823:3.1839367e+01 14844:-6.5088425e+01 14846:-5.5476101e+01 14854:2.5520477e+01 14861:-2.1458652e+01 14889:-1.0362775e+01 14892:3.1700230e+00 14910:-3.4689407e+01 14925:1.0793807e+01 14957:-3.4123978e+01 14959:4.7573200e+01 14965:-3.5696285e+01 15007:-1.2092074e+01 15009:-5.9523015e+00 15014:-7.2863660e+00 15100:3.7091448e+00 15102:1.1175143e+01 15107:-4.1719559e+01 15113:-2.6870892e+00 15122:5.1621204e+01 15130:-4.8556625e+01 15132:-1.4876675e+02 15139:2.9541386e+01 15152:1.5604091e+02 15188:-1.6110566e+00 15227:-4.4174480e+00 15252:7.9477722e+01 15264:8.5215324e+01 15265:-1.0231249e+00 15286:-5.9766785e+01 15300:2.3322140e+01 15342:8.4808182e+01 15345:-9.9806623e+00 15348:6.8029373e+01 15353:2.7945242e+02 15359:-3.0804034e+02 15402:8.0619331e+01 15411:-3.4757214e+01 15430:-2.7826408e+01 15482:5.1641598e+01 15490:-9.9806623e+00 15498:8.1937599e+01 15510:-7.7122536e+01 15526:2.4129581e+02 15576:3.0118001e+00 15589:-1.6190125e+00 15598:2.0896330e+01 15662:-1.1632268e+02 15701:3.3914945e+00 15703:-4.7596230e+01 15711:6.3197392e-01 15737:-4.6562599e+01 15739:1.5088092e+02 15751:-8.4263329e+01 15773:1.0869225e+02 15775:-5.0094734e+01 15781:-2.1458652e+01 15789:-4.4172158e+00 15806:-1.8612957e+01 15807:-1.0362775e+01 15818:-1.8529860e+02 15820:1.2059388e+01 15828:-4.1163483e+01 15848:-1.4871371e+00 15871:-4.4842747e+01 15872:3.6658928e+01 15910:-8.6631308e+00 15923:1.7133567e+00 15931:2.2193642e+01 15945:1.6352751e+02 15949:6.8273186e+01 15965:2.4077680e+00 15969:-5.5159649e+01 15983:-2.1066098e+02 16016:-4.6355352e+00 16018:-3.7312720e+02 16021:1.0150909e+02 16048:-3.1836771e+01 16051:4.8161917e+00 16069:8.5550613e+01 16088:1.0602772e+02 16092:1.0110319e+02 16098:3.7920661e+00 16103:-8.3919807e+01 16105:-3.1301073e+01 16118:1.7366637e+00 16158:-5.1722870e+01 16191:1.3222687e+00 16195:9.7939520e+00 16209:1.6877335e+02 16243:1.3331202e+01 16262:2.5047557e+02 16265:-8.5170990e+01 16267:-5.0046478e+01 16293:6.4322891e+00 16313:7.3501124e+00\n'
b
diff -r 215925e588c4 -r 20429f4c1b95 test-data/test.params
--- a/test-data/test.params Fri May 25 12:17:44 2018 -0400
+++ b/test-data/test.params Wed Jan 22 10:14:41 2020 -0500
b
@@ -1,19 +1,18 @@
 epochs: 20
-lambda: 0.001
+lambda: 0.01
 R: 1
-D: 4
+D: 3
 bitsize: 14
 model_type: sequence
-#ADDITIONAL MODEL PARAMETERS
-ap_extlr: 5
-gev_my: -2.5408
-gev_sigma: 1.6444
-gev_xi: -0.1383
-p50_score: 6.51534 
-p50_p_val: 0.0009059744 
-# ADDITIONAL MODEL INFO
-organism: hsa
-rbp_ens_gene_id: ENSG00000011304
-rbp_gene_name: PTBP1
-clip_method: eCLIP
-pubmed_id: 27018577
+pos_train_ws_pred_median: 0.760321
+pos_train_profile_median: 5.039610
+pos_train_avg_profile_median_1: 4.236340
+pos_train_avg_profile_median_2: 3.868431
+pos_train_avg_profile_median_3: 3.331277
+pos_train_avg_profile_median_4: 2.998667
+pos_train_avg_profile_median_5: 2.829782
+pos_train_avg_profile_median_6: 2.626623
+pos_train_avg_profile_median_7: 2.447083
+pos_train_avg_profile_median_8: 2.349919
+pos_train_avg_profile_median_9: 2.239829
+pos_train_avg_profile_median_10: 2.161676
b
diff -r 215925e588c4 -r 20429f4c1b95 test-data/test.peaks.bed
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/test-data/test.peaks.bed Wed Jan 22 10:14:41 2020 -0500
b
@@ -0,0 +1,2 @@
+s1 5 8 s1,6 6.500000 +
+s2 3 6 s2,4 4.000000 +
b
diff -r 215925e588c4 -r 20429f4c1b95 test-data/test.peaks_genomic.bed
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/test-data/test.peaks_genomic.bed Wed Jan 22 10:14:41 2020 -0500
b
@@ -0,0 +1,2 @@
+chr1 1000 2000 s1 0 +
+chr2 1000 2000 s2 0 -
b
diff -r 215925e588c4 -r 20429f4c1b95 test-data/test.predictions
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/test-data/test.predictions Wed Jan 22 10:14:41 2020 -0500
b
@@ -0,0 +1,3 @@
+SERBP1_K562_rep01_2175 1 0.0193945
+SERBP1_K562_rep01_544 1 0.571673
+SERBP1_K562_rep01_316 1 0.978629
b
diff -r 215925e588c4 -r 20429f4c1b95 test-data/test.profile
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/test-data/test.profile Wed Jan 22 10:14:41 2020 -0500
b
@@ -0,0 +1,9 @@
+0 0 -2.1
+0 1 1.9
+0 2 1.1
+1 0 -2.1
+1 1 3.2
+1 2 1.1
+2 0 -2.1
+2 1 4.1
+2 2 1.1
b
diff -r 215925e588c4 -r 20429f4c1b95 test-data/test.sequence_motif
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/test-data/test.sequence_motif Wed Jan 22 10:14:41 2020 -0500
b
@@ -0,0 +1,437 @@
+ACACCCCAGCAG
+CGCCGCCGCCGC
+GGAGGCUGCCCG
+GCCCCGGAUUGC
+CGCCAACCGCCG
+GCCGAGCUGCAC
+GCCGAGCCAGAG
+GCGCAGGGGCCC
+GAGCCACCUGGA
+CCUCGACGGGAG
+CGCAGGGACCCC
+GCCCGGGCCGCC
+CCGCCGCGCGCC
+CCUCCGCUCCCG
+ACCAGCGCCAGG
+CGCCCGUGCCAG
+GCCGCCGCCGCC
+GCCGCCCUCCGC
+CGCCUGCCCGCC
+CCAGCAGCAGCC
+CGAGCGCUGCCG
+CCGCCGCUGCGC
+CAGCGGCCCCGC
+CAGGGCUGCUGG
+GAACCGUGCUGC
+GCGGCCGCCGCG
+CGCGGCGCUCGC
+CUGGCCAGGGCC
+AGCCAGGCUGGA
+CGGGGCCGGGGC
+CGGCCAGCGGGC
+GGACGGCAGGCC
+CGGCCGCCAGAA
+GCGCCUCCCGCG
+CCUGCCGUCGCG
+GCUGCCACAGUU
+GGCGGAUGCCCC
+CCAGCCAUCCCC
+GGGCCCCCACCG
+CUGGACGCGUCC
+CCGGCGUUCAAG
+GAGCAGGAGCGU
+CGAGAGCGGCCC
+CCGAAACCGAAC
+CCACUCCCCGCC
+CGGCGGCGGCGC
+GAGGGGAAGCUU
+CGCAUGCGGAGG
+CGCGGAGGAGCC
+CCUGGCGCGAGA
+CCGAGCAGCCGC
+GCCCUGGAGGGC
+CGGCGCCCCGCC
+CCGCACAGCAGC
+GCGGCGACGACC
+GCCCCUUGCCGA
+GGAUCCACGGGA
+GGCAGGGCCGGG
+GGAGCCGCCACC
+CGAGCCCUGCCC
+GAGGCCGGACCC
+CCGGAGCUGCCG
+GCAGCCGAGAGC
+AGCCAGCAGCUG
+ACGGCUGGAAGC
+GCCAGAGCCGAG
+GGAGCAGGGCAC
+CGCCUCGGGGCC
+CCCCGCGCCUCG
+GGCGGCGGCGGC
+GGCGGCGGCGGG
+ACCGCCCCUGCC
+CCCUGGCCGCCC
+GCCGGAAGACCC
+GCCGGCGUGCCC
+CGCCGCCGCCGC
+GCAGGCCCAGCG
+GAGCGGUGGCCA
+GCCUGGACCCCC
+CCGUCCCGCUCC
+CCUACAAGCCCG
+CUGCGCCGACCG
+CCGCCGCCGCCG
+CUGCGCCCCCGG
+CCCGCAGGAGAA
+GCGCUGCAGACC
+GCGGGGCAAGCC
+GCGGCGCGGACG
+CGCCAUGACCCG
+GCCGUCAUGUCU
+GGGGGCCGCCGC
+CGCAUGCAGCAC
+GGCUGGGCCGCA
+GCCCUUCCUGUG
+GCCGCCCCCGCC
+GCGCCGCCCCGC
+CGGGAGCCGCGG
+GCUGCCGCCGCC
+CGCAGCGGCACC
+CGCCCGCCACCA
+GCCAGCCCCAGC
+AGAUCCGGCUCA
+CGGUCACCGCCC
+CGCGCCGCCGCC
+UACCACCAGUUG
+GAACCCUGCGCC
+CCAGCGGCGGCA
+CCCCCGGUGUGG
+GCGGAGCGGGCG
+GGCCCGGCGAGC
+GGAGGCGGCGGC
+GCCAGCCGGACG
+CACCCCACCCCG
+CCCCCCAGCCCC
+GCAGGCCGCACC
+CCCGCCCGCCGC
+CGGAGCCGUGGG
+CGGGGGUGCCCG
+GCGCCAGGCUGC
+CAGCCGCCCGCC
+CCUGAGCGGCAG
+GCCAACCUGCAC
+GCCGCAGCCGCC
+CCCCGGAGCCCC
+CCCUGGCCCAGC
+CCGGAGCCCCGC
+ACCGCCGCCGCC
+GCUGGCAGGCGC
+CCGAGGGCCGCC
+CAGCCGAGAAGC
+AGGCUCCAACUC
+GCGCCGGGCCCG
+ACCCCCGCGCCC
+GCGGGACGGCAG
+CCGCCGCCGCCC
+GACCCGGGAGCC
+CCGGCCCCGCCG
+GCACGGGCGGCC
+GCCGCCCCGGCG
+CCCGCCGGCCGU
+CGCCAGCAAGCC
+GCGCCUGCCCGG
+CCGCUGCGGCCG
+GCCGCGGCGCGC
+CAGAGCCAGAGC
+GGAGGACGGGGA
+AGCAGGCCCAGC
+CGCUGCCAUGAA
+CCCCCCUCCCAG
+GACCCCUCAGCC
+CCGCCAGCAUCC
+GCCUGGCGGGCA
+AGGCGGCGGCGG
+GAGGAGAUGACC
+ACGGGCGCCGGC
+CGGCGCGCCCGG
+ACCCCCCGCCCC
+GAGCCAACGGGC
+ACACCAGCCCCC
+CGAGAAGCCCGG
+CGGAGCCCGUCC
+CGAGCAGCUGCG
+CGCGGCGCCGCC
+CCUGGGCCAGCC
+CCGCGCCCCGCC
+GCGCCCCACGCC
+GCCAGAGAAAGG
+CCUGGCCGCAGC
+UACCGGGCCUGC
+CGCCUCCGCCAG
+GGCCCCCGCCGC
+CCUCCAGCUGGC
+CCCCGCCGCCUC
+CGCCGCCGGAGC
+GCGCCGAGCACC
+ACCGGCAGGCCG
+GGAGCCGCCGCC
+CGGCAGCCGCCG
+GCCGCCCCGGCC
+CGGGAAGCCCAC
+CCCCGCAAGCCA
+CCCGAGGCCCCU
+CAGCUCCCGGAC
+AGGAGGCUGCUG
+GCCCCCGGCCUG
+GCGGGGACCUGG
+GAGGCUGGCAAG
+GGCAGACCGAGA
+GGGUGCCAGACC
+CGCCGCCCGGCG
+GCAGAAGCGGGG
+AGGGCGGCGUCG
+GGCGGCCCUGGC
+GCCGCCGGCGGC
+AGCGCUGCAAAC
+CCAUCAGGACGA
+CGCUGCACGGCC
+UGCGGGAGCCGC
+GCAGCACCGAGG
+GACCUGGAGGCC
+GCGCCGUCGGCC
+CCACCCGCCUGG
+GGCCCGGGUGCA
+AGCGGGGCCGUG
+CCGGGAAGCCGC
+CGCCGCCUGCAA
+CCCGUCCGGAAC
+CCACGGGGAACC
+GAGAGGAGCGCG
+CCGCCACUGCCC
+GCCGGCACUGGC
+UCGGCCGGUUUU
+CGCCCUCGCCCG
+CGCCGGAGCCGC
+CCCCAACCAGCC
+AGCGGCUGGCCA
+CCGAGAAGCUAA
+CCGCCGCCGCCG
+GCCCCGCGGAGC
+CCCGCAGAGACC
+CCAAUGUGCCGG
+GCCGCCGCGCCG
+CGGCAGCGCCAG
+CAGCGGCCGCGG
+AAAGCCAGCGAG
+CCCGGCGCCUGC
+GGCGGCGGCGGC
+GAGGGCCCGUCC
+CGUGCCCAGAAC
+GACGCGGCGUUG
+CAAGAAGAGGAG
+GCCGCCGCAGCU
+GCACCCCUGGCC
+CGGAGGUGGCGC
+CCUCACCCAUGA
+CGGCGACGGCAC
+GGAGGGGCGGGC
+AGGGCCCCCCUA
+CGCGCCAGCGGC
+CGGGCCCGGGAG
+GGGCGACGGCCC
+CCAGGCGCUGGA
+GUGGCCGGGCCA
+GCCAAAGGAAAC
+GCCCGGGCGAUC
+CCAACACCGCCC
+UGCCAGCGCCGC
+GCCGCCCCCUGC
+UGCUGGCCGCUG
+CCCGCGCCCGCC
+CAGCCGCCCUGC
+GCCCUGCCGGGA
+CGCCGGACAGGU
+GCCGUGCCCUGC
+CGGCGGCCGGCG
+CGCUGCACCCUC
+CCGGGCCGCGGC
+CAGCGACCCCUA
+GGCUGCUGGCCG
+CGACAGGCCCGC
+CCACCAUCUCGG
+GCUGCCCGGGGA
+GCGCUCCACGCC
+CGGCGGCGCCGA
+GCCGCCGGGACC
+UCAGCAGGGGGA
+CGCCCCCCAGCC
+CCGCGCCGCCCG
+GUGGCCCCCGAG
+CGCGGCGCCCGG
+GCCGCAGGGCCG
+CCGCCCUCGGCC
+CGACCCCGCCCC
+GCCGCCACCGCA
+GACGGCCCCCGC
+CGGCGCCAAGUU
+ACCAGAUGGUGA
+CGGAGGGCCUGC
+ACCUGCUGAUGA
+GCGCCGCGCGCC
+GGAGGAGCAGAU
+ACCCCUGCAGCC
+GCCGGCUUACCA
+CAACGCCCUGUG
+GCCGACGCCGCG
+GCCGCCUGGACA
+CAAGGACGAGCC
+GCUGGAGCCCCA
+CGCGGCGGCGGC
+AUCGCCGAAGCU
+CCCCGUUCACAA
+ACCUCCCAGGGC
+GCCGGCCACCUG
+ACCGGGGAGGGC
+CUGCCGCCCCCG
+CCGUGGCGGCUU
+CCGGCCCGGCGC
+CGGCGGCGGCGC
+CAAGAAGAACCC
+GGAGACGGCGGC
+CCACCAAAGGAG
+GCUCCGAGAAGC
+GAGACCCGACGC
+GCAGCCACCGCC
+GCAGCCCCCCGU
+ACCAGAGCUGCC
+CCGCUGCGCGGC
+GCCGCCGCCGCC
+AGGAGGCGACCG
+CGGACCGCGCGG
+GCCGGGGCUUCU
+CCCCGGGGAGGA
+GCCACCCCCGGG
+GCCCCAGCAGCA
+CAGCCCCCAAGA
+CCGCCGCAACCA
+CUGCCCCUGCCC
+GCUUGCCCAGCA
+GCCCGGCGCCGC
+CGGCCCCAGCCG
+GCUCCUGAAGCC
+CCGCUGCGGCCG
+GCAUCAAGGAGC
+GCGAGCAGAAGG
+CCAGAACCAGGG
+ACCAACAACCGC
+GCCCCCCGACCA
+CAGCCGCCCCGC
+UGCUGGCCGGGG
+AGCACCUCCGCC
+CAACUGGCAGCU
+CGCCACCGCCGC
+CCCGGGCGGAGC
+CCAACCGGCCGC
+GACCGGCGCAGC
+GGCCCUGGCCAC
+GACCCCUGACGG
+CUGCUGGCAGGC
+CUGCCGCCCAAC
+GCGGCGGCGGCG
+CGCCCAGCACGC
+GCCCAGGCCCCG
+GCGCAGCCCGGC
+AGGCGUCGGGCG
+CCACCGCCACCG
+CUGGCAAGCAGC
+GGCGGCGGCGGC
+GCCCCCUCGGCC
+GCCCCAGCCCCC
+GCCAUGCACGCC
+GGACGAGCGCGA
+CCGUCGUCGCCG
+GACCACUCAGCC
+CACGCCAGGCCA
+CGGGACGCGGGC
+GGCAGGAGCGCG
+CCCAGCAAGUGC
+CGUGGCGUGGCC
+GCCGGCCCCCGG
+GAGCAGCGGAAG
+GCCCGCCUGUCC
+GGCCCGACGGGA
+CCCGGACUCUCC
+GCCCCCAUCCCC
+AAGAACUGGGAG
+CGCCGAGCUGUC
+GAGCAAGCCAUU
+GGAAGCUGGGCC
+CUGGAGCCCAUG
+CGCCCAAGAAGC
+GGAGCUGGACCC
+CCGCGCCGGCGC
+GUCCCAGCAGAC
+GGCGAGCCUGGC
+CAGCUGCGGGCG
+GCAGCAUGCUGC
+CCAGCCCGCCCC
+CGCCGCCGCCGC
+CCGCCGACCCCC
+GCCCCAGCCCAG
+GCGGCGGAGGCC
+CGGGCCCGGGCG
+CGCCGCCUCCGC
+GCGCGGCGGCCG
+CUCCCCCAACUA
+ACGAGCGCUGCA
+CGGCGCCGCGCG
+GGCCGCGCAGCA
+GGCCCCGCGGGC
+CGCCCACCACGG
+GGGCGGCCAGGC
+CCUGGAAGCCCC
+CCGCCAUCAGAA
+GACAGCGCCACA
+CGAGCGCGCCGC
+CCCAGGGCCCCA
+CCGCAGGGGCCC
+GCGGGCGGCGGC
+CCCGUGCCAGAG
+CCUGCCGCCGCG
+ACAGACAAGCCC
+GCAGCACAGGCC
+CCAGCUCUGCCC
+GCCAAGGAGGCG
+GGCAAGCCCACC
+GCGGCGGGCCAG
+CCUGGCCCCCAC
+GCGCCCGGCGAG
+CCACCCUAUUCC
+CGCGCCCCGCGU
+CGCCCCCGCCCG
+UGACCAACCGAC
+CCCGGCGGCCGC
+GGCCACGCGGGC
+CCGCGCCGCCGC
+CAGCGCCGCACC
+GCCGCCAGAACA
+CGACCACCACCA
+GGGCACUGGACC
+CCCCAGCCCCGC
+CCACCAAGGCAC
+CCGAGGCGGCGC
+CCGAGCGAUGGG
+GCCCCAGGACAC
+GCGGCGGCCCCG
+CCGCCGCCGCCG
+GAAGGGCCGGCG
+GCUGCCCCUGGC
+CGCCUCGGAGGC
+CCCAUCCCCACC
+GAAGACCGGCGA
+GCAGACCCCUGC
+GCGGAAGGUGGA
+GGUGCGGGAGGC
+CGAGCAGUGCUA
+GACGCCAGCGGC
+GGGCGCCGCCGC
b
diff -r 215925e588c4 -r 20429f4c1b95 test-data/test1.bed
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/test-data/test1.bed Wed Jan 22 10:14:41 2020 -0500
b
@@ -0,0 +1,7 @@
+chr1 1000 2500 CLIP1 1 +
+chr1 3000 4000 CLIP2 0 +
+chr1 5000 6000 CLIP3 2 +
+chr1 7000 8000 CLIP4 3 +
+chr1 9000 10000 CLIP5 1 +
+chr1 11000 12000 CLIP6 0 +
+chr1 13000 13500 CLIP7 3 +
b
diff -r 215925e588c4 -r 20429f4c1b95 test-data/test2.bed
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/test-data/test2.bed Wed Jan 22 10:14:41 2020 -0500
b
@@ -0,0 +1,4 @@
+chr1 1000 2000 CLIP1 0 +
+chr1 3000 4000 CLIP2 0 +
+chr1 3000 4000 CLIP2 0 +
+chr1 7000 8000 CLIP3 0 +
b
diff -r 215925e588c4 -r 20429f4c1b95 test-data/test2.profile
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/test-data/test2.profile Wed Jan 22 10:14:41 2020 -0500
b
@@ -0,0 +1,19 @@
+0 0 2
+0 1 3
+0 2 5
+0 3 8
+0 4 4
+0 5 3
+0 6 7
+0 7 1
+1 0 3
+2 0 2
+2 1 3
+2 2 5
+2 3 8
+2 4 4
+2 5 3
+2 6 7
+2 7 1
+3 0 2
+3 1 4
b
diff -r 215925e588c4 -r 20429f4c1b95 test-data/test2_1.avg_profile
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/test-data/test2_1.avg_profile Wed Jan 22 10:14:41 2020 -0500
b
@@ -0,0 +1,19 @@
+0 1 3.333333
+0 2 4.500000
+0 3 4.400000
+0 4 4.600000
+0 5 5.400000
+0 6 4.600000
+0 7 3.750000
+0 8 3.666667
+1 1 3.000000
+2 1 3.333333
+2 2 4.500000
+2 3 4.400000
+2 4 4.600000
+2 5 5.400000
+2 6 4.600000
+2 7 3.750000
+2 8 3.666667
+3 1 3.000000
+3 2 3.000000
b
diff -r 215925e588c4 -r 20429f4c1b95 test-data/test2_2.avg_profile
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/test-data/test2_2.avg_profile Wed Jan 22 10:14:41 2020 -0500
b
@@ -0,0 +1,19 @@
+0 1 3.333333
+0 2 4.500000
+0 3 4.400000
+0 4 4.600000
+0 5 5.400000
+0 6 4.600000
+0 7 3.750000
+0 8 3.666667
+1 1 3.000000
+2 1 3.333333
+2 2 4.500000
+2 3 4.400000
+2 4 4.600000
+2 5 5.400000
+2 6 4.600000
+2 7 3.750000
+2 8 3.666667
+3 1 3.000000
+3 2 3.000000
b
diff -r 215925e588c4 -r 20429f4c1b95 test-data/test2_3.avg_profile
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/test-data/test2_3.avg_profile Wed Jan 22 10:14:41 2020 -0500
b
@@ -0,0 +1,19 @@
+0 0 2
+0 1 3
+0 2 5
+0 3 8
+0 4 4
+0 5 3
+0 6 7
+0 7 1
+1 0 3
+2 0 2
+2 1 3
+2 2 5
+2 3 8
+2 4 4
+2 5 3
+2 6 7
+2 7 1
+3 0 2
+3 1 4
b
diff -r 215925e588c4 -r 20429f4c1b95 test-data/test3.fa
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/test-data/test3.fa Wed Jan 22 10:14:41 2020 -0500
b
@@ -0,0 +1,6 @@
+>SERBP1_K562_rep01_544
+CAGGTCCCTCCTCCGGCATGTGTCAGCGCAACCCCCAGGTCTGCGGCCCAG
+>SERBP1_K562_rep02_709
+CTCCAAGGGCGGGCTTGTGCTCTGGTGCTTC
+>SERBP1_K562_rep01_316
+GGCTCCGCTGTACAAGAACGTGGATGTGCGAGGTATCCAGG
b
diff -r 215925e588c4 -r 20429f4c1b95 test-data/test3_added_ids_exp.avg_profile
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/test-data/test3_added_ids_exp.avg_profile Wed Jan 22 10:14:41 2020 -0500
b
@@ -0,0 +1,19 @@
+s1 1 3.333333
+s1 2 4.500000
+s1 3 4.400000
+s1 4 4.600000
+s1 5 5.400000
+s1 6 4.600000
+s1 7 3.750000
+s1 8 3.666667
+s2 1 3.000000
+s3 1 3.333333
+s3 2 4.500000
+s3 3 4.400000
+s3 4 4.600000
+s3 5 5.400000
+s3 6 4.600000
+s3 7 3.750000
+s3 8 3.666667
+s4 1 3.000000
+s4 2 3.000000
b
diff -r 215925e588c4 -r 20429f4c1b95 test-data/test3_added_ids_out.avg_profile
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/test-data/test3_added_ids_out.avg_profile Wed Jan 22 10:14:41 2020 -0500
b
@@ -0,0 +1,19 @@
+s1 1 3.333333
+s1 2 4.500000
+s1 3 4.400000
+s1 4 4.600000
+s1 5 5.400000
+s1 6 4.600000
+s1 7 3.750000
+s1 8 3.666667
+s2 1 3.000000
+s3 1 3.333333
+s3 2 4.500000
+s3 3 4.400000
+s3 4 4.600000
+s3 5 5.400000
+s3 6 4.600000
+s3 7 3.750000
+s3 8 3.666667
+s4 1 3.000000
+s4 2 3.000000
b
diff -r 215925e588c4 -r 20429f4c1b95 test-data/test4.avg_profile
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/test-data/test4.avg_profile Wed Jan 22 10:14:41 2020 -0500
b
@@ -0,0 +1,18 @@
+s1 1 -1
+s1 2 0
+s1 3 2
+s1 4 4.5
+s1 5 1
+s1 6 -1
+s1 7 5
+s1 8 6.5
+s2 1 -1
+s3 1 -1
+s3 2 0
+s3 3 2
+s3 4 4.5
+s3 5 1
+s3 6 -1
+s3 7 5
+s3 8 6.5
+s4 1 4
b
diff -r 215925e588c4 -r 20429f4c1b95 test-data/test4.bed
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/test-data/test4.bed Wed Jan 22 10:14:41 2020 -0500
b
@@ -0,0 +1,2 @@
+chr1 10 20 CLIP1 0 +
+chr1 30 40 CLIP2 0 +
b
diff -r 215925e588c4 -r 20429f4c1b95 test-data/test4_out.peaks.bed
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/test-data/test4_out.peaks.bed Wed Jan 22 10:14:41 2020 -0500
b
@@ -0,0 +1,3 @@
+s1 1 8 s1,8 6.500000 +
+s3 1 8 s3,8 6.500000 +
+s4 0 1 s4,1 4.000000 +
b
diff -r 215925e588c4 -r 20429f4c1b95 test-data/test4_out_exp.peaks.bed
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/test-data/test4_out_exp.peaks.bed Wed Jan 22 10:14:41 2020 -0500
b
@@ -0,0 +1,5 @@
+s1 1 5 s1,4 4.500000 +
+s1 6 8 s1,8 6.500000 +
+s3 1 5 s3,4 4.500000 +
+s3 6 8 s3,8 6.500000 +
+s4 0 1 s4,1 4.000000 +
b
diff -r 215925e588c4 -r 20429f4c1b95 test-data/test4_out_exp2.peaks.bed
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/test-data/test4_out_exp2.peaks.bed Wed Jan 22 10:14:41 2020 -0500
b
@@ -0,0 +1,3 @@
+s1 1 8 s1,8 6.500000 +
+s3 1 8 s3,8 6.500000 +
+s4 0 1 s4,1 4.000000 +
b
diff -r 215925e588c4 -r 20429f4c1b95 test-data/test_exp.peaks.bed
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/test-data/test_exp.peaks.bed Wed Jan 22 10:14:41 2020 -0500
b
@@ -0,0 +1,2 @@
+chr1 1005 1008 s1,1006 6.500000 +
+chr2 1994 1997 s2,1997 4.000000 -
b
diff -r 215925e588c4 -r 20429f4c1b95 test-data/test_negatives.parop.fa
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/test-data/test_negatives.parop.fa Wed Jan 22 10:14:41 2020 -0500
b
@@ -0,0 +1,50 @@
+>random_negative_9901
+GCTAGCTATAATTTGCCCTCTGAGGATTGTAATACCGAGAAACAGATGGTTTCTCACAGTA
+>random_negative_19543
+ACCTCAGCCTCCCAAAGTGTTAGGATTTACAGGCATGAGCCACTATGCCTGGTCATTTTAT
+>random_negative_3739
+TGCAGGCGTACATCTTCCTTCTGCAGTGTCCTTCCAGTGCCTTTTACAGAGAAAGGTTAAT
+>random_negative_21923
+TGAGGCACCCGGCTTCCGGTGCCTCTGGCAGCCCAGAGAAGTGACTCTGTCTGTCTTAAGG
+>random_negative_134
+CAAAGTGCTGGGATTACAGGCGTGAGCCACTGCGCCTGGCCTGTTTTTGGTCTTTTCTGTT
+>random_negative_5629
+TCTGTGACCCAGGCTGGAGTGCAGCAGCACAATCTTGGCTCACTGCAACCTCCGCCTCCCA
+>random_negative_18032
+GTGCAGCTTCCAGACAAGTCTCTTAGACCTCATCCTCAGTCTCACCCTGACTGACAAGTTT
+>random_negative_17947
+CGTCATTTACATTAGGTATATCTCCTAATGCTATCCCTCCCCCAGCGCCCCACCCCTCGAC
+>random_negative_4666
+GAGGTGCGCCTGTGTGGAGATCTGAGGGTGACGTAAGTAGAGGCACTGGCGTGAGGGCCCC
+>random_negative_3271
+TCTTTGTAGGGAGCAGGGAGAGTGATCTCAGCTGGTCTCTGGGATTGGGAAAAAGTGTGAT
+>random_negative_10185
+CCTGCATGCCTGCCTTCTGCCCCTGCCGAGGCTGGGTCCTAGGGGAAGCACTTACTTCCCT
+>random_negative_15116
+CTACTGATGTGGAAAGTGGCATCTATGGGAAGGTTATCTCCTAACTGGCAGAGCAAGGTCA
+>random_negative_19846
+GTCTGAGGGGAGAGGTGCATTTCATAGACCCCCATAGTGCTCCTTATGTCGGGAGCTTACA
+>random_negative_14690
+TTCCTCGCCTGGCGTGTCCGCTCCCTCCCTCCCTCTCTGCTCTGGTCGCGCCCGCCCACTT
+>random_negative_4674
+TTGGCCCCTTGGTTTTTCCCCAGGTAGCTGTGGTAGTGCAGGGGCAGCTCCCTGCCCCTTC
+>random_negative_18030
+CCTGGTAGTGCCAGCTGGTGTCTTGCCAGCCAGAGCACAGCCCCGCTTGGTCAGGAATGCA
+>random_negative_3289
+TTTTTCAAAGTCAAAAAAAGAAATGTGACAAAGTGCTCTGGAAATTTACTGGAATTCTTGA
+>random_negative_6275
+TAACCTTTCTGTGCTATTTCCCTATCTGTAAAAAGAAGAAGAGATGGTACCGATCTCCCAG
+>random_negative_11324
+ACTCCATTAACTCCACCCACCTCCTGCACCCCTCCCCACACACACAAAATGAACCACGTTC
+>random_negative_8258
+AACCCGGGAGGCAGAGGTTGCAGTGAGCCAAGGTCACACCACTGCACTCCAGCCTGGCCAA
+>random_negative_16662
+TCAGGCTACATTAGAACACTTGCTGTTTCCCCAATATATCATGTGCTCTTGGGCCTCTGTG
+>random_negative_6703
+TCTTGGGAAACCAAGAGCAAGGAAATGCCAGCAGATCTCATGAAAAAACTGCAACCAGGGT
+>random_negative_10199
+CACCTGGGGCTGCGACTGGGACAGTGTCCCGTGTGCGGTCCCTCCTGCGGGTGACACAGGA
+>random_negative_13581
+ATGTTATGTGTAAAGGCGGCGTATCCTTTGGGAGGCTGAGGTGGGAGGATCGCTTGAAGTT
+>random_negative_17232
+GATGGGTGGGTAGATAGCTGGTGGATAAGTGGATGGATGGGAGGATAACCAGGTGGATGTA
b
diff -r 215925e588c4 -r 20429f4c1b95 test-data/test_negatives.train.fa
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/test-data/test_negatives.train.fa Wed Jan 22 10:14:41 2020 -0500
b
b'@@ -0,0 +1,1000 @@\n+>random_negative_21583\n+CCCAGGAATTTCATTGTGCATATCCCTCAATTCAGCACATACTAGAAGTGGGAATACAGCC\n+>random_negative_22306\n+CGTGTGGAAGAACAAAGAGCTAACTTTTCAAAGCAGGTTTCTCCAATGGAGGGTGGGGTTG\n+>random_negative_7272\n+ACAGAAGGGGTCCATGTAGAATTTCGTTTCTTCCTGGAAGTCCCAAGGAAACCTGCAAGTA\n+>random_negative_16884\n+GTACACCGTTCTTCCTGTTCTAATAAATAGCTTTCAGAGTCATACCTCAGCATCTTGTCTT\n+>random_negative_20197\n+TGCTGAGCGGCTTGGCCTGCAGCGAGGCCCCCTCCCACCTGCCATGCTCTCTTGCTGGCCG\n+>random_negative_10981\n+ATATAAAAATGAAATTAAGAAAACAATTCCTGGCTGGGTGTGGTGGTGAGTGGGTCACCTG\n+>random_negative_15764\n+TAGGGGCTGTGCTCTTGGATCTGGACCAGCAGACCCTGCCCGGAGTGGCCCACCAGGTGGT\n+>random_negative_15445\n+AGGGAAGGCAGTTTGATGCCTGGAGAGGGACCCGTGGCTTCCAGCTGCCAGGCCAGCCCCA\n+>random_negative_11194\n+CTCCCATGTGACCCCTGCATTCCCTCTCCTGCTGTGCCTTTGGAGGGCCCCCATCTCTCAT\n+>random_negative_20933\n+TCCCCCAAAATTGAAATGATGCTTCTTTCTACTCTCAGCCACTACACCATCCCCCCCGAAT\n+>random_negative_6890\n+AAAGAAAATGAGAGGTGGAATGTTATTTCTAAATCACAGTTGAAGAACATTAAAAAGATGT\n+>random_negative_7130\n+CTGCAACCTCCGCCTCCCCGGTTCATGCGCCTCGGCCTCCCAAAATGCTGGGATTGCGGAT\n+>random_negative_20989\n+GCTGTAGTCCAAGGGGAAGGGTCCCTCTCGCTTCACCCTGTCCTGGGTCTCCCCAGGTGAG\n+>random_negative_9619\n+AGGTGTGTGCCACCACATTTGGCTAATTTTTATATTTTTAGTAGAGATGGGGTTTTACCAT\n+>random_negative_20105\n+CAGGCATGCGCCACCATGCCCGGCTAATTTTGTATTTTTAGTAGAGACGGGGTTTCTCCAT\n+>random_negative_7706\n+ACAGGCATGTGCCACCACACCTGGCTAATTTTTTTGTATTTTTAGTATAGACAGGGTTTCA\n+>random_negative_3555\n+GATAACATAATCACCAATTTGAGATTCTGTGTTCCTTTTGCTTTTTGTATTCTTTATGGAA\n+>random_negative_6700\n+GCCCCCCAGAAAGCATGCAATGTCCCTTAAAGTACACTGTCCCTTCTACTCACAAAGCAAG\n+>random_negative_6780\n+CTCGACCCGTCCTTCACTCACGCCATGCAGCTGCTGACGGCAGGTAAGGGGGCCTCCCGTG\n+>random_negative_22232\n+GCCGGGCCGCGGGAGGGCGGAGGAGTTGGCGCGCCGAGCGCTCGGCCCGGGGAGCGGTTTT\n+>random_negative_14067\n+TTGGGGAATTGTGGTCCCTAAATAGGAGGGTTTGGGGGCCCCGGGCCCCCGCGAAGGATCC\n+>random_negative_9526\n+GGTTTAAGCAGTGTGGCATTGGCTTATTTTGGCTAACCCTAGACTTCCTAGATTTTACAAG\n+>random_negative_18033\n+GTCCTCACTGCTACTCTCCCACCAGCACTATGACTGTTTAGAAATGCCATGGCAACATCAG\n+>random_negative_19284\n+TGGAAAAGGGGACACGAAGGCCAGTGGCTTCCTGTGTCGCTGGGCAGCAGGCCGCCTCCAG\n+>random_negative_7687\n+ATGACTTTTTTCATAGCAAAGATGTAGATCAGATATGAAATTAGGATAAAACAAGTTTTTT\n+>random_negative_6511\n+AAGGGGCCACCAGGCTGGGCTCTTCTGGGGATGATGTGGAAGCTTGCTGGCTGGGCCGGGG\n+>random_negative_3242\n+ACAGTACAGCAAATTATGAAGTTCCTTTCAAGTCAGAAGTTTGTTCCTGGGCCCTTAAGAC\n+>random_negative_19073\n+TTGGAATATATTTTGATCTGTCATCTAAGTATGAATTTGGCTGTAACTCTGCTGTTAAATG\n+>random_negative_16680\n+TTACATAGTGGTAAAGGGATCAACACAACAAGAAGAGCTAACTGTCCTAAATATATATGCA\n+>random_negative_22939\n+GGGTATGGGGTTTCTTTTTGCAGTGACAGAATGTTCTAAAATTAGGTAGGGGTGCTGGTTA\n+>random_negative_21512\n+TGGCACAAATGGAAGGGAGGGAGCTAGAAGTCCTCGGTGGCATGGAAGGGCCCGGAGGCTC\n+>random_negative_3569\n+CCACCTAGGCTGGAGTGCAGTGGCACAGTCATGGCTCACTGCAGCCTCAACTTCCCAGGCT\n+>random_negative_5606\n+ATGTAGCACAGTTTCTTTGTCCATTCATTTGTGGATGGATACTTAGGTCAATTCCGTATCT\n+>random_negative_4927\n+CTCATCAGTTTGTATTCAAACGCTTCCAGCCATGGAAAGTCTCAGGGCAGGGCCAGCAGGG\n+>random_negative_16654\n+ACATTTGGACTGGTTTCATGAAACATTAATATCTCATTAATTGTACAGTTATTATTTTAAC\n+>random_negative_11557\n+CAGGAAGCTCCAGGACTCCAAGGTCTCCAGGCTGAATGTCCCTGCATCCCAAGGATAGGCC\n+>random_negative_1151\n+CTCACTGCAAGCTCTGCCTCCTGGGTTCACGCCATTCTCCTGCCTCAGCCTCCCAAGTAGC\n+>random_negative_729\n+CATGGGCGTCCCTGCCTCCCAGAGCGTTAGCGATTCCTTTTCCCCCAATACGTGCTCTTCC\n+>random_negative_395\n+TGACAAGTTAGCCATTTAGAGAGCTGAAACATTCAAGTAGTTTTTAAACAAAAGCAATCAT\n+>random_negative_7892\n+ATACAAAATTAGCCGGGCATGGTGGCATATGCCTGTAATCCCAGCTATTTGGGAGACTGAG\n+>random_negative_3722\n+GAATTTATATAATTAACTTTCACAACAGCCGTTTCTGTTAGTCAGCAGTAAACTGAAGGGA\n+>random_negative_12331\n+TCAAGAGAGCAGAACTTGGTAGGCTCCTGGCCTTGGGTAGCTAAGGTTTGGAGGAGGTGGA\n+>random_negative_18482\n+CAGTAATGGTGGATGCCCCTTCCCTAGCCTCGCTGCCGCCTTGCAGTTTGATCTCAGACTT\n+>random_negative_14968\n+CTCAAAGTCATTACTGACCACACTTCAGATTTAGCCATGTATAATATTCACTGATATGTAG\n+>random_negative_21401\n+CTGCAAGCTCCGCCTCTCGGGTTCACGCCATTCTCCTGCCTCAGCCTCCCAAGTAGCTCGG\n+>random_negative_3422\n+TTTGAGCCCGAAGGACTCACAGAGGTCATCTACTCCAATTATTTTCTTTTCCCTCTGCAGA\n+'..b'GACCCAAAAGTTGAA\n+>random_negative_2880\n+TTTCAAAGGGCATTGATTTTTTTCATAAAACTTTTTAAATTAAGATCTGTGGGCCTGGTGC\n+>random_negative_22609\n+CTGCCAGAAAGGGGCTTGTAGGAAGACACAAGGCTGTACCCTATCTCTCCAGGGGGCAGGC\n+>random_negative_18254\n+TTTATTTTAAGCATGATGGTAAGATGTCTTTGGGGTGTTTTATGGGGTGAAATGATATAAT\n+>random_negative_16597\n+GCATAATTTACACATTTCTAAATGCTCCCTCGGGGGAAAACAGGAAGAAATTGGTTGTAGA\n+>random_negative_5413\n+AAAAGTTACTCCTTTGAGGGAGTTTGGCTGCTTTTGAGTGGAGGTGACTTCAGGCTTATTC\n+>random_negative_4412\n+GTGTCCAAGACAAGGCATAGGTTTTTTGTTTGTTTGTTTTGTGACAGAGTCTCGCTCTGTC\n+>random_negative_16412\n+CTCGATCATCCCCTCTCAGGCCAGCCAGGGAGTCTCAGCTCCTGCCCAGGACCTGGCTGGA\n+>random_negative_18599\n+AGGGCTCCACCTTTGACCTCACTGAACCTTAATTACTCCTTAGAGGTCACGTCTCTAAATA\n+>random_negative_15871\n+AATGCCAAGCCCGCCCTGGTCCCAAGAAAAAACACACCACACACACAAACATTAAAAAAAA\n+>random_negative_2755\n+TTTCACAGATTGTCTTTTTTCTGTTTACCATGCCGTTGCTTTTCGCTGTCATCTTTATCAT\n+>random_negative_1215\n+ACGCAGCTCGTGACAGCAGGGAGTGGAACCCACCAGGCAGCCAAATGGGCCTCCAAGGAGA\n+>random_negative_18664\n+TAGTTAGGGGTATGAGCTCCCTTAGTGTGTGGGCCTCTCCTGTTCATCTTTTCGTTTTAAG\n+>random_negative_8547\n+AGTTGATGCAGCTGAGCTCTTTCCATCCTGTCCTGGGTTGCTGTGTGCATTGCATTCTCTC\n+>random_negative_5845\n+CCTCTACCTCATGGGTTCCAGGGATTCTCCTGCCTTAGCCTCCCGAGTAGCTGGGATTACA\n+>random_negative_8769\n+TATAGATTTGAAAATAAACTTAAAAGCCCAGGGTGAAAACGTGTCATTGCTCTCCGCAGGC\n+>random_negative_13660\n+CCAGCTCCATCCCTTCAGGCACCCTGGCCAAGTATCAGCTTCTCTAAGACTTGGTTTCCTC\n+>random_negative_9916\n+TCAGATTCTTTCAATGCTAGGTTTAGTATCGTTCCTTTTTCTCCCATTTGGTGAAGTGGTG\n+>random_negative_21042\n+AATTTAAAATAAAAGTTGAGGCCAGGTAGGGTGGCTCATGCCTGTAATCACAGCACTTTGA\n+>random_negative_20689\n+TGTAATGTGTAAAGGGGAGGGAGAATGTGAGGTGCTGGGAAGCACCGGGCCCTTCGGGGGA\n+>random_negative_22534\n+ATGGGGTTTCACTATGTTGGTCAGGCGCATCTCCAACTCCTGACCTCAGGTGTGATCCACC\n+>random_negative_21620\n+AAGGGCTTTTCAAGGTTGAATAGGAGTTTGCAATGAGGAAAAAAAGTCACATGATCTTTCT\n+>random_negative_9401\n+TAGACTCCCTTTTCCTAGTTGCCCACTCCCACCCCAGTCTGGTGAATGTTAGCCTTTTGGG\n+>random_negative_19360\n+AGCTTTGCTGAGAATAGACCACAGGAGAGCAAGAGTCGAAGGCAGAAGACAGGTCAGGAAG\n+>random_negative_19844\n+TCTTTGGTGTTTTCAGATACAAGCCTTTGGGGCAGCATCCCCTCTTGAGTATTTTTTCTGG\n+>random_negative_1724\n+GCCACTGCACTCCAGCCTGGGCGACAGAGCGAGACTCTGTCTCTAAATAAATAAATAAATA\n+>random_negative_19807\n+ATATCTGCTGCCATATTTTTCCAGGGGCTAGTTTGACTCTTCTACTTTGGAGGGCTTTGCA\n+>random_negative_11214\n+ATATGTATGTTGTTAGATTGTTTTTTGTTGTTGTTTTTTGAGACAGAGTCCGGCTCTGTTA\n+>random_negative_23177\n+TATAGTAATTTTTTATTGATGTGTGGGAAATTATTCCAAAATTTAGCAACTAGAAGCAACA\n+>random_negative_9875\n+AAAGCTAAAGCAAAAACACTGGCATATGACCATGCAAGACTGTCAGTGCCAACAAAGACAA\n+>random_negative_14367\n+TCTACCCACTAAGCACTCATAGCTCTCCTTCCCTAGCTGACAATAAAAAATGTCCCCAGAC\n+>random_negative_4027\n+ACAAAGACGTTTCTCAGTGTTGAATACACTCTTCCTTAAAAACCTAAGGCTAGCCTCTGTT\n+>random_negative_22646\n+TCCACAACACCCTGCCTATATGTTACCGGCCATTTTGTGTGTCCTTCTGATAGGATGCTAG\n+>random_negative_1645\n+TTCTGACTGCCAGGTGTTCTTCACTCTATCCATTCTCTTTTGCCAGTAGATATATAAAAAG\n+>random_negative_20207\n+GCCTCCTGGGTTCAAGCGATTCTCCTGCTTCAGTCCCCTGAGTAGCTGGGATCACAGGCAT\n+>random_negative_8009\n+AGCTACATGATTTAACCAGAAACACCAGAAACTGTCCTGGTGGAAAATTGTTCTTTATCCA\n+>random_negative_18195\n+CCGTGAAGCAGTGCACCGACCAGTTTGGGATGGACACAGTACTGGTGGAAGACAGGATGCC\n+>random_negative_8527\n+CTGTAATGGGAACCCCTCCCCCATTTACTTCTCCACCTCCCGTCCTCCCCATCATTGGTTT\n+>random_negative_15810\n+AAAAAAAAAAAGAAAAAAAATTGGCAGGGTGCGGTGGCTCACACCTATAATCTCAGCACTT\n+>random_negative_10743\n+GTAGAAAAACGTAGTGGTTGTGTATTAACAGAAAATGCTTCTGAGAAGTTTATATCTAGTA\n+>random_negative_21468\n+TTCTAAGGACAGCTGGTGTCCTCTGTAAGCAAACCAGGCTTGGCTAGGCCTGGGCCCTCTG\n+>random_negative_4787\n+GCCAGTGGCTCCCCTGCCTCCTGCCGATTCGGCCTCCCTGATCTGACTGCATCCTAATGTA\n+>random_negative_16607\n+GGTGTCCATTTGTGCAATGACAATGTTCATCTCACCTCCTGAGGCTGCTGGGGTTCAAGAC\n+>random_negative_407\n+GGTAGGGGGATAGAGTCTCACTGCGCCACCCAGGCTGGAATGCAGTGGTGCGATCTCAGCT\n+>random_negative_19757\n+GGGACTATAGGCGCATGCCACCACGCCCGGCTAATTTTTGTATTTTTAGTAGAGACAGGGG\n+>random_negative_13405\n+GCTTGAACCCGGGAGGCAGAGGATGTTGCAGTAGCTGAGATCGCACCACTGCACTCCAGCC\n+>random_negative_14806\n+GATGTGTGTCCATACTTCCATTCTCTTCCCCTGAACTCTTCTCCAGGTCCTGGAATCGCCT\n'
b
diff -r 215925e588c4 -r 20429f4c1b95 test-data/test_out.peaks.bed
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/test-data/test_out.peaks.bed Wed Jan 22 10:14:41 2020 -0500
b
@@ -0,0 +1,2 @@
+chr1 1005 1008 s1,1006 6.500000 +
+chr2 1994 1997 s2,1997 4.000000 -
b
diff -r 215925e588c4 -r 20429f4c1b95 test-data/test_positives.parop.fa
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/test-data/test_positives.parop.fa Wed Jan 22 10:14:41 2020 -0500
b
@@ -0,0 +1,50 @@
+>SERBP1_K562_rep01_125
+GCGAGGCCTGACATCACTTACCAGGAGCCTCAGGGAACCCAGCCAGCACAGCAGCAGCAGC
+>SERBP1_K562_rep01_2383
+CTCTACGCGGTCTGGGCTGAAGCTGGTTTCCTGGCCGAGGCGGAGCTGCTGAACCTGAGGA
+>SERBP1_K562_rep01_473
+TCAAGTACACGGAGAAGCTGCAGGAGCCCCCGGCCAGTGCTGTCCGTGAGGCTGCCGACAA
+>SERBP1_K562_rep01_1962
+CCAGACGGGAAAACCATCTATGAGAACTCCTCTCCGTGAACTTACCCTGCAGCCCGGTGCC
+>SERBP1_K562_rep01_58
+AGGCACAGGGCTGGAGCGAGGTGTGGCCGGCGTGCCAGCCGAGTTCAGCATCTGGACCCGG
+>SERBP1_K562_rep01_1844
+CTTGTCCTGGCTGGACGCTACTCCGGACGCAAAGCTGTCATCGTGAAGAACATTGATGATG
+>SERBP1_K562_rep01_1556
+CAAAATGACTATGAAGTTGGGCAGCGGCACGGGCTGGAGGCCATCAGCATCATGGACTCCC
+>SERBP1_K562_rep01_3134
+CAGACAATGCCTGTGGAAGACAAGTCAGACCCCCCAGAGGGGTCTGAGGAAGCCGCAGAGC
+>SERBP1_K562_rep02_1008
+TGCCTGCTGGTCACCCTGGCCGCGCGCTTCCCCGCCGACTTCACGGCCGAGGCCCACGCCG
+>SERBP1_K562_rep01_571
+TGGAGAAGCCGGGGGTGGACGAGGAGCCGCAGCATGTCCTCCTGCGGTACGAGGACGCCTA
+>SERBP1_K562_rep01_283
+CTGTCCTCGCCGTCACCGGGCCAGCAGGTGCAGACCCCGCAGTCGATGCCCCCTCCCCCCC
+>SERBP1_K562_rep01_3147
+GCCTTTTACTTCGGCCCGCTTCTTCTGGTCACTCCGCCACCGTAGAATCGCCTACCATTTG
+>SERBP1_K562_rep01_264
+TGATGAAGATGACCCTACTGCTGATGATACCAGTGCTGCTGTAACTGAAGAAATGCCACCC
+>SERBP1_K562_rep01_1186
+GTTGCCAGCCAGGCTCCCCACACCATCACCTGGTATAAGCGTGGAGGCAGCTTACCCAGCC
+>SERBP1_K562_rep02_155
+GTGTGATCAACGGCTCCCCCTGCCAGCACGGAGGCACCTGCGTGGATGATGAGGGCCGGGC
+>SERBP1_K562_rep01_521
+GTGAGGATAATGCCCCAGCCACCAGCTACTGTGTGGAGTGCTCGGAGCCTCTGTGTGAGAC
+>SERBP1_K562_rep01_941
+TAGACACGCAGCCCAAGAAGGTCCGGAAGGTCCCGCCGGGTCTTCCATCCTCGGTGTACCC
+>SERBP1_K562_rep02_898
+GGAGGCTCTTCAGGAAAGCCTTCCCCCTGCTTTGCATGGCTCCCGGGGCGCTGACTGCGTG
+>SERBP1_K562_rep01_3279
+GGCGAAGACGAGAAGAAGAGGAGCGTTGGAGAATGGAAATGAGACGTTATGAAGAGGACAT
+>SERBP1_K562_rep01_237
+GCAGAGCTCCAGAGGATGGAACAAGAGGCTGAGAGGCGCAGGCAGCCACAAATAAAGCAAG
+>SERBP1_K562_rep02_1199
+AACATTGGGAGCCTCATCTGCAATGTAGGGGCCGGTGGACCTGCTCCAGCAGCTGGTGCTG
+>SERBP1_K562_rep01_309
+AGCTTCCTGCCCAGCCCCTCACCGCAGCCCTCCCAGAGCCCAGTGACGGCGCGGACCCCAC
+>SERBP1_K562_rep01_2709
+AAAAAGGTGGAAGAGGTGCTGCAGCTGGTGGACGAGGTGGTACGAGGGGTCGTGGCCGAGG
+>SERBP1_K562_rep01_2694
+TATCAGACATGAAATGACTCCAGTAAACCCTGGTGTTGGCCAGTGCTGCACTTCTTCATAT
+>SERBP1_K562_rep01_2013
+CTGATGGGCCACAGGGACCTCCCGGCCTGCCGGGACTTAAGGGGGATCCTGGCGTGCCTGG
b
diff -r 215925e588c4 -r 20429f4c1b95 test-data/test_positives.train.fa
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/test-data/test_positives.train.fa Wed Jan 22 10:14:41 2020 -0500
b
b'@@ -0,0 +1,1000 @@\n+>SERBP1_K562_rep01_1163\n+ACACCCCAGCAGTCAAACCGTACAAGAAAAAGTCGTGTTTCTGTGTCTCCAGGGAGAACTT\n+>SERBP1_K562_rep01_290\n+GGTGGGATTTCCTCCGCCGCCGCCGCCGCCGCCGCCGCGGGTCCTGCGCCGCGTCCAGCCC\n+>SERBP1_K562_rep02_1341\n+AATATGAGCAGCTGTCCTCTGAAGCCCTGGAGGCTGCCCGAATTTGTGCCAATAAGTACAT\n+>SERBP1_K562_rep01_1961\n+CAAAGGAGTGTGCAGACTTGTGGCCCCGGATTGCATCCAATGCTGGCAGCATTGCATGATT\n+>SERBP1_K562_rep02_453\n+CTTCGGTCGCTACCGCTCCCGCTCTGCCACCCCCGCCAACCGCCGCTCGGGCCTCCGTCGC\n+>SERBP1_K562_rep02_29\n+ACCGATGCCCAGATTGCACGACCACTTCTGGAGCTGCTCCTGTGCGCACAGCGCGAGGCGC\n+>SERBP1_K562_rep01_210\n+CAACAGCAGCAGATGGCCGAGCTGCACAGCAAGTTACAGTCCTCCGAGGCGGAGGTGCGCA\n+>SERBP1_K562_rep01_2868\n+CAGGTTTGCCAGCTGAGGAGCCAAAGGAGAAGTCCTCTCGGAAGGTAGCCGAGCCAGAGCT\n+>SERBP1_K562_rep02_1223\n+GCTGGCAAAGGCAAAATGAGAAACCGTCGCCGTATCCAGCGCAGGGGCCCGTGCATCATCT\n+>SERBP1_K562_rep02_991\n+GGATTGCCAGGGAATCAGGGCCCTCCAGGACCCAAGGGCGCCAAAGGAGAAGTGGGCCCCC\n+>SERBP1_K562_rep02_107\n+AGAAAATGGGTCCTGTCGGGAATCCTGGCTAGCGAGGAGACTTACCTGAGCCACCTGGAGG\n+>SERBP1_K562_rep01_2186\n+GAAGAAAGATAGAGGACGCAGGTCAGAGAGCAGCTCTCCTCGACGGGAGAGAAAGAAAAGC\n+>SERBP1_K562_rep01_27\n+GGGCAGCGGGGGTCGGGGGACTGGCCAGCTCAACCGCTTCGTGCAACTCTCCGGGCGGCCG\n+>SERBP1_K562_rep01_2014\n+TGAGCTGAAGGTGCGCAGGGACCCCCAGGTGAGCCCCATGCACTGCCTGGACGAGGAAGGC\n+>SERBP1_K562_rep01_955\n+GCGGCGCGGATTGAATGAGCCCGCCGAGCCCGGGCCGCCGTCGGGAGCAGCGCAGGCCGCG\n+>SERBP1_K562_rep02_1397\n+TTTGCATTGTTCCTCATCCGCCTCCTTGCTCGCCGCAGCCGCCTCCGCCGCGCGCCTCCTC\n+>SERBP1_K562_rep02_1446\n+CGATTGGCATCCTCAGCCGCTTTTCTGCCTTCAGGATCCTCCGCTCCCGAGGTTATATATG\n+>SERBP1_K562_rep01_1271\n+CAACCTGGTGCTCTGAACCAGCGCCAGGTCCAGTTCTCTGAGGAGCACTGGGTCCATGAGT\n+>SERBP1_K562_rep01_1542\n+CGCTCCTATGCCTGGAAACTCCTGCCCGCCTGAAGTGGATGCAAAGCTGCTGAAGCGGCAG\n+>SERBP1_K562_rep01_173\n+GGCTCTTCGACTACCGGGGCCGTCTGTCGCCCGTGCCAGTGCCCAGGGCGGTCCCTGTGAA\n+>SERBP1_K562_rep01_2934\n+CAGTCCCTCCTGTAGCCGCCGCCGCCGCCGCCCGCCGCCCCTCTGCCAGCAGCTCCGGCGC\n+>SERBP1_K562_rep02_224\n+CCCTCGGGGGTCGCGGCCGCCCTCCGCGGCCCCTCGTGGTGCGCGCCGTCCGCTCGCGCTC\n+>SERBP1_K562_rep01_1161\n+CCGCAACGCTCGACCCCAGGATTCCCCCGGCTCGCCTGCCCGCCATGGCCGACAAGGAAGC\n+>SERBP1_K562_rep01_1370\n+ACAAAAGTGCATTTTTATGTGGAGTTATGAAGACCTACAGGCAGAGAGAGAAACAGGGGAG\n+>SERBP1_K562_rep01_1830\n+CGGTTTACCAGCAGCCCCAGCAGCAGCCGGTGGCCCAGTCCTATGGTGGCTACAAGGAGCC\n+>SERBP1_K562_rep02_1530\n+ATGAGCCGTCCTGGGGAGGTGTGCTCTGCTCTCAACCTCTGCGAGTCTCTCCAGAAGCACC\n+>SERBP1_K562_rep01_761\n+CAAAGGTCGTGGATCCAAGCACCTCCTGGAGTGCGAGCGCTGCCGCCATGCATACCACCCG\n+>SERBP1_K562_rep02_1207\n+CGCCATCGCCGTCATGCTGGGCGCCGCTCTCCGCCGCTGCGCTGTGGCCGCAACCACCCGG\n+>SERBP1_K562_rep01_1911\n+CCTGATTGATGCCGTCACGGGGCTCAGTGGCAGCGGCCCCGCCTACGCATTCACAGCCCTG\n+>SERBP1_K562_rep01_1333\n+CACGGTGATCATGTCTCGGATGATTCTGGGGGAGTACACAGGGCTGCTGGTCAACCTCTCC\n+>SERBP1_K562_rep02_215\n+CTTCAGGTTACTTCATGGCAGCTATCCCACAGACTCAGAACCGTGCTGCATACTATCCTCC\n+>SERBP1_K562_rep01_3203\n+GGCGGCCGCCGCGCTGTGGCTGCTGCTGCTGCTGCTGCCCCGGACCCGGGCGGACGAGCAC\n+>SERBP1_K562_rep01_2285\n+CATGGACATGGTGGAGAAGACGCGGCGCTCGCTCACGGTGCTGCGCAGGTGCCAGGAGGCC\n+>SERBP1_K562_rep01_1761\n+TGGAACGCCCCTTTGTCCTGGCCAGGGCCTTCTTCGCTGGCTCCCAGCGCTTTGGAGCCGT\n+>SERBP1_K562_rep01_3110\n+AGCCAGGCTGGAGAAAAGAAGCTGCCACCATGGTTGCACTTTCACTGAAGATCAGCATTGG\n+>SERBP1_K562_rep01_893\n+AGAGGGGCCTGTCTGGGCCCTCGGGGCCGGGGCACATGGCAAGCCGCGGTGGAGTGGCGGG\n+>SERBP1_K562_rep01_1917\n+ACCGGTGGGAGCTAGGCGCGAGGCTCGGAGTGCGGCCAGCGGGCGGAGGCGGTCTCGCATC\n+>SERBP1_K562_rep02_626\n+GTGTACCTGCTGGTGGGGCTGTAGCCGTCTCTGCTGCCCCAGGCTCTGCAGCCCCTGCTGC\n+>SERBP1_K562_rep02_1471\n+GAAGGACGGCAGGCCAAAGTGGAACAGTTGGGACCCTAGGAGGCAGCGGCAGTTGTCAATG\n+>SERBP1_K562_rep01_213\n+TCCAGCAGAAGGAGATCACACAGAGCCCATCCACGTCCACCATCACCCTGGTGACCAGCAC\n+>SERBP1_K562_rep01_2393\n+AGCCCAAGCGAAGACTGTCGGCCGCCAGAAGAGCCGGCACCTGTTGTGCAAATTGTCAGAC\n+>SERBP1_K562_rep01_2222\n+GCACGTGTCCCTGCCTGGCACGCGCCTCCCGCGCTGGGCTCACAGGCAGGATGCAGTGAGT\n+>SERBP1_K562_rep02_768\n+TTCCTGCCGTCGCGTTTGCACCTCGCTGCTCCAGCCTCTGGGGCGCATTCCAACCTTCCAG\n+>SERBP1_K562_rep02_1277\n+GCTGCCACAGTTGGGGTGGCTGGTTCTGGGGCTGGGATTGGAACTGTGTTTGGGAGCCTCA\n+>SERBP1_K562_rep01_3186\n+CTGACCTCGCTGGTGATCGAGAACGAGGCTGGGGATGAGCGCATGCTGGCGGATGCCCCAC\n+>SERBP1_K562_rep01_476\n+TTTTGTGCCGAAG'..b'TC\n+>SERBP1_K562_rep02_608\n+GCACCCCTTTGGTTGGCCTTCCAACGGCTTCCCAGGGCCCCAGGGTCCATATTACTGTGGT\n+>SERBP1_K562_rep01_657\n+AGTGGAGACAGAGCTTAAAATGTGGGACCCTCACAATGATCCCAATGCTCAGGGGGATGCC\n+>SERBP1_K562_rep01_1166\n+GACCAGTGAGGCAGAATATGTATCGGGGATATAGACCACGATTCCGCAGGGGCCCTCCTCG\n+>SERBP1_K562_rep02_1497\n+CTCTTTCCCTCGGAGCGGGCGGCGGCGTTGGCGGCTTGTGCAGCAATGGCCAAG\n+>SERBP1_K562_rep01_1946\n+ATGACTTTGAGAAGAAGTTTAATGCGCTGAAGGTTCCCGTGCCAGAGGATAAATATACTGC\n+>SERBP1_K562_rep02_76\n+CCGGTTCGCCGTCTGCGTCTCCCCCACGCCGCCTCGCCTGCCGCCGCGCTCGTCCCTCCGG\n+>SERBP1_K562_rep01_204\n+CAAAACCCATAGTCAAGCCACAGACAAGCCCAGAATATGGCCAGGGGATCAATCCGATTAG\n+>SERBP1_K562_rep01_2128\n+AACTCGACCTAGCCCCTCTCCGGAAAGGAGCAGCACAGGCCCAGAACCACCTGCTCCCACT\n+>SERBP1_K562_rep01_2335\n+GACAAGGTGCCCAGTGTCTCCAGCTCTGCCCTCGTGTCTTCCTTGCACCTGCTGAAGTGCA\n+>SERBP1_K562_rep01_1136\n+GCTCAGCCAGCTCCAGAAGCAGCTGGCAGCCAAGGAGGCGAAGCTTCGAGACCTGGAGGAC\n+>SERBP1_K562_rep01_2783\n+GGCAAGCCCACCCACTTCACAGTAAATGCCAAAGCTGCTGGCAAAGGCAAGCTGGACGTCC\n+>SERBP1_K562_rep01_1319\n+AGAGAGGTGTGCAGTGGCATGGAAGGGCCAGCGGGGTATCTGCGGCGGGCCAGTGTGGCCC\n+>SERBP1_K562_rep01_2182\n+CAGATCCTGGAGTGGGCTCCATTTCTCCAGCTTCTCCAAAGATCTCCCTGGCCCCCACAGA\n+>SERBP1_K562_rep01_75\n+CGTGTCCCGCTGCTGCTCCTGTGAGCGCCCGGCGAGTCCGTCCCGTCCACCGTCCGCAGCT\n+>SERBP1_K562_rep01_383\n+CCACCCTATTCCCAGGGTCTTCCAAAACCGCTTCTCCACACAGTACCGCTGCTTCTCTGTG\n+>SERBP1_K562_rep02_1023\n+CGCGCCCCGCGTCGGGTCCCATCCGGCCCATCGTGCGCTGCCCCACGGTTCGGTACCACAC\n+>SERBP1_K562_rep01_241\n+TTCGCCTGCGTCGCTCCGGGAGCTGCCGACGGACGGAGCGCCCCCGCCCCCGCCCGGCCGC\n+>SERBP1_K562_rep02_709\n+CTCCAAGGGCGGGCTTGTGCTCTGGTGCTTCCAGGGCGTTAGCGACTCATGCACCGGACCC\n+>SERBP1_K562_rep02_1521\n+GACACCGTAACTATCCGCACTAGAAAGTTCATGACCAACCGACTACTTCAGAGGAAACAAA\n+>SERBP1_K562_rep01_1657\n+CACGGCCGCATAGGCAAGCACCGGAAGCACCCCGGCGGCCGCGGTAATGCTGGTGGTCTGC\n+>SERBP1_K562_rep01_537\n+AAACTTGGAGAAGGCCCAGGCGGAGCTGGTGGGGACAGCTGACGAGGCCACGCGGGCAGAG\n+>SERBP1_K562_rep01_1589\n+ACCTCAGCCTCGGTGCTCGGGCCGCCCCGCCTCTGCCGGAAAGTCCGCGCCGCCGCTGCCG\n+>SERBP1_K562_rep01_2357\n+CAGCGCCGCACCCGGAAGATGAGGCTCGCCGTGGGAGCCCTGCTGGTCTGCGCCGTCCTGG\n+>SERBP1_K562_rep02_851\n+GTTTGCCGCCAGAACACAGGTAAGTGCCGTGTGTGGTT\n+>SERBP1_K562_rep01_2316\n+CTCCTGCTGTATCGGCTCCGCTGGATACCGCACAACGACCACCACCACAGGGCTGCCCACA\n+>SERBP1_K562_rep01_482\n+CTTCCTGAGAACCTTTATAATGACAGGATGTTTCGCATTAAGAGGGCACTGGACCTGAACT\n+>SERBP1_K562_rep01_424\n+CGTGGTGAAGAGTGTGGCCTGGAACCCCAGCCCCGCTGTCTGCCTGGTGGCTGCAGCCGTG\n+>SERBP1_K562_rep02_946\n+CGGTGCTGGTTTTTCGCTCGTCGACTGCGGCTCTTCCTCGGGCAGCGGAAGCGGCGCGGCG\n+>SERBP1_K562_rep02_299\n+ACCAATGGCCAGCATGCTTTTTACCAGCTCATCCACCAAGGCACCAAGATGATACCCTGTG\n+>SERBP1_K562_rep02_910\n+GCAACATCAGCTCACTGGAGGGGGCCCGGGGCCTCATTGCCGAGGCGGCGCAGCTTGGGCC\n+>SERBP1_K562_rep01_1047\n+CCGAGCGATGGGCATCTCTCGGGACAACTGGCACAAGCGCCGCAAAACCGGGGGCAAGAGA\n+>SERBP1_K562_rep02_835\n+GACTATCATTGATGCCCCAGGACACAGAGACTTTATCAAAAACATGATTACAGGGACATCT\n+>SERBP1_K562_rep01_2140\n+GTCGCAGCGGCGGAGACCCCTGTGCGGTGCGGAGGGGGCGGCGGCCCCGACTCTGACCCGC\n+>SERBP1_K562_rep01_1040\n+CCGCCGCCGCCGTTTCAGACGCAGACCCCACCGCAGAGTCTGCAGCAGCCCGCCCCGCCCG\n+>SERBP1_K562_rep01_1595\n+AGAATGTTCTTGGTGAGAAGGGCCGGCGGATTCGGGAACTGACTGCTGTAGTTCAGAAGAG\n+>SERBP1_K562_rep02_1462\n+GGGGAGCCAAAGGCATCCAGGCCAGGGCTCCTGTGGCAGCTGCCCCTGGCACCTGTGTTCC\n+>SERBP1_K562_rep01_3126\n+GCGGAAGTTGGCCCTCTTTTCCGTGGCGCCTCGGAGGCGTTCAGCTGCTTCAAGATGAAGC\n+>SERBP1_K562_rep01_3060\n+AAGTTAACCTCAATAATATCCGGAATATCCCCATCCCCACCCTCAAGGCATATGCAGAAGC\n+>SERBP1_K562_rep01_3300\n+AAATGCAGAGAAATATGCTGAAGAAGACCGGCGAAAGAAGGAACGAGTTGAAGCAGTTAAT\n+>SERBP1_K562_rep02_1284\n+TTGCCATTATGCAGACCCCTGCTGGGGAGCTGTATGACAAATCCATCATTCAGAGTGCCCA\n+>SERBP1_K562_rep01_2507\n+TGGCAAGCATGTGGTGTTTGGCAAAGTTCTAGAGGGCATGGAGGTGGTGCGGAAGGTGGAG\n+>SERBP1_K562_rep01_22\n+CACAGAGGAGCCAGTGAAGGTGCGGGAGGCTGGGGATGGTGTGTTCGAGTGCGAGTACTAC\n+>SERBP1_K562_rep01_1305\n+CGAGCAGTGCTACTACGTCTTCGGGGATCTCTGCAGCAATCTCGCCACCCTGAACCTCAAC\n+>SERBP1_K562_rep02_1139\n+CTGCGTATCAGTCCTCACCAGCAGGAGGACATGCACCAACTCCTCCAACTCCAGCGCCAAG\n+>SERBP1_K562_rep01_1878\n+TGGAGTTCTCGGGCCGAGACGCCAGCGGCAAGCGTGTGATGGGACTGGTGCCTGCCAAGGG\n+>SERBP1_K562_rep01_25\n+GCAGCCGAACAAAGGAGCAGGGGCGCCGCCGCAGGGACCCGCCACCCACCTCCCGGGGCCG\n'
b
diff -r 215925e588c4 -r 20429f4c1b95 test-data/test_predict.avg_profile
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/test-data/test_predict.avg_profile Wed Jan 22 10:14:41 2020 -0500
b
b'@@ -0,0 +1,3610 @@\n+CAPRIN1_pos_1\t1\t-1.379676\n+CAPRIN1_pos_1\t2\t-1.655219\n+CAPRIN1_pos_1\t3\t-1.263697\n+CAPRIN1_pos_1\t4\t-0.881248\n+CAPRIN1_pos_1\t5\t-0.380379\n+CAPRIN1_pos_1\t6\t0.023880\n+CAPRIN1_pos_1\t7\t0.322847\n+CAPRIN1_pos_1\t8\t-0.050280\n+CAPRIN1_pos_1\t9\t0.131240\n+CAPRIN1_pos_1\t10\t-0.062237\n+CAPRIN1_pos_1\t11\t-0.235974\n+CAPRIN1_pos_1\t12\t-0.501274\n+CAPRIN1_pos_1\t13\t-0.259123\n+CAPRIN1_pos_1\t14\t-0.886778\n+CAPRIN1_pos_1\t15\t-1.552642\n+CAPRIN1_pos_1\t16\t-2.285816\n+CAPRIN1_pos_1\t17\t-2.742947\n+CAPRIN1_pos_1\t18\t-2.954979\n+CAPRIN1_pos_1\t19\t-2.588990\n+CAPRIN1_pos_1\t20\t-2.911601\n+CAPRIN1_pos_1\t21\t-2.434019\n+CAPRIN1_pos_1\t22\t-2.158966\n+CAPRIN1_pos_1\t23\t-1.646564\n+CAPRIN1_pos_1\t24\t-1.663726\n+CAPRIN1_pos_1\t25\t-1.071446\n+CAPRIN1_pos_1\t26\t-0.910116\n+CAPRIN1_pos_1\t27\t-0.488377\n+CAPRIN1_pos_1\t28\t-0.197663\n+CAPRIN1_pos_1\t29\t-0.563750\n+CAPRIN1_pos_1\t30\t-0.753612\n+CAPRIN1_pos_1\t31\t-0.396876\n+CAPRIN1_pos_1\t32\t-0.658542\n+CAPRIN1_pos_1\t33\t-0.589384\n+CAPRIN1_pos_1\t34\t-0.952024\n+CAPRIN1_pos_1\t35\t-0.832475\n+CAPRIN1_pos_1\t36\t-0.774138\n+CAPRIN1_pos_1\t37\t-0.848855\n+CAPRIN1_pos_1\t38\t-1.261698\n+CAPRIN1_pos_1\t39\t-1.367489\n+CAPRIN1_pos_1\t40\t-1.142024\n+CAPRIN1_pos_1\t41\t-0.835390\n+CAPRIN1_pos_1\t42\t-0.918258\n+CAPRIN1_pos_1\t43\t-0.819230\n+CAPRIN1_pos_1\t44\t-0.519664\n+CAPRIN1_pos_1\t45\t-0.255625\n+CAPRIN1_pos_1\t46\t-0.131319\n+CAPRIN1_pos_1\t47\t-0.366882\n+CAPRIN1_pos_1\t48\t0.081570\n+CAPRIN1_pos_1\t49\t0.235923\n+CAPRIN1_pos_1\t50\t-0.075025\n+CAPRIN1_pos_1\t51\t-0.439231\n+CAPRIN1_pos_1\t52\t-1.198908\n+CAPRIN1_pos_1\t53\t-1.686417\n+CAPRIN1_pos_1\t54\t-2.147714\n+CAPRIN1_pos_1\t55\t-2.920973\n+CAPRIN1_pos_1\t56\t-3.416452\n+CAPRIN1_pos_1\t57\t-3.995050\n+CAPRIN1_pos_1\t58\t-3.995280\n+CAPRIN1_pos_1\t59\t-3.999470\n+CAPRIN1_pos_1\t60\t-3.820544\n+CAPRIN1_pos_1\t61\t-3.669137\n+CAPRIN1_pos_1\t62\t-2.972909\n+CAPRIN1_pos_1\t63\t-2.436096\n+CAPRIN1_pos_1\t64\t-1.963858\n+CAPRIN1_pos_1\t65\t-1.168126\n+CAPRIN1_pos_1\t66\t-0.531896\n+CAPRIN1_pos_1\t67\t-0.259057\n+CAPRIN1_pos_1\t68\t0.280927\n+CAPRIN1_pos_1\t69\t0.190114\n+CAPRIN1_pos_1\t70\t-0.143584\n+CAPRIN1_pos_1\t71\t-0.063020\n+CAPRIN1_pos_1\t72\t-0.332623\n+CAPRIN1_pos_1\t73\t-0.573936\n+CAPRIN1_pos_1\t74\t-0.731437\n+CAPRIN1_pos_1\t75\t-0.639886\n+CAPRIN1_pos_1\t76\t-1.304309\n+CAPRIN1_pos_1\t77\t-1.540262\n+CAPRIN1_pos_1\t78\t-1.636970\n+CAPRIN1_pos_1\t79\t-2.051662\n+CAPRIN1_pos_1\t80\t-1.997273\n+CAPRIN1_pos_1\t81\t-2.133966\n+CAPRIN1_pos_1\t82\t-2.592269\n+CAPRIN1_pos_1\t83\t-2.535969\n+CAPRIN1_pos_1\t84\t-2.541698\n+CAPRIN1_pos_1\t85\t-2.834214\n+CAPRIN1_pos_1\t86\t-3.343224\n+CAPRIN1_pos_1\t87\t-3.445146\n+CAPRIN1_pos_1\t88\t-3.802046\n+CAPRIN1_pos_1\t89\t-3.900305\n+CAPRIN1_pos_1\t90\t-4.030203\n+CAPRIN1_pos_1\t91\t-3.925011\n+CAPRIN1_pos_1\t92\t-3.753758\n+CAPRIN1_pos_1\t93\t-3.384476\n+CAPRIN1_pos_1\t94\t-3.559231\n+CAPRIN1_pos_1\t95\t-4.051939\n+CAPRIN1_pos_1\t96\t-4.084883\n+CAPRIN1_pos_1\t97\t-4.024745\n+CAPRIN1_pos_1\t98\t-3.692752\n+CAPRIN1_pos_1\t99\t-3.611979\n+CAPRIN1_pos_1\t100\t-3.592387\n+CAPRIN1_pos_1\t101\t-3.473540\n+CAPRIN1_pos_1\t102\t-3.847526\n+CAPRIN1_pos_1\t103\t-3.912690\n+CAPRIN1_pos_1\t104\t-4.274216\n+CAPRIN1_pos_1\t105\t-4.060786\n+CAPRIN1_pos_1\t106\t-3.883028\n+CAPRIN1_pos_1\t107\t-3.464518\n+CAPRIN1_pos_1\t108\t-3.402433\n+CAPRIN1_pos_1\t109\t-3.536028\n+CAPRIN1_pos_1\t110\t-3.550916\n+CAPRIN1_pos_1\t111\t-3.641621\n+CAPRIN1_pos_1\t112\t-3.821016\n+CAPRIN1_pos_1\t113\t-3.773154\n+CAPRIN1_pos_1\t114\t-3.929856\n+CAPRIN1_pos_1\t115\t-4.027817\n+CAPRIN1_pos_1\t116\t-3.970292\n+CAPRIN1_pos_1\t117\t-3.598182\n+CAPRIN1_pos_1\t118\t-3.545027\n+CAPRIN1_pos_1\t119\t-3.075370\n+CAPRIN1_pos_1\t120\t-2.539913\n+CAPRIN1_pos_1\t121\t-2.252535\n+CAPRIN1_pos_1\t122\t-2.136166\n+CAPRIN1_pos_1\t123\t-2.155064\n+CAPRIN1_pos_1\t124\t-2.261176\n+CAPRIN1_pos_1\t125\t-1.886312\n+CAPRIN1_pos_1\t126\t-1.633039\n+CAPRIN1_pos_1\t127\t-1.676508\n+CAPRIN1_pos_1\t128\t-2.115764\n+CAPRIN1_pos_1\t129\t-2.382775\n+CAPRIN1_pos_1\t130\t-2.936075\n+CAPRIN1_pos_1\t131\t-3.584606\n+CAPRIN1_pos_1\t132\t-3.519645\n+CAPRIN1_pos_1\t133\t-3.131874\n+CAPRIN1_pos_1\t134\t-2.905418\n+CAPRIN1_pos_1\t135\t-2.597431\n+CAPRIN1_pos_1\t136\t-2.349151\n+CAPRIN1_pos_1\t137\t-2.038789\n+CAPRIN1_pos_1\t138\t-2.217021\n+CAPRIN1_pos_1\t139\t-2.272697\n+CAPRIN1_pos_1\t140\t-2.468045\n+CAPRIN1_pos_1\t141\t-2.361197\n+CAPRI'..b'IN1_pos_10\t225\t-0.150782\n+CAPRIN1_pos_10\t226\t-0.530851\n+CAPRIN1_pos_10\t227\t-0.625072\n+CAPRIN1_pos_10\t228\t-0.170871\n+CAPRIN1_pos_10\t229\t-0.521842\n+CAPRIN1_pos_10\t230\t-0.439000\n+CAPRIN1_pos_10\t231\t0.039451\n+CAPRIN1_pos_10\t232\t0.192503\n+CAPRIN1_pos_10\t233\t0.027147\n+CAPRIN1_pos_10\t234\t0.308137\n+CAPRIN1_pos_10\t235\t0.632263\n+CAPRIN1_pos_10\t236\t0.600408\n+CAPRIN1_pos_10\t237\t0.952704\n+CAPRIN1_pos_10\t238\t0.682124\n+CAPRIN1_pos_10\t239\t0.550874\n+CAPRIN1_pos_10\t240\t0.463092\n+CAPRIN1_pos_10\t241\t0.044839\n+CAPRIN1_pos_10\t242\t0.054625\n+CAPRIN1_pos_10\t243\t0.023761\n+CAPRIN1_pos_10\t244\t0.581227\n+CAPRIN1_pos_10\t245\t0.615569\n+CAPRIN1_pos_10\t246\t0.404520\n+CAPRIN1_pos_10\t247\t0.495847\n+CAPRIN1_pos_10\t248\t0.392350\n+CAPRIN1_pos_10\t249\t0.539801\n+CAPRIN1_pos_10\t250\t0.798014\n+CAPRIN1_pos_10\t251\t1.486443\n+CAPRIN1_pos_10\t252\t2.028157\n+CAPRIN1_pos_10\t253\t2.077956\n+CAPRIN1_pos_10\t254\t2.109173\n+CAPRIN1_pos_10\t255\t1.596623\n+CAPRIN1_pos_10\t256\t1.033621\n+CAPRIN1_pos_10\t257\t1.590057\n+CAPRIN1_pos_10\t258\t1.476771\n+CAPRIN1_pos_10\t259\t1.429161\n+CAPRIN1_pos_10\t260\t1.558993\n+CAPRIN1_pos_10\t261\t1.280549\n+CAPRIN1_pos_10\t262\t1.140090\n+CAPRIN1_pos_10\t263\t1.249710\n+CAPRIN1_pos_10\t264\t1.001820\n+CAPRIN1_pos_10\t265\t0.947169\n+CAPRIN1_pos_10\t266\t1.299137\n+CAPRIN1_pos_10\t267\t1.773234\n+CAPRIN1_pos_10\t268\t1.696727\n+CAPRIN1_pos_10\t269\t1.276074\n+CAPRIN1_pos_10\t270\t1.171405\n+CAPRIN1_pos_10\t271\t0.738794\n+CAPRIN1_pos_10\t272\t0.684006\n+CAPRIN1_pos_10\t273\t0.199272\n+CAPRIN1_pos_10\t274\t-0.026699\n+CAPRIN1_pos_10\t275\t0.306582\n+CAPRIN1_pos_10\t276\t0.320538\n+CAPRIN1_pos_10\t277\t0.173961\n+CAPRIN1_pos_10\t278\t0.207101\n+CAPRIN1_pos_10\t279\t-0.113596\n+CAPRIN1_pos_10\t280\t-0.112646\n+CAPRIN1_pos_10\t281\t0.049174\n+CAPRIN1_pos_10\t282\t0.423899\n+CAPRIN1_pos_10\t283\t0.530393\n+CAPRIN1_pos_10\t284\t1.188898\n+CAPRIN1_pos_10\t285\t1.378855\n+CAPRIN1_pos_10\t286\t1.132298\n+CAPRIN1_pos_10\t287\t1.341804\n+CAPRIN1_pos_10\t288\t1.654624\n+CAPRIN1_pos_10\t289\t1.714961\n+CAPRIN1_pos_10\t290\t2.032433\n+CAPRIN1_pos_10\t291\t2.040082\n+CAPRIN1_pos_10\t292\t2.092991\n+CAPRIN1_pos_10\t293\t1.916263\n+CAPRIN1_pos_10\t294\t1.875287\n+CAPRIN1_pos_10\t295\t1.758779\n+CAPRIN1_pos_10\t296\t1.643259\n+CAPRIN1_pos_10\t297\t1.207320\n+CAPRIN1_pos_10\t298\t0.476141\n+CAPRIN1_pos_10\t299\t0.396906\n+CAPRIN1_pos_10\t300\t0.004435\n+CAPRIN1_pos_10\t301\t-0.239439\n+CAPRIN1_pos_10\t302\t0.013079\n+CAPRIN1_pos_10\t303\t-0.129500\n+CAPRIN1_pos_10\t304\t-0.404122\n+CAPRIN1_pos_10\t305\t-0.600306\n+CAPRIN1_pos_10\t306\t-0.552530\n+CAPRIN1_pos_10\t307\t-0.542922\n+CAPRIN1_pos_10\t308\t-0.366593\n+CAPRIN1_pos_10\t309\t0.260696\n+CAPRIN1_pos_10\t310\t0.368463\n+CAPRIN1_pos_10\t311\t0.337125\n+CAPRIN1_pos_10\t312\t0.546437\n+CAPRIN1_pos_10\t313\t0.952587\n+CAPRIN1_pos_10\t314\t1.125232\n+CAPRIN1_pos_10\t315\t1.215464\n+CAPRIN1_pos_10\t316\t1.456625\n+CAPRIN1_pos_10\t317\t1.331600\n+CAPRIN1_pos_10\t318\t1.317941\n+CAPRIN1_pos_10\t319\t1.361951\n+CAPRIN1_pos_10\t320\t1.133940\n+CAPRIN1_pos_10\t321\t0.474719\n+CAPRIN1_pos_10\t322\t0.273075\n+CAPRIN1_pos_10\t323\t-0.180674\n+CAPRIN1_pos_10\t324\t-0.400962\n+CAPRIN1_pos_10\t325\t-0.996007\n+CAPRIN1_pos_10\t326\t-0.675398\n+CAPRIN1_pos_10\t327\t-0.699037\n+CAPRIN1_pos_10\t328\t-1.140867\n+CAPRIN1_pos_10\t329\t-1.166334\n+CAPRIN1_pos_10\t330\t-0.921350\n+CAPRIN1_pos_10\t331\t-1.305438\n+CAPRIN1_pos_10\t332\t-1.190402\n+CAPRIN1_pos_10\t333\t-0.929885\n+CAPRIN1_pos_10\t334\t-0.503833\n+CAPRIN1_pos_10\t335\t-0.429189\n+CAPRIN1_pos_10\t336\t-0.090868\n+CAPRIN1_pos_10\t337\t0.008666\n+CAPRIN1_pos_10\t338\t-0.142423\n+CAPRIN1_pos_10\t339\t0.273499\n+CAPRIN1_pos_10\t340\t0.099133\n+CAPRIN1_pos_10\t341\t0.338603\n+CAPRIN1_pos_10\t342\t1.138634\n+CAPRIN1_pos_10\t343\t1.672221\n+CAPRIN1_pos_10\t344\t2.181895\n+CAPRIN1_pos_10\t345\t2.305009\n+CAPRIN1_pos_10\t346\t2.115692\n+CAPRIN1_pos_10\t347\t2.493816\n+CAPRIN1_pos_10\t348\t2.701728\n+CAPRIN1_pos_10\t349\t2.941207\n+CAPRIN1_pos_10\t350\t3.043532\n+CAPRIN1_pos_10\t351\t3.314954\n+CAPRIN1_pos_10\t352\t2.890958\n+CAPRIN1_pos_10\t353\t2.266202\n+CAPRIN1_pos_10\t354\t2.232321\n+CAPRIN1_pos_10\t355\t1.949431\n+CAPRIN1_pos_10\t356\t1.706469\n+CAPRIN1_pos_10\t357\t1.872770\n+CAPRIN1_pos_10\t358\t1.617284\n+CAPRIN1_pos_10\t359\t1.286155\n+CAPRIN1_pos_10\t360\t1.141618\n+CAPRIN1_pos_10\t361\t0.797198\n'
b
diff -r 215925e588c4 -r 20429f4c1b95 test-data/test_predict.avg_profile.genomic_peaks.bed
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/test-data/test_predict.avg_profile.genomic_peaks.bed Wed Jan 22 10:14:41 2020 -0500
b
@@ -0,0 +1,100 @@
+chr4 53462672 53462674 CAPRIN1_pos_1,53462673 0.322847 -
+chr4 53462670 53462671 CAPRIN1_pos_1,53462671 0.131240 -
+chr4 53462630 53462632 CAPRIN1_pos_1,53462631 0.235923 -
+chr4 53462610 53462612 CAPRIN1_pos_1,53462612 0.280927 -
+chr4 53462406 53462408 CAPRIN1_pos_1,53462407 0.219724 -
+chr4 53462329 53462330 CAPRIN1_pos_1,53462330 0.098335 -
+chr4 53462326 53462327 CAPRIN1_pos_1,53462327 0.046050 -
+chr3 196082839 196082841 CAPRIN1_pos_2,196082841 0.129521 -
+chr3 196082803 196082805 CAPRIN1_pos_2,196082805 0.428493 -
+chr3 196082801 196082802 CAPRIN1_pos_2,196082802 0.264655 -
+chr3 196082794 196082796 CAPRIN1_pos_2,196082795 0.172488 -
+chr3 196082728 196082736 CAPRIN1_pos_2,196082733 1.943816 -
+chr3 196082711 196082712 CAPRIN1_pos_2,196082712 0.040976 -
+chr3 196082704 196082710 CAPRIN1_pos_2,196082708 1.255144 -
+chr3 196082665 196082675 CAPRIN1_pos_2,196082673 1.114239 -
+chr3 196082571 196082575 CAPRIN1_pos_2,196082574 0.318163 -
+chr3 196082493 196082495 CAPRIN1_pos_2,196082495 0.259428 -
+chr18 659480 659499 CAPRIN1_pos_3,659492 1.592080 +
+chr18 659503 659509 CAPRIN1_pos_3,659507 0.602029 +
+chr18 659510 659514 CAPRIN1_pos_3,659512 0.299957 +
+chr18 659520 659521 CAPRIN1_pos_3,659521 0.078966 +
+chr18 659537 659562 CAPRIN1_pos_3,659558 1.790984 +
+chr18 659576 659578 CAPRIN1_pos_3,659578 0.126206 +
+chr18 659620 659622 CAPRIN1_pos_3,659621 0.143977 +
+chr18 659652 659665 CAPRIN1_pos_3,659659 0.757031 +
+chr18 659721 659727 CAPRIN1_pos_3,659725 0.911435 +
+chr18 659757 659771 CAPRIN1_pos_3,659763 1.753765 +
+chr18 659790 659800 CAPRIN1_pos_3,659795 1.424040 +
+chr18 659808 659816 CAPRIN1_pos_3,659813 0.966312 +
+chr18 659817 659818 CAPRIN1_pos_3,659818 0.024165 +
+chr18 659822 659823 CAPRIN1_pos_3,659823 0.014006 +
+chr18 659824 659826 CAPRIN1_pos_3,659825 0.263710 +
+chr18 659839 659841 CAPRIN1_pos_3,659841 0.208979 +
+chr14 45585296 45585297 CAPRIN1_pos_4,45585297 0.043554 -
+chr14 45585250 45585252 CAPRIN1_pos_4,45585251 0.054336 -
+chr14 45585171 45585177 CAPRIN1_pos_4,45585175 0.461447 -
+chr14 45585042 45585043 CAPRIN1_pos_4,45585043 0.075030 -
+chr14 45585037 45585040 CAPRIN1_pos_4,45585040 0.252767 -
+chr14 45585035 45585036 CAPRIN1_pos_4,45585036 0.025491 -
+chr14 45585024 45585033 CAPRIN1_pos_4,45585026 1.156303 -
+chr19 58596235 58596415 CAPRIN1_pos_5,58596272 3.468620 -
+chr19 58596229 58596230 CAPRIN1_pos_5,58596230 0.413088 -
+chr19 58596185 58596228 CAPRIN1_pos_5,58596210 2.698865 -
+chr19 58596054 58596182 CAPRIN1_pos_5,58596161 4.101235 -
+chr4 109571760 109571776 CAPRIN1_pos_6,109571766 2.833152 +
+chr4 109571778 109571814 CAPRIN1_pos_6,109571790 2.728851 +
+chr4 109571827 109571833 CAPRIN1_pos_6,109571830 1.351761 +
+chr4 109571849 109571884 CAPRIN1_pos_6,109571875 2.903897 +
+chr4 109571888 109571904 CAPRIN1_pos_6,109571897 2.748033 +
+chr4 109571940 109571950 CAPRIN1_pos_6,109571945 1.185336 +
+chr4 109571957 109572019 CAPRIN1_pos_6,109572002 3.821982 +
+chr4 109572020 109572021 CAPRIN1_pos_6,109572021 0.282644 +
+chr4 109572022 109572027 CAPRIN1_pos_6,109572023 0.374493 +
+chr4 109572030 109572047 CAPRIN1_pos_6,109572039 2.644698 +
+chr4 109572076 109572081 CAPRIN1_pos_6,109572081 0.287939 +
+chr4 109572092 109572098 CAPRIN1_pos_6,109572096 0.716058 +
+chr8 42798383 42798389 CAPRIN1_pos_7,42798389 0.200465 +
+chr8 42798465 42798466 CAPRIN1_pos_7,42798466 0.077973 +
+chr8 42798520 42798531 CAPRIN1_pos_7,42798526 1.392449 +
+chr8 42798578 42798586 CAPRIN1_pos_7,42798584 0.872296 +
+chr8 42798668 42798676 CAPRIN1_pos_7,42798671 0.966690 +
+chr8 42798678 42798679 CAPRIN1_pos_7,42798679 0.088335 +
+chr8 42798697 42798709 CAPRIN1_pos_7,42798703 0.958345 +
+chr1 11115017 11115041 CAPRIN1_pos_8,11115038 2.212693 -
+chr1 11114998 11115015 CAPRIN1_pos_8,11115006 2.396456 -
+chr1 11114958 11114992 CAPRIN1_pos_8,11114975 2.872385 -
+chr1 11114956 11114957 CAPRIN1_pos_8,11114957 0.234594 -
+chr1 11114952 11114955 CAPRIN1_pos_8,11114954 0.387833 -
+chr1 11114936 11114950 CAPRIN1_pos_8,11114944 1.964977 -
+chr1 11114810 11114925 CAPRIN1_pos_8,11114845 4.077125 -
+chr1 11114760 11114801 CAPRIN1_pos_8,11114781 3.376777 -
+chr1 11114716 11114738 CAPRIN1_pos_8,11114728 1.203373 -
+chr1 11114710 11114715 CAPRIN1_pos_8,11114714 0.159993 -
+chr1 11114697 11114709 CAPRIN1_pos_8,11114702 1.695643 -
+chr1 117530511 117530531 CAPRIN1_pos_9,117530518 1.910565 +
+chr1 117530536 117530548 CAPRIN1_pos_9,117530544 1.168142 +
+chr1 117530561 117530571 CAPRIN1_pos_9,117530568 1.618727 +
+chr1 117530599 117530605 CAPRIN1_pos_9,117530604 0.724791 +
+chr1 117530617 117530619 CAPRIN1_pos_9,117530619 0.249549 +
+chr1 117530620 117530621 CAPRIN1_pos_9,117530621 0.330394 +
+chr1 117530630 117530642 CAPRIN1_pos_9,117530634 1.607942 +
+chr1 117530651 117530654 CAPRIN1_pos_9,117530652 0.234019 +
+chr1 117530668 117530669 CAPRIN1_pos_9,117530669 0.069593 +
+chr1 117530684 117530687 CAPRIN1_pos_9,117530687 0.261641 +
+chr1 117530707 117530708 CAPRIN1_pos_9,117530708 0.178237 +
+chr1 117530740 117530747 CAPRIN1_pos_9,117530743 0.940394 +
+chr1 117530794 117530802 CAPRIN1_pos_9,117530799 1.004641 +
+chr1 117530857 117530862 CAPRIN1_pos_9,117530859 0.213619 +
+chr22 43015872 43015879 CAPRIN1_pos_10,43015879 1.409827 -
+chr22 43015846 43015864 CAPRIN1_pos_10,43015856 2.611975 -
+chr22 43015789 43015844 CAPRIN1_pos_10,43015799 2.810422 -
+chr22 43015727 43015780 CAPRIN1_pos_10,43015748 4.458784 -
+chr22 43015656 43015723 CAPRIN1_pos_10,43015666 2.699599 -
+chr22 43015606 43015649 CAPRIN1_pos_10,43015626 2.109173 -
+chr22 43015601 43015605 CAPRIN1_pos_10,43015604 0.320538 -
+chr22 43015579 43015599 CAPRIN1_pos_10,43015588 2.092991 -
+chr22 43015577 43015578 CAPRIN1_pos_10,43015578 0.013079 -
+chr22 43015557 43015571 CAPRIN1_pos_10,43015564 1.456625 -
+chr22 43015542 43015543 CAPRIN1_pos_10,43015543 0.008666 -
+chr22 43015518 43015541 CAPRIN1_pos_10,43015529 3.314954 -
b
diff -r 215925e588c4 -r 20429f4c1b95 test-data/test_predict.avg_profile.p50.genomic_peaks.bed
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/test-data/test_predict.avg_profile.p50.genomic_peaks.bed Wed Jan 22 10:14:41 2020 -0500
b
@@ -0,0 +1,21 @@
+chr19 58596414 58596415 CAPRIN1_pos_5,58596415 2.940065 -
+chr19 58596391 58596394 CAPRIN1_pos_5,58596393 2.897963 -
+chr19 58596371 58596373 CAPRIN1_pos_5,58596373 2.854940 -
+chr19 58596355 58596357 CAPRIN1_pos_5,58596356 2.979284 -
+chr19 58596269 58596278 CAPRIN1_pos_5,58596272 3.468620 -
+chr19 58596156 58596167 CAPRIN1_pos_5,58596161 4.101235 -
+chr19 58596142 58596144 CAPRIN1_pos_5,58596143 2.884787 -
+chr19 58596136 58596140 CAPRIN1_pos_5,58596137 3.165810 -
+chr19 58596110 58596112 CAPRIN1_pos_5,58596112 2.917658 -
+chr19 58596072 58596076 CAPRIN1_pos_5,58596074 3.078847 -
+chr4 109571765 109571766 CAPRIN1_pos_6,109571766 2.833152 +
+chr4 109571874 109571875 CAPRIN1_pos_6,109571875 2.903897 +
+chr4 109571998 109572009 CAPRIN1_pos_6,109572002 3.821982 +
+chr1 11114974 11114975 CAPRIN1_pos_8,11114975 2.872385 -
+chr1 11114913 11114916 CAPRIN1_pos_8,11114916 3.124103 -
+chr1 11114839 11114849 CAPRIN1_pos_8,11114845 4.077125 -
+chr1 11114783 11114784 CAPRIN1_pos_8,11114784 3.139926 -
+chr1 11114778 11114782 CAPRIN1_pos_8,11114781 3.376777 -
+chr22 43015770 43015773 CAPRIN1_pos_10,43015772 3.033856 -
+chr22 43015739 43015756 CAPRIN1_pos_10,43015748 4.458784 -
+chr22 43015527 43015531 CAPRIN1_pos_10,43015529 3.314954 -
b
diff -r 215925e588c4 -r 20429f4c1b95 test-data/test_predict.avg_profile.p50.peaks.bed
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/test-data/test_predict.avg_profile.p50.peaks.bed Wed Jan 22 10:14:41 2020 -0500
b
@@ -0,0 +1,21 @@
+CAPRIN1_pos_5 0 1 CAPRIN1_pos_5,1 2.940065 +
+CAPRIN1_pos_5 21 24 CAPRIN1_pos_5,23 2.897963 +
+CAPRIN1_pos_5 42 44 CAPRIN1_pos_5,43 2.854940 +
+CAPRIN1_pos_5 58 60 CAPRIN1_pos_5,60 2.979284 +
+CAPRIN1_pos_5 137 146 CAPRIN1_pos_5,144 3.468620 +
+CAPRIN1_pos_5 248 259 CAPRIN1_pos_5,255 4.101235 +
+CAPRIN1_pos_5 271 273 CAPRIN1_pos_5,273 2.884787 +
+CAPRIN1_pos_5 275 279 CAPRIN1_pos_5,279 3.165810 +
+CAPRIN1_pos_5 303 305 CAPRIN1_pos_5,304 2.917658 +
+CAPRIN1_pos_5 339 343 CAPRIN1_pos_5,342 3.078847 +
+CAPRIN1_pos_6 5 6 CAPRIN1_pos_6,6 2.833152 +
+CAPRIN1_pos_6 114 115 CAPRIN1_pos_6,115 2.903897 +
+CAPRIN1_pos_6 238 249 CAPRIN1_pos_6,242 3.821982 +
+CAPRIN1_pos_8 66 67 CAPRIN1_pos_8,67 2.872385 +
+CAPRIN1_pos_8 125 128 CAPRIN1_pos_8,126 3.124103 +
+CAPRIN1_pos_8 192 202 CAPRIN1_pos_8,197 4.077125 +
+CAPRIN1_pos_8 257 258 CAPRIN1_pos_8,258 3.139926 +
+CAPRIN1_pos_8 259 263 CAPRIN1_pos_8,261 3.376777 +
+CAPRIN1_pos_10 106 109 CAPRIN1_pos_10,108 3.033856 +
+CAPRIN1_pos_10 123 140 CAPRIN1_pos_10,132 4.458784 +
+CAPRIN1_pos_10 348 352 CAPRIN1_pos_10,351 3.314954 +
b
diff -r 215925e588c4 -r 20429f4c1b95 test-data/test_predict.avg_profile.peaks.bed
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/test-data/test_predict.avg_profile.peaks.bed Wed Jan 22 10:14:41 2020 -0500
b
@@ -0,0 +1,100 @@
+CAPRIN1_pos_1 5 7 CAPRIN1_pos_1,7 0.322847 +
+CAPRIN1_pos_1 8 9 CAPRIN1_pos_1,9 0.131240 +
+CAPRIN1_pos_1 47 49 CAPRIN1_pos_1,49 0.235923 +
+CAPRIN1_pos_1 67 69 CAPRIN1_pos_1,68 0.280927 +
+CAPRIN1_pos_1 271 273 CAPRIN1_pos_1,273 0.219724 +
+CAPRIN1_pos_1 349 350 CAPRIN1_pos_1,350 0.098335 +
+CAPRIN1_pos_1 352 353 CAPRIN1_pos_1,353 0.046050 +
+CAPRIN1_pos_2 0 2 CAPRIN1_pos_2,1 0.129521 +
+CAPRIN1_pos_2 36 38 CAPRIN1_pos_2,37 0.428493 +
+CAPRIN1_pos_2 39 40 CAPRIN1_pos_2,40 0.264655 +
+CAPRIN1_pos_2 45 47 CAPRIN1_pos_2,47 0.172488 +
+CAPRIN1_pos_2 105 113 CAPRIN1_pos_2,109 1.943816 +
+CAPRIN1_pos_2 129 130 CAPRIN1_pos_2,130 0.040976 +
+CAPRIN1_pos_2 131 137 CAPRIN1_pos_2,134 1.255144 +
+CAPRIN1_pos_2 166 176 CAPRIN1_pos_2,169 1.114239 +
+CAPRIN1_pos_2 266 270 CAPRIN1_pos_2,268 0.318163 +
+CAPRIN1_pos_2 346 348 CAPRIN1_pos_2,347 0.259428 +
+CAPRIN1_pos_3 0 19 CAPRIN1_pos_3,12 1.592080 +
+CAPRIN1_pos_3 23 29 CAPRIN1_pos_3,27 0.602029 +
+CAPRIN1_pos_3 30 34 CAPRIN1_pos_3,32 0.299957 +
+CAPRIN1_pos_3 40 41 CAPRIN1_pos_3,41 0.078966 +
+CAPRIN1_pos_3 57 82 CAPRIN1_pos_3,78 1.790984 +
+CAPRIN1_pos_3 96 98 CAPRIN1_pos_3,98 0.126206 +
+CAPRIN1_pos_3 140 142 CAPRIN1_pos_3,141 0.143977 +
+CAPRIN1_pos_3 172 185 CAPRIN1_pos_3,179 0.757031 +
+CAPRIN1_pos_3 241 247 CAPRIN1_pos_3,245 0.911435 +
+CAPRIN1_pos_3 277 291 CAPRIN1_pos_3,283 1.753765 +
+CAPRIN1_pos_3 310 320 CAPRIN1_pos_3,315 1.424040 +
+CAPRIN1_pos_3 328 336 CAPRIN1_pos_3,333 0.966312 +
+CAPRIN1_pos_3 337 338 CAPRIN1_pos_3,338 0.024165 +
+CAPRIN1_pos_3 342 343 CAPRIN1_pos_3,343 0.014006 +
+CAPRIN1_pos_3 344 346 CAPRIN1_pos_3,345 0.263710 +
+CAPRIN1_pos_3 359 361 CAPRIN1_pos_3,361 0.208979 +
+CAPRIN1_pos_4 88 89 CAPRIN1_pos_4,89 0.043554 +
+CAPRIN1_pos_4 133 135 CAPRIN1_pos_4,135 0.054336 +
+CAPRIN1_pos_4 208 214 CAPRIN1_pos_4,211 0.461447 +
+CAPRIN1_pos_4 342 343 CAPRIN1_pos_4,343 0.075030 +
+CAPRIN1_pos_4 345 348 CAPRIN1_pos_4,346 0.252767 +
+CAPRIN1_pos_4 349 350 CAPRIN1_pos_4,350 0.025491 +
+CAPRIN1_pos_4 352 361 CAPRIN1_pos_4,360 1.156303 +
+CAPRIN1_pos_5 0 180 CAPRIN1_pos_5,144 3.468620 +
+CAPRIN1_pos_5 185 186 CAPRIN1_pos_5,186 0.413088 +
+CAPRIN1_pos_5 187 230 CAPRIN1_pos_5,206 2.698865 +
+CAPRIN1_pos_5 233 361 CAPRIN1_pos_5,255 4.101235 +
+CAPRIN1_pos_6 0 16 CAPRIN1_pos_6,6 2.833152 +
+CAPRIN1_pos_6 18 54 CAPRIN1_pos_6,30 2.728851 +
+CAPRIN1_pos_6 67 73 CAPRIN1_pos_6,70 1.351761 +
+CAPRIN1_pos_6 89 124 CAPRIN1_pos_6,115 2.903897 +
+CAPRIN1_pos_6 128 144 CAPRIN1_pos_6,137 2.748033 +
+CAPRIN1_pos_6 180 190 CAPRIN1_pos_6,185 1.185336 +
+CAPRIN1_pos_6 197 259 CAPRIN1_pos_6,242 3.821982 +
+CAPRIN1_pos_6 260 261 CAPRIN1_pos_6,261 0.282644 +
+CAPRIN1_pos_6 262 267 CAPRIN1_pos_6,263 0.374493 +
+CAPRIN1_pos_6 270 287 CAPRIN1_pos_6,279 2.644698 +
+CAPRIN1_pos_6 316 321 CAPRIN1_pos_6,321 0.287939 +
+CAPRIN1_pos_6 332 338 CAPRIN1_pos_6,336 0.716058 +
+CAPRIN1_pos_7 26 32 CAPRIN1_pos_7,32 0.200465 +
+CAPRIN1_pos_7 108 109 CAPRIN1_pos_7,109 0.077973 +
+CAPRIN1_pos_7 163 174 CAPRIN1_pos_7,169 1.392449 +
+CAPRIN1_pos_7 221 229 CAPRIN1_pos_7,227 0.872296 +
+CAPRIN1_pos_7 311 319 CAPRIN1_pos_7,314 0.966690 +
+CAPRIN1_pos_7 321 322 CAPRIN1_pos_7,322 0.088335 +
+CAPRIN1_pos_7 340 352 CAPRIN1_pos_7,346 0.958345 +
+CAPRIN1_pos_8 0 24 CAPRIN1_pos_8,4 2.212693 +
+CAPRIN1_pos_8 26 43 CAPRIN1_pos_8,36 2.396456 +
+CAPRIN1_pos_8 49 83 CAPRIN1_pos_8,67 2.872385 +
+CAPRIN1_pos_8 84 85 CAPRIN1_pos_8,85 0.234594 +
+CAPRIN1_pos_8 86 89 CAPRIN1_pos_8,88 0.387833 +
+CAPRIN1_pos_8 91 105 CAPRIN1_pos_8,98 1.964977 +
+CAPRIN1_pos_8 116 231 CAPRIN1_pos_8,197 4.077125 +
+CAPRIN1_pos_8 240 281 CAPRIN1_pos_8,261 3.376777 +
+CAPRIN1_pos_8 303 325 CAPRIN1_pos_8,314 1.203373 +
+CAPRIN1_pos_8 326 331 CAPRIN1_pos_8,328 0.159993 +
+CAPRIN1_pos_8 332 344 CAPRIN1_pos_8,340 1.695643 +
+CAPRIN1_pos_9 9 29 CAPRIN1_pos_9,16 1.910565 +
+CAPRIN1_pos_9 34 46 CAPRIN1_pos_9,42 1.168142 +
+CAPRIN1_pos_9 59 69 CAPRIN1_pos_9,66 1.618727 +
+CAPRIN1_pos_9 97 103 CAPRIN1_pos_9,102 0.724791 +
+CAPRIN1_pos_9 115 117 CAPRIN1_pos_9,117 0.249549 +
+CAPRIN1_pos_9 118 119 CAPRIN1_pos_9,119 0.330394 +
+CAPRIN1_pos_9 128 140 CAPRIN1_pos_9,132 1.607942 +
+CAPRIN1_pos_9 149 152 CAPRIN1_pos_9,150 0.234019 +
+CAPRIN1_pos_9 166 167 CAPRIN1_pos_9,167 0.069593 +
+CAPRIN1_pos_9 182 185 CAPRIN1_pos_9,185 0.261641 +
+CAPRIN1_pos_9 205 206 CAPRIN1_pos_9,206 0.178237 +
+CAPRIN1_pos_9 238 245 CAPRIN1_pos_9,241 0.940394 +
+CAPRIN1_pos_9 292 300 CAPRIN1_pos_9,297 1.004641 +
+CAPRIN1_pos_9 355 360 CAPRIN1_pos_9,357 0.213619 +
+CAPRIN1_pos_10 0 7 CAPRIN1_pos_10,1 1.409827 +
+CAPRIN1_pos_10 15 33 CAPRIN1_pos_10,24 2.611975 +
+CAPRIN1_pos_10 35 90 CAPRIN1_pos_10,81 2.810422 +
+CAPRIN1_pos_10 99 152 CAPRIN1_pos_10,132 4.458784 +
+CAPRIN1_pos_10 156 223 CAPRIN1_pos_10,214 2.699599 +
+CAPRIN1_pos_10 230 273 CAPRIN1_pos_10,254 2.109173 +
+CAPRIN1_pos_10 274 278 CAPRIN1_pos_10,276 0.320538 +
+CAPRIN1_pos_10 280 300 CAPRIN1_pos_10,292 2.092991 +
+CAPRIN1_pos_10 301 302 CAPRIN1_pos_10,302 0.013079 +
+CAPRIN1_pos_10 308 322 CAPRIN1_pos_10,316 1.456625 +
+CAPRIN1_pos_10 336 337 CAPRIN1_pos_10,337 0.008666 +
+CAPRIN1_pos_10 338 361 CAPRIN1_pos_10,351 3.314954 +
b
diff -r 215925e588c4 -r 20429f4c1b95 test-data/test_predict.bed
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/test-data/test_predict.bed Wed Jan 22 10:14:41 2020 -0500
b
@@ -0,0 +1,10 @@
+chr4 53462318 53462679 CAPRIN1_pos_1 0 -
+chr3 196082480 196082841 CAPRIN1_pos_2 0 -
+chr18 659480 659841 CAPRIN1_pos_3 0 +
+chr14 45585024 45585385 CAPRIN1_pos_4 0 -
+chr19 58596054 58596415 CAPRIN1_pos_5 0 -
+chr4 109571760 109572121 CAPRIN1_pos_6 0 +
+chr8 42798357 42798718 CAPRIN1_pos_7 0 +
+chr1 11114680 11115041 CAPRIN1_pos_8 0 -
+chr1 117530502 117530863 CAPRIN1_pos_9 0 +
+chr22 43015518 43015879 CAPRIN1_pos_10 0 -
b
diff -r 215925e588c4 -r 20429f4c1b95 test-data/test_predict.fa
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/test-data/test_predict.fa Wed Jan 22 10:14:41 2020 -0500
b
@@ -0,0 +1,20 @@
+>CAPRIN1_pos_1
+ACAAAATGGCGCTGTTTCTTTGGCTCAGACTCCTAGAATGCTTGACAAGACAGAATTTTTTTGGAAGAACCTCATCTCACTATAGTTACTTTTTTCACTTTTGTTATATATGTATTTATTAGAGCATTTGAATATTGGTACCTTTAAAAGGGTCATTTGGTGTTTTGCTGTTGAGCTGGTTTTTGAGTCATAGATCTTGGCTTCCTTTAGAAGCCACTTAACTTCCATACACTATAATAAACTGTGAACATATTTTTGTTACCTAATGCATCCACTGATGAACATGCAAACTTTGGGCATAATGTGAACTAAAATTGAAATGGAAAATGTTAGTGGCCATTTTGCAACAATGAAGAGGATA
+>CAPRIN1_pos_2
+GAGACAGTTTTCCTTATTACTCTGTATTGATCCTGCTAGTCCAAGAATGGACATGAAGTGAACCTATCGTGGTGACTGGGATAGGAAGGTGCTTGCTATTTTTGCCAGCACAGCATATTAGTTCCTTTGGAGCCCTCCATTGTCTGAGTCTGCAGTGATCTGTAGGAAGGCAGCTGGTCAATAATCATGTAGTACAATGGCTTGGAATTGTAACCACTATGGTTATTGATTGTCCTGTGTTGTTTCAGGCATACTTAGGTATGTCCCTGGGGAAAAAGAAAACCATTCAGCTGAGAGTTGCTAACCATGTTCTTTTGGTTAGAAATAATGGTTCATTTTTTGCCCCTGGTTGGAATAGTCT
+>CAPRIN1_pos_3
+GCCTGGGAGCTCCACCGTGATCTCTGGCCCACTTTGCGGGAGTCTAGGCTTTCTGGATGCTCCAGGGCCTCACGTCCCAGGGCAGTTTTCTTCCCTGAAGAAAGTTGGATGGCATGATCTGTCTTCCCATCTTGAAACCGTATGGCAAATTGTTTTTCAGATGAATTCCCTCTGCTGACAACCAAACGTGTGTTCTGGAAGGGTGTTTTGGAGGAGTTGCTGTGGTTTATCAAGGTAAAGAAGTCGCTGCTATTAGAAGTCAGTAGTCTGTTCTCAACACAGCAGCCAGTGAGATCCTTTCAAAACTCAAAGCAGCCAGGTGTGGTGGCTCACGCCTGTAATCCCACCACTTTGGGAGGCT
+>CAPRIN1_pos_4
+GCTCTAAGGATATTAGCAACAATGATAAAACTTGGCCTTGAAGAAATTTACACAACTAGTTAGAACTTGTTACTATTGTAAAGGAAGAGTCAACTGGAAAATTCAAGGAGTTAATAAAATTTGTTTACTTGGTCCCAGCTTTTGAGAGATAAATCCCTTATGAATCCCTGGTCTAAAATACTTTCCTACAGCTGTGTAAAATACTGGTCAAGGAGAACTTTTTCCTTTTACCTCATGTTGTAAACTTAAGTGGCTCAATAAAAATTGATCCACTGTCTTGATCTGACTGTGATTTGTTTGGTGTTAATATATTATTGTTCTGATTAGGGAATTGGAACTAAGGACCTCATCAAACCAGGGA
+>CAPRIN1_pos_5
+CAGCGCAGGGCTGGAGGCCGGGCAGGGCCCTGGGGCTGACGAGCCGGGCTTGTCCCGCGGGAAGCCCTATGCCTGCGGCGAGTGCGGGGAGGCCTTCGCGTGGCTCTCGCACCTGATGGAGCACCACAGCAGCCATGGCGGCCGGAAGCGCTACGCCTGTCAGGGCTGCTGGAAGACCTTCCACTTCAGCCTGGCCCTAGCCGAGCACCAGAAGACCCACGAGAAGGAGAAAAGCTACGCGCTGGGGGGCGCCCGGGGCCCCCAACCGTCCACCCGCGAAGCCCAGGCGGGGGCTAGGGCGGGCGGTCCCCCAGAGAGCGTGGAGGGCGAGGCTCCCCCCGCACCCCCAGAGGCGCAGAGG
+>CAPRIN1_pos_6
+CCGGGAGGCGCGTGGGGCTTGAGGCCGAGAACGGCCCTTGCTGCCACCAACATGGAGACTTTGTACCGTGTCCCGTTCTTAGTGCTCGAATGTCCCAACCTGAAGCTGAAGAAGCCGCCCTGGTTGCACATGCCGTCGGCCATGACTGTGTATGCTCTGGTGGTGGTGTCTTACTTCCTCATCACCGGAGGTAACTCGGGCTGTCGGGCCCGAGAGGCTGAGGAGCGGAGAACTGACCCGCCCCGGGAGGCGCGTCTGTTCCGCTGACTCTCAGCCCCGGGGGAAGCATAGCTCTGCTTTGGATCTTTTCTGAGGGTGGAGGGGAGTTCTGGGGTCCGAAGTGTTAACGTCCAAGTTTATT
+>CAPRIN1_pos_7
+GGCAATGAGTTTCTGGTAATAACAAACCTCGATCAAGTTTTTTTTTTTTTTGGTCATATAAATGTAAATGCAATTATAATCCTTTTATCTCCATTCAGATTTTAGGACAGCAAATTAATGACTTTACCCTTCCTGATGTGAACCTTATTGGGGAGCATTCTGATGCAGCAGAGCTTGGAAGGATGCTTCAGCTCATCTTAGGCTGTGCTGTGAACTGTGAACAGAAGCAAGGTAATTTGTTTCAAGTTAGTGTTTGCATTTAAAATAATGAAGGAAAGTAGTGTATATTTTATATTTTATATATAAATCACCAGTACCAACAGTAACATCTGTTTAAGGTCACCCCCAAAGAATGCCATTT
+>CAPRIN1_pos_8
+TGCTGCCCGAGTTTGCCCGCAAGGTGGGTGGCCTGCGGGGCTGGGTGGTGGGACCCAGGGACCCAGAGCGCCCTCCTGACTGGCCTCATGTCCCTCCAGGCCCTGAATGATGTGAGCTGAGCCCAGGCGCCACCACTGATGCCACCCAGGACCTCGGACCTTGGAGCCTGCGGGGTGCCTCGGCCCCTCCAGCCCCGGGCCGGACCTCCTGCTGGCTCTCGCCCACCAACCAAGTGTTACAAGCCCCAGAATGCTGCCCGGCCTGCCCTGCTGGGCGGACTGTCTGTGTGTCTGTCTCTCTGGCGTTCCACCTCCAAGCCTATACCAGCTGTGTACAGCGCCATCTCTCTGCCTTCTGTTG
+>CAPRIN1_pos_9
+TATTCTTACGCAGCTGGGGCCAGGAGGGTCAGAGTGGTGCCAGGTGCAAGTTAGGCTAAAGAAGCCACCACTATTCCTCTCTCTTGCCCATTGTGGGGGGCAAAGGCATTGGTCACCAAGAGTCTTGCAGGGGGACCCACAGATATGCCATGTCCTTCACACGTGCTTGGGCTCCTTAACCTGAAGGCAAATTGCTACTTGCAAGACTGACTGACTTCAAGGAATCAGAAATTACCTAGAAGCACCATGTTTTTTCTATGACCTTTTCAGTCCTTCAGGTCATTTTAAGGTCCACTGCAGGGGGTTAGTGAGAAAGGGTATACTTTGTGGTATGTTTTGCTTTCCTAATAGGGACATGAAG
+>CAPRIN1_pos_10
+CGCTGGTGCTGATGTGTGGCCCCCCACCCATGATCCAGTACGCCTGCCTTCCCAACCTGGACCACGTGGGCCACCCCACGGAGCGCTGCTTCGTCTTCTGAGGGCCGGGCACGGTCACACGGCCACCCGCCCCGCGCACCCCACGCCCTGTTCACGCTCACCCAGTCACCTCCCCACATCGCACACTGGGGCCCCGGGTTCAGCCTGGCCTGCCCGTGCCCTGGTGAATCACCTGGCTGAGCAGTTCCCCTGGAGCCCCTTCGGGAGCAGGGCTGTGTCCCAGATGGGCCACGGCTGAGCCTTCAGAGTACGTCCTGCCTGGCACTTACTGGTCCTTACCAGAGACGCCCAGCCCCATCCC
b
diff -r 215925e588c4 -r 20429f4c1b95 test-data/test_predict.p50.predictions
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/test-data/test_predict.p50.predictions Wed Jan 22 10:14:41 2020 -0500
b
@@ -0,0 +1,3 @@
+CAPRIN1_pos_5 1 1.47084
+CAPRIN1_pos_8 1 0.79418
+CAPRIN1_pos_10 1 0.959388
b
diff -r 215925e588c4 -r 20429f4c1b95 test-data/test_predict.predictions
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/test-data/test_predict.predictions Wed Jan 22 10:14:41 2020 -0500
b
@@ -0,0 +1,10 @@
+CAPRIN1_pos_1 -1 -1.61295
+CAPRIN1_pos_2 -1 -1.16685
+CAPRIN1_pos_3 -1 -0.565276
+CAPRIN1_pos_4 -1 -1.53879
+CAPRIN1_pos_5 1 1.47084
+CAPRIN1_pos_6 1 0.393163
+CAPRIN1_pos_7 -1 -1.43986
+CAPRIN1_pos_8 1 0.79418
+CAPRIN1_pos_9 -1 -0.74858
+CAPRIN1_pos_10 1 0.959388