changeset 0:28d1a6f8143f draft

planemo upload for repository https://github.com/portiahollyoak/Tools commit 132bb96bba8e7aed66a102ed93b7744f36d10d37-dirty
author portiahollyoak
date Mon, 25 Apr 2016 13:08:56 -0400
parents
children 39cbc0965e07
files Manual scripts/TEMP_Absence.sh scripts/TEMP_Insertion.sh scripts/cmd.total.sh scripts/excision.clustering.pl scripts/filterFalsePositive.ex.pl scripts/filterFalsePositive.in.pl scripts/generate_density_json.pl scripts/get_class.pl scripts/make.bp.bed.pl scripts/mergeTagsWithGap.pl scripts/mergeTagsWithoutGap.pl scripts/pickClippedFastq.pl scripts/pickOverlapPair.ex.pl scripts/pickOverlapPair.in.pl scripts/pickSoftClipping.over.pl scripts/pickUniqIntervalPos.pl scripts/pickUniqMate.pl scripts/pickUniqPairFastq.pl scripts/pickUniqPairFastq_MEM.pl scripts/pickUniqPos.pl scripts/pickUniqPos_MEM.pl scripts/refine_breakpoint.ex.pl scripts/refine_breakpoint.in.pl scripts/summarize_excision.pl temp.xml test-data/README test-data/dm3_chr2L.2bit test-data/test_TE_annotation.bed test-data/test_chromosome.sites test-data/test_chromosome.sorted.bam test-data/test_chromosome.sorted.bam.bai test-data/test_concensus.fa
diffstat 33 files changed, 3703 insertions(+), 0 deletions(-) [+]
line wrap: on
line diff
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/Manual	Mon Apr 25 13:08:56 2016 -0400
@@ -0,0 +1,186 @@
+TEMP (Transposable Element Movement in Population) Manual
+
+
+2015.01.09
+
+
+TEMP is a software designed to 1) detect transposable elements (TEs) insertions and absences relative to the reference genome, 2) define the genome-TE junctions up to base pair resolution when it is possible, and 3) estimate the population frequency of the detected insertions and absences. 
+This document provides information concerning how to run TEMP, what options to use, and how to interpret the outputs. If you have any questions or find any bugs please contact Jiali Zhuang through jiali.zhuang@umassmed.edu. 
+
+
+
+Requirement and installation
+
+
+TEMP runs on Linux x86_64 systems. 
+Following softwares are required by TEMP and should be included in the path:
+Samtools (http://samtools.sourceforge.net/),
+bedtools (http://code.google.com/p/bedtools/),
+bwa (http://sourceforge.net/projects/bio-bwa/),
+twoBitToFa (http://hgdownload.soe.ucsc.edu/admin/exe/linux.x86_64/twoBitToFa),
+Perl package BioPerl is also required for running TEMP (http://www.bioperl.org/wiki/Main_Page).
+
+For installing TEMP just unzip and untar the file. 
+In the directory TEMP_v1.01/ there are two bash scripts TEMP_Insertion.sh and TEMP_Absence.sh for TE insertion and absence analysis, respectively. 
+
+
+
+
+Options
+
+
+For TEMP_Insertion.sh the arguments to the options are explained below:
+
+
+       -i    Input file in bam format with full path. The users need to map the reads to the reference genome using mapping softwares such as BWA (http://bio-bwa.sourceforge.net/). Please sort and index the bam files before calling TEMP. Sorting and indexing can be done by 'samtools sort' and 'samtools index'.
+
+
+       -s    The full path to the scripts in directory TEMP_v1.0/.
+
+
+       -o    The full path to output directory. Default is current directory.
+
+
+       -r    Transposon consensus sequence fasta format with full path. Such files can be downloaded from Repbase (http://www.girinst.org/repbase/).
+
+
+       -t    Annotated transposon positions in the genome (e.g., RepeakMasker) in bed6 format with full path. 
+       
+       
+       -u    Families of transposable elements in tab delimited format (with the first column the name of the elemenet and the second column family). Only use together with -t. 
+
+
+       -x    The minimum score difference between the best hit and the second best hit for considering a read as uniquely mapped. The higher the score the more strigent the criterion. For BWA mem, which does not produce the XT:A: tag.
+       
+       
+       -m    Number of mismatches allowed when mapping to TE concensus sequences.
+
+
+       -f    An integer specifying the length of the fragments (inserts) of the library. Default is 500.
+
+
+       -c    An integer specifying the number of CUPs used. Default is 8.
+
+
+       -h    Show help message.
+
+
+
+
+For TEMP_Absence.sh the arguments to the options are explained below:
+
+
+       -i    Input file in bam format with full path. The users need to map the reads to the reference genome using mapping softwares such as BWA (http://bio-bwa.sourceforge.net/). Please sort and index the bam files before calling TEMP. Sorting and indexing can be done by 'samtools sort' and 'samtools index'.
+
+
+       -s    The full path to the scripts in directory TEMP_v1.0/.
+
+
+       -o    Path to output directory. Default is current directory.
+
+
+       -r    Annotated transposon positions in the genome (e.g., RepeakMasker) in bed6 format with full path. For major model organisms such file can be downloaded from UCSC Genome Browser. In Table Browser page just choose “variation and repeats” in the group tab and “RepeatMasker” in the track tab. 
+
+
+       -t    2bit file for the reference genome. Such file can be downloaded from UCSC Genome Browser. In Downloads page choose the right genome, click on the “Full data set” link and download the *.2bit file.  
+
+
+       -f    An integer specifying the length of the fragments (inserts) of the library. Default is 500.
+
+
+       -c    An integer specifying the number of CUPs used. Default is 4.
+
+
+       -h    Show help message.
+
+
+
+
+Output files
+
+
+For TE insertion analysis, the summay output file has the suffix: .insertion.refined.bp.summary.
+
+
+There are 14 columns in the summary file and their meanings are listed below:
+Column 1: The chromosome where the detected insertion happens.
+Column 2: The coordinate of the start position of the detected insertion.
+Column 3: The coordinate of the end position of the detected insertion.
+Column 4: The TE family that the detected insertion belongs to. 
+Column 5: The direction of the insertion. “Plus” means that the TE is integrated with the plus strand of the genome while “minus” means the TE is integrated with the minus strand.
+Column 6: The class of the insertion. “1p1” means that the detected insertion is supported by reads at both sides. “2p” means the detected insertion is supported by more than 1 read at only 1 side. “Singleton” means the detected insertion is supported by only 1 read at 1 side. 
+Column 7: The total number of read pairs that support the detected insertion. 
+Column 8: The estimated population frequency of the detected insertion.
+Columns 9 & 10: The coordinate of a junction and the number of the reads supporting it. If the junction is not found column 9 will be the arithmetic mean of the start and end coordinates and column 10 will have the value 0.
+Columns 11 & 12: Same as Columns 9 & 10 except for the junction on the other strand. 
+Column 13: The number of reads supporting the detected insertion at the 5’ end of the TE (not including junction spanning reads).
+Column 13: The number of reads supporting the detected insertion at the 3’ end of the TE (not including junction spanning reads).
+
+
+
+
+For TE absence analysis, the summay output file has the suffix: .absence.refined.bp.summary.
+
+
+There are 9 columns in the summary file and their meanings are listed below:
+Column 1: The chromosome where the detected absence happens.
+Column 2: The coordinate of the start position of the detected absence.
+Column 3: The coordinate of the end position of the detected absence.
+Column 4: The TE family that the detected insertion belongs to. 
+Column 5: Junctions at 5’ of the excised TE. The two numbers are the coordinates of the junctions on the two strands.
+Column 6: Junctions at 3’ of the excised TE. The two numbers are the coordinates of the junctions on the two strands. 
+Column 7: The number of reads supporting the absence.
+Column 8: The number of reads supporting the reference (no absence).
+Column 9: Estimated population frequency of the detected absence event.
+
+
+
+
+
+Visualization
+
+Since v1.01, we added a new function to TEMP that enables the visualization of the distribution of predicted TE insertion across the genome using Dr. Xiaopeng Zhu's visualization tool "circosjs".
+
+The procedure involves two steps:
+1) Generate the JSON objects file from the TEMP detected TE insertions. 
+This can be done easily by running the script "generate_density_json.pl": e.g.
+perl generate_density_json.pl input.insertion.bp.summary ref.chromInfo 500000
+
+This script takes 3 parameters: (1) the TE insertions predicted by TEMP (i.e., the output file produced by TEMP_Insertion.sh);
+                                (2) the file contains the sizes of all the chromosomes in a reference genome (the chromInfo files for model organism genomes can be downloaded from UCSC Genome Browser);
+                                (3) the size of genomic bins (500kb in the above example), total number predicted TE insertions in each will be calculated and plotted later.
+                                
+2) Visualization of the distribution of TE insertions across the genome.
+Dr. Xiaopeng Zhu (https://twitter.com/nimezhu) at UMass Medical School developed a powserful web-based visualization tool that is available at: http://circos.zhu.land/
+The user only needs to upload the JSON file generated in step1 in the "read local file" section. 
+
+Please forward any question and suggestion about the website to Dr. Zhu: xiaopeng.zhu@umassmed.edu
+
+
+
+
+
+
+Test datasets
+
+We put together two datasets for testing TEMP. 
+
+One is a simulated set generated using Drosophila Melanogaster Chromosome 2L as the template. It's distributed along with this package.
+
+The recommended commandline invokation for this testset is:
+git clone https://github.com/JialiUMassWengLab/TEMP.git
+cd TEMP
+tar -xvzf test_dataset.tar.gz
+cd test_dataset/
+bash ../scripts/TEMP_Insertion.sh -i ./test_chromosome.sorted.bam -s ../scripts -r ./test_concensus.fa -t ./test_TE_annotation.bed -m 3 -f 500 -c 8
+bash ../scripts/TEMP_Absence.sh -i ./test_chromosome.sorted.bam -s ../scripts -r ./test_TE_annotation.bed -t ./dm3_chr2L.2bit -f 500 -c 4
+
+The other one is derived from chromosome 11 of 8 individuals from 1000 gnomes project. It's available at http://zlab.umassmed.edu/~zhuangj/TEMP_resources/Human_test_dataset.tar.gz.
+The recommended commandline invokation for this testset is:
+git clone https://github.com/JialiUMassWengLab/TEMP.git
+cd TEMP
+wget http://bib.umassmed.edu/~zhuangj/TEMP_resources/Human_test_dataset.tar.gz
+tar -zxvf Human_test_dataset.tar.gz
+cd Human_test_dataset
+bash ../scripts/TEMP_Insertion.sh -i ./chrom11.test.sorted.bam -s ../scripts -r ./HomoSapienRepbaseTEConcensus.fa -t ./hg19_rpmk.bed -m 3 -f 500 -c 8
+bash ../scripts/TEMP_Absence.sh -i ./chrom11.test.sorted.bam -s ../scripts -r ./hg19_rpmk.bed -t ./hg19.2bit -f 500 -c 4
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/scripts/TEMP_Absence.sh	Mon Apr 25 13:08:56 2016 -0400
@@ -0,0 +1,138 @@
+#!/bin/bash -x
+# TEMP (Transposable Element Movement present in a Population)
+# 2013-06-14
+# Jiali Zhuang(jiali.zhuang@umassmed.edu)
+# Zhiping Weng Lab
+# Programs in Bioinformatics and Integrative Biology
+# University of Massachusetts Medical School
+
+#usage function
+usage() {
+echo -en "\e[1;36m"
+cat <<EOF
+
+usage: $0 -i input_file.sorted.bam -s scripts_directory -o output_directory -r transposon_rpmk.bed -t reference.2bit -f fragment_size -c CPUs -h 
+
+TEMP is a software package for detecting transposable elements (TEs) 
+insertions and excisions from pooled high-throughput sequencing data. 
+Please send questions, suggestions and bug reports to:
+jiali.zhuang@umassmed.edu
+
+Options:
+        -i     Input file in bam format with full path. Please sort and index the file before calling this program. 
+               Sorting and indexing can be done by 'samtools sort' and 'samtools index'
+        -s     Directory where all the scripts are
+        -o     Path to output directory. Default is current directory
+        -r     Annotated transposon positions in the genome (e.g., repeakMask) in bed6 format with full path
+        -t     2bit file for the reference genome (can be downloaded from UCSC Genome Browser)
+        -f     An integer specifying the length of the fragments (inserts) of the library. Default is 500
+        -c     An integer specifying the number of CUPs used. Default is 4
+        -h     Show help message
+
+EOF
+echo -en "\e[0m"
+}
+
+# taking options
+while getopts "hi:c:f:o:r:s:t:" OPTION
+do
+        case $OPTION in
+                h)
+                        usage && exit 1
+		;;
+                i)
+                        BAM=$OPTARG
+		;;
+	        f)
+		        INSERT=$OPTARG
+		;;
+                o)
+                        OUTDIR=$OPTARG
+                ;;
+                c)
+                        CPU=$OPTARG
+                ;;
+                s)
+                        BINDIR=$OPTARG
+                ;;
+	        r)
+		        TEBED=$OPTARG
+		;;
+	        t)
+		        REF=$OPTARG
+		;;
+                ?)
+                        usage && exit 1
+                ;;
+        esac
+done
+
+if [[ -z $BAM ]] || [[ -z $BINDIR ]] || [[ -z $TEBED ]] || [[ -z $REF ]]
+then
+        usage && exit 1
+fi
+[ ! -z "${CPU##*[!0-9]*}" ] || CPU=4
+[ ! -z "${INSERT##*[!0-9]*}" ] || INSERT=500
+[ ! -z $OUTDIR ]  || OUTDIR=$PWD
+
+mkdir -p "${OUTDIR}" || echo -e "\e[1;31mWarning: Cannot create directory ${OUTDIR}. Using the direcory of input fastq file\e[0m"
+cd ${OUTDIR} || echo -e "\e[1;31mError: Cannot access directory ${OUTDIR}... Exiting...\e[0m" || exit 1
+touch ${OUTDIR}/.writting_permission && rm -rf ${OUTDIR}/.writting_permission || echo -e "\e[1;31mError: Cannot write in directory ${OUTDIR}... Exiting...\e[0m" || exit 1
+
+function checkExist {
+        echo -ne "\e[1;32m\"${1}\" is using: \e[0m" && which "$1"
+        [[ $? != 0 ]] && echo -e "\e[1;36mError: cannot find software/function ${1}! Please make sure that you have installed the pipeline correctly.\nExiting...\e[0m" && exit 1
+}
+echo -e "\e[1;35mTesting required softwares/scripts:\e[0m"
+checkExist "echo"
+checkExist "rm"
+checkExist "mkdir"
+checkExist "date"
+checkExist "mv"
+checkExist "sort"
+checkExist "touch"
+checkExist "awk"
+checkExist "grep"
+checkExist "bwa"
+checkExist "samtools"
+echo -e "\e[1;35mDone with testing required softwares/scripts, starting pipeline...\e[0m"
+
+name=`basename $BAM`
+i=${name/.sorted.bam/}
+echo $name
+echo $i
+if [[ ! -s $name ]]
+then
+    cp $BAM ./
+fi
+if [[ ! -s $name.bai ]]
+then cp $BAM.bai ./
+fi
+
+#Detect excision sites
+samtools view -XF 0x2 $name > $i.unpair.sam
+awk -F "\t" '{OFS="\t"; if ($9 != 0) print $0}' $i.unpair.sam > temp1.sam
+perl $BINDIR/pickUniqIntervalPos.pl temp1.sam $INSERT > $i.unproper.uniq.interval.bed
+
+rm temp1.sam $i.unpair.sam
+
+# Sometimes $i.unproper.uniq.interval.bed contains malformed bed entries 
+# These must be removed to prevent the script failing
+awk '{if ($3>=$2 && $3 > 0 && $2 > 0) print $0}' $i.unproper.uniq.interval.bed > $i.unproper.uniq.interval.fixed.bed
+mv $i.unproper.uniq.interval.fixed.bed $i.unproper.uniq.interval.bed
+
+# Map to transposons
+bedtools intersect -a $TEBED -b $i.unproper.uniq.interval.bed -f 1.0 -wo > temp
+perl $BINDIR/filterFalsePositive.ex.pl temp $INSERT $i.final.pairs.rpmk.bed
+bedtools intersect -a $TEBED -b $i.final.pairs.rpmk.bed -f 1.0 -wo > temp2
+
+perl $BINDIR/excision.clustering.pl temp2 $i.excision.cluster.rpmk
+rm temp temp2 $i.unproper.uniq.interval.bed $i.final.pairs.rpmk.bed
+
+# Identify breakpoints using soft-clipping information
+perl $BINDIR/pickSoftClipping.over.pl $i.excision.cluster.rpmk $REF > $i.excision.cluster.rpmk.sfcp
+perl $BINDIR/refine_breakpoint.ex.pl
+
+# Estimate excision sites frequencies
+perl $BINDIR/pickOverlapPair.ex.pl $i.excision.cluster.rpmk.refined.bp > $i.excision.cluster.rpmk.refined.bp.refsup
+perl $BINDIR/summarize_excision.pl
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/scripts/TEMP_Insertion.sh	Mon Apr 25 13:08:56 2016 -0400
@@ -0,0 +1,176 @@
+#!/bin/bash -x
+# TEMP (Transposable Element Movement present in a Population)
+# 2013-06-14
+# Jiali Zhuang(jiali.zhuang@umassmed.edu)
+# Zhiping Weng Lab
+# Programs in Bioinformatics and Integrative Biology
+# University of Massachusetts Medical School
+
+#usage function
+usage() {
+echo -en "\e[1;36m"
+cat <<EOF
+
+usage: $0 -i input_file.sorted.bam -s scripts_directory -o output_directory -r transposon_database.fa -t annotated_TEs.bed -m MISMATCH -f fragment_size -c CPUs -h 
+
+TEMP is a software package for detecting transposable elements (TEs) 
+insertions and excisions from pooled high-throughput sequencing data. 
+Please send questions, suggestions and bug reports to:
+jiali.zhuang@umassmed.edu
+
+Options:
+        -i     Input file in bam format with full path. Please sort and index the file before calling this program. 
+               Sorting and indexing can be done by 'samtools sort' and 'samtools index'
+        -s     Directory where all the scripts are
+        -o     Path to output directory. Default is current directory
+        -r     Transposon sequence database in fasta format with full path
+        -t     Annotated TEs in BED6 format with full path. Detected insertions that overlap with annoated TEs will be filtered. 
+        -u     TE families annotations. If supplied detected insertions overlap with annotated TE of the same family will be filtered. Only use with -t.
+        -m     Number of mismatch allowed when mapping to TE concensus sequences. Default is 3
+        -x     The minimum score difference between the best hit and the second best hit for considering a read as uniquely mapped. For BWA mem. 
+        -f     An integer specifying the length of the fragments (inserts) of the library. Default is 500
+        -c     An integer specifying the number of CUPs used. Default is 8
+        -h     Show help message
+
+EOF
+echo -en "\e[0m"
+}
+
+# taking options
+while getopts "hi:c:f:m:o:r:s:t:u:x:" OPTION
+do
+        case $OPTION in
+                h)
+                        usage && exit 1
+		;;
+                i)
+                        BAM=$OPTARG
+		;;
+	        f)
+		        INSERT=$OPTARG
+		;;
+	        m)
+		        MM=$OPTARG
+		;;
+                o)
+                        OUTDIR=$OPTARG
+                ;;
+                c)
+                        CPU=$OPTARG
+                ;;
+                s)
+                        BINDIR=$OPTARG
+                ;;
+	        r)
+		        TESEQ=$OPTARG
+		;;
+	        t)
+                        ANNO=$OPTARG
+                ;;
+                u)
+                        FAMI=$OPTARG
+                ;;
+	        x)
+		        SCORE=$OPTARG
+		;;
+                ?)
+                        usage && exit 1
+                ;;
+        esac
+done
+
+if [[ -z $BAM ]] || [[ -z $BINDIR ]] || [[ -z $TESEQ ]]
+then
+        usage && exit 1
+fi
+[ ! -z "${CPU##*[!0-9]*}" ] || CPU=8
+[ ! -z "${INSERT##*[!0-9]*}" ] || INSERT=500
+[ ! -z "${MM##*[!0-9]*}" ] || MM=3
+[ ! -z "${SCORE##*[!0-9]*}" ] || SCORE=0
+[ ! -z $OUTDIR ]  || OUTDIR=$PWD
+
+mkdir -p "${OUTDIR}" || echo -e "\e[1;31mWarning: Cannot create directory ${OUTDIR}. Using the direcory of input fastq file\e[0m"
+cd ${OUTDIR} || echo -e "\e[1;31mError: Cannot access directory ${OUTDIR}... Exiting...\e[0m" || exit 1
+touch ${OUTDIR}/.writting_permission && rm -rf ${OUTDIR}/.writting_permission || echo -e "\e[1;31mError: Cannot write in directory ${OUTDIR}... Exiting...\e[0m" || exit 1
+
+function checkExist {
+        echo -ne "\e[1;32m\"${1}\" is using: \e[0m" && which "$1"
+        [[ $? != 0 ]] && echo -e "\e[1;36mError: cannot find software/function ${1}! Please make sure that you have installed the pipeline correctly.\nExiting...\e[0m" && exit 1
+}
+echo -e "\e[1;35mTesting required softwares/scripts:\e[0m"
+checkExist "echo"
+checkExist "rm"
+checkExist "mkdir"
+checkExist "date"
+checkExist "mv"
+checkExist "sort"
+checkExist "touch"
+checkExist "awk"
+checkExist "grep"
+checkExist "bwa"
+checkExist "samtools"
+echo -e "\e[1;35mDone with testing required softwares/scripts, starting pipeline...\e[0m"
+
+cp $TESEQ ./
+name=`basename $BAM`
+te=`basename $TESEQ`
+i=${name/.sorted.bam/}
+echo $name
+echo $i
+if [[ ! -s $name ]]
+then
+    cp $BAM ./
+fi
+if [[ ! -s $name.bai ]]
+then cp $BAM.bai ./
+fi
+
+# Get the mate seq of the uniq-unpaired reads
+samtools view -XF 0x2  $name > $i.unpair.sam
+if [[ $SCORE -eq 0 ]]
+then
+    perl $BINDIR/pickUniqPairFastq.pl $i.unpair.sam $i.unpair.uniq
+    perl $BINDIR/pickUniqPos.pl $i.unpair.sam > $i.unpair.uniq.bed
+else
+    perl $BINDIR/pickUniqPairFastq_MEM.pl $i.unpair.sam $i.unpair.uniq $SCORE
+    perl $BINDIR/pickUniqPos_MEM.pl $i.unpair.sam $SCORE > $i.unpair.uniq.bed 
+fi
+
+# Map to transposons
+bwa index -a is $te
+bwa aln -t $CPU -n $MM -l 100 -R 1000 $te $i.unpair.uniq.1.fastq > $i.unpair.uniq.1.sai
+bwa aln -t $CPU -n $MM -l 100 -R 1000 $te $i.unpair.uniq.2.fastq > $i.unpair.uniq.2.sai
+bwa sampe -P $te $i.unpair.uniq.1.sai $i.unpair.uniq.2.sai $i.unpair.uniq.1.fastq $i.unpair.uniq.2.fastq > $i.unpair.uniq.transposons.sam
+
+
+#Summary
+samtools view -hSXF 0x2 $i.unpair.uniq.transposons.sam > $i.unpair.uniq.transposons.unpair.sam
+perl $BINDIR/pickUniqMate.pl $i.unpair.uniq.transposons.unpair.sam $i.unpair.uniq.bed > $i.unpair.uniq.transposons.bed
+cp $i.unpair.uniq.transposons.bed $i.unpair.uniq.transposons.filtered.bed
+
+
+#Prepare for insertion breakpoints identification
+awk -F "\t" -v sample=$i '{OFS="\t"; print $1,$2,$3,sample,$5,$6}' $i.unpair.uniq.transposons.filtered.bed >> tmp
+perl $BINDIR/mergeTagsWithoutGap.pl tmp > $i.uniq.transposons.filtered.woGap.bed
+perl $BINDIR/mergeTagsWithGap.pl $i.uniq.transposons.filtered.woGap.bed $INSERT > $i.uniq.transposons.filtered.wGap.bed
+rm tmp
+perl $BINDIR/get_class.pl $i.uniq.transposons.filtered.wGap.bed $i > $i.uniq.transposons.filtered.wGap.class.bed
+perl $BINDIR/make.bp.bed.pl $i.uniq.transposons.filtered.wGap.class.bed $ANNO $FAMI
+
+#rm $i.unpair.sam $i.unpair.uniq.bed $i.unpair.uniq.?.fastq $i.unpair.uniq.?.sai 
+rm $i.unpair.uniq.transposons.sam $i.unpair.uniq.transposons.unpair.sam $i.uniq.transposons.filtered.woGap.bed $i.uniq.transposons.filtered.wGap.bed
+
+
+#Detect insertion breakpoints using soft-clipping information
+perl $BINDIR/pickClippedFastq.pl $i $te
+perl $BINDIR/refine_breakpoint.in.pl
+
+
+#Estimate insertion frequencies
+perl $BINDIR/pickOverlapPair.in.pl $i.insertion.refined.bp $INSERT > $i.insertion.refined.bp.summary
+
+
+################################
+##End of processing insertions##
+################################
+
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/scripts/cmd.total.sh	Mon Apr 25 13:08:56 2016 -0400
@@ -0,0 +1,26 @@
+rm tmp
+#for i in flamBGFM flamKGFM flamEmbryo flamHets flamTranshets
+#for i in harwich wXh24 wXh14 wXh21 whXw14 whXw21
+#for i in armiTranshetsOvary armiTranshetsSoma armiHetsOvary armiHetsSoma rhinoTranshetsOvary rhinoTranshetsSoma rhinoHetsOvary rhinoHetsSoma qinTranshetsOvary qinHetsOvary w1118Ovary w1118Soma orerOvary orerSoma orerEmbryo
+#for i in w1118Ovary w1118Soma qinTranshetsOvary qinHetsOvary qintestTranshetsOvary qintestHetsOvary 
+for i in harwich.ovary harwichG20.ovary W1.ovary W1G20.ovary wXh1g.ovary whXh3g.ovary whXh5g.ovary whXh7g.ovary whXw3g.ovary whXw5g.ovary whXw7g.ovary
+#for i in W1.ovary wXh1g.ovary whXh3g.ovary whXh5g.ovary whXw3g.ovary whXw5g.ovary armiHets.ovary armiHets.carcass armiTranshets.ovary armiTranshets.carcass flamBGFM.ovary flam.embryo flamHets.ovary flamKGFM.ovary flamTranshets.ovary harwich.ovary introgression2.ovary introgression2X3.ovary introgression3.ovary orer.embryo orer.ovary orer.carcass qinDf.ovary qinHets.ovary qinTMB.ovary qinTranshets.ovary rhinoHets.ovary rhinoHets.carcass rhinoTranshets.ovary rhinoTranshets.carcass w1.ovary w1.carcass whXw14d.ovary whXw21d.ovary wXh14d.ovary wXh21d.ovary wXh2_4d.ovary
+
+do
+	
+	awk -F "\t" -v sample=$i '{OFS="\t"; print $1,$2,$3,sample,$5,$6}' $i.downsample.bam.unpair.uniq.transposons.filtered.bed >> tmp
+	
+	## Filter BS A{36}
+	grep FBgn0000224_BS tmp | egrep "\+51|\-51" > tmp.BS
+
+	## Merge Stalker
+	ediff tmp diff tmp.BS > tmp2 	
+
+done
+
+perl /home/wangj2/jpp_findTransposonJumping/mergeTagsWithoutGap.pl tmp2 > dysgenic.uniq.transposons.filtered.woGap.bed
+perl /home/wangj2/jpp_findTransposonJumping/mergeTagsWithGap.pl dysgenic.uniq.transposons.filtered.woGap.bed 500 > dysgenic.uniq.transposons.filtered.wGap.bed
+
+rm tmp2 tmp.BS tmp
+
+perl get_class.pl dysgenic.uniq.transposons.filtered.wGap.bed > dysgenic.uniq.transposons.filtered.wGap.class.bed
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/scripts/excision.clustering.pl	Mon Apr 25 13:08:56 2016 -0400
@@ -0,0 +1,40 @@
+#! /usr/bin/perl
+
+use strict;
+
+my %position=();
+my %names=();
+open (input, "<$ARGV[0]") or die "Can't open $ARGV[0] since $!\n";
+while (my $line=<input>) {
+    chomp($line);
+    my @a=split(/\t/, $line);
+    my @b=split(/\#/, $a[8]);
+
+    if (defined $position{$b[0]}) {
+	my @c=split(/\:/, $position{$b[0]});
+	if (($c[0] eq $a[0])&&($a[1] < $c[1])) {
+	    $position{$b[0]} =~ s/$c[1]/$a[1]/;
+	}
+	if (($c[0] eq $a[0])&&($a[2] > $c[2])) {
+	    $position{$b[0]} =~ s/$c[2]/$a[2]/;
+	}
+	my $transposon=$a[3];
+	if ($names{$b[0]} !~ /$transposon/) {$names{$b[0]}=$names{$b[0]}.",$transposon";}
+    }
+    else {
+	$position{$b[0]}="$a[0]\:$a[1]\:$a[2]";
+	$names{$b[0]}=$a[3];
+    }
+}
+close input;
+
+open (output, ">>temp_for_sort") or die "Can't open temp_for_sort since $!\n";
+while ((my $key, my $value) = each (%position)) {
+    my @z=split(/\:/, $value);
+    print output "$z[0]\t$z[1]\t$z[2]\t$names{$key}\n";
+}
+close output;
+
+system("sort +0 -1 +1n -2 +2n -3 temp_for_sort > sorted");
+system("uniq -c sorted > $ARGV[1]");
+system("rm sorted temp_for_sort");
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/scripts/filterFalsePositive.ex.pl	Mon Apr 25 13:08:56 2016 -0400
@@ -0,0 +1,31 @@
+#! /usr/bin/perl
+
+use strict;
+
+open (input, "<$ARGV[0]") or die "Can't open $ARGV[0] since $!\n";
+my %leng=();
+my %trans=();
+my %coordinate=();
+while (my $line=<input>) {
+    chomp($line);
+    my @a=split(/\t/, $line);
+    if (defined $leng{$a[9]}) {
+	$trans{$a[9]} += $a[2]-$a[1];
+    }
+    else {
+	$leng{$a[9]}=$a[8]-$a[7]-10;
+	$trans{$a[9]}=$a[2]-$a[1];
+	$coordinate{$a[9]}="$a[6]\:$a[7]\:$a[8]";
+    }
+}
+close input;
+
+open (output, ">>$ARGV[2]") or die "Can't open $ARGV[2] since $!\n";
+while ((my $key, my $value) = each (%coordinate)) {
+    if ((($leng{$key}-$trans{$key}) <= $ARGV[1])&&(($leng{$key}-$trans{$key}) >= 0)) {
+#    if (($leng{$key}-$trans{$key}) <= 500) {
+	my @b=split(/\:/, $value);
+	print output "$b[0]\t$b[1]\t$b[2]\t$key\n";
+    }
+}
+close output;
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/scripts/filterFalsePositive.in.pl	Mon Apr 25 13:08:56 2016 -0400
@@ -0,0 +1,39 @@
+#!/share/bin/perl
+use List::Util qw(max min);
+#system("windowBed -a $ARGV[0] -b /home/wangj2/flycommon/all_transposons.dml.rmskCrossmatch.bed -sw -r 1000 -l 0 > tmp");
+#system("windowBed -a $ARGV[0] -b /home/wangj2/flycommon/soo.trnalnpos.map2.sort.bed -sw -r 1000 -l 0 > tmp");
+system("bedtools window -a $ARGV[0] -b $ARGV[1] -sw -r 1000 -l 0 > tmp");
+
+open in,"tmp";
+my %read;
+while(<in>)
+{
+	chomp;
+	split/\t/;
+
+	## if the same tpye of transposons
+	my @loc=map { [/(.*?),(\+|-)(.*)/] } split/;/,$_[4];
+	foreach my $l (@loc)
+	{
+		if($$l[0] eq $_[9])
+		{	
+			## if the same strand of transposons
+			if((($_[5] eq $$l[1]) && ($_[11] eq "-")) || (($_[5] ne $$l[1]) && ($_[11] eq "+")))
+			{
+				## if the fragments of the exists transposons
+				{
+					my $s=max($$l[2],$_[12]);
+					my $e=min(($$l[2]+$_[2]-$_[1]),$_[13]);
+					if($s<$e)
+					{
+						print join("\t",@_[0..5]),"\n" if not exists $read{$_[3]};
+						$read{$_[3]}=1;
+					}
+				}
+			}
+		}
+	}
+}
+close in;
+
+system("rm tmp");
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/scripts/generate_density_json.pl	Mon Apr 25 13:08:56 2016 -0400
@@ -0,0 +1,80 @@
+#! /usr/bin/perl                                                                                                                                                                                  
+
+use strict;
+die "perl $0 <input.insertion.bp.summary> <chromInfo file> <genomic bin size>\n" if @ARGV<2;
+
+my @colors=("blue","green","red","yellow","grey","orange","purple","brown", "black");
+
+my $op_title=$ARGV[0];
+$op_title =~ s/summary/json/;
+
+my %chrs=();
+system("cut -f1 $ARGV[0] | uniq > chr");
+open (input, "<chr") or die "Can't open chr since $!\n";
+while (my $line=<input>) {
+    chomp($line);
+    $chrs{$line}=1;
+}
+close input;
+system("rm chr");
+
+open (output, ">>$op_title") or die "Can't open $op_title since $!\n";
+print output "{\"ideograms\":[\n";
+
+my $i=0;
+open (input, "<$ARGV[1]") or die "Can't open $ARGV[1] since $!\n";
+while (my $line=<input>) {
+    chomp($line);
+    my @a=split(/\t/, $line);
+    if ($chrs{$a[0]}==1) {
+        my $len=int($a[1]/$ARGV[2])+1;
+        if ($len < 5) {
+            $chrs{$a[0]}=0;
+            next;
+        }
+        if ($i > 0) {print output ",\n";}
+        print output "{\"id\":\"$a[0]\",\"length\":$len,\"color\":\"$colors[$i % 9]\"}";
+        $i++;
+    }
+}
+close input;
+
+print output "\n],\n\"tracks\":[\n{\n";
+print output "\"name\": \"Density\",\n";
+print output "\"type\": \"plot\",\n";
+print output "\"values\":\n[\n";
+
+my @hist=();
+my $last_chr="";
+my $i=0;
+my $k=0;
+open (input, "<$ARGV[0]") or die "Can't open $ARGV[0] since $!\n";
+#my $header=<input>;                                                                                                                                                                              
+while (my $line=<input>) {
+    chomp($line);
+    my @a=split(/\t/, $line);
+    if ($a[0] eq $last_chr) {
+        my $mid=int(($a[1]+$a[2])/2);
+        if (int($mid/$ARGV[2]) > $i) {
+            $i++;
+            $hist[$i]=1;
+        }
+        else {$hist[$i]++;}
+    }
+    else {
+        if (($last_chr ne "") && ($chrs{$last_chr} == 1)) {
+            if ($k > 0) {print output ",\n";}
+            print output "{\"color\":\"$colors[$k % 9]\",\"chr\":\"$last_chr\",\"values\":[";
+            for my $j (0..$i-1) {print output "$hist[$j],";}
+            print output "$hist[$i]]}";
+            $k++;
+        }
+        $i=0;
+        $hist[0]=1;
+        $last_chr=$a[0];
+    }
+}
+close input;
+
+print output "\n]}\n]\n}\n";
+close output;
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/scripts/get_class.pl	Mon Apr 25 13:08:56 2016 -0400
@@ -0,0 +1,45 @@
+#!/share/bin/perl
+
+# chr2L   19384   20049   FBgn0001283_jockey      wXh24,-1;harwich,-8,+5;whXw21,-4,+1;whXw14,-5;wXh14,+2; +-      sense   chr2L:19562.19645
+
+my @sample=("$ARGV[1]");
+
+print "chr\tstart\tend\ttransposonName\tstrand\ttransposonStrand\tbreak\tclass";
+print "\t$_\_class\t$_\_plus\t$_\_minus" foreach @sample;
+print "\n";
+open in,$ARGV[0];
+while(<in>)
+{
+	chomp;
+	my($chrom,$start,$end,$transposonName,$class,$strand,$transposonStrand,$break)=split/\t/;
+	my %classCounts;
+	my ($tcplus,$tcminus)=(0,0);
+	foreach $s (split/;/,$class)
+	{
+		my ($name,@counts)=split/,/,$s;
+		foreach my $c (@counts)
+		{
+			my $strand=($c>0)?"+":"-";
+			$classCounts{$name}{$strand}=$c;
+			$tcplus+=$c if $c>0;
+			$tcminus+=$c if $c<0;
+		}
+	}
+	print "$chrom\t$start\t$end\t$transposonName\t$strand\t$transposonStrand\t$break";
+	print "\t1p1" if $tcplus>0 && $tcminus<0;
+	print "\t2p" if ($tcplus>1 && $tcminus==0) || ($tcplus==0 && $tcminus<-1);
+	print "\tsingleton" if ($tcplus<=1 && $tcminus==0 && $tcplus>0) || ($tcplus==0 && $tcminus>=-1 && $tcminus<0);
+	print "\tNone" if ($tcminus==0 && $tcplus==0);
+	foreach my $s (@sample)
+	{
+		$classCounts{$s}{"+"}=0 if not exists $classCounts{$s}{"+"};
+		$classCounts{$s}{"-"}=0 if not exists $classCounts{$s}{"-"};
+		print "\t1p1" if $classCounts{$s}{"+"}>0 && $classCounts{$s}{"-"}<0;
+		print "\t2p" if ($classCounts{$s}{"+"}>1 && $classCounts{$s}{"-"}==0) || ($classCounts{$s}{"+"}==0 && $classCounts{$s}{"-"}<-1);
+		print "\tsingleton" if ($classCounts{$s}{"+"}<=1 && $classCounts{$s}{"-"}==0 && $classCounts{$s}{"+"}>0) || ($classCounts{$s}{"+"}==0 && $classCounts{$s}{"-"}>=-1 && $classCounts{$s}{"-"}<0);
+		print "\tNone" if $classCounts{$s}{"+"}==0 && $classCounts{$s}{"-"}==0;
+		print "\t",$classCounts{$s}{"+"},"\t",$classCounts{$s}{"-"};
+	}
+	print "\n";
+}
+close in;
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/scripts/make.bp.bed.pl	Mon Apr 25 13:08:56 2016 -0400
@@ -0,0 +1,110 @@
+#! /usr/bin/perl
+
+use strict;
+
+my @sample=();
+open (in, "<$ARGV[0]") or die "Can't open $ARGV[0] since $!\n";
+my $line=<in>;
+close in;
+
+my %chrs=();
+my @a=split(/\t/, $line);
+for my $i (0..$#a) {
+    if ($a[$i] =~ /_class$/) {
+	my $name=$a[$i];
+	$name =~ s/_class//;
+	my $j=$i+1;
+	my $k=$i+2;
+	my $l=$i+3;
+	system("cut -f7,4,6,$j,$k,$l $ARGV[0] > temp");
+	open (input, "<temp") or die "Can't open temp since $!\n";
+	open (output, ">>$name.insertion.bp.bed") or die "Can't open $name.insertion.bp.bed since $!\n";
+	my $header=<input>;
+	while (my $line=<input>) {
+	    chomp($line);
+	    my @b=split(/\t/, $line);
+	    if (($b[4] ne "0")||($b[5] ne "0")) {
+		my @c=split(/\:/, $b[2]);
+		my @d=split(/\./, $c[1]);
+		if ($d[0] > $d[1]) {
+		    my $temp=$d[0];
+		    $d[0]=$d[1];
+		    $d[1]=$temp;
+		}
+		my $lower=$d[0];
+		my $upper=$d[1];
+	        if (($lower >= 0) && ($upper >= 0)) {
+		   print output "$c[0]\t$lower\t$upper\t$b[0]\t$b[1]\t$b[3]\t$b[4]\t$b[5]\n";
+	        }
+		$chrs{$c[0]}=1;
+	    }
+	}
+	close input;
+	close output;
+	system("rm temp");
+
+	if ($ARGV[1] ne "") {
+	    open (input, "<$name.insertion.bp.bed") or die "Can't open $name.insertion.bp.bed since $!\n";
+	    open (output, ">tmp") or die "Can't tmp since $!\n";
+	    while (my $line=<input>) {
+		chomp($line);
+		my @a=split(/\t/, $line);
+		if (($a[0] =~ /^\d{1,2}$/) || ($a[0] eq "X") || ($a[0] eq "Y")) {$a[0]="chr$a[0]";}
+		my $strand="+";
+		if ($a[4] eq "antisense") {$strand="-";}
+		print output "$a[0]\t$a[1]\t$a[2]\t$a[3]\t\.\t$strand\t$a[5]\t$a[6]\t$a[7]\n";
+	    }
+	    close input;
+	    close output;
+
+	    system("bedtools intersect -a tmp -b $ARGV[1] -f 0.1 -wo -s > tmp1");
+	    if ($ARGV[2] eq "") {
+		system("awk -F \"\\t\" '{OFS=\"\\t\"; if ((\$4==\$13)&&(\$6==\$15)) print \$1,\$2,\$3,\$4,\$5,\$6}' tmp1 > tmp2");
+	    }
+	    else {
+		my %family=();
+		open (input, "<$ARGV[2]") or die "Can't open $ARGV[2] since $!\n";
+		while (my $line=<input>) {
+		    chomp($line);
+		    my @a=split(/\t/, $line);
+		    $family{$a[0]}=$a[1];
+		}
+		close input;
+
+		open (input, "<tmp1") or die "Can't open tmp1 since $!\n";
+		open (output, ">>tmp2") or die "Can't open tmp2 since $!\n";
+		while (my $line=<input>) {
+		    chomp($line);
+		    my @a=split(/\t/, $line);
+		    if (($family{$a[3]} eq $family{$a[12]}) && ($a[5] eq $a[14])) {
+			print output "$a[0]\t$a[1]\t$a[2]\t$a[3]\t$a[4]\t$a[5]\n";
+		    }
+		}
+		close input;
+		close output;
+	    }
+	    
+	    if (-s "tmp2") {
+		system("bedtools subtract -a tmp -b tmp2 -f 1.0 > tmp3");
+		open (input, "<tmp3") or die "Can't open tmp3 since $!\n";
+		open (output, ">$name.insertion.bp.bed") or die "Can't open $name.insertion.bp.bed since $!\n";
+		while (my $line=<input>) {
+		    chomp($line);
+		    my @a=split(/\t/, $line);
+		    my $direction="sense";
+		    if ($a[5] eq "-") {$direction="antisense";}
+		    my $chr_num=$a[0];
+		    $chr_num =~ s/chr//;
+		    if (($chrs{$a[0]} == 1) && (! defined $chrs{$chr_num})) {$chr_num=$a[0];}
+		    print output "$chr_num\t$a[1]\t$a[2]\t$a[3]\t$direction\t$a[6]\t$a[7]\t$a[8]\n";
+		}
+		close input;
+		close output;
+	    }
+	}
+
+	system("rm tmp*");
+
+    }
+}
+	
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/scripts/mergeTagsWithGap.pl	Mon Apr 25 13:08:56 2016 -0400
@@ -0,0 +1,196 @@
+#!/share/bin/perl
+#chr2L   114333  114409  FBgn0003055_P-element   harwich,1;      +       antisense
+#chr2L   114443  114567  FBgn0003055_P-element   harwich,3;      +       antisense
+#chr2L   114636  114712  FBgn0003055_P-element   harwich,1;      -       antisense
+#chr2L   131640  131929  FBgn0003055_P-element   harwich,42;     +       sense
+#chr2L   131948  132274  FBgn0003055_P-element   harwich,18;     -       sense
+#chr2L   132027  132103  FBgn0003055_P-element   harwich,1;      -       antisense
+
+use warnings;
+use strict;
+use List::Util qw(max min);
+
+if(scalar(@ARGV)<2 || grep {/^-h/} @ARGV)
+{
+	die "
+usage: mergeOverlapBed4.pl inputFile
+Expects BED input with at least 4 fields.  For each {chr,name} pair,
+merges overlapping ranges and prints out sorted BED4 to stdout.
+inputFile can be - or stdin to read from stdin.
+";
+}
+
+my $input=shift @ARGV;
+my $maxgap=shift @ARGV;
+grep {s/^stdin$/-/i} $input;
+
+my %item2coords;
+open IN,$input;
+while (<IN>)
+{
+	chomp;
+	my ($chrom,$start,$end,$transposonName,$count,$strand,$transposonStrand)=split/\t/;
+	push @{$item2coords{"$chrom;$transposonName;$transposonStrand"}},[$start,$end,$count,$strand]; 
+}
+close IN;
+
+my @results;
+foreach my $item (keys %item2coords)
+{
+	my @sortedCoords=sort{ $a->[0]<=>$b->[0] } @{$item2coords{$item}};
+	my ($chrom,$tName,$tStrand)=split(/;/,$item);
+	my ($mergeStart,$mergeEnd,$mergeCounts,$mergeStrand)=@{shift @sortedCoords};
+	my %sampleCounts=();
+	my ($breakStart,$breakEnd)=0;
+	foreach my $sa (split/;/,$mergeCounts)
+	{
+		my ($s,$c)=split/,/,$sa;
+		$sampleCounts{$s}{$mergeStrand}=$c;
+	}
+	foreach my $rangeRef (@sortedCoords) 
+	{
+    		my ($rangeStart,$rangeEnd,$rangeCounts,$rangeStrand)=@{$rangeRef};
+		if($mergeStrand=~/\Q$rangeStrand\E$/)
+		{
+			if($rangeStart>=$mergeEnd+$maxgap)
+			{
+				$mergeCounts="";
+				foreach my $s (keys %sampleCounts)
+				{
+					$mergeCounts.=$s;
+					$mergeCounts.=",".$_.$sampleCounts{$s}{$_} foreach keys %{$sampleCounts{$s}};
+					$mergeCounts.=";";
+				}
+				if($mergeStrand eq "+")
+				{
+					$breakStart=$mergeEnd;
+					$breakEnd=$mergeEnd+$maxgap;
+				}
+				if($mergeStrand eq "-")
+				{
+					$breakStart=$mergeStart-$maxgap;
+					$breakEnd=$mergeStart;
+				}
+				push @results,[$chrom,$mergeStart,$mergeEnd,$tName,$mergeCounts,$mergeStrand,$tStrand,"$chrom:$breakStart.$breakEnd"];
+				($mergeStart,$mergeEnd,$mergeStrand)=($rangeStart,$rangeEnd,$rangeStrand);
+				%sampleCounts=();
+				foreach my $sa (split/;/,$rangeCounts)
+				{
+					my ($s,$c)=split/,/,$sa;
+					$sampleCounts{$s}{$rangeStrand}=$c;
+				}
+			}
+			else
+			{
+				$mergeEnd=max($rangeEnd,$mergeEnd);
+				foreach my $sa (split/;/,$rangeCounts)
+				{
+					my ($s,$c)=split/,/,$sa;
+					$sampleCounts{$s}{$rangeStrand}+=$c;
+				}
+			}
+		}
+		elsif($rangeStrand eq "+")
+		{
+			$mergeCounts="";
+			foreach my $s (keys %sampleCounts)
+			{
+				$mergeCounts.=$s;
+				$mergeCounts.=",".$_.$sampleCounts{$s}{$_} foreach keys %{$sampleCounts{$s}};
+				$mergeCounts.=";";
+			}
+			if($mergeStrand eq "+")
+			{
+				$breakStart=$mergeEnd;
+				$breakEnd=$mergeEnd+$maxgap;
+			}
+			if($mergeStrand eq "-")
+			{
+				$breakStart=$mergeStart-$maxgap;
+				$breakEnd=$mergeStart;
+			}
+			push @results,[$chrom,$mergeStart,$mergeEnd,$tName,$mergeCounts,$mergeStrand,$tStrand,"$chrom:$breakStart.$breakEnd"];
+			($mergeStart,$mergeEnd,$mergeStrand)=($rangeStart,$rangeEnd,$rangeStrand);
+			%sampleCounts=();
+			foreach my $sa (split/;/,$rangeCounts)
+			{
+				my ($s,$c)=split/,/,$sa;
+				$sampleCounts{$s}{$rangeStrand}=$c;
+			}
+		}
+		else
+		{
+			if($rangeStart>=$mergeEnd+$maxgap*2)
+			{
+				$mergeCounts="";
+				foreach my $s (keys %sampleCounts)
+				{
+					$mergeCounts.=$s;
+					$mergeCounts.=",".$_.$sampleCounts{$s}{$_} foreach keys %{$sampleCounts{$s}};
+					$mergeCounts.=";";
+				}
+				if($mergeStrand eq "+")
+				{
+					$breakStart=$mergeEnd;
+					$breakEnd=$mergeEnd+$maxgap;
+				}
+				if($mergeStrand eq "-")
+				{
+					$breakStart=$mergeStart-$maxgap;
+					$breakEnd=$mergeStart;
+				}
+				push @results,[$chrom,$mergeStart,$mergeEnd,$tName,$mergeCounts,$mergeStrand,$tStrand,"$chrom:$breakStart.$breakEnd"];
+				($mergeStart,$mergeEnd,$mergeStrand)=($rangeStart,$rangeEnd,$rangeStrand);
+				%sampleCounts=();
+				foreach my $sa (split/;/,$rangeCounts)
+				{
+					my ($s,$c)=split/,/,$sa;
+					$sampleCounts{$s}{$rangeStrand}=$c;
+				}
+			}
+			else
+			{
+				$breakStart=$mergeEnd;
+				$mergeEnd=max($rangeEnd,$mergeEnd);
+				$breakEnd=$rangeStart;
+				foreach my $sa (split/;/,$rangeCounts)
+				{
+					my ($s,$c)=split/,/,$sa;
+					$sampleCounts{$s}{$rangeStrand}+=$c;
+				}
+				$mergeStrand.=$rangeStrand;
+			}			
+		}
+	}
+	$mergeCounts="";
+	foreach my $s (keys %sampleCounts)
+	{
+		$mergeCounts.=$s;
+		$mergeCounts.=",".$_.$sampleCounts{$s}{$_} foreach keys %{$sampleCounts{$s}};
+		$mergeCounts.=";";
+	}
+	if($mergeStrand eq "+")
+	{
+		$breakStart=$mergeEnd;
+		$breakEnd=$mergeEnd+$maxgap;
+	}
+	if($mergeStrand eq "-")
+	{
+		$breakStart=$mergeStart-$maxgap;
+		$breakEnd=$mergeStart;
+	}
+	push @results,[$chrom,$mergeStart,$mergeEnd,$tName,$mergeCounts,$mergeStrand,$tStrand,"$chrom:$breakStart.$breakEnd"] if $mergeEnd;
+}
+
+sub bed4Cmp
+{
+  # For sorting by chrom, chromStart, and names -- reverse order for names
+	return $a->[0] cmp $b->[0] ||
+	$a->[1] <=> $b->[1] ||
+	$b->[3] cmp $a->[3];
+}
+
+foreach my $r (sort bed4Cmp @results)
+{
+	print join("\t",@{$r}),"\n";
+}
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/scripts/mergeTagsWithoutGap.pl	Mon Apr 25 13:08:56 2016 -0400
@@ -0,0 +1,91 @@
+#!/share/bin/perl
+#chr2L   735929  736005  HWUSI-EAS1533_0002:1:73:4665:12371#0/2  FBgn0000155_roo,-58;FBgn0000155_roo,-8722;      -
+use warnings;
+use strict;
+
+if(scalar(@ARGV)<1 || grep {/^-h/} @ARGV)
+{
+	die "
+usage: mergeOverlapBed4.pl inputFile
+Expects BED input with at least 4 fields.  For each {chr,name} pair,
+merges overlapping ranges and prints out sorted BED4 to stdout.
+inputFile can be - or stdin to read from stdin.
+";
+}
+
+my $input=shift @ARGV;
+grep {s/^stdin$/-/i} $input;
+
+my %item2coords;
+open IN,$input;
+while (<IN>)
+{
+	chomp;
+	my ($chrom,$start,$end,$sample,$class,$strand)=split/\t/;
+  	die "Sorry, input must have at least 4 fields of BED.\n" if ! $class;
+	# random choose one
+#	my @loc=$class=~/(.*?),(\+|-)(.*)/;
+#	my $transposonStrand=($strand eq $loc[1])?"antisense":"sense";
+#	push @{$item2coords{"$chrom;$strand;$loc[0];$transposonStrand"}},[$start,$end,$sample] 
+
+	# norm by class
+	my @loc=map { [/(.*?),(\+|-)(.*)/] } split/;/,$class;
+	my %transposonName;
+	foreach my $l (@loc)
+	{
+		my $transposonStrand=($strand eq $$l[1])?"antisense":"sense";
+		$transposonName{$$l[0]}=$transposonStrand;
+	}
+	my $c=1/scalar(keys %transposonName);
+	push @{$item2coords{"$chrom;$strand;$_;$transposonName{$_}"}},[$start,$end,$sample,$c] foreach keys %transposonName; 
+}
+close IN;
+
+my @results;
+foreach my $item (keys %item2coords)
+{
+	my @sortedCoords=sort{ $a->[0]<=>$b->[0] } @{$item2coords{$item}};
+	my ($chrom,$strand,$tName,$tStrand)=split(/;/,$item);
+	my ($mergeStart,$mergeEnd,$mergeSample,$mergeCounts)=@{shift @sortedCoords};
+	my %sampleCounts;
+	$sampleCounts{$mergeSample}=$mergeCounts;
+	foreach my $rangeRef (@sortedCoords) 
+	{
+    		my ($rangeStart,$rangeEnd,$rangeSample,$rangeCounts)=@{$rangeRef};
+    		if($rangeEnd<=$mergeEnd)
+		{
+			$sampleCounts{$rangeSample}+=$rangeCounts;
+			next;
+		}
+		if($rangeStart>=$mergeEnd)
+		{
+			my $count="";
+			$count.=$_.",".$sampleCounts{$_}.";" foreach keys %sampleCounts;
+			push @results,[$chrom,$mergeStart,$mergeEnd,$tName,$count,$strand,$tStrand];
+			($mergeStart,$mergeEnd,$mergeSample,$mergeCounts)=($rangeStart,$rangeEnd,$rangeSample,$rangeCounts);
+			%sampleCounts=();
+			$sampleCounts{$mergeSample}=$mergeCounts;
+		}
+		else
+		{
+			$mergeEnd=$rangeEnd;
+			$sampleCounts{$rangeSample}+=$rangeCounts;
+		}
+	}
+	my $count="";
+	$count.=$_.",".$sampleCounts{$_}.";" foreach keys %sampleCounts;
+	push @results,[$chrom,$mergeStart,$mergeEnd,$tName,$count,$strand,$tStrand] if $mergeEnd;
+}
+
+sub bed4Cmp
+{
+  # For sorting by chrom, chromStart, and names -- reverse order for names
+	return $a->[0] cmp $b->[0] ||
+	$a->[1] <=> $b->[1] ||
+	$b->[3] cmp $a->[3];
+}
+
+foreach my $r (sort bed4Cmp @results)
+{
+	print join("\t",@{$r}),"\n";
+}
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/scripts/pickClippedFastq.pl	Mon Apr 25 13:08:56 2016 -0400
@@ -0,0 +1,193 @@
+#!/share/bin/perl
+use List::Util qw(sum);
+use Bio::Seq;
+
+die "perl $0 <input_prefix> <TE sequence database>\n" if @ARGV<1;
+
+my %transposon_seq=();
+my %transposon_revcom_seq=();
+my $curr_seq="";
+my $curr_transposon="";
+open (input, "<$ARGV[1]") or die "Can't open $ARGV[1] since $!\n";
+while (my $line=<input>) {
+    chomp($line);
+    if ($line =~ /^\>/) {
+	if ($curr_transposon ne "") {
+	    $transposon_seq{$curr_transposon}=uc($curr_seq);
+	    my $seq=Bio::Seq->new(-seq=>$curr_seq, -alphabet => 'dna');
+	    $curr_seq=$seq->revcom->seq;
+	    $transposon_revcom_seq{$curr_transposon}=uc($curr_seq);
+	}
+	my @a=split(/\s+/, $line);
+	$a[0] =~ s/\>//;
+	$curr_transposon=$a[0];
+	$curr_seq="";
+    }
+    else {$curr_seq=$curr_seq.$line;}
+}
+close input;
+
+open m1,">>$ARGV[0].clipped.reads.aln";
+
+open (input, "<$ARGV[0].insertion.bp.bed") or die "Can't open $ARGV[0].insertion.bp.bed since $!\n";
+while (my $line=<input>) {
+    chomp($line);
+    my @a=split(/\t/, $line);
+
+    my $lower=$a[1]-15;
+    my $upper=$a[2]+15;
+    if (($lower > 0)&&($upper > 0))
+    {
+	system("samtools view -hXf 0x2 $ARGV[0].sorted.bam $a[0]\:$lower\-$upper > temp.sam");
+	
+	open in,"temp.sam";
+	my %pe1;
+	my %pe2;
+	while(<in>)
+	{
+	    chomp;
+	    my @f=split/\t/,$_,12;
+	    ## read number 1 or 2
+	    my ($rnum)=$f[1]=~/(\d)$/;
+	    
+	    ## XT:A:* 
+	    my ($xt)=$f[11]=~/XT:A:(.)/;
+	    
+	    if ($f[5]=~/S/) {
+		
+		## Coordinate                                                                                                                                    
+		my $coor=-10;
+		my $strand="";
+		my $final="";
+		my $clipseq="";
+		my @z=split(/M/, $f[5]);
+		
+		if (($f[5]=~/S$/)&&($f[1]=~/r/))
+		{
+		    my (@cigar_m)=$f[5]=~/(\d+)M/g;
+		    my (@cigar_d)=$f[5]=~/(\d+)D/g;
+		    my (@cigar_s)=$f[5]=~/(\d+)S/g;
+		    my (@cigar_i)=$f[5]=~/(\d+)I/g;
+		    my $aln_ln=sum(@cigar_m,@cigar_d);
+		    $coor=$f[3]+$aln_ln-1;
+		    $strand="-";
+		    
+		    my (@clipped)=$z[1]=~/(\d+)S/g;
+		    my $cliplen=sum(@clipped);
+		    if ($cliplen >= 15) {
+			$clipseq=substr($f[9], length($f[9])-$cliplen, $cliplen);
+		    }
+		}
+
+                elsif (($f[1]=~/R/)&&($z[0]=~/S/))
+                {
+                    $coor=$f[3]; $strand="+";
+
+                    my (@clipped)=$z[0]=~/(\d+)S/g;
+                    my $cliplen=sum(@clipped);
+                    if ($cliplen >= 15) {
+                        $clipseq=substr($f[9], 0, $cliplen);
+		    }
+		}
+
+		if ($clipseq ne "") {
+		    my $flag=0;
+		    while ((my $key, my $value) = each (%transposon_seq)) {
+			my $seq=$value;
+			if ($a[4] eq "antisense") {
+			    $seq=$transposon_revcom_seq{$key};
+			}
+			if (($seq =~ /$clipseq/)&&($a[3] eq $key)&&($coor >= $lower)&&($coor <= $upper)) {
+#			    print "$clipseq\n";
+			    $final=$coor."\($strand\)";
+			    if (defined $pe1{$final}) {
+				if (length($clipseq) > length($pe1{$final})) {
+				    $pe1{$final}=$clipseq;
+				}
+			    }
+			    else {$pe1{$final}=$clipseq; $pe2{$final}=0;}
+			    $flag=1;
+			    last;
+			}
+		    }		    
+		}
+		
+	    }	    
+	}#while;
+	close in;
+
+        open in,"temp.sam";
+        while(<in>)
+        {
+            chomp;
+            my @f=split/\t/,$_,12;
+            my ($rnum)=$f[1]=~/(\d)$/;
+            my ($xt)=$f[11]=~/XT:A:(.)/;
+
+            if ($f[5]=~/S/) {
+
+                my $coor=-10;
+                my $strand="";
+		my $final="";
+                my $clipseq="";
+                my @z=split(/M/, $f[5]);
+
+                if (($f[5]=~/S$/)&&($f[1]=~/r/))
+                {
+                    my (@cigar_m)=$f[5]=~/(\d+)M/g;
+                    my (@cigar_d)=$f[5]=~/(\d+)D/g;
+                    my (@cigar_s)=$f[5]=~/(\d+)S/g;
+                    my (@cigar_i)=$f[5]=~/(\d+)I/g;
+                    my $aln_ln=sum(@cigar_m,@cigar_d);
+                    $coor=$f[3]+$aln_ln-1;
+                    $strand="-";
+
+                    my (@clipped)=$z[1]=~/(\d+)S/g;
+                    my $cliplen=sum(@clipped);
+                    if ($cliplen >= 6) {
+                        $clipseq=substr($f[9], length($f[9])-$cliplen, $cliplen);
+                    }
+                }
+
+                elsif (($f[1]=~/R/)&&($z[0]=~/S/))
+                {
+                    $coor=$f[3]; $strand="+";
+
+                    my (@clipped)=$z[0]=~/(\d+)S/g;
+                    my $cliplen=sum(@clipped);
+                    if ($cliplen >= 6) {
+                        $clipseq=substr($f[9], 0, $cliplen);
+                    }
+                }
+
+                if ($clipseq ne "") {
+		    foreach my $coor (keys %pe1) {
+			if (($coor =~ /\+/) && (substr($pe1{$coor}, length($pe1{$coor})-length($clipseq), length($clipseq)) eq $clipseq)) {
+			    $pe2{$coor}++;
+			}
+			elsif (($coor =~ /\-/) && (substr($pe1{$coor}, 0, length($clipseq)) eq $clipseq)) {
+			    $pe2{$coor}++;
+			}
+		    }
+		}
+	    }
+	}
+	close in;
+
+	my $clip_site="";
+	
+	foreach my $coor (keys %pe2)
+	{
+	    $clip_site=$clip_site."$coor\:$pe2{$coor}\;";
+	}
+	chop($clip_site);
+	print m1 "$a[0]\t$lower\t$upper\t$a[3]\t$clip_site\n";
+	system("rm temp.sam");
+    }
+    else 
+    {
+	print m1 "$line\n";
+    }
+}
+close input;
+close m1;
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/scripts/pickOverlapPair.ex.pl	Mon Apr 25 13:08:56 2016 -0400
@@ -0,0 +1,120 @@
+#!/share/bin/perl
+use Bio::Seq;
+use List::Util qw(sum);
+
+die "perl $0 <*.excision.cluster.rpmk.refined.bp>\n" if @ARGV<0;
+
+my $title=$ARGV[0];
+if ($title =~ /annotation/) {
+    $title =~ s/excision.cluster.annotation.refined.bp/sorted.bam/;
+}
+else {$title =~ s/excision.cluster.rpmk.refined.bp/sorted.bam/;}
+#system("samtools index /home/wangj2/scratch/bill/bill_genomic/$title");
+
+my %chrs=();
+system("samtools view -H $title > header");
+open (input, "<header") or die "Can't open header since $!\n";
+while (my $line=<input>) {
+    if ($line =~ /^\@SQ/) {
+	my @a=split(/\t/, $line);
+	for my $j (0..$#a) {
+            if ($a[$j] =~ /^SN:/) {
+		$a[$j] =~ s/^SN://;
+		$chrs{$a[$j]}=1;
+            }
+	}
+    }
+}
+close input;
+system("rm header");
+
+open (input, "<$ARGV[0]") or die "Can't open $ARGV[0] since $!\n";
+my $header=<input>;
+while (my $line=<input>) {
+    chomp($line);
+    my @a=split(/\t/, $line);
+
+    my $left=0;
+    my $right=0;
+    if ($a[4] eq "") {$left=$a[1];}
+    else {
+	my @t=split(/\,/, $a[4]);
+	my @p=split(/\(/, $t[$#t]);
+	$left=$p[0];
+    }
+    if ($a[5] eq "") {$right=$a[2];}
+    else {
+	my @t=split(/\,/, $a[5]);
+	my @p=split(/\(/, $t[0]);
+	$right=$p[0];
+    }
+
+    my $leftlower=$left-500;
+    my $leftupper=$left+500;
+    my $rightlower=$right-500;
+    my $rightupper=$right+500;
+    my $chr_num=$a[0];
+    $chr_num =~ s/chr//;
+    if (($chrs{$a[0]} == 1) && (! defined $chrs{$chr_num})) {$chr_num=$a[0];}
+    system("samtools view -Xf 0x2 $title $chr_num\:$leftlower\-$leftupper $chr_num\:$rightlower\-$rightupper > temp.sam");
+    
+    open in,"temp.sam";
+    my %ps=();
+    my %me=();
+    my %uniqp=();
+    my %uniqm=();
+    my $ref_sup=0;
+
+    while(<in>)
+    {
+	chomp;
+	my @f=split/\t/,$_,12;
+	## read number 1 or 2
+	my ($rnum)=$f[1]=~/(\d)$/;
+	
+	## XT:A:* 
+	my ($xt)=$f[11]=~/XT:A:(.)/;
+	
+	## Coordinate
+	my $coor=$f[3];
+	if ($f[1]=~/r/)
+	{
+	    if ($xt eq "U") {$uniqm{$f[0]}=1;}
+	    my (@cigar_m)=$f[5]=~/(\d+)M/g;
+	    my (@cigar_d)=$f[5]=~/(\d+)D/g;
+	    my (@cigar_s)=$f[5]=~/(\d+)S/g;
+	    my (@cigar_i)=$f[5]=~/(\d+)I/g;
+	    my $aln_ln=sum(@cigar_m,@cigar_d);
+	    $me{$f[0]}=$f[3]+$aln_ln-1;
+	}
+	elsif ($f[1]=~/R/) {
+	    $ps{$f[0]}=$f[3];
+	    if ($xt eq "U") {$uniqp{$f[0]}=1;}
+	}
+	
+#	${$pe{$f[0]}}[$rnum-1]=[$xt,$coor];
+    }
+    close in;
+
+    foreach my $id (keys %ps)
+    {
+#	my @rid=@{$pe{$id}};
+	
+#	if(($rid[0][0] eq "U" && $rid[1][0] eq "M") || ($rid[0][0] eq "M" && $rid[1][0] eq "U"))
+#	{
+#	    $soft_clip++;
+#	    print "$id\n";
+#	}
+	
+	if ((defined $me{$id})&&((defined $uniqp{$id})||(defined $uniqm{$id})))
+	{	    
+            if (((($ps{$id}+5)<=$right)&&($me{$id}>$right)&&($uniqm{$id}==1)) || ((($me{$id}-5)>=$left)&&($ps{$id}<$left)&&($uniqp{$id}==1))) {	    
+		$ref_sup++;
+	    }
+	}
+    }
+
+    print "$a[0]\t$a[1]\t$a[2]\t$a[3]\t$left\t$right\t$ref_sup\n";
+    system("rm temp.sam");
+}
+
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/scripts/pickOverlapPair.in.pl	Mon Apr 25 13:08:56 2016 -0400
@@ -0,0 +1,92 @@
+#!/share/bin/perl
+use Bio::Seq;
+use List::Util qw(sum);
+
+die "perl $0 <input.insertion.refined.bp> <fragment size>\n" if @ARGV<1;
+
+my $title=$ARGV[0];
+$title =~ s/insertion.refined.bp/sorted.bam/;
+my $frag=$ARGV[1];
+
+open (input, "<$ARGV[0]") or die "Can't open $ARGV[0] since $!\n";
+print "Chr\tStart\tEnd\tTransposonName\tTransposonDirection\tClass\tVariantSupport\tFrequency\tJunction1\tJunction1Support\tJunction2\tJunction2Support\t5\'_Support\t3\'_Support\n";
+while (my $line=<input>) {
+    chomp($line);
+    my @a=split(/\t/, $line);
+
+    my @b=split(/\(/, $a[6]);
+    my @c=split(/\(/, $a[7]);
+    my $terminal="5\'";
+    my $reverse=0;
+    my $positive=$b[0];
+    my $negative=$c[0];
+    if ($b[0] > $c[0]) {
+	$terminal="3\'";
+	$reverse=1;
+	my $swap=$b[0];
+	$b[0]=$c[0];
+	$c[0]=$swap;
+    }
+    my $lower=$b[0]-$frag;
+    my $upper=$c[0]+$frag;
+    system("samtools view -Xf 0x2 $title $a[0]\:$lower\-$upper > temp.sam");
+    
+    open in,"temp.sam";
+    my %ps=();
+    my %me=();
+    my $ref_sup=0;
+    my $soft_clip=0;
+    while(<in>)
+    {
+	chomp;
+	my @f=split/\t/,$_,12;
+	## read number 1 or 2
+	my ($rnum)=$f[1]=~/(\d)$/;
+	
+	## XT:A:* 
+	my ($xt)=$f[11]=~/XT:A:(.)/;
+	
+	## Coordinate
+	if ($f[1]=~/r/)
+	{
+	    my (@cigar_m)=$f[5]=~/(\d+)M/g;
+	    my (@cigar_d)=$f[5]=~/(\d+)D/g;
+	    my (@cigar_s)=$f[5]=~/(\d+)S/g;
+	    my (@cigar_i)=$f[5]=~/(\d+)I/g;
+	    my $aln_ln=sum(@cigar_m,@cigar_d);
+	    $me{$f[0]}=$f[3]+$aln_ln-1;
+	}
+	else
+	{
+	    $ps{$f[0]}=$f[3];
+	}	    
+	
+#	${$pe{$f[0]}}[$rnum-1]=[$xt,$coor];
+    }
+    close in;
+
+    
+    foreach my $id (keys %ps)
+    {
+	if (defined $me{$id})
+	{	    
+	    if (((($ps{$id}+5)<=$positive)&&($me{$id}>$negative)) || ((($me{$id}-5)>=$negative)&&($ps{$id}<$positive))) {
+#	    if (((($ps{$id}+5)<=$b[0])&&($me{$id}>$c[0])) || ((($me{$id}-5)>=$c[0])&&($ps{$id}<$b[0]))) {
+		$ref_sup++;
+#		print "$id\n";
+	    }
+	}
+    }
+
+    my $variant=$a[8]+$a[9]+$a[10]+$a[11];
+    my $ratio=sprintf("%.4f", $variant/($variant+$ref_sup));
+    if (($a[0] =~ /^\d{1,2}$/) || ($a[0] eq "X") || ($a[0] eq "Y")) {$a[0]="chr$a[0]";}
+    if ($reverse == 0) {
+	print "$a[0]\t$a[1]\t$a[2]\t$a[3]\t$a[4]\t$a[5]\t$variant\t$ratio\t$b[0]\t$a[8]\t$c[0]\t$a[9]\t$a[10]\t$a[11]\n";
+    }
+    elsif ($reverse == 1) {
+	print "$a[0]\t$a[1]\t$a[2]\t$a[3]\t$a[4]\t$a[5]\t$variant\t$ratio\t$b[0]\t$a[9]\t$c[0]\t$a[8]\t$a[10]\t$a[11]\n";
+    }
+    system("rm temp.sam");
+}
+
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/scripts/pickSoftClipping.over.pl	Mon Apr 25 13:08:56 2016 -0400
@@ -0,0 +1,163 @@
+#!/share/bin/perl
+use Bio::Seq;
+use List::Util qw(sum);
+
+die "perl $0 <*.excision.cluster.rpmk> <Reference.2bit>\n" if @ARGV<1;
+
+my $title=$ARGV[0];
+if ($title =~ /annotation/) {
+    $title =~ s/excision.cluster.annotation/sorted.bam/;
+}
+else {$title =~ s/excision.cluster.rpmk/sorted.bam/;}
+
+my %chrs=();
+system("samtools view -H $title > header");
+open (input, "<header") or die "Can't open header since $!\n";
+while (my $line=<input>) {
+    if ($line =~ /^\@SQ/) {
+	my @a=split(/\t/, $line);
+	for my $j (0..$#a) {
+	    if ($a[$j] =~ /^SN:/) {
+		$a[$j] =~ s/^SN://;
+		$chrs{$a[$j]}=1;
+	    }
+	}
+    }
+}
+close input;
+system("rm header");
+
+open (input, "<$ARGV[0]") or die "Can't open $ARGV[0] since $!\n";
+while (my $line=<input>) {
+    chomp($line);
+    my @a=split(/\s+/, $line);
+
+    my $lower=$a[3]-100;
+    my $upper=$a[4]+100;
+    my $chr_num=$a[2];
+    $chr_num =~ s/chr//;
+    if (($chrs{$a[2]} == 1) && (! defined $chrs{$chr_num})) {$chr_num=$a[2];}
+    system("samtools view -bu $title $chr_num\:$lower\-$upper > temp.bam");
+    system("samtools view -Xf 0x2 temp.bam > temp.sam");
+
+    my $leftseq="";
+    my $rightseq="";
+
+    my $ll=$a[3]-150;
+    my $lu=$a[3]+150;
+    system("twoBitToFa $ARGV[1] -seq=$a[2] -start=$ll -end=$lu left.fa");
+    open (seq, "<left.fa") or die "Can't open left.fa since $!\n";
+    my $head=<seq>;
+    for my $k (0..5) {
+	$head=<seq>;
+	chomp($head);
+	$leftseq=$leftseq."$head";
+    }
+    $leftseq=uc($leftseq);
+    close seq;
+    system("rm left.fa");
+
+    my $rl=$a[4]-150;
+    my $ru=$a[4]+150;
+    system("twoBitToFa $ARGV[1] -seq=$a[2] -start=$rl -end=$ru right.fa");
+    open (seq, "<right.fa") or die "Can't open right.fa since $!\n";
+    my $head=<seq>;
+    for my $k (0..5) {
+	$head=<seq>;
+	chomp($head);
+	$rightseq=$rightseq."$head";
+    }
+    $rightseq=uc($rightseq);
+    close seq;
+    system("rm right.fa");
+    
+
+    open in,"temp.sam";
+    my %pe=();
+    while(<in>)
+    {
+	chomp;
+	my @f=split/\t/,$_,12;
+	## read number 1 or 2
+	my ($rnum)=$f[1]=~/(\d)$/;
+	
+	## XT:A:* 
+	my ($xt)=$f[11]=~/XT:A:(.)/;
+
+	my $CIGAR=$f[5];
+	$CIGAR =~ s/S//g;
+	if ($f[5]=~/S/) {
+	
+	    ## Coordinate
+            my $coor=-10;
+	    my $overcoor=-10;
+            my $strand="";
+	    my @z=split(/M/, $f[5]);
+
+            if (($f[5]=~/S$/)&&($f[1]=~/r/))
+            {
+		my (@cigar_m)=$f[5]=~/(\d+)M/g;
+                my (@cigar_d)=$f[5]=~/(\d+)D/g;
+                my (@cigar_s)=$f[5]=~/(\d+)S/g;
+                my (@cigar_i)=$f[5]=~/(\d+)I/g;
+                my $aln_ln=sum(@cigar_m,@cigar_d);
+		$coor=$f[3]+$aln_ln-1;
+                $strand="-";
+
+		my (@clipped)=$z[1]=~/(\d+)S/g;
+		my $cliplen=sum(@clipped);
+#		print "$f[0]\n";
+#		print "$cliplen\t";
+		if ($cliplen >= 10) {
+		    my $clipseq=substr($f[9], length($f[9])-$cliplen, $cliplen);
+		    $overcoor = index($rightseq, $clipseq);
+#		    print "$clipseq\t$rightseq\t$overcoor\t";
+		    if ($overcoor > -1) {$overcoor += ($a[4] - 149);}
+		}
+#		print "\n";
+            }
+            elsif (($f[1]=~/R/)&&($z[0]=~/S/)) {
+		$coor=$f[3]; $strand="+";
+		my (@clipped)=$z[0]=~/(\d+)S/g;
+                my $cliplen=sum(@clipped);
+#		print "$f[0]\n";
+#		print "$cliplen\t";
+		if ($cliplen >= 10) {
+		    my $clipseq=substr($f[9], 0, $cliplen);
+		    $overcoor = index($leftseq, $clipseq);
+#		    print "$clipseq\t$leftseq\t$overcoor\t";
+		    if ($overcoor > -1) {$overcoor += ($a[3] - 150 + $cliplen);}
+		}
+#		print "\n";
+	    }
+
+            if ($coor > 0) {
+                my $final="";
+		if ($overcoor > 0) {
+		    if ($strand eq "-") {$final="$coor\-$overcoor"."\($strand\)";}
+		    else {$final="$overcoor\-$coor"."\($strand\)";}
+		}
+		else {$final=$coor."\($strand\)";}
+		if (defined $pe{$final}) {$pe{$final}++;}
+		else {$pe{$final}=1;}
+            }
+
+	}
+    }
+    close in;
+
+    my $clip_site="";
+    
+    foreach my $coor (keys %pe)
+    {
+	if ($pe{$coor} >= 2) {
+	    $clip_site=$clip_site."$coor\:$pe{$coor}\;";
+	}
+    }
+
+    chop($clip_site);
+    print "$a[2]\t$a[3]\t$a[4]\t$a[5]\t$clip_site\n";
+    system("rm temp.sam temp.bam");
+#    last;
+}
+
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/scripts/pickUniqIntervalPos.pl	Mon Apr 25 13:08:56 2016 -0400
@@ -0,0 +1,36 @@
+#!/share/bin/perl
+use Bio::Seq;
+use List::Util qw(sum);
+
+die "perl $0 <sam> <fragment_size>\n" if @ARGV<1;
+open in,$ARGV[0];
+my %pe;
+while(<in>)
+{
+	chomp;
+	my @f=split/\t/,$_,12;
+	## read number 1 or 2
+	my ($rnum)=$f[1]=~/(\d)$/;
+
+	## XT:A:* 
+	my ($xt)=$f[11]=~/XT:A:(.)/;
+
+	my $strand="+";
+
+	## parse CIGAR
+	if(($f[1]=~/R/)&&($f[8] > $ARGV[1])&&($f[8] <= 10000))
+        {
+                # CIGAR
+                my (@cigar_m)=$f[5]=~/(\d+)M/g;
+                my (@cigar_d)=$f[5]=~/(\d+)D/g;
+                my (@cigar_s)=$f[5]=~/(\d+)S/g;
+                my (@cigar_i)=$f[5]=~/(\d+)I/g;
+                my $aln_ln=sum(@cigar_m,@cigar_d);
+		
+#		print $f[2],"\t",$f[3]-1+$aln_ln,"\t",$f[3]+$f[8],"\t$f[0]/$rnum\t","\n";
+		if ($f[2] =~ /^\d{1,2}$/) {$f[2]="chr$f[2]";}
+		print $f[2],"\t",$f[3]-6+$aln_ln,"\t",$f[7]+5,"\t$f[0]/$rnum\t","\n";
+	}
+}
+close in;
+
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/scripts/pickUniqMate.pl	Mon Apr 25 13:08:56 2016 -0400
@@ -0,0 +1,94 @@
+#!/share/bin/perl
+use List::Util qw(sum);
+use Bio::Seq;
+
+die "perl $0 <mate sam with header> <uniq bed>\n" if @ARGV<1;
+
+open in,$ARGV[1];
+my %uniq;
+while(<in>)
+{
+	chomp;
+	my @f=split;
+	$uniq{$f[3]}=[@f];
+}
+close in;
+
+open in,$ARGV[0];
+my (%te,@ref,%ref);
+while(<in>)
+{
+	chomp;
+	my @f=split/\t/,$_,12;
+	# headers
+	if(/^\@SQ/)
+	{
+		my ($sn,$ln)=/SN:(.*?)\tLN:(\d+)/;
+		push @ref,[$sn,$ln];
+		$ref{$sn}=$#ref;
+		next;
+	}
+
+	# unmapped
+	next if $f[2] eq "*";
+	
+	# alignments
+	if($f[11]=~/XT:A:/)
+	{
+		my ($rnum)=$f[1]=~/(\d)$/;
+		# CIGAR
+		my (@cigar_m)=$f[5]=~/(\d+)M/g;
+		my (@cigar_d)=$f[5]=~/(\d+)D/g;
+		my (@cigar_s)=$f[5]=~/(\d+)S/g;
+		my (@cigar_i)=$f[5]=~/(\d+)I/g;
+		my $aln_ln=sum(@cigar_m,@cigar_d);
+		
+		my $strand="+";
+        	if($f[1]=~/r/)
+        	{
+                	my $seq=Bio::Seq->new(-seq=>$f[9]);
+                	$f[9]=$seq->revcom->seq;
+                	$strand="-";
+        	}
+
+		# align to the junctions
+		if(($f[3]+$aln_ln-1)>${$ref[$ref{$f[2]}]}[1])
+		{
+			if(($f[3]+($aln_ln-1)/2)>${$ref[$ref{$f[2]}]}[1])
+			{
+				$f[2]=${$ref[$ref{$f[2]}+1]}[0];
+				$f[3]=1;
+				$aln_ln=$aln_ln-(${$ref[$ref{$f[2]}]}[1]-$f[3]+1);
+			}
+			else
+			{
+				$aln_ln=${$ref[$ref{$f[2]}]}[1]-$f[3]+1;
+			}
+		}
+
+		$pe{$f[0]}{$rnum}=$f[2].",".$strand."$f[3]".";";
+
+		# XA tag
+		if($f[11]=~/XA:Z:/)
+		{
+			my ($xa)=$f[11]=~/XA:Z:(.*);$/; 
+			my @xa=split(";",$xa);
+			$pe{$f[0]}{$rnum}.=join(",",(split/,/)[0,1]).";" foreach @xa;
+		}
+	}
+}
+close in;
+
+foreach my $id (keys %pe)
+{
+	next if exists $pe{$id}{1} && exists $pe{$id}{2} && exists $uniq{$id."/1"} && exists $uniq{$id."/2"};
+	foreach my $rid (keys %{$pe{$id}})
+	{
+		my $mate_id=($rid==1)?2:1;
+		if(exists $uniq{$id."/".$mate_id})
+		{
+			${$uniq{$id."/".$mate_id}}[4]=$pe{$id}{$rid};
+			print join("\t",@{$uniq{$id."/".$mate_id}}),"\n";
+		}
+	}
+}
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/scripts/pickUniqPairFastq.pl	Mon Apr 25 13:08:56 2016 -0400
@@ -0,0 +1,45 @@
+#!/share/bin/perl
+use Bio::Seq;
+
+die "perl $0 <sam> <output prefix>\n" if @ARGV<1;
+
+open m1,">$ARGV[1].1.fastq";
+open m2,">$ARGV[1].2.fastq";
+
+open in,$ARGV[0];
+my %pe;
+while(<in>)
+{
+	chomp;
+	my @f=split/\t/,$_,12;
+	## read number 1 or 2
+	my ($rnum)=$f[1]=~/(\d)$/;
+
+	## XT:A:* 
+	my ($xt)=$f[11]=~/XT:A:(.)/;
+
+	## revcom the read mapped to the reverse strand
+	if($f[1]=~/r/)
+	{
+		my $seq=Bio::Seq->new(-seq=>$f[9]);
+		$f[9]=$seq->revcom->seq;
+		$f[10]=reverse $f[10];
+	}
+	if (($rnum == 1) || ($rnum == 2))
+	{
+	    ${$pe{$f[0]}}[$rnum-1]=[$xt,$f[9],$f[10]];
+	}
+}
+close in;
+
+foreach my $id (keys %pe)
+{
+	my @rid=@{$pe{$id}};
+	if (($rid[0][1] ne "") && ($rid[1][1] ne "") && (($rid[0][0] eq "U" || $rid[1][0] eq "U")))
+	{
+		print m2 "@"."$id/2","\n",$rid[1][1],"\n","+$id/2","\n",$rid[1][2],"\n";
+		print m1 "@"."$id/1","\n",$rid[0][1],"\n","+$id/1","\n",$rid[0][2],"\n";
+	}
+}
+close m1;
+close m2;
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/scripts/pickUniqPairFastq_MEM.pl	Mon Apr 25 13:08:56 2016 -0400
@@ -0,0 +1,60 @@
+#!/share/bin/perl
+use Bio::Seq;
+
+die "perl $0 <sam> <output prefix>\n" if @ARGV<1;
+
+open m1,">$ARGV[1].1.fastq";
+open m2,">$ARGV[1].2.fastq";
+
+open in,$ARGV[0];
+my %pe;
+while(<in>)
+{
+	chomp;
+	my @f=split/\t/,$_,12;
+	## read number 1 or 2
+	my ($rnum)=$f[1]=~/(\d)$/;
+
+	## XT:A:* 
+	my $xt="";
+	my @a=split(/\s+/, $_);
+	my $as=0;
+	my $xs=0;
+	for my $i (11..$#a) {
+	    if ($a[$i] =~ /^AS:i:/) {
+		$a[$i] =~ s/AS:i://;
+		$as=$a[$i];
+	    }
+	    elsif ($a[$i] =~ /^XS:i:/) {
+		$a[$i] =~ s/XS:i://;
+		$xs=$a[$i];
+	    }
+	    if (($xs > 0) && ($as-$xs <= $ARGV[2])) {$xt="R";}
+	    else {$xt="U";}
+	}
+
+	## revcom the read mapped to the reverse strand
+	if($f[1]=~/r/)
+	{
+		my $seq=Bio::Seq->new(-seq=>$f[9]);
+		$f[9]=$seq->revcom->seq;
+		$f[10]=reverse $f[10];
+	}
+	if (($rnum == 1) || ($rnum == 2))
+	{
+	    ${$pe{$f[0]}}[$rnum-1]=[$xt,$f[9],$f[10]];
+	}
+}
+close in;
+
+foreach my $id (keys %pe)
+{
+	my @rid=@{$pe{$id}};
+	if (($rid[0][1] ne "") && ($rid[1][1] ne "") && (($rid[0][0] eq "U" || $rid[1][0] eq "U")))
+	{
+		print m2 "@"."$id/2","\n",$rid[1][1],"\n","+$id/2","\n",$rid[1][2],"\n";
+		print m1 "@"."$id/1","\n",$rid[0][1],"\n","+$id/1","\n",$rid[0][2],"\n";
+	}
+}
+close m1;
+close m2;
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/scripts/pickUniqPos.pl	Mon Apr 25 13:08:56 2016 -0400
@@ -0,0 +1,41 @@
+#!/share/bin/perl
+use Bio::Seq;
+use List::Util qw(sum);
+
+die "perl $0 <sam>\n" if @ARGV<1;
+open in,$ARGV[0];
+my %pe;
+while(<in>)
+{
+	chomp;
+	my @f=split/\t/,$_,12;
+	## read number 1 or 2
+	my ($rnum)=$f[1]=~/(\d)$/;
+
+	## XT:A:* 
+	my ($xt)=$f[11]=~/XT:A:(.)/;
+
+	my $strand="+";
+	## revcomp
+	if($f[1]=~/r/)
+        {
+                my $seq=Bio::Seq->new(-seq=>$f[9]);
+                $f[9]=$seq->revcom->seq;
+		$strand="-";
+        }
+
+	## parse CIGAR
+	if($xt eq "U")
+        {
+                # CIGAR
+                my (@cigar_m)=$f[5]=~/(\d+)M/g;
+                my (@cigar_d)=$f[5]=~/(\d+)D/g;
+                my (@cigar_s)=$f[5]=~/(\d+)S/g;
+                my (@cigar_i)=$f[5]=~/(\d+)I/g;
+                my $aln_ln=sum(@cigar_m,@cigar_d);
+		
+		print $f[2],"\t",$f[3]-1,"\t",$f[3]-1+$aln_ln,"\t$f[0]/$rnum\t",$f[9],"\t",$strand,"\n";
+	}
+}
+close in;
+
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/scripts/pickUniqPos_MEM.pl	Mon Apr 25 13:08:56 2016 -0400
@@ -0,0 +1,56 @@
+#!/share/bin/perl
+use Bio::Seq;
+use List::Util qw(sum);
+
+die "perl $0 <sam>\n" if @ARGV<1;
+open in,$ARGV[0];
+my %pe;
+while(<in>)
+{
+	chomp;
+	my @f=split/\t/,$_,12;
+	## read number 1 or 2
+	my ($rnum)=$f[1]=~/(\d)$/;
+
+	## XT:A:* 
+        my $xt="";
+        my @a=split(/\s+/, $_);
+        my $as=0;
+        my $xs=0;
+        for my $i (11..$#a) {
+            if ($a[$i] =~ /^AS:i:/) {
+                $a[$i] =~ s/AS:i://;
+                $as=$a[$i];
+            }
+            elsif ($a[$i] =~ /^XS:i:/) {
+                $a[$i] =~ s/XS:i://;
+                $xs=$a[$i];
+            }
+            if (($xs > 0) && ($as-$xs <= $ARGV[1])) {$xt="R";}
+            else {$xt="U";}
+        }
+
+	my $strand="+";
+	## revcomp
+	if($f[1]=~/r/)
+        {
+                my $seq=Bio::Seq->new(-seq=>$f[9]);
+                $f[9]=$seq->revcom->seq;
+		$strand="-";
+        }
+
+	## parse CIGAR
+	if($xt eq "U")
+        {
+                # CIGAR
+                my (@cigar_m)=$f[5]=~/(\d+)M/g;
+                my (@cigar_d)=$f[5]=~/(\d+)D/g;
+                my (@cigar_s)=$f[5]=~/(\d+)S/g;
+                my (@cigar_i)=$f[5]=~/(\d+)I/g;
+                my $aln_ln=sum(@cigar_m,@cigar_d);
+		
+		print $f[2],"\t",$f[3]-1,"\t",$f[3]-1+$aln_ln,"\t$f[0]/$rnum\t",$f[9],"\t",$strand,"\n";
+	}
+}
+close in;
+
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/scripts/refine_breakpoint.ex.pl	Mon Apr 25 13:08:56 2016 -0400
@@ -0,0 +1,168 @@
+#! /usr/bin/perl
+
+use strict;
+
+my @files=<*.excision.cluster.*>;
+foreach my $file (@files) {
+    if (($file !~ /sfcp/)&&($file !~ /refsup/)) {
+	my $sfcp=$file.".sfcp";
+	my $title=$file.".refined.bp";
+	
+	open (input, "<$file") or die "Can't open $file since $!\n";
+	open (input1, "<$sfcp") or die "Can't open $sfcp since $!\n";
+	open (output, ">>$title") or die "Can't open $title since $!\n";
+	print output "Chr\tStart\tEnd\tTransposonName\t5\'_Junction\t3\'_Junction\n";
+	while (my $line=<input>) {
+	    chomp($line);
+	    my @a=split(/\s+/, $line);
+	    my $line1=<input1>;
+	    chomp($line1);
+	    my @b=split(/\t/, $line1);
+	    my @pos=split(/\;/, $b[4]);
+	    my $plusnext=""; my $minusnext="";
+	    my $plusover=0; my $minusover=0;
+	    my $lpcoor=""; my $lmcoor=""; my $rpcoor=""; my $rmcoor="";
+	    my $lp=0; my $lm=0; my $rp=0; my $rm=0;
+	    my %plus=(); my %minus=();
+	    foreach my $site (@pos) {
+		my @x=split(/\:/, $site);
+		my @y=split(/\(/, $x[0]);
+		chop($y[1]);
+		if (($y[0] =~ /\-/)&&($y[1] eq "+")&&($x[1] >= $plusover)) {
+		    if ($plusover >= 2) {$plusnext="$lpcoor\-$rpcoor\:$plusover";}
+		    $plusover=$x[1]; 
+		    my @z=split(/\-/, $y[0]);
+		    $lpcoor=$z[0]; $lp=$x[1];
+		    $rpcoor=$z[1]; $rp=$x[1];
+		}
+		elsif (($y[0] =~ /\-/)&&($y[1] eq "-")&&($x[1] >= $minusover)) {
+		    if ($minusover >= 2) {$minusnext="$lmcoor\-$rmcoor\:$minusover";}
+		    $minusover=$x[1];
+		    my @z=split(/\-/, $y[0]);
+		    $lmcoor=$z[0]; $lm=$x[1];
+		    $rmcoor=$z[1]; $rm=$x[1];
+		}
+		elsif (($y[0] !~ /\-/)&&($y[1] eq "+")) {
+		    $plus{$y[0]}=$x[1];
+		}
+		elsif (($y[0] !~ /\-/)&&($y[1] eq "-")) {
+		    $minus{$y[0]}=$x[1];
+		}
+	    }
+	    
+	    if (($plusnext ne "")&&($minusover == 0)) {
+		my @m=split(/\:/, $plusnext);
+		if (($m[1] >= 2)&&($m[1] == $plusover)) {
+		    my $count1=$m[1]; my $count2=$plusover;
+		    foreach my $id (keys %plus) {
+			if ($m[0] =~ /$id/) {$count1 += $plus{$id};}
+			elsif ($rpcoor == $id) {$count2 += $plus{$id};}
+		    }
+		    if ($count1 > $count2) {
+			my @n=split(/\-/, $m[0]);
+			$lpcoor=$n[0]; $lp=$m[1];
+			$rpcoor=$n[1]; $rp=$m[1];
+		    }
+		}
+	    }
+
+	    if (($minusnext ne "")&&($plusover == 0)) {
+		my @m=split(/\:/, $minusnext);
+		if (($m[1] >= 2)&&($m[1] == $minusover)) {
+		    my $count1=$m[1]; my $count2=$minusover;
+		    foreach my $id (keys %minus) {
+			if ($m[0] =~ /$id/) {$count1 += $minus{$id};}
+			elsif ($lmcoor == $id) {$count2 += $minus{$id};}
+		    }
+		    if ($count1 > $count2) {
+			my @n=split(/\-/, $m[0]);
+			$lmcoor=$n[0]; $lm=$m[1];
+			$rmcoor=$n[1]; $rm=$m[1];
+		    }
+		}
+	    }			    
+
+	    if (($plusover >= 2)&&($minusover >= 2)&&(($lpcoor-$rpcoor) != ($lmcoor-$rmcoor))) {
+		if ($plusnext ne "") {
+		    my @m=split(/\:/, $plusnext);
+		    my @n=split(/\-/, $m[0]);
+		    if ((($n[1]-$n[0]) == ($rmcoor-$lmcoor))&&($m[1] >= 2)) {
+			$rpcoor=$n[1];
+			$lpcoor=$n[0];
+			$plusover=$m[1];
+			$lp=$m[1];
+			$rp=$m[1];
+		    }
+		}
+		if ($minusnext ne "") {
+                    my @m=split(/\:/, $minusnext);
+                    my @n=split(/\-/, $m[0]);
+                    if ((($n[1]-$n[0]) == ($rpcoor-$lpcoor))&&($m[1] >= 2)) {
+			$rmcoor=$n[1];
+			$lmcoor=$n[0];
+			$minusover=$m[1];
+			$lm=$m[1];
+			$rm=$m[1];
+                    }
+		}
+	    }
+
+	    my $plusc=0; my $pluscoor="";
+	    my $minusc=0; my $minuscoor="";
+	    foreach my $id (keys %plus) {
+		if ($id eq $rpcoor) {
+		    $rp=$plusover+$plus{$id};
+		}
+		if ($plus{$id} > $plusc) {
+		    $plusc=$plus{$id};
+		    $pluscoor=$id;
+		}
+		elsif (($plus{$id} == $plusc)&&(abs($id-$b[2]) < abs($pluscoor-$b[2]))) {
+                    $plusc=$plus{$id};
+                    $pluscoor=$id;
+		}
+	    }
+	    foreach my $id (keys %minus) {
+		if ($id eq $lmcoor) {
+		    $lm=$minusover+$minus{$id};
+		}
+		if ($minus{$id} > $minusc) {
+		    $minusc=$minus{$id};
+		    $minuscoor=$id;
+		}
+		elsif (($minus{$id} == $minusc)&&(abs($id-$b[1]) < abs($minuscoor-$b[1]))) {
+                    $minusc=$minus{$id};
+                    $minuscoor=$id;
+		}
+	    }
+	    if ($plusover < 2) {
+		$lpcoor="";
+		if ($plusc >= 3) {$rpcoor=$pluscoor; $rp=$plusc;}
+		else {$rpcoor="";}
+	    }
+	    if ($minusover < 2) {
+		$rmcoor="";
+		if ($minusc >= 3) {$lmcoor=$minuscoor; $lm=$minusc;}
+		else {$lmcoor="";}
+	    }	    
+		
+	    my $bp1=""; my $bp2="";
+	    if (($lpcoor ne "")&&($lmcoor ne "")) {
+		$bp1="$lpcoor\(\+\)\:$lp,$lmcoor\(\-\)\:$lm";
+	    }
+	    elsif ($lpcoor ne "") {$bp1="$lpcoor\(\+\)\:$lp";}
+	    elsif ($lmcoor ne "") {$bp1="$lmcoor\(\-\)\:$lm";}
+	    if (($rpcoor ne "")&&($rmcoor ne "")) {
+		$bp2="$rpcoor\(\+\)\:$rp,$rmcoor\(\-\)\:$rm";
+	    }
+	    elsif ($rpcoor ne "") {$bp2="$rpcoor\(\+\)\:$rp";}
+	    elsif ($rmcoor ne "") {$bp2="$rmcoor\(\-\)\:$rm";}
+
+	    print output "$a[2]\t$a[3]\t$a[4]\t$a[5]\t$bp1\t$bp2\n";
+	}
+	
+	close input;
+	close input1;
+	close output;
+    }
+}
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/scripts/refine_breakpoint.in.pl	Mon Apr 25 13:08:56 2016 -0400
@@ -0,0 +1,73 @@
+#! /usr/bin/perl
+
+use strict;
+
+#my @files=<*.bp.bed.sfcp>;
+my @files=<*.clipped.reads.aln>;
+for my $file (@files) {
+
+    my $title=$file;
+    $title =~ s/clipped.reads.aln/insertion.refined.bp/;
+    my $title2=$file;
+    $title2 =~ s/clipped.reads.aln/insertion.bp.bed/;
+
+    open (input, "<$file") or die "Can't open $file since $!\n";
+    open (input2, "<$title2") or die "Can't open $title2 since $!\n";
+    open (output, ">>$title") or die "Can't open $title since $!\n";
+    while (my $line=<input>) {
+	chomp($line);
+	my @a=split(/\t/, $line);
+	my @b=split(/\;/, $a[4]);
+	my $plusmax="";
+	my $minusmax="";
+	my $plus=0;
+	my $minus=0;
+	my $bp="";
+
+	my $line2=<input2>;
+	my @z=split(/\t/, $line2);
+	my $psup=abs($z[6]);
+	my $msup=abs($z[7]);
+	my $strand=$z[4];
+	my $class=$z[5];
+
+	for my $element (@b) {
+	    my @c=split(/\:/, $element);
+	    chop($c[0]);
+	    my @d=split(/\(/, $c[0]);
+	    if (($d[1] eq "+") && ($c[1] > $plus)) {
+		$plusmax=$d[0];
+		$plus=$c[1];
+	    }
+	    elsif (($d[1] eq "-") && ($c[1] > $minus)) {
+		$minusmax=$d[0];
+		$minus=$c[1];
+	    }
+	}
+
+        if ($a[1] > 0) {
+	    $a[1] += 15;
+	    $a[2] -= 15;
+	    print output "$a[0]\t$a[1]\t$a[2]\t$a[3]\t$strand\t$class\t";
+	    if (($minus >= 1)&&($plus >= 1)&&(abs($plusmax-$minusmax) <= 25)) {
+		print output "$plusmax\(\+\)\t$minusmax\(\-\)\t$plus\t$minus\t";
+	    }
+	    elsif (($plus >= $minus)&&($plus >= 2)&&($plusmax >= $a[1])&&($plusmax <= $a[2])) {
+		print output "$plusmax\(\+\)\t$plusmax\(\-\)\t$plus\t0\t";
+	    }
+	    elsif (($minus >= 2)&&($minusmax >= $a[1])&&($minusmax <= $a[2])) {
+		print output "$minusmax\(\+\)\t$minusmax\(\-\)\t0\t$minus\t";
+	    }
+	    else {
+		my $mid=int(($a[1] + $a[2])/2);
+		print output "$mid\(\+\)\t$mid\(\-\)\t0\t0\t";
+	    }
+	    print output "$psup\t$msup\n";
+	}
+    }
+    close input;
+    close input2;
+    close output;
+    system("uniq $title > temp");
+    system("mv temp $title");
+}
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/scripts/summarize_excision.pl	Mon Apr 25 13:08:56 2016 -0400
@@ -0,0 +1,55 @@
+#! /usr/bin/perl
+
+use strict;
+
+my @files=<*.excision.cluster.*.refined.bp>;
+foreach my $file (@files) {
+    my $rfsp=$file.".refsup";
+    my $count=$file;
+    my $title=$file;
+    $count =~ s/.refined.bp//;
+    $title =~ s/excision/absence/;
+    $title =~ s/.cluster.rpmk//;
+    $title .= ".summary";
+	
+    open (input, "<$file") or die "Can't open $file since $!\n";
+    open (input1, "<$count") or die "Can't open $count since $!\n";
+    open (input2, "<$rfsp") or die "Can't open $rfsp since $!\n";
+    open (output, ">>$title") or die "Can't open $title since $!\n";
+    my $header=<input>; 
+    chomp($header);
+    print output "$header\tVariant\tReference\tFrequency\n";
+    while (my $line=<input>) {
+	chomp($line);
+	my @a=split(/\t/, $line);
+	my $line1=<input1>;
+	my $line2=<input2>;
+	chomp($line1);
+	chomp($line2);
+	my @b=split(/\s+/, $line1);
+	my @c=split(/\t/, $line2);
+
+	my $variant=$b[1];
+	my @x=split(/\:/, $a[4]);
+	my @y=split(/\:/, $a[5]);
+	if ($a[4] =~ /\,/) {
+	    my @m=split(/\,/, $x[1]);
+	    $variant += $m[0]+$x[2];
+	}
+	else {$variant += $x[1];}
+	if ($a[5] =~ /\,/) {
+	    my @n=split(/\,/, $y[1]);
+	    $variant += $n[0]+$y[2];
+	}
+	else {$variant += $y[1];}
+	my $ratio=sprintf("%.4f", ($variant*2)/($variant*2+$c[6]));
+	
+	$line =~ s/:\d+//g;
+	print output "$line\t$variant\t$c[6]\t$ratio\n";
+    }
+    
+    close input;
+    close input1;
+    close input2;
+    close output;
+}
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/temp.xml	Mon Apr 25 13:08:56 2016 -0400
@@ -0,0 +1,69 @@
+<tool id ="run_TEMP" name="Run TEMP" version=" 0.1.0">
+    <description></description>
+    <requirements>
+        <requirement type="package" version="1.6.924">perl-bioperl</requirement>
+        <requirement type="package" version="0.7.13">bwa</requirement>
+        <requirement type="package" version="2.25.0">bedtools</requirement>
+        <requirement type="package" version="0.1.19">samtools</requirement>
+        <requirement type="package" version="324">ucsc-twobittofa</requirement>
+    </requirements>
+    <stdio>
+        <exit_code range="1:" />
+    </stdio>
+    <command><![CDATA[
+
+        ln -f -s "${alignment.metadata.bam_index}" "${alignment.element_identifier}.bai" &&
+        ln -f -s "${alignment}" "${alignment.element_identifier}.bam" &&
+        bash $__tool_directory__/scripts/TEMP_Insertion.sh -i "$alignment" -s $__tool_directory__/scripts -r "$consensus_te_seqs" -t "$bed_te_locations" -m 3 -f "$median_insertsize" -c \${GALAXY_SLOTS:-2} &&
+        bash $__tool_directory__/scripts/TEMP_Absence.sh -i "$alignment" -s $__tool_directory__/scripts -r "$bed_te_locations" -t "$reference2bit" -f 500 -c \${GALAXY_SLOTS:-2} &&
+        mv ${alignment.element_identifier}.insertion.bp.bed $insertion_bed &&
+        mv ${alignment.element_identifier}.insertion.refined.bp $insertion_bed_refined &&
+        mv ${alignment.element_identifier}.insertion.refined.bp.summary $insertion_summary &&
+        mv ${alignment.element_identifier}.absence.refined.bp.summary $absence_summary &&
+        zip $archive  *insertion* *excision* *absence*
+    ]]></command>
+    <inputs>
+        <param format="bam" name="alignment" type="data" label="Alignment bam file"/>
+        <param format="twobit" name="reference2bit" type="data" label="Reference twobit file"/>
+        <param format="fasta" name="consensus_te_seqs" type="data" label="Consensus TE Seqs fasta file"/>
+        <param format="bed" name="bed_te_locations" type="data" label="TE Locations bed file"/>
+        <!--
+        <param format="tabular" name="te_families" type="data" label="TE Families"/>
+        <param format="gff" name="gff_te_locations" type="data" label="Reference TE insertion Locations with Family ID names GFF file"/>
+        -->
+        <param format="txt" name="median_insertsize" type="data" label="Median Insert Length"/>
+    </inputs>
+    <outputs>
+        <data format="bed" type="data" name="insertion_bed" Label="Insertion BED file" />
+        <data format="bed" type="data" name="insertion_bed_refined" Label="Insertion BED file (refined)" />
+        <data format="bed" type="data" name="insertion_summary" Label="Insertion summary file" />
+        <data format="bed" type="data" name="absence_summary" Label="Absence summary file" />
+        <data format="zip" type="data" name="archive" Label="Compressed output files" />
+    </outputs>
+    <tests>
+        <test>
+            <param name="alignment" value="test_chromosome.sorted.bam" ftype="bam"/>
+            <param name="reference2bit" value="dm3_chr2L.2bit" ftype="twobit"/>
+            <param name="consensus_te_seqs" value="test_consensus.fa" ftype="fasta"/>
+            <param name="bed_te_locations" value="test_TE_annotation.bed" ftype="bed"/>
+            <output name="insertion_bed" file="test_chromosome.insertion.bp.bed" ftype="bed" />
+            <output name="insertion_bed_refined" file="test_chromosome.insertion.refined.bp" ftype="bed"/>
+            <output name="insertion_summary" file="test_chromosome.insertion.refined.bp.summary" ftype="bed"/>
+            <output name="absence_summary" file="test_chromosome.absence.refined.bp.summary" ftype="bed"/>
+        </test>
+    </tests>
+    <help> <![CDATA[
+
+
+TEMP is a software package for detecting transposable elements (TEs)  insertions and absences from pooled high-throughput sequencing data
+
+Current version v1.04
+
+Author: Jiali Zhuang (jiali.zhuang@umassmed.edu) and Jie Wang (jie.wangj@umassmed.edu) Weng Lab, University of Massachusetts Medical School, Worcester, MA, USA
+
+For TE insertion analysis run TEMP_Insertion.sh in script.
+For TE absence analysis run TEMP_Absence.sh in script.
+
+
+    ]]> </help>
+</tool>
\ No newline at end of file
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/test-data/README	Mon Apr 25 13:08:56 2016 -0400
@@ -0,0 +1,7 @@
+This is a small simulated dataset for testing if TEMP is properly installed and demonstrating how it works.
+
+10 TE insertions and 5 TE excisions were generated in the chr2L:2000000-3000000 region of the Drosophila Melanogaster Reference Genome (dm3), and pair-end Illumina reads were simulated with insert size 500, read length 90 and error-rate of 0.0005. Those reads were mapped to the dm3 reference genome and the alignments were in the file "test_chromosome.sorted.bam". 
+
+The 10 simulated insertions and 5 excisions were listed in the file "test_chromosome.sites". 
+The concensus sequences for those 10 TEs were in the file "test_concensus.fa".
+The annotated TE insertions in the reference genome were listed in the file "test_TE_annotation.bed".
Binary file test-data/dm3_chr2L.2bit has changed
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/test-data/test_TE_annotation.bed	Mon Apr 25 13:08:56 2016 -0400
@@ -0,0 +1,115 @@
+chr2L	1301606	1302488	FBgn0001167_gypsy	.	-
+chr2L	2094501	2094580	FBgn0000155_roo	.	-
+chr2L	2100429	2109522	FBgn0000155_roo	.	-
+chr2L	2112167	2118361	FBgn0003007_opus	.	-
+chr2L	2118446	2119772	FBgn0003007_opus	.	-
+chr2L	2159453	2159556	FBgn0000155_roo	.	-
+chr2L	2267349	2267457	FBgn0000155_roo	.	+
+chr2L	2294096	2299243	FBgn0000349_copia	.	-
+chr2L	2378805	2378893	FBgn0000155_roo	.	+
+chr2L	2530303	2530389	FBgn0000155_roo	.	+
+chr2L	2565592	2569028	FBgn0000005_297	.	+
+chr2L	2565667	2565886	FBgn0000004_17.6	.	+
+chr2L	2565869	2566006	FBgn0063450_Tom1	.	+
+chr2L	2565871	2566024	FBgn0061485_rover	.	+
+chr2L	2565920	2566024	FBgn0063917_McClintock	.	+
+chr2L	2566158	2569026	FBgn0000004_17.6	.	+
+chr2L	2566674	2566848	FBgn0061485_rover	.	+
+chr2L	2567367	2569022	FBgn0061485_rover	.	+
+chr2L	2567598	2567816	FBgn0063917_McClintock	.	+
+chr2L	2567665	2569027	FBgn0044355_Quasimodo	.	+
+chr2L	2568060	2569027	FBgn0026065_Idefix	.	+
+chr2L	2568062	2569018	FBgn0063917_McClintock	.	+
+chr2L	2568070	2569004	FBgn0063447_accord	.	+
+chr2L	2568121	2568988	FBgn0004082_Tirant	.	+
+chr2L	2568137	2569006	FBgn0063432_gypsy5	.	+
+chr2L	2568137	2568942	FBgn0040267_Transpac	.	+
+chr2L	2568153	2569001	FBgn0063782_accord2	.	-
+chr2L	2568154	2568990	FBgn0023131_ZAM	.	+
+chr2L	2568193	2568993	FBgn0003007_opus	.	+
+chr2L	2568251	2568697	FBgn0000006_412	.	+
+chr2L	2568264	2568985	FBgn0063434_gypsy3	.	+
+chr2L	2568264	2568985	FBgn0003490_springer	.	+
+chr2L	2568308	2568520	FBgn0067387_gypsy10	.	+
+chr2L	2568308	2568517	FBgn0067384_gypsy7	.	+
+chr2L	2568308	2568878	FBgn0063431_gypsy6	.	+
+chr2L	2568308	2568703	FBgn0001167_gypsy	.	+
+chr2L	2568313	2568828	FBgn0002697_mdg1	.	+
+chr2L	2568313	2568526	FBgn0000199_blood	.	+
+chr2L	2568329	2568982	FBgnnnnnnnn_HMS-Beagle2	.	+
+chr2L	2568329	2568982	FBgn0001207_HMS-Beagle	.	+
+chr2L	2568378	2568648	FBgn0063897_Stalker4	.	+
+chr2L	2568378	2568878	FBgn0063433_gypsy4	.	+
+chr2L	2568384	2568646	FBgn0063435_gypsy2	.	+
+chr2L	2568384	2568796	FBgn0002698_mdg3	.	+
+chr2L	2569006	2569756	FBgn0063917_McClintock	.	+
+chr2L	2569007	2571200	FBgn0000004_17.6	.	+
+chr2L	2569010	2571603	FBgn0000005_297	.	+
+chr2L	2569018	2569804	FBgn0061485_rover	.	+
+chr2L	2569064	2570806	FBgn0044355_Quasimodo	.	+
+chr2L	2569064	2569752	FBgn0026065_Idefix	.	+
+chr2L	2569859	2571024	FBgn0061485_rover	.	+
+chr2L	2569987	2570809	FBgn0026065_Idefix	.	+
+chr2L	2570511	2570703	FBgn0063917_McClintock	.	+
+chr2L	2571048	2571200	FBgn0063917_McClintock	.	+
+chr2L	2571264	2571483	FBgn0000004_17.6	.	+
+chr2L	2571466	2571603	FBgn0063450_Tom1	.	+
+chr2L	2571468	2571592	FBgn0061485_rover	.	+
+chr2L	2661257	2663012	FBgn0001249_I-element	.	+
+chr2L	2713413	2713444	FBgn0063371_transib2	.	-
+chr2L	2772652	2776969	FBgn0000005_297	.	+
+chr2L	2772727	2772946	FBgn0000004_17.6	.	+
+chr2L	2772929	2773066	FBgn0063450_Tom1	.	+
+chr2L	2772931	2773084	FBgn0061485_rover	.	+
+chr2L	2772980	2773084	FBgn0063917_McClintock	.	+
+chr2L	2773736	2773910	FBgn0061485_rover	.	+
+chr2L	2774429	2776968	FBgn0061485_rover	.	+
+chr2L	2774429	2776968	FBgn0000004_17.6	.	+
+chr2L	2774660	2774878	FBgn0063917_McClintock	.	+
+chr2L	2774727	2776980	FBgn0044355_Quasimodo	.	+
+chr2L	2775122	2776985	FBgn0026065_Idefix	.	+
+chr2L	2775124	2776969	FBgn0063917_McClintock	.	+
+chr2L	2775132	2776531	FBgn0063447_accord	.	+
+chr2L	2775183	2776509	FBgn0004082_Tirant	.	+
+chr2L	2775199	2776553	FBgn0063432_gypsy5	.	+
+chr2L	2775199	2776494	FBgn0040267_Transpac	.	+
+chr2L	2775215	2776321	FBgn0063782_accord2	.	-
+chr2L	2775216	2776513	FBgn0023131_ZAM	.	+
+chr2L	2775255	2776055	FBgn0003007_opus	.	+
+chr2L	2775313	2775759	FBgn0000006_412	.	+
+chr2L	2775326	2776047	FBgn0063434_gypsy3	.	+
+chr2L	2775326	2776047	FBgn0003490_springer	.	+
+chr2L	2775370	2775579	FBgn0067384_gypsy7	.	+
+chr2L	2775370	2775765	FBgn0001167_gypsy	.	+
+chr2L	2775375	2775890	FBgn0002697_mdg1	.	+
+chr2L	2775375	2775588	FBgn0000199_blood	.	+
+chr2L	2775391	2776044	FBgnnnnnnnn_HMS-Beagle2	.	+
+chr2L	2775391	2776044	FBgn0001207_HMS-Beagle	.	+
+chr2L	2775429	2775767	FBgn0010302_Burdock	.	+
+chr2L	2775440	2775710	FBgn0063897_Stalker4	.	+
+chr2L	2775440	2776515	FBgn0063433_gypsy4	.	+
+chr2L	2775442	2775582	FBgn0067387_gypsy10	.	+
+chr2L	2775446	2775858	FBgn0002698_mdg3	.	+
+chr2L	2776093	2776340	FBgn0000199_blood	.	+
+chr2L	2776099	2776519	FBgnnnnnnnn_HMS-Beagle2	.	+
+chr2L	2776156	2776324	FBgn0063436_gtwin	.	+
+chr2L	2776156	2776516	FBgn0063431_gypsy6	.	+
+chr2L	2776179	2776389	FBgn0003007_opus	.	+
+chr2L	2776938	2777318	FBgn0063917_McClintock	.	+
+chr2L	2776958	2777320	FBgn0000004_17.6	.	+
+chr2L	2776962	2777324	FBgn0061485_rover	.	+
+chr2L	2776962	2777324	FBgn0000005_297	.	+
+chr2L	2776975	2777315	FBgn0044355_Quasimodo	.	+
+chr2L	2777321	2779175	FBgn0000005_297	.	+
+chr2L	2777323	2778772	FBgn0000004_17.6	.	+
+chr2L	2777510	2778596	FBgn0061485_rover	.	+
+chr2L	2777559	2778381	FBgn0026065_Idefix	.	+
+chr2L	2777565	2778378	FBgn0044355_Quasimodo	.	+
+chr2L	2778083	2778275	FBgn0063917_McClintock	.	+
+chr2L	2778620	2778772	FBgn0063917_McClintock	.	+
+chr2L	2778836	2779055	FBgn0000004_17.6	.	+
+chr2L	2779038	2779175	FBgn0063450_Tom1	.	+
+chr2L	2779040	2779164	FBgn0061485_rover	.	+
+chr2L	2933353	2935475	FBgn0003122_pogo	.	-
+chr2L	2945631	2945785	FBgn0000155_roo	.	+
+chr2L	2963474	2963538	FBgn0000155_roo	.	+
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/test-data/test_chromosome.sites	Mon Apr 25 13:08:56 2016 -0400
@@ -0,0 +1,15 @@
+chr2L	2966910	2966911	FBgn0000481_Doc	Insertion
+chr2L	2965226	2965227	FBgn0001167_gypsy	Insertion
+chr2L	2933354	2935475	FBgn0003122_pogo	Excision
+chr2L	2920517	2920518	FBgn0010302_Burdock	Insertion
+chr2L	2763540	2763541	FBgn0000199_blood	Insertion
+chr2L	2714436	2714437	FBgn0000349_copia	Insertion
+chr2L	2661258	2663012	FBgn0001249_I-element	Excision
+chr2L	2569343	2569344	FBgn0004141_HeT-A	Insertion
+chr2L	2412907	2412908	FBgn0003055_P-element	Insertion
+chr2L	2397941	2397942	FBgn0000155_roo		Insertion
+chr2L	2294097	2299243	FBgn0000349_copia	Excision
+chr2L	2131306	2131307	FBgn0001283_jockey	Insertion
+chr2L	2112168	2119772	FBgn0003007_opus	Excision
+chr2L	2100430	2109522	FBgn0000155_roo		Excision
+chr2L	2003871	2003872	FBgn0003122_pogo	Insertion
Binary file test-data/test_chromosome.sorted.bam has changed
Binary file test-data/test_chromosome.sorted.bam.bai has changed
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/test-data/test_concensus.fa	Mon Apr 25 13:08:56 2016 -0400
@@ -0,0 +1,1143 @@
+>FBgn0010302_Burdock
+AGTTAACACAATCACAAAACACCCGAAATATAGTCGTAAGCCTCAAGTGC
+TTTTCCCATCTATAGATCGAGCTTTACCTATAAGAAACTGTAACTTGTTA
+AGCTTTAGAGATAAGAACTCTTGCTATACTTAAGTCAGTCGATTTTGGAA
+GATTAGAAGCGTCGGTCATCGCCACGTACTTACTATTCGTCTCATTAAGT
+GCAGACCGCGCAAGCCTATTGTAATTAATAAACTTACGCTAATAAATATA
+TGGAAAATCTACTAAAATGATAATTGGCGCCCAAACGGATATAAAAACCT
+ACGATAACTGAATAATTATAAATAAATAACAAAAGGAGGATCCGGAGACA
+AAACCAGCGGCTTTGGCTAATTAACTCTAACCTAAGAAATAAAAATTTGC
+TGATTACATAAAATATAATATTAATTACTAAGACCATCTACCTTAAAATT
+GTTTGTTAATCACTATTATTATATTGTAAGTATAACGCTTATTGAACGAA
+TTAAAAATATTATTATTATTATTATATTATAACCTATGCAAAGAGTATTG
+ATAATAAAAATACATGAGTGACAGTGATAACCTTTTAGACAACCTAGTGT
+CAAGCTTAAATAAATGGTCAGCGCACCAGGCAAGTAGGCAAAACAGTGCA
+GAAAAAAATAATAAGTCATCAGATAATTGGTGGTCAAAAACAAAGACAAC
+TAGCGAAATGGAATTTGAAGCTCAGTTAAAAGCGATCGTAGAGAGTGCTG
+TTGCCGGTGCGCTCGCAGTCCAAAAACAATCATTTGAAAAGCAATTGCAG
+GAGATGAATGAGCGAATCGGGAAATTAACAGTGAACACCCCAGAGGTGGA
+AACTTATGTAGATGCTGAAATTAGACCAGGTGTTGTCTGTAGCGAGCCTC
+TAGATATACTTAAATCTCTGCCAGATTTTGATGGCAAAAGTGAAACATAT
+GTGTCGTGGAGAAAAGCGGCTCATGTCGCTTTTAAAGTTTTCAAAGATTA
+CGAGGGAAGTTCAACATTTTACCAAGCTCTTGGTATTATGCGAAATAAAA
+TAAAAGGTCCAGCGAATACAGTATTGGCTTCTTTTAATACTCCGTTACAT
+TTCAAAGCAATGATCAGCCGTCTTGATTTCACATATTCTGACAAAAGGCC
+GATCTATCTAATCGAACAAGAGCTATCAACTTTGCGACAGGGAGACATGA
+CTCTTACTGAATTCTACGATGAAGTCGAGAAAAAACTGACCCTACTTACC
+AACAAGACAATAATGACATTTGATAGTGCCTTGGCGATGTCACTGAATGA
+AAAGTACAGGACGGACGCGTTACGTGTATTTGTAACCGGAGCTAAGAAAT
+CGTTGAGCGACATTCTTTTTGCAAAAGGTCCAAAAGATTTACCAACTGCT
+CTCGCTTTAGCGCAAGAGGTCGAGTCGAACCATGAGCGTTACCAATTCGC
+CCTTATTTATTCTAAAAATATTGGAGACAGGGGTCAGAAAATCGAACAAA
+GGCACAGCGATAAGGATAGAAACTCAATCATGCCCATGCAAACTAAAAAC
+CCATATTTTAGCAAGCGTCAGGTGCATACTTATGATAACCAGGAAAGACA
+AGATCCAGTCCAGTTAACAAATCCTGATGTATCCATGCGATCTAGAAGAA
+CTGGAAATTTTGGACAAACTCCATTTCCGACTCAGGGAAATATTTGGCCA
+TCCCAACAGCAAAATTCTTGGCCATCTCAACAACAATATTCTTGGCCATC
+CCAACAACAAAATTCATTTCGAACACAAAATCAATTCGCATCGCAACCCC
+AACAGCAAAACACAAGTCAGGCTCAGGGACATTTTGGGTATGCGCAAGCA
+TCAAAAAGACCAACGAGTGGCAGTGCAAGGTTTACAGGGCCAAAACAGCA
+GAGGATCAACTACTTACCTCATGAGAAAGGTCAATGTGAGGAAGATACAG
+ACGGTTATCAAAAGGAGGCAGAAGCGGAGGTTGATGATTATGAGGACGAA
+CTAGTGAATTACGATCATGTTCATTTTTTAGCCACAAATCCCTGCTACCG
+TACATAGAAAGAGAGATAGCAGGGAGAACCATAAAACTTTTGATTGACAC
+CGGGGCTTCGAAAAATTACATACAGCCCCTCCCTGAATTAAAAAACATAA
+TGCCGGTACAAAATAAATTCACGGTAAAATCGCTTCATGGTTGCAACACC
+GTCAAACAGAAATGCTTTATTAAGCTATTTAACACATCTGTTCAATTCTT
+TATTCTTCCAAGTCTCTCTAGTTTTGACGCAATAATAGGACTTGACCTTT
+TGAAACAGGGAAATGCAACGTTAGATTTTAAGAACAAAACGTTGAATATC
+AACAATGAAGTGGAATCTATTCAGTTTTTGAGATGTGACAGCGTAAATTT
+CGCCAACATAGAGAATATTGTGGTTCCAAATCAGATATCTAATAAATTCC
+ATACAATGCTTCGAAACCGATTGGCCGTCTTTGCGGAACCGGAAGAAGCA
+CTGCCGTATAATACCAACATTGTTGCCACAATACGTACTGAGGACGACCA
+ACCCATTTACTCAAAACTCTATCCGTACCCCATGGGCGTATCGGATTTTG
+TGAATAAGGAGACACATGCTTTGTTAAAGGACGGAATTATCAGGCCCTCG
+TCGTCACCTTACAACAATCCGGTTTGGGTAGTCGATAAAAAAGGTACAGA
+TGAAGAGGGAAATACTAAGAAAAGGTTGGTTATAGATTTTAGAAAACTAA
+ATTTAAAAACAATCGACGACAAGTACCCTATACCAAACGTAGTATGGATC
+TTGTCAAATTTGGGAAAAGCCAGATTCTTTACAACCCTTGACCTTAAATC
+GGCGTTTCACCAAATTCTGCTCGCAGAAAAGGATAGAGCGAAAACTGCCT
+TTTCAGTAGGAAATGGAAAATACGAGTTTTGCCGTTTGCCGTTTGGCTTG
+AAAAATGCCCCAAGTATTTTTCAACGTGCTATTGATGATGTTGTTAGGGA
+CCGTATAGGAAAGTCATGTTACGTTTACGTTGACGACGTAATAATATTTT
+CAAACGGAATTGAGGACCACGTAAACGACGTTGCTTGGGTACTAGACAGA
+CTGTCTGGGGCAAACATGAGGGTTTCTAAAGAGAAATCGTTTTTCTTCAA
+GGAAAGCGTCGAGTATCTCGGATTCATGGTGTCAAGTGGAGGTATCACAA
+CCAGTCCTAGCAAAGTAGAGGCTATTCAGAAATATAATCAACCTACTAAT
+CTGTTTAGTGTTCGATCGTTTTTAGGGCTAGCAAGTTATTACCGCTGCTT
+TATTAAGGACTTCGCCTCTATTGCTAGACCACTCACTGACATTCTGAAGG
+GTGAAAACGGAAAGGTTTCCGCAAGCCAGTCTAAAAAGATACCAATTTCT
+TTCGATGAAAGACAATGTTCTGCTTTTGAGAAGCTTAAAAATGTTCTTGT
+CTCCGAAAATGTAATGTTATTGTATCCCGATTATAGAAAAGCCTTTGACT
+TAACAACAGACGCTTCGGCTTTTGGCCTGGGGGCAGTCTTATCACAGGAT
+GGCAAGCCTGTTACAATGATTTCGAGAACTTTACAGGATAGAGAACTTAA
+TTTCGCAACAAATGAACGAGAACTTTTGGCCATCGTTTGGGCTTTAAAGT
+CTCTTAGGAACTATCTATATGGTGTCAAAAACTTAAACATTTTTACAGAT
+CACCAGCCGTTAACATACGCCGTGTCAGATAGGAATCCAAATGCAAAAAT
+CAAGAGATGGAAGGCGTTTATAGACGAACATAATGCTAAAATTTTCTATA
+AACCTGGCAAGGAGACCTATGTTGCCGATGCACTATCCAGGCAGGCTATT
+CATGTCCTAGAGGACGAACCCCAGTCAGACATTGCAACAATACATAGCGA
+AATTTCATTGACTTTTACAATCGAAACTATCGACAAGCCGGTTAACTGTT
+TTAGAAACCAAATTGTGATAGATGAGGGCACCGCAGACTCAACTCGAACT
+TTTGTTATTTTCGGAAGCAAGACAAGGCATCTAATACAGTTTCTAGACAA
+AGAGACCTTAATCGGAAGAATTCGTGATGTGGTTAAGCCGGATGTAGTGA
+ATGCGATACACTGCGAATTACCTGTACTAGCTTTCATTCAAAACAGTCTT
+GTAAATGACTTTCCAGCAACAACCTTCCGACACACTATGAAAATGGTCAG
+CGACATTTTTAATCAAACTGAGCAACGGGAAATAGTGTCTTTGGAGCACA
+ACAGAGCGCATAGGGCAGCACAGGAGAATGTAAAACAAATTCTTCAATAC
+TACTTTTTCCCTAAAATGTCACAAATAGCCGCTACCTTTGTTTCTAACTG
+CTTGGTTTGTCAAAAAGCCAAATACGACCGCCATCCGCAAAAGCAAATCC
+TCGGGAGAACACCTATTCCGTCACATGTAGGCGAGACATTGCATATTGAT
+ATATTTTCTACGGGCAGGAATTACTTTTTGACATGTATTGACAAATTTTC
+CAAATTCGCTATTGTGCAACCAATCGGCTCTCGAACGATAACTGATTTAG
+AACCTGCAATTATGCAACTAATGAACTTTTTTCCCCATTCAAAGACAATA
+TTTTGTGACAATGAACCGTCCATAAATTCCGAGTCAATCAAGTCACTTTT
+GAAAAATCGTTTTAATGTTGACATAGCGAACGCACCTCCACTTCATAGTA
+CCTCAAACGGACAGGTTGAAAGGTTTCACAGCACGCTTTTAGAAATAGCT
+CGATGCCTGAAACTTGACAGTGGAATGAATGATACAGTCAACCTTATTCT
+TCAGGCAACAATAGAATACAATAAGACGGTGCACTCAGTCACCAATAGAA
+GACCGATCGACATTATTCATTCAACTCCTCCCGAATTGGCTAACGAGATA
+GTAGAAATGGTTAACGAAGCTCAGGAAAAACAGCTAAGAAGAGAAAATGT
+AACAAGACGAGACAGAACCTTTGAGGTGGGAGAAACCGTCATGGTAAAAC
+AAAACAATCGCTTGGGAAATAAACTAACCCCACGGTATAGGGAAGAACTA
+ATCGAAGCAGACCTCGGGACAACGGTCCTCATAAAAGGGAGGGTCGTTCA
+TAAAGATAATCTACGCTAGGTTTAGTATTTCTTTTCCTTTTGTGACCATC
+GCCAAGTTAGCAAAATACAAACGTGAAATCTGAACACTAGTAAAAGAGTT
+TGCAAACATTTTTCAATTAAATATTTGTCAAATCCTTCTTATTTAATCTT
+TAAACATTTTGTATTATTTCCGCTTCATCCTCTTTAGAAAATTTTAAAGG
+TATGTGATGAAATGCTAGACCCGAATGATTTGAAAACTTAAAGTCCACGC
+AACCACAAATATTTCCTGAAACTACCATAGAAAATAAATGCATTACCAAA
+ACGGCATAATAACAGTATAGCGCACTCACTCTAATTAGATTTCAAATTCC
+CGATTAAAAAAAAAATAAAACACTAATGTTATCAATACCCTTTCCTGATT
+CTGTTCAACTAAAATAGGAAAATCAATACTTGCAATCAATAAGCGTTTTA
+CTACATACTTTAATATCAAAATATCTGAATGAACTTTATTATAAAATTAT
+AATTGTTATACTTAATTATTGTCAAAACTTTAGTATTAAAACTGTAACTA
+CCTCTTAAGTAGATGAGAAGAGTAGAAGAGGGAATTAAGATCTATCAACG
+TAGTATCTGCTAAAGACGTAAAGATGCGGCAACTATTTCTGCGCCTGGGT
+ACTGAAACGACGAACTGAATAATATCTGCCATCAGACGCCAACCAGAGTG
+CGTTCAACACATACGTTTTGATGGTCAACTAGTTCAACCAACATCAGCAT
+CATCGTCGTCAACAAGTCGACGGTTACAATAAAGATTTTTTCCAAGTTCG
+CTACGATCATCTCCAGAACCTTGTTGCGAACCCATGACATGGAGAATCAG
+CAGCATTTACGAACTTCTCGGATCATCCAGACACGCAGAGCTGCCTTCCC
+TTCGATGGTTTAACGCAGTACCAGGTTGGCAGTATGGGAACTTAGTGCAC
+AACCAATGTTACCCGTAAGATCCGCTTTCAAATAGATTTGCCAATTGTAA
+AAAGTCTGTGGACAGCCTTCGTCTTAGAAGGGGAGGAGTTAACACAATCA
+CAAAACACCCGAAATATAGTCGTAAGCCTCAAGTGCTTTTCCCATCTATA
+GATCGAGCTTTACCTATAAGAAACTGTAACTTGTTAAGCTTTAGAGATAA
+GAACTCTTGCTATACTTAAGTCAGTCGATTTTGGAAGATTAGAAGCGTCG
+GTCATCGCCACGTACTTACTATTCGTCTCATTAAGTGCAGACCGCGCAAG
+CCTATTGTAATTAATAAACTTACGCTAATAAATATATGGAAAATCTACTA
+AAATGATAATT
+>FBgn0000349_copia
+TGTTGGAATATACTATTCAACCTACAAAAATAACGTTAAACAACACTACT
+TTATATTTGATATGAATGGCCACACCTTTTATGCCATAAAACATATTGTA
+AGAGAATACCACTCTTTTTATTCCTTCTTTCCTTCTTGTACGTTTTTTGC
+TGTGAGTAGGTCGTGGTGCTGGTGTTGCAGTTGAAATAACTTAAAATATA
+AATCATAAAACTCAAACATAAACTTGACTATTTATTTATTTATTAAGAAA
+GGAAATATAAATTATAAATTACAACAGGTTATGGGCCCAGTCCATGCCTA
+ATAAACAATTAAATTGTGAATTAAAGATTGTGAAAATAAATTGTGAAATA
+GCATTTTTTCACATTCTTGTGAAATAGCTTTTTTTTTCACATTCTTGTGA
+AATTATTTCCTTCTCAGAATTTGAGTGAAAAATGGACAAGGCTAAACGTA
+ATATTAAGCCGTTTGATGGCGAGAAGTACGCGATTTGGAAATTTAGAATT
+AGGGCTCTTTTAGCCGAGCAAGATGTGCTTAAAGTAGTTGATGGTTTAAT
+GCCTAACGAGGTAGATGACTCCTGGAAAAAGGCAGAGCGTTGTGCAAAAA
+GTACAATAATAGAGTACCTAAGCGACTCGTTTTTAAATTTCGCAACAAGC
+GACATTACGGCGCGTCAGATTCTTGAGAATTTGGACGCCGTTTATGAACG
+AAAAAGTTTGGCGTCGCAACTGGCGCTGCGAAAACGTTTGCTTTCTCTGA
+AGCTATCGAGTGAGATGTCACTATTAAGCCATTTTCATATTTTTGACGAA
+CTTATAAGTGAATTGTTGGCAGCTGGTGCAAAAATAGAAGAGATGGATAA
+AATTTCTCATCTACTGATCACATTGCCTTCGTGTTACGATGGAATTATTA
+CAGCGATAGAGACATTATCTGAAGAAAATTTGACATTGGCGTTTGTGAAA
+AATAGATTGCTGGATCAAGAAATTAAAATTAAAAATGACCACAACGATAC
+AAGCAAGAAAGTTATGAACGCGATCGTGCACAACAATAATAACACTTATA
+AAAATAATTTGTTTAAAAATCGGGTAACTAAACCAAAGAAAATATTCAAG
+GGAAATTCAAAGTATAAAGTCAAGTGTCACCACTGTGGCAGAGAAGGCCA
+CATTAAAAAAGATTGTTTCCATTATAAAAGAATATTAAATAATAAAAATA
+AAGAAAATGAAAAACAAGTTCAAACTGCAACATCACACGGCATTGCGTTT
+ATGGTAAAAGAAGTGAATAATACTTCAGTGATGGACAACTGCGGGTTTGT
+CCTTGATTCTGGTGCTAGTGACCATCTTATAAATGATGAGTCGCTGTATA
+CCGACAGTGTGGAGGTTGTGCCTCCACTTAAGATTGCAGTGGCCAAGCAA
+GGCGAATTTATTTATGCCACTAAGCGTGGTATTGTCCGACTACGGAATGA
+CCATGAGATTACACTGGAGGATGTACTCTTTTGTAAGGAAGCTGCTGGTA
+ATTTGATGTCCGTAAAGCGTCTCCAAGAGGCAGGAATGTCGATCGAATTT
+GACAAAAGCGGTGTAACCATTTCGAAAAATGGGTTAATGGTTGTCAAAAA
+TTCAGGTATGTTAAACAATGTACCTGTGATCAATTTTCAAGCATATTCTA
+TAAATGCTAAGCATAAAAATAATTTTCGTTTATGGCATGAGAGGTTTGGC
+CATATAAGCGATGGCAAATTATTAGAAATAAAACGAAAGAATATGTTTAG
+TGATCAAAGTCTTCTAAACAACTTAGAGTTATCATGTGAAATTTGTGAAC
+CCTGTTTAAATGGTAAACAGGCAAGACTTCCTTTTAAACAATTGAAAGAT
+AAGACCCATATTAAAAGACCACTTTTTGTAGTACACTCAGATGTCTGTGG
+GCCTATTACTCCAGTTACTTTAGATGATAAAAATTATTTTGTGATCTTTG
+TTGATCAGTTTACACATTATTGTGTAACTTATTTAATTAAATATAAATCT
+GATGTGTTTAGCATGTTTCAAGATTTTGTAGCCAAGAGTGAAGCTCATTT
+TAATTTAAAGGTTGTGTACTTATACATTGACAATGGTAGAGAATACTTGT
+CAAATGAGATGAGACAATTTTGTGTTAAGAAAGGAATTTCTTATCACTTA
+ACAGTGCCACATACACCTCAGTTAAATGGTGTTTCTGAGAGAATGATAAG
+AACCATTACGGAAAAAGCTCGAACCATGGTTAGTGGTGCAAAGCTAGATA
+AAAGCTTTTGGGGCGAAGCAGTATTAACTGCTACTTATTTAATCAACAGA
+ATTCCTAGTAGAGCACTTGTTGATAGTTCAAAGACCCCATATGAGATGTG
+GCACAATAAGAAGCCATACTTAAAACATTTGAGAGTGTTTGGTGCAACTG
+TTTATGTGCATATTAAAAACAAACAAGGAAAGTTTGATGATAAATCATTT
+AAAAGTATTTTTGTGGGCTATGAACCCAATGGTTTTAAGTTGTGGGATGC
+TGTAAATGAAAAATTTATTGTCGCAAGAGATGTTGTTGTCGATGAAACCA
+ATATGGTTAATTCTAGAGCTGTTAAATTTGAAACAGTGTTCCTGAAAGAT
+AGTAAGGAAAGTGAAAATAAAAATTTTCCGAATGACAGTAGGAAAATAAT
+ACAAACAGAATTCCCGAATGAGAGTAAGGAATGCGACAACATACAATTCC
+TGAAAGATAGTAAGGAAAGTGAAAATAAAAATTTTCCGAATGACAGTAGG
+AAAATAATACAAACAGAATTCCCGAATGAGAGTAAGGAATGCGACAACAT
+ACAATTCCTGAAAGATAGTAAGGAAAGTAATAAATATTTTCTGAATGAGA
+GTAAGAAAAGAAAGCGAGATGATCACCTGAATGAAAGTAAGGGATCAGGC
+AACCCGAATGAGAGTAGGGAAAGTGAAACAGCAGAGCACTTAAAAGAAAT
+TGGAATTGATAATCCAACTAAAAATGATGGCATAGAAATTATTAATAGAA
+GAAGTGAGAGATTAAAGACTAAGCCTCAGATATCCTATAATGAAGAGGAT
+AATAGTCTAAATAAAGTTGTTCTAAATGCTCACACTATATTTAACGATGT
+CCCAAATTCATTTGATGAAATTCAATATAGGGATGATAAATCTTCTTGGG
+AAGAAGCCATCAATACAGAGTTAAATGCTCATAAAATTAATAATACTTGG
+ACAATTACAAAAAGGCCTGAAAACAAAAATATTGTAGATAGCAGATGGGT
+ATTTTCTGTTAAATATAATGAACTTGGAAATCCAATTAGATACAAAGCTA
+GATTGGTTGCACGAGGATTCACTCAAAAATACCAAATAGACTATGAAGAG
+ACATTTGCTCCTGTAGCTAGAATTTCAAGTTTCCGATTTATATTGTCATT
+AGTAATACAGTATAACTTGAAAGTCCATCAAATGGATGTAAAAACAGCTT
+TCTTAAATGGCACGTTAAAAGAGGAAATTTATATGAGACTTCCTCAAGGT
+ATATCGTGTAATAGTGACAATGTGTGTAAATTGAATAAGGCAATTTACGG
+ACTCAAGCAAGCGGCTAGATGCTGGTTTGAAGTATTTGAGCAAGCATTGA
+AAGAGTGTGAGTTTGTAAACTCTTCAGTTGATCGCTGTATATATATTTTA
+GACAAAGGTAACATCAATGAAAACATATATGTATTATTATATGTAGATGA
+TGTGGTTATAGCTACAGGAGATATGACAAGAATGAATAACTTCAAAAGGT
+ATTTAATGGAAAAGTTTAGGATGACTGACCTAAATGAAATAAAACATTTT
+ATTGGAATTAGGATAGAGATGCAGGAAGATAAAATCTATTTAAGCCAATC
+TGCATATGTTAAAAAAATTTTAAGTAAATTTAACATGGAAAATTGTAATG
+CAGTTAGTACTCCTTTACCTAGTAAAATAAATTATGAATTACTTAATTCA
+GATGAAGACTGCAATACCCCATGCCGTAGCCTCATAGGATGTTTAATGTA
+CATAATGCTTTGTACACGCCCAGATTTAACTACTGCAGTAAATATCTTGA
+GCAGATATAGTAGCAAAAATAACTCCGAATTATGGCAGAACTTAAAAAGA
+GTTCTTAGATATTTGAAGGGCACTATCGATATGAAATTGATTTTTAAAAA
+GAACTTGGCATTTGAAAATAAAATTATTGGTTATGTGGATTCTGATTGGG
+CTGGTAGTGAAATTGATAGAAAAAGTACAACAGGGTATTTATTCAAAATG
+TTTGATTTTAATCTCATTTGTTGGAATACAAAGAGACAGAACTCAGTAGC
+AGCCTCATCAACTGAAGCTGAGTATATGGCCCTATTTGAAGCCGTGAGAG
+AAGCTCTATGGCTTAAATTTTTATTAACTAGTATTAACATTAAACTAGAA
+AACCCCATTAAAATTTACGAAGACAATCAAGGCTGTATTAGCATAGCAAA
+CAACCCCTCATGTCATAAACGAGCTAAACATATTGATATTAAATATCATT
+TTGCCAGAGAGCAAGTTCAGAATAATGTGATTTGTCTTGAGTATATTCCT
+ACAGAGAATCAACTGGCTGACATATTTACAAAACCGTTGCCTGCTGCGAG
+ATTTGTGGAGTTACGAGACAAATTGGGTTTGCTGCAAGACGACCAATCGA
+ATGCTGAATGAAATTTTTATATATATTTTTCAAATTTAAATTCCTGTAAA
+CATATTTTGTTACAATGATCTGATCGGGTTTTTCTGGGTTTTCCCCGTAT
+CCTCGCAGCAAATGCTGGATCAGTTAACACTTCCCAGAATGCACACCACC
+CACATTTGATAGTTACTAATGAATATTATTGTTATGTTTTTAATTATAGA
+CGTTATTTTTGAGGGGGCGTGTTGGAATATACTATTCAACCTACAAAAAT
+AACGTTAAACAACACTACTTTATATTTGATATGAATGGCCACACCTTTTA
+TGCCATAAAACATATTGTAAGAGAATACCACTCTTTTTATTCCTTCTTTC
+CTTCTTGTACGTTTTTTGCTGTGAGTAGGTCGTGGTGCTGGTGTTGCAGT
+TGAAATAACTTAAAATATAAATCATAAAACTCAAACATAAACTTGACTAT
+TTATTTATTATTAAGAAAGGAAAATAAATTATAAATTACAACA
+>FBgn0000481_Doc
+GACATTCGGCATTCCACAGTCTTCGGGTGGAGACGTGTTTCTTTCAAGCT
+ACGAATAGCAAGTTCTAAAAACTACAACAGTATAGTGAAAGTTAAACACA
+AAGTGTAAAGTGCAGTTTGCACAACTAACAATTATTGACTATAGTAATTA
+TTTACTAAAATAAATAATTATTCCATATTGTTCTGGTAATTGTTATATGT
+GGACTTAGAACAATGAATCAAAACGACATACGTTCTCAGCGACAATGTGA
+ACAAGACGAGCGCCGGCTCTCTTTACAACGCAACAATGCATACTTTTCTT
+TCGTCTCACCGCAAATCGGTGATCGAGCACCCTCACCTTCAACTAACTCG
+AAACTTTTGCCCTCAGCGAACGACAGACCGCGTTCTTGCTCTCCCTCTCT
+GCCTGCTTCGGCTCACAAGTCGTGGAGCGAAGAGACCGCCTCTCCTACCC
+CGCTCCTCTCGCAGCGCCAAACGACCGTCCCGGGTAACTGTAACACTGCA
+ATAACGAGTGCAGTGACCTCACTGGCAACTGCCACAACATCAACTTCGTC
+AGCGGCCCAACTAATTATCGCTGTGCCAGCTGTAAATAATTCAGCAGCAC
+TGACCGTTTGCAACAACAATAATGCACGTAAAGAAGAATCAAAACAAAAG
+CAGAAGTCGATTTCGACTGTGCAGACTGGCATGGATCGCTACATCCAAAT
+CAAGAGAAAGCTCAGCCCTCAAAACAATAAGGCAGGTAATCAACCCAAAA
+TCAATCGAACCAACAACGGCAATGAAAACTCTGCAGTAAATAATTCAAAC
+CGATATGCTATCTTGGCTGATTCTGCGACCGAACAACCCAACGAAAAAAC
+GGTAGGGGAACCAAAAAAGACCAGGCCTCCACCAATTTTCATACGAGAAC
+AAAGTACAAATGCACTTGTAAATAAACTCGTTGCTTTGATTGGTGACAGC
+AAGTTCCACATTATCCCACTTAAAAAAGGAAATATTCATGAAATAAAACT
+ACAGATCCAAACAGAAGCAGACCACCGTATAGTGACTAAATACCTAAATG
+ATGCTGGTAAAAACTACTACACATACCAATTAAAAAGTTGCAAAGGGCTA
+CAGGTAGTACTTAAGGGCATTGAAGCAACAGTGACACCAGCTGAGATAAT
+TGAGGCTCTGAAGGCCAAAAACTTTTCTGCAAAGACAGCTATTAATATTT
+TAAACAAAGACAAAGTTCCGCAGCCACTATTCAAAATAGAACTCGAACCA
+GAGCTCCAGGCACTAAAGAAAAACGAAGTGCACCCAATATACAATTTACA
+GTACTTGCTACATCGGAGGATCACCGTGGAGGAGCCGCACAAACGTATCA
+ATCCAGTTCAATGTACTAATTGCCAAGAATACGGCCACACCAAGGCATAC
+TGCACCCTTAAGTCCGTATGTGTTGTCTGTAGCGAACCTCATACTACCGC
+AAACTGCCCCAAAAACAAGGACGATAAGTCTGTGAAGAAATGCAGTAACT
+GCGGGGAAAAACATACTGCAAACTACAGAGGCTGTGTGGTGTACAAAGAA
+TTGAAGAGCCGCCTAAACAAACGTATTGCCACAGCACATACATACAACAA
+AGTCAATTTCTACTCTCCGCAACCGATTTTTCAACCACCCCTAACTGTCC
+CAAGCACTACTCCAACAATTTCTTTCGCTAGCGCCCTAAAATCCGGACTA
+GAAGTGCCCGCCCCACCGACAAGAACTGCTCATTCCGAACATACACCGAC
+AAACATCCAACAAACACAACAAAGTGGCATCGAAGCTATGATGCTATCCC
+TACAGCAAAGCATGAAAGACTTCATGACGTTCATGCAAAATACTTTGCAA
+GAGCTCATGAAAAACCAAAATATCCTGATTCAACTTCTTGTATCTTCAAA
+ATCCCCATAATGGCTTCCCTACGGATATCTCTGTGGAACGCAAATGGCGT
+TTCACGGCATACACAAGAGCTCACACAGTTCATTTACGAAAAAAACATCG
+ACGTAATGCTACTATCAGAAACGCACCTCACAAATAAAAACAATTTTCAT
+ATACCAGGATACTTGTTCTATGGTACAAATCATCCAGATGGTAAAGCTCA
+TGGAGGCACTGGAATACTCATCAGAAATCGCATAAAACACCACCACTTAA
+ACAATTTTGACAAAAACTACTTACAATCTACGTCCATAGCCTTACAACTC
+AACAATGGTTCAACGACTCTAGCCGCAGTCTACTGCCCACCGCGCTTTCC
+AATCTCTGAGGATCAATTCATGGAATTCTTTAACACACTAGGTGACAGGT
+TCATCGCAGCGGGTGACTATAACGCCAAGCACACCCATTGGGGATCTCGA
+CTTGTGTCGCCAAAGGGTAAGCAATTGTACAATGCGCTTACGAAGCCAGA
+AAACAAGCTAGACTATGTATCCCCGGGTAAGCCTACATACTGGCCAGCAG
+ACCCAAGAAAAATCCCAGACCTGATCGATTTTGCAATTACTAAACATGTC
+CCCCGCAACATGGTCACCGCCGAAGCACTAGCAGATTTATCATCAGATCA
+CTCACCTGTTTTTCTAAATATGCTAACTCGCCCCCACATCGTCGACCCAC
+CGTATAGACTCACAAATTTTAGAACAAACTGGCCAAGGTATCAAAAGTAT
+GTCTGTTCACACATAGAACTAACGACGGCATTATCTACAAAGGAGGATAT
+AGACAAGTCAACGGAAACTCTTGAAAACATTTTAGTTTCGGCTGCAAAGG
+CTTCAACCCCGCCAGTGACGTATGCAAAACCAAACTACATCAAAACTAAT
+CGCGAAATCGAGCGGCTGGTATTAGATAAACGACGCCTACGAAGGGATTG
+GCAGTCTAATAGATCACCAATTACTAAGCACATGCTTAAGATAGCCACAC
+GCAGGCTTACCAATGCTCTCAAACAAGAGGAAAAAAACAGCCAACGTTCA
+TATATCGAGCAACTCTCTCCCACCAGCACTAAGTACCCTCTTTGGAGAGC
+TCACAGAAACCTAAAGACTCCAATAGCGCCAATTATGCCACTCCGAAGTC
+CCTCTGGCACCTGGTTTCGAAGTGATGAAGAAAGAGCCAGTGCTTTCGCT
+GACCATTTACAAAATGTATTCCGACCAAATCCCTCTACCAACACATTTAT
+TCTCCCTCCTTTAATAGCAGCCAATCTAGATCCTCAAGAACCCTTTGAAT
+TCCGACCATGTGAACTAGCAAAGGTTATCAAAGAGCAACTGAACCCAAGA
+AAATCGCCTGGCTACGACCTAATAACTCCAAGAATGCTCATTGAACTCCC
+AAAGTGTGCTATTCTTCACATCTGCCTGTTGTTCAACGCAATCGCCAAGC
+TTGGATACTTCCCTCAAAAATGGAAAAAGTCGACCATAGTAATGATTCCA
+AAGCCAGGAAAAGATAAAACGCAGCCATCATCATATAGACCGATAAGCTT
+ACTAACATGTCTTTCAAAGCTGTTTGAAAAAATGCTACTCCTTCGGATTA
+GCCCTCATCTTAGAATAAACAACACACTTCCAACACATCAATTTGGCTTT
+AGAGAAAAACATGGAACCATCGAACAGGTCAACCGAATCACGTCAGAAAT
+TCGTACTGCTTTTGAACATCGAGAATACTGCACAGCCATTTTTCTAGACG
+TCGCGCAGGCATTTGACAGAGTGTGGCTCGATGGACTTTTGTTTAAAATA
+ATCAAGCTGTTGCCCCAAAACACACATAAGCTACTGAAGTCATACCTATA
+TAACAGAGTGTTTGCAATAAGATGCGATACAAGCACTTCACGCGATTGCG
+CAATCGAAGCTGGAGTGCCGCAAGGCAGTGTACTGGGTCCAATCTTATAC
+ACCCTGTATACGGCGGATTTCCCCATAGACTACAATCTAACAACCTCCAC
+GTTCGCTGATGATACCGCGATACTCAGTCGCTCGAAATGCCCAATAAAAG
+CCACGGCACTCCTATCCCGACACTTAACATCTGTAGAACGATGGCTTGCC
+GACTGGAGAATTTCAATAAATGTTCAAAAATGCAAGCAGGTTACCTTTAC
+CTTAAACAAACAAACATGCCCACCACTGGTCTTGAATAACATATGCATTC
+CACAAGCCGACGAGGTAACATATCTGGGAGTTCATCTGGACAGGCGGCTC
+ACTTGGCGCAAACATATAGAAGCCAAATCGAAACATCTTAAACTTAAAGC
+AAGGAACCTCCACTGGCTCATAAATGCTCGCTCTCCACTTAGTCTGGAGT
+TCAAAGCTCTTCTATACAACTCCGTCTTAAAACCTATCTGGACTTATGGC
+TCCGAGCTGTGGGGCAACGCATCCAGAAGTAACATAGACATTATTCAGCG
+AGCACAGTCAAGAATTCTGAGAATTATCACTGGAGCGCCGTGGTACCTTC
+GAAACGAAAACATACACAGAGACCTAAAAATCAAATTAGTAATCGAAGTA
+ATAGCTGAGAAAAAAACGAAGTATAACGAAAAGCTGACCACCCATACAAA
+TCCCCTCGCAAGAAAACTAATCCGAGTATGCAGTCAAAGCCGGCTGCACC
+GCAACGACCTCCCAGCCCAGCAATAAACTTATTAGGGCATTAATGAAAAA
+AAAAAACTATCACTAAGTGAAAGTTAATTAAGTTAGATTAAGATTTGAAC
+ACTTATTGTTAGTCTCTTAACACAAAGGGAAGATTCAATAAATAATAAAA
+ATTAAAAAAAAAAAAAAAAAAAAAA
+>FBgn0001167_gypsy
+AGTTAACAACTAACAATGTATTGCTTCGTAGCAACTAAGTAGCTTTGTAT
+GAACAATGCTGACGCGCCAGAATTGGGTTCAACGCTCCACGCGAAGAATG
+CCTGGCAGCGGAAAGCTGACACTTCCTACCGGGAGTGTTGCTTCACGCTG
+CAAGAAATGCTGAGTCGGCTTGCCGACTTGTGGCGGCGCGATGCATTGCT
+CGAGGGTAAACTTAGTTTTCAATATTGTCTTCTACTCAGTTCAAATCTTG
+TGTCGAAATAAACCACAGCTTGCTCCGGCTCATTGCCGTTAAACATCATT
+GTTCTTATTTACAATCAAATCGCTATCGCCACAAGGCTAGTGATAATAAC
+TAAGGGGGCGAAGTCAAGCCCTCCAACCTAATCTCCATAAACAGTGTCTA
+AGACGAACCTCAGCGAAAGAAGGAAGATCTCTAGACCTACTGGAAATAAC
+ATAACTCTGGACCTATTGGAACTTATATAATTGGCGCCCAACCAACAATC
+TGAACCCACCAATCTAATTTAACACACTTTGTCAGGCGACAAACAGGGTA
+GTTAAGTTAGAAAAGCATGTAAGTTTTACAAGACACTTCTTTGACGCAAT
+CAAGAAATTTACGAGTGAAAAAAAAAAAAAAAAAAAGTTGTGTATCTGGC
+CACGTAATAAGTGTGCGTTGAATTTATTCGCAAAAACATTGCATATTTTC
+GGCAAAGTAAAATTTTGTTGCATACCTTATCAAAAAATAAGTGCTGCATA
+CTTTTTAGAGAAACCAAATAATTTTTTATTGCATACCCGTTTTTAATAAA
+ATACATTGCATACCCTCTTTTAATAAAAAATATTGCATACTTTGACGAAA
+CAAATTTTCGTTGCATACCCAATAAAAGATTATTATATTGCATACCCGTT
+TTTAATAAAATACATTGCATACCCTCTTTTAATAAAAAATATTGCATACG
+TTGACGAAACAAATTTTCGTTGCATACCCAATAAAAGATTATTATATTGC
+ATACCTTTTCTTGCCATACCATTTAGCCGATCAATTGTGCTCGGCAACAG
+TATATTTGTGGTGTGCCAACCAACAACCAATGAGTTGGGCACATAACTAC
+AGAAAGGTTAAGGTCGAATACGAAAGCGAGGATAGCTGGGAGGAGGAGCA
+AGTAGGCCAAGCATTAGGTCGGCCGTTAGATAGTGCCACGGTAGATATTA
+CCATGGACCCCAATCAGATTCAAGCTCTTATCGACAATGCTGTCAGACAG
+GCATTGTCGCAACAGCAATCCCAATTTCAGACACAACTCAATTCCCTAGC
+TGCGCGGGTACAGAGTTTGCAGGTGGAAGCACCGCAAATCAAGATTTACG
+AAAAAGTCTCTGTTAACCCCGATGTTAGGTGCGACATTCCCCTTGACATA
+ATAAAGTCTGTACCAGAGTTCTCCGGTACCCAAGACGAGTATGTGGCCTG
+GAGACAATCGGCCATATACGCCTACGAGCTCTTCAAACCATACAATGGCA
+GCAGTGCCCATTATCAGGCTGTTGCCATATTAAGGAATAAAATCCGTGGC
+GCAGCCGGGGCTTTACTGGTCTCCCACAATACGGTATTGAACTTCGATGC
+TATTTTGGCCAGACTAGACTGCACGTACTCGGACAAAACATCCTTACGCC
+TGTTGAGGCAAGGATTGGAAATGGTTAGGCAAGGAGACCTACCACTAATG
+CAATACTACGATGAAGTTGAAAAGAAGCTAACGCTTGTCACTAACAAAAT
+CGTAATGACGCATGAACAAGAGGGTGCTGACCTGCTTAACGCTGAGGTCA
+GAGCCGACGCCCTGCATGCTTTTATTTCGGGGCTCAAAAAGGCCCTCAGA
+GCTGTGGTCTTCCCGGCCCAACCAAAAGACCTGCCATCTGCACTGGCTTT
+AGCTAGAGAAGCAGAGGCAAGCATAGAGAGAAGCATGTTCGCTAACTCCT
+ACGCCAAGGCCGTAGAGGAGCGAGCGCATTCGGGGGCAAACGGCAAGAGC
+CGTTTCCAGGGGAAGCCAAATAAAGAAGAACAGGGACAGGACAGGAATCC
+CCACTTCACCAAACGCCCCAAAAATAACGGACAAACCAACAAGGACACTC
+AGGCGCAAGCACCCCAGCCAATGGAGGTCGATTCATCCTCCAGGTTTAGG
+CAGCGTACTGAACATTATCAGAATCATCCTAACGAGTCGAACGCGTTTAA
+GAGGAGAAATTCCTCAGAACGCTCAACAGGACCGAGACGACAACGTCTGA
+ATAACGTTGTCCAAGAGGCCCCTAAACAAAAGGACCCCAAAGAAGAGTAT
+GAAAAAACAGCAAAGGCTGCAGTCGAGGAAATCGACAGCGAAAATGAGTA
+CGCTCCCAGTGACGACTCGTTGAATTTTTTAGGGGGCGCTCCCGGTTGCC
+GTTCATTGAACGACGGCTGGCTGGGAGAACCTTAAAGATGCTAATCGATA
+CCGACGCGGCAAAAAACTACATTAGGCCCGTAAAGGAGCTGAAAAATGTA
+ATGCCGGTCGCCAGCCCTTTCTCGGTGAGCTCAATACACGGCTCCACCGA
+AATCAAACACAAATGCTTGATGAAAGTCTTCAAGCACATCTCCCCATTTT
+TTCTTTTGGATTCTCTCAATGCGTTCGACGCTATCATAGGCTTGGACCTG
+TTAACACAGGCCGGGGTAAAACTCAACCTTGCAGAGGACTCCTTAGAATA
+CCAGGGCATCGCTGAAAAGCTTCATTATTTCAGCTGCCCCAGTGTAAATT
+TCACTGATGTAAACGATATTGTTGTACCTGACTCCGTTAAAAAGGAGTTC
+AAGGACACAATAATAAGGAGGAAGAAAGCTTTCTCCACAACAAATGAAGC
+TCTTCCTTTTAACACCGCTGTCACTGCCACAATTCGGACAGTTGACAATG
+AACCGGTGTACTCAAGAGCGTACCCAACTCTTATGGGTGTCTCCGACTTT
+GTGAACAACGAGGTCAAACAACTGCTGAAAGACGGCATTATCAGGCCCTC
+AAGGTCTCCCTATAACAGCCCGACCTGGGTTGTTGACAAAAAGGGGACCG
+ACGCCTTCGGGAACCCAAACAAGAGGTTGGTCATTGACTTCAGGAAGCTA
+AATGAGAAAACTATTCCTGACCGGTACCCGATGCCTAGCATTCCCATGAT
+TCTAGCGAATCTGGGCAAGGCAAAGTTCTTCACTACCCTTGATCTTAAGT
+CAGGGTATCATCAAATTTACCTCGCGGAACACGACCGCGAGAAGACATCG
+TTCTCGGTGAATGGTGGTAAATACGAGTTTTGCCGTCTACCGTTCGGCTT
+GAGAAATGCAAGCAGCATTTTTCAAAGAGCCCTAGACGATGTGCTTAGAG
+AGCAAATCGGGAAGATATGTTACGTCTATGTAGATGACGTCATAATTTTC
+TCTGAAAACGAGTCCGACCATGTCCGCCACATCGATACAGTACTAAAATG
+CCTGATCGATGCCAACATGAGAGTAAGCCAGGAGAAAACTAGATTCTTTA
+AAGAGAGTGTAGAATACCTCGGCTTTATTGTCAGTAAGGACGGAACTAAA
+TCCGATCCAGAGAAGGTGAAGGCCATTCAGGAGTACCCTGAACCAGACTG
+CGTTTACAAGGTTAGGTCCTTCCTTGGTTTAGCCAGCTACTACAGAGTCT
+TCATCAAAGACTTTGCTGCCATAGCCCGCCCGATCACCGATATCCTAAAA
+GGGGAAAATGGTTCGGTGAGCAAACACATGTCTAAAAAAATTCCTGTTGA
+GTTTAATGAAACTCAACGCAACGCGTTCCAAAGACTGCGAAACATACTAG
+CATCCGAGGATGTCATACTCAAATACCCCGACTTTAAAAAGCCTTTTGAC
+CTTACTACAGATGCTTCGGCAAGTGGTATCGGTGCAGTCCTATCCCAGGA
+GGGCAGGCCAATCACCATGATATCGCGTACCCTTAAACAGCCCGAGCAGA
+ACTACGCCACAAACGAAAGGGAATTGCTGGCGATTGTATGGGCCCTAGGT
+AAGTTGCAGAACTTCCTGTATGGCTCTAGGGAGATTAATATATTTACCGA
+CCATCAACCCCTCACTTTCGCTGTTGCCGACAGGAACACGAATGCCAAGA
+TAAAGAGGTGGAAATCTTACATAGACCAGCATAATGCCAAGGTTTTCTAC
+AAACCTGGCAAAGAAAATTTCGTGGCAGACGCCCTCTCTAGGCAGAATCT
+GAATGCCTTACAAAACGAACCCCAATCAGACGCTGCGACCATTCACAGTG
+AGCTCTCCCTGACCTACACGGTCGAGACAACAGACAAACCGTTAAATTGC
+TTCAGGAACCAGATCATTCTGGAGGCAGCACGTTTTCCGCTCAAACGAAA
+CCTGGTGCTCTTTCGAAGCAAATCTCGCCACTTAATCAGCTTTACTGATA
+AAAGTTGGCTATTAAAAACACTTAAGGAGGTGGTAAACCCTGACGTCGTG
+AACGCTATTCACTGCGACCTGCCCACTCTGGCAAGCTTCCAACACGACCT
+CATTGCCCACTTTCCAGCCACCCAATTTCGTCACTGTAAGAATGTCGTGT
+TAGACATAACCGACAAAAACGAACAGATCGAAATCGTCACTGCCGAGCAC
+AACCGCGCTCACAGAGCCGCACAAGAAAACATTAAACAAGTCCTTCGGGA
+TTATTACTTTCCCAAAATGGGCAGTTTAGCTAAAGAAGTAGTAGCTAATT
+GTAGGGTCTGCACCCAAGCAAAGTATGACAGGCACCCGAAAAAGCAAGAG
+CTCGGGGAAACGCCCATACCCAGCTATACAGGTGAGATGGTGCATATTGA
+CATATTCTCAACCGACAGGAAGCTATTCCTGACGTGTATTGACAAATTTT
+CTAAATATGCAATAGTGCAACCAGTGGTGTCTAGAACAATAGTGGACATC
+ACAGCACCCCTGTTGCAGATCATTAACCTGTTCCCCAATATCAAAACGGT
+CTATTGTGACAATGAGCCCGCATTTAACTCAGAAACTGTCACCTCAATGC
+TCAAGAACAGCTTCGGCATTGACATAGTAAATGCGCCCCCACTCCACAGC
+TCATCCAATGGCCAAGTTGAACGGTTCCACAGCACATTGGCAGAAATCGC
+CAGGTGCCTGAAGTTGGACAAAAAAACGAATGACACAGTAGAACTAATCT
+TGAGGGCGACGATAGAATATAACAAAACCGTGCACTCAGTTACTCGTGAG
+AGACCAATTGAGGTGGTTCACCCAGGGGCCCACGAGCGCTGCCTAGAAAT
+CAAGGCAAGATTAGTAAAGGCTCAGCAAGACAGCATCGGAAGAAACAACC
+CTTCCCGACAAAACCGCGTGTTTGAGGTGGGAGAACGCGTGTTTGTAAAA
+AACAACAAGAGGTTAGGAAATAAGCTAACTCCACTATGCACCGAGCAAAA
+AGTGCAGGCAGACTTGGGAACGTCTGTTCTTATTAAGGGGAGGGTGGTCC
+ACAAGGACAACCTCAAGTAGACATTCCCTCTACAGTTAGGTAGTAAGTTA
+TGTCAAGGAAAATCCGAGCACTGTAGTATCACCTTGTCTTTAATTTCCAG
+GTTCACCCTCATGATGTTCATACCCTTGGTAGTAGCGAATGCTCGGATCA
+CCGACTTTTCGCATGCCAACTACATTCCTGTGTTAGATGGGGATGTGCTG
+GTGTTTGAACAGCGTGACCTCTTGAAACATTCGAGTAACCTTTCCGAGTA
+CGCTAGTATGATAGATGAAACACAGAAACTGTCCGAGTCCTTTCCCCACT
+CACATATGCGTAAGTTGCTAGAGGTCGATACTGACCATCTTAGAACCTTG
+TTGTCCGTTCTCAAAGTCCACCATAGGATAGCTAGGAGTCTAGATTTCTT
+AGGTACAGCCTTAAAGGTTGTGGCGGGTACTCCCGATGCCACGGACCTCT
+TTAAAATTAAGATCACAGAGGCCCAACTAGTAGAATCTAATTCCAGGCAG
+ATAGCTATAAACTCCGAAACCCAGAAACAGATAAATAAGTTAACTGACAC
+CATCAATAAGGTGATCAATGCCCGTAAAGGCGACTTGGTTGACACTCCAC
+ACTTATATGAAGCACTACTAGCAAGAAATAGGATGCTGTCTACAGAAATT
+CAAAATTTAATTCTCACTATTACTTTGGTCAAATCAAACATTATAAATCC
+CACAATTCTTGATCATGCCGACTTGAAGCCTCTTGTAGAACAGGATACCC
+CAATTGTCAGCTTAATAGAAGCATCTAAGATCAGGGTCCTCCAGTCCGAG
+AATAGCATTCATATTTTAATTGCCTATCCTAGAGTCAAGTTCAGTTGCAA
+GAAAGTCGCCGTCTACCCTGTATCTCACCAACACACCATCTTGCGCCTCG
+ACGAAGACACTTTGGCCGAATGCGAACATGACACCTTTGCGGTCACCGGA
+TGCACAGACACCACACACTTCACGTTCTGCGAGCGGTCTCGGCGCGAAAC
+TTGCGTGCGCTCACTCCATGCTGGAAACGCTGCTCAATGCCACACTCAAC
+CCAGCCACTTGCGAGAAATAAACCCCGTAGATGATGGCGTTGTGATTATC
+AACGAAGCCGCAGCTCACGTTAGCACTGATGGCAGCCCCGAAACACTGAT
+AGAGGGAACCTACCTGGTAACCTTCGAGCGAACGGCAACCATCAACGGCT
+CTGAATTCGTAAATCTAAGGAAAACACTAAGCAAGCAGCCAGGCATCGTG
+CGTTCACCACTACTTAACATCGTCGGCCACGACCCTGTGCTCAGTATACC
+TCTGCTACACCGGATGAGTAACGAAAACCTACATTCCATCCAAAACCTTA
+TGGATGACGTGGAATCTGAAGGCTCGCCCAGACTCTGGTTCGTGGCTGGT
+GTGGTCCTAAACTTCGGCTTGATTGGCTCTCTCGCCCTTTATCTGGCATT
+AAGGAGAAGACGAGCCTCTAGGGAGATACAGCGCACCATCGATACTTTCA
+ACATGACCGAGGACGGTCATAAACTTGAGGGGGGAGTAGTTAACAACTAA
+CAATGTATTGCTTCGTAGCAACTAAGTAGCTTTGTATGAACAATGCTGAC
+GCGCCAGAATTGGGTTCAACGCTCCACGCGAAGAATGCCTGGCAGCGGAA
+AGCTGACACTTCCTACCGGGAGTGTTGCTTCACGCTGCAAGAAATGCTGA
+GTCGGCTTGCCGACTTGTGGCGGCGCGATGCATTGCTCGAGGGTAAACTT
+AGTTTTCAATATTGTCTTCTACTCAGTTCAAATCTTGTGTCGAAATAAAC
+CACAGCTTGCTCCGGCTCATTGCCGTTAAACATCATTGTTCTTATTTACA
+ATCAAATCGCTATCGCCACAAGGCTAGTGATAATAACTAAGGGGGCGAAG
+TCAAGCCCTCCAACCTAATCTCCATAAACAGTGTCTAAGACGAACCTCAG
+CGAAAGAAGGAAGATCTCTAGACCTACTGGAAATAACATAACTCTGGACC
+TATTGGAACTTATATAATT
+>FBgn0004141_HeT-A
+TAAATAAATAAAATAAATTAAACAATTAACTAAATAATTAAATAACTAAA
+ATTAATAATATAATCCGTTCGCTTGCCAAAGACTCTCACGCGCATAACTA
+ATTAAAATCGATTTTCAAGTTGACAAATAAATGGTTTAAAATTGTCCTCA
+GGCTGCAAAGAAAAGCCGCGGCAACAATAAACATTTAGTGACACGCGAAA
+AGCGAACATTTGATTAGTGTAATACTTGTGCAAACCGACAAGCTGCCGCC
+ATAACAAAACGGAGACGAAGAATCATAAAGAACAAAAGCTAAATCCACCA
+GCATAGCAAAAATAAATTAACAAATAAAATAAAAGCAAATTTAAATAACA
+TAATAAATTAAACTTATTTAATAAACCAATTAATTTTAATTAATTCAATT
+AAACGCTAAATCTACATAATACTCCACGCGCAAATTAATTGAAATCGTCT
+TTCTAGTTAATAAATTAAAAGTTTAAAAATTGTCTCCGGCCGCAAAATTT
+GAACCGCGACGATAAAAACATTTAATTGACAAACAAAAAGCGAACAATTA
+TTCAGTGAACTATTTGTGCAAAATTGACAAGCAGACGCCATAATTAAAAG
+GAGAAGAAGCCAAAAGACGAAGAGAAGAAAGCAACCAGAAGAACTCAAAG
+AAGAAAAGGAGGAAAGCCCAATTAAAGAAAGCCAGGGTATTTATACCTTA
+CACTTATCGTTTAATATAACAAAAACCCAACATGTCCATGTCCGACAACC
+TTTTTTCTGACGATGAGGTACTTTCAATTTCCTCAAGCCCAGAACAGCGA
+TCTTCTCCGTTCTACCTCAATATATCGCCCATGTCCCACGGATCAGACAA
+TTCTCAGATTAATACAGTCATCATTAATTCGAAGAAATTGCCCTCAAATC
+AAGCAGACATAAGTTTAAAAAACTCTTCTGGGGCTGCTATAAAAATTGTT
+AATTCCCTTTCACACAAGAAGAAAGAGAACACAAACGTTAATAATGCCCA
+AAAAGACCCCCTCTCACTCACCAATACTACTGCAAGCACTTGTGGCGCCA
+AAAGCAGCATCTCAGAGGGGAAATTGTCTTCTCCTCCGTCCACCTCACAC
+ACATATGAGGGGAAATTACTCACAAAACTTACTCACACACACACAGACTT
+TAGAGGCGCCAAAACGAGCGATGCAATGGGAAGTTTCCCCTCTCTCTCGC
+ACAGCGACAATAGCATAGAGAAAAATCTGAGTTCTTCCACCAAAATTGGA
+CCAAACGCTTCTTCCCCTCCTTCTCATGCACACACTCACACTAGCAAATC
+CACTGATATAAGCTTAGAAAGCCGCTCAAAACATCCCGCGCTTGCCAATA
+CGGACGCACGCTCTATAAAAGCCAATGCTAATGACAATGGGGAAATTTTC
+TCCTCACTTATACAAATTGACGAACGCAAGCAAGAGGAAAGGCCTTGCAC
+AACTATCAACGCTTTTTGGTCTATTTTTAAACCCAAGCCGGACGTTACTA
+AACTAAGTCTAAAGAGGAAACCCACCAATCCCACTAAAAACACTGGGAAA
+AAATGCATCTCCCCTCATAAAAAGAGCGCTTATTTATGCCCTTCCGCTCA
+GGATGATTTAAATTTAAATTTAAACCCCAAATCTAGCGCCAAGCCCACTG
+TGGTGAATTTACCAGCTGCCCGCATCCTAAGCCGGCCTGCAGCCAAGCGG
+GATTTATTTAAATCATCATCCTCCCGAAGCCCAGACGAGCAGCCTATGAG
+TTTTTCGGAAGTGGTCGCTGGCACGGGTTCAATTTTTGCGGCACCCTGTG
+TCCCGGCACCTTTAACGAAAACTCCAGGCAAGCGGACAAACGACGATCTG
+GACTGCTCCAACTTTAAGACGCCCAATAAAAAATTATGCGCGACTTCCAA
+CTTTGTAACTCCCAGCATTTTTCCGCCGCTCATCACTCCCGTTTTCAAGA
+GCAAGGCAGCTCAATCTGTTTACGAGGAATCCAAAGCCAGAAATGGACCC
+CCCCCGCCGGCCCTCGCCTGCAGCATCAATGCCTCTGCTCGCAGCGCAGC
+GGCGCCACCCGGGATCGCCCCCCTACCCCCTCATAATACAGATGCAGAGC
+TGCCTCCATGGAAAATCGTGCCCCAGAGCCGTAGAGCACCTCCTATACTC
+GTCAATGATGTAAAGGAAATTGTACCTCTACTGGAAAAGCTGAACTACAC
+AGCAGGAGTCTCCAGCTATACTACTAGGGCTATAGAAGGAAACGGGGTCA
+GGATACAGGCAAAGGACATGACCGCCTATAACAAAATTAAAGAAGTCCTG
+GTGGCCAACGGACTTCCTTTATTCACCAACCAGCCCAAGTCCGAGAGAGG
+CTTCCGAGTCATCATCAGACATCTCCACCACTCCACACCATGCTCGTGGA
+TAGTCGAGGAACTGCTGAAGCTCGGATTCCAAGCGCGATTCGTCAGAAAT
+ATGACGAATCCGGCTACAGGTGGCCCCATGCGAATGTTTGAAGTGGAGAT
+CGTCATGGCCAAAGACGGCAGTCATGACAAAATACTCTCACTCAAACAAA
+TCGGTGGGCAAAGGGTGGACATTGAAAGGAAAAACAGGACACGGGAGCCA
+GTCCAGTGCTACAGATGCCAAGGCTTCAGGCATGCCAAAAACTCTTGCAT
+GAGGCCGCCAAGATGCATGAAATGCGCTGGCGAACACCTGTCTTCCTGTT
+GCACCAAACCAAGAACCACCCCCGCCACCTGCGTAAATTGCTCTGGGCAG
+CATATTAGCGCGTACAAAGGATGCCCTGCATATAAGGCGGAAAAACAAAA
+GCTGGCGGCAAACAACGTTGACATAAACAAAATAAGAACAATCAAAGACG
+CAACAAATAACTTTTATAAACGTCAAGGCCCCCCTCTACGCAACAACACC
+CCTCGGCTACCGCACAGCTCAGCAATCCTGAGCAAATCAATTGCCGAAGC
+TCGCCAGGAGGCAGCCAGAAAGTCGATGTTAAATCCATTCCGACAAAATA
+TAAACGACAGAAGACCACGATTCTCCTCCCACGACACGGCCATTCAGAAG
+CGTCTGAATAAATGGCGCCGAAACACCAACAAAATACCCAAAAAGGGTAG
+GATAGCCTTAAAGGATAATGCAAAGCCACGACCGGCACATAGGACAAGTA
+ACCCAGCGCAAAGACATCTGGAGGACTACCAGGACATGCTCCGAAGGGAA
+AGGAGTGAAGAAAACGACCAGGAATCTGAGAAGGGCACCCCCAATACCAA
+GCAGGTCGGCAATGACAGCCCTCCGACCACGAGCAGAGCAGCCAGAGCCA
+GCTTTAAGCCAAGAATCATTGACGATACCACGCCATCGCCAAAAATCTGC
+AATCCCAACTCACAAAAAGGCCTCTTGGACGACCCCACAACAAGCTTAGC
+TAATAGAGTCGACAATTTAGAAAAGAAAATTGACATTTTAATGGCCTTAA
+TCATACAAGGAAGAAATAACAATCTTGACATGGATACATCCAATTAATCT
+TACAACTACTTATATATTCTTTAATAAATATATCCAATAGAAAAGCGCAC
+GTCGGTCTGCTTTTAAAATCCTTCACCGTCATCACCTTCCTCGACGGAGC
+CTAATTTATTGGAAAAATAAATCAATTATATGTTGGCACAAAAATGTAAA
+CACACACTCACCTAAACGCACCCGGACGAACAAGCCTATGACAACGCACT
+CCAGCTGATCTGTAAGAAACAAAAAATATGAATAGATAGATCGATATGAA
+AAGGATATGTGCGGCAGAAACATGATGAGCAAAAGGCGACTCGCTGCAGC
+AACTTATGCACAACGTCACTTACCTGAAATTTCTTGCCGTACGATCTCCT
+GTAGTATCCCTTATCACAGCTGCAATCTACTTGCAATGCTGCACTGCAAT
+AAACGTACTACAAAAGCTGCATACGTTTTGATCAGGACACCTCGTGCGGA
+CGTGCTAAAAAAAATTTCCTTTCTGCTGCTCTTATTGACGCTAAAACCTT
+AAAACCTACAAACAAAACAATTAAATAATAACAAATCAAATAAGACAACC
+AAATAATACACTTACCTCATTGACTGCAGCTAAATCGCTGACCCACATTC
+AGTGCAGCCGACAGCAGGAGACGGGCCCGCAAAAGCAAAACAAAATCGCC
+AATTTTGCGATTATAAACACGAAAAATTGACAATTTTGCGATGCCGTCTC
+CGCCTCCTGATGCCACTGCATTGACAAGCATCACTAGCGAGGAGCTGACA
+CCACACCAAAAAGCTGTAAAATCCGTCCACAAATTGTATATTTTGCCTCA
+GTGTCGTATCTGCAATGTTTTTCCGATAACCTGTAAGGAAAGAAAAATTA
+ATAAGAAAATTATACAAAATTAATTAAGGACGACAGAAAATAGCAAACCA
+GACAGGCAAATTAACAGATACAAATATGAGACTCCATCCTGCTGCCGACA
+CACAAGTAAATCCTTCAACTCGACAACAGGAGACGGGCCTTGCAAAAGCA
+AAACAAAATCGCCAACTTTTGCGATTATAAATACAAAAAATTGACAATTT
+TGCAACGCCGTCTCCACCTCCTGTTGCCACTGCATTAATAAGGATCACCA
+GCGCGGCGTGACGCCACACTAAAAGGCTGCAAAATCCGTCCACAAAATGT
+ATACTTTTCCTCAGTACAATACTTTCTAATGAACTTCCGCCAACCTGCAA
+TGAAAAGAAAAGAAATAGGTATATAAAACAAAACAAACAAAAGGACAACC
+TAAAATTAGCAAACCAGACAGGCATACTAGTAGATGCTAATATGCAGCTC
+CATCCTACTGACGACAACCACGCAACTCCTTTCTCCAAGACCGCAAATAC
+TGAAACAAGGAAGCACAAGCTAATACTGGGAATTATTTATTTAAACAAAA
+ATACTTATCTAATTGCCAATTCGACGACTCCAAATCCGCGGCTAACCGGC
+GGCGATGGCCCATAAATAAAGGGCCTCCTAATTAATTACAAAATGTACCT
+GAAAAACATAAAATTAACGCAACTATAATTAACGCAATTAATAAATCAAA
+TAAATACAAGTATAATACTTACCTCCAAGCAAACGTACCTGAAAAACAAA
+ACCAAAAAAAAAATTAATGCAATAAATAAATCAAATAAATACAAACATAA
+TACTTACCTCCAATTTACCTCCCAGCCAATCTACCTGAAAAACATAATCT
+AATACAATCTCAAAAACAAATAACAAATGTAATACTTACCAAATTTTAAT
+TTTGTATTCATTTCCATGACCCCAACGCTGCAACTGTCCTCGGCAACAAT
+TCCTGTTCCGGCGGCTCCATGCTGCCAATCCTGACGCACTGGCCACAAGA
+CGCGGCGCTGCTGGCAATCTCTCGATGAACAACCGATCTACAATTTCCAT
+GACGACTCCTCTGTCACGATGAGACAGAAGACACCACCAACGCCAGCAGC
+TCCAAAACAATACAACAACGGCCGCGCGGAACCCATCTTCAGAATTCCCT
+CTTCCTGACGACCGGCGAACGAGTTCTGGAATAAACAATGTATTAATTGC
+AAACATCTACCGATGAGGGTAGAAGAGATACTCACCAAACGACTGCGGCG
+CGGGAACAAACTAACTGCAACGCCGGCCGGACCTATTTGTTGCAAGTGGC
+GCGCATCCAGCGCCTGCAACATGCCCCAGCCCAAGTACACAACTACTTAC
+CTGCAACGTCGCCAGAGGCTCCCAGCGAATCGGTGCTTCCGTCCTTCTGG
+CGGGGGTACCTGAAAAGAAACAAATTAAACAATATTAATCCTAAATTTCA
+ATGTTTTTTGTAAAATAATTTAAATTGTTAAATGTAAACAAGCCTTGCAA
+TATGTTAATGTTACCAGTCCATGCTACTGTCTAAAAGCCAAGAATACAAA
+AAATACTAATTATAAACTAACTCACCACGCCCAACCCCCAAACTCACCCC
+ATGCAATGTTAAACCTATAAATTCAAATAATTGTACCTATATATTGCACA
+TACTGTAATCAAAGGCAAAATAAATCGTGGATGCGGAACAGAATTTACTC
+TGTCTCCGTACCTCCACCAGCAAAGTTAAAAAA
+>FBgn0001283_jockey
+AAAAATCATTCACATGGGAGATGAGCAATCGAGTGGACGTGTTCACAGAA
+GTCGCGAGATAAAACAAAAACGTAATTGTGATCCATCACAAACATCTGCG
+CAGATCGTGTGCTTATCTCACAAACAAAATCTATTTTTAGTCACTGCATA
+ACGGTGACGGCTTCGGTTCGCGAAACTTATCAGCAACTAGCAATTTCTAA
+GCTGTGTTGTTTTTGCCCCTCGCCCTGCGCGCTGCGCAAGCGGGAGGTTG
+TTACAATTTACCTTACAAGTAAACCGGTAAATCTTATCGTGTTTAGTAAA
+TATCAATTGCATTATACGGCATAAGTATAAAGACAATTGATATAATGGAG
+AATTCATTTGCTCAATCGCGACCTAGCAATGGGTGCGATAAATTTGAGAA
+AATGAGGAAAGTAGCAGGTGTTGAGCCAGGAGAATTACGCTCCCAACTCC
+GCGCCAGCTGTGCAGTTGTTTCCCCTAACCTGGAAGGTATGCCAACTCAA
+TCTGCGGTCTCCAGCTTAATGGTGACAATCAGCAGCAACACCAATGCAAG
+TGTTACCTGCACTATTTCTAACGTACAGGCCAACATGATCTGTACTCCTA
+CATACACTGATTGCACAACCGTGACCACTAGCATTTGCCCAACTACGCCT
+TATGACAATGGACTGCCGACACCTCTGTCATCACTGCCCAATAAGCCATC
+TAAAGCGAATTGCCCCTTTCAAGCACATGATCGTACTGTCAACAGGAAAC
+GAAAAGGCGTGTCTCAGCCCCCATTACCTATCCTCACCCCTTCTCCAAGC
+CGTAAAACTAAAAGGCAGGCCACTATGCCACTCAATGAGGAGGCCTCTAC
+CTCCACTGCAGCAGCATTAAATAACAATCGCTTCGCGCTTTTGTCCGCTG
+AAGCGGAGAATATGGAGCAAGACGTGTCGGATGCTGATTCTGACATTGAA
+GACTCTGCTGCCCGAGATGGTGGTGGACAATCCGCTAAATATAGCAAACC
+CCCAGCCATATGCGTACCAAGTGTAAGCGATCCGGTCACCTTGGAACGGG
+CTCTCAATCTGAGCACCGGCTCCTCAAACTACTACATCCGCATTTCTAGA
+TTTGGTGTATCCAGAATCTATACAGCCAACCCTGATGCTTTCCGCACCGC
+TGTAAAAGAACTAAATAAGTTAAATTGTCAATTCTGGCATCACCAACTTA
+AAGAAGAAAAACCCTACAGAGTAGTGCTTAAAGGAATCCATGCTAATGTT
+CCTAGTTCGCAGATAGAACAAGCATTTAGTGATCACGGCTATGAGGTCCT
+TAATATCTATTGCCCCAGAAAGTCTGACTGGAAGAACATTCAGGTAAACG
+AAGATGATAATGAAGCTACAAAAAACTTCAAAACTAGACAAAATTTGTTT
+TATATTAATCTTAAACAAGGCCCGAATGTTAAAGAGTCTCTTAAGATAAC
+TCGACTTGGCAGATACAGAGTCACTGTTGAGCGCGCTACACGTAGAAAAG
+AACTGCTACAATGTCAAAGATGCCAAATTTTTGGACACTCTAAGAACTAT
+TGCGCCCAGGATCCTATTTGTGGTAAATGTAGTGGTCCCCATATGACCGG
+GTTCGCTTTGTGCATAAGTGACGTATGTCTGTGTATAAATTGTGGTGGTG
+ATCATGTCTCGACAGACAAAAGCTGCCCTGTCAGAGCAGAGAAAGCCAAG
+AAGCTAAAACCAAGGTCCAGGCTACCGATGACTAATAATATTGCCACACT
+CAAACCTCCACAACGTTCTTCAAGCGGTTACATACCAGCTGAGGCATTAA
+GAACCAACATCTCTTATGCTGATATTGCTCGACGCAACACGACTCAATCT
+AGGGCTCGTGCTACTGTGCAGGCTGAAGTTATACCAACGTCGGACAATAG
+CCTTAACAATAAATTTATGACGTTAGACAACTCCATTCGGGCCATCAATA
+CGAGAATGGACGAACTATTTAAGCTTATACACGAAACTGTAGAGGCTAAT
+AAAGCTTTCAGAGAACTGGTTCAGGTTCTAATTACACGTATTCCTAAATG
+ACTCAACCAACCTTAAAAATCGGATTGTGGAACGCTCGCGGATTAACAAG
+GGGCTCTGAGGAGCTTCGGATATTCCTCAGCGATCACGATATAGACGTAA
+TGCTTACCACGGAAACACACATGCGAGTTGGTCAGCGCATCTATCTCCCA
+GGGTATCTTATGTATCACGCCCACCACCCCAGTGGTAACAGTAGAGGTGG
+CTCTGCAGTCATCATAAAATCTAGACTTTGTCACAGCCCTCTGACACCTA
+TCTCTACTAATGACAGGCAGATAGCGAGAGTGCACCTGCAAACATCGGTT
+GGGACCGTCACTGTAGCTGCTGTTTATCTACCTCCAGCAGAAAGATGGAT
+AGTAGATGACTTCAAATCCATGTTTGCTGCGTTAGGCAACAAATTTATTG
+CTGGTGGTGATTACAATGCCAAACATGCATGGTGGGGGAACCCAAGATCC
+TGTCCTAGAGGTAAAATGTTGCAAGAAGTCATTGCACATGGGCAATACCA
+AGTTCTGGCTACGGGCGAACCCACTTTCTACTCTTACAACCCTTTGTTAA
+CACCATCAGCCCTTGATTTTTTTATAACCTGTGGGTACGGCATGGGCAGG
+CTAGATGTACAAACTCTCCAGGAACTCTCGTCGGACCATCTTCCTATTCT
+GGCTGTATTGCACGCTACGCCGTTAAAGAAACCACAACGCGTACGACTAC
+TTGCCCATAATGCTGACATAAACATATTCAAAACCCATCTTGAACAGCTG
+AGTGAGGTAAATATGCAAATTCTGGAGGCGGTGGACATTGATAATGCCAC
+AAGCCTTTTCATGAGCAAACTAAGTGAGGCTGCTCAGCTTGCTGCACCGA
+GAAATCGGCATGAAGTAGAGGCCTTCAGACCACTTCAACTTCCTTCCAGT
+ATATTGGCACTGCTCAGGCTAAAACGAAGAGTTCGAAAAGAATATGCTAG
+AACAGGTGATCCCCGCATGCAACAGATCCACAGTAGACTGGCCAACTGCC
+TGCATAAGGCCCTTGCTCGAAGAAAGCAGGCCCAAATAGATACCTTCTTG
+GATAACTTGGGTGCTGACGCGAGCACAAATTACTCACTGTGGCGTATCAC
+GAAACGGTTCAAAGCTCAGCCCACCCCAAAATCAGCAATCAAAAATCCGT
+CTGGTGGCTGGTGTCGCACTAGCTTGGAAAAAACTGAAGTGTTCGCTAAC
+AACCTTGAGCAACGTTTTACACCCTATAACTATGCACCGGAAAGTCTCTG
+TCGTCAGGTTGAAGAATACTTGGAATCGCCCTTTCAAATGAGCCTGCCTC
+TGAGTGCTGTCACACTGGAAGAAGTGAAGAATTTAATAGCCAAGCTGCCA
+CTTAAGAAAGCTCCTGGAGAAGATCTTCTTGATAATAGAACCATTAGACT
+TCTCCCAGATCAAGCATTGCAGTTCCTTGCCTTAATATTCAACAGCGTTC
+TTGATGTTGGCTACTTTCCGAAAGCTTGGAAATCGGCGAGCATAATTATG
+ATCCATAAGACTGGAAAAACACCGACAGACGTTGACTCGTACAGGCCCAC
+CAGCTTACTCCCATCTCTGGGTAAAATTATGGAGAGGCTGATCCTAAACA
+GGCTGCTCACATGCAAGGATGTTACCAAAGCGATTCCCAAATTTCAGTTT
+GGCTTCCGGTTGCAGCACGGTACTCCTGAGCAACTACATAGAGTAGTGAA
+CTTTGCTCTGGAAGCTATGGAAAACAAGGAGTATGCAGTAGGTGCCTTTC
+TTGATATTCAACAGGCATTTGACAGAGTCTGGCACCCTGGGCTCCTGTAC
+AAAGCGAAGAGGCTGTTCCCGCCGCAGCTATATTTGGTTGTTAAAAGTTT
+CCTGGAAGAACGCACATTCCACGTCTCTGTTGATGGGTACAAATCATCAA
+TCAAGCCAATTGCAGCTGGAGTTCCTCAAGGAAGCGTTCTTGGCCCAACC
+CTATACTCAGTTTTTGCTTCGGACATGCCTACTCACACACCAGTCACAGA
+GGTAGACGAAGAAGATGTGCTCATAGCCACCTACGCTGACGATACTGCTG
+TGCTCACGAAAAGTAAAAGTATCCTGGCTGCCACTTCTGGTCTACAGGAA
+TACCTGGATGCATTCCAGCAATGGGCTGAGAACTGGAATGTGCGCATCAA
+CGCTGAGAAGTGTGCCAATGTGACGTTCGCCAACCGAACAGGTAGCTGTC
+CGGGTGTCAGTCTGAATGGAAGACTGATCAGACACCATCAGGCTTATAAA
+TACCTTGGTATTACCCTCGATAGGAAGCTCACCTTCAGCAGGCACATCAC
+AAATATTCAGCAAGCGTTCAGGACCAAGGTTGCTCGGATGTCTTGGCTCA
+TTGCACCACGCAACAAACTGTCGCTTGGCTGCAAGGTCAATATTTACAAG
+TCCATATTGGCCCCCTGCCTGTTCTACGGCCTGCAGGTATACGGCATTGC
+TGCGAAGAGTCACCTTAATAAGATCCGGATTTTACAGGCGAAGACCTTAA
+GAAGAATTTCGGGGGCTCCTTGGTATATGAGAACAAGAGACATCGAACGC
+GACCTCAAGGTGCCCAAATTAGGAGACAAGCTCCAGAACATCGCCCAAAA
+ATATATGGAAAGGCTTAATGTACACCCCAACAGCCTAGCAAGGAAGCTAG
+GAACTGCAGCTGTGGTCAATGCTGACCCTCGGACTAGAGTCAAAAGAAGA
+CTCAAGCGACACCACCCTCATGACCTCCCTAACCTGGTTTTGACCTAGAA
+AGTCTTAGTTTTAAAATTCATTAGAATAATCAAATAAATAATAATTACTA
+TGTTATATCAACTATTATAATTCTCCCTATCATTTTTAGATTAAAAATCT
+GTTAGTCTTAAGTAACCAAGACACATTGTAAAATAAAATAATTTAAGCAG
+ATCAAATTAAGTTGCCGCATGGGTAACAGTGCGTTGATCAAATAATAAAA
+ACATCATAAAAAAAAAAAAA
+>FBgn0003055_P-element
+CATGATGAAATAACATAAGGTGGTCCCGTCGAAAGCCGAAGCTTACCGAA
+GTATACACTTAAATTCAGTGCACGTTTGCTTGTTGAGAGGAAAGGTTGTG
+TGCGGACGAATTTTTTTTTGAAAACATTAACCCTTACGTGGAATAAAAAA
+AAATGAAATATTGCAAATTTTGCTGCAAAGCTGTGACTGGAGTAAAATTA
+ATTCACGTGCCGAAGTGTGCTATTAAGAGAAAATTGTGGGAGCAGAGCCT
+TGGGTGCAGCCTTGGTGAAAACTCCCAAATTTGTGATACCCACTTTAATG
+ATTCGCAGTGGAAGGCTGCACCTGCAAAAGGTCAGACATTTAAAAGGAGG
+CGACTCAACGCAGATGCCGTACCTAGTAAAGTGATAGAGCCTGAACCAGA
+AAAGATAAAAGAAGGCTATACCAGTGGGAGTACACAAACAGAGTAAGTTT
+GAATAGTAAAAAAAATCATTTATGTAAACAATAACGTGACTGTGCGTTAG
+GTCCTGTTCATTGTTTAATGAAAATAAGAGCTTGAGGGAAAAAATTCGTA
+CTTTGGAGTACGAAATGCGTCGTTTAGAGCAGCAGCTGAGGGAGTCTCAA
+CAGTTGGAGGAGTCTCTACGCAAAATCTTCACGGACACGCAGATACGGAT
+ACTGAAGAATGGTGGACAAAGAGCTACGTTCAATTCCGACGACATTTCTA
+CAGCTATTTGTCTCCACACCGCAGGCCCTCGAGCGTATAACCATCTGTAC
+AAAAAAGGATTTCCTTTGCCCAGTCGTACGACTTTGTACAGATGGTTATC
+AGATGTGGACATAAAAAGAGGATGTTTGGATGTGGTCATAGACCTAATGG
+ACAGTGATGGAGTTGATGACGCCGACAAGCTTTGCGTACTCGCTTTCGAC
+GAGATGAAGGTCGCTGCTGCCTTCGAGTATGACAGCTCTGCTGATATTGT
+TTACGAGCCAAGCGACTATGTCCAACTGGCTATTGTTCGTGGTCTAAAAA
+AATCGTGGAAGCAGCCAGTTTTTTTCGATTTTAATACCCGAATGGACCCG
+GATACTCTTAACAATATATTAAGGAAACTGCATAGGAAAGGATATTTAGT
+AGTTGCTATTGTATCCGATTTAGGTACCGGAAACCAAAAGCTATGGACAG
+AGCTCGGTATATCAGAATGTAAGTTTCGTATATTACAAAAATCAGATAAT
+CCTTGAAATTCCATTTTTTAGCAAAAACCTGGTTTAGCCATCCTGCAGAT
+GACCATTTAAAGATTTTCGTTTTTTCGGATACGCCACATTTAATTAAGTT
+AGTCCGTAACCACTATGTGGATTCCGGATTAACAATAAATGGGAAAAAAT
+TAACAAAAAAAACAATTCAGGAGGCACTTCATCTTTGCAACAAGTCCGAT
+CTGTCTATCCTCTTTAAAATTAATGAAAATCACATTAATGTTCGATCGCT
+CGCAAAACAGAAGGTTAAATTGGCTACCCAGCTGTTTTCGAATACCACCG
+CTAGCTCGATCAGACGCTGCTATTCATTGGGGTATGACATTGAAAATGCC
+ACCGAAACTGCGGACTTCTTCAAATTGATGAATGATTGGTTCGACATTTT
+TAATTCTAAATTGTCCACATCCAATTGCATTGAGTGCTCGCAACCTTATG
+GCAAGCAGTTGGATATACAGAATGATATTTTGAATCGAATGTCGGAAATT
+ATGCGAACAGGAATTCTGGATAAACCCAAAAGGCTCCCATTTCAAAAAGG
+TATCATTGTGAATAATGCTTCGCTTGATGGCTTGTATAAATATTTGCAAG
+AAAACTTCAGTATGCAATACATATTAACAAGCCGTCTCAACCAAGACATT
+GTGGAGCATTTTTTTGGCAGCATGCGATCGAGAGGTGGACAATTCGACCA
+TCCCACTCCACTGCAGTTTAAGTATAGGTTAAGAAAATATATAATAGGTA
+TGACAAATTTAAAAGAATGCGTAAACAAAAATGTAATTCCATGATTTATA
+ATTGTTTAATGTTTAGCTATATGTTTCAGGAAAGTTTCAGTTGAGAATGT
+AGGTAGTTATGTGCTGTCTATTGTGTTTTGTCTTTTATCTGTTTCTTTTC
+ATTTTATTATTTAATCATTATCCTTTTGCTTATCCAGCCAGGAATACAGA
+AATGTTAAGAAATTCGGGAAATATCGAAGAGGACAACTCTGAAAGCTGGC
+TTAATTTAGATTTCAGTTCTAAAGAAAACGAAAATAAAAGTAAAGATGAT
+GAGCCTGTCGATGATGAGCCTGTCGATGAGATGTTAAGCAATATAGATTT
+CACCGAAATGGATGAGTTGACGGAGGATGCGATGGAATATATCGCGGGCT
+ATGTCATTAAAAAATTGAGAATCAGTGACAAAGTAAAAGAAAATTTGACA
+TTTACATACGTCGACGAGGTGTCTCACGGCGGACTTATTAAGCCGTCCGA
+AAAATTTCAAGAGAAGTTAAAAGAGCTAGAATGTATTTTTTTGCATTATA
+CAAATAATAATAATTTTGAAATTACAAATAATGTAAAGGAAAAATTAATA
+TTAGCAGCGCGAAACGTCGATGTTGATAAACAAGTAAAATCTTTTTATTT
+TAAAATTAGAATATATTTTAGAATTAAGTACTTCAACAAAAAAATTGAAA
+TTAAAAATCAAAAACAAAAGTTAATTGGAAACTCCAAATTATTAAAAATA
+AAACTTTAAAAATAATTTCGTCTAATTAATATTATGAGTTAATTCAAACC
+CCACGGACATGCTAAGGGTTAATCAACAATCATATCGCTGTCTCACTCAG
+ACTCAATACGACACTCAGAATACTATTCCTTTCACTCGCACTTATTGCAA
+GCATACGTTAAGTGGATGTCTCTTGCCGACGGGACCACCTTATGTTATTT
+CATCATG
+>FBgn0003122_pogo
+CAGTATAATTCGCTTAGCTGCATCGATAGTTAGCTGCATCGGCAAGATAT
+CTGCATTATTTTTCCATTTTTTTGTGTGAATAGAAAATTTGTACGAAAAT
+TCATACGTTTGCTGCATCGCAGATAACAGCCTTTTTAACTTAAGTGCATC
+ATATCAGCTGTTTTTTTTGCCAATTTCAATGAATATCATCAAAGTTAGCT
+GCGCCATCTATGAATCATTTTTGCATATCTAAAAGATGCAAGAATGCCAA
+CTCGTTTCAGTATCTGCGCATGTCCGTTTTTGTTTTTGCTTTGATCGTGA
+TTTTTGTGTTTTTGTTTCTTATGGCACAAAGTTATTAAAATGGGTAAAAC
+AAAGCGTGTCGTTGGACTAACACTAAAGGAAAAGCTTCAAATAATCGAGT
+TAGTGACCAACAAAGTGGACAAAAAGGAAATTTGTGCCAAGTTCAAATGC
+GACAGATCCACAGTCAACCGCATTTTACAAAAAACAAATGAAATTCATGA
+AGCTGTGGCCGCGTCAGGTTTAAAAAGAAAGCGTCAAAGAAAAGGAGCGC
+ACGACTTAGTAGAAGAAGCCTTATACATTTGGTTCGGACAGCAGGAATCA
+AAGAACGTAATTCTTGACCGGCACGTCATATTAGCAAAAGCGAAAGAATT
+TTGCCAAAAATTTAACGACGCCTTTGAACCTGACGCCAGCTGGCTTTGGC
+GCTGGCGCAAGCGCCACAATATAAAGTATGGCAAAATACACGGCGAAACT
+GCTACAAATGATTCCGTATCAGCAAATGAGTACAAAAATGATATTTTGCC
+AGGATTGCTTAAAGGTTATAACCCAGAAGACATTTTTAACGCTGACGAAA
+CTGCACTCTTTTATAAAGCAATGCCGAATGCGACATTTTTTACTTGTGGA
+AAGCAATTAAATGGCCAGAAATCTCAGAGAGTGAGACTTACTTTGCTGTT
+TATATGCAATGCAACTGGGACATACAAAAAAACTTTTGTAATCGGCAGAT
+CTAAATCGCCACGATGCTTCAAGAATGCTAATGTGCCCATTCCGTACTAT
+GCAAATAAGAAGGCCTGGATGACTAAGGATCTCTGGCGAAAAATAATGAC
+AGGATTTGACGAAGAAATGAAAAAGCAAAATCGAAAGATTTTACTCTTCA
+TCGACAATGCAACTAGTCACACGACTGTCAAGGACTTCGAAAACATAAAA
+TTGTGCTTCATGCCACCAAACGCAACGGCTCTACTTCAACCTCTGGACCA
+AGGTATTATCCACTCATTCAAATTAGAGTATAGGCGTATTTTGGTCAAAC
+AGCAGCTCATTGCTGTTAATTGTGGTAAATCTACTGTGGAATTTTTAAAA
+TCATTATCGTTATTGGATGCTCTATATTTTGTCAACCAAGGATGGAAGAA
+TGTTAAAATGTTAACTATTCAGAATTGTTTTAAAAAGGTAAGATGGGATT
+ATTATTGATATGTATCTCAAATAACGAATTTATTATTTTCAGGCTGGATT
+TAAGTTCAGTTTTGAAAATGAAGACACCATTGCTGAAAAAGACAAACAAT
+GCGTAGAAGTTGACATTGTATCGAATATTAATTGGAATGAATATGCCAAT
+GTTGATGCAGATGAGGCTTGCCATGGTCAATTAGATGATGATGAAATCGT
+GCGCTCTTTAGTTCAAGATGCAAAAACCAGCGATAACGAAGAAAGCCATA
+GTGATGAAGATGTGGACGATACTGAGCGTCCTACTTTTAAGGATGGGTTT
+GCAGCAATTAAGGCTTTAAAGTCCATTTTTATGCGAAACAATAATGATGA
+GTTTTTGCAAAACTTGAATTCTATGGAAGACAAGCTGTTTAATTTACATA
+TAAACTCAGCTGTATTGCAAAAAAAAATTACTGACTATTTTTAAGTTAGT
+TTTAAAAAGTGTTTTAATCAATTCACCATCACTTAAATTTATATGTCGAT
+CTTACTTATCATTAAGAATGAAATTATCAGTTCCTTTTATGTTTAACATT
+GTTATAAAGAAATAAATTCTTTATTTTTCCTTAAAAAAAAAAATTAAGTT
+AGCTGCATTTTTAAGTTACCTGCATCGAGGCATTGTGCAAAGTACTCGAG
+GCAGCTAAGCGAATTATACTG
+>FBgn0000155_roo
+TGTTCACACATGAACACGAATATATTTAAAGACTTACAATTTTGGGCTCC
+GTTCATATCTTATGTAAATGAATCGAGAGCGATAAATTATATTTAGGATT
+TTGTTATCTAAGGCGACATGGGTGCATTGCTCAAAAACATGTAATTTAAG
+TGCACACTACATGAGTCAGTCACTTGAGATCGTTCCCCGCCTCCTAAAAT
+AGTCCCTTAGTGGGAGACCACAGATAAGGTCCTCGCCGCTCAAGATAGGC
+AGATGTGCCCGAGCGTGGGACCTCGATAAGGCGGGGACTATTTACGTAGG
+CCTCTGCGTAGGCCATTTACTTTAAGATGCGATTCTCATGTCACCTATTT
+AAACCGAAGATATTTCCAAATAAAATCAGTTTTTTTACAAAAACTCAACG
+AGTAAAGTCTTCTTATTTGGGATTTTACATTTGGTCAATCGAGCCTTTAA
+TCGACTCTGCAGTTTCCCCCTACCAAAGGTAAGGAACTCAGAGAAAGGCC
+AGCTCCTTTAAGCATCTTACAGCTAAAGGTAGCAAAAATAAGTGACTCTT
+GTTTCCCCCTACCAAAGGTAAGGAACAGAGTATAAATATAAAAAGCAAAA
+GATACAAAAGAATCTTTTATGTTTTAAAACAAGCACCTTATAGTCTATAG
+CTAAAGGTTGCTTTGTGTACCATTATAAATTGTGGTAAGGCGTGCTTGAG
+GCCATACATCAGCAATTGTGAAATTAAAAAGTGCATAACAAAAGTGCCTT
+ATAAATGCTCTAATAGCATTAAATCAGCTCATAAATAGAGTGCAGTGTAT
+ATGCCATAAGAGCATAAATTAAATAAAAAGTGCCTGAAAACAGTGCCTTA
+TAAATGCTCTAATAGCATTAAATCAGCTCATAAATAGAGTGCAGTGTATA
+TGCCAAAAGAGCATAAATGCCGAAATAAATGGCTAAAAAACAAAAAATCT
+GACTGGACTACAAAAATAATAAAACGTGCCAAAAAAAAAAAAAAAATCAT
+CTTTAAACATCGACGGAGCCTTAAAGAAGAGAAGGAAGTCAAATTCAAAG
+GAGCCTCTACCAGCAGCAGAAGCAGCAACAACAGCAGCAGCAGAAGCAGC
+AACAGCAGTAGCAACAGCAGCAACAACAGCAGCAACAGCAGCAGCAACAA
+CAACGACATCAGCTAAGTCAAAACAAGAATTTTCTGTTTATCCAAACACA
+CATATATATATAAATACATATAAAATACATATACACGTACTATATATATT
+AAGAAATTACAAAAAATTTTCAAAATGATGTCAGAAAAGACTATTCAATT
+CCTTAAGAAGCAGTCCGAAATTATTTTGGAAATTAGAAAGTTGGAAGTAA
+AACCAACATTAACAGATGTAGAAATTCTAAAATTAAATGAGCTTCAAAAA
+TGTTTCATTGCTAATCATAGCAATTTGTTAAAGATCGGCGTTGTCGATCA
+TGAATATTTTAACGCGAAGCAGTATGATTTAATAATGATGGTGTTAGAAA
+AAATTAAAAATAAAAATGAAAAAATTAAGGGCGAGTCGGTAGAAAACACT
+TTCCCTAAATCAAACACTGTCCCTAAATCAAACCCTCCCCCTACATTAAA
+CCTTGAAATGCGTGGTCACCCTGAAAAAGAGGGTATAGCACAAAACAACG
+CTTTAAAAGTAGAGCAGGCATTTCGTAATAATGTTGGCCAATTTCGAGTA
+TATCTAGAAGATACGTCTAAACTAATAGACAGTAGTCCAGATTTCCTTAA
+AATAAGGAAAAATAAAATTGAATTTTTATGGCATAAAATAGATAACCTGA
+TTGAACAGGTGAATAGTCGTTTTGAGAGTTCGCTATTCGAAGAAGAAATT
+AGCGAACTTGAATTTGACAAACAAAATATTCTTACAGCCATTAATAGTCG
+ACTCAGTGGCACAATAAATAAAGCTGAAATGTCGACGGTTGTTAAGGCGG
+AGGAGTTACCAACCCTGCCTAAAATACAGATTCCCACCTTCTTTGGTGAT
+TCCAAAGAATGGGATCTTTTTAATGAACTCTTTACAGAGCTCATACATGT
+GAGAGAGGATCTCAGTCCTTCTCTCAAATTTAATTATCTAAAGTCAGCAT
+TAAAAGGAGAAGCCAGAAATGTGGTTACTCATTTACTGCTCGGCTCTGGA
+GAAAATTATGAAGCCACTTGGGAGTTTTTGACCAAGCGATATGAGAATAA
+AAGAAACATATTCTCAGATCATATGAATAGGCTTATGGATATGCCAAATT
+TAAATTTAGAATCCAATAAGCAAATAAAGACATTTATTGACACGATTAAC
+GAGTCAATTTATATTATAAAATTAAAGGCACAATTACCAGAAGATGTGGA
+TGCAATTTTCGCTCACATAATTCTTCGGAAATTCAATAAAGAATCACTCA
+ATTTATATGAAAGCCATGTTAAAAAGACAAAAGAAATACAGGCACTTTCT
+GATGTCATGGACTTTTTAGAGCAAAGGCTCAATTCTATATCATCATTCTC
+ACAGGAAGTAAAACCTGTAAAGAAAATGATTAATAATAACAAGAATAAAA
+ATTATAGTGACAATTGTGCATATTGCAAACTACCAGGGCATTATTTAATT
+CAATGCCATAAATTTAAAATAATGAATCCAGCAGAACGGTCTGACTGGGT
+AAGAAAAAATGGGATTTGCCTAAGATGTCTGAGGCATCCGTTTGGTAAAA
+AATGTATAAGCGAGCAGCTTTGTTCGACTTGTCGTAAACCTCACCACACG
+TTACTTCACTTTGCAGGTCATAATCCAGAAAAAGTGAATACGTGTAGAAC
+AACAGGTCAAGCCTTGTTGGCCACGGCCTTGATTCAAGTAAAGTCGAGGT
+ATGGAGGCTTTGAACAATTAAGAGCATTGATTGATAGTGGCTCTCAAAGC
+ACAATTATTTCAGAAGAGTCTGCACAGATTCTAAAATTGAAAAAATTTCG
+GTCTCATACTGAAATAAGTGGAGTATCTTCCACAGGAACGTGCATCTCCA
+AGCACAAAGCGGTTATTTCGATAAGAAATTCTCCGAAAAATTTAGAAATT
+GAAGCAATTATTCTCCCAAAACTTATGAAGGCACTTCCAGTCAACACGAT
+TAATGTTGATCAGAAAAAATGGAAGAACTTTAAATTAGCCGACCCCGATT
+TTAATAAACCGGGTCGCATTGATTTAATCATTGGAGCAGACGTATATACT
+CACATTCTGCAAAATGGAGTTATAAAAATAGACGGTCTCCTTGGGCAAAA
+AACTGATTTCGGGTGGATAGTTTCTGGATGTAAAAAATCCAAAGGAAAAG
+AAACCATTGTAGCCACAACAATAGAAATAAAAGAGTTAGATCGCTACTGG
+GAAGTGGAAGAAGAAGAAAAAGATGATATCGAGTCTGAAATCTGTGAAAA
+TAAATTTATCAAAACGACAAAAAAAGATTCAGATGGGCGATACATTGTGT
+CAATTCCATTCAAGGAGGATGTCACCTTAGGAGATTCAAAGAAACAAGCG
+ATAGCTCGTTACATGAATCTGGAGAAAAAACTAAAAAGAAATGAAAAACT
+TAAGGTTGACTACACTAAATTCATGAATGAATACATGGATTTAGGACACA
+TGATTGAAGTGAGTGATGAAGGCAAATATTTTTTACCGCACCAGGCAGTG
+ATTAGAGATTCAAGCCTTACGACCAAATTGAGAGTAGTTTTTGATGCTTC
+AGCAAAAACTACGAATAACAAAAGTTTGAACGACATAATGTGGGTTGGGC
+CACGAGTTCAAAAAGATATTTTTGACATTATTATTAAATGGAGAAAATGG
+GAATTTGTTGTTTCGGCAGACATTGAAAAGATGTACCGACAAATTAAAAT
+AGATAATAATGATCAAAAATATCAATATATTTTATGGAGAAATTCTCCAA
+AAGAAAAAATTAAAACATATAAATTAACCACAGTCACTTACGGAACTGCA
+TCTGCACCATATTTGGCTACCAGGGTTCTGGTAGATATTGCAGATAAATG
+TAAAAACCAAGTTATTAGTGCAATAATTAGGAATGATTTCTATATGGATG
+ACCTAATGACTGGAGCTGATTCGGTAGAAGAAGCTAATAAATTAATAACA
+TTAATTCCCCATGAATTGCAGAAAGTTGGATTCAACTTAAGGAAATGGAT
+TTCCAACAATTCCAAAATATTAACCACTGTGGAGGACACAGGGGACAATA
+AGGTTCTCAATATTATCGAAAATGAATGTGTTAAAACTTTAGGACTAAAA
+TGGGAACCTCAAAAGGATTTATTTAAGTTCAGCGTAAATTGTAATGATGA
+ATCAAAAAATATAAATAAGCGCGTTGTGTTATCAACGCTAGCAAAAATAT
+TTGATCCGTTAGGATGGTTGGCACCAGTCACGGTTTCAGGAAAACTTTTT
+ATTCAAAAACTTTGGATAAATAAAAGTGAATGGGATCAGGAATTATCCAT
+AGAAGATAAAAATTATTGGGAAAAATATAAAGAAAATTTATTATTGTTAG
+AGAATATTCGAATCCCAAGGTGGATTAATTCAAACAGTTCTTCAGTCATT
+CAGATTCACGGATTTGCGGACGCCTCCGAAAAAGCATATGCTGCAGTAGT
+CTATGCTAAAGTAGGACCTCATGTTAATATAATAGCTAGCAAAAGTAGAG
+TCAACCCTATAAAAAATAGGAAGACAATTCCCAAACTCGAGCTGTGTGCA
+GCTCACCTGCTTAGTGAATTAATCCAAAGACTAAAAGGATCAATTGACAA
+TATAATGGAGATCTATGCTTGGAGTGATTCCACGATTACCTTAGCATGGA
+TTAACAGTGGTCAAAGTAAGATCAAATTTATAAAAAGAAGAACGGATGAC
+ATTCGGAAATTAAAAAATACTGAATGGAATCATGTTAAGTCAGAGGATAA
+TCCAGCAGATTTAGCATCCAGGGGAGTGGATTCTAACCAGTTGATCAACT
+GTGATTTTTGGTGGAAAGGTCCGAAATGGCTAGCAGACCCAAAAGAACTT
+TGGCCTCGGCAGCAGTCTGTAGAAGAACCTGTCTTAATAAATACGGTATT
+AAATGACAAAATAGATGATCCTATTTACGAATTAATAGAAAGGTATTCCA
+GTATAGAAAAACTTATACGTATAATAGCATACATAAATAGATTCGTGCAG
+ATGAAAACAAGAAATAAAGCCTATTCATCAATTATTTCAGTAAAGGAGAT
+AAGAATAGCGGAAACAGTTGTTATTAAGAAACAACAAGAATACCAGTTTA
+GGCAAGAGATAAAGTGCCTTAAAATCAAAAAGGAAATCAAGACAAATAAT
+AAAATATTGTCATTGAATCCATTTTTGGACAAGGGTGGGGTTCTAAGAGT
+TGGAGGAAGATTGCAAAATTCCAATGCAGAATTTAATGTTAAACATCCAA
+TCATTTTAGAAAAATGCCACCTAACAAGCTTATTAATAAAAAATGCTCAT
+AAGGAAACATTGCATGGAGGGATAAACCTAATGCGAAACTATATCCAAAG
+AAAGTATTGGATTTTCGGGTTGAAAAATTCGTTGAAAAAGTATTTAAGAG
+AATGTGTAACGTGTGCAAGGTATAAACAAAATACAGCTCAGCAAATAATG
+GGTAACTTGCCAAAATATAGAGTGACGATGACATTCCCGTTTCTTAATAC
+TGGAATAGATTACGCAGGTCCTTATTATGTTAAATGTTCAAAAAATCGTG
+GCCAAAAAACATTTAAAGGATACGTTGCTGTATTCGTTTGCATGGCCACC
+AAAGCCATACACTTAGAAATGGTAAGCGATCTAACTTCAGACGCATTTTT
+AGCAGCACTCAGAAGATTTATTGCTAGACGGGGAAAATGTTCCAATATCT
+ATTCAGACAACGGAACAAATTTTGTAGGAGCTGCAAGAAAATTAGATCAA
+GAGTTATTTAATGCAATACAAGAAAATATAACGATTGCAGCGCAACTTGA
+AAAGGACAGGATTGATTGGCATTTTATTCCCCCGGCAGGACCTCACTTCG
+GAGGTATTTGGGAAGCTGGAGTTAAGTCAATGAAATACCATTTAAAGCGT
+ATAATCGGCGACACTATTTTTACTTATGAAGAAATGTCAACTCTTTTATG
+TCAAATAGAAGCATGCTTAAATTCAAGGCCATTATACACTATAGTTAGTG
+AGAAGGACCAACAAGAAGTTTTAACACCAGGTCATTTTTTAATTGGAAGA
+CCACCTTTAGAAATAGTCGAACCAATGGAAGATGAAAAAATCGGAAATTT
+GGATAGGTGGAGACTTATCCAAAAAATAAAGAAAGATTTCTGGGTTAAGT
+GGAAAAGTGAATATTTGCATACGCTCCAGCAAAGGAATAAATGGAAAAAG
+GAAATTCCTAATATAGAAGAAGGGCAAATAGTTTTATTAAAGGATGAGAA
+TTGTCATCCTGCAAGATGGCCTTTAGGAAAGGTGGAAAAGGTGCATAAGG
+GGAATGATGATAAGGTCCGAGTGGCTAAAGTAAAGATGCAGGAAGGATAT
+ATCACTAGACCCATTACTAAAATTTGTCCCTTGGAAGGAATAAAGTCTGT
+TGACAAAAATGAGGCTGACCAAGAGCCAAAAAGACGAACTAGAGCGACAT
+CGGGAATGTCCAAGATCGGAATCATTATGGCAATGTTGTTGTTTGTGTTA
+AGTTGTCAAGTTTCTAGCGCATTACCTAAAGATATAGCACCAAGATATTC
+TATAGACAAAATAAATAAAACCTCAGCAATATATCTAGACCCGCTAGGAG
+ATGTTGAGATTGTGAGTACTTCTTGGAATTTGGTTATCTATTATAAAATG
+GATCCATATTTTAAAATGTTAACAAAGGGTAATGCGCTTATACAAAGTAT
+GAGGAAAGTTTGCGAAAGACTTCATAGCTTTGAAGAGCAATGTAGTCTAG
+TCTTAGATAATATGCAAAGTCAGTTATCGGAACTTGAAGAAAACAATAAA
+TTGTTTATGATGCAGTCTAGATCTAGAAGCAAGCGTGCTCCTTTCGAATT
+TATGGGTTCCTTGTATCATATTTTATTTGGTATAATGGATGAAGATGATA
+GAGAGCAATTAGAAGAAAATATGAAGAATTTGTTAGATAACCAGAACAAC
+CTTGATAAACTAATTCAAAAACAAACATCTGTGGTTGATTCAACTTCTAA
+TCTATTAAAGAGAACAACAGAAGATGTTAACTCCAATTTTAGAAGTATGC
+AAATAAGAATTGAGAACATGACAGAAGTTCTTAAAGAAAATTATTATGTT
+TATAAGGAATCAATAAAATTCTTTATGATTACGAAACAGCTACACTCATT
+GATTGAAGAAGGCGAAAAAATTCAAGCAGGCATTATAAGCCTGTTGATTG
+ATATTAATCACGGTAGGCTAAATACAAATATTCTCAGGCCAAATCAGCTT
+AAAAAAGAAATTGCCAAAATTCAGCAGAGTCTTTCAGAGAACCTAGTAAT
+TCCAGGAAAACGGTCAGGTACGGAACTTAAGGAGGTGTATACACTGTTAA
+CAGCCAGGGGTTTATTCATCGACGATAAATTGATCATTAGTGCAAAAGTG
+CCTCTGTTTAGCAGGCATCCATCCAAATTGTTCAGGCTTATTCCGGTGCC
+AATTCGAAATGAAGATCGGATAATAATGGTGCATACAACGTCCGAATATT
+TAATTTATAATTTTGAGATAGATTCCTATCACATAATGACGGAAGCCACA
+TTAAATCAATGTCAGAAATGGCAACTAAATAAGAGAATATGCAAAGGAAG
+TTGGCCCTGGAATTCAGCGAATGATAATGCATGTGAGATTCAGCCTCTAA
+AGCCAGATAAAGCGGCGAACTGCATCTATAAAACAGTAGTCGACTCTAAA
+AGTTACTGGGTAGAGTTAGAAAAGAAAAGTAGTTGGTTGTTTAAGGTTCC
+TGCGAATTCAAAAGTCCGTCTGCAATGTACTGGCTCTCAAATTGAATTGT
+TTGATTTGCCTCAGCAAGGAGTTTTAAGCATTGCGCCATATTGTACGGCA
+AGAACCGACGATAAAATTCTAGTTGCCCACCATAACATTCAGTCCGAAAG
+TGAAGAATTATTATCAACACCTTATATAGGAGAAGTTAGTGGAGTGCCGA
+AGATTATTTGGGATCCGCTGAAACTATCAATATTAAATCATACTGAGGAA
+TTTGAACGATTGAATAATGAAATTAAATTTATGAAAGAGAACCATCAAAA
+ATTGAAAGATTTACATTTCCATCATATTTCCGGACATGCTGGATTAATTA
+TTGCTTTAATACTAATGATAGTATTAATAATATATTTCATACGGAAATGT
+GCTGTGCAACAAAGAATGCAAGCAATAACCTTTGCAGGTCCGTTGCCAGT
+ACTATAAATATCAATAGTAAATAAACAATAAAATAATATAACAAATAAAA
+ATATACAGTCCACTAATAGAAAATGTACTTCTACATAGAAAAAGCAAAAT
+GTTTAAAATAAGTTAATTAAGTACAAATTGTTGAATTAAAAATAATATAA
+ACCATAATTGTAATCCAATAAAATTAAAAGCCAGAAAAACTAGGCCCATT
+GAAATCTTAGTTGCAAAATAAATGAACATATATCAAATAAATACAGTCCA
+CTACTGTTATAAATGCAACTAATATACTAATGTACATCTCAGCTTTGCTG
+GCCCTTTGGCAGAATGTTCACACATGAACACGAATATATTTAAAGACTTA
+CAATTTTGGGCTCCGTTCATATCTTATGTAAATGAATCGAGAGCGATAAA
+TTATATTTAGGATTTTGTTATCTAAGGCGACATGGGTGCATTGCTCAAAA
+ACATGTAATTTAAGTGCACACTACATGAGTCAGTCACTTGAGATCGTTCC
+CCGCCTCCTAAAATAGTCCCTTAGTGGGAGACCACAGATAAGGTCCTCGC
+CGCTCAAGATAGGCAGATGTGCCCGAGCGTGGGACCTCGATAAGGCGGGG
+ACTATTTACGTAGGCCTCTGCGTAGGCCATTTACTTTAAGATGCGATTCT
+CATGTCACCTATTTAAACCGAAGATATTTCCAAATAAAATCAGTTTCTTA
+CAAAAACTCAACGAGTAAAGTCTTCTCATTTGGGATTTTACA
+>FBgn0000199_blood
+TGTAGTATGTGCATATATCGAGGGTACACTGTACCTATAAGTACACAGCA
+ACACTTAGTTGCATTGCATAAATAAATGTCTCAAGTGAGCGTGATATAAG
+ATCACCCATTTATGCTTTAAGCTAAGTCAGCATCCCCACGCTGGCCGCTG
+GCCATATATGCGCATAAGCTCTCTCTCTCTCTCTCTTATACATATATATA
+TACGCTGCTCTTCTGCCGCTGTCGACGGCGGCGCAGTCGCAGTATTTAGG
+TAAGATTAGACACTCTGTAGAGGTTAAGCGGGCAGAACCGTTTCTGCTAC
+TCGAAGAGATAAGAAGAAATAAAAAGGTGCCTGACGGCTGCACCCAACTG
+CAAGGAAAACACGTGTTCTCAATTGGTGGCATATATTGGTTTATTACATG
+GCGACCGTGAGGCAGGAGCCTGCGATCTGAGGACTACTGAGGAAATGCTG
+CTAATATTGCCGATTTGATTTGGGAATTCTAAACAGCGACAACAGGTGTG
+AGAAGCAGGCCGCCCCTTACACCAGTGCGGGAGACCTAGAGACGGGACAC
+TGATGAAAAAAAAAGAAACAAAAATACTGAGTGAGTAGAGTGTGGTAATG
+GGCAAACGCGGATGTCAGGAAATCAAAAATAAAGGTATAGCACATATTAA
+GTGGCTATGATATACAAATAAAACACCGCCCCCATGGGCAACGGCACAGA
+AATTAACTGCCGAATTAGACTTTCTGAAAGAAAACCTCCAGCAAAGAAAG
+CCGAATACCACAACTCACTCAGCAAAAATAGAAATAATCAATGAAGAAAT
+AACTGAAAATTCAACATCACCCAAGCCGAAAAGACCCGACGTCTGCATGA
+AAGACTGCCCTCGACCATTGTAAGCCGCAACAGCAATTAGCACGGCATCC
+TGCGAGGGTAGGATTAGGATAAAGGATAAAGGATTCCACCGGCGCGCCGC
+ACATGACAACAGCGAATGTCTACCAAGCAGACGTTCGAACACCCTGCTCC
+TGTCGAGCAAAGGGATCTGCCAAGTATCAAAGAGGTAATAGAGGTAGATC
+CGTCCGCGGGACCAAAGCCCTTGACCATACAAGAGTACAAGGCACGGACT
+GCAGCGAGGGAGCAGCCACCTAAAAAGAAGAGGGGTGGCCGCCGGATTAA
+GTTGCTCAGCGCCCGGAGGCTCAACATCGAACTACTGAAGACGGCAACTA
+ATGAGGAAGACCGGCAGCGCTACAAAGAGCGCCTTGCAGCCATCAATCAA
+CAACTTCGTGGTGCGAAGTAAAGCGGCGGGCTGCGTTATACGCCATAGCC
+TCAACCGCCCAAATATTATATTAATGTTGTCGATGCGGTTTCCGCTGCAA
+CAAAATTACTAACTTATCAGGGACCCATTTCATAACTAACACATTATACT
+CAGTCCTAAACTTAAAATAAGTAATAATATTGTAAAATTGCAAATTGCAA
+CCGATGTAAACTGAGTATAATGAATTCATCTATCAAGTAAAAATATGTTT
+AACAACAGTTTAGACCTATTAAAATTTCGAGCTATATTTATATCTGATCG
+AGATAACAATAATTGACCAATTCTCAAAGTTAAAATTCTATTTGTACTTT
+TGATATACAAATAAAGACTAATTTTCCCCATATCAAAATGGGACATAAGT
+CGTGGATACAACCCCACAGTTAAATTCAATGTACTTACTATTTTTGATTT
+TAGTTATCCTATCAGCCTTTTTACCTTGGCCTTAAAACTTTATCAGTTTC
+ACACAAGATCGTTGAAAAGACTTACATGAGTCGAGCCAATGATTTAGACA
+AAATCTAATAGAAACTACACCAAAAAGGTACAAGGTCGATTACATCGCTA
+AAAGGTACATACATGGAATGGCTAAACTTAACCATATCCATAAACAATAT
+TAGAGATGCTTTTGATAAATCCTATAAATGTATTAATAAAACCGCGCTGA
+TCAAAACTCAGACGCTTATTTTTCACATAAAGGTATTGATAACACAATAC
+AACACATTACAAAACCTAATAGTAACAAACAAAAGCAAACTCACTGAAGA
+ACATAAAGTCCAATGCTTCAAAGTTCTCAGTTCATTTGGTAAAAGACTAC
+ATAATACCAGCGTTAGACACAGTATTATAATAGAAGTCCCAACAGAACTA
+ACCAAAATAGCAGAATTCGACGAAAGCCAGTTAAGAGACTTGGACGAGTC
+GCAGCCGTTAGAAGATTTAGATATCGAAAGCGATATCGAATCAATAGAAG
+AATTAAAATTTAATACCGTACAACCAAATACAAGAAACATGGCCAACGCA
+TTAGAAGCTCAGAGAGCATACGTTAAACAGGTATCTGCCACAGTACCTGA
+TTTCGATGGTAAGAAACTCCATTTAAACAGGTTTGTGACAGCACTTAAGT
+TGACGGATCTAACTAAAGGAGATCAAGAAACTTTAGCAGTAGAGGTCATA
+AAGACCAAAATTATTGGCCCATTAAACTATAAAGTAGAACATGCGACAAC
+GATACAGGCAATAATTACCATATTGCAGGCAAACGTAAAAGGCGAATCGC
+CTGACGTTATAAAGGCCAAATTAATAAATGCCCAACAAAGAGGCAAGACC
+GCGTCTCAGTATGTTACAGAAATAGACAGTATGCGTAAGCAGCTCGAGGC
+AGCTTACATAGACGGCGGATTAGACGCCGATAATGCTGACAAATTCGCGA
+CTAAAGAGTCGATATCAGCAATGACCAAAAACTGTGCCAACGAGGCACTT
+AAAATGATCTTAACTGCAGGTACATTTAGTACATTCAACGACGCAATGGA
+AAAATACCTACATTGCAGTACAGAAATAACCGGCAATTCAAATACAGTCT
+TATTCTATAATGGGAATAATAGACGTGGTAATTATAATGCCTACTATCGT
+GGTAGAGGCAGAAATAATTATAACCATAATTATAACCAGAATTATAACCA
+AGGTTATAATAATAACAACAGAGGTCGCGGAGGCTACCGCGGCCACGGTA
+ATAACAGAGACGGAGGTAACCGAAGGGGTAACCAAAGTCAGAATAATAAT
+AACAACCGAAATGTGCGTAACGTACAATCGGAAAACAGCCAGACCCCCTT
+AAGCGATCAACAGTAAAAGTGTTTAAAGTAAACCTAAATCTGAGTATTTT
+CATTAAGACAAAAAACCATGAAACAAACACAGTTCTTACATTACTAATAG
+ACACAGGTGCAGAAATTTCATTGCTAAAAGCCAAAGCAAAGGAATATAAT
+AATATAAATTTCAGTAATATATCAAATATTACAGGTATTGGGCAAGGAAC
+CATACAGTCTATAGGTACAGTAGATCTTGACATACGCATTCAGGATGTTC
+TAGTGCCACATGAATTTCATGTAGTACCTGAGAATTTTCCGATACCATGC
+GATGGCATAATCGGAATAGATTTTATCAAGAAATACAATTGCGTATTAGA
+GTTTCAAAATAACAAAGACTGGTTCACAATAAGACCCAATAACTTCAGTA
+GACAGATTAGTGTACCAATTACACATAACTTAGACTCCAACACACTCTTA
+TTGCCAGCTAGATGCGAAGTAATCAGACAAGTCAAATTACTCACTAACGA
+AAAAACGGTGGTAGTACCAAATCAGGAGCTGCAACCAGGTATAATAGTAG
+CAAGCACCATTGCCGATAGCAAAAACGCATTGATTCGCATTATAAATACA
+AATAATAAAGACGCCATAATAGATAGCGCGAAGATCAAATGCGAATCAAT
+GAAAGACTATGACATTTTTACAACACCAGTAGAAAAGGAAAATAGAACTG
+AAGAAATTTTAAAACAATTAAGATTCCCTAAACAATTCAATAATGAACTA
+ACTAAGTTATGCACCGAGTTTAGCGATATTTTTGGTCTAGAAACAGAACC
+AATATCGGCTAACAATTTCTACAAACAAAAACTCAGATTAGGGGAAAAAA
+CACCGGTCTATATAAAAAACTATCGCATGGCAGATAGCCAAAAACCAGAA
+ATCGCCAGACAGGTAAAAAAATTAATAGATGATGGAATAGTTGAACCATC
+AATGTCTGAATATAATAGTCCATTACTTTTGGTTCCAAAGAAACCACTTC
+CGAATTCCACGGAAAAAAGATGGCGATTAGCAGTTGACTATCGTCAAATA
+AATAAGAAACTATTATCAGACAAATTTCCACTTCCAAGAATAGAAGATAT
+TCTTGATCAATTAGGAAGAGCAAAGTATTTTTCATGTCTCGACCTAATGT
+CTGGATTCCACCAGATAGAACTAGAAAAAAGGTATAGAGATATAACGTCA
+TTTTCAACAGCCAATGGCTCATATCGCTTCACGCGATTACCATACGGACT
+GAAAGTAGCACCAAACTCCTTCCAACGTAGGATGACACTTGCATTTTCTG
+GTCTTGAACCATCGCAAGCATTTCTATATATGGATGACTTAGTAGTAATA
+GGTTGTTCAGAAAAACATATGCTCAAAAATTTGACTAACGTATTCGAGCT
+ATGTAGACGACATAATTTGAAACTACATCCAGGGAAATGTTCTTTCTTTA
+TGAAAGAAGTAACATATTTGGGTCACAAATGTACCGATAAAGGTATACTC
+CCAGATGACACCAAATATGAAGTTATAGAAAAATATCCTATACCAACAGA
+TGCCGACAGTGCTAGGCGTTTCGTAGCCTTCTGTAATTATTACAGACGTT
+TCATTAAAAATTTTTCTGATCATTCACGCCACTTAACGAGGCTTTGTAAA
+AAGAATGTTCAATTCGAATGGACAGCAGAATGCAATGATGCATTCGAATA
+CCTTAAAACAGAATTAATGAAACCAACATTACTACAGTACCCAGATTTCG
+GTAAAGAATTTTGCATAACAACCGATGCTAGTAAACAGGCATGCGGAGCG
+GTACTTACACAAGATCACAATGGTCAACAACTTCCAGTGGCATACGCTTC
+AAGAATGTTCACTCAAGGTGAAAGTAATAAGTCCACTACAGAACAAGAAT
+TAACGGCCATTCATTGGGCCATAAATCATTTTCGACCATACATATATGGC
+AAGCATTTCATGGTAAAAAGCGATCATAGACCATTGTCATACCTATTCTC
+TATGAAAAATCCAAGTTCAAAACTCACTCGTATGAGGCTGGATTTAGAAG
+AGTATGACTTTACTGTAGAATATCTTAAGGGGAAAGATAACCATATTGCG
+GACGCCTTGTCTCGCATAACAATAAAAGATCTGAAAACAATCAACAGAGA
+AATATTAAAAGTTACCACCAGATCAAAAGCTAAACAGGAAAATTCCTGTA
+AGGACGAAGCAATAGTCAAAATACAAGAGGAAAAAGAGCAAACAATAGAA
+AAGCCCAAAGTCTATGAAGTTGTCAATAATAATGACACAAAGAAATATGT
+TTTAATCAAAATAGATAAACACAAGTGTTTATTAAAACGAGGAAAAACAA
+TTGTTTCACGCTTTGATGTTGATGACTTGTATTCTAATGAAACATTTGAT
+CTAAATCAATTCTTTCAAAGGCTTATTTCAAAAGCCGGAATGCATAAAAT
+AACAAAAATGCGAATATCACCAAGCGAACAGATGTTCCAATTTGTATCAC
+TAAATGAATTTAAAATAAAGGGCAACCGAGTACTCGAAAAAGTAGAACTA
+GCTATTCTACAAAAGGTGATAATTATAGACAAAAATGACGAAGCTCAGAT
+TAAAGAAATTTTGACAAAATTCCATGATGATCCTATAGAAGGAGGCCACA
+CTGGTATTTCGCGAACCCAGTCAAAAATCAAAAGATTTTATTATTGGCCC
+CAGATGACCAAGACAATCTCAAAGTATGTAAAGACTTGTTTGAAATGTCA
+ACAAGCCAAAATTACAACACATACGAAAACTCCATTAACATTGATGCCAA
+CGCCAGCAACAGCATTTGATACTGTTTTAATTGATACCATTGGTCCACTA
+CCGAAATCGGAAGACGGAAATGAGTATGCAGTTACAATCATATGCGATCT
+AACCAAGTTTTTAGTAACTATTCCAACACCAAATAAAAGTGCTAAAACAG
+TTGCAAAGGCTATATTTGAATTATTTGTACTGAAGTACGGTCCAATGAAG
+ACGTTCATTACAGATCAAGGTACGGAATACAAAAATTCACTTATGAATGA
+ATTATGCAAATATATGCATATAGAAAATCTAACATCTAGCGCTCACCATC
+ATCAAACTTTAGGAACAATAGAAAGAAGCCACCGAACTTTTAATGAATAT
+ATACGTTCATACATATCGGTTAACAAAAGTGATTGGGACATTTGGTTACC
+ATATTTCACTTATTGCTTCAATACAACACCCTCAATAGTCCATGACTATT
+GCCCATACGAACTAGTATTTGGCAGACTACCCAGACAATTCAAAGATTTC
+AGTAAGATAAACAAAATAGACCCAATATACAACTTAGACGACTACTCTAA
+AGAGCTTAAATGCAGACTAGAATTGTCGTACAACAGAGCAAGAAGAATGT
+TAGAAAAAGCAAAAGCGGATAGAAAATTAAGATATGATAGGAATACAAAT
+AATTTCGAATTAAAAATAGGAGATAAAGTATTACTTAGAAAAGAAACAGG
+TCATAAGTTAGATAAAAGATATGAAGGTCCTTATGACGTAGTAGATATAG
+GAATAAATGACAATATAACCATTAAAACAGGAAGTAAGAAACAACAAATA
+GTACATAAAGATAGGCTAAAAAAGCACAAATAGAATGAAAAAAAAAAAGG
+GCAATCAATGCCAAACCTTTCATAATAAAACTTAAATAACGGCCTGATCA
+GCCAAAACAATATAACAAAGACATAGACATAATCGAATTTTTATTAATTC
+AAAATACATACATATTTTTTCTTTATTCATTTAAAAATTCTATATCATAA
+ATAATGTTAATTCATTAAAAATAATATTTAAGTAATTTTTATTTTATAAT
+GGTAATATAGTTGATAGAAAATAACTTCATTTCTTTACGTTATTTTAAAA
+AAGAGGGGAGGTGTAGTATGTGCATATATCGAGGGTACACTGTACCTATA
+AGTACACAGCAACACTTAGTTGCATTGCATAAATAAATGTCTCAAGTGAG
+CGTGATATAAGATCACCCATTTATGCTTTAAGCTAAGTCAGCATCCCCAC
+GCTGGCCGCTGGCCATATATGCGCATAAGCTCTCTCTCTCTCTCTCTTAT
+ACATATATATATACGCTGCTCTTCTGCCGCTGTCGACGGCGGCGCAGTCG
+CAGTATTTAGGTAAGATTAGACACTCTGTAGAGGTTAAGCGGGCAGAACC
+GTTTCTGCTACTCGAAGAGATAAGAAGAAATAAAAAGGTGGCCTGACGGC
+TGCACCCAACTGCAAGGAAAACACGTGTTCTCAATTGGTGGCATATATTG
+GTTTATTACA