| Previous changeset 0:7a813e633d1c (2019-02-01) Next changeset 2:000dbfafe31d (2025-06-30) |
|
Commit message:
planemo upload for repository https://github.com/abims-sbr/adaptsearch commit 68979144b9949c27bcc3340a9e8375de1391526c |
|
modified:
filter_assembly.xml macros.xml scripts/S01_script_to_choose.py test-data/trinity_and_velvet_up.output test-data/trinity_out/AcAcaud_trinity.fasta test-data/trinity_out/AmAmphi_trinity.fasta test-data/trinity_out/ApApomp_trinity.fasta test-data/trinity_out/PfPfiji_trinity.fasta test-data/trinity_up.output test-data/velvet_out/AcAc_transcriptome_25591.fasta test-data/velvet_out/ApAp_transcriptome_35099.fasta test-data/velvet_out/PgPg_transcriptome_90109.fasta test-data/velvet_up.output |
|
removed:
scripts/S02a_remove_redondancy_from_velvet_oases.py scripts/S02b_format_fasta_name_trinity.py scripts/S03_choose_one_variants_per_locus_trinity.py scripts/S04_find_orf.py scripts/S05_filter.py |
| b |
| diff -r 7a813e633d1c -r a83562c0719f filter_assembly.xml --- a/filter_assembly.xml Fri Feb 01 10:22:32 2019 -0500 +++ b/filter_assembly.xml Mon Feb 03 14:37:31 2025 +0000 |
| [ |
| @@ -1,4 +1,4 @@ -<tool name="Filter assemblies" id="filter_assemblies" version="2.0.3"> +<tool name="Filter assemblies" id="filter_assemblies" version="2.0.4"> <description> Filter the outputs of Velvet or Trinity assemblies @@ -9,8 +9,7 @@ </macros> <requirements> - <expand macro="python_required" /> - <requirement type="package" version="0.0.14">fastx_toolkit</requirement> + <expand macro="python3_required" /> <requirement type="package" version="10.2011">cap3</requirement> </requirements> @@ -23,19 +22,13 @@ #end for #set $infiles = $infiles[:-1] - ln -s '$__tool_directory__/scripts/S02a_remove_redondancy_from_velvet_oases.py' . && - ln -s '$__tool_directory__/scripts/S02b_format_fasta_name_trinity.py' . && - ln -s '$__tool_directory__/scripts/S03_choose_one_variants_per_locus_trinity.py' . && - ln -s '$__tool_directory__/scripts/S04_find_orf.py' . && - ln -s '$__tool_directory__/scripts/S05_filter.py' . && - python '$__tool_directory__/scripts/S01_script_to_choose.py' '$infiles' $length_seq_max $percent_identity $overlap_length - > ${log} + > '${log}' ]]> </command> @@ -106,13 +99,13 @@ **Description** -This tool reformats Velvet Oases or Trinity assemblies for the AdaptSearch galaxy suite and selects only one variant per gene according to its length and quality check. +This tool runs the CAP3 software on assembly FASTA data, merge singlets and contigs and then reformat headers to allow any assembly tools. --------- **Input format** -(1) Sequences are in the sequential format: +Sequences are in the FASTA format: | >seqname1 | AAAGAGAGACCACATGTCAGTAGC -on one or several lines - @@ -121,18 +114,6 @@ | etc ... | -2) The file name should begin with a two letter abbreviation of the species name (for isntance, 'Ap' if the species is Alvinella pompejana). - -**For Velvet Oases assemblies input** - - The headers must be as follow : *>Locus_i_Transcript_i/j_Confidence_x.xxx_Length_N* where i is the locus number, j the transcript variant among all versions of the transcript, x.xxx the confidence value and N the length. - -**For Trinity assemblies inputs** - - The headers must be as follow : *>cj_gj_ij Len=j path=[j:0-j]* where all the j are integers (locus number, transcript variant, length, position...) - -**The tool handles the case if input files come from both assemblers (there is no need for input files to be exclusively from one or another assembler).** - --------- **Parameters** @@ -150,11 +131,9 @@ **Steps**: The tool: - 1) Modifies the sequence name to add the species abbreviation using the 2 first letters of the name of the transcriptome file : note that each species abbreviation must be unique - 2) Selects one allelic sequence from each transcript (c or locus) using the length of the sequence and its level of confidence - 3) Selects the best ORF from the sequence between two stop codons - 4) Performs a CAP3 from the full set of ORFs to minimize redundancy - 5) Retrieves the initial transcript sequences from the remaining set of proceeded ORF sequences + 1) Performs a CAP3 from the full set of ORFs to minimize redundancy + 2) Merges singlets and contigs identified by CAP3 + 3) Reformats headers of the FASTA records by adding a specified prefix (defined from the original filename) and ensures that sequences are on a single line **Outputs** @@ -172,6 +151,11 @@ Changelog --------- + +**Version 2.2 - 07/10/2024** + + - Input files can be from any assembly tools + **Version 2.1 - 15/01/2018** - Input files can be a mix from files coming either from Trinity or Velvet Oases assemblers |
| b |
| diff -r 7a813e633d1c -r a83562c0719f macros.xml --- a/macros.xml Fri Feb 01 10:22:32 2019 -0500 +++ b/macros.xml Mon Feb 03 14:37:31 2025 +0000 |
| b |
| @@ -1,9 +1,13 @@ <macros> <xml name="python_required"> - <requirement type="package" version="2.7">python</requirement> + <requirement type="package" version="3.10">python</requirement> </xml> + <xml name="python3_required"> + <requirement type="package" version="1.79">biopython</requirement> + </xml> + <token name="@HELP_AUTHORS@"> .. class:: infomark |
| b |
| diff -r 7a813e633d1c -r a83562c0719f scripts/S01_script_to_choose.py --- a/scripts/S01_script_to_choose.py Fri Feb 01 10:22:32 2019 -0500 +++ b/scripts/S01_script_to_choose.py Mon Feb 03 14:37:31 2025 +0000 |
| [ |
| b'@@ -1,54 +1,157 @@\n #!/usr/bin/env python\n-#coding: utf-8\n+import os\n+import subprocess\n+import sys\n \n-## AUTHOR: Eric Fontanillas\n-## LAST VERSION: 10.2017 by Victor Mataigne\n+from Bio import SeqIO\n+\n \n-import glob, sys, string, os\n- \n-def nameFormatting(name, script_path, prefix):\n- f = open(name, "r")\n- f1 = f.readline() # Only need to check first line to know the assembler which has been used\n- f.close()\n- name_find_orf_input = ""\n+def fasta_formatter(input_file, output_file):\n+ """\n+ Reformats the input FASTA file to ensure that sequences\n+ are on a single line.\n+ """\n+ os.makedirs(os.path.dirname(output_file), exist_ok=True)\n+ with open(input_file, \'r\') as infile, open(output_file, \'w\') as outfile:\n+ sequence = \'\'\n+ header = \'\'\n+ for line in infile:\n+ if line.startswith(\'>\'):\n+ if sequence:\n+ outfile.write(sequence + \'\\n\')\n+ header = line.strip()\n+ outfile.write(header + \'\\n\')\n+ sequence = \'\'\n+ else:\n+ sequence += line.strip()\n+ if sequence:\n+ outfile.write(sequence + \'\\n\')\n+\n \n- if f1.startswith(">Locus"):\n- name_remove_redondancy = "02_%s" %name\n- os.system("python S02a_remove_redondancy_from_velvet_oases.py %s %s" %(name, name_remove_redondancy))\n- name_find_orf_input = "%s%s" %(prefix, name)\n- os.system("sed -e \'s/Locus_/%s/g\' -e \'s/_Confidence_/_/g\' -e \'s/_Transcript_/_/g\' -e \'s/_Length_/_/g\' %s > %s" % (prefix, name_remove_redondancy, name_find_orf_input))\n- elif f1.startswith(">c"): \n- #Format the name of the sequences with good name\n- name_format_fasta = "03%s" %name\n- os.system("python S02b_format_fasta_name_trinity.py %s %s %s" %(name, name_format_fasta, prefix))\n- #Apply first script to avoid reductant sequences\n- name_find_orf_input = "04%s" %name\n- os.system("python S03_choose_one_variants_per_locus_trinity.py %s %s" %(name_format_fasta, name_find_orf_input))\n+def reformat_headers(input_file, output_file, prefix):\n+ """\n+ Reformats the headers of the FASTA records by adding a specified prefix\n+ and ensures that sequences are on a single line.\n+ """\n+ with open(input_file, \'r\') as infile, open(output_file, \'w\') as outfile:\n+ sequence = \'\'\n+ for line in infile:\n+ if line.startswith(\'>\'):\n+ if sequence:\n+ outfile.write(sequence + \'\\n\')\n+ # Process header line\n+ original_id = line[1:].strip()\n+ header_parts = original_id.split(\'/\')\n+ numeric_part = header_parts[0].replace(\'ou\', \'\')\n+ rest = \'/\'.join(header_parts[1:]) \\\n+ if len(header_parts) > 1 else ""\n+ if rest:\n+ new_header = ">{}/{}".format(prefix +\n+ str(numeric_part), rest)\n+ else:\n+ new_header = ">{}".format(prefix + str(numeric_part))\n+ outfile.write(new_header + \'\\n\')\n+ sequence = \'\'\n+ else:\n+ sequence += line.strip()\n+ if sequence:\n+ outfile.write(sequence + \'\\n\')\n \n- return name_find_orf_input\n+\n+def rename_fasta_headers(input_fasta, output_fasta):\n+ # Extract the base name of the file (without .fasta extension)\n+ base_name_dir = input_fasta.split(\'.\')[0]\n+ base_name = base_name_dir.split(\'/\')[1]\n+ # The first two letters of the file name\n+ prefix = base_name[3:5]\n+ # List to store new sequences\n+ modified_sequences = []\n+\n+ # Read the file and edit the headers\n+ for index, record in enumerate(SeqIO.parse(input_fasta, "fasta"), start=1):\n+ seq_length = len(record.seq)\n+ new_header = ">{}{}_1/1_1.000_{}".format(prefix, index, seq_length)\n+ record.id = new_header[1:] # [1:] to remove '..b' sys.exit(1)\n+\n+ output_dir = "outputs" # Define the output directory\n+ os.makedirs(output_dir, exist_ok=True)\n percent_identity = sys.argv[3]\n overlap_length = sys.argv[4]\n \n- for name in str.split(sys.argv[1], ","): \n- prefix=name[0:2]\n- name_fasta_formatter = "01%s" %name\n- os.system("cat \'%s\' | fasta_formatter -w 0 -o \'%s\'" % (name, name_fasta_formatter))\n- name_find_orf_input = nameFormatting(name_fasta_formatter, script_path, prefix)\n- #Pierre guillaume find_orf script for keeping the longuest ORF\n- name_find_orf = "05%s"% name\n- os.system("python S04_find_orf.py %s %s" %(name_find_orf_input, name_find_orf))\n- #Apply cap3\n- os.system("cap3 %s -p %s -o %s"%(name_find_orf, percent_identity, overlap_length))\n- #Il faudrait faire un merge des singlets et contigs! TODO\n- os.system("zcat -f < \'%s.cap.singlets\' | fasta_formatter -w 0 -o \'%s\'" % (name_find_orf, prefix))\n- #Apply pgbrun script filter script TODO length parameter\n- name_filter = "%s%s"%(prefix, name)\n- os.system("python S05_filter.py %s %s outputs/%s" %(prefix, length_seq_max, name_filter))\n+ for name in sys.argv[1].split(","):\n+ if not os.path.isfile(name):\n+ print("Error: Input file {} does not exist.".format(name))\n+ continue\n+\n+ # Apply CAP3\n+ # Get the base file name\n+ file_name = os.path.basename(name)\n+ # Define the output file path in the output directory\n+ output_file_path = os.path.join(output_dir, file_name)\n+ # Create a symbolic link for the input file in the output directory\n+ symlink_path = os.path.join(output_dir, file_name)\n+ if not os.path.exists(symlink_path):\n+ os.symlink(os.path.abspath(name), symlink_path)\n+\n+ # Print and run the CAP3 command\n+ print(\n+ "cap3 {} -p {} -o {}".format(output_file_path,\n+ percent_identity, overlap_length)\n+ )\n+ subprocess.run([\n+ "cap3", output_file_path, "-p", percent_identity,\n+ "-o", overlap_length], check=True)\n+\n+ # Format file to have sequence in one line\n+ name_fasta_formatter = os.path.join(\n+ output_dir, "02_{}".format(os.path.basename(name)))\n+ fasta_formatter(\n+ "{}.cap.singlets".format(output_file_path), name_fasta_formatter)\n+\n+ # Merge singlets and contigs\n+ merged_file = os.path.join(output_dir,\n+ "03_{}_merged.fasta".format(file_name))\n+ # Define paths for CAP3 output files\n+ cap_singlets_file = os.path.join(output_dir,\n+ "{}.cap.singlets".format(file_name))\n+ cap_contigs_file = os.path.join(output_dir,\n+ "{}.cap.contigs".format(file_name))\n+ print("{} and {}".format(cap_singlets_file, cap_contigs_file))\n+\n+ with open(merged_file, \'w\') as outfile:\n+ # Write the contents of the contigs file first\n+ if os.path.exists(cap_contigs_file):\n+ with open(cap_contigs_file, \'r\') as contigs:\n+ outfile.write(contigs.read())\n+ # Append the contents of the singlets file\n+ if os.path.exists(cap_singlets_file):\n+ with open(cap_singlets_file, \'r\') as singlets:\n+ outfile.write(singlets.read())\n+\n+ # Reformat headers\n+ name_fasta_final = os.path.join(\n+ output_dir, "04_{}".format(os.path.basename(name)))\n+ rename_fasta_headers(merged_file, name_fasta_final)\n+\n+ # Format final file to have sequence in one line\n+ prefix = file_name[:2]\n+ tmp = prefix + os.path.basename(name)\n+ name_final_file = os.path.join(output_dir, tmp)\n+ fasta_formatter(name_fasta_final, name_final_file)\n+\n \n if __name__ == "__main__":\n main()\n' |
| b |
| diff -r 7a813e633d1c -r a83562c0719f scripts/S02a_remove_redondancy_from_velvet_oases.py --- a/scripts/S02a_remove_redondancy_from_velvet_oases.py Fri Feb 01 10:22:32 2019 -0500 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 |
| [ |
| @@ -1,122 +0,0 @@ -#!/usr/bin/env python -## AUTHOR: Eric Fontanillas -## LAST VERSION: 06.12.2011 - -## DESCRIPTION: Remove redondant transcripts (i.e. transcript from the same locus) from Oases output on the basis of two recursive criterias (see in DEF1): - ## 1. [CRITERIA 1] Keep in priority seq with BEST "confidence_oases_criteria" present in the fasta name - ## 2. [CRITERIA 2] Second choice (if same coverage) : choose the longuest sequence (once any "N" have been removed => effective_length = length - N number -## => criticize of this approach: the transcripts may come from a same locus but may be not redundant (non-overlapping) ==> SEE "DEF2" for an alternative - -################### -###### DEF 1 ###### -################### -def dico_filtering_redundancy(path_in): - f_in = open(path_in, "r") - bash = {} - bash_unredundant = {} - file_read = f_in.read() - S1 = file_read.split(">") - k = 0 - - ## 1 ## Extract each transcript and group them in same locus if they share the same "short_fasta_name" - for element in S1: - if element != "": - S2 = element.split("\n") - fasta_name = S2[0] - fasta_seq = S2[1:-1] # that line was unindented - fasta_seq = "".join(fasta_seq) # that line was unindented - L = fasta_name.split("_") - short_fasta_name = L[0] + L[1] - - ## Used later for [CRITERIA 1] (see below) - confidence_oases_criteria = L[-3] - countN = fasta_seq.count("N") - length = len(fasta_seq) - effective_length = length - countN - - if short_fasta_name not in list(bash.keys()): - bash[short_fasta_name] = [[fasta_name, fasta_seq, confidence_oases_criteria, effective_length]] - else: - bash[short_fasta_name].append([fasta_name, fasta_seq, confidence_oases_criteria, effective_length]) - k = k+1 - f_in.close() - - for key in list(bash.keys()): - ## 2 ## IF ONE TRANSCRIPT PER LOCUS: - ## In this case => we record directly - if len(bash[key]) == 1: - entry = bash[key][0] - name = entry[0] - seq = entry[1] - bash_unredundant[name] = seq - - ## 3 ## IF MORE THAN ONE TRANSCRIPTS PER LOCUS: - ## In this case: - ## 1. [CRITERIA 1] Keep in priority seq with BEST "confidence_oases_criteria" present in the fasta name - ## 2. [CRITERIA 2] Second choice (if same coverage) : choose the longuest sequence (once any "N" have been removed => effective_length = length - N numb - elif len(bash[key]) > 1: ### means there are more than 1 seq - MAX_CONFIDENCE = {} - MAX_LENGTH = {} - for entry in bash[key]: ## KEY = short fasta name || VALUE = list of list, e.g. : [[fasta_name1, fasta_seq1],[fasta_name2, fasta_seq2][fasta_name3, fasta_seq3]] - name = entry[0] - seq = entry[1] - effective_length = entry[3] - confidence_oases_criteria = entry[2] - - ## Bash for [CRITERIA 2] - MAX_LENGTH[effective_length] = entry - - ## Bash for [CRITERIA 1] - # confidence_oases_criteria = string.atof(confidence_oases_criteria) - confidence_oases_criteria = float(confidence_oases_criteria) - if confidence_oases_criteria not in list(MAX_CONFIDENCE.keys()): - MAX_CONFIDENCE[confidence_oases_criteria] = entry - else: ## IF SEVERAL SEQUENCES WITH THE SAME CONFIDENCE INTERVAL => RECORD ONLY THE LONGUEST ONE [CRITERIA 2] - current_seq_length = effective_length - yet_recorded_seq_length = MAX_CONFIDENCE[confidence_oases_criteria][3] - if current_seq_length > yet_recorded_seq_length: - MAX_CONFIDENCE[confidence_oases_criteria] = entry ## Replace the previous recorded entry with the same confidence interval but lower length - - ## Sort keys() for MAX_CONFIDENCE bash - KC = list(MAX_CONFIDENCE.keys()) - KC.sort() - - ## Select the best entry - MAX_CONFIDENCE_KEY = KC[-1] ## [CRITERIA 1] - BEST_ENTRY = MAX_CONFIDENCE[MAX_CONFIDENCE_KEY] - - BEST_fasta_name = BEST_ENTRY[0] - BEST_seq = BEST_ENTRY[1] - bash_unredundant[BEST_fasta_name] = BEST_seq - - return bash_unredundant -#~#~#~#~#~#~#~#~#~# - -################### -### RUN RUN RUN ### -################### -import string, os, sys, re - -path_IN = sys.argv[1] -path_OUT = sys.argv[2] -file_OUT = open(path_OUT, "w") -dico = dico_filtering_redundancy(path_IN) ### DEF1 ### -KB = list(dico.keys()) - -## Sort the fasta_name depending their number XX : ApXX -BASH_KB = {} -for name in KB: - L = name.split("_") - nb = int(L[1]) - BASH_KB[nb] = name -NEW_KB = [] -KKB = list(BASH_KB.keys()) -KKB.sort() - -for nb in KKB: - fasta_name = BASH_KB[nb] - seq = dico[fasta_name] - file_OUT.write(">%s\n" %fasta_name) - file_OUT.write("%s\n" %seq) - -file_OUT.close() \ No newline at end of file |
| b |
| diff -r 7a813e633d1c -r a83562c0719f scripts/S02b_format_fasta_name_trinity.py --- a/scripts/S02b_format_fasta_name_trinity.py Fri Feb 01 10:22:32 2019 -0500 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 |
| [ |
| @@ -1,66 +0,0 @@ -#!/usr/bin/env python -## AUTHOR: Eric Fontanillas -## LAST VERSION: 06.12.2011 -## DESCRIPTION: format fasta name in TRINITY output - -from os import listdir -import re - -################### -###### DEF 1 ###### -################### -def dico_format_fasta_name(path_in, SUFFIX): - f_in = open(path_in, "r") - bash = {} - file_read = f_in.read() - S1 = file_read.split(">") - k = 0 - - for element in S1: - if element != "": - S2 = element.split("\n") - fasta_name = S2[0] - fasta_seq = S2[1] - L = fasta_name.split("_") - match=re.search('(\D+)(\d+)', L[0]) - short_fasta_name= SUFFIX + match.group(2) + "_" + L[1] + "_" + L[2] - bash[short_fasta_name] = fasta_seq - - return bash -#~#~#~#~#~#~#~#~#~# - -################### -### RUN RUN RUN ### -################### -import string, os, sys, re - -path_IN = sys.argv[1] -path_OUT = sys.argv[2] -suffix= sys.argv[3] -file_OUT = open(path_OUT, "w") -#Extract suffix info - -dico = dico_format_fasta_name(path_IN, suffix) ### DEF1 ### - -print((len(list(dico.keys())))) - -KB = list(dico.keys()) - -## Sort the fasta_name depending their number XX : ApXX -BASH_KB = {} -for name in KB: - L = name.split("_") - nb = L[0][2:] - nb = int(nb) - BASH_KB[nb] = name - -KKB = list(BASH_KB.keys()) -KKB.sort() - -for nb in KKB: - fasta_name = BASH_KB[nb] - seq = dico[fasta_name] - file_OUT.write(">%s\n" %fasta_name) - file_OUT.write("%s\n" %seq) - -file_OUT.close() |
| b |
| diff -r 7a813e633d1c -r a83562c0719f scripts/S03_choose_one_variants_per_locus_trinity.py --- a/scripts/S03_choose_one_variants_per_locus_trinity.py Fri Feb 01 10:22:32 2019 -0500 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 |
| [ |
| @@ -1,111 +0,0 @@ -#!/usr/bin/env python -## AUTHOR: Eric Fontanillas -## LAST VERSION: 06.12.2011 - -## DESCRIPTION: Remove redondant transcripts (i.e. transcript from the same locus) from TRINITY on the basis of 1 criteria: - ## 1. [CRITERIA 1] choose the longuest sequence (once any "N" have been removed => effective_length = length - N number - - - -################### -###### DEF 1 ###### -################### -def dico_filtering_redundancy(path_in): - f_in = open(path_in, "r") - bash = {} - bash_unredundant = {} - file_read = f_in.read() - S1 = file_read.split(">") - k = 0 - - ## 1 ## Extract each transcript and group them in same locus if they share the same "short_fasta_name" - for element in S1: - if element != "": - S2 = element.split("\n") - fasta_name = S2[0] - fasta_seq = S2[1] - - L = fasta_name.split("_") - short_fasta_name = L[0] + L[1] ## 1.1. ## Extract short fasta name - - ## Used later for [CRITERIA 1] (see below) - - countN = fasta_seq.count("N") - length = len(fasta_seq) - effective_length = length - countN - - if short_fasta_name not in list(bash.keys()): - bash[short_fasta_name] = [[fasta_name, fasta_seq, effective_length]] - else: - bash[short_fasta_name].append([fasta_name, fasta_seq, effective_length]) - k = k+1 - if k%1000 == 0: - print (k) - f_in.close() - - for key in list(bash.keys()): - ## 2 ## IF ONE TRANSCRIPT PER LOCUS: - ## In this case => we record directly - if len(bash[key]) == 1: - entry = bash[key][0] - name = entry[0] - seq = entry[1] - bash_unredundant[name] = seq - - ## 3 ## IF MORE THAN ONE TRANSCRIPTS PER LOCUS: - ## In this case: - ## [CRITERIA 1]: Choose the longuest sequence (once any "N" have been removed => effective_length = length - N numb - elif len(bash[key]) > 1: ### means there are more than 1 seq - MAX_LENGTH = {} - for entry in bash[key]: ## KEY = short fasta name || VALUE = list of list, e.g. : [[fasta_name1, fasta_seq1],[fasta_name2, fasta_seq2][fasta_name3, fasta_seq3]] - name = entry[0] - seq = entry[1] - effective_length = entry[2] - - ## Bash for [CRITERIA 1] - MAX_LENGTH[effective_length] = entry - - ## Sort keys() for MAX_LENGTH bash - KC = list(MAX_LENGTH.keys()) - KC.sort() - - ## Select the best entry - MAX_LENGTH_KEY = KC[-1] ## [CRITERIA 1] - BEST_ENTRY = MAX_LENGTH[MAX_LENGTH_KEY] - - BEST_fasta_name = BEST_ENTRY[0] - BEST_seq = BEST_ENTRY[1] - bash_unredundant[BEST_fasta_name] = BEST_seq - - return bash_unredundant -#~#~#~#~#~#~#~#~#~# - -################### -### RUN RUN RUN ### -################### -import string, os, sys, re - -path_IN = sys.argv[1] -path_OUT = sys.argv[2] -file_OUT = open(path_OUT, "w") -dico = dico_filtering_redundancy(path_IN) ### DEF1 ### -KB = list(dico.keys()) - -## Sort the fasta_name depending their number XX : ApXX -BASH_KB = {} -for name in KB: - L = name.split("_") - nb = L[0][2:] - nb = int(nb) - BASH_KB[nb] = name - -KKB = list(BASH_KB.keys()) -KKB.sort() - -for nb in KKB: - fasta_name = BASH_KB[nb] - seq = dico[fasta_name] - file_OUT.write(">%s\n" %fasta_name) - file_OUT.write("%s\n" %seq) - -file_OUT.close() \ No newline at end of file |
| b |
| diff -r 7a813e633d1c -r a83562c0719f scripts/S04_find_orf.py --- a/scripts/S04_find_orf.py Fri Feb 01 10:22:32 2019 -0500 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 |
| [ |
| @@ -1,64 +0,0 @@ -#!/usr/bin/env python -#keeps the longest ORF found in the 6 possible ORF alltogether -#python find_ORF.py file output - -def find_orf(entry): - orf={} - orf_length={} - stop=['TAA','TAG','TGA'] - for i in range(0,3): - pos=i - orf[i]=[0] - while pos<len(entry): - if entry[pos:pos+3] in stop: - orf[i].append(pos-1) - orf[i].append(pos+3) - pos+=3 - orf[i].append(len(entry)-1) - orf_length[i]=[] - for u in range(1,len(orf[i])): - orf_length[i].append(orf[i][u]-orf[i][u-1]+1) - orf[i]=[orf[i][orf_length[i].index(max(orf_length[i]))],orf[i][orf_length[i].index(max(orf_length[i]))+1]] - orf_max={0:max(orf_length[0]),1:max(orf_length[1]),2:max(orf_length[2])} - orf=orf[max(list(orf_max.keys()), key=(lambda k: orf_max[k]))] - if orf[0]==0: - orf[0]=orf[0]+max(list(orf_max.keys()), key=(lambda k: orf_max[k])) - return orf - - -def reverse_seq(entry): - nt={'A':'T','T':'A','G':'C','C':'G', 'N':'N'} - seqlist=[] - for i in range(len(entry)-1,-1,-1): - seqlist.append(nt[entry[i]]) - seq=''.join(seqlist) - return seq - -# RUN - -import string, os, sys, re, itertools - -path_IN = sys.argv[1] -file_OUT = open(sys.argv[2], "w") -inc=1 -threshold=0 #minimal length of the ORF - -with open (path_IN, "r") as f_in: - for ignored, line in itertools.izip_longest(*[f_in]*2): - name=">"+path_IN[:2]+str(inc)+"_1/1_1.000_" - high_plus=find_orf(line[:-1]) - reverse=reverse_seq(line[:-1]) - high_minus=find_orf(reverse) - if high_plus[1]-high_plus[0]>threshold or high_minus[1]-high_minus[0]>threshold: - inc+=1 - if high_plus[1]-high_plus[0]>high_minus[1]-high_minus[0]: - file_OUT.write("%s" %name) - file_OUT.write(str(high_plus[1]-high_plus[0]+1)+"\n") - file_OUT.write("%s" %line[high_plus[0]:high_plus[1]+1]) - file_OUT.write("\n") - else: - file_OUT.write("%s" %name) - file_OUT.write(str(high_minus[1]-high_minus[0]+1)+"\n") - file_OUT.write("%s" %reverse[high_minus[0]:high_minus[1]+1]) - file_OUT.write("\n") -file_OUT.close() |
| b |
| diff -r 7a813e633d1c -r a83562c0719f scripts/S05_filter.py --- a/scripts/S05_filter.py Fri Feb 01 10:22:32 2019 -0500 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 |
| [ |
| @@ -1,21 +0,0 @@ -#!/usr/bin/env python -#filters the sequences depending on their length after cap3, makes the sequences names compatible with the phylogeny workflow -#python filter.py file length_threshold_nucleotides output - -import string, os, sys, re, itertools - -path_IN = sys.argv[1] -threshold = int(sys.argv[2]) #minimum number of nucleotides for one sequence -file_OUT = open(sys.argv[3], "w") -inc = 1 -with open(path_IN, "r") as f_in: - for ignored, sequence in itertools.izip_longest(*[f_in]*2): - name=">"+path_IN[:2]+str(inc)+"_1/1_1.000_" - if len(sequence)-1>threshold-1: - inc+=1 - file_OUT.write("%s" %name) - file_OUT.write(str(len(sequence)-1)+"\n") - file_OUT.write("%s" %sequence) -file_OUT.close() - -#filtre eventuel sur les petits transcrits \ No newline at end of file |
| b |
| diff -r 7a813e633d1c -r a83562c0719f test-data/trinity_and_velvet_up.output --- a/test-data/trinity_and_velvet_up.output Fri Feb 01 10:22:32 2019 -0500 +++ b/test-data/trinity_and_velvet_up.output Mon Feb 03 14:37:31 2025 +0000 |
| b |
| @@ -1,4 +1,3 @@ -20 Number of segment pairs = 380; number of pairwise comparisons = 3 '+' means given segment; '-' means reverse complement @@ -6,15 +5,13 @@ DETAILED DISPLAY OF CONTIGS -21 -Number of segment pairs = 380; number of pairwise comparisons = 3 +Number of segment pairs = 420; number of pairwise comparisons = 4 '+' means given segment; '-' means reverse complement Overlaps Containments No. of Constraints Supporting Overlap DETAILED DISPLAY OF CONTIGS -20 Number of segment pairs = 380; number of pairwise comparisons = 3 '+' means given segment; '-' means reverse complement @@ -22,32 +19,45 @@ DETAILED DISPLAY OF CONTIGS -22 -Number of segment pairs = 342; number of pairwise comparisons = 2 +Number of segment pairs = 462; number of pairwise comparisons = 4 '+' means given segment; '-' means reverse complement Overlaps Containments No. of Constraints Supporting Overlap DETAILED DISPLAY OF CONTIGS -Number of segment pairs = 4032; number of pairwise comparisons = 0 +Number of segment pairs = 39402; number of pairwise comparisons = 402 +'+' means given segment; '-' means reverse complement + +Overlaps Containments No. of Constraints Supporting Overlap + + +DETAILED DISPLAY OF CONTIGS +Number of segment pairs = 39402; number of pairwise comparisons = 343 '+' means given segment; '-' means reverse complement Overlaps Containments No. of Constraints Supporting Overlap DETAILED DISPLAY OF CONTIGS -Number of segment pairs = 4160; number of pairwise comparisons = 0 +Number of segment pairs = 39402; number of pairwise comparisons = 352 '+' means given segment; '-' means reverse complement Overlaps Containments No. of Constraints Supporting Overlap DETAILED DISPLAY OF CONTIGS -Number of segment pairs = 4422; number of pairwise comparisons = 1 -'+' means given segment; '-' means reverse complement - -Overlaps Containments No. of Constraints Supporting Overlap - - -DETAILED DISPLAY OF CONTIGS +cap3 outputs/Pfiji_trinity.fasta -p 100 -o 60 +outputs/Pfiji_trinity.fasta.cap.singlets and outputs/Pfiji_trinity.fasta.cap.contigs +cap3 outputs/Apomp_trinity.fasta -p 100 -o 60 +outputs/Apomp_trinity.fasta.cap.singlets and outputs/Apomp_trinity.fasta.cap.contigs +cap3 outputs/Amphi_trinity.fasta -p 100 -o 60 +outputs/Amphi_trinity.fasta.cap.singlets and outputs/Amphi_trinity.fasta.cap.contigs +cap3 outputs/Acaud_trinity.fasta -p 100 -o 60 +outputs/Acaud_trinity.fasta.cap.singlets and outputs/Acaud_trinity.fasta.cap.contigs +cap3 outputs/Pg_transcriptome_90109.fasta -p 100 -o 60 +outputs/Pg_transcriptome_90109.fasta.cap.singlets and outputs/Pg_transcriptome_90109.fasta.cap.contigs +cap3 outputs/Ap_transcriptome_35099.fasta -p 100 -o 60 +outputs/Ap_transcriptome_35099.fasta.cap.singlets and outputs/Ap_transcriptome_35099.fasta.cap.contigs +cap3 outputs/Ac_transcriptome_25591.fasta -p 100 -o 60 +outputs/Ac_transcriptome_25591.fasta.cap.singlets and outputs/Ac_transcriptome_25591.fasta.cap.contigs |
| b |
| diff -r 7a813e633d1c -r a83562c0719f test-data/trinity_out/AcAcaud_trinity.fasta --- a/test-data/trinity_out/AcAcaud_trinity.fasta Fri Feb 01 10:22:32 2019 -0500 +++ b/test-data/trinity_out/AcAcaud_trinity.fasta Mon Feb 03 14:37:31 2025 +0000 |
| b |
| @@ -1,38 +1,44 @@ ->Ac1_1/1_1.000_151 -TCGCTCTCCTCCGCCTTTTCTCTAAGCTTAAAAATTATGAAGAGTCTGCACCAGAGACAACTCCTTAGCACATCGACCGACCAGCTGGCCATAAAATGTCTTATTTATACCTTTGTCATGAGCCTTGATAATCCTTTGTGCCAGTGGTGGC +>Ac1_1/1_1.000_160 +GCCAGTGAACTCATTAGGCTCTGTCTGCGCCGATATGATAAGAACGAGTTTGACAGCGATGATGACAGCATGAACAGCGAGCTCGCCTACGATGATGACTCTGAAATGCCTGATGACCTAATTGACCGTTTGGAGATGTGCGACTTTTATGAAATGGAAG >Ac2_1/1_1.000_160 +GCCACCACTGGCACAAAGGATTATCAAGGCTCATGACAAAGGTATAAATAAGACATTTTATGGCCAGCTGGTCGGTCGATGTGCTAAGGAGTTGTCTCTGGTGCAGACTCTTCATAATTTTTAAGCTTAGAGAAAAGGCGGAGGAGAGCGATTATAGCGA +>Ac3_1/1_1.000_160 ATGTTAGTAAAAGAGATTAAAGAGTACCGAGAGATAAAAGAGAAGGCTAGAACCTATCTATGTTATATTATAAGTAGTAACCTATCTTATGGTTCAAGCATAAATGAGGAGACTCTTCAAGAGAGTATGGAGATGTTAAAGAGGGCAATCCCAAAGAGTG ->Ac3_1/1_1.000_160 +>Ac4_1/1_1.000_160 ATCTGTAATGTCGTTTACCACACACTGGACACTGATATTTCCGCTCGCCAGTGTGTGGTAAACGATATTACAGATCAGATGTGCTGGCAATCCATATCAGTACACACAGCAGTGAAAAAAATCATAAATGTGACATCTGTGGCAAGGCTTTCTCAAATGC ->Ac4_1/1_1.000_160 -AAACAATGCAATCCTCTACCATTGCCAAGATATGAAGAACAAGTAAATGGCACATCAACAACAATGATAATAATAAGTGCTAATAACAATAAGAGTAATACAATTACCACAATATCTGAGAACAAGGGGCTTAAGCATAGCTATCATTATTTGGGAGGGG >Ac5_1/1_1.000_160 -GCACCGGGATGCGGATTTGCTGACGATATGGCAAAAGCATTGTCAGCGTGCGGAACCTGTTTATGTCACACCACTGGCATCTTCCTGGCCGTCGCAGCCTTCGTTCTGACGGCACTCGGTATTGTCTGCGTCACGCGATCAGCTGACCCGAGCCTTTGGT +AAACAATGCAATCCTCTACCATTGCCAAGATATGAAGAACAAGTAAATGGCACATCAACAACAATGATAATAATAAGTGCTAATAACAATAAGAGTAATACAATTACCACAATATCTGAGAACAAGGGGCTTAAGCATAGCTATCATTATTTGGGAGGGG >Ac6_1/1_1.000_160 +GCACCGGGATGCGGATTTGCTGACGATATGGCAAAAGCATTGTCAGCGTGCGGAACCTGTTTATGTCACACCACTGGCATCTTCCTGGCCGTCGCAGCCTTCGTTCTGACGGCACTCGGTATTGTCTGCGTCACGCGATCAGCTGACCCGAGCCTTTGGT +>Ac7_1/1_1.000_160 CAGCCTACCACTGAGAAGAGATACTTCAACATGTCTTACTGGGGTAGAAGTGGTGGTCGTACAGCGGGTGGTAATGCAGGACGTGGTCGTGGCGGCGGCAGCGGCAGTGGCAGTAGTCAAAGTGGTGGTGGCAGCTTTCTACAGGAACGTATCAAAGAGA ->Ac7_1/1_1.000_160 +>Ac8_1/1_1.000_160 GCACCTAGAATTACCCGAAGTTGCTTGGCAATAGCGACACCTAACGGTCGCCATGATATTTGCAGGAAGAAGGCATGTGGTACCATTGGGAACCGTCAAGCGTTTCCTCAGCCCTGTGGCAGCTGCCCGTCTGCGCCCGTGTTTGACCTTGAGCACCAAG ->Ac8_1/1_1.000_160 +>Ac9_1/1_1.000_160 ATCAAAGAAGAGCAACATCGAGCTACTGGCACTGGCAATGGAATCCTAATTATAGCAGAAACAAGCACTGGTTGCCTGTTGTCTGGGTCAGCAATTGGTAGTAGAGGTGTTCCTGCTGAAGAAGTTGGGGTCAAAGCAGGACAGATGCTTTTGGATAACT ->Ac9_1/1_1.000_160 -GCCATTCGTCTTAGGAGAAGTTTGTCGTCAGGAAAGATACATGAGGCCTGGATTCTTTCTGACACCGACTCGACGATGTCATTACCTTGTCCACCTGGAACCAACCCCTCATCGACTTCAGCGGATCCATATCTGGTGATCACCAGAAAAACGAACACTA >Ac10_1/1_1.000_160 -GCCTGGGTATTATTTACCACAGTAACCTTTCATCAGTTTGTGGTGAAAGTACGTGACGTTATGCATTGGCAAGATTGGACATTTTGGTTCGCCCTGTTTTGTACGCATAATAATGTATGTAGTTGTATTTTCCAAAATAATTGTTATATTAGCTATCCAA +GCCATTCGTCTTAGGAGAAGTTTGTCGTCAGGAAAGATACATGAGGCCTGGATTCTTTCTGACACCGACTCGACGATGTCATTACCTTGTCCACCTGGAACCAACCCCTCATCGACTTCAGCGGATCCATATCTGGTGATCACCAGAAAAACGAACACTA >Ac11_1/1_1.000_160 -ACAATTACACAGGTATCAACAAATGTTCACTGCACCTGTCAGTTCCACAAACATAAAGATTACACACATGTACACATCTTTACAAAATATTTACAATTTTGTATTCTTAATTCTATCCACTTGGCTCTGGAAGGCCTTCAGCCATCAGATGATGTGTTTA +GCCTGGGTATTATTTACCACAGTAACCTTTCATCAGTTTGTGGTGAAAGTACGTGACGTTATGCATTGGCAAGATTGGACATTTTGGTTCGCCCTGTTTTGTACGCATAATAATGTATGTAGTTGTATTTTCCAAAATAATTGTTATATTAGCTATCCAA >Ac12_1/1_1.000_160 +CTGGGCATGGTGGCTACCAAAACGGAGTATAGAACACTGTGTGACTTTTATGTTGATAATAGAAAATATATTCTCTATATAGACGAAGACTGCAGGGTGTCTAGAATATCGCCTAATAAAATTAGAATAGGGAGTCTGCAGTTGATCCGGAAACTACCAG +>Ac13_1/1_1.000_160 +ACAATTACACAGGTATCAACAAATGTTCACTGCACCTGTCAGTTCCACAAACATAAAGATTACACACATGTACACATCTTTACAAAATATTTACAATTTTGTATTCTTAATTCTATCCACTTGGCTCTGGAAGGCCTTCAGCCATCAGATGATGTGTTTA +>Ac14_1/1_1.000_160 CAATCCAGCACTAGCAGGAGTGTTGGCCGGAAGGTTGATGATATTTTTCAGTCAAAGAATCTGCATGCTCCAGATGATCGCCTATCAGACAAGGATAACCGTGACAAGTCCAAGAACCCTTTACTTAACAATGAGATGACTCCTCAGTCATTTTCTCGAG ->Ac13_1/1_1.000_141 -GATTACATGCAAAACATAATAGAAATGTTTGTCCCAAGGTCTTACCAGTTTATAGTTTTACATTCGTGTCTTGAAATAAGAAAATGCCTTTATGAGAGTGTATTATTACTCAGTAGATGGAAATTAGCTTACCGGGGGATA ->Ac14_1/1_1.000_160 +>Ac15_1/1_1.000_160 +GATTACATGCAAAACATAATAGAAATGTTTGTCCCAAGGTCTTACCAGTTTATAGTTTTACATTCGTGTCTTGAAATAAGAAAATGCCTTTATGAGAGTGTATTATTACTCAGTAGATGGAAATTAGCTTACCGGGGGATATAATTTAGGCCGGAAACCC +>Ac16_1/1_1.000_160 +TGTTTGTCCCAAGGTCTTACCAGTTTATAGTTTTACATTCGTGTCTTGAAATTCAGTAGATGGAAATTAGTGCTTATAGTGGGTTTGGGCAATCGATTTTTTTTTTTTTTTTTTTAAAAAAAAAGGCGAGGCCGAGAGAAGATTCCTAGCGAACAGCCTA +>Ac17_1/1_1.000_160 CCTGTTGTGACTCGTTCCCTGACGTCGTGCACGCAAGCGCACGCGCGTGCGCGCCGGGTTAGGCACACATACGCGGCACAGGTGCGCAGTATTAGACAGACGCAGACGCAGGCGTCCAGACACGCCAGCCAGCACGGTTACAATGTCCATATCACAATGA ->Ac15_1/1_1.000_147 -CTGAATGTCAACCAGTCACTGACCATCAGCTACATGTCTCTAATGGTCACTAGCATGAAACATGAAATGCCTGCTTATAGTGGGTCTGTAACTGGTAGGATACTGATTACATGTGGAGGCTTATTAAAGGGGTATCCTATTATTTTT ->Ac16_1/1_1.000_160 +>Ac18_1/1_1.000_160 +CTGAATGTCAACCAGTCACTGACCATCAGCTACATGTCTCTAATGGTCACTAGCATGAAACATGAAATGCCTGCTTATAGTGGGTCTGTAACTGGTAGGATACTGATTACATGTGGAGGCTTATTAAAGGGGTATCCTATTATTTTTTAAAACCCCCCCC +>Ac19_1/1_1.000_160 CTATGTTGGCTACTGCTAAGGATGTGCTACTTGCCTGATGTAAACAATTCCCAGAATGAATATAAACCAATCATAAGGAGAACTATGGAACCATCCTTAAATGTATTAATCTTATTTAAAATTATGTGCACATCTTGTTTGGCAGAAGGTACATTAAAGC ->Ac17_1/1_1.000_160 +>Ac20_1/1_1.000_160 ATCTGTAATGTCGTTTACCACACACTGGACACTGATATTTCCGCTCGCCAGTGTGTGGTAAACGATATTACAGATCAGATGTGCTGGCAATCCATATCAGTACACACAGCAGTGAAAAAAATCATAAATGTGACATCTGTGGCAAGGCTTTCTCAAATGC ->Ac18_1/1_1.000_160 +>Ac21_1/1_1.000_160 ATGTTAGTAAAAGAGATTAAAGAGTACCGAGAGATAAAAGAGAAGGCTAGAACCTATCTATGTTATATTATAAGTAGTAACCTATCTTATGGTTCAAGCATAAATGAGGAGACTCTTCAAGAGAGTATGGAGATGTTAAAGAGGGCAATCCCAAAGAGTG ->Ac19_1/1_1.000_160 +>Ac22_1/1_1.000_160 GCCAGTGAACTCATTAGGCTCTGTCTGCGCCGATATGATAAGAACGAGTTTGACAGCGATGGTTATATTATGAACAGCGAGCTCGCCTACGATGATGACTCTGAAATGCCTGATGACCTAATTGACCGTTTAGAAGCTGGAAATATTACAAGCTTTGTGC |
| b |
| diff -r 7a813e633d1c -r a83562c0719f test-data/trinity_out/AmAmphi_trinity.fasta --- a/test-data/trinity_out/AmAmphi_trinity.fasta Fri Feb 01 10:22:32 2019 -0500 +++ b/test-data/trinity_out/AmAmphi_trinity.fasta Mon Feb 03 14:37:31 2025 +0000 |
| b |
| @@ -4,8 +4,8 @@ CAGCCTACCACTGAGAAGAGATACTTCAACATGTCTTACTGGGGTAGAAGTGGTGGTCGTACAGCGGGTGGTAATGCAGGACGTGGTCGTGGCGGCGGCAGCGGCAGTGGCAGTAGTCAAAGTGGTGGTGGCAGCTTTCTACAGGAACGTATCAAAGAGA >Am3_1/1_1.000_160 GCACCTAGAATTACCCGAAGTTGCTTGGCAATAGCGACACCTAACGGTCGCCATGATATTTGCAGGAAGAAGGCATGTGGTACCATTGGGAACCGTCAAGCGTTTCCTCAGCCCTGTGGCAGCTGCCCGTCTGCGCCCGTGTTTGACCTTGAGCACCAAG ->Am4_1/1_1.000_147 -ACAGTTCTAAATAGCATTCGCCAAATGATATTGAAAGCATATTTTATGAATAGTGGTTCACAGATGAAAGATCATTATTGGGAACCCGTTCCAGCTTTTGTAGATCATTTTGTTCTTGCTATAGATCATCGACCCAGAATACAAGTT +>Am4_1/1_1.000_160 +ACAGTTCTAAATAGCATTCGCCAAATGATATTGAAAGCATATTTTATGAATAGTGGTTCACAGATGAAAGATCATTATTGGGAACCCGTTCCAGCTTTTGTAGATCATTTTGTTCTTGCTATAGATCATCGACCCAGAATACAAGTTTAGCACACAAGGA >Am5_1/1_1.000_160 TACTGCTGTCGAAGTGATGGCGCTTGGAACAGTGTGATAACCTTGCCTACAAATAGGCCATTCTATTTGCTCAGATACACAAGTCAATGTCAGCTGGTGAAAGGAATGAAAGTCAGAAGGGAAGTATTCTACTGGGATAATGAAGATATTAACAATATTG >Am6_1/1_1.000_160 |
| b |
| diff -r 7a813e633d1c -r a83562c0719f test-data/trinity_out/ApApomp_trinity.fasta --- a/test-data/trinity_out/ApApomp_trinity.fasta Fri Feb 01 10:22:32 2019 -0500 +++ b/test-data/trinity_out/ApApomp_trinity.fasta Mon Feb 03 14:37:31 2025 +0000 |
| b |
| @@ -4,37 +4,39 @@ ATACTCAGGCACACAGCATTTGTCGTACTAGGCGAGAGAGAGAGAGGAACGACTAATTGCAACCACGATTACGTTACATTTGTTTACAAACCAAACGTACTGGCGTCGAAGATAATTAAGAGGAAGCTGACTGAATGCGATTGGCGTTGGTCTACGGGTT >Ap3_1/1_1.000_160 GCCATGCAGTACACTGGACTTCTGTTATTCTGTTTGTTTGCCTTGACGGCAGCCAAACCCGCGGAAGACCTTCAAATGCTCATCCGAGCCCTGCTCCATGAAATAGAAGAGGAAGGTGAACTCCAAGAGCGAGGCATTGGCGCCGTGAAGTATGGTGGAA ->Ap4_1/1_1.000_135 -CGTTTAACCAGGCCCTGCTACCCTCCAATCTCGTCCAATCGGTCTCTACGCATCCACTCAATAATTATTGACATATTACAATTGATTCGGATTAAAAAAATGGCGCTAGGCTTAAAACACAGACAGTTCGCTAGC +>Ap4_1/1_1.000_160 +CGGCCGCGGCGCGTCGTTCTCAGCCAAGCTGACTTCGACTTGAGCCGTCCATTCGCTTATTTACACGACGACTGCTCGACCCTTTACGACTTAGTCACACTTCCGTTTAACCAGGCCCTGCTACCCTCCAATCTCGTCCAATCGGTCTCTACGCATCCGA >Ap5_1/1_1.000_160 +CCGTTTAACCAGGCCCTGCTACCCTCCAATCTCGTCCAATCGGTCTCTACGCATCCACTCAATAATTATTGACATATTACAATTGATTCGGATTAAAAAAATGGCGCTAGGCTTAAAACACAGACAGTTCGCTAGCTGATTAGGCTCTTTTTAAGGCGAA +>Ap6_1/1_1.000_160 AATCTACTGACAGATACCTGGAACGAGATGCAGGTCAAGTGGTCGTGTTGTGGTGTGGATGGCTACTCCGACTGGACGCAAGCTGAAGGTCTGGCCACGGGTCACTACGTGCCGCAGTCCTGCTGTCAGAACACGATGAGTACAAGCTGCACGTCACAGA ->Ap6_1/1_1.000_160 +>Ap7_1/1_1.000_160 TGGCGAAATGTAGTGGTCATTGATGGATTTTATTGCAATCAGTGTTACATATTACAAGCATTTCTTAATAAACAAAAAGTTGCACGAGATATTTTTTACTTAAAGGTTTTATGGGATGAACACAGTCAATTATATTCATGTAAAAGGCCTTATCCGAGAA ->Ap7_1/1_1.000_160 +>Ap8_1/1_1.000_160 TTCGTAATGAATCTTTTTGACTGGTATTCCGCAGGATACTCAATAATTATTGTCGCATTCTTCGAAGTTATCGCCATTTCTTGGATATACGGTCTCCAACGGTTCAAGAAGGACATTCAGATGATGGTTGGCAAGGGGCGATGGATCAATGCTAGTTTCT ->Ap8_1/1_1.000_160 +>Ap9_1/1_1.000_160 GCGAAAACTGGTTTTAACACAAATAATTGTTACAGTACCAGGTTTCGGAACACGTTTGCATATAACCAGCGAGAGTGGTGCTCAGTTCTGTTATGTATGACAGTCCTTCTCCTCAACATGCAACGGAAGCGAGCACTTCCATCATCACATTTGTCAATAA ->Ap9_1/1_1.000_160 -TGTCTTTACTTCTATCCTTCTCATCATGTTTTACATCATTTTTATTGCTGCCTCTCTTCTCAGCCCTTTCCACACTTTCATGTTTATCTTTTGATTTTTCAACTTCAACTCCATCTTCATCATCATTCTCATGCATTAATTCTTCTATTTCTTCTTCCAA >Ap10_1/1_1.000_160 -GCAGTGGTGGGAAGTTGTTCACCCTGGCTTGGTGTCCCATGTTTCTCTGTAATTCCTGTTCCTTTCTCTGTAGTTCCTCAGCCTTCCTCTCCAGTTCTTCCTGACGTCTCTTCAGGTCATCTGTGGCAGCCTGGGCCGTGGTCTTGGCGGCTGAGTATGG +TTGGAAGAAGAAATAGAAGAATTAATGCATGAGAATGATGATGAAGATGGAGTTGAAGTTGAAAAATCAAAAGATAAACATGAAAGTGTGGAAAGGGCTGAGAAGAGAGGCAGCAATAAAAATGATGTAAAACATGATGAGAAGGATAGAAGTAAAGACA >Ap11_1/1_1.000_160 -ACGACAGAGGTCCTCTGCTTGATGAATATGGTTACACCAGAGGATTTGGAAGATGAAGAGGAATATGAAGAAATTTTGGAGGATGTCAAAGAAGAGTGCAGCAAATATGGTTATGTGAAGAGTATAGAGATCCCACGGCCCATTAAGGGTGTGGAAGTGC +CCATACTCAGCCGCCAAGACCACGGCCCAGGCTGCCACAGATGACCTGAAGAGACGTCAGGAAGAACTGGAGAGGAAGGCTGAGGAACTACAGAGAAAGGAACAGGAATTACAGAGAAACATGGGACACCAAGCCAGGGTGAACAACTTCCCACCACTGC >Ap12_1/1_1.000_160 -TGTCTTTACTTCTATCCTTCTCATCATGTTTTACATCATTTTTATTGCTGCCTCTCTTCTCAGCCCTTTCCACACTTTCATGTTTATCTTTTGATTTTTCAACTTCAACTCCATCTTCATCATCATTCTCATGCATTAATTCTTCTATTTCTTCTTCCAA +ACGACAGAGGTCCTCTGCTTGATGAATATGGTTACACCAGAGGATTTGGAAGATGAAGAGGAATATGAAGAAATTTTGGAGGATGTCAAAGAAGAGTGCAGCAAATATGGTTATGTGAAGAGTATAGAGATCCCACGGCCCATTAAGGGTGTGGAAGTGC >Ap13_1/1_1.000_160 -GCGAAAACTGGTTTTAACACAAATAATTGTTACAGTACCAGGTTTCGGAACACGTTTGCATATAACCAGCGAGAGTGGTGCTCAGTTCTGTTATGTATGACAGTCCTTCTCCTCAACATGCAACGGAAGCGAGCACTTCCATCATCACATTTGTCAATAA +TTGGAAGAAGAAATAGAAGAATTAATGCATGAGAATGATGATGAAGATGGAGTTGAAGTTGAAAAATCAAAAGATAAACATGAAAGTGTGGAAAGGGCTGAGAAGAGAGGCAGCAATAAAAATGATGTAAAACATGATGAGAAGGATAGAAGTAAAGACA >Ap14_1/1_1.000_160 -TTCGTAATGAATCTTTTTGACTGGTATTCCGCAGGATACTCAATAATTATTGTCGCATTCTTCGAAGTTATCGCCATTTCTTGGATATACGGTCTCCAACGGTTCAAGAAGGACATTCAGATGATGGTTGGCAAGGGGCGATGGATCAATGCTAGTTTCT +GCGAAAACTGGTTTTAACACAAATAATTGTTACAGTACCAGGTTTCGGAACACGTTTGCATATAACCAGCGAGAGTGGTGCTCAGTTCTGTTATGTATGACAGTCCTTCTCCTCAACATGCAACGGAAGCGAGCACTTCCATCATCACATTTGTCAATAA >Ap15_1/1_1.000_160 +TTCGTAATGAATCTTTTTGACTGGTATTCCGCAGGATACTCAATAATTATTGTCGCATTCTTCGAAGTTATCGCCATTTCTTGGATATACGGTCTCCAACGGTTCAAGAAGGACATTCAGATGATGGTTGGCAAGGGGCGATGGATCAATGCTAGTTTCT +>Ap16_1/1_1.000_160 GTTGTCAGTGGATCTCGTGATGCAACACTGAGGCTATGGAATGTCGATACTGGCCAGTGTCTGCATGTTCTGATGGGACATATGGCAGCTGTACGGTGTGTGCAGTATGATGGCAAGCGTGTTGTTAGTGGTGCCTATGATTATACAGTTAGAGTGTGGG ->Ap16_1/1_1.000_160 +>Ap17_1/1_1.000_160 AGATTTATATTTGAGAATGTTTTGAGTACGACTTCTGTACAGACACACAGCAGAATGACCCTTGTATTGTTTAACAACGTTCAAAATTTCCTGATTCTTCTACCGAAAAAAATACATAAGAAGAGCCACCAAGACGATCAGATCACGGAGGTACTGGCAT ->Ap17_1/1_1.000_160 +>Ap18_1/1_1.000_160 GCAGACTCGGCTGGCACGGCCACCGCCTTCCTCTGTGGAGTGAAGGCTCGCTACGGAACGCTGGGTCTGGGACCGAGAGCCACACGATCTGACTGTAGACAGAGTCACATCAACAAACTGAAGTGTATAGGAGACATGGCACAACAAGCAGGTATGAGGA ->Ap18_1/1_1.000_160 +>Ap19_1/1_1.000_160 CCGGCCTGCAAGACGCCATTTTACTTCGTCTGTCAATCGAGGTCAAAGGTCACTACCGTTGTCTCCGAGAAGCACACAGACGCCGAGCTGGTTCACACGCTGTGTATTCGGCACAGATCTACTGTTGCTTGGGATATTTTAGCCGGCGAACGAGCGAAAT ->Ap19_1/1_1.000_160 +>Ap20_1/1_1.000_160 CCGGCGATCGTTCAGAGGGCCAGCGGTCTGGCCATGTCAGAGATCTATCACCTGCGCTTCTGCGATGGGGATCGGCTGAACGTCAGCTGCCCGGACAACTGGCAGATCCACATCTCGTCCAGCTACTTCGTCTACGTCAGCGGCGTCGACGGCCGCGGCG ->Ap20_1/1_1.000_160 +>Ap21_1/1_1.000_160 CATGAAGGACCTGTGTGGCAGGTGGCTTGGGCACATCCAATGTTTGGTAATCTGATAGCATCATGTAGTTATGACAGAAAGGTGATTATTTGGAAGGAGACTGGAGGGACATGGGCAAAGCTTTATGAATACAACAATCATGATTCCTCAGTTAATTCAG |
| b |
| diff -r 7a813e633d1c -r a83562c0719f test-data/trinity_out/PfPfiji_trinity.fasta --- a/test-data/trinity_out/PfPfiji_trinity.fasta Fri Feb 01 10:22:32 2019 -0500 +++ b/test-data/trinity_out/PfPfiji_trinity.fasta Mon Feb 03 14:37:31 2025 +0000 |
| b |
| @@ -21,7 +21,7 @@ >Pf11_1/1_1.000_160 AGCATTGTCCGTGTTGCGCGGGTCGTCGACGTAACCTCGGTACACCTCAGCGTGCCCGGCCATCTGGTGCGTGAGCCGCTTGAAGACGACTCTCGCCGGCGGCTGCTTGCTGTCCAGCGTCATGCTCTCGAACGCCTTGATCCAGGACTCCTTGACACTG >Pf12_1/1_1.000_160 -GCCCTCGGCCACCAAGCCCAAGAGTCCCAACGTGATGCCCAACCTGCCCAAGCACGTGCTGCAGGCCATCGAAGAGAACATGATCTACTACAACAAAATGTACAGTCTCCGAGTCAAGCCGGACCTGCTCCAGGTTCACTAGAGGGCGCTGTGGTGTTCG +CGAACACCACAGCGCCCTCTAGTGAACCTGGAGCAGGTCCGGCTTGACTCGGAGACTGTACATTTTGTTGTAGTAGATCATGTTCTCTTCGATGGCCTGCAGCACGTGCTTGGGCAGGTTGGGCATCACGTTGGGACTCTTGGGCTTGGTGGCCGAGGGC >Pf13_1/1_1.000_160 CGCGTCCACGACCGCCACGCGCACCGAGGTCTACGACAAACTCGCGCCGCAGGAGGCTCCTCTCAACCTGCACAAGCCTCGCGCCGACAGCGTCCCGACCGACGGCAACGGCTGACGGCAGACACTCGAGCCTTGACTACGTGTATGCACAAAGCTACCC >Pf14_1/1_1.000_160 |
| b |
| diff -r 7a813e633d1c -r a83562c0719f test-data/trinity_up.output --- a/test-data/trinity_up.output Fri Feb 01 10:22:32 2019 -0500 +++ b/test-data/trinity_up.output Mon Feb 03 14:37:31 2025 +0000 |
| b |
| @@ -1,12 +1,3 @@ -20 -Number of segment pairs = 380; number of pairwise comparisons = 3 -'+' means given segment; '-' means reverse complement - -Overlaps Containments No. of Constraints Supporting Overlap - - -DETAILED DISPLAY OF CONTIGS -21 Number of segment pairs = 380; number of pairwise comparisons = 3 '+' means given segment; '-' means reverse complement @@ -14,7 +5,13 @@ DETAILED DISPLAY OF CONTIGS -20 +Number of segment pairs = 420; number of pairwise comparisons = 4 +'+' means given segment; '-' means reverse complement + +Overlaps Containments No. of Constraints Supporting Overlap + + +DETAILED DISPLAY OF CONTIGS Number of segment pairs = 380; number of pairwise comparisons = 3 '+' means given segment; '-' means reverse complement @@ -22,11 +19,18 @@ DETAILED DISPLAY OF CONTIGS -22 -Number of segment pairs = 342; number of pairwise comparisons = 2 +Number of segment pairs = 462; number of pairwise comparisons = 4 '+' means given segment; '-' means reverse complement Overlaps Containments No. of Constraints Supporting Overlap DETAILED DISPLAY OF CONTIGS +cap3 outputs/Pfiji_trinity.fasta -p 100 -o 60 +outputs/Pfiji_trinity.fasta.cap.singlets and outputs/Pfiji_trinity.fasta.cap.contigs +cap3 outputs/Apomp_trinity.fasta -p 100 -o 60 +outputs/Apomp_trinity.fasta.cap.singlets and outputs/Apomp_trinity.fasta.cap.contigs +cap3 outputs/Amphi_trinity.fasta -p 100 -o 60 +outputs/Amphi_trinity.fasta.cap.singlets and outputs/Amphi_trinity.fasta.cap.contigs +cap3 outputs/Acaud_trinity.fasta -p 100 -o 60 +outputs/Acaud_trinity.fasta.cap.singlets and outputs/Acaud_trinity.fasta.cap.contigs |
| b |
| diff -r 7a813e633d1c -r a83562c0719f test-data/velvet_out/AcAc_transcriptome_25591.fasta --- a/test-data/velvet_out/AcAc_transcriptome_25591.fasta Fri Feb 01 10:22:32 2019 -0500 +++ b/test-data/velvet_out/AcAc_transcriptome_25591.fasta Mon Feb 03 14:37:31 2025 +0000 |
| b |
| b'@@ -1,132 +1,398 @@\n->Ac1_1/1_1.000_2580\n-AAGACAACTGCTCTTGATAGTTGTCTCGGAAGAGACTTGAAAACATCCAAGATGGTGAACTTTACGGTAGACGAGATCCGTGCGATCATGGACAAGAAGAAGAACATACGTAACATGTCCGTGATTGCTCATGTGGATCATGGCAAGTCGACGCTGACTGATTCGTTGGTGAGCAAGGCTGGCATTATTGCTGGCTCCAAGGCTGGCGAGACCCGCTTCACAGACACAAGGAAGGATGAGCAGGAAAGATGTATTACCATCAAATCAACAGCAATTTCACTCTTTTACCAGCTGCCAGAAAAAGATTTGAAGTTGATCGAGCAGCCAAGAGAGGAGGGAGAGACTGCTTTCCTGATCAACTTGATTGACTCACCTGGTCACGTGGATTTCTCCTCGGAGGTGACTGCTGCCCTTCGTGTTACAGATGGTGCTCTGGTTGTTGTCGACTGTGTGTCGGGCGTGTGTGTACAAACAGAGACTGTGCTGCGTCAGGCCATTGCTGAGCGTATCAAGCCAGTACTGTTCATGAACAAGATGGACTTGGCTCTGCTGACCCTACAGCTTGGTGCTGAGGACCTCTACCAGACCTTCTCCCGTATCATTGAAAGCATCAATGTAATCATTGCCACTTATGCTGACGACGAGGGACCGATGGGTAACATCCATGTTGATCCATCCAAGGGTACAGTTGGCTTTGGATCTGGACTCCATGGCTGGGCATTCACACTGAAGCAGTTTGCCGAGATGTATGCAGACAAGTTCAAGATTGAGGAACCAAAACTGATGAAGAGGCTGTGGGGAGACCAGTTCTACAACCCAAAGGAGAAGAGATGGGGCAAAGAAATGCAGAAGGGCTATTGTCGTGGTTTCACACAATACATCCTTGACCCCATTTACAAGATGTTTGAGTTCTGCATGAAGAAGCCAAAGGAAGAGACACTGAAGCTGGTTGAGAAACTTGGCATCAAACTGACAAGTGATGACAAGGACCTCATAGACAAACAACTGTTGAAGGTTGTCATGCGTAAATGGCTGCCAGCTGGTGATGCTTTGCTTCAGATGATAACCATCCATCTGCCGTCACCAGTAGCGGCTCAGAGGTACCGTATGGAGATGCTGTATGAGGGGCCACATGACGATGAGGCTGCTCTGGGAATCAAGAACTGTGACCCCAATGGACCACTGATGATGTACATCTCCAAGATGGTACCAACATCAGACAAGGGTAGATTCTATGCATTTGGTCGTGTGTTCTCTGGTGTTGTGTCAACAGGTATGAAGGCTAGGATCATGGGTCCCAACTTTATCCCTGGGAAGAAGGAAGATCTCTATGTGAAGGCCATCCAGAGAACAATCCTTATGATGGGTCGTTACATAGAGCCAATTGAAGATGTGCCCTGTGGTAATGTTTGTGGTCTGGTTGGTGTTGACCAGTACATTCTGAAGACTGGAACCATCAGCACGTACGAGCATGCCCACAACTTGAAAGTGATGAAGTTCAGTGTCAGTCCAGTTGTGCGTGTGGCTGTTGAGTGTAAAAACCCAGCTGATCTGCCCAAGCTTGTTGAAGGATTGAAACGTCTGTCAAAATCTGATCCCCTGGTGCAGTGTTCCATTGAGGAATCTGGAGAGCACATTGTTGCTGGAGCTGGTGAACTTCATCTGGAAATCTGCCTCAAGGACTTGGAAGAAGATCATGCCTGCATCCCAATCAAGAAATCTGACCCTGTTGTCTCATATAGAGAGACTGTCAGTAACACATCTGACAGAACCTGCTTGTCAAAATCACCAAACAAGCACAATCGTCTCTTCATGGTTGCTGCACCACTGCCAGATGGCTTACCTGAAGAGATTGATAGGGGAGAGAAGGTCAGTGCTCGTCAGGATCAGAAGGAGAGAGCTAGATACCTGGCCGACACATACGAGTTTGATGTTACTGAGGCTCGTAAGATCTGGTGCTTTGGACCTGATGGCACAGGACCAAACCTGGTCATTGACTGCACAAAGGGTGTCCAGTACCTGAATGAAATCAAAGACAGTGTTGTGGCTGGCTTCCAGTGGGCTAGCAAGGAGGGTGTACTCTGTGAAGAGAACATGAGAGGAATCCGCTTCAACATTCTTGATGTCACACTGCATGCTGATGCTATTCACCGTGGTGGTGGCCAGATCATCCCAACAACAAGAAGATGTCTCTATGCATGTGTGCTGACAGCTGAACCAAGGTTGATGGAACCAATATACCTGGTTGAGATCCAGTGCCCTGAGCAAGCTGTTGGTGGCATTTATGGTGTGCTGAACAGAAGACGAGGTGTTGTCATTGAGGAGAACCAAGTGGTGGGAACCCCGATGTTCCAGGTCAAGGCATACCTTCCTGTAAACGAATCATTTGGTTTCACTGCCGACCTGAGGTCCAACACTGGTGGCCAGGCATTCCCACAGTGTGTGTTTGATCACTGGCAGATCCTCCCAGGCGATCCGTTTGTGGACAACTCCAAGCCTAACATAATCGTCCAAGAGACGAGAAAACGCAAAGGGCTGAAGGAGGGCGTTCCTCCACTGGACAACTTCCTGGACAAGTTG\n->Ac2_1/1_1.000_5295\n-GAATTTTGGCCGAGATATCAGCTGATGACTGTAGCTTTGGTCTGGGCACTGGCCATTGTTCCCCAGGTGCTTTGTCAACTGATGATGACCACGACACCACCACCAACTCCAATAGCGTGTAGAGAAAATATGTGGGGTTGTGCCGACGGCAAGCAGTGTATACGTGAACTGTATCGTTGTGATGGTGATTACGACTGTGAGGACCGCTCTGACGAGGCCTTTCTTTTGTGTGCCCTCATTGTTTGCGATGAAAACAGCCAGTTTGAGTGTACTGCCAACAGGTTTACTAATAACACTAAGATCTGCATACCTGTTTCTTATTTGTGTGATGAGGACAATGATTGCGGAGATAACTCAGATGAAGATCCAGCCAACTGTCCTACCACATTCCGTCCTCCGACGACTCCACCGCCTTGTGTTCCTGGTTTCGAGTTCTTCTGTCCAGCTAGTCGTGACAGGGGCTGTATACCAATTGGTTTGAAATGTGACACTAAGCATGACTGTATGAATGGTGAAGATGAACAAGGCTGCACCTACAGAAATTGTTCTGATACAACGGAGTTTCAGTGTCATTCTAAGCAATGTATTGATAGCCGTCTGAAGTGTAATGGTTATGCCGACTGTAGGGATGGAAGTGATGAAACACCAGATATATGTGATGTTGCCCCTTTGCAGTGTGCAAAACATGAGTTTCAGTGTAACAATGGAAAGTGTATGGTTTGGTATGAAGTTCTCTGTAACGGAATAGACGACTGTGGTGATAATTCCGATGAAGATATCTGTAACACACTCCACATAAATGAATGTAACAATAAGACATTGCATCAATGTTCTGATAACTGCGAGGAGATGACTTTTGGCTACAGGTGTACTTGTAATCCTGGATACAGCTTAGCAAAAGATGGAAAAACATGCATCAATTCCAATGAATGCCTAGATTCACCAGGTGTGTGTCCACAGATTTGTATGGACACACCAGGAAGTTACAAATGTCAGTGTGCTACAGGCTACAGGGATATAAATGGAGATGGAACAAAGTGTGTTCGTACAGACAAAACCGAACCATATTTGATTTTCGGCAACAAGTACTATATACGCCGCATGGACATTGATGGTAGTAACTATGTCAGTATGTCCAGTGAACATACCTACACACATGTTTTGGACTTTGATTACCGCAATAAGAAGATATACTATGCCGATGCTCCAAATATGAAACAGGCAATAAAGAGAATGAACTTTGATGGCTCTGGGAAGGAGATTATTGAAAAGCATCATGCCACAGGCATCGAAGGAATTGCTGTTGACTGGGTTGGAGACAATATATACTGGACTAGCAACAAACAATGGGGGA'..b'GTGGGTGGACCTTGACAACAGTTGTTGGATACTCAGGGAATGTCCTGCCAACGGCGTGGGCACCCTGCATGTTGGTACGCTCAGACAGACCGACAAAGATTTCTCTACCTGTCCAAAGGACATCGCCACCTTCCAGCTTCGTTTCCTCATCTCCTTTGTTCTCCACTTCAACAACTTTAAGTCCGAGTTCCTTTCTTAGCACCTGTCGGACGACAGCCAGCTCTCCCTCCCTGGAGGGCTTGTTCGGCGGCGACTGCGGTCGGCATATGAGTGCCGTTCCGTTGATGACGACGGCTATGTCGTCGACGAACAAGCCGTCCGGATGCTTCTCATCGCACGGCAGTTCTATCACGTCGAGGCTGATGCGTCTCAGGGCGTCGACCAGCTGCTCGTGCTCGGCGCGGGCCTTCTCGATGTTGATCGGCGACGCGCCCGGCTTCAGATCGAAGCTCGACGTCTCGGCGAACGAGTTCGCGATCCGGCTGACCAGAGCGAAGTTGTACTTGAAACAATTAGATCCAGCCATTTTCTCCGGCATGATGAATCCTTCTCCGACCGAGCTACTCTGCGCCGAGACGACAACGAGGCTCGCCGTGACCGCTAC\n+>Ac197_1/1_1.000_794\n+TCCGATCTTGTCTCGTGTTTATTTCTTGTTACACATCACACAGAATGATGCTAGCAGGTCACTTTCTATTGTAATCCATGGTTTACTGGGATTTTGCCGCGATCCTCTTGTCACGTTCCTCTGCTATCGTCAACCTCAGTCCCTTGCCAGCTGGTAATGACACCCACGGCTTGTTACCTTTACCAATAATGAACACGTTGTTCAAACGTGTGGCAAATGAGTGGCCCATGCTGTCCTTGATGTGAACAATATCAAAGCCACCAGGATGACGTTCTCTGTGTGTCACAAGGCCAACACGACCCAAGTTGTGCCCACCAGTGATCATGCACAAATTTCCTGATTCAAACTTGATGAAATCCTTTATCTTGCCTGTGGCAATGTCAACCTGAACTGTGTCATTGACCTTGATCATTGGATCTGGATAGCGAATGGTACGAGCATCATGAGTGACCAGATGTGGAACTCCCTTCAGTCCAATGATTATCTTCTTTACCTTACACAGTTTGTACTTGGCTTCTTGGGAAGTGATACGGTGAATGGTGAAACGACCTTTGACATCATAGATGAGACGGAAGTTTTCAGCCGTCTTCTCTATTGTGATCACATCCATAAAGCCAGCAGGGTATGTCTTGTCAGTTCTCACTTTGCCATCAACCTTGATCAGACGTTGGTTTACAATCTTCTTCACCTCATCATATGTCAAGGCATACTTCAGGCGATTTCTCAAGAACACCACCGGACGAGATGGGTGTTCGTGGTCCAAGGAAGCATTTGAAGAGGCTTCATGCCCCTAA\n+>Ac198_1/1_1.000_1615\n+TCCGATCTTGTCTCGTGTTTATTTCTTGTTACACATCACACAGAATGATGCTAGCAGGTCACTTTCTATTGTAATCCATGGTTTACTGGGATTTTGCCGCGATCCTCTTGTCACGTTCCTCTGCTATCGTCAACCTCAGTCCCTTGCCAGCTGGTAATGACACCCACGGCTTGTTACCTTTACCAATAATGAACACGTTGTTCAAACGTGTGGCAAATGAGTGGCCCATGCTGTCCTTGATGTGAACAATATCAAAGCCACCAGGATGACGTTCTCTGTGTGTCACAAGGCCAACACGACCCAAGTTGTGCCCACCAGTGATCATGCACAAATTTCCTGATTCAAACTTGATGAAATCCTTTATCTTGCCTGTGGCAATGTCAACCTGAACTGTGTCATTGACCTTGATCATTGGATCTGGATAGCGAATGGTACGAGCATCATGAGTGACCAGATGTGGAACTCCCTTCAGTCCAATGATTATCTTCTTTACCTTACACAGTTTGTACTTGGCTTCTTGGGAAGTGATACGGTGAATGGTGAAACGACCTTTGACATCATAGATGAGACGGAAGTTTTCAGCCGTCTTCTCTATTGTGATCACATCCATAAAGCCAGCAGGGTATGTCTTGTCAGTTCTCACTTTGCCATCAACCTTGATCAGACGTTGGTTTACAATCTTCTTCACCTCATCATATGTCAAGGCATACTTCAGGCGATTTCTCAAGAACACCACCGGAGGGAGACACTCTCGCATCTTGTGTGGACCAGTGCTTGGGCGTGGGGCAAAAACACCCCCAAGCTTGTCCAACATCCAGTGTTTAGGGGCATGAAGCCTCTTCAAATGCTTCCTTGGACCACGAACACCCATCTCGTCCGGTGGTGTTCTTGAGAAATCGCCTGAAGTATGCCTTGACATATGATGAGGTGAAGAAGATTGTAAACCAACGTCTGATCAAGGTTGATGGCAAAGTGAGAACTGACAAGACATACCCTGCTGGCTTTATGGATGTGATCACAATAGAGAAGACGGCTGAAAACTTCCGTCTCATCTATGATGTCAAAGGTCGTTTCACCATTCACCGTATCACTTCCCAAGAAGCCAAGTACAAACTGTGTAAGGTAAAGAAGATAATCATTGGACTGAAGGGAGTTCCACATCTGGTCACTCATGATGCTCGTACCATTCGCTATCCAGATCCAATGATCAAGGTCAATGACACAGTTCAGGTTGACATTGCCACAGGCAAGATAAAGGATTTCATCAAGTTTGAATCAGGAAATTTGTGCATGATCACTGGTGGGCACAACTTGGGTCGTGTTGGCCTTGTGACACACAGAGAACGTCATCCTGGTGGCTTTGATATTGTTCACATCAAGGACAGCATGGGCCACTCATTTGCCACACGTTTGAACAACGTGTTCATTATTGGTAAAGGTAACAAGCCGTGGGTGTCATTACCAGCTGGCAAGGGACTGAGGTTGACGATAGCAGAGGAACGTGACAAGAGGATCGCGGCAAAATCCCAGTAAACCATGGATTACAATAGAAAGTGACCTGCTAGCATCATTCTGTGTGATGTGTAACAAGAAATAAACACGAGACAAGATCGGA\n+>Ac199_1/1_1.000_912\n+TTAGGGGCATGAAGCCTCTTTTTCCCGTACCACCGGACGAGATGGGTGTTCGTGGTCCAAGGAAGCATTTGAAGAGGCTTCATGCCCCTAAACACTGGATGTTGGACAAGCTTGGGGGTGTTTTTGCCCCACGCCCAAGCACTGGTCCACACAAGATGCGAGAGTGTCTCCCTCTGGTGGTGTTCTTGAGAAATCGCCTGAAGTATGCCTTGACATATGATGAGGTGAAGAAGATTGTAAACCAACGTCTGATCAAGGTTGATGGCAAAGTGAGAACTGACAAGACATACCCTGCTGGCTTTATGGATGTGATCACAATAGAGAAGACGGCTGAAAACTTCCGTCTCATCTATGATGTCAAAGGTCGTTTCACCATTCACCGTATCACTTCCCAAGAAGCCAAGTACAAACTGTGTAAGGTAAAGAAGATAATCATTGGACTGAAGGGAGTTCCACATCTGGTCACTCATGATGCTCGTACCATTCGCTATCCAGATCCAATGATCAAGGTCAATGACACAGTTCAGGTTGACATTGCCACAGGCAAGATAAAGGATTTCATCAAGTTTGAATCAGGAAATTTGTGCATGATCACTGGTGGGCACAACTTGGGTCGTGTTGGCCTTGTGACACACAGAGAACGTCATCCTGGTGGCTTTGATATTGTTCACATCAAGGACAGCATGGGCCACTCATTTGCCACACGTTTGAACAACGTGTTCATTATTGGTAAAGGTAACAAGCCGTGGGTGTCATTACCAGCTGGCAAGGGACTGAGGTTGACGATAGCAGAGGAACGTGACAAGAGGATCGCGGCAAAATCCCAGTAAACCATGGATTACAATAGAAAGTGACCTGCTAGCATCATTCTGTGTGATGTGTAACAAGAAATAAACACGAGACAAGATCGGA\n' |
| b |
| diff -r 7a813e633d1c -r a83562c0719f test-data/velvet_out/ApAp_transcriptome_35099.fasta --- a/test-data/velvet_out/ApAp_transcriptome_35099.fasta Fri Feb 01 10:22:32 2019 -0500 +++ b/test-data/velvet_out/ApAp_transcriptome_35099.fasta Mon Feb 03 14:37:31 2025 +0000 |
| b |
| b'@@ -1,130 +1,398 @@\n->Ap1_1/1_1.000_256\n-AGCAACAAGACATTCCTTTTTGGTGCTAACTTCTCAATGGATGGCAGATATATCGTGGCCGGCTCACACGAAAACCTGCACCTGTGGAGCACGGAGAACTGCAAGCTGGTCACAACAATCAGACTGCACACCAACGACCACTTCCCAATGGCCGTCTGCTCAGACAGTAACTACATAGCCACCGGCTCAAACATCCACACGGCCATCAAAGTCTGGGACTTGACCAACGTCCAAATGTCCGAGCCGGGCTCCTTGA\n->Ap2_1/1_1.000_225\n-TTTGTCCAACACCCAGGCATACTTGAAGGAGCCTTTACCCATCTCCTGGGCCTCCTTCTCGAACTTTTCAATGGTTCTCTTGTCGATGCCACCACACTTGTAGATCAGATGGCCAGTGGTGGTAGACTTCCCGGAGTCTACGTGGCCAATAACCACGATGTTGATGTGTTTCTTTTCTGCTCCCATGCTGTGTTTCTACGTGCAACTTCTAGAGAATCAAAATAC\n->Ap3_1/1_1.000_189\n-TTGCTTTTGCTTAATGTAATAATATGGCTGGTTCCATTGAACTCCCTAGGTGTCAGCACTGTTTTATTGCCTAGACCGTCTTCTCCTTATAACCCCTCTTGCCAGTGCCTAGTGCCGGAGCAAGTTAAAAACTGTTTGAATGCTGGATCAATAGGAAATATCCAGTACATCACATACTGTGTGGAAACA\n->Ap4_1/1_1.000_330\n-CTCCGTATGGATTGGTGGATCGATCCTCGCCTCCCTGTCCACCTTCCAGCAGATGAGGATCAGCAAGCAGGAGTACGACGAGTCTGGACCATCCATCGTTCACAGGAAGTGCTTCTAAAGATAGTTGTGACCATCCCAACTGCCGTGACCACTCACAACAACAAACAACATTCTGTCTGCTCAGTGGCCCGTGGGCGACCTTTGTTCAGTGCCAGGGAACTCGTCATGACAAAGTCTAAAGAAAGTGCTGATTCCACCGTCAGAAGTTTGCTATACGAAATCCAGTCACACCATTCTGCTCTTCAAACATTACACAAACCAATCTTTTCC\n->Ap5_1/1_1.000_229\n-AGAATGATCTTCACAAAGGGATTAATGTTACACATTACTGTTCAAAGGACACACAGTGAAACAAGCACCAAACCAAGCTTGCTGGTCACCAGACCACACCAGTTTACATCAACATGCATGGACTGTGAATTCTTTGAAGAGCAATCGAGGGATTTGCTGTCATGTAAACAAAGTAGCAGGCTTCTGAACGTCTACTGTTTCATCCACGCCAACATGAGATGGACTGTTA\n->Ap6_1/1_1.000_374\n-AAATATACAGATTTAATAACAGCAATTGATAAACACTTTGAAAGCAAATTGTCATCTGTTGAGAAAGATGACATATTATTATCAGAATCTCACTTTGCACAATGTGGTTCTTTGTCTGACACAGTTAGCAGCTATCTTCACACTGTGATGACAGTCTCCAAGTGGAAGAAGAAAGTCAAAGCAATGCAGCAGACACCAGGACAACCACTACTTCTAATTATTTGTAGTGCTGCAAGTCGAGCAGTAGACTTGATAAGAGATTTGAGGTCCTTTTCTCAAGATAATTGTAAAGTTGCAAAGCTATTTGCAAAACACATGAAGCTGGAAGAACAAGTAAAATTCTTGAAGAAAAATGTAATACAGGCAGGAGTTGG\n->Ap7_1/1_1.000_291\n-CAGAAGGAAAGGCCCAAAGGGTTGTCCCGATGTCCGTCCCGCGGGCCAGACTCGACCCTCTCCGGCAGATCGGCAGCCCCAGTACCACCCTGCCAGATCGTGTTCGGGTGGGTTTTTTATCGACCTGCCGGGGACTGGCCGAGTAGTGACCGATCACGGGCGAACCGGAAACCGACACGACAACCCCGGACATCAGAAGACGGGACGACACACACACGCACGAACGGAGAGATAGACGCAAGACGACTACATCAGCACAGACGTCCGCCGCACACGGACTCGGACGCGGAC\n->Ap8_1/1_1.000_147\n-GCGCTGATTGTCATATTGTTATATAGTTCACGGCCTGGTCTGTCGGACAATGGCTCATACAACGCAGACTGCCACCTGAGAATATCTGAAAACTGGCAAATGCGTCATTTTACAAATCGCAGCTATCCAATTAATTATTTAGCTGGG\n->Ap9_1/1_1.000_1956\n-CATTTTCCTTCACGCATTTCATCTGACTTCAAAGTCGCAGAAATGGTAACAACAACAGTGGCATCAGCTCAGGCCGCTGACTCGGACGCCATGGCCCGGTCATACGTGTACGACTTCAAGAACAACACGTTCTCTGTCTGGGATTACGTGGTGTTTGGTGGCGTGCTGGCAGTGTCTGCTGGGATCGGGATATACTACGGCTGTACGGGCGGCAGGCAGAGGACAACATCTGAGTTCCTTATGGCTGACAGAAAGATGCATGTCCTTCCCGTTACCTTGTCACTGCTAGCCAGCTTCATGTCTGCCATTACCTTACTAGGTACCCCAGCTGAAATCTACATGTTTGGCACTCAGTATTGGATGATATGGATTGGATATGTTATTATGATTCCACTAGCTACACACGTTTTCATTCCTGTCTTCTACAATCTACAATTGACAAGTGTATTTGAGTATCTACAAATGAGGTTCGGTACCCACGTCAGGATCTTTGCCTGTCTCTGCTTCATCGTACAAATGATATTATACATGGCCATAGTTTTGTATACACCCTGTTTGGCTCTCTCGGTCGTTACTGGCTTTAATAAGTGGATATCCGTGTGTTTGGTTGGCGTCGTCTGTACCTTTTATACAACAATAGGAGGAATGAAAGCCGTCATGTGGACAGACTCGTTCCAGATCTGCATGATGTTCGCCGGGTTGATAGCTGTGCTCGTTAAAGGATCCATTGACGAAGGAGGCTTCGGTAACATCTGGAGATACATGGAGGAAGGAGACAGGATACAGTTCTGGGACATCGACCCAAGCCCCTTAAAGAGACATTCGCTGTGGGCCTTGATCTTCGGCGGTTGTTTCACGTGGCTTGCCGTCTACGGTGTAAACCAAGCCATGGTACAGAGAGCTTTATGTTGTCCCAGAAAGAAAGACGGACAAATAGCCATGTGGCTCAATCTTCCTGGTTTGACGGCTCTGCTCACTGTATGTGCCCTGTGTGGTATGGTTGTCTACGCAGAGTACAGATACTGTGATCCCTTAATTACCAATAGGATTGAGGCTAAAGATCAGTTACTGCCCCAGTATGTCATGGATCAGTTGTTTTATCCTGGTTTACCTGGTCTATTTACTGCATGTCTCTTCAGCGGAGCTCTAAGTACGATATCCTCAGGACTAAACTCTCTGGCTGCCGTTACACTTCAGGACCTAATCATTGACCGATGTTGTTCAAAAATATCTGAGACCAAGGCCGCCCGCATATCTAAGGCATTAGCCTTCAGCTATGGTCTGCTGATGATTGCTTTGTCGTACGTGGCTTCAAAACTTGGAGGGGTTCTGCAGGCTGCACTTGGTTTGTTTGGGATGATAGGTGGTCCAGTCCTCGGGCTGTTTATTTTGGGAATCATCTACCCTTGGGCTAATCACGTGGGGTCGTTCGTTGGGACGTTTGTCAGTTTGGTTATCACTCTGTGGATTGGCTTTGGTGCTCAGATATACAAGCCTTCAGTGTACAGGCCTCCAGTAAATATAACGGGTTGTCCGCTGAAGGAAGTCAACGAATCCTTCAGTTTCACGACGCTAGCCGCGAATTTCTCTACGACAGTGGCTCCATCTATCCCAGCTAGAGAACGACCAGGATATCTCGTCATTTATGAAGTGTCCTACATGTGGTACAGTCCCATCGCTGTTTTTATCGCAGTCGTCGTAGGCTTGTTGGTTAGCGCATGCACAGGGTTTAACAAACCT'..b'TTTAGGCTTAGAACGCCCTACCTTAACAATTACTATCTCACATGGTTTTATTTTATGGGGTTCAATTTAGAGACCCCTCTCGTCACTCATGAAAAATACCGAGAAGAAGAATTAGTACAAATAAAAAAGCTGGAAGAATGGCCTGAGAAATAATAGTTGAGGATGGGAGGAGTGGGATCGGTATTAGTAGGGCAATTTCTACGTCGAAGATTAGAAAGATTACTGCAAGTAAAAAAAATCGTAAAGAAAATGGGGTTCGGGCCGAGTG\n+>Ap194_1/1_1.000_914\n+CAGATAGTAGTGCGGAGATAAATAAAATGTTGGTTGGTGTTGCTTTGACTAGGGTGTTGTAGCATATCATAGAGATAGGGAAGGTGTTTAAGCTTCCTGGTTTTAGGCTTAGAACGCCCTACCTTAACAATTACTATCTCCGCACTACTATCTGGAGTGGCAATAGCTCTATCGGCCCACTCCATGCTATCGATATGAATGGGCCTTGAACTCAACCTATTCGGCTTCATTCCCCTTATTATAGCAACAAGATCTAATCAAGAAAAAGAGGCAGCCTGTAAATACTTTCTAGCCCAAGCCATCCCATCCGCTATCTTTCTACTAGCCCTAGTATTAATACCAGACATCCCTACAACCTCTGCCGTAATTCTCGTCGCACTATTTATAAAAATAGGAATCGCCCCATGTCACCAATGATTTCCTTCTGTTATAAACGCACTAGCTTGGCCGCAAGCATGGACCCTCATTACTGTACAAAAAATTGCACCATTCTTCATAATTCTCCACATAGTTGGTAACACGACCATTCTCACTTTCCATAGCAGCCGCTATTTCATCTATTATTGGCGGACTAGGCGGCATAAATCAAACACAACTACGCCCACTATTGGCCTACTCATCTATCGGGCACATAGGCTGAATACTAGGAGCAGTTTTAGTTTCAAATAGCGCTGCCACGCTCTATTTCTCTTCTTATCTCTTTATTGTATCAACAACAATTCTAAGCGCCGTTCTATTAAAAACTAACTCCTTGTTTTCCCTACCACTATTTAAATCATCAACAACTCTATCAACCATTCTATTCCTCTCCTTCATAAACATAGGGGGCCTTCCTCCATTCTTCGGTTTCTTTATTAAAGCTTTCGTAATACTTAACTTACTTTCCAGCAATCTGGCCCCCCTCACCTTCTTCT\n+>Ap195_1/1_1.000_302\n+CAGATAGTAGTGCGGAGATAAATAAAATGTTGGTTGGTGTTGCTTTGACTAGGGTGTTGTAGCATATCATAGAGATAGGGAAGGTGTTTAAGCTTCCTGGTTTTAGGCTTAGAACGCCCTACCTTAACAATTACTATCTCACATGGTTTTATTTTATGGGGTTCAATTTAGAGACCCCTCTCGTCACTCATGAAAAATACCGAGAAGAAGAATTAGTACAAATAAAAAAGCTGGAAGAATGGCCTGAGAAATAATAGTTGAGGATGGGAGGAGTGGGATCGGTATTAGTAGGGCAATTTCTA\n+>Ap196_1/1_1.000_374\n+CACTCGGCCCGAACCCCATTTTCTTTACGATTTTTTTTACTTGCAGTAATATTTCTAATCTTCGACGTAGAAATTGCCCTACTAATACCGATCCCACTCCTCCCATCCTCAACTATTATTTCTCAGGCCATTCTTCCAGCTTTTTTATTTGTACTAATTCTTCTTCTCGGTATTTTTCATGAGTGACGAGAGGGGTCTCTAAATTGAACCCCATAAAATAAAACCATGTGAGATAGTAATTGTTAAGGTAGGGCGTTCTAAGCCTAAAACCAGGAAGCTTAAACACCTTCCCTATCTCTATGATAAGCTACAACACCCTAGTCAAAGCAACACCAACCAACATTTTATTTATCTCCGCACTACTATCTGGAGTG\n+>Ap197_1/1_1.000_821\n+CAGATAGTAGTGCGGAGATAAATAAAATGTTGGTTGGTGTTGCTTTGACTAGGGTGTTGTAGCATATCATAGAGATAGGGAAGGTGTTTAAGCTTCCTGGTTTTAGGCTTAGAACGCCCTACCTTAACAATTACTATCTCCGCACTACTATCTGGAGTGGCAATAGCTCTATCGGCCCACTCCATGCTATCGATATGAATGGGCCTTGAACTCAACCTATTCGGCTTCATTCCCCTTATTATAGCAACAAGATCTAATCAAGAAAAAGAGGCAGCCTGTAAATACTTTCTAGCCCAAGCCATCCCATCCGCTATCTTTCTACTAGCCCTAGTATTAATACCAGACATCCCTACAACCTCTGCCGTAATTCTCGTCGCACTATTTATAAAAATAGGAATCGCCCCATGTCACCAATGATTTCCTTCTGTTATAAACGCACTAGCTTGGCCGCAAGCATGGACCCTCATTACTGTACAAAAAATTGCACCATTCTTCATAATTCTCCACATAGTTGGTAACACGACCATTCTCACTTTCCATAGCAGCCGCTATTTCATCTATTATTGGCGGACTAGGCGGCATAAATCAAACACAACTACGCCCACTATTGGCCTACTCATCTATCGGGCACATAGGCTGAATACTAGGAGCAGTTTTAGTTTCAAATAGCGCTGCCACGCTCTATTTCTCTTCTTATCTCTTTATTGTATCAACAACAATTCTAAGCGCCGTTCTATTAAAAACTAACTCCTTGTTTTCCCTACCACTATTTAAATCATCAACAACTCTATCAACCATTCTATTCCTCTCCTTCATAAACA\n+>Ap198_1/1_1.000_775\n+AGAAGAAGGTGAGGGGGGCCAGATTGCTGGAAAGTAAGTTAAGTATTACGAAAGCTTTAATAAAGAAACCGAAGAATGGAGGAAGGCCCCCTATGTTTATGAAGGAGAGGAATAGAATGGTTGATAGAGTTGTTGATGATTTAAATAGTGGTAGGGAAAACAAGGAGTTAGTTTTTAATAGAACGGCGCTTAGAATTGTTGTTGATACAATAAAGAGATAAGAAGAGAAATAGAGCGTGGCAGCGCTATTTGAAACTAAAACTGCTCCTAGTATTCAGCCTATGTGCCCGATAGATGAGTAGGCCAATAGTGGGCGTAGTTGTGTTTGATTTATGCCGCCTAGTCCGCCAATAATAGATGAAATAGCGGCTGCTATGGAAGTGAGAATGGTCGTGTTAACCAACTATGTGGAGAATTATGAAGAATGGTGCAATTTTTTGTACAGTAATGAGGGTCCATGCTTGCGGCCAAGCTAGTGCGTTTATAACAGAAGGAAATCATTGGTGACATGGGGCGATTCCTATTTTTATAAATAGTGCGACGAGAATTACGGCAGAGGTTGTAGGGATGTCTGGTATTAATACTAGGGCTAGTAGAAAGATAGCGGATGGGATGGCTTGGGCTAGAAAGTATTTACAGGCTGCCTCTTTTTCTTGATTAGATCTTGTTGCTATAATAAGGGGAATGAAGCCGAATAGGTTGAGTTCAAGGCCCATTCATATCGATAGCATGGAGTGGGCCGATAGAGCTATTGCCACTCCAGATAGTAGTGCGG\n+>Ap199_1/1_1.000_400\n+TGATCGTCTTATAAACCTAACTTGAAAAACCTTCCTACCATTTAGGGCTAGCAGCCCTATTAATTATCACACCTATCGCAGCGCTCTCACTATAATTATAAGTATTGCGCCGGGTTTGAACGGATAGCTCTGATGCTGCTAATTACGGGACCTAATAATCCCCAATACTTTATCCTTAGAGAGCTGTACCTCTTAGCACCAGTCTTTTAAACTGGCGAAAGCACACTTTATGCTTCTAAGGAATGAAACTAATTCTTATAATCCTACTAATCTCTTTTATCATCCCCGCCATTCTATTTTTACTCTCGATCTTTACTACTATGCGCATGCCAGAGAGCCGTGAAAAATTTAGGCCCTACGAGTGCGGGTTTGACCCCAATCACTCGGCCCGAACCCCATT\n' |
| b |
| diff -r 7a813e633d1c -r a83562c0719f test-data/velvet_out/PgPg_transcriptome_90109.fasta --- a/test-data/velvet_out/PgPg_transcriptome_90109.fasta Fri Feb 01 10:22:32 2019 -0500 +++ b/test-data/velvet_out/PgPg_transcriptome_90109.fasta Mon Feb 03 14:37:31 2025 +0000 |
| b |
| b'@@ -1,128 +1,398 @@\n->Pg1_1/1_1.000_474\n-GAGGTATGTTCGGGTTATAGGTGTGGTCCGACAATGGGTCGAGTAATCAGGAGTCAGCGTAAGGGTGCTGGCAGTGTATTCAAGGCACACACGAAACACAGGAAAGGTGCTGCAAGACTTCGAGCATTTGATTTTTCTGAAAGACATGGCTATATCAAAGGTGTTATAAGGGACATCATTCATGATCCAGGACGTGGCGCTCCATTGGCACGTGTCGTTTTCCGTGATCCATATAGGTACAAGCTGAGACATGAGAACTTCATCGCCTGCGAGGGCATGTACACCGGACAGTTCATTTACTGCGGCAAAAAGGCCACACTCCAGATAGGAAACATCCTTCCCGTCGGTGTGATGCCTGAGGGTACAGTCGTGTGCTCACTGGAAGAGAAGACTGGAGATCGTGGACGACTGGCCAAGTGCTTGGTAACTATGCCACTGTCATCTCCCACAATCCGGAAACAAAAAGGACTAGGG\n->Pg2_1/1_1.000_1300\n-GCTGTAGGCAACTGTGACAGAACAACAGGAGAGTGTAAGAAATGTATATATAATACAGCTGGCTTCTATTGTGAAAGATGTCTTCCTGGTTACTATGGTGATGCCTTAGCTGAACCGAAAGGGCAATGTAAAGCATGTAATTGTTACCCACCTGGTACTAATGACAGAGCCAGACAAGAAGGCTCCCTGACTTGTGATGAGAGATCTGGCCAGTGTCCGTGTAAACGTCAAGTTATTGGTAAAATGTGTGATACTTGTGAAGATGGCTTCTGGAACATAGACAGTGGACGAGGTTGTGAAGCATGTTTATGTAACCCAACTGGATCACACAACAGGAACTGTGACCTGCGTACTGGACAATGTCAGTGTAAGTCTGGTGTTACTGGCAGAAAATGTGATCAGTGTCTGCCTGACCACTGGGGATTCTCTCGTGATGGATGTAAAGCTTGTAACTGTAATATGGAAGGAGCTGTTAATACTCAGTGTGATTTGAGGACTGGGCAATGTATCTGTAGGCCAAGCATAGAGGGAGAGAAATGTGACAGATGTGTGGAGAATAAGTTTAACATCACTGCAGGATGTATTGATTGTCCTCCATGTTACTCACTTGTCCAAGATCAGGTACACATCCTCAGATCAAAAATAAATGAACTTCGTGAAATTATCCATAACATTGGTGACAACCCACATAAGGTTGATGATGCTGACTTCCGTAGGAAGTTAAGGGCTGTAAATGACTCTGTAAATGATCTGTGGAGAGACGCCATGCATGCTGGTGGTGGTGGAGACTCCTCTCTTGGCCAACAGATGGAGGCACTGCAGCAGGCTATCAGTGACATCATGACTGAATGTGGCCAGATAACAATTGATATAACATATGCCACATCATCATCGGACAGCAGTAAGGTAGATATTACTTATGCCGAGGAGGCAATTGATGAAGCAGAGAAGGCATTACTGGCTGCTGAAACCTACCTTCGTACAGAAGGTAGGAAGGCACTTCATGATGCCATTGAAGAGTTGAGGGATCTTGACGAGAAGTCACAACAACTGACTAAAATAGCAAGAGAAGCTCGGGAGGAGGCTGAAAAACAAGAAAAAGAAGCCAGTATGATAGATGAAACAGCTAACAAGGCATTGAATACATCAAAAGAAGCCTTAGTGTTAATTAATGAGGTACTGAGGAAGCCTGACGATATTGCCGATCAGATAGAAGATCTTAGACGAGAGGTTCTAGACACAGAAGTGGAATATCAGGCCACAAAAGCTGAAGCAGAACGAGCAGAGAAGTTGGCTACTG\n->Pg3_1/1_1.000_1813\n-TCTACCTGGAGTCTATCTAGGGGGGTGAGGCTTGTCTCTCCGGTGTTTTGTCACCGTGATACTATCTCGCCTTATCACATCCGATCGCCGTCGAGTGACGTCAGCAAAATGAGTGTCAGCGACTTAGAGGCTCGTCGAGAGGCCAGGAGGCGGGCCAGAGAAGAAAAACAATCGGTGCTACTCGGCTCTCCGGCCACCCAATCTGCCATAACGACTGACCACGACGACGAAGATATCGTGGAACGAATTGCGAGGAGGCGAAGAGAACGTCAGGAACGACTCGCAAAACTATCTGCTGATACCAGCTCTGTCGATTACGACATCGAGAAACGTCGCCAGGAAAGACGAGCAGCACGCGAAAACATAATGAGGGGTGAGATCAAGGACGGGTCAAACGACCATGAGAAGGAGAGCTCTTATGCAGAAGAGAAGTCGGCAGACGACACGAAAGAAGAATGGCCAACAGCGCGCGATGAAGAAGAAAATGGAAAGGACGAAGAGAAAGACAAAGGACGAGAAGAAGACGAACAGAGACAGAAAGAAGAGGAGGAAACGGTTAGAATAGAAACACAACAACAGGAAGGTGAGATAAATGAGGAAGAGCAAAATGAAAAGCGAGAATGGAAGATGGGTGGAAAATCTACAGTAGAAGAACAGGAAAACGGTCTGAGTGGTGAAGAAGGTGAGGAAGAGGAGGAAGAAGAGGAAGAAGGAGAAGAGGAGGAGGAGGAGGAAGAAGAAGAAGAAGGGGAAAATTATCAGCAGAGAGAGGATGATTTGGCGGAGGAGGAAAGAAAGATCCAGGAGGAAGAGGATCTCATCAGGGAGGAGGAGCAATTGAAAAGAGAAGAAGAACAACGGTGGAGAGAGATGGAAGAGAAGAGGCAGCAACAAGAGAAGGACGAAATGGAGTTTGAAGAAGACGAGAAGAGACGTAGAAAAGAAGAAGCGGAAATGGACGTGGACGTCGGGGAGAAGAACGAAGATCAGAGTTCTCCCGAAGAGAAGGACAAGGGAAGGACGATTGACGAGAAACGTCAGCAGCTGATGAAACACATGAATGGCTCGATCGATGGAACGACCCCGACACGGCCGAAAAATGACTCCCCTGATACGCCACCTAGCGAGACTAAAAGACGACAGAAGGCGATGGAATGGGAACAACGTCTCAAGAGAACGCCCTCGCAAAGCGAACCAAACGACAGAATCAAACAGATTGAGGAACAACGGGCCGCCGAGCGGAAGGAGCTACAGCGCGCGCGCGAGGCTCGCTGGAAAGAACGGGAAGAGAAGCTCAGACAAGAAGCCGAAACTCGGAAGAAGCGGGAAGAAGACTTGGCTGAGAGACGCCGAAAGGCGGCCGAGGAGAGGAAAACGTTGCGCAAACAATCAGATATCGCTCATAACGAACCACAACCTGAAGGTGGAGAGAATGACGATGGCTTGGCAGATATGGAGAAAGGCAAAAGGAAAGGGTTGGGAGGCCTCTCCCCTGAGAAGAAGAAGCTACTAAAGCAACTGATCATGCAAAAGGCAGCGGAAGACTTAAAGAAGCAACAAGAGGCCGAAGCTGAAGCGAAGAGGAAGATCATCCAGCAGCGTGTTCCAAAACTAGAGATTGATGGATTAGATCAAGGTCGTCTGGAGAAAATCGTACGTGACCTTTATAAAAAGGTCGTTGCCCTTGAGGAGGATAAATATGATTGGGAGGTGAAGCTTAGGAAACAGGACCAAGAGATGAATGAGCTTAACATCAAGGTTAACGATATTAAGGGCAAATTTGTTAAGCCCGTCCTGAAGAAGGTATCGAAGA\n->Pg4_1/1_1.000_231\n-ATCACGTTATATTTTTATCGCCTTAAATTCGAAGCCAAATCTTATCGAGTGAACTGTGCGTGTATCCCGACTGATACGTGTATCCCAACTGAATGTGTAAGTCGTCTGCCCGCGTGTTGTGTTGTTGCCGCCGTCGTCATCATCCCAGCGCGACTTAACTCCAGCATTCAGCTGACCTGTATGAAATGTGTGCTATTTTTCCAATGTTGCGGTTTGTGTGCGTGTGTGTGT\n->Pg5_1/1_1.000_1440\n-GTACTCATAGTTGGTGTAGAGACAATGGGCTGTATTTTGGGATCGCTGG'..b'ACATTAACACTTTGTGATGTGCTACCAACAGAGTTAGTTGCCGTTACTGTGTACTCAGCCGTGTCCTCAACAAGAGACTCTTTCACCTTCAGAGTGTAAGAGTTGTCCTTCGAGATCATCTTGTAATGACTGTCCTTGTCAGTTATAGGTTTTTTGTTCTTTGACCATGTGACCTCCGGCTCTGGATAACCTGTTGTTTTACAGGTGAGCTTGAAGCCTTCACCTGCCTCAACATTAACTGGCTGTAACTTGCCCTTAATATGTGGTGCAACCTTCTTCTTCTCGACACTAACAGTAACCTTCACGATGGTGGTACCTTTGCCAGACTTGGCACTGACTCTGTACTCACCAGCATCATCAGCCGTAGCATCGACGACCTGTAAGTAATACACATCACTGTCCATATCCCAGTCAACTTTGTACTTCTTGTCACCCTTTTTCTTCGGCTTAATCTTATTCGTGTCCTTGAACCAGGTAACCTCTGCCTCCTCTTCAATTTGACAAGTTAACTTAAATGTTTCTCCTTCTGTCACAACCAATGGCTGAGGTTCTTTTGTAAACTTTGGACCTTCTGGCTTCTCCTTTGACTTCTTATCCTTTTCAACTTTTTCTTGTTTTTCTTCTTCTTTCGGTGGTTCCTCTTCCTTTAGCTTTTCTTCTTCCTTAGGTTTTTCTGCTTCTTTTACTTTTTCATCTTCTTTGGGTTTTTCCTCTACGGGTTTTTTTTCTGCCTCCTGCTCTACAGCCTTTTCGGCTGCTTCCTTTGTTACTTCCTTTGTTATCTTCTCTTCCTTTTCAGTCTCTTTCTTCTTTTTTTCCTTTTTCTCTTTCGGTTTTTCTTCTTTTGGCTCTTCCTTTTTTTCCTTCTTCTTTTTCTTCTCTTCCTTTGGCTTATCTTCTGGTTTCTGCTCTTCTTCCTTATCAATTTTGTCCTCCACCACTTCCTTCTTTTTCTTCTTCTCCTTTGGTTTCTTTTCCTTTGGACTTTCTTCCACCTTTTCCACGGCTTCTTCAACAACCTCTTGTGGCTTTTCTTCCTCTCTTATTTCTTCTTTCTGTTTGTCTTCCTTGGGCTTCTCTTCCTTTGACCTTTCCTCTTCAACCCTTTCTTCTTCCATCTTCTCTTCTATCGGCCCTTCCTCTTTAGACTTTTCTCCCTCGGATCTTGGTTCTTTAGGTTTTTCTTTGTTGAGGATTTCTTCAGATGTTTCAATTATTTCTTCCTCTATTACCTTCACTTCTTCAGGAACAGAGACAGCAACATTAACAGTTGCTGAGACGGTTCCGCCGTCATTGCTGGCGACAATGGTGTACTGGCCGGCATCATCTGGCGTGGTATCGCTGATGAGCAACAGATGGACATCATCATTTGAGTCGAAGTCAAGTTTGACACGGTCAGTTGACTTTTCAAACTGCTTACTCCCTTTGAACCAGGTGACCTCAGGCTTTGGTTGTCCTAAGACGCGTGCACCGAGCGTAATGGTCTTTCCCGACTTCACGACGACGGGTTCGGGGAAGAGATCGAACCGAGGCAAAGAGACCGTTTTGTCGGTCACGACTTCCTCCAGTGTGATGTCGCCGAATGCTTCGCTGTCGACGTCCACTGTGGAGTATATTGTCTCCTCAATGATGACGGTCTCTTTCTTTACCACCGTCTCATTTGTCAGCTCCTCATTCTCCTCAAGTGACGTTTTTCCCTCATCCGTCTTATTTTCTTTAATTAATTTGGCTTCCTCTTTCATGGCTTCATCCACCGCAGACGGATTATTCACAGCTAATTTGGTTGCCATTTTGTCATCTTCTTCATCAGCTGCATTGCCTCTTTCTTCTTCATCACCAGTAATGGGTGCCGCCATTTCTTTGACATCCGGCACGTCTTTCTTGCTTTCTTTCATTGCCTCTGATATTTCCGAACTATTTTCTGCAGATGTACCATCCACCAGCGGGTTTGATTTCTCCTCAGCCGTTTCTTTTCTCGATTTCATCAGGATCAGCTGTGGCTTACTGTCCGCCTCTTTTGACTCGATTTCTTCCAACAGCTCCTTGGTGTCTGCGATGTCTATTTCTATGTTCAGATCCTGATCTGCGGAGATCTTCAGTGCCCTAGCAGCTTCTTCCTGTACTATGGCCACTTCCGTTAGACCACCATCCCTGCCATCGGCACCCTGGCGCAGCCTGTTCTCACCGGTATAGGAAATCGTGATGATGTCTTGTTTATCTTTATCTTTGTCCTCGACGATCACCGTCACGGTGACGGTCATCGAGACCTCGCCGTGAATGTTCGACGCTTTGACGGTGTAATCATCGGCATCATCAATGGTGCACTCTTTGATGATGAGCGTATACAAATCGGATGCAACATCCCAGTGGATCTCTATGTGCTGATCCTGTTTCTTGGGCTTTAGCTCTTTATCGCCCTTATACCAAGTGACCTGAGGTTGAGGGTCACCTGCTACTTTGCAGCTAAGTTTTACGGTTTCTCCTTCTTTCACAGTGACAGGCTCGGGGATGACTTCGAATCTTGGTCTTGGCAAATCTTCTACTCTCTCTTCTTGCTCTGCGGCCAGCACTTCTTCCACGGATTCTTTTGGCTTCTGGTCCATTTCCACTTCCTTTGGGGCGGTTTCTTTTATTGGTGGTTTTTCCTTCTTTTCAATTGCTTCTTCTCCATCTTCAGATTCTTCATTTACTTCTATCCCTTTCACTTTCTGTTTTTTAACCTTCTTAATTTTCTTTCCTGATATGTCCACTTCTTCTTTCATCTTAATAATAATCTTTTTCTTCTTCTTCTTTTTCACATCTTTCCCCTCAGATTCTCCTTCACTTGATTCTGACTCACTCGATGTTACTTCAGAGTCAACAAGCTCAGACTCAATTTCTGTTGTCTCTTCTTCAGTAGTAGTAAACTCAACCTCAACTTTATCAATCTTTTCGGATGTCTCCATCACATCAACTTCTGCCTTTTCTATTCTTATAGATTCACTCAATGAACTAATATTCACTGACACTGTACAAGATTCGGAGCCACAATCATTTGTAGCTTTCACTATGTAGTCACCAGAATCTTCAACTGTTGCATTTTTAATTATCAACATGTTTAGGTCATTAGAAGTATCCCAGTCAACCTTCACACGACCTTCTTTCTTTGGTTTGATTTTCTTCTCATCTTTATACCAAGTAATTGATGGATCTGGATTTCCTGTGACCCGACACGTTAAACGGATTGTCTCTCCTTCGTTTACAGTGACTGGCTGCGGTTCTGCGGTGATGACAGGGGCACTAGAGGCACGTGTCGCATCCAGCTGGTCATCAACTGGCTCTTGCTTCACTCTTTCCGTTGGCTCCTTGGCTGGGTATCTTCACAGCGTCGTCCAGCTCAGCCATGTCTTCACTGATGCCGACCTTGTTTTCCGCATAAACCCGAAAGAAATACTGATGGCCTTCCTTAACTTTGGTCACAGTCAAAGTTAAGGTTGTGCCATTTGTCTGTCCGACCTTCTTAAATTTATTTTTCTCGGCTTCTCTCATCACGATCAAGTAAGATGTTAGGTCCGTGTTGCCGGCATCAATCGGTGCATCCCAGCTCAGCGTCACCGAACTACTGTTTACTTCCTTAACAATTAGATTTGTGGGAGCCGATGGCACGGTCTGAGGTTTCTCAGCAGATGGCTGTTGATGCTTTTCAGAAGTGATTTCTTCTTCAGATGTCTGTTCGGGTTTGGGGGCTGTCTGCGCCTCGGTTGTGATTTTGGCTTCAGACTGAGGCTCGGCTTCTGTGACTTTGCTCTCCGTGACCTCTACTTGCTCTTCTACTATTTCCACTTCTTCATCCTTCTTGATCACTTTTGATGTTATTTTTACCGCAGAATCTATCTCAGCAGCTGATTCACTGATGCCAGCACTGTTCTCGGCATAAACCCTGATAAAGTATTCTTTGGCTGGCTCGATGTTAGAAGTAATGGAATATTTCAACGTACCGCCAGACACTTTACCAACTT\n' |
| b |
| diff -r 7a813e633d1c -r a83562c0719f test-data/velvet_up.output --- a/test-data/velvet_up.output Fri Feb 01 10:22:32 2019 -0500 +++ b/test-data/velvet_up.output Mon Feb 03 14:37:31 2025 +0000 |
| b |
| @@ -1,21 +1,27 @@ -Number of segment pairs = 4032; number of pairwise comparisons = 0 +Number of segment pairs = 39402; number of pairwise comparisons = 402 +'+' means given segment; '-' means reverse complement + +Overlaps Containments No. of Constraints Supporting Overlap + + +DETAILED DISPLAY OF CONTIGS +Number of segment pairs = 39402; number of pairwise comparisons = 343 '+' means given segment; '-' means reverse complement Overlaps Containments No. of Constraints Supporting Overlap DETAILED DISPLAY OF CONTIGS -Number of segment pairs = 4160; number of pairwise comparisons = 0 +Number of segment pairs = 39402; number of pairwise comparisons = 352 '+' means given segment; '-' means reverse complement Overlaps Containments No. of Constraints Supporting Overlap DETAILED DISPLAY OF CONTIGS -Number of segment pairs = 4422; number of pairwise comparisons = 1 -'+' means given segment; '-' means reverse complement - -Overlaps Containments No. of Constraints Supporting Overlap - - -DETAILED DISPLAY OF CONTIGS +cap3 outputs/Pg_transcriptome_90109.fasta -p 100 -o 60 +outputs/Pg_transcriptome_90109.fasta.cap.singlets and outputs/Pg_transcriptome_90109.fasta.cap.contigs +cap3 outputs/Ap_transcriptome_35099.fasta -p 100 -o 60 +outputs/Ap_transcriptome_35099.fasta.cap.singlets and outputs/Ap_transcriptome_35099.fasta.cap.contigs +cap3 outputs/Ac_transcriptome_25591.fasta -p 100 -o 60 +outputs/Ac_transcriptome_25591.fasta.cap.singlets and outputs/Ac_transcriptome_25591.fasta.cap.contigs |