changeset 2:28d52478ace9 draft

Uploaded v0.0.4 which adds a unit test.
author peterjc
date Mon, 15 Apr 2013 12:28:51 -0400
parents 50a8a6917a9c
children 19e26966ed3e
files test-data/k12_hypothetical.fasta test-data/k12_hypothetical.tabular test-data/k12_ten_proteins.fasta tools/filters/seq_select_by_id.py tools/filters/seq_select_by_id.txt tools/filters/seq_select_by_id.xml
diffstat 6 files changed, 88 insertions(+), 6 deletions(-) [+]
line wrap: on
line diff
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/test-data/k12_hypothetical.fasta	Mon Apr 15 12:28:51 2013 -0400
@@ -0,0 +1,3 @@
+>gi|16127999|ref|NP_414546.1| hypothetical protein b0005 [Escherichia coli str. K-12 substr. MG1655]
+MKKMQSIVLALSLVLVAPMAAQAAEITLVPSVKLQIGDRDNRGYYWDGGHWRDHGWWKQHYEWRGNRWHL
+HGPPPPPRHHKKAPHDHHGGHGPGKHHR
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/test-data/k12_hypothetical.tabular	Mon Apr 15 12:28:51 2013 -0400
@@ -0,0 +1,2 @@
+#ID	Description
+gi|16127999|ref|NP_414546.1|	hypothetical protein b0005 [Escherichia coli str. K-12 substr. MG1655]
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/test-data/k12_ten_proteins.fasta	Mon Apr 15 12:28:51 2013 -0400
@@ -0,0 +1,60 @@
+>gi|16127995|ref|NP_414542.1| thr operon leader peptide [Escherichia coli str. K-12 substr. MG1655]
+MKRISTTITTTITITTGNGAG
+>gi|16127996|ref|NP_414543.1| fused aspartokinase I and homoserine dehydrogenase I [Escherichia coli str. K-12 substr. MG1655]
+MRVLKFGGTSVANAERFLRVADILESNARQGQVATVLSAPAKITNHLVAMIEKTISGQDALPNISDAERI
+FAELLTGLAAAQPGFPLAQLKTFVDQEFAQIKHVLHGISLLGQCPDSINAALICRGEKMSIAIMAGVLEA
+RGHNVTVIDPVEKLLAVGHYLESTVDIAESTRRIAASRIPADHMVLMAGFTAGNEKGELVVLGRNGSDYS
+AAVLAACLRADCCEIWTDVDGVYTCDPRQVPDARLLKSMSYQEAMELSYFGAKVLHPRTITPIAQFQIPC
+LIKNTGNPQAPGTLIGASRDEDELPVKGISNLNNMAMFSVSGPGMKGMVGMAARVFAAMSRARISVVLIT
+QSSSEYSISFCVPQSDCVRAERAMQEEFYLELKEGLLEPLAVTERLAIISVVGDGMRTLRGISAKFFAAL
+ARANINIVAIAQGSSERSISVVVNNDDATTGVRVTHQMLFNTDQVIEVFVIGVGGVGGALLEQLKRQQSW
+LKNKHIDLRVCGVANSKALLTNVHGLNLENWQEELAQAKEPFNLGRLIRLVKEYHLLNPVIVDCTSSQAV
+ADQYADFLREGFHVVTPNKKANTSSMDYYHQLRYAAEKSRRKFLYDTNVGAGLPVIENLQNLLNAGDELM
+KFSGILSGSLSYIFGKLDEGMSFSEATTLAREMGYTEPDPRDDLSGMDVARKLLILARETGRELELADIE
+IEPVLPAEFNAEGDVAAFMANLSQLDDLFAARVAKARDEGKVLRYVGNIDEDGVCRVKIAEVDGNDPLFK
+VKNGENALAFYSHYYQPLPLVLRGYGAGNDVTAAGVFADLLRTLSWKLGV
+>gi|16127997|ref|NP_414544.1| homoserine kinase [Escherichia coli str. K-12 substr. MG1655]
+MVKVYAPASSANMSVGFDVLGAAVTPVDGALLGDVVTVEAAETFSLNNLGRFADKLPSEPRENIVYQCWE
+RFCQELGKQIPVAMTLEKNMPIGSGLGSSACSVVAALMAMNEHCGKPLNDTRLLALMGELEGRISGSIHY
+DNVAPCFLGGMQLMIEENDIISQQVPGFDEWLWVLAYPGIKVSTAEARAILPAQYRRQDCIAHGRHLAGF
+IHACYSRQPELAAKLMKDVIAEPYRERLLPGFRQARQAVAEIGAVASGISGSGPTLFALCDKPETAQRVA
+DWLGKNYLQNQEGFVHICRLDTAGARVLEN
+>gi|16127998|ref|NP_414545.1| threonine synthase [Escherichia coli str. K-12 substr. MG1655]
+MKLYNLKDHNEQVSFAQAVTQGLGKNQGLFFPHDLPEFSLTEIDEMLKLDFVTRSAKILSAFIGDEIPQE
+ILEERVRAAFAFPAPVANVESDVGCLELFHGPTLAFKDFGGRFMAQMLTHIAGDKPVTILTATSGDTGAA
+VAHAFYGLPNVKVVILYPRGKISPLQEKLFCTLGGNIETVAIDGDFDACQALVKQAFDDEELKVALGLNS
+ANSINISRLLAQICYYFEAVAQLPQETRNQLVVSVPSGNFGDLTAGLLAKSLGLPVKRFIAATNVNDTVP
+RFLHDGQWSPKATQATLSNAMDVSQPNNWPRVEELFRRKIWQLKELGYAAVDDETTQQTMRELKELGYTS
+EPHAAVAYRALRDQLNPGEYGLFLGTAHPAKFKESVEAILGETLDLPKELAERADLPLLSHNLPADFAAL
+RKLMMNHQ
+>gi|16127999|ref|NP_414546.1| hypothetical protein b0005 [Escherichia coli str. K-12 substr. MG1655]
+MKKMQSIVLALSLVLVAPMAAQAAEITLVPSVKLQIGDRDNRGYYWDGGHWRDHGWWKQHYEWRGNRWHL
+HGPPPPPRHHKKAPHDHHGGHGPGKHHR
+>gi|16128000|ref|NP_414547.1| peroxide resistance protein, lowers intracellular iron [Escherichia coli str. K-12 substr. MG1655]
+MLILISPAKTLDYQSPLTTTRYTLPELLDNSQQLIHEARKLTPPQISTLMRISDKLAGINAARFHDWQPD
+FTPANARQAILAFKGDVYTGLQAETFSEDDFDFAQQHLRMLSGLYGVLRPLDLMQPYRLEMGIRLENARG
+KDLYQFWGDIITNKLNEALAAQGDNVVINLASDEYFKSVKPKKLNAEIIKPVFLDEKNGKFKIISFYAKK
+ARGLMSRFIIENRLTKPEQLTGFNSEGYFFDEDSSSNGELVFKRYEQR
+>gi|16128001|ref|NP_414548.1| putative transporter [Escherichia coli str. K-12 substr. MG1655]
+MPDFFSFINSVLWGSVMIYLLFGAGCWFTFRTGFVQFRYIRQFGKSLKNSIHPQPGGLTSFQSLCTSLAA
+RVGSGNLAGVALAITAGGPGAVFWMWVAAFIGMATSFAECSLAQLYKERDVNGQFRGGPAWYMARGLGMR
+WMGVLFAVFLLIAYGIIFSGVQANAVARALSFSFDFPPLVTGIILAVFTLLAITRGLHGVARLMQGFVPL
+MAIIWVLTSLVICVMNIGQLPHVIWSIFESAFGWQEAAGGAAGYTLSQAITNGFQRSMFSNEAGMGSTPN
+AAAAAASWPPHPAAQGIVQMIGIFIDTLVICTASAMLILLAGNGTTYMPLEGIQLIQKAMRVLMGSWGAE
+FVTLVVILFAFSSIVANYIYAENNLFFLRLNNPKAIWCLRICTFATVIGGTLLSLPLMWQLADIIMACMA
+ITNLTAILLLSPVVHTIASDYLRQRKLGVRPVFDPLRYPDIGRQLSPDAWDDVSQE
+>gi|16128002|ref|NP_414549.1| transaldolase B [Escherichia coli str. K-12 substr. MG1655]
+MTDKLTSLRQYTTVVADTGDIAAMKLYQPQDATTNPSLILNAAQIPEYRKLIDDAVAWAKQQSNDRAQQI
+VDATDKLAVNIGLEILKLVPGRISTEVDARLSYDTEASIAKAKRLIKLYNDAGISNDRILIKLASTWQGI
+RAAEQLEKEGINCNLTLLFSFAQARACAEAGVFLISPFVGRILDWYKANTDKKEYAPAEDPGVVSVSEIY
+QYYKEHGYETVVMGASFRNIGEILELAGCDRLTIAPALLKELAESEGAIERKLSYTGEVKARPARITESE
+FLWQHNQDPMAVDKLAEGIRKFAIDQEKLEKMIGDLL
+>gi|16128003|ref|NP_414550.1| molybdochelatase incorporating molybdenum into molybdopterin [Escherichia coli str. K-12 substr. MG1655]
+MNTLRIGLVSISDRASSGVYQDKGIPALEEWLTSALTTPFELETRLIPDEQAIIEQTLCELVDEMSCHLV
+LTTGGTGPARRDVTPDATLAVADREMPGFGEQMRQISLHFVPTAILSRQVGVIRKQALILNLPGQPKSIK
+ETLEGVKDAEGNVVVHGIFASVPYCIQLLEGPYVETAPEVVAAFRPKSARRDVSE
+>gi|16128004|ref|NP_414551.1| inner membrane protein, Grp1_Fun34_YaaH family [Escherichia coli str. K-12 substr. MG1655]
+MGNTKLANPAPLGLMGFGMTTILLNLHNVGYFALDGIILAMGIFYGGIAQIFAGLLEYKKGNTFGLTAFT
+SYGSFWLTLVAILLMPKLGLTDAPNAQFLGVYLGLWGVFTLFMFFGTLKGARVLQFVFFSLTVLFALLAI
+GNIAGNAAIIHFAGWIGLICGASAIYLAMGEVLNEQFGRTVLPIGESH
+
--- a/tools/filters/seq_select_by_id.py	Fri May 18 12:25:12 2012 -0400
+++ b/tools/filters/seq_select_by_id.py	Mon Apr 15 12:28:51 2013 -0400
@@ -16,11 +16,11 @@
 molecular biology and bioinformatics. Bioinformatics 25(11) 1422-3.
 http://dx.doi.org/10.1093/bioinformatics/btp163 pmid:19304878.
 
-This script is copyright 2011-2012 by Peter Cock, The James Hutton Institute UK.
+This script is copyright 2011-2013 by Peter Cock, The James Hutton Institute UK.
 All rights reserved. See accompanying text file for licence details (MIT/BSD
 style).
 
-This is version 0.0.3 of the script.
+This is version 0.0.4 of the script.
 """
 import sys
 
@@ -28,6 +28,10 @@
     sys.stderr.write(msg.rstrip() + "\n")
     sys.exit(err)
 
+if "-v" in sys.argv or "--version" in sys.argv:
+    print "v0.0.4"
+    sys.exit(0)
+
 #Parse Command Line
 try:
     tabular_file, col_arg, in_file, seq_format, out_file = sys.argv[1:]
--- a/tools/filters/seq_select_by_id.txt	Fri May 18 12:25:12 2012 -0400
+++ b/tools/filters/seq_select_by_id.txt	Mon Apr 15 12:28:51 2013 -0400
@@ -35,7 +35,10 @@
 =======
 
 v0.0.1 - Initial version.
-v0.0.3 - Ignore blank lines in input
+v0.0.3 - Ignore blank lines in input.
+v0.0.4 - Record script version when run from Galaxy.
+       - Basic unit test included.
+
 
 Developers
 ==========
@@ -43,10 +46,10 @@
 This script and related tools are being developed on the following hg branch:
 http://bitbucket.org/peterjc/galaxy-central/src/tools
 
-For making the "Galaxy Tool Shed" http://community.g2.bx.psu.edu/ tarball use
+For making the "Galaxy Tool Shed" http://toolshed.g2.bx.psu.edu/ tarball use
 the following command from the Galaxy root folder:
 
-tar -czf seq_select_by_id.tar.gz tools/filters/seq_select_by_id.*
+$ tar -czf seq_select_by_id.tar.gz tools/filters/seq_select_by_id.* test-data/k12_ten_proteins.fasta test-data/k12_hypothetical.fasta test-data/k12_hypothetical.tabular
 
 Check this worked:
 
@@ -54,6 +57,9 @@
 filter/seq_select_by_id.py
 filter/seq_select_by_id.txt
 filter/seq_select_by_id.xml
+test-data/k12_ten_proteins.fasta
+test-data/k12_hypothetical.fasta
+test-data/k12_hypothetical.tabular
 
 
 Licence (MIT/BSD style)
--- a/tools/filters/seq_select_by_id.xml	Fri May 18 12:25:12 2012 -0400
+++ b/tools/filters/seq_select_by_id.xml	Mon Apr 15 12:28:51 2013 -0400
@@ -1,5 +1,6 @@
-<tool id="seq_select_by_id" name="Select sequences by ID" version="0.0.3">
+<tool id="seq_select_by_id" name="Select sequences by ID" version="0.0.4">
 	<description>from a tabular file</description>
+	<version_command interpreter="python">seq_select_by_id.py --version</version_command>
 	<command interpreter="python">
 seq_select_by_id.py $input_tabular $column $input_file $input_file.ext $output_file
 	</command>
@@ -22,6 +23,12 @@
 		</data>
 	</outputs>
 	<tests>
+		<test>
+			<param name="input_file" value="k12_ten_proteins.fasta" ftype="fasta" />
+			<param name="input_tabular" value="k12_hypothetical.tabular" ftype="tabular" />
+			<param name="column" value="1" />
+			<output name="output_file" file="k12_hypothetical.fasta" ftype="fasta" />
+		</test>
 	</tests>
 	<requirements>
 		<requirement type="python-module">Bio</requirement>