# HG changeset patch # User prog # Date 1578487813 18000 # Node ID 975585306dc4ca5f6dc507fab200205e68c58096 "planemo upload commit c2694fb4aa55c4f25eb53db73496eaf5e56d7872" diff -r 000000000000 -r 975585306dc4 README.md --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/README.md Wed Jan 08 07:50:13 2020 -0500 @@ -0,0 +1,12 @@ +# ISA extractor + +[](https://travis-ci.org/workflow4metabolomics/isa-extractor) + +Extract collections of raw data files (mzML, mzXML, netCDF, mzData or nmrML) from an ISA study. + +## Updates + +### 1.3.0 + + * Wrote planemo tests for all tools. + * Run tests in Travis-CI. diff -r 000000000000 -r 975585306dc4 extract-from-isa --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/extract-from-isa Wed Jan 08 07:50:13 2020 -0500 @@ -0,0 +1,130 @@ +#!/bin/bash +# vi: fdm=marker + +# Constants {{{1 +################################################################ + +PROG_NAME=$(basename $0) +PROG_PATH=$(dirname $0) +YES=yes +NO=no + +# Global variables {{{1 +################################################################ + +DEBUG=0 +EXT= +INPUT_DIR= +OUTPUT_DIR= +SYMLINK=$NO + +# Print help {{{1 +################################################################ + +function print_help { + echo "Usage: $PROG_NAME -i <ISA_DIR> -e <EXT> -o <OUTPUT_DIR>" + echo + echo "Extract files with a given extension from ISA-Tab archives into a collection." + echo + echo "Options:" + echo " -e, --ext EXT The extension of the files to find." + echo " -h, --help Print this help message." + echo " -i, --input DIR Input directory containing ISA archive." + echo " -o, --output DIR Set the output directory to use." + echo " -s, --symlink Create symbolic links instead of copying files." +} + +# Error {{{1 +################################################################ + +function error { + + local msg=$1 + + echo "ERROR: $msg" >&2 + + exit 1 +} + +# Print debug msg {{{1 +################################################################ + +function print_debug_msg { + + local dbglvl=$1 + local dbgmsg=$2 + + [ $DEBUG -ge $dbglvl ] && echo "[DEBUG] $dbgmsg" >&2 +} + +# Read args {{{1 +################################################################ + +function read_args { + + local args="$*" # save arguments for debugging purpose + + # Read options + while true ; do + shift_count=1 + case $1 in + -e|--ext) EXT="$2" ; shift_count=2 ;; + -g|--debug) DEBUG=$((DEBUG + 1)) ;; + -h|--help) print_help ; exit 0 ;; + -i|--input) INPUT_DIR="$2" ; shift_count=2 ;; + -o|--output) OUTPUT_DIR="$2" ; shift_count=2 ;; + -s|--symlink) SYMLINK=$YES ;; + -) error "Illegal option $1." ;; + --) error "Illegal option $1." ;; + --*) error "Illegal option $1." ;; + -?) error "Unknown option $1." ;; + -[^-]*) split_opt=$(echo $1 | sed 's/^-//' | sed 's/\([a-zA-Z]\)/ -\1/g') ; set -- $1$split_opt "${@:2}" ;; + *) break + esac + shift $shift_count + done + shift $((OPTIND - 1)) + + # Debug + print_debug_msg 1 "Arguments are : $args" + + # Check input params + [[ $# -eq 0 ]] || error "No remaining arguments are allowed." + [[ -n $INPUT_DIR ]] || error "You must specify an input directory, using -i option." + [[ -d $INPUT_DIR ]] || error "\"$INPUT_DIR\" is not a valid directory." + [[ -n $OUTPUT_DIR ]] || error "You must specify an output directory, using -o option." + [[ ! -e $OUTPUT_DIR ]] || error "\"$OUTPUT_DIR\" already exists." + [[ -n $EXT ]] || error "You must specify the extension of the files you are looking for, with the -e option." +} + +# MAIN {{{1 +################################################################ + +read_args "$@" + +# Create output directory +print_debug_msg 1 "Create output directory \"$OUTPUT_DIR\"." +mkdir -p "$OUTPUT_DIR" + +# Find files to extract +print_debug_msg 1 "Find \"$EXT\" files to extract in \"$INPUT_DIR\"." +files_to_extract=$(mktemp -t tmp.XXXXXX) +find "$(realpath $INPUT_DIR)" -iname "*.$EXT" >files_to_extract +print_debug_msg 1 "Files to extract:" +if [[ $DEBUG -ge 1 ]] ; then + cat files_to_extract >&2 +fi + +# Extract files +if [[ $SYMLINK == $YES ]] ; then + print_debug_msg 1 "Create symbolic links of all \"$EXT\" files to extract into \"$OUTPUT_DIR\"." + xargs -I % ln -s % "$OUTPUT_DIR" <files_to_extract +else + print_debug_msg 1 "Copy all \"$EXT\" files to extract to \"$OUTPUT_DIR\"." + xargs -I % cp % "$OUTPUT_DIR" <files_to_extract +fi +rm files_to_extract +print_debug_msg 1 "Files extracted:" +if [[ $DEBUG -ge 1 ]] ; then + ls -1 "$OUTPUT_DIR" >&2 +fi diff -r 000000000000 -r 975585306dc4 isa2mzdata.xml --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/isa2mzdata.xml Wed Jan 08 07:50:13 2020 -0500 @@ -0,0 +1,77 @@ +<!-- vi: se fdm=marker : --> +<tool id="isa2mzdata" name="ISA to mzData" version="1.3.0"> + + <description>Extract mzData files from an ISA dataset and output a collection of mzData dataset.</description> + + <!-- Command {{{1 --> + + <command><![CDATA[ + ## @@@BEGIN_CHEETAH@@@ + + $__tool_directory__/extract-from-isa + -i "$isa.extra_files_path" + -e mzData + -o mzData + + ## @@@END_CHEETAH@@@ + ]]></command> + + <!-- Inputs {{{1 --> + + <inputs> + <param name="isa" label="ISA" type="data" format="isa-tab"/> + </inputs> + + <!-- Outputs {{{1 --> + + <outputs> + <collection name="mzData" type="list" label="mzData files"> + <discover_datasets pattern="(?P<designation>.+)\.[mM][zZ][dD][aA][tT][aA]$" directory="mzData" format="mzdata"/> + </collection> + </outputs> + + <!-- Tests {{{1 --> + <tests> + <test> + <param name="isa" value="mzdata_study.zip" ftype="isa-tab"/> + <output_collection name="mzData" type="list" count="1"> + <element name="empty" file="mzdata_study_output/empty.mzData" ftype="mzdata"/> + </output_collection> + </test> + </tests> + + <!-- Help {{{1 --> + <help> +<!-- @@@BEGIN_RST@@@ --> + +==================== +ISA to mzData +==================== + +Extract mzData files contained inside an ISA archive. + +----- +Input +----- + +ISA dataset +=========== + +The ISA-Tab dataset from which to extract the files. + +------ +Output +------ + +The output is a collection of mzData files. + +<!-- @@@END_RST@@@ --> + </help> + + <!-- Citations {{{1 --> + <citations> + <citation type="doi">10.1038/ng.1054</citation> <!-- ISA --> + <citation type="doi">10.1093/bioinformatics/btu813</citation> <!-- W4M --> + </citations> + +</tool> diff -r 000000000000 -r 975585306dc4 isa2mzml.xml --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/isa2mzml.xml Wed Jan 08 07:50:13 2020 -0500 @@ -0,0 +1,77 @@ +<!-- vi: se fdm=marker : --> +<tool id="isa2mzml" name="ISA to mzML" version="1.3.0"> + + <description>Extract mzML files from an ISA dataset and output a collection of mzML dataset.</description> + + <!-- Command {{{1 --> + + <command><![CDATA[ + ## @@@BEGIN_CHEETAH@@@ + + $__tool_directory__/extract-from-isa + -i "$isa.extra_files_path" + -e mzML + -o mzML + + ## @@@END_CHEETAH@@@ + ]]></command> + + <!-- Inputs {{{1 --> + + <inputs> + <param name="isa" label="ISA" type="data" format="isa-tab"/> + </inputs> + + <!-- Outputs {{{1 --> + + <outputs> + <collection name="mzML" type="list" label="mzML files"> + <discover_datasets pattern="(?P<designation>.+)\.[mM][zZ][mM][lL]$" directory="mzML" format="mzml"/> + </collection> + </outputs> + + <!-- Tests {{{1 --> + <tests> + <test> + <param name="isa" value="mzml_study.zip" ftype="isa-tab"/> + <output_collection name="mzML" type="list" count="1"> + <element name="empty" file="mzml_study_output/empty.mzML" ftype="mzml"/> + </output_collection> + </test> + </tests> + + <!-- Help {{{1 --> + <help> +<!-- @@@BEGIN_RST@@@ --> + +==================== +ISA to mzML +==================== + +Extract mzML files contained inside an ISA archive. + +----- +Input +----- + +ISA dataset +=========== + +The ISA-Tab dataset from which to extract the files. + +------ +Output +------ + +The output is a collection of mzML files. + +<!-- @@@END_RST@@@ --> + </help> + + <!-- Citations {{{1 --> + <citations> + <citation type="doi">10.1038/ng.1054</citation> <!-- ISA --> + <citation type="doi">10.1093/bioinformatics/btu813</citation> <!-- W4M --> + </citations> + +</tool> diff -r 000000000000 -r 975585306dc4 isa2mzxml.xml --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/isa2mzxml.xml Wed Jan 08 07:50:13 2020 -0500 @@ -0,0 +1,77 @@ +<!-- vi: se fdm=marker : --> +<tool id="isa2mzxml" name="ISA to mzXML" version="1.3.0"> + + <description>Extract mzXML files from an ISA dataset and output a collection of mzXML dataset.</description> + + <!-- Command {{{1 --> + + <command><![CDATA[ + ## @@@BEGIN_CHEETAH@@@ + + $__tool_directory__/extract-from-isa + -i "$isa.extra_files_path" + -e mzXML + -o mzXML + + ## @@@END_CHEETAH@@@ + ]]></command> + + <!-- Inputs {{{1 --> + + <inputs> + <param name="isa" label="ISA" type="data" format="isa-tab"/> + </inputs> + + <!-- Outputs {{{1 --> + + <outputs> + <collection name="mzXML" type="list" label="mzXML files"> + <discover_datasets pattern="(?P<designation>.+)\.[mM][zZ][xX][mM][lL]$" directory="mzXML" format="mzxml"/> + </collection> + </outputs> + + <!-- Tests {{{1 --> + <tests> + <test> + <param name="isa" value="mzxml_study.zip" ftype="isa-tab"/> + <output_collection name="mzXML" type="list" count="1"> + <element name="empty" file="mzxml_study_output/empty.mzXML" ftype="mzxml"/> + </output_collection> + </test> + </tests> + + <!-- Help {{{1 --> + <help> +<!-- @@@BEGIN_RST@@@ --> + +==================== +ISA to mzXML +==================== + +Extract mzXML files contained inside an ISA archive. + +----- +Input +----- + +ISA dataset +=========== + +The ISA-Tab dataset from which to extract the files. + +------ +Output +------ + +The output is a collection of mzXML files. + +<!-- @@@END_RST@@@ --> + </help> + + <!-- Citations {{{1 --> + <citations> + <citation type="doi">10.1038/ng.1054</citation> <!-- ISA --> + <citation type="doi">10.1093/bioinformatics/btu813</citation> <!-- W4M --> + </citations> + +</tool> diff -r 000000000000 -r 975585306dc4 isa2netcdf.xml --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/isa2netcdf.xml Wed Jan 08 07:50:13 2020 -0500 @@ -0,0 +1,77 @@ +<!-- vi: se fdm=marker : --> +<tool id="isa2netcdf" name="ISA to netCDF" version="1.3.0"> + + <description>Extract netCDF files from an ISA dataset and output a collection of netCDF dataset.</description> + + <!-- Command {{{1 --> + + <command><![CDATA[ + ## @@@BEGIN_CHEETAH@@@ + + $__tool_directory__/extract-from-isa + -i "$isa.extra_files_path" + -e CDF + -o netCDF + + ## @@@END_CHEETAH@@@ + ]]></command> + + <!-- Inputs {{{1 --> + + <inputs> + <param name="isa" label="ISA" type="data" format="isa-tab"/> + </inputs> + + <!-- Outputs {{{1 --> + + <outputs> + <collection name="netCDF" type="list" label="netCDF files"> + <discover_datasets pattern="(?P<designation>.+)\.[cC][dD][fF]$" directory="netCDF" format="netcdf"/> + </collection> + </outputs> + + <!-- Tests {{{1 --> + <tests> + <test> + <param name="isa" value="netcdf_study.zip" ftype="isa-tab"/> + <output_collection name="netCDF" type="list" count="1"> + <element name="empty" file="netcdf_study_output/empty.CDF" ftype="netcdf"/> + </output_collection> + </test> + </tests> + + <!-- Help {{{1 --> + <help> +<!-- @@@BEGIN_RST@@@ --> + +==================== +ISA to netCDF +==================== + +Extract netCDF files contained inside an ISA archive. + +----- +Input +----- + +ISA dataset +=========== + +The ISA-Tab dataset from which to extract the files. + +------ +Output +------ + +The output is a collection of netCDF files. + +<!-- @@@END_RST@@@ --> + </help> + + <!-- Citations {{{1 --> + <citations> + <citation type="doi">10.1038/ng.1054</citation> <!-- ISA --> + <citation type="doi">10.1093/bioinformatics/btu813</citation> <!-- W4M --> + </citations> + +</tool> diff -r 000000000000 -r 975585306dc4 isa2nmrml.xml --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/isa2nmrml.xml Wed Jan 08 07:50:13 2020 -0500 @@ -0,0 +1,77 @@ +<!-- vi: se fdm=marker : --> +<tool id="isa2nmrml" name="ISA to nmrML" version="1.3.0"> + + <description>Extract nmrML files from an ISA dataset and output a collection of nmrML dataset.</description> + + <!-- Command {{{1 --> + + <command><![CDATA[ + ## @@@BEGIN_CHEETAH@@@ + + $__tool_directory__/extract-from-isa + -i "$isa.extra_files_path" + -e nmrML + -o nmrML + + ## @@@END_CHEETAH@@@ + ]]></command> + + <!-- Inputs {{{1 --> + + <inputs> + <param name="isa" label="ISA" type="data" format="isa-tab"/> + </inputs> + + <!-- Outputs {{{1 --> + + <outputs> + <collection name="nmrML" type="list" label="nmrML files"> + <discover_datasets pattern="(?P<designation>.+)\.[nN][mM][rR][mM][lL]$" directory="nmrML" format="nmrml"/> + </collection> + </outputs> + + <!-- Tests {{{1 --> + <tests> + <test> + <param name="isa" value="nmrml_study.zip" ftype="isa-tab"/> + <output_collection name="nmrML" type="list" count="1"> + <element name="empty" file="nmrml_study_output/empty.nmrML" ftype="nmrml"/> + </output_collection> + </test> + </tests> + + <!-- Help {{{1 --> + <help> +<!-- @@@BEGIN_RST@@@ --> + +==================== +ISA to nmrML +==================== + +Extract nmrML files contained inside an ISA archive. + +----- +Input +----- + +ISA dataset +=========== + +The ISA-Tab dataset from which to extract the files. + +------ +Output +------ + +The output is a collection of nmrML files. + +<!-- @@@END_RST@@@ --> + </help> + + <!-- Citations {{{1 --> + <citations> + <citation type="doi">10.1038/ng.1054</citation> <!-- ISA --> + <citation type="doi">10.1093/bioinformatics/btu813</citation> <!-- W4M --> + </citations> + +</tool> diff -r 000000000000 -r 975585306dc4 test-data/mzdata_study.zip Binary file test-data/mzdata_study.zip has changed diff -r 000000000000 -r 975585306dc4 test-data/mzdata_study_output/empty.mzData diff -r 000000000000 -r 975585306dc4 test-data/mzml_study.zip Binary file test-data/mzml_study.zip has changed diff -r 000000000000 -r 975585306dc4 test-data/mzml_study_output/empty.mzML diff -r 000000000000 -r 975585306dc4 test-data/mzxml_study.zip Binary file test-data/mzxml_study.zip has changed diff -r 000000000000 -r 975585306dc4 test-data/mzxml_study_output/empty.mzXML diff -r 000000000000 -r 975585306dc4 test-data/netcdf_study.zip Binary file test-data/netcdf_study.zip has changed diff -r 000000000000 -r 975585306dc4 test-data/netcdf_study_output/empty.CDF diff -r 000000000000 -r 975585306dc4 test-data/nmrml_study.zip Binary file test-data/nmrml_study.zip has changed diff -r 000000000000 -r 975585306dc4 test-data/nmrml_study_output/empty.nmrML