# HG changeset patch # User pieter.lukasse@wur.nl # Date 1427140921 -3600 # Node ID ce9228263148cc4c62abdcc8d573137e410386b6 # Parent 8fa07f40d2eb2dd194edf599f1fc68af59e45af7 renamed to TermMapper diff -r 8fa07f40d2eb -r ce9228263148 LICENSE --- a/LICENSE Fri Aug 01 17:21:30 2014 +0200 +++ b/LICENSE Mon Mar 23 21:02:01 2015 +0100 @@ -1,7 +1,7 @@ Apache License Version 2.0, January 2004 - http://www.apache.org/licenses/ + http://www.apache.org/licenses/ TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION diff -r 8fa07f40d2eb -r ce9228263148 README.rst --- a/README.rst Fri Aug 01 17:21:30 2014 +0200 +++ b/README.rst Mon Mar 23 21:02:01 2015 +0100 @@ -20,6 +20,7 @@ ============== ====================================================================== Date Changes -------------- ---------------------------------------------------------------------- +August 2014 * improvements release May 2014 * first release via Tool Shed ============== ====================================================================== diff -r 8fa07f40d2eb -r ce9228263148 Results2O.jar Binary file Results2O.jar has changed diff -r 8fa07f40d2eb -r ce9228263148 TermMapperTool.jar Binary file TermMapperTool.jar has changed diff -r 8fa07f40d2eb -r ce9228263148 results2o.xml --- a/results2o.xml Fri Aug 01 17:21:30 2014 +0200 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,107 +0,0 @@ - - use ontology mapping to annotate results (e.g. annotate protein identifications with Gene Ontology[GO] terms) - - - - Results2O.jar - -inputFileName $inputFileName - -inputIdColumnName "$inputIdColumnName" - -inputIdPrefix "$inputIdPrefix" - -quantifColumn "$quantifColumn" - - -ontologyMappingFileName $ontologyMappingFileName - -mappingFileIdColName "$mappingFileIdColName" - -mappingIdPrefix "$mappingIdPrefix" - -mappingFileOntologyTermColName "$mappingFileOntologyTermColName" - -removeWhiteSpacesFromOterms $removeWhiteSpacesFromOterms - - -outputFileName $outputFileName - -outputObservationsFileName $outputObservationsFileName - - - - - - - - - - - - - - - - - - - - - - #if isinstance( $inputFileName.datatype, $__app__.datatypes_registry.get_datatype_by_extension('tabular').__class__): - - #else: - - #end if - - - - - - - - - - -.. class:: infomark - -This tool is responsible for annotating quantifications result file -with the ontology terms given in a mapping file. This mapping file links the items found in the result file -(e.g. protein identifications coded in common protein coding formats such as UniProt ) -to their respective ontology terms (e.g. GO terms). It enables users to use the cross-reference -information now available in different repositories (like uniprot and KEGG - see for example -http://www.uniprot.org/taxonomy/ or http://www.genome.jp/linkdb/ ) -to map their results to other useful coding schemes such as ontologies for functional annotations. - -As an example for transcripts and proteins, users can check http://www.uniprot.org/taxonomy/ to -see if their organism has been mapped to GO terms by Uniprot. For example the link -http://www.uniprot.org/uniprot/?query=taxonomy:2850 will show the Uniprot repository and cross-references -for the taxonomy 2850. -When the organism being studied is not available, then other strategies -could be tried (like Blast2GO for example). - - -Despite the specific examples above, this class is generic and can be used to map any -results file to an Ontology according to a given mapping file. One example would be mapping metabolomics -identifications to the CheBI ontology. - - ------ - -**Output** - -This method will read in the given input file and for each line it will add a new column -containing the Ontology terms found for the ID in that line. So the output file is the same as the -input file + extra Ontology terms column (separated by ; ). - -A second summarized "ontology observations" file is also generated which can be used for visualizing the results -in an ontology viewer (e.g. see OntologyAndObservationsViewer). - - - diff -r 8fa07f40d2eb -r ce9228263148 term_mapper.xml --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/term_mapper.xml Mon Mar 23 21:02:01 2015 +0100 @@ -0,0 +1,207 @@ + + use cross-reference lookup tables to annotate results + + + + TermMapperTool.jar + -inputFileName $inputFileName + -inputIdColumnName "$inputIdColumnName" + #if $inputIdCol.inputIdHasPrefix == True + -inputIdPrefix "$inputIdCol.inputIdPrefix" + #end if + + -mappingFileName $mappingFileName + -mappingFileIdColName "$mappingFileIdColName" + + #if $mappingIdCol.mappingIdHasPrefix == True + -mappingIdPrefix "$mappingIdCol.mappingIdPrefix" + #end if + + -mappingFileTermColName "$mappingFileTermColName" + + -outputFileName $outputFileName + + #if $genObservations.genObservationsFile == True + -outputObservationsFileName $outputObservationsFileName + -quantifColumn "$genObservations.quantifColumn" + #end if + + -mappedTermsColName $mappedTermsColName + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + #if isinstance( $inputFileName.datatype, $__app__.datatypes_registry.get_datatype_by_extension('tabular').__class__): + + #else: + + #end if + + + + ( genObservations.genObservationsFile == True ) + + + + + + + + + +.. class:: infomark + + +This tool is responsible for annotating the given target file +with the terms given in a lookup table. This lookup table maps the items found in the target file +(e.g. protein identifications coded in common protein coding formats such as UniProt ) +to their respective terms (e.g. GO terms). It enables users to use the cross-reference +information now available from different repositories (like uniprot and KEGG - see for example +http://www.uniprot.org/taxonomy/ or http://www.genome.jp/linkdb/ ) +to map their data to other useful coding schemes or to ontologies and functional annotations. + +.. class:: infomark + +**NB:** Currently the tool will do "smart parsing" of hierarchy based fields in the target file ID column. + This means that if the colum contains a ".", the trailing part of the ID after the "." is ignored if the full + ID does not get a match in the lookup table while the part before the "." does. + +.. class:: infomark + +Examples of usage: + + annotate protein identifications with Gene Ontology[GO] terms + + annotate metabolite CAS identifications with chebi codes + + add KEGG gene codes to a file containing UNIPROT codes + + add KEGG compound codes to a file containing chebi codes + + etc + +As an example for transcripts and proteins, users can check http://www.uniprot.org/taxonomy/ to +see if their organism has been mapped to GO terms by Uniprot. For example the link +http://www.uniprot.org/uniprot/?query=taxonomy:2850 will show the Uniprot repository and cross-references +for the taxonomy 2850. +When the organism being studied is not available, then other strategies +could be tried (like Blast2GO for example). + +Despite the specific examples above, this class is generic and can be used to map any +values to new terms according to a given lookup table. + +.. class:: infomark + +*Omics cross-reference resources on the web:* + +LinkDB: http://www.genome.jp/linkdb/ + +*Ready to use metabolomics links:* + +http://rest.genome.jp/link/compound/chebi + +http://rest.genome.jp/link/compound/lipidmaps + +http://rest.genome.jp/link/compound/lipidbank + +http://rest.genome.jp/link/compound/hmdb + + +*Ready to use proteomics links:* + +http://rest.genome.jp/link/uniprot/pti (Phaeodactylum Tri.) + +http://rest.genome.jp/link/uniprot/hsa (Homo Sapiens) + +(for organism code list see: ) + + +Uniprot to GO + +http://www.uniprot.org/taxonomy/ + + +----- + +**Output** + +This method will read in the given input file and for each line it will add a new column +containing the terms found for the ID in that line. So the output file is the same as the +input file + extra terms column (separated by ; ). + +----- + +**Link to ontology viewer** + +A second summarized "terms observations" file can also be generated. +In case the terms are ontology terms, this file can be used for visualizing the results +in the ontology viewer "OntologyAndObservationsViewer". + + +