changeset 0:c34c364ce75d

First commit
author Vimalkumar Velayudhan <vimal@biotechcoder.com>
date Tue, 21 Jul 2015 14:48:39 +0100
parents
children 4419c2f77b60
files .hgignore LICENSE README.rst difftrans.xml metagene.xml prepare.xml riboseqr/__init__.py riboseqr/difftrans.py riboseqr/metagene.py riboseqr/prepare.py riboseqr/ribosome_profile.py riboseqr/triplet.py riboseqr/utils.py ribosome_profile.xml tests/__init__.py tests/test_riboseqr.py tool_dependencies.xml triplet.xml
diffstat 18 files changed, 2305 insertions(+), 0 deletions(-) [+]
line wrap: on
line diff
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/.hgignore	Tue Jul 21 14:48:39 2015 +0100
@@ -0,0 +1,3 @@
+syntax: glob
+
+*.pyc
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/LICENSE	Tue Jul 21 14:48:39 2015 +0100
@@ -0,0 +1,340 @@
+                    GNU GENERAL PUBLIC LICENSE
+                       Version 2, June 1991
+
+ Copyright (C) 1989, 1991 Free Software Foundation, Inc., <http://fsf.org/>
+ 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ Everyone is permitted to copy and distribute verbatim copies
+ of this license document, but changing it is not allowed.
+
+                            Preamble
+
+  The licenses for most software are designed to take away your
+freedom to share and change it.  By contrast, the GNU General Public
+License is intended to guarantee your freedom to share and change free
+software--to make sure the software is free for all its users.  This
+General Public License applies to most of the Free Software
+Foundation's software and to any other program whose authors commit to
+using it.  (Some other Free Software Foundation software is covered by
+the GNU Lesser General Public License instead.)  You can apply it to
+your programs, too.
+
+  When we speak of free software, we are referring to freedom, not
+price.  Our General Public Licenses are designed to make sure that you
+have the freedom to distribute copies of free software (and charge for
+this service if you wish), that you receive source code or can get it
+if you want it, that you can change the software or use pieces of it
+in new free programs; and that you know you can do these things.
+
+  To protect your rights, we need to make restrictions that forbid
+anyone to deny you these rights or to ask you to surrender the rights.
+These restrictions translate to certain responsibilities for you if you
+distribute copies of the software, or if you modify it.
+
+  For example, if you distribute copies of such a program, whether
+gratis or for a fee, you must give the recipients all the rights that
+you have.  You must make sure that they, too, receive or can get the
+source code.  And you must show them these terms so they know their
+rights.
+
+  We protect your rights with two steps: (1) copyright the software, and
+(2) offer you this license which gives you legal permission to copy,
+distribute and/or modify the software.
+
+  Also, for each author's protection and ours, we want to make certain
+that everyone understands that there is no warranty for this free
+software.  If the software is modified by someone else and passed on, we
+want its recipients to know that what they have is not the original, so
+that any problems introduced by others will not reflect on the original
+authors' reputations.
+
+  Finally, any free program is threatened constantly by software
+patents.  We wish to avoid the danger that redistributors of a free
+program will individually obtain patent licenses, in effect making the
+program proprietary.  To prevent this, we have made it clear that any
+patent must be licensed for everyone's free use or not licensed at all.
+
+  The precise terms and conditions for copying, distribution and
+modification follow.
+
+                    GNU GENERAL PUBLIC LICENSE
+   TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION
+
+  0. This License applies to any program or other work which contains
+a notice placed by the copyright holder saying it may be distributed
+under the terms of this General Public License.  The "Program", below,
+refers to any such program or work, and a "work based on the Program"
+means either the Program or any derivative work under copyright law:
+that is to say, a work containing the Program or a portion of it,
+either verbatim or with modifications and/or translated into another
+language.  (Hereinafter, translation is included without limitation in
+the term "modification".)  Each licensee is addressed as "you".
+
+Activities other than copying, distribution and modification are not
+covered by this License; they are outside its scope.  The act of
+running the Program is not restricted, and the output from the Program
+is covered only if its contents constitute a work based on the
+Program (independent of having been made by running the Program).
+Whether that is true depends on what the Program does.
+
+  1. You may copy and distribute verbatim copies of the Program's
+source code as you receive it, in any medium, provided that you
+conspicuously and appropriately publish on each copy an appropriate
+copyright notice and disclaimer of warranty; keep intact all the
+notices that refer to this License and to the absence of any warranty;
+and give any other recipients of the Program a copy of this License
+along with the Program.
+
+You may charge a fee for the physical act of transferring a copy, and
+you may at your option offer warranty protection in exchange for a fee.
+
+  2. You may modify your copy or copies of the Program or any portion
+of it, thus forming a work based on the Program, and copy and
+distribute such modifications or work under the terms of Section 1
+above, provided that you also meet all of these conditions:
+
+    a) You must cause the modified files to carry prominent notices
+    stating that you changed the files and the date of any change.
+
+    b) You must cause any work that you distribute or publish, that in
+    whole or in part contains or is derived from the Program or any
+    part thereof, to be licensed as a whole at no charge to all third
+    parties under the terms of this License.
+
+    c) If the modified program normally reads commands interactively
+    when run, you must cause it, when started running for such
+    interactive use in the most ordinary way, to print or display an
+    announcement including an appropriate copyright notice and a
+    notice that there is no warranty (or else, saying that you provide
+    a warranty) and that users may redistribute the program under
+    these conditions, and telling the user how to view a copy of this
+    License.  (Exception: if the Program itself is interactive but
+    does not normally print such an announcement, your work based on
+    the Program is not required to print an announcement.)
+
+These requirements apply to the modified work as a whole.  If
+identifiable sections of that work are not derived from the Program,
+and can be reasonably considered independent and separate works in
+themselves, then this License, and its terms, do not apply to those
+sections when you distribute them as separate works.  But when you
+distribute the same sections as part of a whole which is a work based
+on the Program, the distribution of the whole must be on the terms of
+this License, whose permissions for other licensees extend to the
+entire whole, and thus to each and every part regardless of who wrote it.
+
+Thus, it is not the intent of this section to claim rights or contest
+your rights to work written entirely by you; rather, the intent is to
+exercise the right to control the distribution of derivative or
+collective works based on the Program.
+
+In addition, mere aggregation of another work not based on the Program
+with the Program (or with a work based on the Program) on a volume of
+a storage or distribution medium does not bring the other work under
+the scope of this License.
+
+  3. You may copy and distribute the Program (or a work based on it,
+under Section 2) in object code or executable form under the terms of
+Sections 1 and 2 above provided that you also do one of the following:
+
+    a) Accompany it with the complete corresponding machine-readable
+    source code, which must be distributed under the terms of Sections
+    1 and 2 above on a medium customarily used for software interchange; or,
+
+    b) Accompany it with a written offer, valid for at least three
+    years, to give any third party, for a charge no more than your
+    cost of physically performing source distribution, a complete
+    machine-readable copy of the corresponding source code, to be
+    distributed under the terms of Sections 1 and 2 above on a medium
+    customarily used for software interchange; or,
+
+    c) Accompany it with the information you received as to the offer
+    to distribute corresponding source code.  (This alternative is
+    allowed only for noncommercial distribution and only if you
+    received the program in object code or executable form with such
+    an offer, in accord with Subsection b above.)
+
+The source code for a work means the preferred form of the work for
+making modifications to it.  For an executable work, complete source
+code means all the source code for all modules it contains, plus any
+associated interface definition files, plus the scripts used to
+control compilation and installation of the executable.  However, as a
+special exception, the source code distributed need not include
+anything that is normally distributed (in either source or binary
+form) with the major components (compiler, kernel, and so on) of the
+operating system on which the executable runs, unless that component
+itself accompanies the executable.
+
+If distribution of executable or object code is made by offering
+access to copy from a designated place, then offering equivalent
+access to copy the source code from the same place counts as
+distribution of the source code, even though third parties are not
+compelled to copy the source along with the object code.
+
+  4. You may not copy, modify, sublicense, or distribute the Program
+except as expressly provided under this License.  Any attempt
+otherwise to copy, modify, sublicense or distribute the Program is
+void, and will automatically terminate your rights under this License.
+However, parties who have received copies, or rights, from you under
+this License will not have their licenses terminated so long as such
+parties remain in full compliance.
+
+  5. You are not required to accept this License, since you have not
+signed it.  However, nothing else grants you permission to modify or
+distribute the Program or its derivative works.  These actions are
+prohibited by law if you do not accept this License.  Therefore, by
+modifying or distributing the Program (or any work based on the
+Program), you indicate your acceptance of this License to do so, and
+all its terms and conditions for copying, distributing or modifying
+the Program or works based on it.
+
+  6. Each time you redistribute the Program (or any work based on the
+Program), the recipient automatically receives a license from the
+original licensor to copy, distribute or modify the Program subject to
+these terms and conditions.  You may not impose any further
+restrictions on the recipients' exercise of the rights granted herein.
+You are not responsible for enforcing compliance by third parties to
+this License.
+
+  7. If, as a consequence of a court judgment or allegation of patent
+infringement or for any other reason (not limited to patent issues),
+conditions are imposed on you (whether by court order, agreement or
+otherwise) that contradict the conditions of this License, they do not
+excuse you from the conditions of this License.  If you cannot
+distribute so as to satisfy simultaneously your obligations under this
+License and any other pertinent obligations, then as a consequence you
+may not distribute the Program at all.  For example, if a patent
+license would not permit royalty-free redistribution of the Program by
+all those who receive copies directly or indirectly through you, then
+the only way you could satisfy both it and this License would be to
+refrain entirely from distribution of the Program.
+
+If any portion of this section is held invalid or unenforceable under
+any particular circumstance, the balance of the section is intended to
+apply and the section as a whole is intended to apply in other
+circumstances.
+
+It is not the purpose of this section to induce you to infringe any
+patents or other property right claims or to contest validity of any
+such claims; this section has the sole purpose of protecting the
+integrity of the free software distribution system, which is
+implemented by public license practices.  Many people have made
+generous contributions to the wide range of software distributed
+through that system in reliance on consistent application of that
+system; it is up to the author/donor to decide if he or she is willing
+to distribute software through any other system and a licensee cannot
+impose that choice.
+
+This section is intended to make thoroughly clear what is believed to
+be a consequence of the rest of this License.
+
+  8. If the distribution and/or use of the Program is restricted in
+certain countries either by patents or by copyrighted interfaces, the
+original copyright holder who places the Program under this License
+may add an explicit geographical distribution limitation excluding
+those countries, so that distribution is permitted only in or among
+countries not thus excluded.  In such case, this License incorporates
+the limitation as if written in the body of this License.
+
+  9. The Free Software Foundation may publish revised and/or new versions
+of the General Public License from time to time.  Such new versions will
+be similar in spirit to the present version, but may differ in detail to
+address new problems or concerns.
+
+Each version is given a distinguishing version number.  If the Program
+specifies a version number of this License which applies to it and "any
+later version", you have the option of following the terms and conditions
+either of that version or of any later version published by the Free
+Software Foundation.  If the Program does not specify a version number of
+this License, you may choose any version ever published by the Free Software
+Foundation.
+
+  10. If you wish to incorporate parts of the Program into other free
+programs whose distribution conditions are different, write to the author
+to ask for permission.  For software which is copyrighted by the Free
+Software Foundation, write to the Free Software Foundation; we sometimes
+make exceptions for this.  Our decision will be guided by the two goals
+of preserving the free status of all derivatives of our free software and
+of promoting the sharing and reuse of software generally.
+
+                            NO WARRANTY
+
+  11. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY
+FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW.  EXCEPT WHEN
+OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES
+PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED
+OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
+MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.  THE ENTIRE RISK AS
+TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU.  SHOULD THE
+PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING,
+REPAIR OR CORRECTION.
+
+  12. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING
+WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR
+REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES,
+INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING
+OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED
+TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY
+YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER
+PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE
+POSSIBILITY OF SUCH DAMAGES.
+
+                     END OF TERMS AND CONDITIONS
+
+            How to Apply These Terms to Your New Programs
+
+  If you develop a new program, and you want it to be of the greatest
+possible use to the public, the best way to achieve this is to make it
+free software which everyone can redistribute and change under these terms.
+
+  To do so, attach the following notices to the program.  It is safest
+to attach them to the start of each source file to most effectively
+convey the exclusion of warranty; and each file should have at least
+the "copyright" line and a pointer to where the full notice is found.
+
+    {description}
+    Copyright (C) {year}  {fullname}
+
+    This program is free software; you can redistribute it and/or modify
+    it under the terms of the GNU General Public License as published by
+    the Free Software Foundation; either version 2 of the License, or
+    (at your option) any later version.
+
+    This program is distributed in the hope that it will be useful,
+    but WITHOUT ANY WARRANTY; without even the implied warranty of
+    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+    GNU General Public License for more details.
+
+    You should have received a copy of the GNU General Public License along
+    with this program; if not, write to the Free Software Foundation, Inc.,
+    51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
+
+Also add information on how to contact you by electronic and paper mail.
+
+If the program is interactive, make it output a short notice like this
+when it starts in an interactive mode:
+
+    Gnomovision version 69, Copyright (C) year name of author
+    Gnomovision comes with ABSOLUTELY NO WARRANTY; for details type `show w'.
+    This is free software, and you are welcome to redistribute it
+    under certain conditions; type `show c' for details.
+
+The hypothetical commands `show w' and `show c' should show the appropriate
+parts of the General Public License.  Of course, the commands you use may
+be called something other than `show w' and `show c'; they could even be
+mouse-clicks or menu items--whatever suits your program.
+
+You should also get your employer (if you work as a programmer) or your
+school, if any, to sign a "copyright disclaimer" for the program, if
+necessary.  Here is a sample; alter the names:
+
+  Yoyodyne, Inc., hereby disclaims all copyright interest in the program
+  `Gnomovision' (which makes passes at compilers) written by James Hacker.
+
+  {signature of Ty Coon}, 1 April 1989
+  Ty Coon, President of Vice
+
+This General Public License does not permit incorporating your program into
+proprietary programs.  If your program is a subroutine library, you may
+consider it more useful to permit linking proprietary applications with the
+library.  If this is what you want to do, use the GNU Lesser General
+Public License instead of this License.
+
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/README.rst	Tue Jul 21 14:48:39 2015 +0100
@@ -0,0 +1,36 @@
+riboseqr_wrapper
+----------------
+`riboSeqR <http://bioconductor.org/packages/3.0/bioc/html/riboSeqR.html>`_
+integration for `Galaxy <http://galaxyproject.org/>`_ and
+`RiboGalaxy <http://ribogalaxy.ucc.ie/>`_.
+
+Included tools
+..............
+In the order they are run
+
+1. Prepare riboSeqR Input - Prepare riboSeqR format input files from SAM format alignment files.
+   The SAM format files should be obtained by aligning Ribo-Seq and RNA-Seq data to the transcriptome.
+   (RNA-Seq data is optional but required for Differential translation analysis).
+
+2. Triplet Periodicity - Plot triplet periodicity for different read lengths.
+
+3. Metagene Analysis
+
+4. Plot Ribosome Profile
+
+   [OR]
+
+   Differential Translation Analysis - Get Ribo and RNA-Seq counts with riboSeqR. Perform differential
+   translation analysis with baySeq.
+
+Dependencies
+............
+Tested on Ubuntu Linux 14.04 LTS, 64-bit. Dependencies should install automatically on Linux x86_64.
+
+**R** ``3.1.2``
+
+**riboSeqR** ``1.0.5``
+
+**baySeq** ``2.0.50``
+
+**rpy2** ``2.3.10``
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/difftrans.xml	Tue Jul 21 14:48:39 2015 +0100
@@ -0,0 +1,149 @@
+<tool id="difftrans" name="Differential Translation Analysis" version="0.1.0">
+    <description>
+        (Step 4) Get Ribo and RNA-Seq counts with riboSeqR. Perform differential
+        translation analysis with baySeq.
+    </description>
+    <requirements>
+        <requirement type="package" version="3.1.2">R</requirement>
+        <requirement type="package" version="6.2">readline</requirement>
+        <requirement type="package" version="2.3.10">rpy2</requirement>
+        <requirement type="package" version="0.4.0">riboseqr_wrapper_deps
+        </requirement>
+    </requirements>
+    <stdio>
+        <exit_code range="1:" level="fatal" description="Error"/>
+    </stdio>
+    <command interpreter="python">riboseqr/difftrans.py
+        --rdata_load "$rdata_load"
+        --slice_lengths "$slice_lengths"
+        --frames "$frames"
+        --group1 "$group1"
+        --group2 "$group2"
+        --num_counts "$num_counts"
+        --normalize "$normalize"
+        --html_file "$html_file"
+        --output_path "$html_file.files_path"
+    </command>
+    <inputs>
+        <param name="rdata_load" format="rda" type="data"
+               label="Select Metagene analysis (R data file)"
+               multiple="false">
+            <validator type="expression"
+                       message="Please check if the correct RDA file is selected">value.name == "Metagene analysis (R data file)"</validator>
+        </param>
+        <param name="slice_lengths" type="text"
+               label="&lt;h4&gt;&lt;font color=&quot;#666666&quot;&gt;Ribo and RNA counts (riboSeqR)&lt;/font&gt;&lt;/h4&gt;Lengths of ribosome footprints to inform count data"
+               value="27,28">
+            <validator type="empty_field" message="Field requires a value"/>
+        </param>
+        <param name="frames" type="text" label="Frames corresponding to the lengths of ribosome footprints given above"
+               value="0,2">
+            <validator type="empty_field" message="Field requires a value"/>
+        </param>
+        <param type="text" name="group1"
+               label="&lt;h4&gt;&lt;font color=&quot;#666666&quot;&gt;baySeq Analysis&lt;/font&gt;&lt;/h4&gt;Group (model) structure to test on the data"
+               help="ex. 1,1,1,1">
+            <validator type="empty_field" message="Field requires a value"/>
+        </param>
+        <param type="text" name="group2" label="and" help="ex. WT,WT,M,M">
+            <validator type="empty_field" message="Field requires a value"/>
+        </param>
+        <param type="integer" name="num_counts"
+               label="How many results to return? (top candidates for differential translation)"
+               value="10"/>
+
+        <param name="normalize" type="boolean" checked="no"
+               label="Normalize data?" falsevalue="FALSE" truevalue="TRUE"/>
+    </inputs>
+    <outputs>
+        <data format="html" name="html_file"
+              label="Differential Translation Analysis (HTML report)"/>
+    </outputs>
+    <!--<tests>-->
+    <!--<test>-->
+    <!--<param name="rdata_load" value="Metagene analysis (R data file)"/>-->
+    <!--<param name="slice_lengths" value="27,28"/>-->
+    <!--<param name="frames" value="0,2"/>-->
+    <!--<param name="group1" value="1,1,1,1"/>-->
+    <!--<param name="group2" value="WT,WT,M,M"/>-->
+    <!--<param name="normalize" checked="yes"/>-->
+    <!--<output name="html_file" file="Differential_Translation_Analysis_(HTML_report)">-->
+    <!--<extra_files type="file" name="RNACounts.csv" value="RNACounts.csv" ftype="csv"/>-->
+    <!--<extra_files type="file" name="RiboCounts.csv" value="RiboCounts.csv" ftype="csv"/>-->
+    <!--<extra_files type="file" name="TopCounts.csv" value="TopCounts.csv" ftype="csv"/>-->
+    <!--</output>-->
+    <!--</test>-->
+    <!--</tests>-->
+    <help>
+Differential Translation Analysis
+---------------------------------
+riboSeqR version: ``1.0.5``, baySeq version: ``2.0.50``.
+
+This tool can be used for obtaining the Ribo and RNA-Seq counts and
+finding the
+top candidates for differential translation using baySeq. The input is
+the
+R data file from the Metagene analysis step.
+
+How to use?
+-----------
+Inputs
+......
+Select *Metagene analysis (R data file)*, enter lengths of ribosome
+footprints
+and the frames. For baySeq analysis, enter the group model and the
+number of
+results to return. Please refer to the documentation for explanation of
+the
+options.
+
+Outputs
+.......
+The following files will be generated on completion:
+
+*Differential Translation Analysis (HTML report)*
+
+A HTML file with results, links to other output files and the R script
+used for
+the session.
+
+riboSeqR functions used
+.......................
+``sliceCounts``, ``rnaCounts``.
+
+baySeq functions used
+.....................
+``getLibsizes``, ``getPriors``, ``getLikelihoods``, ``topCounts``.
+
+For detailed description of the functions and the options used, please
+consult
+the riboSeqR and baySeq documentation.
+
+Links
+.....
+* `riboSeqR &lt;http://bioconductor.org/packages/3.0/bioc/html/riboSeqR.html&gt;`_
+* `baySeq &lt;http://bioconductor.org/packages/3.0/bioc/html/baySeq.html&gt;`_
+    </help>
+    <citations>
+        <citation type="bibtex">
+            @Manual{,
+            title = {riboSeqR: Analysis of sequencing data from ribosome
+            profiling experiments.},
+            author = {Thomas J. Hardcastle},
+            year = {2014},
+            note = {R package version 1.0.5},
+            }
+        </citation>
+        <citation type="bibtex">
+            @Manual{,
+            title = {baySeq: Empirical Bayesian analysis of patterns of
+            differential
+            expression in count data},
+            author = {Thomas J. Hardcastle},
+            year = {2012},
+            note = {R package version 2.0.50},
+            }
+        </citation>
+        <citation type="doi">10.1186/1471-2105-11-422</citation>
+    </citations>
+</tool>
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/metagene.xml	Tue Jul 21 14:48:39 2015 +0100
@@ -0,0 +1,202 @@
+<tool id="riboseqr_metagene" name="Metagene Analysis" version="0.4.0">
+    <description>
+        (Step 3) Metagene analysis using riboSeqR.
+    </description>
+    <requirements>
+        <requirement type="package" version="3.1.2">R</requirement>
+        <requirement type="package" version="6.2">readline</requirement>
+        <requirement type="package" version="2.3.10">rpy2</requirement>
+        <requirement type="package" version="0.4.0">riboseqr_wrapper_deps</requirement>
+    </requirements>
+    <command interpreter="python">riboseqr/metagene.py
+        --rdata_load "$rdata_load"
+        --selected_lengths "$selected_lengths"
+        --selected_frames "$selected_frames"
+        --hit_mean "$hit_mean"
+        --unique_hit_mean "$unique_hit_mean"
+        --ratio_check "$ratio_check"
+        --plot_lengths "$plot_lengths"
+        --min5p "$min5p"
+        --max5p "$max5p"
+        --min3p "$min3p"
+        --max3p "$max3p"
+        --cap "$cap"
+        --plot_title "$plot_title"
+        --rdata_save "$rdata_save"
+        --html_file "$html_file"
+        --output_path "$html_file.files_path"
+    </command>
+    <inputs>
+        <param name="rdata_load" type="data" format="rda"
+               label="Select Triplet periodicity (R data file)"
+               multiple="false" optional="false"
+               help="&lt;br&gt;&lt;h4&gt;&lt;font color=&quot;#666666&quot;&gt;filterHits parameters (footprint reads to be used in the analysis)&lt;/font&gt;&lt;/h4&gt;">
+            <validator type="expression"
+                       message="Please check if the correct RDA file is selected">value.name == "Triplet periodicity (R data file)"</validator>
+        </param>
+
+        <param name="selected_lengths" type="text" value="27"
+               label="Length of ribosome footprint
+                      reads to be used in filtering (lengths)"
+               help="ex: 27,28. Multiple values must be comma-separated.
+                     Please consult Periodicity-report.txt." optional="false">
+            <validator type="empty_field" message="Field requires a value"/>
+        </param>
+
+        <param name="selected_frames" type="text" value=""
+               label="Frames of the ribosome footprint reads to be used in filtering
+                      (frames)"
+               help="ex: 1,0. Should be of equal length to the lengths parameter above.
+                     Multiple values must be comma-separated.
+                     Please consult the periodicity report from previous step."
+               optional="false">
+            <validator type="empty_field" message="Field requires a value"/>
+        </param>
+
+        <param name="hit_mean" size="4" type="integer" value="10"
+               label="Mean number of hits within a replicate group that
+                      should be observed to pass filtering (hitMean)" optional="false">
+            <validator type="empty_field" message="Field requires a value"/>
+        </param>
+
+        <param name="unique_hit_mean" size="4" type="integer" value="1"
+               label="Mean number of unique sequences within a replicate group
+                      that should be observed to pass filtering
+                      (unqhitMean)" optional="false">
+            <validator type="empty_field" message="Field requires a value"/>
+        </param>
+
+        <param name="ratio_check" type="boolean" checked="yes"
+               label="Check the ratios of the expected phase to maximal phase
+                      within the putative coding sequence
+                      (ratioCheck)" falsevalue="FALSE"
+               truevalue="TRUE"
+               help="&lt;br&gt;&lt;h4&gt;&lt;font color=&quot;#666666&quot;&gt;plotCDS parameters&lt;/font&gt;&lt;/h4&gt;"/>
+
+        <param name="plot_lengths" type="text"
+               label="Length of footprints to be
+                      plotted (lengths)"
+               help="Multiple values should be comma-separated.
+                     In that case, multiple plots will be produced" value="27"/>
+
+        <param name="min5p" label="The distance upstream of the translation
+                                   start to be plotted (min5p)" value="-20" type="integer"/>
+
+        <param name="max5p" label="The distance downstream of the translation
+                                   start to be plotted (max5p)" value="200" type="integer"/>
+
+        <param name="min3p" label="The distance upstream of the translation
+                                   end to be plotted (min3p)" value="-200" type="integer"/>
+
+        <param name="max3p" label="The distance downtream of the translation
+                                   end to be plotted (max3p)" value="20" type="integer"/>
+
+        <param name="cap"
+               label="If given, caps the height of plotted values (cap)"
+               value="" type="integer" optional="true"/>
+
+        <param name="plot_title"
+               label="Title of the plot (main)" type="text" size="30"
+               value=""/>
+    </inputs>
+    <stdio>
+        <exit_code range="1:"  level="fatal" description="Error" />
+    </stdio>
+    <outputs>
+        <data format="rda" name="rdata_save"
+              label="Metagene analysis (R data file)"/>
+        <data format="html" name="html_file"
+              label="Metagene analysis (HTML report)"/>
+    </outputs>
+    <!--<tests>-->
+    <!--<test>-->
+    <!--<param name="rdata_load" value="Triplet periodicity (R data file)" ftype="rda"/>-->
+    <!--<param name="hit_mean" value="50"/>-->
+    <!--<param name="unique_hit_mean" value="10"/>-->
+    <!--<param name="min5p" value="-20"/>-->
+    <!--<param name="max5p" value="200"/>-->
+    <!--<param name="min3p" value="-200"/>-->
+    <!--<param name="max3p" value="20"/>-->
+    <!--<param name="cap" value=""/>-->
+    <!--<param name="plot_title" value=""/>-->
+    <!--<output name="rdata_save" file="Metagene analysis (R data file)" ftype="rda"/>-->
+    <!--<output name="html_file" file="Metagene_analysis_(HTML_report).html">-->
+    <!--<extra_files type="file" name="Metagene-analysis-plot1.pdf" value="Metagene-analysis-plot1.pdf"/>-->
+    <!--<extra_files type="file" name="Metagene-analysis-plot1_1.png" value="Metagene-analysis-plot1_1.png"/>-->
+    <!--<extra_files type="file" name="Metagene-analysis-plot1_2.png" value="Metagene-analysis-plot1_2.png"/>-->
+    <!--<extra_files type="file" name="Metagene-analysis-plot1_3.png" value="Metagene-analysis-plot1_3.png"/>-->
+    <!--<extra_files type="file" name="Metagene-analysis-plot1_4.png" value="Metagene-analysis-plot1_4.png"/>-->
+    <!--<extra_files type="file" name="Metagene-analysis-plot2.pdf" value="Metagene-analysis-plot2.pdf"/>-->
+    <!--<extra_files type="file" name="Metagene-analysis-plot2_1.png" value="Metagene-analysis-plot2_1.png"/>-->
+    <!--<extra_files type="file" name="Metagene-analysis-plot2_2.png" value="Metagene-analysis-plot2_2.png"/>-->
+    <!--<extra_files type="file" name="Metagene-analysis-plot2_3.png" value="Metagene-analysis-plot2_3.png"/>-->
+    <!--<extra_files type="file" name="Metagene-analysis-plot2_4.png" value="Metagene-analysis-plot2_4.png"/>-->
+    <!--</output>-->
+    <!--</test>-->
+    <!--</tests>-->
+    <help>
+Metagene Analysis
+-----------------
+riboSeqR version: ``1.0.5``.
+
+The input is the R data file from the previous step - Triplet periodicity.
+
+How to use?
+-----------
+Inputs
+......
+#. Select *Triplet periodicity (R data file)* from the previous step.
+
+#. Specify length of ribosome footprint reads to be used in filtering
+(lengths). Only these reads **will** be used in the analysis.
+
+#. Specify frames to consider. This information can be obtained
+from the *Triplet periodicity (HTML report)*.
+
+.. class:: warningmark
+
+Please note that the frames specified should correspond to the
+lengths of the reads.
+
+#. Under *plotCDS parameters*, input length of footprints to be considered for
+generating the plot.
+
+#. Review/change other options if necessary and execute program.
+
+Outputs
+.......
+The following files will be generated on completion:
+
+#. Metagene analysis (HTML report)
+
+A HTML file with results and links to other output files - plots for
+specified lengths (PDF) and R script used for the session.
+
+#. Metagene analysis (R data file)
+
+Used as input for the next step - *Plot Rribosome Profile*.
+
+riboSeqR functions used
+.......................
+``filterHits``.
+
+For detailed description of the functions and the options used, please consult
+the riboSeqR documentation.
+
+Links
+.....
+`riboSeqR &lt;http://bioconductor.org/packages/3.0/bioc/html/riboSeqR.html&gt;`_.
+
+    </help>
+    <citations>
+        <citation type="bibtex">
+            @Manual{,
+            title = {riboSeqR: Analysis of sequencing data from ribosome
+            profiling experiments.},
+            author = {Thomas J. Hardcastle},
+            year = {2014},
+            note = {R package version 1.0.5},
+            }
+        </citation>
+    </citations>
+</tool>
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/prepare.xml	Tue Jul 21 14:48:39 2015 +0100
@@ -0,0 +1,175 @@
+<tool id="riboseqr_prepare_input" name="Prepare riboSeqR input" version="0.4.0">
+    <description>
+        (Step 1) Prepare alignment file (SAM format, Ribo-Seq or RNA-Seq alignments)
+        for riboSeqR analysis.
+    </description>
+    <requirements>
+        <requirement type="package" version="3.1.2">R</requirement>
+        <requirement type="package" version="6.2">readline</requirement>
+        <requirement type="package" version="2.3.10">rpy2</requirement>
+        <requirement type="package" version="0.4.0">riboseqr_wrapper_deps</requirement>
+    </requirements>
+    <stdio>
+        <exit_code range="1:"  level="fatal" description="Error" />
+    </stdio>
+    <command interpreter="python">riboseqr/prepare.py
+        #set $files_ribo = []
+        #set $files_rna = []
+        #set $replicates = []
+
+        #for $i, $s in enumerate($rnaseq.datasets)
+        #silent $replicates.append($s.replicate_name.value)
+        #silent $files_ribo.append($s.ribo_files.file_name)
+        #try:
+        #silent $s.rna_files
+        #except:
+        #pass
+        #else:
+        #silent $files_rna.append($s.rna_files.file_name)
+        #end try
+        #end for
+
+        #set $ribofiles = ", ".join($files_ribo)
+        #set $rnafiles = ", ".join($files_rna)
+        #set $replicate_names = ", ".join($replicates)
+
+        --ribo_files "$ribofiles"
+        --rna_files "$rnafiles"
+        --replicate_names "$replicate_names"
+        --seqnames "$seqnames"
+        --rdata_save "$rdata_save"
+        --sam_format
+        --html_file "$html_file"
+        --output_path "$html_file.files_path"
+    </command>
+    <inputs>
+        <conditional name="rnaseq">
+            <param name="have-rnaseq" type="boolean" truevalue="checked"
+                   falsevalue="unchecked" label="Have RNA-Seq data?"/>
+            <when value="checked">
+                <repeat name="datasets" title="Ribo and RNA-Seq alignment"
+                        default="1">
+                    <param type="text" name="replicate_name" size="20"
+                           label="Provide a common name for each replicate group"
+                           value=""/>
+                    <param format="sam" name="ribo_files" type="data"
+                           label="Ribo-Seq alignment file (SAM)"
+                           multiple="false"/>
+                    <param format="sam" name="rna_files" type="data"
+                           label="RNA-Seq alignment file (SAM)"
+                           multiple="false"/>
+                </repeat>
+            </when>
+            <when value="unchecked">
+                <repeat name="datasets" title="Ribo-Seq alignment" default="1">
+                    <param type="text" name="replicate_name" size="20"
+                           label="Provide a common name for each replicate group"
+                           value=""/>
+                    <param format="sam" name="ribo_files" type="data"
+                           label="Ribo-Seq alignment file (SAM)"
+                           multiple="false">
+                    </param>
+                </repeat>
+            </when>
+        </conditional>
+        <param name="seqnames" type="text" area="True" label="Transcript (seqname)
+                                                              names to be read" size="3x60"
+               help="[Optional] - only the mapped footprints for these transcripts will be
+                     used in further riboSeqR analysis. Multiple transcript names should
+                     be separated by commas. There should be no quotes around transcript names.">
+            <sanitizer>
+                <valid>
+                    <add value="|"/>
+                </valid>
+            </sanitizer>
+        </param>
+    </inputs>
+    <outputs>
+        <data format="RData" name="rdata_save"
+              label="Prepare riboSeqR input (R data file)"/>
+        <data format="html" name="html_file"
+              label="Prepare riboSeqR input (HTML report)"/>
+    </outputs>
+    <!--<tests>-->
+    <!--<test>-->
+    <!--<param name="ribo_files" value="12hRPF.sam" />-->
+    <!--<output name="html_file" file="Prepare_riboSeqR_input_(HTML_report).html">-->
+    <!--<extra_files type="file" name="RiboSeq file 1" value="RiboSeq file 1"/>-->
+    <!--</output>-->
+    <!--</test>-->
+    <!--</tests>-->
+    <help>
+Prepare riboSeqR Input
+----------------------
+riboSeqR version: ``1.0.5``.
+
+This tool can be used to prepare input data for riboSeqR from SAM
+format alignments of Ribo or RNA-Seq data to a reference transcriptome. You can
+do this alignment manually using bowtie or the
+"Transcriptome Mapping" -> "Align to transcriptome using Bowtie"
+tool on `RiboGalaxy &lt;http://ribogalaxy.ucc.ie&gt;`_.
+
+The required input format for riboSeqR is mentioned in the
+*"Getting Data"* section of the documentation.
+
+How to use?
+-----------
+Inputs
+......
+Select SAM format Ribo-Seq alignment files in the input section.
+
+If you have RNA-Seq data, these can be included if the *"Have RNA-Seq data"*
+option is checked.
+
+Outputs
+.......
+The following files will be generated on completion:
+
+#. Prepare riboSeqR input (HTML report) - A HTML file with links to all other
+output files.
+
+* Generated riboSeqR format input files of RiboSeq and RNASeq data(if provided).
+These files are plain text and lines have the following information -
+strand, transcript name, alignment position, sequence.
+
+Please note the alignments are made *0-indexed*.
+
+* R script used in this session.
+
+#. Prepare riboSeqR input (R data file) - used as input for the next step -
+*Triplet Periodicity*.
+
+How are the SAM alignments processed?
+.....................................
+#. Lines starting with ``@`` are ignored.
+#. Lines having a ``FLAG=0`` are considered as successful alignments. These are
+considered for the next step.
+#. Alignment start is located on ``column 4``. These are decremented by
+1 as SAM alignments are 1-indexed.
+#. riboSeqR input file is written with the strand (``+``), transcript name,
+alignment start and the aligned sequence.
+
+riboSeqR functions used
+.......................
+``readRibodata``.
+
+For detailed description of the functions and the options used, please consult
+the riboSeqR documentation.
+
+Links
+.....
+`riboSeqR &lt;http://bioconductor.org/packages/3.0/bioc/html/riboSeqR.html&gt;`_
+
+    </help>
+    <citations>
+        <citation type="bibtex">
+            @Manual{,
+            title = {riboSeqR: Analysis of sequencing data from ribosome
+            profiling experiments.},
+            author = {Thomas J. Hardcastle},
+            year = {2014},
+            note = {R package version 1.0.5},
+            }
+        </citation>
+    </citations>
+</tool>
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/riboseqr/__init__.py	Tue Jul 21 14:48:39 2015 +0100
@@ -0,0 +1,1 @@
+# -*- coding: utf-8 -*-
\ No newline at end of file
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/riboseqr/difftrans.py	Tue Jul 21 14:48:39 2015 +0100
@@ -0,0 +1,150 @@
+# -*- coding: utf-8 -*-
+import os
+import sys
+import argparse
+import logging
+import rpy2.robjects as robjects
+import utils
+
+rscript = ''
+R = robjects.r
+
+
+def run_rscript(command=None):
+    """Run R command, log it, append to rscript"""
+    global rscript
+    if not command:
+        return
+    logging.debug(command)
+    rscript += '{}\n'.format(command)
+    output = R(command)
+    return output
+
+
+def get_counts(rdata_load='Metagene.rda', slice_lengths='27',
+               frames='', group1=None, group2=None, num_counts=10,
+               normalize='FALSE', html_file='Counts.html',
+               output_path='counts'):
+    options = {'slice_lengths': utils.process_args(
+        slice_lengths, ret_type='int', ret_mode='charvector')}
+
+    if not num_counts:
+        num_counts = 10
+
+    options['frames'] = utils.process_args(
+        frames, ret_type='int', ret_mode='listvector')
+
+    run_rscript('suppressMessages(library(riboSeqR))')
+    run_rscript('load("{}")'.format(rdata_load))
+
+    cmd_args = 'ffCs, lengths={slice_lengths}'.format(**options)
+    if frames:
+        cmd_args += ', frames={frames}'.format(**options)
+
+    run_rscript("""riboCounts <- sliceCounts({})
+    annotation <- as.data.frame(ffCs@CDS)
+    rownames(riboCounts) <- annotation$seqnames
+    colnames(riboCounts) <- names(riboDat@riboGR)""".format(cmd_args))
+
+    run_rscript("""mrnaCounts <- rnaCounts(riboDat, ffCs@CDS)
+    rownames(mrnaCounts) <- annotation$seqnames""")
+
+    if not os.path.exists(output_path):
+        os.mkdir(output_path)
+
+    html = '<h2>Differential Translation Analysis</h2><hr>'
+    for count_name, file_name, legend in (
+            ('riboCounts', 'RiboCounts.csv', 'Ribo-Seq counts'),
+            ('mrnaCounts', 'RNACounts.csv', 'RNA-Seq counts'),
+            ('tC', 'TopCounts.csv', 'baySeq topCounts')):
+        if count_name == 'tC' and R['riboCounts'] and R['mrnaCounts']:
+            run_rscript('suppressMessages(library(baySeq))')
+            cmd = """pD <- new("countData", replicates=ffCs@replicates, \
+            data=list(riboCounts, mrnaCounts), groups=list(NDT={0}, DT={1}), \
+            annotation=as.data.frame(ffCs@CDS), \
+            densityFunction=bbDensity)""".format(
+                utils.process_args(
+                    group1, ret_type='int', ret_mode='charvector'),
+                utils.process_args(
+                    group2, ret_type='str', ret_mode='charvector'))
+            run_rscript(cmd)
+
+            run_rscript('libsizes(pD) <- getLibsizes(pD)')
+            run_rscript('pD <- getPriors(pD, cl=NULL)')
+            run_rscript('pD <- getLikelihoods(pD, cl=NULL)')
+            run_rscript('tC <- topCounts(pD, "DT", normaliseData={}, '
+                        'number={})'.format(normalize, num_counts))
+
+        if R[count_name]:
+            html += '<h3>{}</h3>'.format(legend)
+            output_file = os.path.join(output_path, file_name)
+            run_rscript('write.csv({}, file="{}")'.format(
+                count_name, output_file))
+            with open(output_file) as g:
+                html += '<table cellpadding="4" border="1">'
+                header = g.readline()
+                html += '<tr>'
+                for item in header.strip().split(","):
+                    html += '<th>{}</th>'.format(item.strip('"'))
+                html += '</tr>'
+                for line in g:
+                    html += '<tr>'
+                    data = line.strip().split(",")
+                    # html += '<td>{}</td>'.format(data[0].strip('"'))
+                    for item in data:
+                        html += ('<td align="center"><code>{}</code>'
+                                 '</td>'.format(item.strip('"')))
+                    html += '</tr>'
+                html += ('</table><p>Download: <a href="{0}">'
+                         '{0}</a></p><hr>'.format(file_name))
+
+    with open(os.path.join(output_path, 'counts.R'), 'w') as r:
+        r.write(rscript)
+
+    html += ('<h4>R script for this session</h4>'
+             '<p>Download: <a href="counts.R">counts.R</a></p>')
+
+    with open(html_file, 'w') as f:
+        f.write(html)
+
+
+if __name__ == '__main__':
+    parser = argparse.ArgumentParser(description='Get Ribo and RNA-Seq counts')
+
+    # required arguments
+    flags = parser.add_argument_group('required arguments')
+    flags.add_argument('--rdata_load', required=True,
+                       help='Saved riboSeqR data from Step 2')
+    flags.add_argument('--slice_lengths',
+                       help='Lengths of ribosome footprints to inform count '
+                            'data', default='27,28', required=True)
+    flags.add_argument('--group1', help='Group model for baySeq', required=True)
+    flags.add_argument('--group2', help='Group model for baySeq', required=True)
+    flags.add_argument('--frames',
+                       help='Frames of ribosome footprints (relative to '
+                            'coding start site). If omitted, all frames '
+                            'are used.', required=True)
+
+    # optional arguments
+    parser.add_argument(
+        '--num_counts', help='How many results to return? (topCounts)')
+    parser.add_argument('--normalize', help='Normalize data?',
+                        choices=['TRUE', 'FALSE'], default='FALSE')
+    parser.add_argument('--html_file', help='HTML file with reports')
+    parser.add_argument('--output_path', help='Directory to save output files')
+    parser.add_argument(
+        '--debug', help='Produce debug output', action='store_true')
+
+    args = parser.parse_args()
+    if args.debug:
+        logging.basicConfig(format='%(levelname)s - %(message)s',
+                            level=logging.DEBUG, stream=sys.stdout)
+        logging.debug('Supplied Arguments\n{}\n'.format(vars(args)))
+
+    if not os.path.exists(args.output_path):
+        os.mkdir(args.output_path)
+
+    get_counts(rdata_load=args.rdata_load, slice_lengths=args.slice_lengths,
+               frames=args.frames, group1=args.group1, group2=args.group2,
+               num_counts=args.num_counts, normalize=args.normalize,
+               html_file=args.html_file, output_path=args.output_path)
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/riboseqr/metagene.py	Tue Jul 21 14:48:39 2015 +0100
@@ -0,0 +1,209 @@
+#!/usr/bin/env python
+import os
+import sys
+import glob
+import argparse
+import logging
+import rpy2.robjects as robjects
+
+import utils
+
+rscript = ''
+R = robjects.r
+
+
+def run_rscript(command=None):
+    """Run R command, log it, append to rscript"""
+    global rscript
+    if not command:
+        return
+    logging.debug(command)
+    rscript += '{}\n'.format(command)
+    output = R(command)
+    return output
+
+
+def do_analysis(
+        rdata_load='Periodicity.rda', selected_lengths='27',
+        selected_frames='', hit_mean='10', unique_hit_mean='1',
+        ratio_check='TRUE', min5p='-20', max5p='200', min3p='-200', max3p='20',
+        cap='', plot_title='', plot_lengths='27', rdata_save='Metagene.rda',
+        html_file='Metagene-report.html', output_path=os.getcwd()):
+    """Metagene analysis from saved periodicity R data file. """
+    run_rscript('suppressMessages(library(riboSeqR))')
+    run_rscript('load("{}")'.format(rdata_load))
+
+    logging.debug('fS\n{}\nfCs\n{}\n'.format(R['fS'], R['fCs']))
+    options = {}
+    for key, value, rtype, rmode in (
+            ('lengths', selected_lengths, 'int', 'charvector'),
+            ('frames', selected_frames, 'int', 'listvector'),
+            ('hit_mean', hit_mean, 'int', None),
+            ('unique_hit_mean', unique_hit_mean, 'int', None),
+            ('ratio_check', ratio_check, 'bool', None),
+            ('min5p', min5p, 'int', None), ('max5p', max5p, 'int', None),
+            ('min3p', min3p, 'int', None), ('max3p', max3p, 'int', None),
+            ('cap', cap, 'int', None),
+            ('plot_title', plot_title, 'str', 'charvector'),
+            ('plot_lengths', plot_lengths, 'int', 'list')):
+            options[key] = utils.process_args(
+                value, ret_type=rtype, ret_mode=rmode)
+
+    cmd_args = """fCs, lengths={lengths},
+    frames={frames}, hitMean={hit_mean},
+    unqhitMean={unique_hit_mean}, fS=fS""".format(**options)
+
+    if ratio_check == 'TRUE':
+        cmd_args += ', ratioCheck = TRUE'
+
+    run_rscript('ffCs <- filterHits({})'.format(cmd_args))
+    logging.debug("ffCs\n{}\n".format(R['ffCs']))
+
+    cds_args = ('coordinates=ffCs@CDS, riboDat=riboDat, min5p={min5p}, '
+                'max5p={max5p}, min3p={min3p}, max3p={max3p}'.format(**options))
+
+    if options['cap']:
+        cds_args += ', cap={cap}'.format(**options)
+
+    if options['plot_title']:
+        cds_args += ', main={plot_title}'.format(**options)
+
+    html = """<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 3.2//EN">
+    <html>
+    <head>
+    <title>Metagene Analysis - Report</title>
+    </head>
+    <body>
+    """
+    html += '<h2>Metagene analysis - results</h2>\n<hr>\n'
+    html += ('<p>\nLengths of footprints used in analysis - <strong>'
+             '<code>{0}</code></strong><br>\nLengths of footprints '
+             'selected for the plot - <strong><code>{1}</code></strong>'
+             '\n</p>\n'.format(selected_lengths, plot_lengths))
+    for count, length in enumerate(options['plot_lengths']):
+        count += 1
+        html += '<h3>Length: {0}</h3>\n'.format(length)
+        plot_file = os.path.join(output_path,
+                                 'Metagene-analysis-plot{0}'.format(count))
+        for fmat in ('pdf', 'png'):
+            if fmat == 'png':
+                cmd = 'png(file="{0}_%1d.png", type="cairo")'
+            else:
+                cmd = 'pdf(file="{0}.pdf")'
+            run_rscript(cmd.format(plot_file))
+            run_rscript('plotCDS({0},{1})'.format(
+                cds_args, 'lengths={}'.format(length)))
+            run_rscript('dev.off()')
+        for image in sorted(
+                glob.glob('{}*.png'.format(plot_file))):
+            html += '<p><img border="1" src="{0}" alt="{0}"></p>\n'.format(
+                os.path.basename(image))
+        html += '<p><a href="{0}.pdf">PDF version</a></p>\n'.format(
+            os.path.basename(plot_file))
+    run_rscript('save("ffCs", "riboDat", "fastaCDS", file="{}", '
+                'compress=FALSE)'.format(rdata_save))
+
+    logging.debug('\n{:#^80}\n{}\n{:#^80}\n'.format(
+        ' R script for this session ', rscript, ' End R script '))
+
+    with open(os.path.join(output_path, 'metagene.R'), 'w') as r:
+        r.write(rscript)
+
+    html += ('<h4>R script for this session</h4>\n'
+             '<p><a href="metagene.R">metagene.R</a></p>\n'
+             '<p>Next step: <em>Plot Ribosome profile</em></p>\n'
+             '</body>\n</html>\n')
+
+    with open(html_file, 'w') as f:
+        f.write(html)
+
+if __name__ == '__main__':
+
+    parser = argparse.ArgumentParser(description='Metagene analysis')
+
+    # required arguments
+    flags = parser.add_argument_group('required arguments')
+    flags.add_argument('--rdata_load', required=True,
+                       help='Saved riboSeqR data from Periodicity step.')
+    flags.add_argument('--selected_lengths', required=True,
+                       help='Select frame lengths to filter. Comma-separated',
+                       default='27')
+    flags.add_argument(
+        '--selected_frames', required=True,
+        help='Select frames corresponding to frame lengths. Comma-separated')
+
+    flags.add_argument(
+        '--hit_mean', required=True,
+        help='Mean number of hits within the replicate group for filtering',
+        default='10')
+
+    flags.add_argument(
+        '--unique_hit_mean', required=True,
+        help='Mean number of unique sequences within the replicate group '
+             'for filtering', default='1')
+
+    parser.add_argument(
+        '--rdata_save', help='File to write R data to (default: %(default)s)',
+        default='Metagene.rda')
+
+    parser.add_argument(
+        '--ratio_check',
+        help='Check the ratios of the expected phase to maximal phase '
+             'within the putative coding sequence (default: %(default)s)',
+        choices=['TRUE', 'FALSE'], default='TRUE')
+
+    parser.add_argument(
+        '--plot_lengths',
+        help='Length of footprints to be plotted. Multiple values should be '
+             'comma-separated. In that case, multiple plots will be produced'
+             '(default: %(default)s)', default='27')
+
+    parser.add_argument(
+        '--min5p',
+        help='The distance upstream of the translation start to be plotted '
+             '(default: %(default)s)', default='-20')
+
+    parser.add_argument(
+        '--max5p',
+        help='The distance downstream of the translation start to be plotted '
+             '(default: %(default)s)', default='200')
+
+    parser.add_argument(
+        '--min3p',
+        help='The distance upstream of the translation end to be plotted '
+             '(default: %(default)s)', default='-200')
+
+    parser.add_argument(
+        '--max3p',
+        help='The distance downtream of the translation end to be plotted '
+             '(default: %(default)s)', default='20')
+
+    parser.add_argument(
+        '--cap', help='If given, caps the height of plotted values '
+                      '(default: %(default)s)')
+
+    parser.add_argument('--plot_title', help='Title of the plot', default='')
+    parser.add_argument('--html_file', help='HTML file with reports')
+    parser.add_argument('--output_path', help='Directory to save output files')
+    parser.add_argument(
+        '--debug', help='Produce debug output', action='store_true')
+
+    args = parser.parse_args()
+    if args.debug:
+        logging.basicConfig(format='%(module)s: %(levelname)s - %(message)s',
+                            level=logging.DEBUG, stream=sys.stdout)
+        logging.debug('Supplied Arguments\n{}\n'.format(vars(args)))
+
+    if not os.path.exists(args.output_path):
+        os.mkdir(args.output_path)
+
+    do_analysis(
+        rdata_load=args.rdata_load, selected_lengths=args.selected_lengths,
+        selected_frames=args.selected_frames, hit_mean=args.hit_mean,
+        unique_hit_mean=args.unique_hit_mean, ratio_check=args.ratio_check,
+        min5p=args.min5p, max5p=args.max5p, min3p=args.min3p, max3p=args.max3p,
+        cap=args.cap, plot_title=args.plot_title,
+        plot_lengths=args.plot_lengths, rdata_save=args.rdata_save,
+        html_file=args.html_file, output_path=args.output_path)
+
+    logging.debug('Done!')
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/riboseqr/prepare.py	Tue Jul 21 14:48:39 2015 +0100
@@ -0,0 +1,214 @@
+#!/usr/bin/env python
+import os
+import sys
+import argparse
+import logging
+import rpy2.robjects as robjects
+
+import utils
+
+rscript = ''
+R = robjects.r
+
+
+def run_rscript(command=None):
+    """Run R command, log it, append to rscript"""
+    global rscript
+    if not command:
+        return
+    logging.debug(command)
+    rscript += '{}\n'.format(command)
+    output = R(command)
+    return output
+
+
+def prep_riboseqr_input(sam_file, output_file):
+    """Generate input file for riboSeqR from SAM format file."""
+    with open(output_file, 'w') as f:
+        for line in open(sam_file):
+            if line.startswith('@'):
+                continue
+            line = line.split()
+            flag = line[1]
+
+            if flag != '0':
+                continue
+            # make start 0-indexed, sam alignments are 1-indexed
+            start = int(line[3]) - 1
+            (name, sequence) = (line[2], line[9])
+            f.write('"+"\t"{0}"\t{1}\t"{2}"\n'.format(name, start, sequence))
+
+
+def batch_process(sam_files, seq_type, output_path):
+    """Batch process the conversion of SAM format files -> riboSeqR format
+    input files.
+
+    Files are saved with file names corresponding to their sequence type.
+
+    """
+    outputs = []
+    prefix = '{}'
+    if seq_type == 'riboseq':
+        prefix = 'RiboSeq file {}'
+    elif seq_type == 'rnaseq':
+        prefix = 'RNASeq file {}'
+
+    for count, fname in enumerate(sam_files):
+        count += 1
+        out_file = os.path.join(output_path, prefix.format(count))
+        logging.debug('Processing: {}'.format(fname))
+        logging.debug('Writing output to: {}'.format(out_file))
+        prep_riboseqr_input(fname, out_file)
+        outputs.append(out_file)
+    return outputs
+
+
+def generate_ribodata(ribo_files='', rna_files='', replicate_names='',
+                      seqnames='', rdata_save='Prepare.rda', sam_format=True,
+                      html_file='Prepare-report.html', output_path=os.getcwd()):
+    """Prepares Ribo and RNA seq data in the format required for riboSeqR. Calls
+    the readRibodata function of riboSeqR and saves the result objects in an
+    R data file which can be used as input for the next step.
+
+    """
+    input_ribo_files = utils.process_args(ribo_files, ret_mode='list')
+    logging.debug('Found {} Ribo-Seq files'.format(len(input_ribo_files)))
+    logging.debug(input_ribo_files)
+
+    input_rna_files = []
+    if rna_files:
+        input_rna_files = utils.process_args(rna_files, ret_mode='list')
+        logging.debug('Found {} RNA-Seq files'.format(len(input_rna_files)))
+        logging.debug(input_rna_files)
+
+    replicates = utils.process_args(replicate_names, ret_mode='charvector')
+    logging.debug('Replicates: {}\n'.format(replicates))
+
+    if sam_format:
+        ribo_seq_files = batch_process(input_ribo_files, 'riboseq', output_path)
+    else:
+        ribo_seq_files = input_ribo_files
+
+    html = '<h2>Prepare riboSeqR input - results</h2><hr>'
+    if len(ribo_seq_files):
+        html += '<h4>Generated riboSeqR format input files ' \
+                '<em>(RiboSeq)</em></h4><p>'
+        for fname in ribo_seq_files:
+            html += '<a href="{0}">{0}</a><br>'.format(
+                os.path.basename(fname))
+        html += '</p>'
+
+    rna_seq_files = []
+    if len(input_rna_files):
+        if sam_format:
+            rna_seq_files = batch_process(
+                input_rna_files, 'rnaseq', output_path)
+        else:
+            rna_seq_files = input_rna_files
+
+    if len(rna_seq_files):
+        html += ('<h4>Generated riboSeqR format input files '
+                 '<em>(RNASeq)</em></h4><p>')
+        for fname in rna_seq_files:
+            html += '<a href="{0}">{0}</a><br>'.format(
+                os.path.basename(fname))
+        html += '</p>'
+
+    input_seqnames = utils.process_args(seqnames, ret_mode='charvector')
+    options = {'ribo_seq_files': 'c({})'.format(str(ribo_seq_files)[1:-1]),
+               'rna_seq_files': 'c({})'.format(str(rna_seq_files)[1:-1]),
+               'input_replicates': replicates,
+               'input_seqnames': input_seqnames}
+
+    logging.debug('{}'.format(R('sessionInfo()')))
+
+    script = ''
+    cmd = 'suppressMessages(library(riboSeqR))'
+    run_rscript(cmd)
+    script += '{}\n'.format(cmd)
+
+    if len(rna_seq_files):
+        cmd_args = ('riboFiles={ribo_seq_files}, '
+                    'rnaFiles={rna_seq_files}'.format(**options))
+    else:
+        cmd_args = 'riboFiles={ribo_seq_files}'.format(**options)
+
+    if input_seqnames:
+        cmd_args += ', seqnames={input_seqnames}'.format(**options)
+    if replicates:
+        cmd_args += ', replicates={input_replicates}'.format(**options)
+    else:
+        cmd_args += ', replicates=c("")'
+    cmd = 'riboDat <- readRibodata({0})'.format(cmd_args)
+    run_rscript(cmd)
+    script += '{}\n'.format(cmd)
+
+    ribo_data = R['riboDat']
+    logging.debug('riboDat \n{}\n'.format(ribo_data))
+    cmd = 'save("riboDat", file="{}", compress=FALSE)'.format(rdata_save)
+    run_rscript(cmd)
+    script += '{}\n'.format(cmd)
+
+    msg = '\n{:#^80}\n{}\n{:#^80}\n'.format(
+        ' R script for this session ', script, ' End R script ')
+    logging.debug(msg)
+
+    with open(os.path.join(output_path, 'prepare.R'), 'w') as r:
+        r.write(script)
+
+    html += ('<h4>R script for this session</h4>'
+             '<p><a href="prepare.R">prepare.R</a></p>'
+             '<p>Next step: <em>Triplet periodicity</em></p>')
+
+    with open(html_file, 'w') as f:
+        f.write(html)
+
+    return ribo_data
+
+
+if __name__ == '__main__':
+
+    description = (
+        'Prepare riboSeqR input file from SAM format RNA/Ribo-Seq alignment.')
+    parser = argparse.ArgumentParser(description=description)
+
+    # required arguments
+    flags = parser.add_argument_group('required arguments')
+    flags.add_argument('--ribo_files', required=True,
+                       help='List of Ribo-Seq files, comma-separated')
+    # optional arguments
+    parser.add_argument('--rna_files',
+                        help='List of RNA-Seq files. Comma-separated')
+    parser.add_argument('--replicate_names',
+                        help='Replicate names, comma-separated')
+    parser.add_argument('--seqnames',
+                        help='Transcript (seqname) names to be read')
+    parser.add_argument('--rdata_save',
+                        help='File to write RData to (default: %(default)s)',
+                        default='Prepare.rda')
+    parser.add_argument(
+        '--sam_format',
+        help='Flag. Input is in SAM format', action='store_true')
+    parser.add_argument('--html_file', help='Output file for results (HTML)')
+    parser.add_argument('--output_path',
+                        help='Files are saved in this directory')
+    parser.add_argument('--debug', help='Flag. Produce debug output',
+                        action='store_true')
+    args = parser.parse_args()
+
+    if args.debug:
+        logging.basicConfig(format='%(module)s: %(levelname)s - %(message)s',
+                            level=logging.DEBUG, stream=sys.stdout)
+        logging.debug('Supplied Arguments: {}'.format(vars(args)))
+
+    if not os.path.exists(args.output_path):
+        os.mkdir(args.output_path)
+
+    generate_ribodata(
+        ribo_files=args.ribo_files, rna_files=args.rna_files,
+        replicate_names=args.replicate_names, seqnames=args.seqnames,
+        rdata_save=args.rdata_save,
+        sam_format=args.sam_format, html_file=args.html_file,
+        output_path=args.output_path
+    )
+    logging.debug('Done')
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/riboseqr/ribosome_profile.py	Tue Jul 21 14:48:39 2015 +0100
@@ -0,0 +1,127 @@
+#!/usr/bin/env python
+import os
+import sys
+import glob
+import argparse
+import logging
+import rpy2.robjects as robjects
+import utils
+
+rscript = ''
+R = robjects.r
+
+
+def run_rscript(command=None):
+    """Run R command, log it, append to rscript"""
+    global rscript
+    if not command:
+        return
+    logging.debug(command)
+    rscript += '{}\n'.format(command)
+    msg = R(command)
+
+
+def plot_transcript(rdata_load='Metagene.rda', transcript_name='',
+                    transcript_length='27', transcript_cap='',
+                    html_file='Plot-ribosome-profile.html',
+                    output_path=os.getcwd()):
+    """Plot ribosome profile for a given transcript. """
+    options = {}
+    for key, value, rtype, rmode in (
+            ('transcript_name', transcript_name, 'str', None),
+            ('transcript_length', transcript_length, 'int', 'charvector'),
+            ('transcript_cap', transcript_cap, 'int', None)):
+        options[key] = utils.process_args(value, ret_type=rtype, ret_mode=rmode)
+
+    run_rscript('suppressMessages(library(riboSeqR))')
+    run_rscript('load("{}")'.format(rdata_load))
+
+    html = """<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 3.2//EN">
+    <html>
+    <head>
+    <title>Ribosome Profile Plot - Report</title>
+    </head>
+    <body>
+    """
+    html += '<h2>Plot ribosome profile - results</h2>\n<hr>\n'
+    if len(transcript_name):
+        cmd_args = (
+            '"{transcript_name}", main="{transcript_name}",'
+            'coordinates=ffCs@CDS, riboData=riboDat,'
+            'length={transcript_length}'.format(**options))
+        if transcript_cap:
+            cmd_args += ', cap={transcript_cap}'.format(**options)
+        plot_file = os.path.join(output_path, 'Ribosome-profile-plot')
+        for fmat in ('pdf', 'png'):
+            if fmat == 'png':
+                cmd = 'png(file="{}_%1d.png", type="cairo")'.format(plot_file)
+            else:
+                cmd = 'pdf(file="{}.pdf")'.format(plot_file)
+            run_rscript(cmd)
+            cmd = 'plotTranscript({})'.format(cmd_args)
+            run_rscript(cmd)
+            run_rscript('dev.off()')
+
+        html += ('<p>Selected ribosome footprint length: '
+                 '<strong>{0}</strong>\n'.format(transcript_length))
+
+        for image in sorted(glob.glob('{}_*.png'.format(plot_file))):
+            html += '<p><img border="1" src="{0}" alt="{0}"></p>\n'.format(
+                os.path.basename(image))
+        html += '<p><a href="Ribosome-profile-plot.pdf">PDF version</a></p>\n'
+    else:
+        msg = 'No transcript name was provided. Did not generate plot.'
+        html += '<p>{}</p>'.format(msg)
+        logging.debug(msg)
+
+    logging.debug('\n{:#^80}\n{}\n{:#^80}\n'.format(
+        ' R script for this session ', rscript, ' End R script '))
+
+    with open(os.path.join(output_path, 'ribosome-profile.R'), 'w') as r:
+        r.write(rscript)
+
+    html += ('<h4>R script for this session</h4>\n'
+             '<p><a href="ribosome-profile.R">ribosome-profile.R</a></p>\n'
+             '</body>\n</html>\n')
+
+    with open(html_file, 'w') as f:
+        f.write(html)
+
+
+if __name__ == '__main__':
+    parser = argparse.ArgumentParser(description='Plot Ribosome profile')
+
+    # required arguments
+    flags = parser.add_argument_group('required arguments')
+    flags.add_argument('--rdata_load', required=True,
+                       help='Saved riboSeqR data from Step 2')
+    flags.add_argument('--transcript_name', required=True,
+                       help='Name of the transcript to be plotted')
+    flags.add_argument(
+        '--transcript_length', required=True,
+        help='Size class of ribosome footprint data to be plotted',
+        default='27')
+    flags.add_argument(
+        '--transcript_cap', required=True,
+        help=('Cap on the largest value that will be plotted as an abundance '
+              'of the ribosome footprint data'))
+    parser.add_argument('--html_file', help='HTML file with reports')
+    parser.add_argument('--output_path', help='Directory to save output files')
+    parser.add_argument('--debug', help='Produce debug output',
+                        action='store_true')
+
+    args = parser.parse_args()
+    if args.debug:
+        logging.basicConfig(format='%(levelname)s - %(message)s',
+                            level=logging.DEBUG, stream=sys.stdout)
+        logging.debug('Supplied Arguments\n{}\n'.format(vars(args)))
+
+    if not os.path.exists(args.output_path):
+        os.mkdir(args.output_path)
+
+    plot_transcript(rdata_load=args.rdata_load,
+                    transcript_name=args.transcript_name,
+                    transcript_length=args.transcript_length,
+                    transcript_cap=args.transcript_cap,
+                    html_file=args.html_file, output_path=args.output_path)
+    logging.debug('Done!')
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/riboseqr/triplet.py	Tue Jul 21 14:48:39 2015 +0100
@@ -0,0 +1,158 @@
+#!/usr/bin/env python
+import os
+import sys
+import argparse
+import logging
+import rpy2.robjects as robjects
+
+import utils
+
+rscript = ''
+R = robjects.r
+
+
+def run_rscript(command=None):
+    """Run R command, log it, append to rscript"""
+    global rscript
+    if not command:
+        return
+    logging.debug(command)
+    rscript += '{}\n'.format(command)
+    output = R(command)
+    return output
+
+
+def find_periodicity(
+        rdata_load='Prepare.rda', start_codons='ATG', stop_codons='TAG,TAA,TGA',
+        fasta_file=None, include_lengths='25:30', analyze_plot_lengths='26:30',
+        text_legend='Frame 0, Frame 1, Frame 2', rdata_save='Periodicity.rda',
+        html_file='Periodicity-report.html', output_path=os.getcwd()):
+    """Plot triplet periodicity from prepared R data file. """
+    logging.debug('{}'.format(R('sessionInfo()')))
+    cmd = 'suppressMessages(library(riboSeqR))'
+    run_rscript(cmd)
+
+    logging.debug('Loading saved R data file')
+    cmd = 'load("{}")'.format(rdata_load)
+    run_rscript(cmd)
+
+    # R("""options(showTailLines=Inf)""")
+    starts, stops = (utils.process_args(start_codons, ret_mode='charvector'),
+                     utils.process_args(stop_codons, ret_mode='charvector'))
+
+    cmd = ('fastaCDS <- findCDS(fastaFile={0!r}, startCodon={1}, '
+           'stopCodon={2})'.format(fasta_file, starts, stops))
+    run_rscript(cmd)
+
+    logging.debug('Potential coding sequences using start codon (ATG) and '
+                  'stop codons TAG, TAA, TGA')
+    logging.debug('{}\n'.format(R['fastaCDS']))
+
+    cmd = """fCs <- frameCounting(riboDat, fastaCDS, lengths={0})
+    fS <- readingFrame(rC=fCs, lengths={1}); fS""".\
+        format(include_lengths, analyze_plot_lengths)
+    run_rscript(cmd)
+
+    logging.debug('riboDat \n{}\n'.format(R['riboDat']))
+    logging.debug('fCs\n{0}\n'.format(R['fCs']))
+    logging.debug('Reading frames for each n-mer\n{}'.format(R['fS']))
+
+    legend = utils.process_args(text_legend, ret_mode='charvector')
+
+    for fmat in ('pdf', 'png'):
+        if fmat == 'png':
+            cmd = '{0}(file="{1}", type="cairo")'
+        else:
+            cmd = '{0}(file="{1}")'
+        run_rscript(cmd.format(fmat, os.path.join(
+            output_path, '{0}.{1}'.format('Periodicity-plot', fmat))))
+        run_rscript('plotFS(fS, legend.text = {0})'.format(legend))
+        run_rscript('dev.off()')
+
+    run_rscript('save("fCs", "fS", "riboDat", "fastaCDS", '
+                'file="{}", compress=FALSE)'.format(rdata_save))
+
+    html = '<h2>Triplet periodicity - results</h2><hr>'
+    html += ('<h4>Results of reading frame analysis</h4>'
+             '<pre>{}</pre><br>'.format(R['fS']))
+    html += ('<p>Lengths used for reading frame analysis - <code>{0}</code>'
+             '<br>Lengths selected for the plot - <code>{1}</code>'
+             '</p>'.format(include_lengths, analyze_plot_lengths))
+    html += ('<p><img src="Periodicity-plot.png" border="1" '
+             'alt="Triplet periodicity plot" />'
+             '<br><a href="Periodicity-plot.pdf">PDF version</a></p>')
+
+    logging.debug('\n{:#^80}\n{}\n{:#^80}\n'.format(
+        ' R script for this session ', rscript, ' End R script '))
+
+    with open(os.path.join(output_path, 'periodicity.R'), 'w') as r:
+        r.write(rscript)
+
+    html += ('<h4>R script for this session</h4>'
+             '<p><a href="periodicity.R">periodicity.R</a></p>'
+             '<p>Next step: <em>Metagene analysis</em></p>')
+
+    with open(html_file, 'w') as f:
+        f.write(html)
+
+
+if __name__ == '__main__':
+    
+    parser = argparse.ArgumentParser(
+        description='Plot triplet periodicity for different read lengths.')
+    
+    # required arguments
+    flags = parser.add_argument_group('required arguments')
+    flags.add_argument(
+        '--rdata_load', required=True,
+        help='riboSeqR data from input preparation step (Prepare.rda)')
+    flags.add_argument(
+        '--fasta_file', required=True,
+        help='FASTA file of the reference transcriptome')
+    
+    # optional arguments
+    parser.add_argument(
+        '--start_codons',
+        help='Start codon(s) to use. (default: %(default)s)', default='ATG')
+    parser.add_argument(
+        '--stop_codons', help='Stop codon(s) to use. (default: %(default)s)',
+        default='TAG, TAA, TGA')
+    parser.add_argument(
+        '--include_lengths',
+        help='Lengths of ribosome footprints to be included '
+             '(default: %(default)s)', default='25:30')
+    parser.add_argument(
+        '--analyze_plot_lengths',
+        help='Lenghts of reads to be analyzed for frame-shift and plotting '
+             '(default: %(default)s)', default='26:30')
+    parser.add_argument(
+        '--text_legend',
+        help='Text for legend used in the plot (default: %(default)s)',
+        default='Frame 0, Frame 1, Frame 2')
+    parser.add_argument(
+        '--rdata_save', help='File to write RData to (default: %(default)s)',
+        default='Periodicity.rda')
+    parser.add_argument('--html_file', help='Output file for results (HTML)')
+    parser.add_argument('--output_path',
+                        help='Files are saved in this directory')
+    parser.add_argument(
+        '--debug', help='Produce debug output', action='store_true')
+    
+    args = parser.parse_args()
+    if args.debug:
+        logging.basicConfig(format='%(module)s: %(levelname)s - %(message)s',
+                            level=logging.DEBUG, stream=sys.stdout)
+        logging.debug('Supplied Arguments\n{}\n'.format(vars(args)))
+
+    if not os.path.exists(args.output_path):
+        os.mkdir(args.output_path)
+
+    find_periodicity(
+        rdata_load=args.rdata_load, start_codons=args.start_codons,
+        stop_codons=args.stop_codons, fasta_file=args.fasta_file,
+        include_lengths=args.include_lengths,
+        analyze_plot_lengths=args.analyze_plot_lengths,
+        text_legend=args.text_legend,
+        rdata_save=args.rdata_save, html_file=args.html_file,
+        output_path=args.output_path)
+logging.debug("Done!")
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/riboseqr/utils.py	Tue Jul 21 14:48:39 2015 +0100
@@ -0,0 +1,93 @@
+"""Common functions"""
+
+
+def process_args(args, ret_type='str', ret_mode=None):
+    """Split arguments (only strings) on comma, return in requested
+
+    ret_type
+        (str, int or bool)
+
+    as
+
+    ret_mode
+        char vector - c(1,2,3)
+        list vector - list(1,2,3)
+        list - as list [1,2,3]
+        None  - same as input
+
+    """
+    if args:
+        all_args = [item.strip() for item in args.split(',')]
+    else:
+        all_args = []
+    num_options = len(all_args)
+    if num_options == 1 and len(all_args[0]):
+        if ret_type == 'int':
+            option = int(all_args[0])
+            if ret_mode == 'charvector':
+                # if option:
+                return 'c({})'.format(option)
+                # else:
+                #     return 'c()'
+            elif ret_mode == 'listvector':
+                # if option:
+                return 'list({})'.format(option)
+                # else:
+                #     return 'list()'
+            elif ret_mode == 'list':
+                # if option:
+                return [option]
+                # else:
+                #     return []
+            else:
+                return option
+        else:
+            # str, bool
+            option = all_args[0]
+            if ret_mode == 'charvector':
+                # if len(option):
+                return 'c("{}")'.format(option)
+                # else:
+                #     return 'c("")'
+            elif ret_mode == 'listvector':
+                # if option:
+                return 'list("{}")'.format(option)
+                # else:
+                #     return 'list("")'
+            elif ret_mode == 'list':
+                # if option:
+                return [option]
+                # else:
+                #     return []
+            else:
+                return option
+    elif num_options > 1:
+        if ret_type == 'int':
+            options = tuple([int(item) for item in all_args])
+            if ret_mode == 'charvector':
+                # if len(options):
+                return 'c{}'.format(options)
+                # else:
+                #     return 'c()'
+            elif ret_mode == 'listvector':
+                # if len(options):
+                return 'list{}'.format(options)
+                # else:
+                #     return 'list()'
+            elif ret_mode == 'list':
+                return list(options)
+        else:
+            options = tuple(all_args)
+            if ret_mode == 'charvector':
+                # if len(options):
+                return 'c{}'.format(options)
+            elif ret_mode == 'listvector':
+                return 'list{}'.format(options)
+            elif ret_mode == 'list':
+                # if len(all_args):
+                return all_args
+                # else:
+                #     return []
+            else:
+                # as original with spaces stripped
+                return ','.join(all_args)
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/ribosome_profile.xml	Tue Jul 21 14:48:39 2015 +0100
@@ -0,0 +1,128 @@
+<tool id="riboseqr_ribosome_profile" name="Plot Ribosome Profile"
+      version="0.4.0">
+    <description>
+        (Step 4) Plot Ribosome profile using riboSeqR.
+    </description>
+    <requirements>
+      <requirement type="package" version="3.1.2">R</requirement>
+      <requirement type="package" version="6.2">readline</requirement>
+      <requirement type="package" version="2.3.10">rpy2</requirement>
+      <requirement type="package" version="0.4.0">riboseqr_wrapper_deps</requirement>
+    </requirements>
+    <stdio>
+        <exit_code range="1:"  level="fatal" description="Error" />
+    </stdio>
+    <command interpreter="python">riboseqr/ribosome_profile.py
+        --rdata_load "$rdata_load"
+        --transcript_name "$transcript_name"
+        --transcript_length "$transcript_length"
+        --transcript_cap "$transcript_cap"
+        --html_file "$html_file"
+        --output_path "$html_file.files_path"
+    </command>
+    <inputs>
+        <param name="rdata_load" format="rda" type="data"
+               label="Select Metagene analysis (R data file)"
+               multiple="false">
+            <validator type="expression"
+                       message="Please check if the correct RDA file is selected">value.name == "Metagene analysis (R data file)"</validator>
+        </param>
+
+        <param name="transcript_name" type="text" size="30"
+               label="Name of the transcript to be plotted"
+               help="Plot of ribosome footprint abundance and mRNA coverage
+               (if available) for a specific transcript.">
+            <validator type="expression" message="Please input a single transcript name">len(value.split(',')) == 1</validator>
+            <validator type="empty_field" message="Field requires a value"/>
+            <sanitizer>
+                <valid>
+                    <add value="|"/>
+                </valid>
+            </sanitizer>
+        </param>
+
+        <param name="transcript_length" type="text"
+               label="Select Ribosome footprint length"
+               help="" value="28">
+            <validator type="expression" message="Please input a single footprint length">len(value.split(',')) == 1</validator>
+            <validator type="expression" message="Please input a single footprint length">len(value.split(' ')) == 1</validator>
+            <validator type="empty_field" message="Field requires a value"/>
+        </param>
+
+        <param name="transcript_cap" type="integer" value="200"
+               label="Cap on the largest value that will be plotted as an
+               abundance of the ribosome footprint data"/>
+    </inputs>
+    <outputs>
+        <data format="html" name="html_file"
+              label="Plot ribosome profile (HTML report)"/>
+    </outputs>
+    <!--<tests>-->
+        <!--<test>-->
+            <!--<param name="rdata_load" value="Metagene analysis (R data file)"/>-->
+            <!--<param name="transcript_name" value="CUFF.37930.1"/>-->
+            <!--<param name="transcript_length" value="27"/>-->
+            <!--<param name="transcript_cap" value="200"/>-->
+            <!--<output name="html_file" file="Plot_ribosome_profile_(HTML_report).html">-->
+              <!--<extra_files type="file" name="Ribosome-profile-plot.pdf" value="Ribosome-profile-plot.pdf" ftype="pdf"/>-->
+              <!--<extra_files type="file" name="Ribosome-profile-plot_1.png" value="Ribosome-profile-plot_1.png" ftype="png"/>-->
+              <!--<extra_files type="file" name="Ribosome-profile-plot_2.png" value="Ribosome-profile-plot_2.png" ftype="png"/>-->
+              <!--<extra_files type="file" name="Ribosome-profile-plot_3.png" value="Ribosome-profile-plot_3.png" ftype="png"/>-->
+              <!--<extra_files type="file" name="Ribosome-profile-plot_4.png" value="Ribosome-profile-plot_4.png" ftype="png"/>-->
+            <!--</output>-->
+        <!--</test>-->
+    <!--</tests>-->
+    <help>
+Plot Ribosome Profile
+---------------------
+riboSeqR version: ``1.0.5``.
+
+This tool can be used for generating a ribosome profile plot for a given
+transcript. The input is the R data file from the previous
+step - Metagene analysis.
+
+How to use?
+-----------
+Inputs
+......
+Select *Metagene analysis (R data file)* from the previous step, enter name
+of the transcript to plot, review/change parameters and execute program.
+
+.. class:: warningmark
+
+        The transcript name should correspond to names used in SAM alignments
+        and FASTA file of the transcriptome.
+
+Outputs
+.......
+The following files will be generated on completion:
+
+Plot ribosome profile (HTML report)
+
+A HTML file with results and links to other output files - plot for the
+specified transcript (PDF) and R script used for the session.
+
+riboSeqR functions used
+.......................
+``plotTranscript``.
+
+For detailed description of the functions and the options used, please consult
+the riboSeqR documentation.
+
+Links
+.....
+`riboSeqR &lt;http://bioconductor.org/packages/3.0/bioc/html/riboSeqR.html&gt;`_.
+
+    </help>
+    <citations>
+        <citation type="bibtex">
+            @Manual{,
+            title = {riboSeqR: Analysis of sequencing data from ribosome
+            profiling experiments.},
+            author = {Thomas J. Hardcastle},
+            year = {2014},
+            note = {R package version 1.0.5},
+            }
+        </citation>
+    </citations>
+</tool>
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/tests/__init__.py	Tue Jul 21 14:48:39 2015 +0100
@@ -0,0 +1,1 @@
+# -*- coding: utf-8 -*-
\ No newline at end of file
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/tests/test_riboseqr.py	Tue Jul 21 14:48:39 2015 +0100
@@ -0,0 +1,58 @@
+"""riboSeqR Galaxy unit tests"""
+import unittest
+from riboseqr import utils
+
+
+class PrepareTestCase(unittest.TestCase):
+
+    def test_process_args(self):
+        """Test processing arguments. """
+        # "ATG" -> c("ATG")
+        rs = utils.process_args('ATG', ret_mode='charvector')
+        self.assertEqual(rs, 'c("ATG")','Return string as a character vector.')
+
+        # stop codons "TAG,TAA,TGA" -> c("TAG", "TAA", "TGA"). Also
+        # replicate names, seqnames.
+        rs = utils.process_args('TAG,TAA,TGA', ret_mode='charvector')
+        self.assertEqual(
+            rs, "c('TAG', 'TAA', 'TGA')",
+            'Return comma separated strings as a character vector.')
+
+        # "" -> None
+        rs = utils.process_args('')
+        self.assertIsNone(rs, 'Return empty string as None.')
+
+        # "27,28" -> c(27, 28)
+        rs = utils.process_args("27,28", ret_type='int', ret_mode='charvector')
+        self.assertEqual(
+            rs, "c(27, 28)", 'Return number strings as a character vector.')
+
+        # "27,28" -> [27, 28]
+        rs = utils.process_args("27,28", ret_type='int', ret_mode='list')
+        self.assertEqual(rs, [27, 28], 'Return number strings as a list.')
+
+        # "0,2" -> list(0,2)
+        rs = utils.process_args("0,2", ret_type='int', ret_mode='listvector')
+        self.assertEqual(
+            rs, "list(0, 2)", 'Return number strings as a list vector.')
+
+        # "50" -> 50
+        rs = utils.process_args("50", ret_type='int')
+        self.assertEqual(rs, 50, 'Return number string as a number.')
+
+        # "-200" -> -200
+        rs = utils.process_args("-200", ret_type='int')
+        self.assertEqual(rs, -200, 'Return number string as a number.')
+
+        # "TRUE" -> TRUE
+        rs = utils.process_args("TRUE", ret_type='bool')
+        self.assertEqual(rs, 'TRUE', 'Return bool string as bool.')
+
+        # 'chlamy17,chlamy3' -> 'chlamy17,chlamy3' for ribo and rna names
+        rs = utils.process_args('chlamy17,chlamy3')
+        self.assertEqual(rs, 'chlamy17,chlamy3', 'Return csv string as string.')
+
+        # 'chlamy17.idx, chlamy3.idx' -> ['chlamy17.idx', 'chlamy3.idx']
+        rs = utils.process_args('chlamy17.idx, chlamy3.idx', ret_mode='list')
+        self.assertEqual(rs, ['chlamy17.idx', 'chlamy3.idx'],
+                         'Return files as a list.')
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/tool_dependencies.xml	Tue Jul 21 14:48:39 2015 +0100
@@ -0,0 +1,100 @@
+<?xml version="1.0"?>
+<tool_dependency>
+    <package name="R" version="3.1.2">
+        <repository changeset_revision="9f2fddb9d6e2" name="package_r_3_1_2" 
+                    owner="iuc" prior_installation_required="True" 
+                    toolshed="http://toolshed.g2.bx.psu.edu" />
+    </package>
+    <package name="readline" version="6.2">
+        <repository changeset_revision="52d6f646b3b5" name="package_readline_6_2" 
+                    owner="iuc" prior_installation_required="True" 
+                    toolshed="http://toolshed.g2.bx.psu.edu" />
+    </package>
+    <package name="rpy2" version="2.3.10">
+        <install version="1.0">
+            <actions>
+                <action type="download_by_url">
+                    https://pypi.python.org/packages/source/r/rpy2/rpy2-2.3.10.tar.gz
+                </action>
+                <action type="set_environment_for_install">
+                    <repository changeset_revision="52d6f646b3b5" name="package_readline_6_2" 
+                                owner="iuc" toolshed="http://toolshed.g2.bx.psu.edu">
+                        <package name="readline" version="6.2" />
+                    </repository>
+                    <repository changeset_revision="9f2fddb9d6e2" name="package_r_3_1_2" 
+                                owner="iuc" toolshed="http://toolshed.g2.bx.psu.edu">
+                        <package name="R" version="3.1.2" />
+                    </repository>
+                </action>
+                <action type="shell_command">
+                    export PYTHONPATH=$PYTHONPATH:$INSTALL_DIR/lib/python &amp;&amp;
+                    export LDFLAGS="-L$READLINE_LIB_PATH -lreadline" &amp;&amp;
+                    export CPPFLAGS="-I$READLINE_INCLUDE_PATH" &amp;&amp;
+                    python setup.py build --r-home $R_HOME --r-home-lib $R_HOME/lib install --install-lib $INSTALL_DIR/lib/python
+                </action>
+                <action type="set_environment">
+                    <environment_variable action="append_to" name="PYTHONPATH">
+                        $INSTALL_DIR/lib/python
+                    </environment_variable>
+                    <environment_variable action="append_to" name="LD_LIBRARY_PATH">
+                        $ENV[READLINE_LIB_PATH]
+                    </environment_variable>
+                    <environment_variable action="append_to" name="LD_LIBRARY_PATH">
+                        $ENV[RHOME]/lib
+                    </environment_variable>
+                    <environment_variable action="append_to" name="LD_LIBRARY_PATH">
+                        $ENV[R_ROOT_DIR]/lib
+                    </environment_variable>
+                </action>
+            </actions>
+        </install>
+        <readme>rpy2 is a redesign and rewrite of rpy. It is providing a low-level interface to R, 
+            a proposed high-level interface, including wrappers to graphical libraries, 
+            as well as R-like structures and functions.
+            
+            http://rpy.sourceforge.net/rpy2.html
+        </readme>
+    </package>
+    <package name="riboseqr_wrapper_deps" version="0.4.0">
+        <install version="1.0">
+            <actions>
+                <action type="setup_r_environment">
+                    <repository changeset_revision="9f2fddb9d6e2" name="package_r_3_1_2" 
+                                owner="iuc" toolshed="http://toolshed.g2.bx.psu.edu">
+                        <package name="R" version="3.1.2" />
+                    </repository>
+                    <package>
+                        https://github.com/vimalkumarvelayudhan/riboseqr_wrapper_deps/raw/master/0.4.0/BiocGenerics_0.12.1.tar.gz?raw=true
+                    </package>
+                    <package>
+                        https://github.com/vimalkumarvelayudhan/riboseqr_wrapper_deps/raw/master/0.4.0/S4Vectors_0.4.0.tar.gz?raw=true
+                    </package>
+                    <package>
+                        https://github.com/vimalkumarvelayudhan/riboseqr_wrapper_deps/raw/master/0.4.0/IRanges_2.0.1.tar.gz?raw=true
+                    </package>
+                    <package>
+                        https://github.com/vimalkumarvelayudhan/riboseqr_wrapper_deps/raw/master/0.4.0/GenomeInfoDb_1.2.5.tar.gz?raw=true
+                    </package>
+                    <package>
+                        https://github.com/vimalkumarvelayudhan/riboseqr_wrapper_deps/raw/master/0.4.0/XVector_0.6.0.tar.gz?raw=true
+                    </package>
+                    <package>
+                        https://github.com/vimalkumarvelayudhan/riboseqr_wrapper_deps/raw/master/0.4.0/GenomicRanges_1.18.4.tar.gz?raw=true
+                    </package>
+                    <package>
+                        https://github.com/vimalkumarvelayudhan/riboseqr_wrapper_deps/raw/master/0.4.0/abind_1.4-3.tar.gz?raw=true
+                    </package>
+                    <package>
+                        https://github.com/vimalkumarvelayudhan/riboseqr_wrapper_deps/raw/master/0.4.0/baySeq_2.0.50.tar.gz?raw=true
+                    </package>
+                    <package>
+                        https://github.com/vimalkumarvelayudhan/riboseqr_wrapper_deps/raw/master/0.4.0/riboSeqR_1.0.5.tar.gz?raw=true
+                    </package>
+                </action>
+            </actions>
+        </install>
+        <readme>
+        Dependency installation for riboseqr_wrapper
+        </readme>
+    </package>
+</tool_dependency>
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/triplet.xml	Tue Jul 21 14:48:39 2015 +0100
@@ -0,0 +1,161 @@
+<tool id="riboseqr_triplet_periodicity" name="Triplet Periodicity"
+      version="0.4.0">
+    <description>
+        (Step 2) Plot triplet periodicity for different read lengths.
+    </description>
+    <requirements>
+        <requirement type="package" version="3.1.2">R</requirement>
+        <requirement type="package" version="6.2">readline</requirement>
+        <requirement type="package" version="2.3.10">rpy2</requirement>
+        <requirement type="package" version="0.4.0">riboseqr_wrapper_deps</requirement>
+    </requirements>
+    <stdio>
+        <exit_code range="1:"  level="fatal" description="Error" />
+    </stdio>
+    <command interpreter="python">riboseqr/triplet.py
+        --rdata_load "$rdata_load"
+        --fasta_file "$fasta_file"
+        --start_codons "$start_codons"
+        --stop_codons "$stop_codons"
+        --include_lengths "$include_lengths"
+        --analyze_plot_lengths "$analyze_plot_lengths"
+        --text_legend "$text_legend"
+        --rdata_save "$rdata_save"
+        --html_file "$html_file"
+        --output_path "$html_file.files_path"
+    </command>
+    <inputs>
+        <param name="rdata_load" type="data" format="rda"
+               label="Select prepared riboSeqR input (R data file)"
+               multiple="false" optional="false"
+               help="&lt;br&gt;&lt;h4&gt;&lt;font color=&quot;#666666&quot;&gt;findCDS
+                     parameters&lt;/font&gt;&lt;/h4&gt;">
+            <validator type="expression"
+                       message="Please check if the correct RDA file is selected">value.name == "Prepare riboSeqR input (R data file)"</validator>
+        </param>
+
+        <param format="fasta" name="fasta_file" type="data"
+               label="Select FASTA file of the reference transcriptome"
+               help="This Fasta file is used to find start (ATG) and
+                     stop codons(TAG, TAA, TGA) in frame with one another. If no files
+                     are listed, you will need to upload a FASTA file of the reference
+                     transcriptome. Please note that the FASTA header of the transcriptome
+                     sequences must match the sequence names from the SAM file.">
+            <validator type="empty_field" message="Field requires a value"/>
+        </param>
+
+        <param name="start_codons" type="text" size="15" value="ATG"
+               label="Start codon(s) to use"
+               help="Default is ATG. Multiple values must be comma-separated.">
+            <validator type="empty_field" message="Field requires a value"/>
+        </param>
+
+        <param name="stop_codons" type="text" size="15" value="TAG, TAA, TGA"
+               label="Stop codon(s) to use"
+               help="Default is TAG, TAA, TGA. Multiple values must be comma-separated.
+                     &lt;h4&gt;&lt;font color=&quot;#666666&quot;&gt;frameCounting
+                     parameters&lt;/font&gt;&lt;/h4&gt;">
+            <validator type="empty_field" message="Field requires a value"/>
+        </param>
+
+        <param name="include_lengths" type="text"
+               label="Lengths of ribosome footprints
+                      to be included"
+               help="Here, you can specify the lengths of ribosome footprints to
+                     be included in the riboData object. &lt;br&gt;For example 25:30.
+                     Range, semi-colon separated. &lt;h4&gt;&lt;font color=&quot;#666666&quot;&gt;readingFrame
+                     parameters&lt;/font&gt;&lt;/h4&gt;" value="25:30"/>
+
+        <param name="analyze_plot_lengths" type="text"
+               label="Lengths of reads to be analysed for frame-shift, or to be plotted"
+               help="[Optional] If omitted, all lengths will be plotted.
+                     &lt;br&gt;For example 26:30. Range, semi-colon separated.
+                     &lt;h4&gt;&lt;font color=&quot;#666666&quot;&gt;plotFS
+                     parameters&lt;/font&gt;&lt;/h4&gt;" value="26:30"/>
+
+        <param name="text_legend" type="text" size="30"
+               value='Frame 0, Frame 1, Frame 2'
+               label="Text for legend used in the
+                      plot" help="Comma-separated values."/>
+    </inputs>
+    <outputs>
+        <data format="rda" name="rdata_save"
+              label="Triplet periodicity (R data file)"/>
+        <data format="html" name="html_file"
+              label="Triplet periodicity (HTML report)"/>
+    </outputs>
+    <tests>
+        <test>
+            <param name="rdata_load" value="Prepare riboSeqR input (R data file)" ftype="rda" />
+            <param name="fasta_file" value="rsem_chlamy236_deNovo.transcripts.fa" />
+            <param name="include_lengths" value="25:30" />
+            <param name="analyze_plot_lengths" value="26:30" />
+            <output name="rdata_save" file="Triplet periodicity (R data file)" ftype="rda" />
+            <output name="html_file" file="Triplet_periodicity_(HTML_report).html">
+                <extra_files type="file" name="Periodicity-plot.png" value="Periodicity-plot.png" ftype="png" />
+                <extra_files type="file" name="Periodicity-plot.pdf" value="Periodicity-plot.pdf" ftype="pdf" />
+            </output>
+        </test>
+    </tests>
+    <help>
+Triplet Periodicity
+-------------------
+riboSeqR version: ``1.0.5``.
+
+This tool can be used to plot triplet periodicity for different read lengths.
+The inputs for this program are:
+
+#. Prepare riboSeqR input (R data file) from the previous step.
+#. FASTA format file of the reference transcriptome.
+
+.. class:: infomark
+
+Please note that the FASTA header of the transcriptome sequences must
+match the sequence names from the SAM file.
+
+How to use?
+-----------
+Inputs
+......
+Select prepared R data file from the previous step.
+
+Select FASTA format file of the reference transcriptome, review/change other
+options if necessary and execute program.
+
+Outputs
+.......
+The following files will be generated on completion:
+
+#. Triplet periodicity (HTML report)
+
+A HTML file with results and links to other output files - triplet
+periodicity plot (PDF) and R script used for the session.
+
+#. Triplet periodicity (R data file)
+
+Used as input for the next step - *Metagene analysis*.
+
+riboSeqR functions used
+.......................
+``findCDS``, ``frameCounting``, ``readingFrame`` and ``plotFS``.
+
+For detailed description of the functions and the options used, please consult
+the riboSeqR documentation.
+
+Links
+.....
+`riboSeqR &lt;http://bioconductor.org/packages/3.0/bioc/html/riboSeqR.html&gt;`_.
+
+    </help>
+    <citations>
+        <citation type="bibtex">
+            @Manual{,
+            title = {riboSeqR: Analysis of sequencing data from ribosome
+            profiling experiments.},
+            author = {Thomas J. Hardcastle},
+            year = {2014},
+            note = {R package version 1.0.5},
+            }
+        </citation>
+    </citations>
+</tool>