Mercurial > repos > fubar > egapx_runner
changeset 2:a3b158471bd3 draft
planemo upload for repository https://github.com/ncbi/egapx commit 98875ef7eda9323fc9991970103954e9097d9e73
line wrap: on
line diff
--- a/LICENSE Sat Aug 03 12:10:13 2024 +0000 +++ b/LICENSE Sun Aug 04 00:06:43 2024 +0000 @@ -86,3 +86,25 @@ applicable to my particular case? A. Contact us. Send your questions to 'toolbox@ncbi.nlm.nih.gov'. + +MIT License + +Copyright (c) 2024 Ross Lazarus + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE.
--- a/egapx_runner.xml Sat Aug 03 12:10:13 2024 +0000 +++ b/egapx_runner.xml Sun Aug 04 00:06:43 2024 +0000 @@ -38,7 +38,130 @@ <help><![CDATA[ - **What it Does** +Galaxy tool wrapping the Eukaryotic Genome Annotation Pipeline (EGAPx) +================================================================================================= + +**A very simple and crude way to run the EGAPx workflows inside Galaxy** + +EGAPx requires huge resources to run with useful data. *128GB and 32 cores* are the minimum requirement; *256GB and 64 cores* are recommended. + +There is a special test minimal example that can be run in 6GB with 4 cores. + +In this implementation, the user can supply a yaml configuration file as initial proof of concept. + +Does not use computational resources as efficiently as converting the NF workflow components into Galaxy tools, but it is very simple to maintain until EGAPx becomes stable. It will also enable measurement of the actual loss of efficiency from the crude wrapping method. + + +Sample yaml configurations +=========================== + +YAML sample configurations can be uploaded into your Galaxy history from the `EGAPx github repository <https://github.com/ncbi/egapx/tree/main/examples/>`_. +The simplest possible example is shown below - can be cut/paste into a history dataset in the upload tool. + + +*./examples/input_D_farinae_small.yaml* is included in the examples linked above. RNA-seq data is provided as URI to the reads FASTA files. +These FASTA files are a sampling of the reads from the complete SRA read files to expedite testing. + +:: + + genome: https://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/020/809/275/GCF_020809275.1_ASM2080927v1/GCF_020809275.1_ASM2080927v1_genomic.fna.gz + taxid: 6954 + reads: + - https://ftp.ncbi.nlm.nih.gov/genomes/TOOLS/EGAP/data/Dermatophagoides_farinae_small/SRR8506572.1 + - https://ftp.ncbi.nlm.nih.gov/genomes/TOOLS/EGAP/data/Dermatophagoides_farinae_small/SRR8506572.2 + - https://ftp.ncbi.nlm.nih.gov/genomes/TOOLS/EGAP/data/Dermatophagoides_farinae_small/SRR9005248.1 + - https://ftp.ncbi.nlm.nih.gov/genomes/TOOLS/EGAP/data/Dermatophagoides_farinae_small/SRR9005248.2 + + + +Purpose +======== + +**This is not intended for production** + +Just a proof of concept. +It is possibly too inefficient to be useful although it may turn out not to be a problem if run on a dedicated workstation. +At least the efficiency can now be more easily estimated. + +This tool is not recommended for public deployment because of the resource demands. + +EGAPx Overview +=============== + +.. image:: $PATH_TO_IMAGES/Pipeline_sm_ncRNA_CAGE_80pct.png + +**Warning:** +The current version is an alpha release with limited features and organism scope to collect initial feedback on execution. Outputs are not yet complete and not intended for production use. Please open a GitHub [Issue](https://github.com/ncbi/egapx/issues) if you encounter any problems with EGAPx. You can also write to cgr@nlm.nih.gov to give us your feedback or if you have any questions. + +EGAPx is the publicly accessible version of the updated NCBI [Eukaryotic Genome Annotation Pipeline](https://www.ncbi.nlm.nih.gov/genome/annotation_euk/process/). + +EGAPx takes an assembly fasta file, a taxid of the organism, and RNA-seq data. Based on the taxid, EGAPx will pick protein sets and HMM models. The pipeline runs `miniprot` to align protein sequences, and `STAR` to align RNA-seq to the assembly. Protein alignments and RNA-seq read alignments are then passed to `Gnomon` for gene prediction. In the first step of `Gnomon`, the short alignments are chained together into putative gene models. In the second step, these predictions are further supplemented by _ab-initio_ predictions based on HMM models. The final annotation for the input assembly is produced as a `gff` file. + +**Security Notice:** + +EGAPx has dependencies in and outside of its execution path that include several thousand files from the [NCBI C++ toolkit](https://www.ncbi.nlm.nih.gov/toolkit), and more than a million total lines of code. Static Application Security Testing has shown a small number of verified buffer overrun security vulnerabilities. Users should consult with their organizational security team on risk and if there is concern, consider mitigating options like running via VM or cloud instance. + + +*To specify an array of NCBI SRA datasets in yaml* + +:: + + reads: + - SRR8506572 + - SRR9005248 + + +*To specify an SRA entrez query* + +:: + + reads: 'txid6954[Organism] AND biomol_transcript[properties] NOT SRS024887[Accession] AND (SRR8506572[Accession] OR SRR9005248[Accession] )' + + +**Note:** Both the above examples will have more RNA-seq data than the `input_D_farinae_small.yaml` example. To make sure the entrez query does not produce a large number of SRA runs, please run it first at the [NCBI SRA page](https://www.ncbi.nlm.nih.gov/sra). If there are too many SRA runs, then select a few of them and list it in the input yaml. + +Output +======= + +EGAPx output will appear as a collection in the user history. The main annotation file is called *accept.gff*. + +:: + + accept.gff + annot_builder_output + nextflow.log + run.report.html + run.timeline.html + run.trace.txt + run_params.yaml + + +The *nextflow.log* is the log file that captures all the process information and their work directories. ``run_params.yaml`` has all the parameters that were used in the EGAPx run. More information about the process time and resources can be found in the other run* files. + +## Intermediate files + +In the log, each line denotes the process that completed in the workflow. The first column (_e.g._ `[96/621c4b]`) is the subdirectory where the intermediate output files and logs are found for the process in the same line, _i.e._, `egapx:miniprot:run_miniprot`. To see the intermediate files for that process, you can go to the work directory path that you had supplied and traverse to the subdirectory `96/621c4b`: + +:: + + $ aws s3 ls s3://temp_datapath/D_farinae/96/ + PRE 06834b76c8d7ceb8c97d2ccf75cda4/ + PRE 621c4ba4e6e87a4d869c696fe50034/ + $ aws s3 ls s3://temp_datapath/D_farinae/96/621c4ba4e6e87a4d869c696fe50034/ + PRE output/ + 2024-03-27 11:19:18 0 + 2024-03-27 11:19:28 6 .command.begin + 2024-03-27 11:20:24 762 .command.err + 2024-03-27 11:20:26 762 .command.log + 2024-03-27 11:20:23 0 .command.out + 2024-03-27 11:19:18 13103 .command.run + 2024-03-27 11:19:18 129 .command.sh + 2024-03-27 11:20:24 276 .command.trace + 2024-03-27 11:20:25 1 .exitcode + $ aws s3 ls s3://temp_datapath/D_farinae/96/621c4ba4e6e87a4d869c696fe50034/output/ + 2024-03-27 11:20:24 17127134 aligns.paf + + ]]></help> <citations> <citation type="doi">10.1093/bioinformatics/bts573</citation>
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/oldgit/COMMIT_EDITMSG Sun Aug 04 00:06:43 2024 +0000 @@ -0,0 +1,19 @@ +update help +considering adding some prepackaged yaml options? +currently point to samples in egapx github repo +# Please enter the commit message for your changes. Lines starting +# with '#' will be ignored, and an empty message aborts the commit. +# +# On branch main +# Your branch is up to date with 'origin/main'. +# +# Changes to be committed: +# new file: .shed.yml +# new file: EGAPXREADME.md +# modified: README.md +# new file: assets/default_task_params.yaml +# new file: egapx_runner.xml +# new file: static/images/Pipeline_sm_ncRNA_CAGE_80pct.png +# new file: test-data/yamlconfig_sample +# modified: ui/assets/config/executor/docker_minimal.config +#
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/oldgit/HEAD Sun Aug 04 00:06:43 2024 +0000 @@ -0,0 +1,1 @@ +ref: refs/heads/main
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/oldgit/config Sun Aug 04 00:06:43 2024 +0000 @@ -0,0 +1,5 @@ +[core] + repositoryformatversion = 0 + filemode = true + bare = false + logallrefupdates = true
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/oldgit/description Sun Aug 04 00:06:43 2024 +0000 @@ -0,0 +1,1 @@ +Unnamed repository; edit this file 'description' to name the repository.
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/oldgit/hooks/applypatch-msg.sample Sun Aug 04 00:06:43 2024 +0000 @@ -0,0 +1,15 @@ +#!/bin/sh +# +# An example hook script to check the commit log message taken by +# applypatch from an e-mail message. +# +# The hook should exit with non-zero status after issuing an +# appropriate message if it wants to stop the commit. The hook is +# allowed to edit the commit message file. +# +# To enable this hook, rename this file to "applypatch-msg". + +. git-sh-setup +commitmsg="$(git rev-parse --git-path hooks/commit-msg)" +test -x "$commitmsg" && exec "$commitmsg" ${1+"$@"} +:
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/oldgit/hooks/commit-msg.sample Sun Aug 04 00:06:43 2024 +0000 @@ -0,0 +1,24 @@ +#!/bin/sh +# +# An example hook script to check the commit log message. +# Called by "git commit" with one argument, the name of the file +# that has the commit message. The hook should exit with non-zero +# status after issuing an appropriate message if it wants to stop the +# commit. The hook is allowed to edit the commit message file. +# +# To enable this hook, rename this file to "commit-msg". + +# Uncomment the below to add a Signed-off-by line to the message. +# Doing this in a hook is a bad idea in general, but the prepare-commit-msg +# hook is more suited to it. +# +# SOB=$(git var GIT_AUTHOR_IDENT | sed -n 's/^\(.*>\).*$/Signed-off-by: \1/p') +# grep -qs "^$SOB" "$1" || echo "$SOB" >> "$1" + +# This example catches duplicate Signed-off-by lines. + +test "" = "$(grep '^Signed-off-by: ' "$1" | + sort | uniq -c | sed -e '/^[ ]*1[ ]/d')" || { + echo >&2 Duplicate Signed-off-by lines. + exit 1 +}
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/oldgit/hooks/fsmonitor-watchman.sample Sun Aug 04 00:06:43 2024 +0000 @@ -0,0 +1,174 @@ +#!/usr/bin/perl + +use strict; +use warnings; +use IPC::Open2; + +# An example hook script to integrate Watchman +# (https://facebook.github.io/watchman/) with git to speed up detecting +# new and modified files. +# +# The hook is passed a version (currently 2) and last update token +# formatted as a string and outputs to stdout a new update token and +# all files that have been modified since the update token. Paths must +# be relative to the root of the working tree and separated by a single NUL. +# +# To enable this hook, rename this file to "query-watchman" and set +# 'git config core.fsmonitor .git/hooks/query-watchman' +# +my ($version, $last_update_token) = @ARGV; + +# Uncomment for debugging +# print STDERR "$0 $version $last_update_token\n"; + +# Check the hook interface version +if ($version ne 2) { + die "Unsupported query-fsmonitor hook version '$version'.\n" . + "Falling back to scanning...\n"; +} + +my $git_work_tree = get_working_dir(); + +my $retry = 1; + +my $json_pkg; +eval { + require JSON::XS; + $json_pkg = "JSON::XS"; + 1; +} or do { + require JSON::PP; + $json_pkg = "JSON::PP"; +}; + +launch_watchman(); + +sub launch_watchman { + my $o = watchman_query(); + if (is_work_tree_watched($o)) { + output_result($o->{clock}, @{$o->{files}}); + } +} + +sub output_result { + my ($clockid, @files) = @_; + + # Uncomment for debugging watchman output + # open (my $fh, ">", ".git/watchman-output.out"); + # binmode $fh, ":utf8"; + # print $fh "$clockid\n@files\n"; + # close $fh; + + binmode STDOUT, ":utf8"; + print $clockid; + print "\0"; + local $, = "\0"; + print @files; +} + +sub watchman_clock { + my $response = qx/watchman clock "$git_work_tree"/; + die "Failed to get clock id on '$git_work_tree'.\n" . + "Falling back to scanning...\n" if $? != 0; + + return $json_pkg->new->utf8->decode($response); +} + +sub watchman_query { + my $pid = open2(\*CHLD_OUT, \*CHLD_IN, 'watchman -j --no-pretty') + or die "open2() failed: $!\n" . + "Falling back to scanning...\n"; + + # In the query expression below we're asking for names of files that + # changed since $last_update_token but not from the .git folder. + # + # To accomplish this, we're using the "since" generator to use the + # recency index to select candidate nodes and "fields" to limit the + # output to file names only. Then we're using the "expression" term to + # further constrain the results. + my $last_update_line = ""; + if (substr($last_update_token, 0, 1) eq "c") { + $last_update_token = "\"$last_update_token\""; + $last_update_line = qq[\n"since": $last_update_token,]; + } + my $query = <<" END"; + ["query", "$git_work_tree", {$last_update_line + "fields": ["name"], + "expression": ["not", ["dirname", ".git"]] + }] + END + + # Uncomment for debugging the watchman query + # open (my $fh, ">", ".git/watchman-query.json"); + # print $fh $query; + # close $fh; + + print CHLD_IN $query; + close CHLD_IN; + my $response = do {local $/; <CHLD_OUT>}; + + # Uncomment for debugging the watch response + # open ($fh, ">", ".git/watchman-response.json"); + # print $fh $response; + # close $fh; + + die "Watchman: command returned no output.\n" . + "Falling back to scanning...\n" if $response eq ""; + die "Watchman: command returned invalid output: $response\n" . + "Falling back to scanning...\n" unless $response =~ /^\{/; + + return $json_pkg->new->utf8->decode($response); +} + +sub is_work_tree_watched { + my ($output) = @_; + my $error = $output->{error}; + if ($retry > 0 and $error and $error =~ m/unable to resolve root .* directory (.*) is not watched/) { + $retry--; + my $response = qx/watchman watch "$git_work_tree"/; + die "Failed to make watchman watch '$git_work_tree'.\n" . + "Falling back to scanning...\n" if $? != 0; + $output = $json_pkg->new->utf8->decode($response); + $error = $output->{error}; + die "Watchman: $error.\n" . + "Falling back to scanning...\n" if $error; + + # Uncomment for debugging watchman output + # open (my $fh, ">", ".git/watchman-output.out"); + # close $fh; + + # Watchman will always return all files on the first query so + # return the fast "everything is dirty" flag to git and do the + # Watchman query just to get it over with now so we won't pay + # the cost in git to look up each individual file. + my $o = watchman_clock(); + $error = $output->{error}; + + die "Watchman: $error.\n" . + "Falling back to scanning...\n" if $error; + + output_result($o->{clock}, ("/")); + $last_update_token = $o->{clock}; + + eval { launch_watchman() }; + return 0; + } + + die "Watchman: $error.\n" . + "Falling back to scanning...\n" if $error; + + return 1; +} + +sub get_working_dir { + my $working_dir; + if ($^O =~ 'msys' || $^O =~ 'cygwin') { + $working_dir = Win32::GetCwd(); + $working_dir =~ tr/\\/\//; + } else { + require Cwd; + $working_dir = Cwd::cwd(); + } + + return $working_dir; +}
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/oldgit/hooks/post-update.sample Sun Aug 04 00:06:43 2024 +0000 @@ -0,0 +1,8 @@ +#!/bin/sh +# +# An example hook script to prepare a packed repository for use over +# dumb transports. +# +# To enable this hook, rename this file to "post-update". + +exec git update-server-info
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/oldgit/hooks/pre-applypatch.sample Sun Aug 04 00:06:43 2024 +0000 @@ -0,0 +1,14 @@ +#!/bin/sh +# +# An example hook script to verify what is about to be committed +# by applypatch from an e-mail message. +# +# The hook should exit with non-zero status after issuing an +# appropriate message if it wants to stop the commit. +# +# To enable this hook, rename this file to "pre-applypatch". + +. git-sh-setup +precommit="$(git rev-parse --git-path hooks/pre-commit)" +test -x "$precommit" && exec "$precommit" ${1+"$@"} +:
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/oldgit/hooks/pre-commit.sample Sun Aug 04 00:06:43 2024 +0000 @@ -0,0 +1,49 @@ +#!/bin/sh +# +# An example hook script to verify what is about to be committed. +# Called by "git commit" with no arguments. The hook should +# exit with non-zero status after issuing an appropriate message if +# it wants to stop the commit. +# +# To enable this hook, rename this file to "pre-commit". + +if git rev-parse --verify HEAD >/dev/null 2>&1 +then + against=HEAD +else + # Initial commit: diff against an empty tree object + against=$(git hash-object -t tree /dev/null) +fi + +# If you want to allow non-ASCII filenames set this variable to true. +allownonascii=$(git config --type=bool hooks.allownonascii) + +# Redirect output to stderr. +exec 1>&2 + +# Cross platform projects tend to avoid non-ASCII filenames; prevent +# them from being added to the repository. We exploit the fact that the +# printable range starts at the space character and ends with tilde. +if [ "$allownonascii" != "true" ] && + # Note that the use of brackets around a tr range is ok here, (it's + # even required, for portability to Solaris 10's /usr/bin/tr), since + # the square bracket bytes happen to fall in the designated range. + test $(git diff --cached --name-only --diff-filter=A -z $against | + LC_ALL=C tr -d '[ -~]\0' | wc -c) != 0 +then + cat <<\EOF +Error: Attempt to add a non-ASCII file name. + +This can cause problems if you want to work with people on other platforms. + +To be portable it is advisable to rename the file. + +If you know what you are doing you can disable this check using: + + git config hooks.allownonascii true +EOF + exit 1 +fi + +# If there are whitespace errors, print the offending file names and fail. +exec git diff-index --check --cached $against --
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/oldgit/hooks/pre-merge-commit.sample Sun Aug 04 00:06:43 2024 +0000 @@ -0,0 +1,13 @@ +#!/bin/sh +# +# An example hook script to verify what is about to be committed. +# Called by "git merge" with no arguments. The hook should +# exit with non-zero status after issuing an appropriate message to +# stderr if it wants to stop the merge commit. +# +# To enable this hook, rename this file to "pre-merge-commit". + +. git-sh-setup +test -x "$GIT_DIR/hooks/pre-commit" && + exec "$GIT_DIR/hooks/pre-commit" +:
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/oldgit/hooks/pre-push.sample Sun Aug 04 00:06:43 2024 +0000 @@ -0,0 +1,53 @@ +#!/bin/sh + +# An example hook script to verify what is about to be pushed. Called by "git +# push" after it has checked the remote status, but before anything has been +# pushed. If this script exits with a non-zero status nothing will be pushed. +# +# This hook is called with the following parameters: +# +# $1 -- Name of the remote to which the push is being done +# $2 -- URL to which the push is being done +# +# If pushing without using a named remote those arguments will be equal. +# +# Information about the commits which are being pushed is supplied as lines to +# the standard input in the form: +# +# <local ref> <local oid> <remote ref> <remote oid> +# +# This sample shows how to prevent push of commits where the log message starts +# with "WIP" (work in progress). + +remote="$1" +url="$2" + +zero=$(git hash-object --stdin </dev/null | tr '[0-9a-f]' '0') + +while read local_ref local_oid remote_ref remote_oid +do + if test "$local_oid" = "$zero" + then + # Handle delete + : + else + if test "$remote_oid" = "$zero" + then + # New branch, examine all commits + range="$local_oid" + else + # Update to existing branch, examine new commits + range="$remote_oid..$local_oid" + fi + + # Check for WIP commit + commit=$(git rev-list -n 1 --grep '^WIP' "$range") + if test -n "$commit" + then + echo >&2 "Found WIP commit in $local_ref, not pushing" + exit 1 + fi + fi +done + +exit 0
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/oldgit/hooks/pre-rebase.sample Sun Aug 04 00:06:43 2024 +0000 @@ -0,0 +1,169 @@ +#!/bin/sh +# +# Copyright (c) 2006, 2008 Junio C Hamano +# +# The "pre-rebase" hook is run just before "git rebase" starts doing +# its job, and can prevent the command from running by exiting with +# non-zero status. +# +# The hook is called with the following parameters: +# +# $1 -- the upstream the series was forked from. +# $2 -- the branch being rebased (or empty when rebasing the current branch). +# +# This sample shows how to prevent topic branches that are already +# merged to 'next' branch from getting rebased, because allowing it +# would result in rebasing already published history. + +publish=next +basebranch="$1" +if test "$#" = 2 +then + topic="refs/heads/$2" +else + topic=`git symbolic-ref HEAD` || + exit 0 ;# we do not interrupt rebasing detached HEAD +fi + +case "$topic" in +refs/heads/??/*) + ;; +*) + exit 0 ;# we do not interrupt others. + ;; +esac + +# Now we are dealing with a topic branch being rebased +# on top of master. Is it OK to rebase it? + +# Does the topic really exist? +git show-ref -q "$topic" || { + echo >&2 "No such branch $topic" + exit 1 +} + +# Is topic fully merged to master? +not_in_master=`git rev-list --pretty=oneline ^master "$topic"` +if test -z "$not_in_master" +then + echo >&2 "$topic is fully merged to master; better remove it." + exit 1 ;# we could allow it, but there is no point. +fi + +# Is topic ever merged to next? If so you should not be rebasing it. +only_next_1=`git rev-list ^master "^$topic" ${publish} | sort` +only_next_2=`git rev-list ^master ${publish} | sort` +if test "$only_next_1" = "$only_next_2" +then + not_in_topic=`git rev-list "^$topic" master` + if test -z "$not_in_topic" + then + echo >&2 "$topic is already up to date with master" + exit 1 ;# we could allow it, but there is no point. + else + exit 0 + fi +else + not_in_next=`git rev-list --pretty=oneline ^${publish} "$topic"` + /usr/bin/perl -e ' + my $topic = $ARGV[0]; + my $msg = "* $topic has commits already merged to public branch:\n"; + my (%not_in_next) = map { + /^([0-9a-f]+) /; + ($1 => 1); + } split(/\n/, $ARGV[1]); + for my $elem (map { + /^([0-9a-f]+) (.*)$/; + [$1 => $2]; + } split(/\n/, $ARGV[2])) { + if (!exists $not_in_next{$elem->[0]}) { + if ($msg) { + print STDERR $msg; + undef $msg; + } + print STDERR " $elem->[1]\n"; + } + } + ' "$topic" "$not_in_next" "$not_in_master" + exit 1 +fi + +<<\DOC_END + +This sample hook safeguards topic branches that have been +published from being rewound. + +The workflow assumed here is: + + * Once a topic branch forks from "master", "master" is never + merged into it again (either directly or indirectly). + + * Once a topic branch is fully cooked and merged into "master", + it is deleted. If you need to build on top of it to correct + earlier mistakes, a new topic branch is created by forking at + the tip of the "master". This is not strictly necessary, but + it makes it easier to keep your history simple. + + * Whenever you need to test or publish your changes to topic + branches, merge them into "next" branch. + +The script, being an example, hardcodes the publish branch name +to be "next", but it is trivial to make it configurable via +$GIT_DIR/config mechanism. + +With this workflow, you would want to know: + +(1) ... if a topic branch has ever been merged to "next". Young + topic branches can have stupid mistakes you would rather + clean up before publishing, and things that have not been + merged into other branches can be easily rebased without + affecting other people. But once it is published, you would + not want to rewind it. + +(2) ... if a topic branch has been fully merged to "master". + Then you can delete it. More importantly, you should not + build on top of it -- other people may already want to + change things related to the topic as patches against your + "master", so if you need further changes, it is better to + fork the topic (perhaps with the same name) afresh from the + tip of "master". + +Let's look at this example: + + o---o---o---o---o---o---o---o---o---o "next" + / / / / + / a---a---b A / / + / / / / + / / c---c---c---c B / + / / / \ / + / / / b---b C \ / + / / / / \ / + ---o---o---o---o---o---o---o---o---o---o---o "master" + + +A, B and C are topic branches. + + * A has one fix since it was merged up to "next". + + * B has finished. It has been fully merged up to "master" and "next", + and is ready to be deleted. + + * C has not merged to "next" at all. + +We would want to allow C to be rebased, refuse A, and encourage +B to be deleted. + +To compute (1): + + git rev-list ^master ^topic next + git rev-list ^master next + + if these match, topic has not merged in next at all. + +To compute (2): + + git rev-list master..topic + + if this is empty, it is fully merged to "master". + +DOC_END
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/oldgit/hooks/pre-receive.sample Sun Aug 04 00:06:43 2024 +0000 @@ -0,0 +1,24 @@ +#!/bin/sh +# +# An example hook script to make use of push options. +# The example simply echoes all push options that start with 'echoback=' +# and rejects all pushes when the "reject" push option is used. +# +# To enable this hook, rename this file to "pre-receive". + +if test -n "$GIT_PUSH_OPTION_COUNT" +then + i=0 + while test "$i" -lt "$GIT_PUSH_OPTION_COUNT" + do + eval "value=\$GIT_PUSH_OPTION_$i" + case "$value" in + echoback=*) + echo "echo from the pre-receive-hook: ${value#*=}" >&2 + ;; + reject) + exit 1 + esac + i=$((i + 1)) + done +fi
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/oldgit/hooks/prepare-commit-msg.sample Sun Aug 04 00:06:43 2024 +0000 @@ -0,0 +1,42 @@ +#!/bin/sh +# +# An example hook script to prepare the commit log message. +# Called by "git commit" with the name of the file that has the +# commit message, followed by the description of the commit +# message's source. The hook's purpose is to edit the commit +# message file. If the hook fails with a non-zero status, +# the commit is aborted. +# +# To enable this hook, rename this file to "prepare-commit-msg". + +# This hook includes three examples. The first one removes the +# "# Please enter the commit message..." help message. +# +# The second includes the output of "git diff --name-status -r" +# into the message, just before the "git status" output. It is +# commented because it doesn't cope with --amend or with squashed +# commits. +# +# The third example adds a Signed-off-by line to the message, that can +# still be edited. This is rarely a good idea. + +COMMIT_MSG_FILE=$1 +COMMIT_SOURCE=$2 +SHA1=$3 + +/usr/bin/perl -i.bak -ne 'print unless(m/^. Please enter the commit message/..m/^#$/)' "$COMMIT_MSG_FILE" + +# case "$COMMIT_SOURCE,$SHA1" in +# ,|template,) +# /usr/bin/perl -i.bak -pe ' +# print "\n" . `git diff --cached --name-status -r` +# if /^#/ && $first++ == 0' "$COMMIT_MSG_FILE" ;; +# *) ;; +# esac + +# SOB=$(git var GIT_COMMITTER_IDENT | sed -n 's/^\(.*>\).*$/Signed-off-by: \1/p') +# git interpret-trailers --in-place --trailer "$SOB" "$COMMIT_MSG_FILE" +# if test -z "$COMMIT_SOURCE" +# then +# /usr/bin/perl -i.bak -pe 'print "\n" if !$first_line++' "$COMMIT_MSG_FILE" +# fi
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/oldgit/hooks/push-to-checkout.sample Sun Aug 04 00:06:43 2024 +0000 @@ -0,0 +1,78 @@ +#!/bin/sh + +# An example hook script to update a checked-out tree on a git push. +# +# This hook is invoked by git-receive-pack(1) when it reacts to git +# push and updates reference(s) in its repository, and when the push +# tries to update the branch that is currently checked out and the +# receive.denyCurrentBranch configuration variable is set to +# updateInstead. +# +# By default, such a push is refused if the working tree and the index +# of the remote repository has any difference from the currently +# checked out commit; when both the working tree and the index match +# the current commit, they are updated to match the newly pushed tip +# of the branch. This hook is to be used to override the default +# behaviour; however the code below reimplements the default behaviour +# as a starting point for convenient modification. +# +# The hook receives the commit with which the tip of the current +# branch is going to be updated: +commit=$1 + +# It can exit with a non-zero status to refuse the push (when it does +# so, it must not modify the index or the working tree). +die () { + echo >&2 "$*" + exit 1 +} + +# Or it can make any necessary changes to the working tree and to the +# index to bring them to the desired state when the tip of the current +# branch is updated to the new commit, and exit with a zero status. +# +# For example, the hook can simply run git read-tree -u -m HEAD "$1" +# in order to emulate git fetch that is run in the reverse direction +# with git push, as the two-tree form of git read-tree -u -m is +# essentially the same as git switch or git checkout that switches +# branches while keeping the local changes in the working tree that do +# not interfere with the difference between the branches. + +# The below is a more-or-less exact translation to shell of the C code +# for the default behaviour for git's push-to-checkout hook defined in +# the push_to_deploy() function in builtin/receive-pack.c. +# +# Note that the hook will be executed from the repository directory, +# not from the working tree, so if you want to perform operations on +# the working tree, you will have to adapt your code accordingly, e.g. +# by adding "cd .." or using relative paths. + +if ! git update-index -q --ignore-submodules --refresh +then + die "Up-to-date check failed" +fi + +if ! git diff-files --quiet --ignore-submodules -- +then + die "Working directory has unstaged changes" +fi + +# This is a rough translation of: +# +# head_has_history() ? "HEAD" : EMPTY_TREE_SHA1_HEX +if git cat-file -e HEAD 2>/dev/null +then + head=HEAD +else + head=$(git hash-object -t tree --stdin </dev/null) +fi + +if ! git diff-index --quiet --cached --ignore-submodules $head -- +then + die "Working directory has staged changes" +fi + +if ! git read-tree -u -m "$commit" +then + die "Could not update working tree to new HEAD" +fi
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/oldgit/hooks/sendemail-validate.sample Sun Aug 04 00:06:43 2024 +0000 @@ -0,0 +1,77 @@ +#!/bin/sh + +# An example hook script to validate a patch (and/or patch series) before +# sending it via email. +# +# The hook should exit with non-zero status after issuing an appropriate +# message if it wants to prevent the email(s) from being sent. +# +# To enable this hook, rename this file to "sendemail-validate". +# +# By default, it will only check that the patch(es) can be applied on top of +# the default upstream branch without conflicts in a secondary worktree. After +# validation (successful or not) of the last patch of a series, the worktree +# will be deleted. +# +# The following config variables can be set to change the default remote and +# remote ref that are used to apply the patches against: +# +# sendemail.validateRemote (default: origin) +# sendemail.validateRemoteRef (default: HEAD) +# +# Replace the TODO placeholders with appropriate checks according to your +# needs. + +validate_cover_letter () { + file="$1" + # TODO: Replace with appropriate checks (e.g. spell checking). + true +} + +validate_patch () { + file="$1" + # Ensure that the patch applies without conflicts. + git am -3 "$file" || return + # TODO: Replace with appropriate checks for this patch + # (e.g. checkpatch.pl). + true +} + +validate_series () { + # TODO: Replace with appropriate checks for the whole series + # (e.g. quick build, coding style checks, etc.). + true +} + +# main ------------------------------------------------------------------------- + +if test "$GIT_SENDEMAIL_FILE_COUNTER" = 1 +then + remote=$(git config --default origin --get sendemail.validateRemote) && + ref=$(git config --default HEAD --get sendemail.validateRemoteRef) && + worktree=$(mktemp --tmpdir -d sendemail-validate.XXXXXXX) && + git worktree add -fd --checkout "$worktree" "refs/remotes/$remote/$ref" && + git config --replace-all sendemail.validateWorktree "$worktree" +else + worktree=$(git config --get sendemail.validateWorktree) +fi || { + echo "sendemail-validate: error: failed to prepare worktree" >&2 + exit 1 +} + +unset GIT_DIR GIT_WORK_TREE +cd "$worktree" && + +if grep -q "^diff --git " "$1" +then + validate_patch "$1" +else + validate_cover_letter "$1" +fi && + +if test "$GIT_SENDEMAIL_FILE_COUNTER" = "$GIT_SENDEMAIL_FILE_TOTAL" +then + git config --unset-all sendemail.validateWorktree && + trap 'git worktree remove -ff "$worktree"' EXIT && + validate_series +fi
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/oldgit/hooks/update.sample Sun Aug 04 00:06:43 2024 +0000 @@ -0,0 +1,128 @@ +#!/bin/sh +# +# An example hook script to block unannotated tags from entering. +# Called by "git receive-pack" with arguments: refname sha1-old sha1-new +# +# To enable this hook, rename this file to "update". +# +# Config +# ------ +# hooks.allowunannotated +# This boolean sets whether unannotated tags will be allowed into the +# repository. By default they won't be. +# hooks.allowdeletetag +# This boolean sets whether deleting tags will be allowed in the +# repository. By default they won't be. +# hooks.allowmodifytag +# This boolean sets whether a tag may be modified after creation. By default +# it won't be. +# hooks.allowdeletebranch +# This boolean sets whether deleting branches will be allowed in the +# repository. By default they won't be. +# hooks.denycreatebranch +# This boolean sets whether remotely creating branches will be denied +# in the repository. By default this is allowed. +# + +# --- Command line +refname="$1" +oldrev="$2" +newrev="$3" + +# --- Safety check +if [ -z "$GIT_DIR" ]; then + echo "Don't run this script from the command line." >&2 + echo " (if you want, you could supply GIT_DIR then run" >&2 + echo " $0 <ref> <oldrev> <newrev>)" >&2 + exit 1 +fi + +if [ -z "$refname" -o -z "$oldrev" -o -z "$newrev" ]; then + echo "usage: $0 <ref> <oldrev> <newrev>" >&2 + exit 1 +fi + +# --- Config +allowunannotated=$(git config --type=bool hooks.allowunannotated) +allowdeletebranch=$(git config --type=bool hooks.allowdeletebranch) +denycreatebranch=$(git config --type=bool hooks.denycreatebranch) +allowdeletetag=$(git config --type=bool hooks.allowdeletetag) +allowmodifytag=$(git config --type=bool hooks.allowmodifytag) + +# check for no description +projectdesc=$(sed -e '1q' "$GIT_DIR/description") +case "$projectdesc" in +"Unnamed repository"* | "") + echo "*** Project description file hasn't been set" >&2 + exit 1 + ;; +esac + +# --- Check types +# if $newrev is 0000...0000, it's a commit to delete a ref. +zero=$(git hash-object --stdin </dev/null | tr '[0-9a-f]' '0') +if [ "$newrev" = "$zero" ]; then + newrev_type=delete +else + newrev_type=$(git cat-file -t $newrev) +fi + +case "$refname","$newrev_type" in + refs/tags/*,commit) + # un-annotated tag + short_refname=${refname##refs/tags/} + if [ "$allowunannotated" != "true" ]; then + echo "*** The un-annotated tag, $short_refname, is not allowed in this repository" >&2 + echo "*** Use 'git tag [ -a | -s ]' for tags you want to propagate." >&2 + exit 1 + fi + ;; + refs/tags/*,delete) + # delete tag + if [ "$allowdeletetag" != "true" ]; then + echo "*** Deleting a tag is not allowed in this repository" >&2 + exit 1 + fi + ;; + refs/tags/*,tag) + # annotated tag + if [ "$allowmodifytag" != "true" ] && git rev-parse $refname > /dev/null 2>&1 + then + echo "*** Tag '$refname' already exists." >&2 + echo "*** Modifying a tag is not allowed in this repository." >&2 + exit 1 + fi + ;; + refs/heads/*,commit) + # branch + if [ "$oldrev" = "$zero" -a "$denycreatebranch" = "true" ]; then + echo "*** Creating a branch is not allowed in this repository" >&2 + exit 1 + fi + ;; + refs/heads/*,delete) + # delete branch + if [ "$allowdeletebranch" != "true" ]; then + echo "*** Deleting a branch is not allowed in this repository" >&2 + exit 1 + fi + ;; + refs/remotes/*,commit) + # tracking branch + ;; + refs/remotes/*,delete) + # delete tracking branch + if [ "$allowdeletebranch" != "true" ]; then + echo "*** Deleting a tracking branch is not allowed in this repository" >&2 + exit 1 + fi + ;; + *) + # Anything else (is there anything else?) + echo "*** Update hook: unknown type of update to ref $refname of type $newrev_type" >&2 + exit 1 + ;; +esac + +# --- Finished +exit 0
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/oldgit/info/exclude Sun Aug 04 00:06:43 2024 +0000 @@ -0,0 +1,6 @@ +# git ls-files --others --exclude-from=.git/info/exclude +# Lines that start with '#' are comments. +# For a project mostly in C, the following would be a good set of +# exclude patterns (uncomment them if you want to use them): +# *.[oa] +# *~
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/oldgit/logs/HEAD Sun Aug 04 00:06:43 2024 +0000 @@ -0,0 +1,2 @@ +0000000000000000000000000000000000000000 8173d01b08d9a91c9ec5f6cb50af346edc8020c4 fubar2 <ross.lazarus@gmail.com> 1722655773 +1000 clone: from https://github.com/ncbi/egapx.git +8173d01b08d9a91c9ec5f6cb50af346edc8020c4 375ac3ad6c77bc7efad8b0c8c3375c7157e29651 fubar2 <ross.lazarus@gmail.com> 1722728096 +1000 commit: update help
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/oldgit/logs/refs/heads/main Sun Aug 04 00:06:43 2024 +0000 @@ -0,0 +1,2 @@ +0000000000000000000000000000000000000000 8173d01b08d9a91c9ec5f6cb50af346edc8020c4 fubar2 <ross.lazarus@gmail.com> 1722655773 +1000 clone: from https://github.com/ncbi/egapx.git +8173d01b08d9a91c9ec5f6cb50af346edc8020c4 375ac3ad6c77bc7efad8b0c8c3375c7157e29651 fubar2 <ross.lazarus@gmail.com> 1722728096 +1000 commit: update help
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/oldgit/packed-refs Sun Aug 04 00:06:43 2024 +0000 @@ -0,0 +1,6 @@ +# pack-refs with: peeled fully-peeled sorted +203afce5dc30d03a6aab0d27a5a60d26276f3413 refs/tags/v0.1.0-alpha +4c01189ce1d1e3970fc7d8b9326083cc006c3d28 refs/tags/v0.1.1-alpha +^2380370cc0c522742619ba6c463a5b7f7a79507c +5a916d370a13cd49d70cad71cd7589f01fdea573 refs/tags/v0.1.2-alpha +314cbbc4ff1480f8f470c0ccef8979ef302a15cd refs/tags/v0.2-alpha