🔧 Installing GECCO
GECCO is implemented in Python, and
supports all versions from Python
3.6. It requires additional libraries that can be installed directly
from PyPI, the Python Package Index.
Use pip to install GECCO on
$ pip install gecco-tool
If you’d rather use Conda, a package is available
in the bioconda channel. You can
$ conda install -c bioconda gecco
This will install GECCO, its dependencies, and the data needed to run
predictions. This requires around 100MB of data to be downloaded, so it
could take some time depending on your Internet connection. Once done,
you will have a gecco command available in your $PATH.
Note that GECCO uses HMMER3, which can
only run on PowerPC and recent x86-64 machines running a POSIX operating
system. Therefore, Linux and OSX are supported platforms, but GECCO will
not be able to run on Windows.
🧬 Running GECCO
Once gecco is installed, you can run it from the terminal by giving
it a FASTA or GenBank file with the genomic sequence you want to
analyze, as well as an output directory:
$ gecco run --genome some_genome.fna -o some_output_dir
Additional parameters of interest are:
- --jobs, which controls the number of threads that will be spawned
by GECCO whenever a step can be parallelized. The default, 0, will
autodetect the number of CPUs on the machine using
- --cds, controlling the minimum number of consecutive genes a BGC
region must have to be detected by GECCO (default is 3).
- --threshold, controlling the minimum probability for a gene to be
considered part of a BGC region. Using a lower number will increase
the number (and possibly length) of predictions, but reduce accuracy.