changeset 358:00c88eac3f6d draft

Uploaded
author bimib
date Tue, 17 Sep 2024 10:50:31 +0000
parents 551c32c33bab
children 4665b8ff31e7
files marea_2/GSOC project submission.html marea_2/README.md marea_2/custom_data_generator.py marea_2/custom_data_generator.xml marea_2/flux_simulation.py marea_2/flux_simulation.xml marea_2/flux_to_map.py marea_2/flux_to_map.xml marea_2/local/svg metabolic maps/ENGRO2_map.svg marea_2/local/svg metabolic maps/ENGRO2_no_legend_map.svg marea_2/local/svg metabolic maps/HMRcore_no_legend_map.svg marea_2/marea.py marea_2/marea.xml marea_2/marea_cluster.xml marea_2/marea_macros.xml marea_2/ras_generator.py marea_2/ras_generator.xml marea_2/ras_to_bounds.py marea_2/ras_to_bounds.xml marea_2/rps_generator.xml marea_2/utils/general_utils.py
diffstat 7 files changed, 0 insertions(+), 2269 deletions(-) [+]
line wrap: on
line diff
--- a/marea_2/GSOC project submission.html	Thu Aug 29 20:46:04 2024 +0000
+++ /dev/null	Thu Jan 01 00:00:00 1970 +0000
@@ -1,65 +0,0 @@
-<!DOCTYPE html>
-<html lang="en">
-<head>
-    <meta charset="UTF-8">
-    <meta name="viewport" content="width=device-width, initial-scale=1.0">
-    <title>Google Summer of Code 2024 - COBRAxy: COBRA and MaREA4Galaxy</title>
-    <link href="https://cdn.jsdelivr.net/npm/bootstrap@5.3.1/dist/css/bootstrap.min.css" rel="stylesheet">
-</head>
-<body>
-    <div class="container my-5">
-        <h1 class="text-center mb-4">Google Summer of Code 2024</h1>
-        <h2 class="text-center mb-4">COBRAxy: COBRA and MaREA4Galaxy</h2>
-        <p><strong>National Resource for Network Biology (NRNB)</strong></p>
-        <p><strong>Mentors:</strong></p>
-        <ul>
-            <li>Alex Graudenzi, alex.graudenzi@unimib.it</li>
-            <li>Chiara Damiani, chiara.damiani@unimib.it</li>
-            <li>Marco Antoniotti, marco.antoniotti@unimib.it</li>
-        </ul>
-        <p><strong>Contributor:</strong></p>
-        <ul>
-            <li>Luca Milazzo (University of Milano-Bicocca) – lucmil2000@gmail.com, luca.milazzo@epfl.ch</li>
-        </ul>
-
-        <h3 class="mt-4">Project Description</h3>
-        <p>
-            The project focused on developing an advanced Galaxy tool that enhances the data mapping capabilities of MaREA4Galaxy. The extension of this framework includes the analysis of fluxomics data, starting from a metabolic model and progressing to the representation of up-regulated fluxes on a metabolic map. This tool enables users to perform constraint-based enrichment analysis of metabolic pathways.
-        </p>
-        <p>The primary goals of the project were:</p>
-        <ul>
-            <li>Create a flux sampling and analysis interface to allow users to work with constraint-based metabolic models (e.g., sampling algorithms, FBA, pFBA, and FVA).</li>
-            <li>Adapt the existing clustering module to clusterize fluxomics data and implement additional clustering algorithms (e.g., Leiden and Louvain).</li>
-            <li>Build upon the existing module for visualizing enriched reactions based on RAS to create a new module for enrichment analysis of metabolic pathways based on simulated fluxomics data, and visualize the results on the metabolic map.</li>
-        </ul>
-
-        <h3 class="mt-4">What I Did</h3>
-        <ul>
-            <li>Updated all existing modules of MaREA4Galaxy to use recent versions of Python libraries, ensuring greater future compatibility.</li>
-            <li>Modified the "Custom Data Generator" tool to extract rules, reactions, bounds, and medium information from a COBRA model.</li>
-            <li>Developed the "RAS to Bound" tool, which generates metabolic reaction bounds based on the RAS matrix and a growth medium (either custom or one of 26 pre-defined settings), enabling the creation of cell-specific bounds from a generic metabolic model (e.g., ENGRO2 or a custom model).</li>
-            <li>Developed the "Flux Simulation" tool, allowing users to sample multiple metabolic models using cell-specific bounds, employing the CBS and OPTGP algorithms. This tool also supports flux analysis using FBA, pFBA, FVA, and biomass sensitivity analysis.</li>
-            <li>Developed the "Metabolic Flux Enrichment Analysis" tool, which visualizes up-regulated fluxes identified by the "Flux Simulation" tool, compares different sub-classes identified by the clustering tool over fluxomics data, and visualizes all results on the metabolic map.</li>
-        </ul>
-
-        <h3 class="mt-4">Current State and Future Extensions</h3>
-        <p>
-            Currently, the updated MaREA4Galaxy tool allows users to perform constraint-based enrichment analysis of metabolic pathways using RNA-seq profiles by simulating fluxomics. Additionally, users can compare different sub-populations identified by the clustering tool. The architecture minimizes computational costs by handling cell-specific models through a set of bounds, without storing complete COBRA models, which would contain a large amount of redundant information.
-        </p>
-        <p>
-            The implementation of the "Metabolic Flux Enrichment Analysis" tool did not leave enough time to extend the clustering module to new algorithms such as HDBSCAN, Leiden, and Louvain. This is a potential future extension to consider. Moreover, implementing a more advanced clustering grid search could further optimize clustering results.
-        </p>
-
-        <h3 class="mt-4">About the Code</h3>
-        <p>
-            I worked on the Mercurial repository of MaREA4Galaxy, where this document is stored. I committed all my changes, as shown by the repository history, though without using any Git-like merge operations due to the limitations of the Mercurial interface.
-        </p>
-
-        <h3 class="mt-4">Conclusions</h3>
-        <p>
-            Over the past years, I have focused on biology-related subjects, particularly metabolic fluxes and other omics data such as gene expression datasets. Through this project, I was able to apply the knowledge I have gained in constraint-based modeling, flux sampling, and omics enrichment analysis by expanding the MaREA4Galaxy tool. This experience not only enhanced my programming skills but also deepened my understanding of the real needs of biologists when working with such omics data.
-        </p>
-    </div>
-</body>
-</html>
-
--- a/marea_2/flux_simulation.py	Thu Aug 29 20:46:04 2024 +0000
+++ /dev/null	Thu Jan 01 00:00:00 1970 +0000
@@ -1,437 +0,0 @@
-import argparse
-import utils.general_utils as utils
-from typing import Optional, List
-import os
-import numpy as np
-import pandas as pd
-import cobra
-import utils.CBS_backend as CBS_backend
-from joblib import Parallel, delayed, cpu_count
-from cobra.sampling import OptGPSampler
-import sys
-
-################################# process args ###############################
-def process_args(args :List[str]) -> argparse.Namespace:
-    """
-    Processes command-line arguments.
-
-    Args:
-        args (list): List of command-line arguments.
-
-    Returns:
-        Namespace: An object containing parsed arguments.
-    """
-    parser = argparse.ArgumentParser(usage = '%(prog)s [options]',
-                                     description = 'process some value\'s')
-
-    parser.add_argument('-ol', '--out_log', 
-                        help = "Output log")
-    
-    parser.add_argument('-td', '--tool_dir',
-                        type = str,
-                        required = True,
-                        help = 'your tool directory')
-    
-    parser.add_argument('-in', '--input',
-                        required = True,
-                        type=str,
-                        help = 'inputs bounds')
-    
-    parser.add_argument('-ni', '--names',
-                        required = True,
-                        type=str,
-                        help = 'cell names')
- 
-    parser.add_argument(
-        '-ms', '--model_selector', 
-        type = utils.Model, default = utils.Model.ENGRO2, choices = [utils.Model.ENGRO2, utils.Model.Custom],
-        help = 'chose which type of model you want use')
-    
-    parser.add_argument("-mo", "--model", type = str)
-    
-    parser.add_argument("-mn", "--model_name", type = str, help = "custom mode name")
-    
-    parser.add_argument('-a', '--algorithm',
-                        type = str,
-                        choices = ['OPTGP', 'CBS'],
-                        required = True,
-                        help = 'choose sampling algorithm')
-    
-    parser.add_argument('-th', '--thinning', 
-                        type = int,
-                        default= 100,
-                        required=False,
-                        help = 'choose thinning')
-    
-    parser.add_argument('-ns', '--n_samples', 
-                        type = int,
-                        required = True,
-                        help = 'choose how many samples')
-    
-    parser.add_argument('-sd', '--seed', 
-                        type = int,
-                        required = True,
-                        help = 'seed')
-    
-    parser.add_argument('-nb', '--n_batches', 
-                        type = int,
-                        required = True,
-                        help = 'choose how many batches')
-    
-    parser.add_argument('-ot', '--output_type', 
-                        type = str,
-                        required = True,
-                        help = 'output type')
-    
-    parser.add_argument('-ota', '--output_type_analysis', 
-                        type = str,
-                        required = False,
-                        help = 'output type analysis')
-    
-    ARGS = parser.parse_args()
-    return ARGS
-
-########################### warning ###########################################
-def warning(s :str) -> None:
-    """
-    Log a warning message to an output log file and print it to the console.
-
-    Args:
-        s (str): The warning message to be logged and printed.
-    
-    Returns:
-      None
-    """
-    with open(ARGS.out_log, 'a') as log:
-        log.write(s + "\n\n")
-    print(s)
-
-
-def write_to_file(dataset: pd.DataFrame, name: str, keep_index:bool=False)->None:
-    dataset.index.name = 'Reactions'
-    dataset.to_csv(ARGS.output_folder + name + ".csv", sep = '\t', index = keep_index)
-
-############################ dataset input ####################################
-def read_dataset(data :str, name :str) -> pd.DataFrame:
-    """
-    Read a dataset from a CSV file and return it as a pandas DataFrame.
-
-    Args:
-        data (str): Path to the CSV file containing the dataset.
-        name (str): Name of the dataset, used in error messages.
-
-    Returns:
-        pandas.DataFrame: DataFrame containing the dataset.
-
-    Raises:
-        pd.errors.EmptyDataError: If the CSV file is empty.
-        sys.exit: If the CSV file has the wrong format, the execution is aborted.
-    """
-    try:
-        dataset = pd.read_csv(data, sep = '\t', header = 0, index_col=0, engine='python')
-    except pd.errors.EmptyDataError:
-        sys.exit('Execution aborted: wrong format of ' + name + '\n')
-    if len(dataset.columns) < 2:
-        sys.exit('Execution aborted: wrong format of ' + name + '\n')
-    return dataset
-
-
-
-def OPTGP_sampler(model:cobra.Model, model_name:str, n_samples:int=1000, thinning:int=100, n_batches:int=1, seed:int=0)-> None:
-    """
-    Samples from the OPTGP (Optimal Global Perturbation) algorithm and saves the results to CSV files.
-
-    Args:
-        model (cobra.Model): The COBRA model to sample from.
-        model_name (str): The name of the model, used in naming output files.
-        n_samples (int, optional): Number of samples per batch. Default is 1000.
-        thinning (int, optional): Thinning parameter for the sampler. Default is 100.
-        n_batches (int, optional): Number of batches to run. Default is 1.
-        seed (int, optional): Random seed for reproducibility. Default is 0.
-    
-    Returns:
-        None
-    """
-
-    for i in range(0, n_batches):
-        optgp = OptGPSampler(model, thinning, seed)
-        samples = optgp.sample(n_samples)
-        samples.to_csv(ARGS.output_folder +  model_name + '_'+ str(i)+'_OPTGP.csv', index=False)
-        seed+=1
-    samplesTotal = pd.DataFrame()
-    for i in range(0, n_batches):
-        samples_batch = pd.read_csv(ARGS.output_folder  +  model_name + '_'+ str(i)+'_OPTGP.csv')
-        samplesTotal = pd.concat([samplesTotal, samples_batch], ignore_index = True)
-
-    write_to_file(samplesTotal.T, model_name, True)
-
-    for i in range(0, n_batches):
-        os.remove(ARGS.output_folder +   model_name + '_'+ str(i)+'_OPTGP.csv')
-    pass
-
-
-def CBS_sampler(model:cobra.Model, model_name:str, n_samples:int=1000, n_batches:int=1, seed:int=0)-> None:
-    """
-    Samples using the CBS (Constraint-based Sampling) algorithm and saves the results to CSV files.
-
-    Args:
-        model (cobra.Model): The COBRA model to sample from.
-        model_name (str): The name of the model, used in naming output files.
-        n_samples (int, optional): Number of samples per batch. Default is 1000.
-        n_batches (int, optional): Number of batches to run. Default is 1.
-        seed (int, optional): Random seed for reproducibility. Default is 0.
-    
-    Returns:
-        None
-    """
-
-    df_FVA = cobra.flux_analysis.flux_variability_analysis(model,fraction_of_optimum=0).round(6)
-    
-    df_coefficients = CBS_backend.randomObjectiveFunction(model, n_samples*n_batches, df_FVA, seed=seed)
-
-    for i in range(0, n_batches):
-        samples = pd.DataFrame(columns =[reaction.id for reaction in model.reactions], index = range(n_samples))
-        try:
-            CBS_backend.randomObjectiveFunctionSampling(model, n_samples, df_coefficients.iloc[:,i*n_samples:(i+1)*n_samples], samples)
-        except Exception as e:
-            utils.logWarning(
-            "Warning: GLPK solver has failed for " + model_name + ". Trying with COBRA interface. Error:" + str(e),
-            ARGS.out_log)
-            CBS_backend.randomObjectiveFunctionSampling_cobrapy(model, n_samples, df_coefficients.iloc[:,i*n_samples:(i+1)*n_samples], 
-                                                    samples)
-        utils.logWarning(ARGS.output_folder +  model_name + '_'+ str(i)+'_CBS.csv', ARGS.out_log)
-        samples.to_csv(ARGS.output_folder +  model_name + '_'+ str(i)+'_CBS.csv', index=False)
-
-    samplesTotal = pd.DataFrame()
-    for i in range(0, n_batches):
-        samples_batch = pd.read_csv(ARGS.output_folder  +  model_name + '_'+ str(i)+'_CBS.csv')
-        samplesTotal = pd.concat([samplesTotal, samples_batch], ignore_index = True)
-
-    write_to_file(samplesTotal.T, model_name, True)
-
-    for i in range(0, n_batches):
-        os.remove(ARGS.output_folder +   model_name + '_'+ str(i)+'_CBS.csv')
-    pass
-
-
-def model_sampler(model_input_original:cobra.Model, bounds_path:str, cell_name:str)-> List[pd.DataFrame]:
-    """
-    Prepares the model with bounds from the dataset and performs sampling and analysis based on the selected algorithm.
-
-    Args:
-        model_input_original (cobra.Model): The original COBRA model.
-        bounds_path (str): Path to the CSV file containing the bounds dataset.
-        cell_name (str): Name of the cell, used to generate filenames for output.
-
-    Returns:
-        List[pd.DataFrame]: A list of DataFrames containing statistics and analysis results.
-    """
-
-    model_input = model_input_original.copy()
-    bounds_df = read_dataset(bounds_path, "bounds dataset")
-    for rxn_index, row in bounds_df.iterrows():
-        model_input.reactions.get_by_id(rxn_index).lower_bound = row.lower_bound
-        model_input.reactions.get_by_id(rxn_index).upper_bound = row.upper_bound
-    
-    name = cell_name.split('.')[0]
-    
-    if ARGS.algorithm == 'OPTGP':
-        OPTGP_sampler(model_input, name, ARGS.n_samples, ARGS.thinning, ARGS.n_batches, ARGS.seed)
-
-    elif ARGS.algorithm == 'CBS':
-        CBS_sampler(model_input,  name, ARGS.n_samples, ARGS.n_batches, ARGS.seed)
-
-    df_mean, df_median, df_quantiles = fluxes_statistics(name, ARGS.output_types)
-
-    if("fluxes" not in ARGS.output_types):
-        os.remove(ARGS.output_folder  +  name + '.csv')
-
-    returnList = []
-    returnList.append(df_mean)
-    returnList.append(df_median)
-    returnList.append(df_quantiles)
-
-    df_pFBA, df_FVA, df_sensitivity = fluxes_analysis(model_input, name, ARGS.output_type_analysis)
-
-    if("pFBA" in ARGS.output_type_analysis):
-        returnList.append(df_pFBA)
-    if("FVA" in ARGS.output_type_analysis):
-        returnList.append(df_FVA)
-    if("sensitivity" in ARGS.output_type_analysis):
-        returnList.append(df_sensitivity)
-
-    return returnList
-
-def fluxes_statistics(model_name: str,  output_types:List)-> List[pd.DataFrame]:
-    """
-    Computes statistics (mean, median, quantiles) for the fluxes.
-
-    Args:
-        model_name (str): Name of the model, used in filename for input.
-        output_types (List[str]): Types of statistics to compute (mean, median, quantiles).
-
-    Returns:
-        List[pd.DataFrame]: List of DataFrames containing mean, median, and quantiles statistics.
-    """
-
-    df_mean = pd.DataFrame()
-    df_median= pd.DataFrame()
-    df_quantiles= pd.DataFrame()
-
-    df_samples = pd.read_csv(ARGS.output_folder  +  model_name + '.csv', sep = '\t', index_col = 0).T
-    df_samples = df_samples.round(8)
-
-    for output_type in output_types:
-        if(output_type == "mean"):
-            df_mean = df_samples.mean()
-            df_mean = df_mean.to_frame().T
-            df_mean = df_mean.reset_index(drop=True)
-            df_mean.index = [model_name]
-        elif(output_type == "median"):
-            df_median = df_samples.median()
-            df_median = df_median.to_frame().T
-            df_median = df_median.reset_index(drop=True)
-            df_median.index = [model_name]
-        elif(output_type == "quantiles"):
-            newRow = []
-            cols = []
-            for rxn in df_samples.columns:
-                quantiles = df_samples[rxn].quantile([0.25, 0.50, 0.75])
-                newRow.append(quantiles[0.25])
-                cols.append(rxn + "_q1")
-                newRow.append(quantiles[0.5])
-                cols.append(rxn + "_q2")
-                newRow.append(quantiles[0.75])
-                cols.append(rxn + "_q3")
-            df_quantiles = pd.DataFrame(columns=cols)
-            df_quantiles.loc[0] = newRow
-            df_quantiles = df_quantiles.reset_index(drop=True)
-            df_quantiles.index = [model_name]
-    
-    return df_mean, df_median, df_quantiles
-
-def fluxes_analysis(model:cobra.Model,  model_name:str, output_types:List)-> List[pd.DataFrame]:
-    """
-    Performs flux analysis including pFBA, FVA, and sensitivity analysis.
-
-    Args:
-        model (cobra.Model): The COBRA model to analyze.
-        model_name (str): Name of the model, used in filenames for output.
-        output_types (List[str]): Types of analysis to perform (pFBA, FVA, sensitivity).
-
-    Returns:
-        List[pd.DataFrame]: List of DataFrames containing pFBA, FVA, and sensitivity analysis results.
-    """
-
-    df_pFBA = pd.DataFrame()
-    df_FVA= pd.DataFrame()
-    df_sensitivity= pd.DataFrame()
-
-    for output_type in output_types:
-        if(output_type == "pFBA"):
-            model.objective = "Biomass"
-            solution = cobra.flux_analysis.pfba(model)
-            fluxes = solution.fluxes
-            df_pFBA.loc[0,[rxn._id for rxn in model.reactions]] = fluxes.tolist()
-            df_pFBA = df_pFBA.reset_index(drop=True)
-            df_pFBA.index = [model_name]
-            df_pFBA = df_pFBA.astype(float).round(6)
-        elif(output_type == "FVA"):
-            fva = cobra.flux_analysis.flux_variability_analysis(model, fraction_of_optimum=0, processes=1).round(8)
-            columns = []
-            for rxn in fva.index.to_list():
-                columns.append(rxn + "_min")
-                columns.append(rxn + "_max")
-            df_FVA= pd.DataFrame(columns = columns)
-            for index_rxn, row in fva.iterrows():
-                df_FVA.loc[0, index_rxn+ "_min"] = fva.loc[index_rxn, "minimum"]
-                df_FVA.loc[0, index_rxn+ "_max"] = fva.loc[index_rxn, "maximum"]
-            df_FVA = df_FVA.reset_index(drop=True)
-            df_FVA.index = [model_name]
-            df_FVA = df_FVA.astype(float).round(6)
-        elif(output_type == "sensitivity"):
-            model.objective = "Biomass"
-            solution_original = model.optimize().objective_value
-            reactions = model.reactions
-            single = cobra.flux_analysis.single_reaction_deletion(model)
-            newRow = []
-            df_sensitivity = pd.DataFrame(columns = [rxn.id for rxn in reactions], index = [model_name])
-            for rxn in reactions:
-                newRow.append(single.knockout[rxn.id].growth.values[0]/solution_original)
-            df_sensitivity.loc[model_name] = newRow
-            df_sensitivity = df_sensitivity.astype(float).round(6)
-    return df_pFBA, df_FVA, df_sensitivity
-
-############################# main ###########################################
-def main() -> None:
-    """
-    Initializes everything and sets the program in motion based on the fronted input arguments.
-
-    Returns:
-        None
-    """
-    if not os.path.exists('flux_simulation/'):
-        os.makedirs('flux_simulation/')
-
-    num_processors = cpu_count()
-
-    global ARGS
-    ARGS = process_args(sys.argv)
-
-    ARGS.output_folder = 'flux_simulation/'
-    
-    
-    model_type :utils.Model = ARGS.model_selector
-    if model_type is utils.Model.Custom:
-        model = model_type.getCOBRAmodel(customPath = utils.FilePath.fromStrPath(ARGS.model), customExtension = utils.FilePath.fromStrPath(ARGS.model_name).ext)
-    else:
-        model = model_type.getCOBRAmodel(toolDir=ARGS.tool_dir)
-    
-    ARGS.bounds = ARGS.input.split(",")
-    ARGS.bounds_name = ARGS.names.split(",")
-    ARGS.output_types = ARGS.output_type.split(",")
-    ARGS.output_type_analysis = ARGS.output_type_analysis.split(",")
-
-
-    results = Parallel(n_jobs=num_processors)(delayed(model_sampler)(model, bounds_path, cell_name) for bounds_path, cell_name in zip(ARGS.bounds, ARGS.bounds_name))
-
-    all_mean = pd.concat([result[0] for result in results], ignore_index=False)
-    all_median = pd.concat([result[1] for result in results], ignore_index=False)
-    all_quantiles = pd.concat([result[2] for result in results], ignore_index=False)
-
-    if("mean" in ARGS.output_types):
-        all_mean = all_mean.fillna(0.0)
-        all_mean = all_mean.sort_index()
-        write_to_file(all_mean.T, "mean", True)
-
-    if("median" in ARGS.output_types):
-        all_median = all_median.fillna(0.0)
-        all_median = all_median.sort_index()
-        write_to_file(all_median.T, "median", True)
-    
-    if("quantiles" in ARGS.output_types):
-        all_quantiles = all_quantiles.fillna(0.0)
-        all_quantiles = all_quantiles.sort_index()
-        write_to_file(all_quantiles.T, "quantiles", True)
-
-    index_result = 3
-    if("pFBA" in ARGS.output_type_analysis):
-        all_pFBA = pd.concat([result[index_result] for result in results], ignore_index=False)
-        all_pFBA = all_pFBA.sort_index()
-        write_to_file(all_pFBA.T, "pFBA", True)
-        index_result+=1
-    if("FVA" in ARGS.output_type_analysis):
-        all_FVA= pd.concat([result[index_result] for result in results], ignore_index=False)
-        all_FVA = all_FVA.sort_index()
-        write_to_file(all_FVA.T, "FVA", True)
-        index_result+=1
-    if("sensitivity" in ARGS.output_type_analysis):
-        all_sensitivity = pd.concat([result[index_result] for result in results], ignore_index=False)
-        all_sensitivity = all_sensitivity.sort_index()
-        write_to_file(all_sensitivity.T, "sensitivity", True)
-
-    pass
-        
-##############################################################################
-if __name__ == "__main__":
-    main()
\ No newline at end of file
--- a/marea_2/flux_simulation.xml	Thu Aug 29 20:46:04 2024 +0000
+++ /dev/null	Thu Jan 01 00:00:00 1970 +0000
@@ -1,129 +0,0 @@
-<tool id="fluxSimulation" name="Flux Simulation" version="2.0.0">
-    
-    <macros>
-        <import>marea_macros.xml</import>
-    </macros>
-
-	<requirements>
-        <requirement type="package" version="1.24.4">numpy</requirement>
-        <requirement type="package" version="2.0.3">pandas</requirement>
-		<requirement type="package" version="0.29.0">cobra</requirement>
-        <requirement type="package" version="5.2.2">lxml</requirement>
-        <requirement type="package" version="1.4.2">joblib</requirement>
-        <requirement type="package" version="1.10.1">scipy</requirement>
-	</requirements>
-
-    <command detect_errors="exit_code">
-        <![CDATA[
-        python $__tool_directory__/flux_simulation.py
-        --tool_dir $__tool_directory__
-        --model_selector $cond_model.model_selector
-        #if $cond_model.model_selector == 'Custom'
-            --model $model
-            --model_name $model.element_identifier
-        #end if
-        --input "${",".join(map(str, $inputs))}"
-        #set $names = ""
-        #for $input_temp in $inputs:
-            #set $names = $names + $input_temp.element_identifier + ","
-        #end for
-        --name $names
-        --thinning 0
-        #if $algorithm_param.algorithm == 'OPTGP':
-        	--thinning $algorithm_param.thinning
-        #end if
-        --algorithm $algorithm_param.algorithm
-        --n_batches $n_batches
-        --n_samples $n_samples
-        --seed $seed
-        --output_type "${",".join(map(str, $output_types))}"
-        --output_type_analysis "${",".join(map(str, $output_types_analysis))}"
-        --out_log $log
-        ]]>
-    </command>
-    <inputs>
-
-        <conditional name="cond_model">
-            <expand macro="options_ras_to_bounds_model"/>
-            <when value="Custom">
-                <param name="model" argument="--model" type="data" format="json, xml" label="Custom model" />
-            </when>
-        </conditional> 
-
-        <param name="inputs" argument="--inputs" multiple="true" type="data" format="tabular, csv, tsv" label="Bound(s):" />
-        
-        
-        <conditional name="algorithm_param">
-			<param name="algorithm" argument="--algorithm" type="select" label="Choose sampling algorithm:">
-                    <option value="CBS" selected="true">CBS</option>
-                	<option value="OPTGP">OPTGP</option>
-        	</param>
-        	<when value="OPTGP">
-        		<param name="thinning" argument="--thinning" type="integer" label="Thinning:"  value="100" help="Number of iterations to wait before taking a sample."/>
-        	</when>
-
-		</conditional>
-
-
-        <param name="n_samples" argument="--n_samples" type="integer" label="Samples:" value="1000"/>
-
-        <param name="n_batches" argument="--n_batches" type="integer" label="Batches:" value="10" help="This is useful for computational perfomances."/>
-
-        <param name="seed" argument="--seed" type="integer" label="Seed:" value="0" helph="Random seed."/>
-
-        <param type="select" argument="--output_types" multiple="true" name="output_types" label="Desired outputs from sampling">
-            <option value="mean" selected="true">Mean</option>
-            <option value="median" selected="true">Median</option>
-            <option value="quantiles" selected="true">Quantiles</option>
-            <option value="fluxes" selected="false">All fluxes</option>
-        </param>
-
-        <param type="select" argument="--output_types_analysis" multiple="true" name="output_types_analysis" label="Desired outputs from flux analysis">
-            <option value="pFBA" selected="false">pFBA</option>
-            <option value="FVA" selected="false">FVA</option>
-            <option value="sensitivity" selected="false">Sensitivity reaction knock-out (Biomass)</option>
-        </param>
-    </inputs>
-
-        		
-    <outputs>
-        <data format="txt" name="log" label="fluxSimulation - Log" />
-        <collection name="results" type="list" label="${tool.name} - Samples">
-            <discover_datasets pattern="__name_and_ext__" directory="flux_simulation"/>
-        </collection>
-    </outputs>
-       
-        
-    <help>
-<![CDATA[
-
-What it does
--------------
-
-Thisoo tool generates flux samples from a model in JSON or XML format using the CBS (Corner-based Sampling) and OPTGP (Improved Artificial Centering Hit-and-Run Sampler) algorithms. It can return sampled fluxes by applying summary statistics such as:
-   - Mean
-   - Median
-   - Quantiles (0.25, 0.50, 0.75)
-Additionally, flux analysis can be performed on the metabolic model, including:
-   - Parsimonious-FBA (optimized by Biomass)
-   - FVA (Flux Variability Analysis)
-   - Biomass sensitivity analysis (single reaction knockout), which calculates the ratio between the optimal FBA coefficients of the Biomass reaction after knocking out a reaction and the same coefficients in the complete model.
-
-**Accepted Files:**
-   - A Model: JSON or XML file containing the reactions and rules defined in the model. This can be a single model, multiple models, or a collection of models.
-   - Cell-Specific Bounds: Generated by the **RAS2Bounds** tool.
-
-**Output:**
---------------
-The tool generates the following:
-   - **Samples:** A CSV file reporting the sampled fluxes for each reaction.
-   - **Log File:** A text file (.txt) containing logs of the operation.
-
-.. class:: infomark
-
-**Tip:** The 'Batches' parameter is useful for managing memory usage by processing samples in smaller batches. For example, if you want to sample 10,000 points, it is recommended to set `n_samples = 1,000` and `n_batches = 10`.
-
-]]>
-    </help>
-    <expand macro="citations" />
-</tool>
\ No newline at end of file
--- a/marea_2/flux_to_map.py	Thu Aug 29 20:46:04 2024 +0000
+++ /dev/null	Thu Jan 01 00:00:00 1970 +0000
@@ -1,1055 +0,0 @@
-from __future__ import division
-import csv
-from enum import Enum
-import re
-import sys
-import numpy as np
-import pandas as pd
-import itertools as it
-import scipy.stats as st
-import lxml.etree as ET
-import math
-import utils.general_utils as utils
-from PIL import Image
-import os
-import copy
-import argparse
-import pyvips
-from PIL import Image, ImageDraw, ImageFont
-from typing import Tuple, Union, Optional, List, Dict
-import matplotlib.pyplot as plt
-
-ERRORS = []
-########################## argparse ##########################################
-ARGS :argparse.Namespace
-def process_args() -> argparse.Namespace:
-    """
-    Interfaces the script of a module with its frontend, making the user's choices for various parameters available as values in code.
-
-    Args:
-        args : Always obtained (in file) from sys.argv
-
-    Returns:
-        Namespace : An object containing the parsed arguments
-    """
-    parser = argparse.ArgumentParser(
-        usage = "%(prog)s [options]",
-        description = "process some value's genes to create a comparison's map.")
-    
-    #General:
-    parser.add_argument(
-        '-td', '--tool_dir',
-        type = str,
-        required = True,
-        help = 'your tool directory')
-    
-    parser.add_argument('-on', '--control', type = str)
-    parser.add_argument('-ol', '--out_log', help = "Output log")
-
-    #Computation details:
-    parser.add_argument(
-        '-co', '--comparison',
-        type = str, 
-        default = '1vs1',
-        choices = ['manyvsmany', 'onevsrest', 'onevsmany'])
-    
-    parser.add_argument(
-        '-pv' ,'--pValue',
-        type = float, 
-        default = 0.1, 
-        help = 'P-Value threshold (default: %(default)s)')
-    
-    parser.add_argument(
-        '-fc', '--fChange',
-        type = float, 
-        default = 1.5, 
-        help = 'Fold-Change threshold (default: %(default)s)')
-    
-
-    parser.add_argument(
-        '-op', '--option',
-        type = str, 
-        choices = ['datasets', 'dataset_class'],
-        help='dataset or dataset and class')
-
-    parser.add_argument(
-        '-idf', '--input_data_fluxes',
-        type = str,
-        help = 'input dataset fluxes')
-    
-    parser.add_argument(
-        '-icf', '--input_class_fluxes', 
-        type = str,
-        help = 'sample group specification fluxes')
-    
-    parser.add_argument(
-        '-idsf', '--input_datas_fluxes', 
-        type = str,
-        nargs = '+', 
-        help = 'input datasets fluxes')
-    
-    parser.add_argument(
-        '-naf', '--names_fluxes', 
-        type = str,
-        nargs = '+', 
-        help = 'input names fluxes')
-    
-    #Output:
-    parser.add_argument(
-        "-gs", "--generate_svg",
-        type = utils.Bool("generate_svg"), default = True,
-        help = "choose whether to generate svg")
-    
-    parser.add_argument(
-        "-gp", "--generate_pdf",
-        type = utils.Bool("generate_pdf"), default = True,
-        help = "choose whether to generate pdf")
-    
-    parser.add_argument(
-        '-cm', '--custom_map',
-        type = str,
-        help='custom map to use')
-    
-    parser.add_argument(
-        '-mc',  '--choice_map',
-        type = utils.Model, default = utils.Model.HMRcore,
-        choices = [utils.Model.HMRcore, utils.Model.ENGRO2, utils.Model.Custom])
-    
-    parser.add_argument(
-        '-colorm',  '--color_map',
-        type = str,
-        choices = ["jet", "viridis"])
-
-    args :argparse.Namespace = parser.parse_args()
-    args.net = True
-
-    return args
-          
-############################ dataset input ####################################
-def read_dataset(data :str, name :str) -> pd.DataFrame:
-    """
-    Tries to read the dataset from its path (data) as a tsv and turns it into a DataFrame.
-
-    Args:
-        data : filepath of a dataset (from frontend input params or literals upon calling)
-        name : name associated with the dataset (from frontend input params or literals upon calling)
-
-    Returns:
-        pd.DataFrame : dataset in a runtime operable shape
-    
-    Raises:
-        sys.exit : if there's no data (pd.errors.EmptyDataError) or if the dataset has less than 2 columns
-    """
-    try:
-        dataset = pd.read_csv(data, sep = '\t', header = 0, engine='python')
-    except pd.errors.EmptyDataError:
-        sys.exit('Execution aborted: wrong format of ' + name + '\n')
-    if len(dataset.columns) < 2:
-        sys.exit('Execution aborted: wrong format of ' + name + '\n')
-    return dataset
-
-############################ dataset name #####################################
-def name_dataset(name_data :str, count :int) -> str:
-    """
-    Produces a unique name for a dataset based on what was provided by the user. The default name for any dataset is "Dataset", thus if the user didn't change it this function appends f"_{count}" to make it unique.
-
-    Args:
-        name_data : name associated with the dataset (from frontend input params)
-        count : counter from 1 to make these names unique (external)
-
-    Returns:
-        str : the name made unique
-    """
-    if str(name_data) == 'Dataset':
-        return str(name_data) + '_' + str(count)
-    else:
-        return str(name_data)
-
-############################ map_methods ######################################
-FoldChange = Union[float, int, str] # Union[float, Literal[0, "-INF", "INF"]]
-def fold_change(avg1 :float, avg2 :float) -> FoldChange:
-    """
-    Calculates the fold change between two gene expression values.
-
-    Args:
-        avg1 : average expression value from one dataset avg2 : average expression value from the other dataset
-
-    Returns:
-        FoldChange :
-            0 : when both input values are 0
-            "-INF" : when avg1 is 0
-            "INF" : when avg2 is 0
-            float : for any other combination of values
-    """
-    if avg1 == 0 and avg2 == 0:
-        return 0
-    elif avg1 == 0:
-        return '-INF'
-    elif avg2 == 0:
-        return 'INF'
-    else: # (threshold_F_C - 1) / (abs(threshold_F_C) + 1) con threshold_F_C > 1
-        return (avg1 - avg2) / (abs(avg1) + abs(avg2))
-    
-def fix_style(l :str, col :Optional[str], width :str, dash :str) -> str:
-    """
-    Produces a "fixed" style string to assign to a reaction arrow in the SVG map, assigning style properties to the corresponding values passed as input params.
-
-    Args:
-        l : current style string of an SVG element
-        col : new value for the "stroke" style property
-        width : new value for the "stroke-width" style property
-        dash : new value for the "stroke-dasharray" style property
-
-    Returns:
-        str : the fixed style string
-    """
-    tmp = l.split(';')
-    flag_col = False
-    flag_width = False
-    flag_dash = False
-    for i in range(len(tmp)):
-        if tmp[i].startswith('stroke:'):
-            tmp[i] = 'stroke:' + col
-            flag_col = True
-        if tmp[i].startswith('stroke-width:'):
-            tmp[i] = 'stroke-width:' + width
-            flag_width = True
-        if tmp[i].startswith('stroke-dasharray:'):
-            tmp[i] = 'stroke-dasharray:' + dash
-            flag_dash = True
-    if not flag_col:
-        tmp.append('stroke:' + col)
-    if not flag_width:
-        tmp.append('stroke-width:' + width)
-    if not flag_dash:
-        tmp.append('stroke-dasharray:' + dash)
-    return ';'.join(tmp)
-
-# The type of d values is collapsed, losing precision, because the dict containst lists instead of tuples, please fix!
-def fix_map(d :Dict[str, List[Union[float, FoldChange]]], core_map :ET.ElementTree, threshold_P_V :float, threshold_F_C :float, max_z_score :float) -> ET.ElementTree:
-    """
-    Edits the selected SVG map based on the p-value and fold change data (d) and some significance thresholds also passed as inputs.
-
-    Args:
-        d : dictionary mapping a p-value and a fold-change value (values) to each reaction ID as encoded in the SVG map (keys)
-        core_map : SVG map to modify
-        threshold_P_V : threshold for a p-value to be considered significant
-        threshold_F_C : threshold for a fold change value to be considered significant
-        max_z_score : highest z-score (absolute value)
-    
-    Returns:
-        ET.ElementTree : the modified core_map
-
-    Side effects:
-        core_map : mut
-    """
-    maxT = 12
-    minT = 2
-    grey = '#BEBEBE'
-    blue = '#6495ed'
-    red = '#ecac68'
-    for el in core_map.iter():
-        el_id = str(el.get('id'))
-        if el_id.startswith('R_'):
-            tmp = d.get(el_id[2:])
-            if tmp != None:
-                p_val :float = tmp[0]
-                f_c = tmp[1]
-                z_score = tmp[2]
-                if p_val < threshold_P_V:
-                    if not isinstance(f_c, str):
-                        if abs(f_c) < ((threshold_F_C - 1) / (abs(threshold_F_C) + 1)): # 
-                            col = grey
-                            width = str(minT)
-                        else:
-                            if f_c < 0:
-                                col = blue
-                            elif f_c > 0:
-                                col = red
-                            width = str(max((abs(z_score) * maxT) / max_z_score, minT))
-                    else:
-                        if f_c == '-INF':
-                            col = blue
-                        elif f_c == 'INF':
-                            col = red
-                        width = str(maxT)
-                    dash = 'none'
-                else:
-                    dash = '5,5'
-                    col = grey
-                    width = str(minT)
-                el.set('style', fix_style(el.get('style', ""), col, width, dash))
-    return core_map
-
-def getElementById(reactionId :str, metabMap :ET.ElementTree) -> utils.Result[ET.Element, utils.Result.ResultErr]:
-    """
-    Finds any element in the given map with the given ID. ID uniqueness in an svg file is recommended but
-    not enforced, if more than one element with the exact ID is found only the first will be returned.
-
-    Args:
-        reactionId (str): exact ID of the requested element.
-        metabMap (ET.ElementTree): metabolic map containing the element.
-
-    Returns:
-        utils.Result[ET.Element, ResultErr]: result of the search, either the first match found or a ResultErr.
-    """
-    return utils.Result.Ok(
-        f"//*[@id=\"{reactionId}\"]").map(
-        lambda xPath : metabMap.xpath(xPath)[0]).mapErr(
-        lambda _ : utils.Result.ResultErr(f"No elements with ID \"{reactionId}\" found in map"))
-        # ^^^ we shamelessly ignore the contents of the IndexError, it offers nothing to the user.
-
-def styleMapElement(element :ET.Element, styleStr :str) -> None:
-    currentStyles :str = element.get("style", "")
-    if re.search(r";stroke:[^;]+;stroke-width:[^;]+;stroke-dasharray:[^;]+$", currentStyles):
-        currentStyles = ';'.join(currentStyles.split(';')[:-3])
-
-    element.set("style", currentStyles + styleStr)
-
-class ReactionDirection(Enum):
-    Unknown = ""
-    Direct  = "_F"
-    Inverse = "_B"
-
-    @classmethod
-    def fromDir(cls, s :str) -> "ReactionDirection":
-        # vvv as long as there's so few variants I actually condone the if spam:
-        if s == ReactionDirection.Direct.value:  return ReactionDirection.Direct
-        if s == ReactionDirection.Inverse.value: return ReactionDirection.Inverse
-        return ReactionDirection.Unknown
-
-    @classmethod
-    def fromReactionId(cls, reactionId :str) -> "ReactionDirection":
-        return ReactionDirection.fromDir(reactionId[-2:])
-
-def getArrowBodyElementId(reactionId :str) -> str:
-    if reactionId.endswith("_RV"): reactionId = reactionId[:-3] #TODO: standardize _RV
-    elif ReactionDirection.fromReactionId(reactionId) is not ReactionDirection.Unknown: reactionId = reactionId[:-2]
-    return f"R_{reactionId}"
-
-def getArrowHeadElementId(reactionId :str) -> Tuple[str, str]:
-    """
-    We attempt extracting the direction information from the provided reaction ID, if unsuccessful we provide the IDs of both directions.
-
-    Args:
-        reactionId : the provided reaction ID.
-
-    Returns:
-        Tuple[str, str]: either a single str ID for the correct arrow head followed by an empty string or both options to try.
-    """
-    if reactionId.endswith("_RV"): reactionId = reactionId[:-3] #TODO: standardize _RV
-    elif ReactionDirection.fromReactionId(reactionId) is not ReactionDirection.Unknown: return reactionId[:-3:-1] + reactionId[:-2], ""
-    return f"F_{reactionId}", f"B_{reactionId}"
-
-class ArrowColor(Enum):
-    """
-    Encodes possible arrow colors based on their meaning in the enrichment process.
-    """
-    Invalid       = "#BEBEBE" # gray, fold-change under treshold
-    Transparent   = "#ffffff00" # white, not significant p-value
-    UpRegulated   = "#ecac68" # red, up-regulated reaction
-    DownRegulated = "#6495ed" # blue, down-regulated reaction
-
-    UpRegulatedInv = "#FF0000"
-    # ^^^ different shade of red (actually orange), up-regulated net value for a reversible reaction with
-    # conflicting enrichment in the two directions.
-
-    DownRegulatedInv = "#0000FF"
-    # ^^^ different shade of blue (actually purple), down-regulated net value for a reversible reaction with
-    # conflicting enrichment in the two directions.
-
-    @classmethod
-    def fromFoldChangeSign(cls, foldChange :float, *, useAltColor = False) -> "ArrowColor":
-        colors = (cls.DownRegulated, cls.DownRegulatedInv) if foldChange < 0 else (cls.UpRegulated, cls.UpRegulatedInv)
-        return colors[useAltColor]
-
-    def __str__(self) -> str: return self.value
-
-class Arrow:
-    """
-    Models the properties of a reaction arrow that change based on enrichment.
-    """
-    MIN_W = 2
-    MAX_W = 12
-
-    def __init__(self, width :int, col: ArrowColor, *, isDashed = False) -> None:
-        """
-        (Private) Initializes an instance of Arrow.
-
-        Args:
-            width : width of the arrow, ideally to be kept within Arrow.MIN_W and Arrow.MAX_W (not enforced).
-            col : color of the arrow.
-            isDashed : whether the arrow should be dashed, meaning the associated pValue resulted not significant.
-        
-        Returns:
-            None : practically, a Arrow instance.
-        """
-        self.w    = width
-        self.col  = col
-        self.dash = isDashed
-    
-    def applyTo(self, reactionId :str, metabMap :ET.ElementTree, styleStr :str) -> None:
-        if getElementById(reactionId, metabMap).map(lambda el : styleMapElement(el, styleStr)).isErr:
-            ERRORS.append(reactionId)
-
-    def styleReactionElements(self, metabMap :ET.ElementTree, reactionId :str, *, mindReactionDir = True) -> None:
-        if not mindReactionDir:
-            return self.applyTo(getArrowBodyElementId(reactionId), metabMap, self.toStyleStr())
-        
-        # Now we style the arrow head(s):
-        idOpt1, idOpt2 = getArrowHeadElementId(reactionId)
-        self.applyTo(idOpt1, metabMap, self.toStyleStr(downSizedForTips = True))
-        if idOpt2: self.applyTo(idOpt2, metabMap, self.toStyleStr(downSizedForTips = True))
-
-    def styleReactionElementsMeanMedian(self, metabMap :ET.ElementTree, reactionId :str, isNegative:bool) -> None:
-
-        self.applyTo(getArrowBodyElementId(reactionId), metabMap, self.toStyleStr())
-        idOpt1, idOpt2 = getArrowHeadElementId(reactionId)
-
-        if(isNegative):
-            self.applyTo(idOpt2, metabMap, self.toStyleStr(downSizedForTips = True))
-            self.col = ArrowColor.Transparent
-            self.applyTo(idOpt1, metabMap, self.toStyleStr(downSizedForTips = True)) #trasp
-        else:
-            self.applyTo(idOpt1, metabMap, self.toStyleStr(downSizedForTips = True))
-            self.col = ArrowColor.Transparent
-            self.applyTo(idOpt2, metabMap, self.toStyleStr(downSizedForTips = True)) #trasp
-
-
-    
-    def getMapReactionId(self, reactionId :str, mindReactionDir :bool) -> str:
-        """
-        Computes the reaction ID as encoded in the map for a given reaction ID from the dataset.
-
-        Args:
-            reactionId: the reaction ID, as encoded in the dataset.
-            mindReactionDir: if True forward (F_) and backward (B_) directions will be encoded in the result.
-    
-        Returns:
-            str : the ID of an arrow's body or tips in the map.
-        """
-        # we assume the reactionIds also don't encode reaction dir if they don't mind it when styling the map.
-        if not mindReactionDir: return "R_" + reactionId
-
-        #TODO: this is clearly something we need to make consistent in fluxes
-        return (reactionId[:-3:-1] + reactionId[:-2]) if reactionId[:-2] in ["_F", "_B"] else f"F_{reactionId}" # "Pyr_F" --> "F_Pyr"
-
-    def toStyleStr(self, *, downSizedForTips = False) -> str:
-        """
-        Collapses the styles of this Arrow into a str, ready to be applied as part of the "style" property on an svg element.
-
-        Returns:
-            str : the styles string.
-        """
-        width = self.w
-        if downSizedForTips: width *= 0.8
-        return f";stroke:{self.col};stroke-width:{width};stroke-dasharray:{'5,5' if self.dash else 'none'}"
-
-# vvv These constants could be inside the class itself a static properties, but python
-# was built by brainless organisms so here we are!
-INVALID_ARROW = Arrow(Arrow.MIN_W, ArrowColor.Invalid)
-INSIGNIFICANT_ARROW = Arrow(Arrow.MIN_W, ArrowColor.Invalid, isDashed = True)
-
-def applyFluxesEnrichmentToMap(fluxesEnrichmentRes :Dict[str, Union[Tuple[float, FoldChange], Tuple[float, FoldChange, float, float]]], metabMap :ET.ElementTree, maxNumericZScore :float) -> None:
-    """
-    Applies fluxes enrichment results to the provided metabolic map.
-
-    Args:
-        fluxesEnrichmentRes : fluxes enrichment results.
-        metabMap : the metabolic map to edit.
-        maxNumericZScore : biggest finite z-score value found.
-    
-    Side effects:
-        metabMap : mut
-    
-    Returns:
-        None
-    """
-    for reactionId, values in fluxesEnrichmentRes.items():
-        pValue = values[0]
-        foldChange = values[1]
-        z_score = values[2]
-
-        if isinstance(foldChange, str): foldChange = float(foldChange)
-        if pValue >= ARGS.pValue: # pValue above tresh: dashed arrow
-            INSIGNIFICANT_ARROW.styleReactionElements(metabMap, reactionId)
-            INSIGNIFICANT_ARROW.styleReactionElements(metabMap, reactionId, mindReactionDir = False)
-
-            continue
-
-        if abs(foldChange) <  (ARGS.fChange - 1) / (abs(ARGS.fChange) + 1):
-            INVALID_ARROW.styleReactionElements(metabMap, reactionId)
-            INVALID_ARROW.styleReactionElements(metabMap, reactionId, mindReactionDir = False)
-
-            continue
-        
-        width = Arrow.MAX_W
-        if not math.isinf(foldChange):
-            try: 
-                width = max(abs(z_score * Arrow.MAX_W) / maxNumericZScore, Arrow.MIN_W) 
-
-            except ZeroDivisionError: pass
-        
-        #if not reactionId.endswith("_RV"): # RV stands for reversible reactions
-        #    Arrow(width, ArrowColor.fromFoldChangeSign(foldChange)).styleReactionElements(metabMap, reactionId)
-        #    continue
-        
-        #reactionId = reactionId[:-3] # Remove "_RV"
-        
-        inversionScore = (values[3] < 0) + (values[4] < 0) # Compacts the signs of averages into 1 easy to check score
-        if inversionScore == 2: foldChange *= -1
-        # ^^^ Style the inverse direction with the opposite sign netValue
-        
-        # If the score is 1 (opposite signs) we use alternative colors vvv
-        arrow = Arrow(width, ArrowColor.fromFoldChangeSign(foldChange, useAltColor = inversionScore == 1))
-        
-        # vvv These 2 if statements can both be true and can both happen
-        if ARGS.net: # style arrow head(s):
-            arrow.styleReactionElements(metabMap, reactionId + ("_B" if inversionScore == 2 else "_F"))
-            arrow.applyTo(("F_" if inversionScore == 2 else "B_") + reactionId, metabMap, f";stroke:{ArrowColor.Transparent};stroke-width:0;stroke-dasharray:None")
-
-        arrow.styleReactionElements(metabMap, reactionId, mindReactionDir = False)
-
-
-############################ split class ######################################
-def split_class(classes :pd.DataFrame, resolve_rules :Dict[str, List[float]]) -> Dict[str, List[List[float]]]:
-    """
-    Generates a :dict that groups together data from a :DataFrame based on classes the data is related to.
-
-    Args:
-        classes : a :DataFrame of only string values, containing class information (rows) and keys to query the resolve_rules :dict
-        resolve_rules : a :dict containing :float data
-
-    Returns:
-        dict : the dict with data grouped by class
-
-    Side effects:
-        classes : mut
-    """
-    class_pat :Dict[str, List[List[float]]] = {}
-    for i in range(len(classes)):
-        classe :str = classes.iloc[i, 1]
-        if pd.isnull(classe): continue
-
-        l :List[List[float]] = []
-        for j in range(i, len(classes)):
-            if classes.iloc[j, 1] == classe:
-                pat_id :str = classes.iloc[j, 0]
-                tmp = resolve_rules.get(pat_id, None)
-                if tmp != None:
-                    l.append(tmp)
-                classes.iloc[j, 1] = None
-        
-        if l:
-            class_pat[classe] = list(map(list, zip(*l)))
-            continue
-        
-        utils.logWarning(
-            f"Warning: no sample found in class \"{classe}\", the class has been disregarded", ARGS.out_log)
-    
-    return class_pat
-
-############################ conversion ##############################################
-#conversion from svg to png 
-def svg_to_png_with_background(svg_path :utils.FilePath, png_path :utils.FilePath, dpi :int = 72, scale :int = 1, size :Optional[float] = None) -> None:
-    """
-    Internal utility to convert an SVG to PNG (forced opaque) to aid in PDF conversion.
-
-    Args:
-        svg_path : path to SVG file
-        png_path : path for new PNG file
-        dpi : dots per inch of the generated PNG
-        scale : scaling factor for the generated PNG, computed internally when a size is provided
-        size : final effective width of the generated PNG
-
-    Returns:
-        None
-    """
-    if size:
-        image = pyvips.Image.new_from_file(svg_path.show(), dpi=dpi, scale=1)
-        scale = size / image.width
-        image = image.resize(scale)
-    else:
-        image = pyvips.Image.new_from_file(svg_path.show(), dpi=dpi, scale=scale)
-
-    white_background = pyvips.Image.black(image.width, image.height).new_from_image([255, 255, 255])
-    white_background = white_background.affine([scale, 0, 0, scale])
-
-    if white_background.bands != image.bands:
-        white_background = white_background.extract_band(0)
-
-    composite_image = white_background.composite2(image, 'over')
-    composite_image.write_to_file(png_path.show())
-
-#funzione unica, lascio fuori i file e li passo in input
-#conversion from png to pdf
-def convert_png_to_pdf(png_file :utils.FilePath, pdf_file :utils.FilePath) -> None:
-    """
-    Internal utility to convert a PNG to PDF to aid from SVG conversion.
-
-    Args:
-        png_file : path to PNG file
-        pdf_file : path to new PDF file
-
-    Returns:
-        None
-    """
-    image = Image.open(png_file.show())
-    image = image.convert("RGB")
-    image.save(pdf_file.show(), "PDF", resolution=100.0)
-
-#function called to reduce redundancy in the code
-def convert_to_pdf(file_svg :utils.FilePath, file_png :utils.FilePath, file_pdf :utils.FilePath) -> None:
-    """
-    Converts the SVG map at the provided path to PDF.
-
-    Args:
-        file_svg : path to SVG file
-        file_png : path to PNG file
-        file_pdf : path to new PDF file
-
-    Returns:
-        None
-    """
-    svg_to_png_with_background(file_svg, file_png)
-    try:
-        convert_png_to_pdf(file_png, file_pdf)
-        print(f'PDF file {file_pdf.filePath} successfully generated.')
-    
-    except Exception as e:
-        raise utils.DataErr(file_pdf.show(), f'Error generating PDF file: {e}')
-
-############################ map ##############################################
-def buildOutputPath(dataset1Name :str, dataset2Name = "rest", *, details = "", ext :utils.FileFormat) -> utils.FilePath:
-    """
-    Builds a FilePath instance from the names of confronted datasets ready to point to a location in the
-    "result/" folder, used by this tool for output files in collections.
-
-    Args:
-        dataset1Name : _description_
-        dataset2Name : _description_. Defaults to "rest".
-        details : _description_
-        ext : _description_
-
-    Returns:
-        utils.FilePath : _description_
-    """
-    # This function returns a util data structure but is extremely specific to this module.
-    # RAS also uses collections as output and as such might benefit from a method like this, but I'd wait
-    # TODO: until a third tool with multiple outputs appears before porting this to utils.
-    return utils.FilePath(
-        f"{dataset1Name}_vs_{dataset2Name}" + (f" ({details})" if details else ""),
-        # ^^^ yes this string is built every time even if the form is the same for the same 2 datasets in
-        # all output files: I don't care, this was never the performance bottleneck of the tool and
-        # there is no other net gain in saving and re-using the built string.
-        ext,
-        prefix = "result")
-
-FIELD_NOT_AVAILABLE = '/'
-def writeToCsv(rows: List[list], fieldNames :List[str], outPath :utils.FilePath) -> None:
-    fieldsAmt = len(fieldNames)
-    with open(outPath.show(), "w", newline = "") as fd:
-        writer = csv.DictWriter(fd, fieldnames = fieldNames, delimiter = '\t')
-        writer.writeheader()
-        
-        for row in rows:
-            sizeMismatch = fieldsAmt - len(row)
-            if sizeMismatch > 0: row.extend([FIELD_NOT_AVAILABLE] * sizeMismatch)
-            writer.writerow({ field : data for field, data in zip(fieldNames, row) })
-
-OldEnrichedScores = Dict[str, List[Union[float, FoldChange]]] #TODO: try to use Tuple whenever possible
-def writeTabularResult(enrichedScores : OldEnrichedScores, outPath :utils.FilePath) -> None:
-    fieldNames = ["ids", "P_Value", "fold change"]
-    fieldNames.extend(["average_1", "average_2"])
-
-    writeToCsv([ [reactId] + values for reactId, values in enrichedScores.items() ], fieldNames, outPath)
-
-def temp_thingsInCommon(tmp :Dict[str, List[Union[float, FoldChange]]], core_map :ET.ElementTree, max_z_score :float, dataset1Name :str, dataset2Name = "rest") -> None:
-    # this function compiles the things always in common between comparison modes after enrichment.
-    # TODO: organize, name better.
-    writeTabularResult(tmp, buildOutputPath(dataset1Name, dataset2Name, details = "Tabular Result", ext = utils.FileFormat.TSV))
-    for reactId, enrichData in tmp.items(): tmp[reactId] = tuple(enrichData)
-    applyFluxesEnrichmentToMap(tmp, core_map, max_z_score)
-
-def computePValue(dataset1Data: List[float], dataset2Data: List[float]) -> Tuple[float, float]:
-    """
-    Computes the statistical significance score (P-value) of the comparison between coherent data
-    from two datasets. The data is supposed to, in both datasets:
-    - be related to the same reaction ID;
-    - be ordered by sample, such that the item at position i in both lists is related to the
-      same sample or cell line.
-
-    Args:
-        dataset1Data : data from the 1st dataset.
-        dataset2Data : data from the 2nd dataset.
-
-    Returns:
-        tuple: (P-value, Z-score)
-            - P-value from a Kolmogorov-Smirnov test on the provided data.
-            - Z-score of the difference between means of the two datasets.
-    """
-    # Perform Kolmogorov-Smirnov test
-    ks_statistic, p_value = st.ks_2samp(dataset1Data, dataset2Data)
-    
-    # Calculate means and standard deviations
-    mean1 = np.mean(dataset1Data)
-    mean2 = np.mean(dataset2Data)
-    std1 = np.std(dataset1Data, ddof=1)
-    std2 = np.std(dataset2Data, ddof=1)
-    
-    n1 = len(dataset1Data)
-    n2 = len(dataset2Data)
-    
-    # Calculate Z-score
-    z_score = (mean1 - mean2) / np.sqrt((std1**2 / n1) + (std2**2 / n2))
-    
-    return p_value, z_score
-
-def compareDatasetPair(dataset1Data :List[List[float]], dataset2Data :List[List[float]], ids :List[str]) -> Tuple[Dict[str, List[Union[float, FoldChange]]], float]:
-    #TODO: the following code still suffers from "dumbvarnames-osis"
-    tmp :Dict[str, List[Union[float, FoldChange]]] = {}
-    count   = 0
-    max_z_score = 0
-
-    for l1, l2 in zip(dataset1Data, dataset2Data):
-        reactId = ids[count]
-        count += 1
-        if not reactId: continue # we skip ids that have already been processed
-
-        try: 
-            p_value, z_score = computePValue(l1, l2)
-            avg1 = sum(l1) / len(l1)
-            avg2 = sum(l2) / len(l2)
-            avg = fold_change(avg1, avg2)
-            if not isinstance(z_score, str) and max_z_score < abs(z_score): max_z_score = abs(z_score)
-            tmp[reactId] = [float(p_value), avg, z_score, avg1, avg2]
-        except (TypeError, ZeroDivisionError): continue
-    
-    return tmp, max_z_score
-
-def computeEnrichment(metabMap :ET.ElementTree, class_pat :Dict[str, List[List[float]]], ids :List[str]) -> None:
-    """
-    Compares clustered data based on a given comparison mode and applies enrichment-based styling on the
-    provided metabolic map.
-
-    Args:
-        metabMap : SVG map to modify.
-        class_pat : the clustered data.
-        ids : ids for data association.
-        
-
-    Returns:
-        None
-
-    Raises:
-        sys.exit : if there are less than 2 classes for comparison
-    
-    Side effects:
-        metabMap : mut
-        ids : mut
-    """
-    class_pat = { k.strip() : v for k, v in class_pat.items() }
-    #TODO: simplfy this stuff vvv and stop using sys.exit (raise the correct utils error)
-    if (not class_pat) or (len(class_pat.keys()) < 2): sys.exit('Execution aborted: classes provided for comparisons are less than two\n')
-
-    if ARGS.comparison == "manyvsmany":
-        for i, j in it.combinations(class_pat.keys(), 2):
-            #TODO: these 2 functions are always called in pair and in this order and need common data,
-            # some clever refactoring would be appreciated.
-            comparisonDict, max_z_score = compareDatasetPair(class_pat.get(i), class_pat.get(j), ids)
-            temp_thingsInCommon(comparisonDict, metabMap, max_z_score, i, j)
-    
-    elif ARGS.comparison == "onevsrest":
-        for single_cluster in class_pat.keys():
-            t :List[List[List[float]]] = []
-            for k in class_pat.keys():
-                if k != single_cluster:
-                   t.append(class_pat.get(k))
-            
-            rest :List[List[float]] = []
-            for i in t:
-                rest = rest + i
-            
-            comparisonDict, max_z_score = compareDatasetPair(class_pat.get(single_cluster), rest, ids)
-            temp_thingsInCommon(comparisonDict, metabMap, max_z_score, single_cluster)
-    
-    elif ARGS.comparison == "onevsmany":
-        controlItems = class_pat.get(ARGS.control)
-        for otherDataset in class_pat.keys():
-            if otherDataset == ARGS.control: continue
-            
-            comparisonDict, max_z_score = compareDatasetPair(controlItems, class_pat.get(otherDataset), ids)
-            temp_thingsInCommon(comparisonDict, metabMap, max_z_score, ARGS.control, otherDataset)
-
-def createOutputMaps(dataset1Name :str, dataset2Name :str, core_map :ET.ElementTree) -> None:
-    svgFilePath = buildOutputPath(dataset1Name, dataset2Name, details = "SVG Map", ext = utils.FileFormat.SVG)
-    utils.writeSvg(svgFilePath, core_map)
-
-    if ARGS.generate_pdf:
-        pngPath = buildOutputPath(dataset1Name, dataset2Name, details = "PNG Map", ext = utils.FileFormat.PNG)
-        pdfPath = buildOutputPath(dataset1Name, dataset2Name, details = "PDF Map", ext = utils.FileFormat.PDF)
-        convert_to_pdf(svgFilePath, pngPath, pdfPath)                     
-
-    if not ARGS.generate_svg: os.remove(svgFilePath.show())
-
-ClassPat = Dict[str, List[List[float]]]
-def getClassesAndIdsFromDatasets(datasetsPaths :List[str], datasetPath :str, classPath :str, names :List[str]) -> Tuple[List[str], ClassPat]:
-    # TODO: I suggest creating dicts with ids as keys instead of keeping class_pat and ids separate,
-    # for the sake of everyone's sanity.
-    class_pat :ClassPat = {}
-    if ARGS.option == 'datasets':
-        num = 1 #TODO: the dataset naming function could be a generator
-        for path, name in zip(datasetsPaths, names):
-            name = name_dataset(name, num)
-            resolve_rules_float, ids = getDatasetValues(path, name)
-            if resolve_rules_float != None:
-                class_pat[name] = list(map(list, zip(*resolve_rules_float.values())))
-        
-            num += 1
-    
-    elif ARGS.option == "dataset_class":
-        classes = read_dataset(classPath, "class")
-        classes = classes.astype(str)
-
-        resolve_rules_float, ids = getDatasetValues(datasetPath, "Dataset Class (not actual name)")
-        if resolve_rules_float != None: class_pat = split_class(classes, resolve_rules_float)
-    
-    return ids, class_pat
-    #^^^ TODO: this could be a match statement over an enum, make it happen future marea dev with python 3.12! (it's why I kept the ifs)
-
-#TODO: create these damn args as FilePath objects
-def getDatasetValues(datasetPath :str, datasetName :str) -> Tuple[ClassPat, List[str]]:
-    """
-    Opens the dataset at the given path and extracts the values (expected nullable numerics) and the IDs.
-
-    Args:
-        datasetPath : path to the dataset
-        datasetName (str): dataset name, used in error reporting
-
-    Returns:
-        Tuple[ClassPat, List[str]]: values and IDs extracted from the dataset
-    """
-    dataset = read_dataset(datasetPath, datasetName)
-    IDs = pd.Series.tolist(dataset.iloc[:, 0].astype(str))
-
-    dataset = dataset.drop(dataset.columns[0], axis = "columns").to_dict("list")
-    return { id : list(map(utils.Float("Dataset values, not an argument"), values)) for id, values in dataset.items() }, IDs
-
-def rgb_to_hex(rgb):
-    """
-    Convert RGB values (0-1 range) to hexadecimal color format.
-
-    Args:
-        rgb (numpy.ndarray): An array of RGB color components (in the range [0, 1]).
-
-    Returns:
-        str: The color in hexadecimal format (e.g., '#ff0000' for red).
-    """
-    # Convert RGB values (0-1 range) to hexadecimal format
-    rgb = (np.array(rgb) * 255).astype(int)
-    return '#{:02x}{:02x}{:02x}'.format(rgb[0], rgb[1], rgb[2])
-
-
-
-def save_colormap_image(min_value: float, max_value: float, path: utils.FilePath, colorMap:str="viridis"):
-    """
-    Create and save an image of the colormap showing the gradient and its range.
-
-    Args:
-        min_value (float): The minimum value of the colormap range.
-        max_value (float): The maximum value of the colormap range.
-        filename (str): The filename for saving the image.
-    """
-
-    # Create a colormap using matplotlib
-    cmap = plt.get_cmap(colorMap)
-
-    # Create a figure and axis
-    fig, ax = plt.subplots(figsize=(6, 1))
-    fig.subplots_adjust(bottom=0.5)
-
-    # Create a gradient image
-    gradient = np.linspace(0, 1, 256)
-    gradient = np.vstack((gradient, gradient))
-
-    # Add min and max value annotations
-    ax.text(0, 0.5, f'{np.round(min_value, 3)}', va='center', ha='right', transform=ax.transAxes, fontsize=12, color='black')
-    ax.text(1, 0.5, f'{np.round(max_value, 3)}', va='center', ha='left', transform=ax.transAxes, fontsize=12, color='black')
-
-
-    # Display the gradient image
-    ax.imshow(gradient, aspect='auto', cmap=cmap)
-    ax.set_axis_off()
-
-    # Save the image
-    plt.savefig(path.show(), bbox_inches='tight', pad_inches=0)
-    plt.close()
-    pass
-
-def min_nonzero_abs(arr):
-    # Flatten the array and filter out zeros, then find the minimum of the remaining values
-    non_zero_elements = np.abs(arr)[np.abs(arr) > 0]
-    return np.min(non_zero_elements) if non_zero_elements.size > 0 else None
-
-def computeEnrichmentMeanMedian(metabMap: ET.ElementTree, class_pat: Dict[str, List[List[float]]], ids: List[str], colormap:str) -> None:
-    """
-    Compute and visualize the metabolic map based on mean and median of the input fluxes.
-    The fluxes are normalised across classes/datasets and visualised using the given colormap.
-
-    Args:
-        metabMap (ET.ElementTree): An XML tree representing the metabolic map.
-        class_pat (Dict[str, List[List[float]]]): A dictionary where keys are class names and values are lists of enrichment values.
-        ids (List[str]): A list of reaction IDs to be used for coloring arrows.
-    
-    Returns:
-        None
-    """
-    # Create copies only if they are needed
-    metabMap_mean = copy.deepcopy(metabMap)
-    metabMap_median = copy.deepcopy(metabMap)
-
-    # Compute medians and means
-    medians = {key: np.round(np.median(np.array(value), axis=1), 6) for key, value in class_pat.items()}
-    means = {key: np.round(np.mean(np.array(value), axis=1),6) for key, value in class_pat.items()}
-
-    # Normalize medians and means
-    max_flux_medians = max(np.max(np.abs(arr)) for arr in medians.values())
-    max_flux_means = max(np.max(np.abs(arr)) for arr in means.values())
-
-    min_flux_medians = min(min_nonzero_abs(arr) for arr in medians.values())
-    min_flux_means = min(min_nonzero_abs(arr) for arr in means.values())
-
-    medians = {key: median/max_flux_medians for key, median in medians.items()}
-    means = {key: mean/max_flux_means for key, mean in means.items()}
-
-    save_colormap_image(min_flux_medians, max_flux_medians, utils.FilePath("Color map median", ext=utils.FileFormat.PNG, prefix="result"), colormap)
-    save_colormap_image(min_flux_means, max_flux_means, utils.FilePath("Color map mean", ext=utils.FileFormat.PNG, prefix="result"), colormap)
-
-    cmap = plt.get_cmap(colormap)
-
-    for key in class_pat:
-        # Create color mappings for median and mean
-        colors_median = {
-            rxn_id: rgb_to_hex(cmap(abs(medians[key][i]))) if medians[key][i] != 0 else '#bebebe'  #grey blocked
-            for i, rxn_id in enumerate(ids)
-        }
-
-        colors_mean = {
-            rxn_id: rgb_to_hex(cmap(abs(means[key][i]))) if means[key][i] != 0 else '#bebebe'  #grey blocked
-            for i, rxn_id in enumerate(ids)
-        }
-
-        for i, rxn_id in enumerate(ids):
-            isNegative = medians[key][i] < 0
-
-            # Apply median arrows
-            apply_arrow(metabMap_median, rxn_id, colors_median[rxn_id], isNegative)
-
-            isNegative = means[key][i] < 0
-            # Apply mean arrows
-            apply_arrow(metabMap_mean, rxn_id, colors_mean[rxn_id], isNegative)
-
-        # Save and convert the SVG files
-        save_and_convert(metabMap_mean, "mean", key)
-        save_and_convert(metabMap_median, "median", key)
-
-def apply_arrow(metabMap, rxn_id, color, isNegative):
-    """
-    Apply an arrow to a specific reaction in the metabolic map with a given color.
-
-    Args:
-        metabMap (ET.ElementTree): An XML tree representing the metabolic map.
-        rxn_id (str): The ID of the reaction to which the arrow will be applied.
-        color (str): The color of the arrow in hexadecimal format.
-
-    Returns:
-        None
-    """
-    arrow = Arrow(width=5, col=color)
-    arrow.styleReactionElementsMeanMedian(metabMap, rxn_id, isNegative)
-    pass
-
-def save_and_convert(metabMap, map_type, key):
-    """
-    Save the metabolic map as an SVG file and optionally convert it to PNG and PDF formats.
-
-    Args:
-        metabMap (ET.ElementTree): An XML tree representing the metabolic map.
-        map_type (str): The type of map ('mean' or 'median').
-        key (str): The key identifying the specific map.
-
-    Returns:
-        None
-    """
-    svgFilePath = utils.FilePath(f"SVG Map {map_type} - {key}", ext=utils.FileFormat.SVG, prefix="result")
-    utils.writeSvg(svgFilePath, metabMap)
-    if ARGS.generate_pdf:
-        pngPath = utils.FilePath(f"PNG Map {map_type} - {key}", ext=utils.FileFormat.PNG, prefix="result")
-        pdfPath = utils.FilePath(f"PDF Map {map_type} - {key}", ext=utils.FileFormat.PDF, prefix="result")
-        convert_to_pdf(svgFilePath, pngPath, pdfPath)
-    if not ARGS.generate_svg:
-        os.remove(svgFilePath.show())
-
-
-
-    
-############################ MAIN #############################################
-def main() -> None:
-    """
-    Initializes everything and sets the program in motion based on the fronted input arguments.
-
-    Returns:
-        None
-    
-    Raises:
-        sys.exit : if a user-provided custom map is in the wrong format (ET.XMLSyntaxError, ET.XMLSchemaParseError)
-    """
-
-    global ARGS
-    ARGS = process_args()
-
-    if os.path.isdir('result') == False: os.makedirs('result')
-    
-    core_map :ET.ElementTree = ARGS.choice_map.getMap(
-        ARGS.tool_dir,
-        utils.FilePath.fromStrPath(ARGS.custom_map) if ARGS.custom_map else None)
-    # TODO: ^^^ ugly but fine for now, the argument is None if the model isn't custom because no file was given.
-    # getMap will None-check the customPath and panic when the model IS custom but there's no file (good). A cleaner
-    # solution can be derived from my comment in FilePath.fromStrPath
-
-    ids, class_pat = getClassesAndIdsFromDatasets(ARGS.input_datas_fluxes, ARGS.input_data_fluxes, ARGS.input_class_fluxes, ARGS.names_fluxes)
-
-    if(ARGS.choice_map == utils.Model.HMRcore):
-        temp_map = utils.Model.HMRcore_no_legend
-        computeEnrichmentMeanMedian(temp_map.getMap(ARGS.tool_dir), class_pat, ids, ARGS.color_map)
-    elif(ARGS.choice_map == utils.Model.ENGRO2):
-        temp_map = utils.Model.ENGRO2_no_legend
-        computeEnrichmentMeanMedian(temp_map.getMap(ARGS.tool_dir), class_pat, ids, ARGS.color_map)
-    else:
-        computeEnrichmentMeanMedian(core_map, class_pat, ids, ARGS.color_map)
-    
-
-    computeEnrichment(core_map, class_pat, ids)
-    
-    # create output files: TODO: this is the same comparison happening in "maps", find a better way to organize this
-    if ARGS.comparison == "manyvsmany":
-        for i, j in it.combinations(class_pat.keys(), 2): createOutputMaps(i, j, core_map)
-        return
-    
-    if ARGS.comparison == "onevsrest":
-        for single_cluster in class_pat.keys(): createOutputMaps(single_cluster, "rest", core_map)
-        return
-    
-    for otherDataset in class_pat.keys():
-        if otherDataset != ARGS.control: createOutputMaps(i, j, core_map)
-
-    if not ERRORS: return
-    utils.logWarning(
-        f"The following reaction IDs were mentioned in the dataset but weren't found in the map: {ERRORS}",
-        ARGS.out_log)
-    
-    print('Execution succeded')
-
-###############################################################################
-if __name__ == "__main__":
-    main()
\ No newline at end of file
--- a/marea_2/flux_to_map.xml	Thu Aug 29 20:46:04 2024 +0000
+++ /dev/null	Thu Jan 01 00:00:00 1970 +0000
@@ -1,245 +0,0 @@
-<tool id="FluxToMap" name="Metabolic Flux Enrichment Analysis" version="2.0.0">
-	<macros>
-		<import>marea_macros.xml</import>
-	</macros>
-	
-	<requirements>
-		<requirement type="package" version="1.24.4">numpy</requirement>
-        <requirement type="package" version="2.0.3">pandas</requirement>
-		<requirement type="package" version="0.13.0">seaborn</requirement>
-        <requirement type="package" version="1.10.1">scipy</requirement>
-		<requirement type="package" version="1.5.1">svglib</requirement>
-		<requirement type="package" version="2.2.3">pyvips</requirement>
-		<requirement type="package" version="2.7.1">cairosvg</requirement>
-		<requirement type="package" version="0.29.0">cobra</requirement>
-		<requirement type="package" version="5.2.2">lxml</requirement>
-	</requirements>
-	
-	<command detect_errors="exit_code">
-		<![CDATA[
-      	python $__tool_directory__/flux_to_map.py
-
-      	--tool_dir $__tool_directory__
-      	--option $cond.type_selector
-        --out_log $log
-		--color_map $color_map
-	
-        #if $cond.type_selector == 'datasets':
-
-			--input_datas_fluxes
-			#for $data in $cond.input_datasets_fluxes:
-				${data.input_fluxes}
-			#end for
-
-			--names_fluxes
-			#for $data in $cond.input_datasets_fluxes:
-				${data.input_name_fluxes}
-			#end for
-
-        #elif $cond.type_selector == 'dataset_class':
-
-			--input_data_fluxes $input_data_fluxes
-			--input_class_fluxes $input_class_fluxes
-
-        #end if
-
-		--comparison ${comparis.comparison}
-		#if $comparis.comparison == 'onevsmany'
-			--control '${cond.comparis.controlgroup}'
-		#end if
-
-		--choice_map '${cond_choice_map.choice_map}'
-		#if $cond_choice_map.choice_map == 'Custom':
-			--custom_map ${cond_choice_map.custom_map}
-		#end if
-		
-		#if $advanced.choice == 'true':
-			--pValue ${advanced.pValue}
-			--fChange ${advanced.fChange}
-			--generate_svg ${advanced.generateSvg}
-			--generate_pdf ${advanced.generatePdf}
-		#else 
-			--pValue 0.05
-			--fChange 1.2
-			--generate_svg false
-			--generate_pdf true
-		#end if
-        ]]>
-	</command>
-	
-	<inputs>
-
-		<conditional name="cond">
-			<param name="type_selector" argument="--option" type="select" label="Input format:">
-				<option value="datasets" selected="true">Fluxes of group 1 + Fluxes of group 2 + ... + Fluxes of group N</option>
-				<option value="dataset_class">All fluxes + sample group specification</option>
-			</param>
-
-			<when value="datasets">
-				<repeat name="input_datasets_fluxes" title="Fluxes dataset" min="2">
-					<param name="input_fluxes" argument="--input_datas_fluxes" type="data" format="tabular, csv, tsv" label="add dataset" />
-					<param name="input_name_fluxes" argument="--names_fluxes" type="text" label="Dataset's name:" value="Dataset" help="Default: Dataset" />
-				</repeat>
-			</when>
-
-			<when value="dataset_class">
-				<param name="input_data_fluxes" argument="--input_data_fluxes" type="data" format="tabular, csv, tsv" label="All fluxes" />
-				<param name="input_class_fluxes" argument="--input_class_fluxes" type="data" format="tabular, csv, tsv" label="Sample group specification" />
-			</when>
-		</conditional>
-
-		<conditional name="comparis">
-			<param name="comparison" argument="--comparison" type="select" label="Groups comparison:">
-				<option value="manyvsmany" selected="true">One vs One</option>
-				<option value="onevsrest">One vs All</option>
-				<option value="onevsmany">One vs Control</option>
-			</param>
-			<when value="onevsmany">
-				<param name="controlgroup" argument="--controlgroup" type="text" label="Control group label:" value="0" help="Name of group label to be compared to others"/>
-			</when>
-		</conditional>
-		
-		<conditional name="cond_choice_map">
-			<param name="choice_map" argument="--choice_map" type="select" label="Choose metabolic map:">
-				<option value="ENGRO2" selected="true">ENGRO2</option>
-				<option value="HMRcore" >HMRcore</option>
-				<option value="Custom">Custom</option>
-			</param>
-
-			<when value="Custom">				
-				<param name="custom_map" argument="--custom_map" type="data" format="xml, svg" label="custom-map.svg"/>
-			</when>
-		</conditional>
-
-		<param name="color_map" argument="--color_map" type="select" label="Color map:">
-				<option value="viridis" selected="true">Viridis</option>
-				<option value="jet">Jet</option>
-		</param>
-
-		<conditional name="advanced">
-			<param name="choice" type="boolean" checked="false" label="Use advanced options?" help="Use this options to choose custom parameters for evaluation: pValue, Fold-Change threshold, how to solve (A and NaN) and specify output maps.">
-				<option value="true" selected="true">No</option>
-				<option value="false">Yes</option>
-			</param>
-
-			<when value="true">
-				<param name="pValue" argument="--pValue" type="float" size="20" value="0.05" max="1" min="0" label="P-value threshold:" help="min value 0" />
-				<param name="fChange" argument="--fChange" type="float" size="20" value="1.2" min="1" label="Fold-Change threshold:" help="min value 1" />
-				<param name="generateSvg" argument="--generateSvg" type="boolean" checked="false" label="Generate SVG map" help="should the program generate an editable svg map of the processes?" />
-				<param name="generatePdf" argument="--generatePdf" type="boolean" checked="true" label="Generate PDF map" help="should the program return a non editable (but displayble) pdf map of the processes?" />
-			</when>
-		</conditional>
-	</inputs>
-
-	<outputs>
-		<data format="txt" name="log" label="FluxToMap - Log" />
-		<collection name="results" type="list" label="FluxToMap - Results">
-			<discover_datasets pattern="__name_and_ext__" directory="result"/>
-		</collection>
-	</outputs>
-	
-	<help>
-	<![CDATA[
-
-What it does
--------------
-
-This tool analyzes and visualizes differences in reactions fluxes of groups of samples returned by the Flux Simulation tool.
-
-Accepted files are: 
-    - option 1) two or more fluxes datasets, each referring to samples in a given group. The user can specify a label for each group;
-    - option 2) one fluxes dataset and one group-file specifying the group each sample belongs to (e.g. the accepted group file is thought to be the one returned by the Clustering tool).
-
-Optional files:
-    - custom svg map. Graphical elements must have the same IDs of reactions.
-
-The tool generates:
-    - 1) a tab-separated file: reporting fold-change and p-values of fluxes between a pair of conditions/classes;
-    - 2) a metabolic map file (downloadable as .svg): visualizing up- and down-regulated reactions between a pair of conditions/classes;
-    - 3) a log file (.txt).
-    
-Output options:
-To calculate P-Values and Fold-Changes and to enrich maps, comparisons are performed for each possible pair of groups (default option ‘One vs One’).
-
-Alternative options are:
-    - comparison of each group vs. the rest of samples (option ‘One vs Rest’)
-    - comparison of each group vs. a control group (option ‘One vs Control). If this option is selected the user must indicate the control group label.
-
-Output files will be named as classA_vs_classB. Reactions will conventionally be reported as up-regulated (down-regulated) if they are significantly more (less) active in class having label "classA".
-
-Example input
--------------
-
-"Fluxes of group 1 + Fluxes of group 2 + ... + Fluxes of group N" option:
-
-Fluxes Dataset 1:
-
-+------------+----------------+----------------+----------------+ 
-| Reaction ID|   Patient1     |   Patient2     |   Patient3     |  
-+============+================+================+================+
-| r1642      |    0.523167    |    0.371355    |    0.925661    |  
-+------------+----------------+----------------+----------------+    
-| r1643      |    0.568765    |    0.765567    |    0.456789    |    
-+------------+----------------+----------------+----------------+    
-| r1640      |    0.876545    |    0.768933    |    0.987654    |  
-+------------+----------------+----------------+----------------+
-| r1641      |    0.456788    |    0.876543    |    0.876542    |    
-+------------+----------------+----------------+----------------+    
-| r1646      |    0.876543    |    0.786543    |    0.897654    |   
-+------------+----------------+----------------+----------------+
-
-Fluxes Dataset 2:
-
-+------------+----------------+----------------+----------------+ 
-| Reaction ID|   Patient1     |   Patient2     |   Patient3     |  
-+============+================+================+================+
-| r1642      |    0.523167    |    0.371355    |    0.925661    |  
-+------------+----------------+----------------+----------------+    
-| r1643      |    0.568765    |    0.765567    |    0.456789    |    
-+------------+----------------+----------------+----------------+    
-| r1640      |    0.876545    |    0.768933    |    0.987654    |  
-+------------+----------------+----------------+----------------+
-| r1641      |    0.456788    |    0.876543    |    0.876542    |    
-+------------+----------------+----------------+----------------+    
-| r1646      |    0.876543    |    0.786543    |    0.897654    |   
-+------------+----------------+----------------+----------------+
-
-"Fluxes of all samples + sample group specification" option:
-
-Fluxes Dataset:
-
-+------------+----------------+----------------+----------------+ 
-| Reaction ID|   Patient1     |   Patient2     |   Patient3     |  
-+============+================+================+================+
-| r1642      |    0.523167    |    0.371355    |    0.925661    |  
-+------------+----------------+----------------+----------------+    
-| r1643      |    0.568765    |    0.765567    |    0.456789    |    
-+------------+----------------+----------------+----------------+    
-| r1640      |    0.876545    |    0.768933    |    0.987654    |  
-+------------+----------------+----------------+----------------+
-| r1641      |    0.456788    |    0.876543    |    0.876542    |    
-+------------+----------------+----------------+----------------+    
-| r1646      |    0.876543    |    0.786543    |    0.897654    |   
-+------------+----------------+----------------+----------------+
-
-Group-file
-
-+---------------+-----------+
-| Patient ID    |   Class   | 
-+===============+===========+
-| Patient1      |    0      | 
-+---------------+-----------+  
-| Patient2      |    1      |    
-+---------------+-----------+   
-| Patient3      |    1      |
-+---------------+-----------+
-
-
-**TIP**: If your dataset is not split into classes, use `MaREA cluster analysis`_.
-
-.. class:: infomark
-
-]]>
-	</help>
-	<expand macro="citations" />
-</tool>
\ No newline at end of file
--- a/marea_2/ras_to_bounds.py	Thu Aug 29 20:46:04 2024 +0000
+++ /dev/null	Thu Jan 01 00:00:00 1970 +0000
@@ -1,231 +0,0 @@
-import argparse
-import utils.general_utils as utils
-from typing import Optional, List
-import os
-import numpy as np
-import pandas as pd
-import cobra
-import sys
-import csv
-from joblib import Parallel, delayed, cpu_count
-
-################################# process args ###############################
-def process_args(args :List[str]) -> argparse.Namespace:
-    """
-    Processes command-line arguments.
-
-    Args:
-        args (list): List of command-line arguments.
-
-    Returns:
-        Namespace: An object containing parsed arguments.
-    """
-    parser = argparse.ArgumentParser(usage = '%(prog)s [options]',
-                                     description = 'process some value\'s')
-    
-    parser.add_argument(
-        '-ms', '--model_selector', 
-        type = utils.Model, default = utils.Model.ENGRO2, choices = [utils.Model.ENGRO2, utils.Model.Custom],
-        help = 'chose which type of model you want use')
-    
-    parser.add_argument("-mo", "--model", type = str,
-        help = "path to input file with custom rules, if provided")
-    
-    parser.add_argument("-mn", "--model_name", type = str, help = "custom mode name")
-
-    parser.add_argument(
-        '-mes', '--medium_selector', 
-        default = "allOpen",
-        help = 'chose which type of medium you want use')
-    
-    parser.add_argument("-meo", "--medium", type = str,
-        help = "path to input file with custom medium, if provided")
-
-    parser.add_argument('-ol', '--out_log', 
-                        help = "Output log")
-    
-    parser.add_argument('-td', '--tool_dir',
-                        type = str,
-                        required = True,
-                        help = 'your tool directory')
-    
-    parser.add_argument('-ir', '--input_ras',
-                        type=str,
-                        required = False,
-                        help = 'input ras')
-    
-    parser.add_argument('-rs', '--ras_selector',
-                        required = True,
-                        type=utils.Bool("using_RAS"),
-                        help = 'ras selector')
-    
-    ARGS = parser.parse_args()
-    return ARGS
-
-########################### warning ###########################################
-def warning(s :str) -> None:
-    """
-    Log a warning message to an output log file and print it to the console.
-
-    Args:
-        s (str): The warning message to be logged and printed.
-    
-    Returns:
-      None
-    """
-    with open(ARGS.out_log, 'a') as log:
-        log.write(s + "\n\n")
-    print(s)
-
-############################ dataset input ####################################
-def read_dataset(data :str, name :str) -> pd.DataFrame:
-    """
-    Read a dataset from a CSV file and return it as a pandas DataFrame.
-
-    Args:
-        data (str): Path to the CSV file containing the dataset.
-        name (str): Name of the dataset, used in error messages.
-
-    Returns:
-        pandas.DataFrame: DataFrame containing the dataset.
-
-    Raises:
-        pd.errors.EmptyDataError: If the CSV file is empty.
-        sys.exit: If the CSV file has the wrong format, the execution is aborted.
-    """
-    try:
-        dataset = pd.read_csv(data, sep = '\t', header = 0, engine='python')
-    except pd.errors.EmptyDataError:
-        sys.exit('Execution aborted: wrong format of ' + name + '\n')
-    if len(dataset.columns) < 2:
-        sys.exit('Execution aborted: wrong format of ' + name + '\n')
-    return dataset
-
-
-def apply_ras_bounds(model, ras_row, rxns_ids):
-    """
-    Adjust the bounds of reactions in the model based on RAS values.
-
-    Args:
-        model (cobra.Model): The metabolic model to be modified.
-        ras_row (pd.Series): A row from a RAS DataFrame containing scaling factors for reaction bounds.
-        rxns_ids (list of str): List of reaction IDs to which the scaling factors will be applied.
-    
-    Returns:
-        None
-    """
-    for reaction in rxns_ids:
-        if reaction in ras_row.index and pd.notna(ras_row[reaction]):
-            rxn = model.reactions.get_by_id(reaction)
-            scaling_factor = ras_row[reaction]
-            rxn.lower_bound *= scaling_factor
-            rxn.upper_bound *= scaling_factor
-
-def process_ras_cell(cellName, ras_row, model, rxns_ids, output_folder):
-    """
-    Process a single RAS cell, apply bounds, and save the bounds to a CSV file.
-
-    Args:
-        cellName (str): The name of the RAS cell (used for naming the output file).
-        ras_row (pd.Series): A row from a RAS DataFrame containing scaling factors for reaction bounds.
-        model (cobra.Model): The metabolic model to be modified.
-        rxns_ids (list of str): List of reaction IDs to which the scaling factors will be applied.
-        output_folder (str): Folder path where the output CSV file will be saved.
-    
-    Returns:
-        None
-    """
-    model_new = model.copy()
-    apply_ras_bounds(model_new, ras_row, rxns_ids)
-    bounds = pd.DataFrame([(rxn.lower_bound, rxn.upper_bound) for rxn in model_new.reactions], index=rxns_ids, columns=["lower_bound", "upper_bound"])
-    bounds.to_csv(output_folder + cellName + ".csv", sep='\t', index=True)
-
-def generate_bounds(model: cobra.Model, medium: dict, ras=None, output_folder='output/') -> pd.DataFrame:
-    """
-    Generate reaction bounds for a metabolic model based on medium conditions and optional RAS adjustments.
-    
-    Args:
-        model (cobra.Model): The metabolic model for which bounds will be generated.
-        medium (dict): A dictionary where keys are reaction IDs and values are the medium conditions.
-        ras (pd.DataFrame, optional): A DataFrame with RAS scaling factors for different cell types. Defaults to None.
-        output_folder (str, optional): Folder path where output CSV files will be saved. Defaults to 'output/'.
-
-    Returns:
-        pd.DataFrame: DataFrame containing the bounds of reactions in the model.
-    """
-    rxns_ids = [rxn.id for rxn in model.reactions]
-    
-    # Set medium conditions
-    for reaction, value in medium.items():
-        if value is not None:
-            model.reactions.get_by_id(reaction).lower_bound = -float(value)
-    
-    # Perform Flux Variability Analysis (FVA)
-    df_FVA = cobra.flux_analysis.flux_variability_analysis(model, fraction_of_optimum=0, processes=1).round(8)
-    
-    # Set FVA bounds
-    for reaction in rxns_ids:
-        rxn = model.reactions.get_by_id(reaction)
-        rxn.lower_bound = float(df_FVA.loc[reaction, "minimum"])
-        rxn.upper_bound = float(df_FVA.loc[reaction, "maximum"])
-
-    if ras is not None:
-        Parallel(n_jobs=cpu_count())(delayed(process_ras_cell)(cellName, ras_row, model, rxns_ids, output_folder) for cellName, ras_row in ras.iterrows())
-    else:
-        model_new = model.copy()
-        apply_ras_bounds(model_new, pd.Series([1]*len(rxns_ids), index=rxns_ids), rxns_ids)
-        bounds = pd.DataFrame([(rxn.lower_bound, rxn.upper_bound) for rxn in model_new.reactions], index=rxns_ids, columns=["lower_bound", "upper_bound"])
-        bounds.to_csv(output_folder + "bounds.csv", sep='\t', index=True)
-
-
-############################# main ###########################################
-def main() -> None:
-    """
-    Initializes everything and sets the program in motion based on the fronted input arguments.
-
-    Returns:
-        None
-    """
-    if not os.path.exists('ras_to_bounds'):
-        os.makedirs('ras_to_bounds')
-
-
-    global ARGS
-    ARGS = process_args(sys.argv)
-
-    ARGS.output_folder = 'ras_to_bounds/'
-
-    if(ARGS.ras_selector == True):
-        ras = read_dataset(ARGS.input_ras, "ras dataset")
-        ras.replace("None", None, inplace=True)
-        ras.set_index("Reactions", drop=True, inplace=True)
-        ras = ras.T
-        ras = ras.astype(float)
-    
-    model_type :utils.Model = ARGS.model_selector
-    if model_type is utils.Model.Custom:
-        model = model_type.getCOBRAmodel(customPath = utils.FilePath.fromStrPath(ARGS.model), customExtension = utils.FilePath.fromStrPath(ARGS.model_name).ext)
-    else:
-        model = model_type.getCOBRAmodel(toolDir=ARGS.tool_dir)
-
-    if(ARGS.medium_selector == "Custom"):
-        medium = read_dataset(ARGS.medium, "medium dataset")
-        medium.set_index(medium.columns[0], inplace=True)
-        medium = medium.astype(float)
-        medium = medium[medium.columns[0]].to_dict()
-    else:
-        df_mediums = pd.read_csv(ARGS.tool_dir + "/local/medium/medium.csv", index_col = 0)
-        ARGS.medium_selector = ARGS.medium_selector.replace("_", " ")
-        medium = df_mediums[[ARGS.medium_selector]]
-        medium = medium[ARGS.medium_selector].to_dict()
-
-    if(ARGS.ras_selector == True):
-        generate_bounds(model, medium, ras = ras, output_folder=ARGS.output_folder)
-    else:
-        generate_bounds(model, medium, output_folder=ARGS.output_folder)
-
-    pass
-        
-##############################################################################
-if __name__ == "__main__":
-    main()
\ No newline at end of file
--- a/marea_2/ras_to_bounds.xml	Thu Aug 29 20:46:04 2024 +0000
+++ /dev/null	Thu Jan 01 00:00:00 1970 +0000
@@ -1,107 +0,0 @@
-<tool id="RAStoBounds" name="RAS2Bounds" version="2.0.0">
-    
-    <macros>
-        <import>marea_macros.xml</import>
-    </macros>
-
-	<requirements>
-        <requirement type="package" version="1.24.4">numpy</requirement>
-        <requirement type="package" version="2.0.3">pandas</requirement>
-		<requirement type="package" version="0.29.0">cobra</requirement>
-        <requirement type="package" version="5.2.2">lxml</requirement>
-        <requirement type="package" version="1.4.2">joblib</requirement>
-	</requirements>
-
-    <command detect_errors="exit_code">
-        <![CDATA[
-      	python $__tool_directory__/ras_to_bounds.py
-        --tool_dir $__tool_directory__
-        --model_selector $cond_model.model_selector
-        #if $cond_model.model_selector == 'Custom'
-            --model $model
-            --model_name $model.element_identifier
-        #end if
-        --medium_selector $cond_medium.medium_selector
-        #if $cond_medium.medium_selector == 'Custom'
-            --medium $medium
-        #end if
-        --ras_selector $cond_ras.ras_choice
-        #if $cond_ras.ras_choice == "True"
-        	--input_ras $cond_ras.input_ras
-        #end if
-        --out_log $log
-        ]]>
-    </command>
-    <inputs>
-        <conditional name="cond_model">
-            <expand macro="options_ras_to_bounds_model"/>
-            <when value="Custom">
-                <param name="model" argument="--model" type="data" format="json, xml" label="Custom model" />
-            </when>
-        </conditional> 
-
-        <conditional name="cond_ras">
-			<param name="ras_choice" argument="--ras_choice" type="select" label="Do want to use RAS?">
-                	<option value="True" selected="true">Yes</option>
-                	<option value="False">No</option>
-        	</param>
-            <when value="True">
-                <param name="input_ras" argument="--input_ras" multiple="false" type="data" format="tabular, csv, tsv" label="RAS matrix:" />
-            </when>
-        </conditional>  
-        
-        <conditional name="cond_medium">
-            <expand macro="options_ras_to_bounds_medium"/>
-            <when value="Custom">
-                <param name="medium" argument="--medium" type="data" format="tabular, csv, tsv" label="Custom medium" />
-            </when>
-        </conditional> 
-
-    </inputs>
-
-    <outputs>
-        <data format="txt" name="log" label="RAStoBounds- Log" />
-        
-        <collection name="ras_to_bounds" type="list" label="Ras to Bounds">
-            <discover_datasets name = "collection" pattern="__name_and_ext__" directory="ras_to_bounds"/>
-        </collection>
-
-    </outputs>
-
-    <help>
-
-    <![CDATA[
-
-What it does
--------------
-This tool generates reaction bounds for a given metabolic model (in JSON or XML format), both with and without the integration of Reaction Activity Scores (RAS) into the metabolic model. 
-Additionally, it allows the use of custom or pre-defined growth media to constrain exchange reactions (see format example here below).
-If a RAS matrix, generated by the **Expression2RAS** tool, is used, then a separate bounds file is generated for each cell. Otherwise, a single bounds file is returned.
-
-**Accepted Files:**
-   - **Model:** A JSON or XML file containing the reactions and rules defined in the model.
-   - **RAS Matrix:** A tab-separated file with RAS data as generated by the **Expression2RAS** tool.
-   - **Medium:** A tab-separated file specifying the lower and upper bounds of medium reactions.
-
-Example medium file
--------------
-
-+------------+----------------+----------------+
-| Reaction ID|   lower_bound   |   upper_bound |  
-+============+================+================+
-| r1         |    0.123167    |    0.371355    | 
-+------------+----------------+----------------+   
-| r2         |    0.268765    |    0.765567    |  
-+------------+----------------+----------------+   
-
-
-Output:
--------------
-
-The tool generates:
-    - bounds: reporting the bounds of the model, or cells if RAS is used. Format: tab-separated.
-    - a log file (.txt).
-    ]]>
-    </help>
-    <expand macro="citations" />
-</tool>
\ No newline at end of file