annotate table_pandas_rename_column.py @ 1:d4bacc06365e draft default tip

planemo upload for repository https://github.com/RECETOX/galaxytools/tree/master/tools/tables commit f69717379738dfeb8e0c212685c269fbb0a2035e
author recetox
date Wed, 29 Jan 2025 15:46:51 +0000
parents e6d5fee8c7a6
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
0
e6d5fee8c7a6 planemo upload for repository https://github.com/RECETOX/galaxytools/tree/master/tools/tables commit d0ff40eb2b536fec6c973c3a9ea8e7f31cd9a0d6
recetox
parents:
diff changeset
1 import argparse
e6d5fee8c7a6 planemo upload for repository https://github.com/RECETOX/galaxytools/tree/master/tools/tables commit d0ff40eb2b536fec6c973c3a9ea8e7f31cd9a0d6
recetox
parents:
diff changeset
2 import logging
e6d5fee8c7a6 planemo upload for repository https://github.com/RECETOX/galaxytools/tree/master/tools/tables commit d0ff40eb2b536fec6c973c3a9ea8e7f31cd9a0d6
recetox
parents:
diff changeset
3 from typing import Tuple
e6d5fee8c7a6 planemo upload for repository https://github.com/RECETOX/galaxytools/tree/master/tools/tables commit d0ff40eb2b536fec6c973c3a9ea8e7f31cd9a0d6
recetox
parents:
diff changeset
4
e6d5fee8c7a6 planemo upload for repository https://github.com/RECETOX/galaxytools/tree/master/tools/tables commit d0ff40eb2b536fec6c973c3a9ea8e7f31cd9a0d6
recetox
parents:
diff changeset
5 import pandas as pd
e6d5fee8c7a6 planemo upload for repository https://github.com/RECETOX/galaxytools/tree/master/tools/tables commit d0ff40eb2b536fec6c973c3a9ea8e7f31cd9a0d6
recetox
parents:
diff changeset
6 from utils import KeyValuePairsAction, LoadDataAction, StoreOutputAction
e6d5fee8c7a6 planemo upload for repository https://github.com/RECETOX/galaxytools/tree/master/tools/tables commit d0ff40eb2b536fec6c973c3a9ea8e7f31cd9a0d6
recetox
parents:
diff changeset
7
e6d5fee8c7a6 planemo upload for repository https://github.com/RECETOX/galaxytools/tree/master/tools/tables commit d0ff40eb2b536fec6c973c3a9ea8e7f31cd9a0d6
recetox
parents:
diff changeset
8
e6d5fee8c7a6 planemo upload for repository https://github.com/RECETOX/galaxytools/tree/master/tools/tables commit d0ff40eb2b536fec6c973c3a9ea8e7f31cd9a0d6
recetox
parents:
diff changeset
9 def rename_columns(df: pd.DataFrame, rename_dict: dict):
e6d5fee8c7a6 planemo upload for repository https://github.com/RECETOX/galaxytools/tree/master/tools/tables commit d0ff40eb2b536fec6c973c3a9ea8e7f31cd9a0d6
recetox
parents:
diff changeset
10 """
e6d5fee8c7a6 planemo upload for repository https://github.com/RECETOX/galaxytools/tree/master/tools/tables commit d0ff40eb2b536fec6c973c3a9ea8e7f31cd9a0d6
recetox
parents:
diff changeset
11 Rename columns in the dataframe based on the provided dictionary.
e6d5fee8c7a6 planemo upload for repository https://github.com/RECETOX/galaxytools/tree/master/tools/tables commit d0ff40eb2b536fec6c973c3a9ea8e7f31cd9a0d6
recetox
parents:
diff changeset
12
e6d5fee8c7a6 planemo upload for repository https://github.com/RECETOX/galaxytools/tree/master/tools/tables commit d0ff40eb2b536fec6c973c3a9ea8e7f31cd9a0d6
recetox
parents:
diff changeset
13 Parameters:
e6d5fee8c7a6 planemo upload for repository https://github.com/RECETOX/galaxytools/tree/master/tools/tables commit d0ff40eb2b536fec6c973c3a9ea8e7f31cd9a0d6
recetox
parents:
diff changeset
14 df (pd.DataFrame): The input dataframe.
e6d5fee8c7a6 planemo upload for repository https://github.com/RECETOX/galaxytools/tree/master/tools/tables commit d0ff40eb2b536fec6c973c3a9ea8e7f31cd9a0d6
recetox
parents:
diff changeset
15 rename_dict (dict): A dictionary with 1-based column index as key and new column name as value.
e6d5fee8c7a6 planemo upload for repository https://github.com/RECETOX/galaxytools/tree/master/tools/tables commit d0ff40eb2b536fec6c973c3a9ea8e7f31cd9a0d6
recetox
parents:
diff changeset
16
e6d5fee8c7a6 planemo upload for repository https://github.com/RECETOX/galaxytools/tree/master/tools/tables commit d0ff40eb2b536fec6c973c3a9ea8e7f31cd9a0d6
recetox
parents:
diff changeset
17 Returns:
e6d5fee8c7a6 planemo upload for repository https://github.com/RECETOX/galaxytools/tree/master/tools/tables commit d0ff40eb2b536fec6c973c3a9ea8e7f31cd9a0d6
recetox
parents:
diff changeset
18 pd.DataFrame: The dataframe with renamed columns.
e6d5fee8c7a6 planemo upload for repository https://github.com/RECETOX/galaxytools/tree/master/tools/tables commit d0ff40eb2b536fec6c973c3a9ea8e7f31cd9a0d6
recetox
parents:
diff changeset
19 """
e6d5fee8c7a6 planemo upload for repository https://github.com/RECETOX/galaxytools/tree/master/tools/tables commit d0ff40eb2b536fec6c973c3a9ea8e7f31cd9a0d6
recetox
parents:
diff changeset
20 try:
e6d5fee8c7a6 planemo upload for repository https://github.com/RECETOX/galaxytools/tree/master/tools/tables commit d0ff40eb2b536fec6c973c3a9ea8e7f31cd9a0d6
recetox
parents:
diff changeset
21 rename_map = {
e6d5fee8c7a6 planemo upload for repository https://github.com/RECETOX/galaxytools/tree/master/tools/tables commit d0ff40eb2b536fec6c973c3a9ea8e7f31cd9a0d6
recetox
parents:
diff changeset
22 df.columns[key - 1]: value for key, value in rename_dict.items()
e6d5fee8c7a6 planemo upload for repository https://github.com/RECETOX/galaxytools/tree/master/tools/tables commit d0ff40eb2b536fec6c973c3a9ea8e7f31cd9a0d6
recetox
parents:
diff changeset
23 } # Convert 1-based index to column name
e6d5fee8c7a6 planemo upload for repository https://github.com/RECETOX/galaxytools/tree/master/tools/tables commit d0ff40eb2b536fec6c973c3a9ea8e7f31cd9a0d6
recetox
parents:
diff changeset
24 return df.rename(columns=rename_map)
e6d5fee8c7a6 planemo upload for repository https://github.com/RECETOX/galaxytools/tree/master/tools/tables commit d0ff40eb2b536fec6c973c3a9ea8e7f31cd9a0d6
recetox
parents:
diff changeset
25 except IndexError as e:
e6d5fee8c7a6 planemo upload for repository https://github.com/RECETOX/galaxytools/tree/master/tools/tables commit d0ff40eb2b536fec6c973c3a9ea8e7f31cd9a0d6
recetox
parents:
diff changeset
26 logging.error(f"Invalid column index: {e}")
e6d5fee8c7a6 planemo upload for repository https://github.com/RECETOX/galaxytools/tree/master/tools/tables commit d0ff40eb2b536fec6c973c3a9ea8e7f31cd9a0d6
recetox
parents:
diff changeset
27 raise
e6d5fee8c7a6 planemo upload for repository https://github.com/RECETOX/galaxytools/tree/master/tools/tables commit d0ff40eb2b536fec6c973c3a9ea8e7f31cd9a0d6
recetox
parents:
diff changeset
28 except Exception as e:
e6d5fee8c7a6 planemo upload for repository https://github.com/RECETOX/galaxytools/tree/master/tools/tables commit d0ff40eb2b536fec6c973c3a9ea8e7f31cd9a0d6
recetox
parents:
diff changeset
29 logging.error(f"Error renaming columns: {e}")
e6d5fee8c7a6 planemo upload for repository https://github.com/RECETOX/galaxytools/tree/master/tools/tables commit d0ff40eb2b536fec6c973c3a9ea8e7f31cd9a0d6
recetox
parents:
diff changeset
30 raise
e6d5fee8c7a6 planemo upload for repository https://github.com/RECETOX/galaxytools/tree/master/tools/tables commit d0ff40eb2b536fec6c973c3a9ea8e7f31cd9a0d6
recetox
parents:
diff changeset
31
e6d5fee8c7a6 planemo upload for repository https://github.com/RECETOX/galaxytools/tree/master/tools/tables commit d0ff40eb2b536fec6c973c3a9ea8e7f31cd9a0d6
recetox
parents:
diff changeset
32
e6d5fee8c7a6 planemo upload for repository https://github.com/RECETOX/galaxytools/tree/master/tools/tables commit d0ff40eb2b536fec6c973c3a9ea8e7f31cd9a0d6
recetox
parents:
diff changeset
33 def main(input_dataset: pd.DataFrame, rename_dict: dict, output_dataset: Tuple[callable, str]):
e6d5fee8c7a6 planemo upload for repository https://github.com/RECETOX/galaxytools/tree/master/tools/tables commit d0ff40eb2b536fec6c973c3a9ea8e7f31cd9a0d6
recetox
parents:
diff changeset
34 """
e6d5fee8c7a6 planemo upload for repository https://github.com/RECETOX/galaxytools/tree/master/tools/tables commit d0ff40eb2b536fec6c973c3a9ea8e7f31cd9a0d6
recetox
parents:
diff changeset
35 Main function to load the dataset, rename columns, and save the result.
e6d5fee8c7a6 planemo upload for repository https://github.com/RECETOX/galaxytools/tree/master/tools/tables commit d0ff40eb2b536fec6c973c3a9ea8e7f31cd9a0d6
recetox
parents:
diff changeset
36
e6d5fee8c7a6 planemo upload for repository https://github.com/RECETOX/galaxytools/tree/master/tools/tables commit d0ff40eb2b536fec6c973c3a9ea8e7f31cd9a0d6
recetox
parents:
diff changeset
37 Parameters:
e6d5fee8c7a6 planemo upload for repository https://github.com/RECETOX/galaxytools/tree/master/tools/tables commit d0ff40eb2b536fec6c973c3a9ea8e7f31cd9a0d6
recetox
parents:
diff changeset
38 input_dataset (pd.DataFrame): The input dataset .
e6d5fee8c7a6 planemo upload for repository https://github.com/RECETOX/galaxytools/tree/master/tools/tables commit d0ff40eb2b536fec6c973c3a9ea8e7f31cd9a0d6
recetox
parents:
diff changeset
39 rename_dict (dict): A dictionary with 1-based column index as key and new column name as value.
e6d5fee8c7a6 planemo upload for repository https://github.com/RECETOX/galaxytools/tree/master/tools/tables commit d0ff40eb2b536fec6c973c3a9ea8e7f31cd9a0d6
recetox
parents:
diff changeset
40 output_dataset (tuple): The function to store the output dataset and the path.
e6d5fee8c7a6 planemo upload for repository https://github.com/RECETOX/galaxytools/tree/master/tools/tables commit d0ff40eb2b536fec6c973c3a9ea8e7f31cd9a0d6
recetox
parents:
diff changeset
41 """
e6d5fee8c7a6 planemo upload for repository https://github.com/RECETOX/galaxytools/tree/master/tools/tables commit d0ff40eb2b536fec6c973c3a9ea8e7f31cd9a0d6
recetox
parents:
diff changeset
42 try:
e6d5fee8c7a6 planemo upload for repository https://github.com/RECETOX/galaxytools/tree/master/tools/tables commit d0ff40eb2b536fec6c973c3a9ea8e7f31cd9a0d6
recetox
parents:
diff changeset
43 write_func, file_path = output_dataset
e6d5fee8c7a6 planemo upload for repository https://github.com/RECETOX/galaxytools/tree/master/tools/tables commit d0ff40eb2b536fec6c973c3a9ea8e7f31cd9a0d6
recetox
parents:
diff changeset
44 write_func(rename_columns(input_dataset, rename_dict), file_path)
e6d5fee8c7a6 planemo upload for repository https://github.com/RECETOX/galaxytools/tree/master/tools/tables commit d0ff40eb2b536fec6c973c3a9ea8e7f31cd9a0d6
recetox
parents:
diff changeset
45 except Exception as e:
e6d5fee8c7a6 planemo upload for repository https://github.com/RECETOX/galaxytools/tree/master/tools/tables commit d0ff40eb2b536fec6c973c3a9ea8e7f31cd9a0d6
recetox
parents:
diff changeset
46 logging.error(f"Error in main function: {e}")
e6d5fee8c7a6 planemo upload for repository https://github.com/RECETOX/galaxytools/tree/master/tools/tables commit d0ff40eb2b536fec6c973c3a9ea8e7f31cd9a0d6
recetox
parents:
diff changeset
47 raise
e6d5fee8c7a6 planemo upload for repository https://github.com/RECETOX/galaxytools/tree/master/tools/tables commit d0ff40eb2b536fec6c973c3a9ea8e7f31cd9a0d6
recetox
parents:
diff changeset
48
e6d5fee8c7a6 planemo upload for repository https://github.com/RECETOX/galaxytools/tree/master/tools/tables commit d0ff40eb2b536fec6c973c3a9ea8e7f31cd9a0d6
recetox
parents:
diff changeset
49
e6d5fee8c7a6 planemo upload for repository https://github.com/RECETOX/galaxytools/tree/master/tools/tables commit d0ff40eb2b536fec6c973c3a9ea8e7f31cd9a0d6
recetox
parents:
diff changeset
50 if __name__ == "__main__":
e6d5fee8c7a6 planemo upload for repository https://github.com/RECETOX/galaxytools/tree/master/tools/tables commit d0ff40eb2b536fec6c973c3a9ea8e7f31cd9a0d6
recetox
parents:
diff changeset
51 logging.basicConfig(level=logging.INFO)
e6d5fee8c7a6 planemo upload for repository https://github.com/RECETOX/galaxytools/tree/master/tools/tables commit d0ff40eb2b536fec6c973c3a9ea8e7f31cd9a0d6
recetox
parents:
diff changeset
52 parser = argparse.ArgumentParser(description="Rename columns in a dataframe.")
e6d5fee8c7a6 planemo upload for repository https://github.com/RECETOX/galaxytools/tree/master/tools/tables commit d0ff40eb2b536fec6c973c3a9ea8e7f31cd9a0d6
recetox
parents:
diff changeset
53 parser.add_argument(
e6d5fee8c7a6 planemo upload for repository https://github.com/RECETOX/galaxytools/tree/master/tools/tables commit d0ff40eb2b536fec6c973c3a9ea8e7f31cd9a0d6
recetox
parents:
diff changeset
54 "--input_dataset",
e6d5fee8c7a6 planemo upload for repository https://github.com/RECETOX/galaxytools/tree/master/tools/tables commit d0ff40eb2b536fec6c973c3a9ea8e7f31cd9a0d6
recetox
parents:
diff changeset
55 nargs=2,
e6d5fee8c7a6 planemo upload for repository https://github.com/RECETOX/galaxytools/tree/master/tools/tables commit d0ff40eb2b536fec6c973c3a9ea8e7f31cd9a0d6
recetox
parents:
diff changeset
56 action=LoadDataAction,
e6d5fee8c7a6 planemo upload for repository https://github.com/RECETOX/galaxytools/tree/master/tools/tables commit d0ff40eb2b536fec6c973c3a9ea8e7f31cd9a0d6
recetox
parents:
diff changeset
57 required=True,
e6d5fee8c7a6 planemo upload for repository https://github.com/RECETOX/galaxytools/tree/master/tools/tables commit d0ff40eb2b536fec6c973c3a9ea8e7f31cd9a0d6
recetox
parents:
diff changeset
58 help="Path to the input dataset and its file extension (csv, tsv, parquet)",
e6d5fee8c7a6 planemo upload for repository https://github.com/RECETOX/galaxytools/tree/master/tools/tables commit d0ff40eb2b536fec6c973c3a9ea8e7f31cd9a0d6
recetox
parents:
diff changeset
59 )
e6d5fee8c7a6 planemo upload for repository https://github.com/RECETOX/galaxytools/tree/master/tools/tables commit d0ff40eb2b536fec6c973c3a9ea8e7f31cd9a0d6
recetox
parents:
diff changeset
60 parser.add_argument(
e6d5fee8c7a6 planemo upload for repository https://github.com/RECETOX/galaxytools/tree/master/tools/tables commit d0ff40eb2b536fec6c973c3a9ea8e7f31cd9a0d6
recetox
parents:
diff changeset
61 "--rename",
e6d5fee8c7a6 planemo upload for repository https://github.com/RECETOX/galaxytools/tree/master/tools/tables commit d0ff40eb2b536fec6c973c3a9ea8e7f31cd9a0d6
recetox
parents:
diff changeset
62 nargs="+",
e6d5fee8c7a6 planemo upload for repository https://github.com/RECETOX/galaxytools/tree/master/tools/tables commit d0ff40eb2b536fec6c973c3a9ea8e7f31cd9a0d6
recetox
parents:
diff changeset
63 action=KeyValuePairsAction,
e6d5fee8c7a6 planemo upload for repository https://github.com/RECETOX/galaxytools/tree/master/tools/tables commit d0ff40eb2b536fec6c973c3a9ea8e7f31cd9a0d6
recetox
parents:
diff changeset
64 required=True,
e6d5fee8c7a6 planemo upload for repository https://github.com/RECETOX/galaxytools/tree/master/tools/tables commit d0ff40eb2b536fec6c973c3a9ea8e7f31cd9a0d6
recetox
parents:
diff changeset
65 help="List of key=value pairs with 1-based column index as key and new column name as value",
e6d5fee8c7a6 planemo upload for repository https://github.com/RECETOX/galaxytools/tree/master/tools/tables commit d0ff40eb2b536fec6c973c3a9ea8e7f31cd9a0d6
recetox
parents:
diff changeset
66 )
e6d5fee8c7a6 planemo upload for repository https://github.com/RECETOX/galaxytools/tree/master/tools/tables commit d0ff40eb2b536fec6c973c3a9ea8e7f31cd9a0d6
recetox
parents:
diff changeset
67 parser.add_argument(
e6d5fee8c7a6 planemo upload for repository https://github.com/RECETOX/galaxytools/tree/master/tools/tables commit d0ff40eb2b536fec6c973c3a9ea8e7f31cd9a0d6
recetox
parents:
diff changeset
68 "--output_dataset",
e6d5fee8c7a6 planemo upload for repository https://github.com/RECETOX/galaxytools/tree/master/tools/tables commit d0ff40eb2b536fec6c973c3a9ea8e7f31cd9a0d6
recetox
parents:
diff changeset
69 nargs=2,
e6d5fee8c7a6 planemo upload for repository https://github.com/RECETOX/galaxytools/tree/master/tools/tables commit d0ff40eb2b536fec6c973c3a9ea8e7f31cd9a0d6
recetox
parents:
diff changeset
70 action=StoreOutputAction,
e6d5fee8c7a6 planemo upload for repository https://github.com/RECETOX/galaxytools/tree/master/tools/tables commit d0ff40eb2b536fec6c973c3a9ea8e7f31cd9a0d6
recetox
parents:
diff changeset
71 required=True,
e6d5fee8c7a6 planemo upload for repository https://github.com/RECETOX/galaxytools/tree/master/tools/tables commit d0ff40eb2b536fec6c973c3a9ea8e7f31cd9a0d6
recetox
parents:
diff changeset
72 help="Path to the output dataset and its file extension (csv, tsv, parquet)",
e6d5fee8c7a6 planemo upload for repository https://github.com/RECETOX/galaxytools/tree/master/tools/tables commit d0ff40eb2b536fec6c973c3a9ea8e7f31cd9a0d6
recetox
parents:
diff changeset
73 )
e6d5fee8c7a6 planemo upload for repository https://github.com/RECETOX/galaxytools/tree/master/tools/tables commit d0ff40eb2b536fec6c973c3a9ea8e7f31cd9a0d6
recetox
parents:
diff changeset
74
e6d5fee8c7a6 planemo upload for repository https://github.com/RECETOX/galaxytools/tree/master/tools/tables commit d0ff40eb2b536fec6c973c3a9ea8e7f31cd9a0d6
recetox
parents:
diff changeset
75 args = parser.parse_args()
e6d5fee8c7a6 planemo upload for repository https://github.com/RECETOX/galaxytools/tree/master/tools/tables commit d0ff40eb2b536fec6c973c3a9ea8e7f31cd9a0d6
recetox
parents:
diff changeset
76 main(args.input_dataset, args.rename, args.output_dataset)