Galaxy | Tool Preview

Split Dataset (version 1.0.11.0)
This tool only supports to split one array at each tool run. If X, y are in separate files, the splitting task could be done by invoking this tool twice in which this input dataset is swapped while all other parameters are kept the same.
Options
Options 0

What it does This tool implements splitter function and classes from sklearn.model_selection module to split contents (rows) of a table into two subsets for training and test, respectively . The simple train test split mode not only supports shuffle split and stratified shuffle split natively carried by the train_test_split function, but also gets extended to do group shuffle. The cross-validation splitter mode supports more diverse splitting strategies. Each tool run outputs one split, train and test. To get different splitting sets, for example, nested CV, multiple tool runs are needed with different nth_split.

Input: a tabular dataset.

Output: two tabular datasets containing training and test subsets, respectively.