Split a feature table into training and testing sets.
Split a feature table into training and testing sets. By default stratifies
training and test sets on a metadata column, such that values in that
column are evenly represented across training and test sets.
Parameters
- table : FeatureTable[Frequency]
- Feature table containing all features that should be used for target
prediction.
- metadata : MetadataColumn[Numeric | Categorical]
- Numeric metadata column to use as prediction target.
- test_size : Float % Range(0.0, 1.0, inclusive_start=False), optional
- Fraction of input samples to exclude from training set and use for
classifier testing.
- random_state : Int, optional
- Seed used by random number generator.
- stratify : Bool, optional
- Evenly stratify training and test data among metadata categories. If
True, all values in column must match at least two samples.
- missing_samples : Str % Choices('error', 'ignore'), optional
- How to handle missing samples in metadata. "error" will fail if missing
samples are detected. "ignore" will cause the feature table and
metadata to be filtered, so that only samples found in both files are
retained.
Returns
- training_table : FeatureTable[Frequency]
- Feature table containing training samples
- test_table : FeatureTable[Frequency]
- Feature table containing test samples