What it does
Given a pre-built keras deep learning model and labeled training dataset, this tool works in two modes.
- Train and Validate: training dataset is split into train and validation portions. The model fits on the train portion, in the meantime performances are validated on the validation portion multiple times along with the training progressing. Finally, a fitted model (skeleton + weights) and its validation performance scores are outputted.
- Train, Validate and and Evaluate: training dataset is split into three portions, train, val and test. The same Train and Validate happens on the train and val portions. The test portion is hold out exclusively for testing (evaluation). As a result, a fitted model (skeleton + weights) and test performance scores are outputted.
In both modes, besides the performance scores, the true labels and predicted values are able to be ouputted, which could be used in generating plots in other tools, machine learning visualization extensions, for example.
Note that since all training and model parameters are accessible and changeable in the Hyperparameter Swapping section, the training and evaluation processes are transparent and fully controllable.
Input
- tabular
- sparse
- sequnences in a fasta file to work with DNA, RNA and Proteins with corresponding fasta data generator
- reference genome and intervals exclusively work with GenomicIntervalBatchGenerator.
Output
- performance scores from evaluation
- fitted estimator skeleton and weights
- true labels or values and predicted values from the evaluation