0.9
python
scikit-learn
pandas
xgboost
asteval
Train a model
Load a model and predict
Predict class labels
Include advanced options
squared loss
huber
epsilon insensitive
squared epsilon insensitive
l2
l1
elastic net
none
optimal
constant
inverse scaling
auto
svd
cholesky
lsqr
sparse_cg
sag
Gini impurity
Information gain
mse - mean squared error
mae - mean absolute error
auto - max_features=n_features
sqrt - max_features=sqrt(n_features)
log2 - max_features=log2(n_features)
I want to type the number in or input None type
auto
true
false
k-means++
random
Calculate metrics globally by counting the total true positives, false negatives and false positives. (micro)
Calculate metrics for each instance, and find their average. Only meaningful for multilabel. (samples)
Calculate metrics for each label, and find their unweighted mean. This does not take label imbalance into account. (macro)
Calculate metrics for each label, and find their average, weighted by support (the number of true instances for each label). This alters ‘macro’ to account for label imbalance; it can result in an F-score that is not between precision and recall. (weighted)
None
Select columns by column index number(s)
Select columns by column header name(s)
All columns but by column index number(s)
All columns but by column header name(s)
All columns
Tabular
Sparse
Tabular
Sparse
tabular data
sparse matrix
Uniform weights. All points in each neighborhood are weighted equally. (Uniform)
Weight points by the inverse of their distance. (Distance)
Auto
BallTree
KDTree
Brute-force
rbf
linear
poly
sigmoid
precomputed
arpack
lobpcg
amg
RBF
precomputed
Nearset neighbors
kmeans
discretize
auto
ball_tree
kd_tree
brute
KMeans
Spectral Clustering
Mini Batch KMeans
DBSCAN
Birch
euclidean
cityblock
cosine
l1
l2
manhattan
braycurtis
canberra
chebyshev
correlation
dice
hamming
jaccard
kulsinski
mahalanobis
matching
minkowski
rogerstanimoto
russellrao
seuclidean
sokalmichener
sokalsneath
sqeuclidean
yule
rbf
sigmoid
polynomial
linear
chi2
additive_chi2
Euclidean distance matrix
Distance matrix
Minimum distances between one point and a set of points
Additive chi-squared kernel
Exponential chi-squared kernel
Linear kernel
L1 distances
Kernel
Polynomial kernel
Gaussian (rbf) kernel
Laplacian kernel
Standard Scaler (Standardizes features by removing the mean and scaling to unit variance)
Binarizer (Binarizes data)
Imputer (Completes missing values)
Max Abs Scaler (Scales features by their maximum absolute value)
Normalizer (Normalizes samples individually to unit norm)
Kernel Centerer (Centers a kernel matrix)
Minmax Scaler (Scales features to a range)
Polynomial Features (Generates polynomial and interaction features)
Robust Scaler (Scales features using outlier-invariance statistics)
Replace missing values using the mean along the axis
Replace missing values using the median along the axis
Replace missing using the most frequent value along the axis
Yes
No. Load a prefitted estimator
Yes
SelectKBest - Select features according to the k highest scores
SelectFromModel - Meta-transformer for selecting features based on importance weights
GenericUnivariateSelect - Univariate feature selector with configurable strategy
SelectPercentile - Select features according to a percentile of the highest scores
SelectFpr - Filter: Select the p-values below alpha based on a FPR test
SelectFdr - Filter: Select the p-values for an estimated false discovery rate
SelectFwe - Filter: Select the p-values corresponding to Family-wise error rate
RFE - Feature ranking with recursive feature elimination
RFECV - Feature ranking with recursive feature elimination and cross-validated selection of the best number of features
VarianceThreshold - Feature selector that removes all low-variance features
percentile
k_best
fpr
fdr
fwe
chi2 - Compute chi-squared stats between each non-negative feature and class
f_classif - Compute the ANOVA F-value for the provided sample
f_regression - Univariate linear regression tests
mutual_info_classif - Estimate mutual information for a discrete target variable
mutual_info_regression - Estimate mutual information for a continuous target variable
fit_transform - Fit to data, then transform it
get_support - Get a mask, or integer index, of the features selected
default with estimator
Classification -- 'accuracy'
Classification -- 'balanced_accuracy'
Classification -- 'average_precision'
Classification -- 'f1'
Classification -- 'f1_micro'
Classification -- 'f1_macro'
Classification -- 'f1_weighted'
Classification -- 'f1_samples'
Classification -- 'neg_log_loss'
Classification -- 'precision'
Classification -- 'precision_micro'
Classification -- 'precision_macro'
Classification -- 'precision_wighted'
Classification -- 'precision_samples'
Classification -- 'recall'
Classification -- 'recall_micro'
Classification -- 'recall_macro'
Classification -- 'recall_wighted'
Classification -- 'recall_samples'
Classification -- 'roc_auc'
Regression -- 'explained_variance'
Regression -- 'neg_mean_absolute_error'
Regression -- 'neg_mean_squared_error'
Regression -- 'neg_mean_squared_log_error'
Regression -- 'neg_median_absolute_error'
Regression -- 'r2'
Classification -- 'accuracy'
Classification -- 'balanced_accuracy'
Classification -- 'average_precision'
Classification -- 'f1'
Classification -- 'f1_micro'
Classification -- 'f1_macro'
Classification -- 'f1_weighted'
Classification -- 'f1_samples'
Classification -- 'neg_log_loss'
Classification -- 'precision'
Classification -- 'precision_micro'
Classification -- 'precision_macro'
Classification -- 'precision_wighted'
Classification -- 'precision_samples'
Classification -- 'recall'
Classification -- 'recall_micro'
Classification -- 'recall_macro'
Classification -- 'recall_wighted'
Classification -- 'recall_samples'
Classification -- 'roc_auc'
Regression -- 'explained_variance'
Regression -- 'neg_mean_absolute_error'
Regression -- 'neg_mean_squared_error'
Regression -- 'neg_mean_squared_log_error'
Regression -- 'neg_median_absolute_error'
Regression -- 'r2'
Final estimator
Pre-processing step #1
Pre-processing step #2
Pre-processing step #3
Pre-processing step #4
Pre-processing step #5
sklearn.svm
sklearn.linear_model
sklearn.ensemble
sklearn.naive_bayes
sklearn.tree
sklearn.neighbors
xgboost
LinearSVC
LinearSVR
NuSVC
NuSVR
OneClassSVM
SVC
SVR
ARDRegression
BayesianRidge
ElasticNet
ElasticNetCV
HuberRegressor
Lars
LarsCV
Lasso
LassoCV
LassoLars
LassoLarsCV
LassoLarsIC
LinearRegression
LogisticRegression
LogisticRegressionCV
MultiTaskLasso
MultiTaskElasticNet
MultiTaskLassoCV
MultiTaskElasticNetCV
OrthogonalMatchingPursuit
OrthogonalMatchingPursuitCV
PassiveAggressiveClassifier
PassiveAggressiveRegressor
Perceptron
RANSACRegressor
Ridge
RidgeClassifier
RidgeClassifierCV
RidgeCV
SGDClassifier
SGDRegressor
TheilSenRegressor
AdaBoostClassifier
AdaBoostRegressor
BaggingClassifier
BaggingRegressor
ExtraTreesClassifier
ExtraTreesRegressor
GradientBoostingClassifier
GradientBoostingRegressor
IsolationForest
RandomForestClassifier
RandomForestRegressor
RandomTreesEmbedding
BernoulliNB
GaussianNB
MultinomialNB
DecisionTreeClassifier
DecisionTreeRegressor
ExtraTreeClassifier
ExtraTreeRegressor
KNeighborsClassifier
KNeighborsRegressor
KernelDensity
LocalOutlierFactor
RadiusNeighborsClassifier
RadiusNeighborsRegressor
NearestCentroid
NearestNeighbors
XGBRegressor
XGBClassifier
Nystroem
RBFSampler
AdditiveChi2Sampler
SkewedChi2Sampler
DictionaryLearning
FactorAnalysis
FastICA
IncrementalPCA
KernelPCA
LatentDirichletAllocation
MiniBatchDictionaryLearning
MiniBatchSparsePCA
NMF
PCA
SparsePCA
TruncatedSVD
FeatureAgglomeration
ReliefF
SURF
SURFstar
MultiSURF
MultiSURFstar
TuRF
selected_tasks['selected_task'] == 'load'
selected_tasks['selected_task'] == 'train'
10.5281/zenodo.15094
@article{scikit-learn,
title={Scikit-learn: Machine Learning in {P}ython},
author={Pedregosa, F. and Varoquaux, G. and Gramfort, A. and Michel, V.
and Thirion, B. and Grisel, O. and Blondel, M. and Prettenhofer, P.
and Weiss, R. and Dubourg, V. and Vanderplas, J. and Passos, A. and
Cournapeau, D. and Brucher, M. and Perrot, M. and Duchesnay, E.},
journal={Journal of Machine Learning Research},
volume={12},
pages={2825--2830},
year={2011}
}
@Misc{,
author = {Eric Jones and Travis Oliphant and Pearu Peterson and others},
title = {{SciPy}: Open source scientific tools for {Python}},
year = {2001--},
url = "http://www.scipy.org/",
note = {[Online; accessed 2016-04-09]}
}
@article{DBLP:journals/corr/abs-1711-08477,
author = {Ryan J. Urbanowicz and
Randal S. Olson and
Peter Schmitt and
Melissa Meeker and
Jason H. Moore},
title = {Benchmarking Relief-Based Feature Selection Methods},
journal = {CoRR},
volume = {abs/1711.08477},
year = {2017},
url = {http://arxiv.org/abs/1711.08477},
archivePrefix = {arXiv},
eprint = {1711.08477},
timestamp = {Mon, 13 Aug 2018 16:46:04 +0200},
biburl = {https://dblp.org/rec/bib/journals/corr/abs-1711-08477},
bibsource = {dblp computer science bibliography, https://dblp.org}
}
@inproceedings{Chen:2016:XST:2939672.2939785,
author = {Chen, Tianqi and Guestrin, Carlos},
title = {{XGBoost}: A Scalable Tree Boosting System},
booktitle = {Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining},
series = {KDD '16},
year = {2016},
isbn = {978-1-4503-4232-2},
location = {San Francisco, California, USA},
pages = {785--794},
numpages = {10},
url = {http://doi.acm.org/10.1145/2939672.2939785},
doi = {10.1145/2939672.2939785},
acmid = {2939785},
publisher = {ACM},
address = {New York, NY, USA},
keywords = {large-scale machine learning},
}