1073 lines
47 KiB
ReStructuredText
1073 lines
47 KiB
ReStructuredText
.. include:: _contributors.rst
|
|
|
|
.. currentmodule:: sklearn
|
|
|
|
============
|
|
Version 0.19
|
|
============
|
|
|
|
.. _changes_0_19:
|
|
|
|
Version 0.19.2
|
|
==============
|
|
|
|
**July, 2018**
|
|
|
|
This release is exclusively in order to support Python 3.7.
|
|
|
|
Related changes
|
|
---------------
|
|
|
|
- ``n_iter_`` may vary from previous releases in
|
|
:class:`linear_model.LogisticRegression` with ``solver='lbfgs'`` and
|
|
:class:`linear_model.HuberRegressor`. For Scipy <= 1.0.0, the optimizer could
|
|
perform more than the requested maximum number of iterations. Now both
|
|
estimators will report at most ``max_iter`` iterations even if more were
|
|
performed. :issue:`10723` by `Joel Nothman`_.
|
|
|
|
Version 0.19.1
|
|
==============
|
|
|
|
**October 23, 2017**
|
|
|
|
This is a bug-fix release with some minor documentation improvements and
|
|
enhancements to features released in 0.19.0.
|
|
|
|
Note there may be minor differences in TSNE output in this release (due to
|
|
:issue:`9623`), in the case where multiple samples have equal distance to some
|
|
sample.
|
|
|
|
Changelog
|
|
---------
|
|
|
|
API changes
|
|
...........
|
|
|
|
- Reverted the addition of ``metrics.ndcg_score`` and ``metrics.dcg_score``
|
|
which had been merged into version 0.19.0 by error. The implementations
|
|
were broken and undocumented.
|
|
|
|
- ``return_train_score`` which was added to
|
|
:class:`model_selection.GridSearchCV`,
|
|
:class:`model_selection.RandomizedSearchCV` and
|
|
:func:`model_selection.cross_validate` in version 0.19.0 will be changing its
|
|
default value from True to False in version 0.21. We found that calculating
|
|
training score could have a great effect on cross validation runtime in some
|
|
cases. Users should explicitly set ``return_train_score`` to False if
|
|
prediction or scoring functions are slow, resulting in a deleterious effect
|
|
on CV runtime, or to True if they wish to use the calculated scores.
|
|
:issue:`9677` by :user:`Kumar Ashutosh <thechargedneutron>` and `Joel
|
|
Nothman`_.
|
|
|
|
- ``correlation_models`` and ``regression_models`` from the legacy gaussian
|
|
processes implementation have been belatedly deprecated. :issue:`9717` by
|
|
:user:`Kumar Ashutosh <thechargedneutron>`.
|
|
|
|
Bug fixes
|
|
.........
|
|
|
|
- Avoid integer overflows in :func:`metrics.matthews_corrcoef`.
|
|
:issue:`9693` by :user:`Sam Steingold <sam-s>`.
|
|
|
|
- Fixed a bug in the objective function for :class:`manifold.TSNE` (both exact
|
|
and with the Barnes-Hut approximation) when ``n_components >= 3``.
|
|
:issue:`9711` by :user:`goncalo-rodrigues`.
|
|
|
|
- Fix regression in :func:`model_selection.cross_val_predict` where it
|
|
raised an error with ``method='predict_proba'`` for some probabilistic
|
|
classifiers. :issue:`9641` by :user:`James Bourbeau <jrbourbeau>`.
|
|
|
|
- Fixed a bug where :func:`datasets.make_classification` modified its input
|
|
``weights``. :issue:`9865` by :user:`Sachin Kelkar <s4chin>`.
|
|
|
|
- :class:`model_selection.StratifiedShuffleSplit` now works with multioutput
|
|
multiclass or multilabel data with more than 1000 columns. :issue:`9922` by
|
|
:user:`Charlie Brummitt <crbrummitt>`.
|
|
|
|
- Fixed a bug with nested and conditional parameter setting, e.g. setting a
|
|
pipeline step and its parameter at the same time. :issue:`9945` by `Andreas
|
|
Müller`_ and `Joel Nothman`_.
|
|
|
|
Regressions in 0.19.0 fixed in 0.19.1:
|
|
|
|
- Fixed a bug where parallelised prediction in random forests was not
|
|
thread-safe and could (rarely) result in arbitrary errors. :issue:`9830` by
|
|
`Joel Nothman`_.
|
|
|
|
- Fix regression in :func:`model_selection.cross_val_predict` where it no
|
|
longer accepted ``X`` as a list. :issue:`9600` by :user:`Rasul Kerimov
|
|
<CoderINusE>`.
|
|
|
|
- Fixed handling of :func:`model_selection.cross_val_predict` for binary
|
|
classification with ``method='decision_function'``. :issue:`9593` by
|
|
:user:`Reiichiro Nakano <reiinakano>` and core devs.
|
|
|
|
- Fix regression in :class:`pipeline.Pipeline` where it no longer accepted
|
|
``steps`` as a tuple. :issue:`9604` by :user:`Joris Van den Bossche
|
|
<jorisvandenbossche>`.
|
|
|
|
- Fix bug where ``n_iter`` was not properly deprecated, leaving ``n_iter``
|
|
unavailable for interim use in
|
|
:class:`linear_model.SGDClassifier`, :class:`linear_model.SGDRegressor`,
|
|
:class:`linear_model.PassiveAggressiveClassifier`,
|
|
:class:`linear_model.PassiveAggressiveRegressor` and
|
|
:class:`linear_model.Perceptron`. :issue:`9558` by `Andreas Müller`_.
|
|
|
|
- Dataset fetchers make sure temporary files are closed before removing them,
|
|
which caused errors on Windows. :issue:`9847` by :user:`Joan Massich <massich>`.
|
|
|
|
- Fixed a regression in :class:`manifold.TSNE` where it no longer supported
|
|
metrics other than 'euclidean' and 'precomputed'. :issue:`9623` by :user:`Oli
|
|
Blum <oliblum90>`.
|
|
|
|
Enhancements
|
|
............
|
|
|
|
- Our test suite and :func:`utils.estimator_checks.check_estimator` can now be
|
|
run without Nose installed. :issue:`9697` by :user:`Joan Massich <massich>`.
|
|
|
|
- To improve usability of version 0.19's :class:`pipeline.Pipeline`
|
|
caching, ``memory`` now allows ``joblib.Memory`` instances.
|
|
This make use of the new :func:`utils.validation.check_memory` helper.
|
|
issue:`9584` by :user:`Kumar Ashutosh <thechargedneutron>`
|
|
|
|
- Some fixes to examples: :issue:`9750`, :issue:`9788`, :issue:`9815`
|
|
|
|
- Made a FutureWarning in SGD-based estimators less verbose. :issue:`9802` by
|
|
:user:`Vrishank Bhardwaj <vrishank97>`.
|
|
|
|
Code and Documentation Contributors
|
|
-----------------------------------
|
|
|
|
With thanks to:
|
|
|
|
Joel Nothman, Loic Esteve, Andreas Mueller, Kumar Ashutosh,
|
|
Vrishank Bhardwaj, Hanmin Qin, Rasul Kerimov, James Bourbeau,
|
|
Nagarjuna Kumar, Nathaniel Saul, Olivier Grisel, Roman
|
|
Yurchak, Reiichiro Nakano, Sachin Kelkar, Sam Steingold,
|
|
Yaroslav Halchenko, diegodlh, felix, goncalo-rodrigues,
|
|
jkleint, oliblum90, pasbi, Anthony Gitter, Ben Lawson, Charlie
|
|
Brummitt, Didi Bar-Zev, Gael Varoquaux, Joan Massich, Joris
|
|
Van den Bossche, nielsenmarkus11
|
|
|
|
|
|
Version 0.19
|
|
============
|
|
|
|
**August 12, 2017**
|
|
|
|
Highlights
|
|
----------
|
|
|
|
We are excited to release a number of great new features including
|
|
:class:`neighbors.LocalOutlierFactor` for anomaly detection,
|
|
:class:`preprocessing.QuantileTransformer` for robust feature transformation,
|
|
and the :class:`multioutput.ClassifierChain` meta-estimator to simply account
|
|
for dependencies between classes in multilabel problems. We have some new
|
|
algorithms in existing estimators, such as multiplicative update in
|
|
:class:`decomposition.NMF` and multinomial
|
|
:class:`linear_model.LogisticRegression` with L1 loss (use ``solver='saga'``).
|
|
|
|
Cross validation is now able to return the results from multiple metric
|
|
evaluations. The new :func:`model_selection.cross_validate` can return many
|
|
scores on the test data as well as training set performance and timings, and we
|
|
have extended the ``scoring`` and ``refit`` parameters for grid/randomized
|
|
search :ref:`to handle multiple metrics <multimetric_grid_search>`.
|
|
|
|
You can also learn faster. For instance, the :ref:`new option to cache
|
|
transformations <pipeline_cache>` in :class:`pipeline.Pipeline` makes grid
|
|
search over pipelines including slow transformations much more efficient. And
|
|
you can predict faster: if you're sure you know what you're doing, you can turn
|
|
off validating that the input is finite using :func:`config_context`.
|
|
|
|
We've made some important fixes too. We've fixed a longstanding implementation
|
|
error in :func:`metrics.average_precision_score`, so please be cautious with
|
|
prior results reported from that function. A number of errors in the
|
|
:class:`manifold.TSNE` implementation have been fixed, particularly in the
|
|
default Barnes-Hut approximation. :class:`semi_supervised.LabelSpreading` and
|
|
:class:`semi_supervised.LabelPropagation` have had substantial fixes.
|
|
LabelPropagation was previously broken. LabelSpreading should now correctly
|
|
respect its alpha parameter.
|
|
|
|
Changed models
|
|
--------------
|
|
|
|
The following estimators and functions, when fit with the same data and
|
|
parameters, may produce different models from the previous version. This often
|
|
occurs due to changes in the modelling logic (bug fixes or enhancements), or in
|
|
random sampling procedures.
|
|
|
|
- :class:`cluster.KMeans` with sparse X and initial centroids given (bug fix)
|
|
- :class:`cross_decomposition.PLSRegression`
|
|
with ``scale=True`` (bug fix)
|
|
- :class:`ensemble.GradientBoostingClassifier` and
|
|
:class:`ensemble.GradientBoostingRegressor` where ``min_impurity_split`` is used (bug fix)
|
|
- gradient boosting ``loss='quantile'`` (bug fix)
|
|
- :class:`ensemble.IsolationForest` (bug fix)
|
|
- :class:`feature_selection.SelectFdr` (bug fix)
|
|
- :class:`linear_model.RANSACRegressor` (bug fix)
|
|
- :class:`linear_model.LassoLars` (bug fix)
|
|
- :class:`linear_model.LassoLarsIC` (bug fix)
|
|
- :class:`manifold.TSNE` (bug fix)
|
|
- :class:`neighbors.NearestCentroid` (bug fix)
|
|
- :class:`semi_supervised.LabelSpreading` (bug fix)
|
|
- :class:`semi_supervised.LabelPropagation` (bug fix)
|
|
- tree based models where ``min_weight_fraction_leaf`` is used (enhancement)
|
|
- :class:`model_selection.StratifiedKFold` with ``shuffle=True``
|
|
(this change, due to :issue:`7823` was not mentioned in the release notes at
|
|
the time)
|
|
|
|
Details are listed in the changelog below.
|
|
|
|
(While we are trying to better inform users by providing this information, we
|
|
cannot assure that this list is complete.)
|
|
|
|
Changelog
|
|
---------
|
|
|
|
New features
|
|
............
|
|
|
|
Classifiers and regressors
|
|
|
|
- Added :class:`multioutput.ClassifierChain` for multi-label
|
|
classification. By :user:`Adam Kleczewski <adamklec>`.
|
|
|
|
- Added solver ``'saga'`` that implements the improved version of Stochastic
|
|
Average Gradient, in :class:`linear_model.LogisticRegression` and
|
|
:class:`linear_model.Ridge`. It allows the use of L1 penalty with
|
|
multinomial logistic loss, and behaves marginally better than 'sag'
|
|
during the first epochs of ridge and logistic regression.
|
|
:issue:`8446` by `Arthur Mensch`_.
|
|
|
|
Other estimators
|
|
|
|
- Added the :class:`neighbors.LocalOutlierFactor` class for anomaly
|
|
detection based on nearest neighbors.
|
|
:issue:`5279` by `Nicolas Goix`_ and `Alexandre Gramfort`_.
|
|
|
|
- Added :class:`preprocessing.QuantileTransformer` class and
|
|
:func:`preprocessing.quantile_transform` function for features
|
|
normalization based on quantiles.
|
|
:issue:`8363` by :user:`Denis Engemann <dengemann>`,
|
|
:user:`Guillaume Lemaitre <glemaitre>`, `Olivier Grisel`_, `Raghav RV`_,
|
|
:user:`Thierry Guillemot <tguillemot>`, and `Gael Varoquaux`_.
|
|
|
|
- The new solver ``'mu'`` implements a Multiplicate Update in
|
|
:class:`decomposition.NMF`, allowing the optimization of all
|
|
beta-divergences, including the Frobenius norm, the generalized
|
|
Kullback-Leibler divergence and the Itakura-Saito divergence.
|
|
:issue:`5295` by `Tom Dupre la Tour`_.
|
|
|
|
Model selection and evaluation
|
|
|
|
- :class:`model_selection.GridSearchCV` and
|
|
:class:`model_selection.RandomizedSearchCV` now support simultaneous
|
|
evaluation of multiple metrics. Refer to the
|
|
:ref:`multimetric_grid_search` section of the user guide for more
|
|
information. :issue:`7388` by `Raghav RV`_
|
|
|
|
- Added the :func:`model_selection.cross_validate` which allows evaluation
|
|
of multiple metrics. This function returns a dict with more useful
|
|
information from cross-validation such as the train scores, fit times and
|
|
score times.
|
|
Refer to :ref:`multimetric_cross_validation` section of the userguide
|
|
for more information. :issue:`7388` by `Raghav RV`_
|
|
|
|
- Added :func:`metrics.mean_squared_log_error`, which computes
|
|
the mean square error of the logarithmic transformation of targets,
|
|
particularly useful for targets with an exponential trend.
|
|
:issue:`7655` by :user:`Karan Desai <karandesai-96>`.
|
|
|
|
- Added :func:`metrics.dcg_score` and :func:`metrics.ndcg_score`, which
|
|
compute Discounted cumulative gain (DCG) and Normalized discounted
|
|
cumulative gain (NDCG).
|
|
:issue:`7739` by :user:`David Gasquez <davidgasquez>`.
|
|
|
|
- Added the :class:`model_selection.RepeatedKFold` and
|
|
:class:`model_selection.RepeatedStratifiedKFold`.
|
|
:issue:`8120` by `Neeraj Gangwar`_.
|
|
|
|
Miscellaneous
|
|
|
|
- Validation that input data contains no NaN or inf can now be suppressed
|
|
using :func:`config_context`, at your own risk. This will save on runtime,
|
|
and may be particularly useful for prediction time. :issue:`7548` by
|
|
`Joel Nothman`_.
|
|
|
|
- Added a test to ensure parameter listing in docstrings match the
|
|
function/class signature. :issue:`9206` by `Alexandre Gramfort`_ and
|
|
`Raghav RV`_.
|
|
|
|
Enhancements
|
|
............
|
|
|
|
Trees and ensembles
|
|
|
|
- The ``min_weight_fraction_leaf`` constraint in tree construction is now
|
|
more efficient, taking a fast path to declare a node a leaf if its weight
|
|
is less than 2 * the minimum. Note that the constructed tree will be
|
|
different from previous versions where ``min_weight_fraction_leaf`` is
|
|
used. :issue:`7441` by :user:`Nelson Liu <nelson-liu>`.
|
|
|
|
- :class:`ensemble.GradientBoostingClassifier` and :class:`ensemble.GradientBoostingRegressor`
|
|
now support sparse input for prediction.
|
|
:issue:`6101` by :user:`Ibraim Ganiev <olologin>`.
|
|
|
|
- :class:`ensemble.VotingClassifier` now allows changing estimators by using
|
|
:meth:`ensemble.VotingClassifier.set_params`. An estimator can also be
|
|
removed by setting it to ``None``.
|
|
:issue:`7674` by :user:`Yichuan Liu <yl565>`.
|
|
|
|
- :func:`tree.export_graphviz` now shows configurable number of decimal
|
|
places. :issue:`8698` by :user:`Guillaume Lemaitre <glemaitre>`.
|
|
|
|
- Added ``flatten_transform`` parameter to :class:`ensemble.VotingClassifier`
|
|
to change output shape of `transform` method to 2 dimensional.
|
|
:issue:`7794` by :user:`Ibraim Ganiev <olologin>` and
|
|
:user:`Herilalaina Rakotoarison <herilalaina>`.
|
|
|
|
Linear, kernelized and related models
|
|
|
|
- :class:`linear_model.SGDClassifier`, :class:`linear_model.SGDRegressor`,
|
|
:class:`linear_model.PassiveAggressiveClassifier`,
|
|
:class:`linear_model.PassiveAggressiveRegressor` and
|
|
:class:`linear_model.Perceptron` now expose ``max_iter`` and
|
|
``tol`` parameters, to handle convergence more precisely.
|
|
``n_iter`` parameter is deprecated, and the fitted estimator exposes
|
|
a ``n_iter_`` attribute, with actual number of iterations before
|
|
convergence. :issue:`5036` by `Tom Dupre la Tour`_.
|
|
|
|
- Added ``average`` parameter to perform weight averaging in
|
|
:class:`linear_model.PassiveAggressiveClassifier`. :issue:`4939`
|
|
by :user:`Andrea Esuli <aesuli>`.
|
|
|
|
- :class:`linear_model.RANSACRegressor` no longer throws an error
|
|
when calling ``fit`` if no inliers are found in its first iteration.
|
|
Furthermore, causes of skipped iterations are tracked in newly added
|
|
attributes, ``n_skips_*``.
|
|
:issue:`7914` by :user:`Michael Horrell <mthorrell>`.
|
|
|
|
- In :class:`gaussian_process.GaussianProcessRegressor`, method ``predict``
|
|
is a lot faster with ``return_std=True``. :issue:`8591` by
|
|
:user:`Hadrien Bertrand <hbertrand>`.
|
|
|
|
- Added ``return_std`` to ``predict`` method of
|
|
:class:`linear_model.ARDRegression` and
|
|
:class:`linear_model.BayesianRidge`.
|
|
:issue:`7838` by :user:`Sergey Feldman <sergeyf>`.
|
|
|
|
- Memory usage enhancements: Prevent cast from float32 to float64 in:
|
|
:class:`linear_model.MultiTaskElasticNet`;
|
|
:class:`linear_model.LogisticRegression` when using newton-cg solver; and
|
|
:class:`linear_model.Ridge` when using svd, sparse_cg, cholesky or lsqr
|
|
solvers. :issue:`8835`, :issue:`8061` by :user:`Joan Massich <massich>` and :user:`Nicolas
|
|
Cordier <ncordier>` and :user:`Thierry Guillemot <tguillemot>`.
|
|
|
|
Other predictors
|
|
|
|
- Custom metrics for the :mod:`sklearn.neighbors` binary trees now have
|
|
fewer constraints: they must take two 1d-arrays and return a float.
|
|
:issue:`6288` by `Jake Vanderplas`_.
|
|
|
|
- ``algorithm='auto`` in :mod:`sklearn.neighbors` estimators now chooses the most
|
|
appropriate algorithm for all input types and metrics. :issue:`9145` by
|
|
:user:`Herilalaina Rakotoarison <herilalaina>` and :user:`Reddy Chinthala
|
|
<preddy5>`.
|
|
|
|
Decomposition, manifold learning and clustering
|
|
|
|
- :class:`cluster.MiniBatchKMeans` and :class:`cluster.KMeans`
|
|
now use significantly less memory when assigning data points to their
|
|
nearest cluster center. :issue:`7721` by :user:`Jon Crall <Erotemic>`.
|
|
|
|
- :class:`decomposition.PCA`, :class:`decomposition.IncrementalPCA` and
|
|
:class:`decomposition.TruncatedSVD` now expose the singular values
|
|
from the underlying SVD. They are stored in the attribute
|
|
``singular_values_``, like in :class:`decomposition.IncrementalPCA`.
|
|
:issue:`7685` by :user:`Tommy Löfstedt <tomlof>`
|
|
|
|
- :class:`decomposition.NMF` now faster when ``beta_loss=0``.
|
|
:issue:`9277` by :user:`hongkahjun`.
|
|
|
|
- Memory improvements for method ``barnes_hut`` in :class:`manifold.TSNE`
|
|
:issue:`7089` by :user:`Thomas Moreau <tomMoral>` and `Olivier Grisel`_.
|
|
|
|
- Optimization schedule improvements for Barnes-Hut :class:`manifold.TSNE`
|
|
so the results are closer to the one from the reference implementation
|
|
`lvdmaaten/bhtsne <https://github.com/lvdmaaten/bhtsne>`_ by :user:`Thomas
|
|
Moreau <tomMoral>` and `Olivier Grisel`_.
|
|
|
|
- Memory usage enhancements: Prevent cast from float32 to float64 in
|
|
:class:`decomposition.PCA` and
|
|
`decomposition.randomized_svd_low_rank`.
|
|
:issue:`9067` by `Raghav RV`_.
|
|
|
|
Preprocessing and feature selection
|
|
|
|
- Added ``norm_order`` parameter to :class:`feature_selection.SelectFromModel`
|
|
to enable selection of the norm order when ``coef_`` is more than 1D.
|
|
:issue:`6181` by :user:`Antoine Wendlinger <antoinewdg>`.
|
|
|
|
- Added ability to use sparse matrices in :func:`feature_selection.f_regression`
|
|
with ``center=True``. :issue:`8065` by :user:`Daniel LeJeune <acadiansith>`.
|
|
|
|
- Small performance improvement to n-gram creation in
|
|
:mod:`sklearn.feature_extraction.text` by binding methods for loops and
|
|
special-casing unigrams. :issue:`7567` by :user:`Jaye Doepke <jtdoepke>`
|
|
|
|
- Relax assumption on the data for the
|
|
:class:`kernel_approximation.SkewedChi2Sampler`. Since the Skewed-Chi2
|
|
kernel is defined on the open interval :math:`(-skewedness; +\infty)^d`,
|
|
the transform function should not check whether ``X < 0`` but whether ``X <
|
|
-self.skewedness``. :issue:`7573` by :user:`Romain Brault <RomainBrault>`.
|
|
|
|
- Made default kernel parameters kernel-dependent in
|
|
:class:`kernel_approximation.Nystroem`.
|
|
:issue:`5229` by :user:`Saurabh Bansod <mth4saurabh>` and `Andreas Müller`_.
|
|
|
|
Model evaluation and meta-estimators
|
|
|
|
- :class:`pipeline.Pipeline` is now able to cache transformers
|
|
within a pipeline by using the ``memory`` constructor parameter.
|
|
:issue:`7990` by :user:`Guillaume Lemaitre <glemaitre>`.
|
|
|
|
- :class:`pipeline.Pipeline` steps can now be accessed as attributes of its
|
|
``named_steps`` attribute. :issue:`8586` by :user:`Herilalaina
|
|
Rakotoarison <herilalaina>`.
|
|
|
|
- Added ``sample_weight`` parameter to :meth:`pipeline.Pipeline.score`.
|
|
:issue:`7723` by :user:`Mikhail Korobov <kmike>`.
|
|
|
|
- Added ability to set ``n_jobs`` parameter to :func:`pipeline.make_union`.
|
|
A ``TypeError`` will be raised for any other kwargs. :issue:`8028`
|
|
by :user:`Alexander Booth <alexandercbooth>`.
|
|
|
|
- :class:`model_selection.GridSearchCV`,
|
|
:class:`model_selection.RandomizedSearchCV` and
|
|
:func:`model_selection.cross_val_score` now allow estimators with callable
|
|
kernels which were previously prohibited.
|
|
:issue:`8005` by `Andreas Müller`_ .
|
|
|
|
- :func:`model_selection.cross_val_predict` now returns output of the
|
|
correct shape for all values of the argument ``method``.
|
|
:issue:`7863` by :user:`Aman Dalmia <dalmia>`.
|
|
|
|
- Added ``shuffle`` and ``random_state`` parameters to shuffle training
|
|
data before taking prefixes of it based on training sizes in
|
|
:func:`model_selection.learning_curve`.
|
|
:issue:`7506` by :user:`Narine Kokhlikyan <NarineK>`.
|
|
|
|
- :class:`model_selection.StratifiedShuffleSplit` now works with multioutput
|
|
multiclass (or multilabel) data. :issue:`9044` by `Vlad Niculae`_.
|
|
|
|
- Speed improvements to :class:`model_selection.StratifiedShuffleSplit`.
|
|
:issue:`5991` by :user:`Arthur Mensch <arthurmensch>` and `Joel Nothman`_.
|
|
|
|
- Add ``shuffle`` parameter to :func:`model_selection.train_test_split`.
|
|
:issue:`8845` by :user:`themrmax <themrmax>`
|
|
|
|
- :class:`multioutput.MultiOutputRegressor` and :class:`multioutput.MultiOutputClassifier`
|
|
now support online learning using ``partial_fit``.
|
|
:issue: `8053` by :user:`Peng Yu <yupbank>`.
|
|
|
|
- Add ``max_train_size`` parameter to :class:`model_selection.TimeSeriesSplit`
|
|
:issue:`8282` by :user:`Aman Dalmia <dalmia>`.
|
|
|
|
- More clustering metrics are now available through :func:`metrics.get_scorer`
|
|
and ``scoring`` parameters. :issue:`8117` by `Raghav RV`_.
|
|
|
|
- A scorer based on :func:`metrics.explained_variance_score` is also available.
|
|
:issue:`9259` by :user:`Hanmin Qin <qinhanmin2014>`.
|
|
|
|
Metrics
|
|
|
|
- :func:`metrics.matthews_corrcoef` now support multiclass classification.
|
|
:issue:`8094` by :user:`Jon Crall <Erotemic>`.
|
|
|
|
- Add ``sample_weight`` parameter to :func:`metrics.cohen_kappa_score`.
|
|
:issue:`8335` by :user:`Victor Poughon <vpoughon>`.
|
|
|
|
Miscellaneous
|
|
|
|
- :func:`utils.estimator_checks.check_estimator` now attempts to ensure that methods
|
|
transform, predict, etc. do not set attributes on the estimator.
|
|
:issue:`7533` by :user:`Ekaterina Krivich <kiote>`.
|
|
|
|
- Added type checking to the ``accept_sparse`` parameter in
|
|
:mod:`sklearn.utils.validation` methods. This parameter now accepts only boolean,
|
|
string, or list/tuple of strings. ``accept_sparse=None`` is deprecated and
|
|
should be replaced by ``accept_sparse=False``.
|
|
:issue:`7880` by :user:`Josh Karnofsky <jkarno>`.
|
|
|
|
- Make it possible to load a chunk of an svmlight formatted file by
|
|
passing a range of bytes to :func:`datasets.load_svmlight_file`.
|
|
:issue:`935` by :user:`Olivier Grisel <ogrisel>`.
|
|
|
|
- :class:`dummy.DummyClassifier` and :class:`dummy.DummyRegressor`
|
|
now accept non-finite features. :issue:`8931` by :user:`Attractadore`.
|
|
|
|
Bug fixes
|
|
.........
|
|
|
|
Trees and ensembles
|
|
|
|
- Fixed a memory leak in trees when using trees with ``criterion='mae'``.
|
|
:issue:`8002` by `Raghav RV`_.
|
|
|
|
- Fixed a bug where :class:`ensemble.IsolationForest` uses an
|
|
an incorrect formula for the average path length
|
|
:issue:`8549` by `Peter Wang <https://github.com/PTRWang>`_.
|
|
|
|
- Fixed a bug where :class:`ensemble.AdaBoostClassifier` throws
|
|
``ZeroDivisionError`` while fitting data with single class labels.
|
|
:issue:`7501` by :user:`Dominik Krzeminski <dokato>`.
|
|
|
|
- Fixed a bug in :class:`ensemble.GradientBoostingClassifier` and
|
|
:class:`ensemble.GradientBoostingRegressor` where a float being compared
|
|
to ``0.0`` using ``==`` caused a divide by zero error. :issue:`7970` by
|
|
:user:`He Chen <chenhe95>`.
|
|
|
|
- Fix a bug where :class:`ensemble.GradientBoostingClassifier` and
|
|
:class:`ensemble.GradientBoostingRegressor` ignored the
|
|
``min_impurity_split`` parameter.
|
|
:issue:`8006` by :user:`Sebastian Pölsterl <sebp>`.
|
|
|
|
- Fixed ``oob_score`` in :class:`ensemble.BaggingClassifier`.
|
|
:issue:`8936` by :user:`Michael Lewis <mlewis1729>`
|
|
|
|
- Fixed excessive memory usage in prediction for random forests estimators.
|
|
:issue:`8672` by :user:`Mike Benfield <mikebenfield>`.
|
|
|
|
- Fixed a bug where ``sample_weight`` as a list broke random forests in Python 2
|
|
:issue:`8068` by :user:`xor`.
|
|
|
|
- Fixed a bug where :class:`ensemble.IsolationForest` fails when
|
|
``max_features`` is less than 1.
|
|
:issue:`5732` by :user:`Ishank Gulati <IshankGulati>`.
|
|
|
|
- Fix a bug where gradient boosting with ``loss='quantile'`` computed
|
|
negative errors for negative values of ``ytrue - ypred`` leading to wrong
|
|
values when calling ``__call__``.
|
|
:issue:`8087` by :user:`Alexis Mignon <AlexisMignon>`
|
|
|
|
- Fix a bug where :class:`ensemble.VotingClassifier` raises an error
|
|
when a numpy array is passed in for weights. :issue:`7983` by
|
|
:user:`Vincent Pham <vincentpham1991>`.
|
|
|
|
- Fixed a bug where :func:`tree.export_graphviz` raised an error
|
|
when the length of features_names does not match n_features in the decision
|
|
tree. :issue:`8512` by :user:`Li Li <aikinogard>`.
|
|
|
|
Linear, kernelized and related models
|
|
|
|
- Fixed a bug where :func:`linear_model.RANSACRegressor.fit` may run until
|
|
``max_iter`` if it finds a large inlier group early. :issue:`8251` by
|
|
:user:`aivision2020`.
|
|
|
|
- Fixed a bug where :class:`naive_bayes.MultinomialNB` and
|
|
:class:`naive_bayes.BernoulliNB` failed when ``alpha=0``. :issue:`5814` by
|
|
:user:`Yichuan Liu <yl565>` and :user:`Herilalaina Rakotoarison
|
|
<herilalaina>`.
|
|
|
|
- Fixed a bug where :class:`linear_model.LassoLars` does not give
|
|
the same result as the LassoLars implementation available
|
|
in R (lars library). :issue:`7849` by :user:`Jair Montoya Martinez <jmontoyam>`.
|
|
|
|
- Fixed a bug in `linear_model.RandomizedLasso`,
|
|
:class:`linear_model.Lars`, :class:`linear_model.LassoLars`,
|
|
:class:`linear_model.LarsCV` and :class:`linear_model.LassoLarsCV`,
|
|
where the parameter ``precompute`` was not used consistently across
|
|
classes, and some values proposed in the docstring could raise errors.
|
|
:issue:`5359` by `Tom Dupre la Tour`_.
|
|
|
|
- Fix inconsistent results between :class:`linear_model.RidgeCV` and
|
|
:class:`linear_model.Ridge` when using ``normalize=True``. :issue:`9302`
|
|
by `Alexandre Gramfort`_.
|
|
|
|
- Fix a bug where :func:`linear_model.LassoLars.fit` sometimes
|
|
left ``coef_`` as a list, rather than an ndarray.
|
|
:issue:`8160` by :user:`CJ Carey <perimosocordiae>`.
|
|
|
|
- Fix :func:`linear_model.BayesianRidge.fit` to return
|
|
ridge parameter ``alpha_`` and ``lambda_`` consistent with calculated
|
|
coefficients ``coef_`` and ``intercept_``.
|
|
:issue:`8224` by :user:`Peter Gedeck <gedeck>`.
|
|
|
|
- Fixed a bug in :class:`svm.OneClassSVM` where it returned floats instead of
|
|
integer classes. :issue:`8676` by :user:`Vathsala Achar <VathsalaAchar>`.
|
|
|
|
- Fix AIC/BIC criterion computation in :class:`linear_model.LassoLarsIC`.
|
|
:issue:`9022` by `Alexandre Gramfort`_ and :user:`Mehmet Basbug <mehmetbasbug>`.
|
|
|
|
- Fixed a memory leak in our LibLinear implementation. :issue:`9024` by
|
|
:user:`Sergei Lebedev <superbobry>`
|
|
|
|
- Fix bug where stratified CV splitters did not work with
|
|
:class:`linear_model.LassoCV`. :issue:`8973` by
|
|
:user:`Paulo Haddad <paulochf>`.
|
|
|
|
- Fixed a bug in :class:`gaussian_process.GaussianProcessRegressor`
|
|
when the standard deviation and covariance predicted without fit
|
|
would fail with a unmeaningful error by default.
|
|
:issue:`6573` by :user:`Quazi Marufur Rahman <qmaruf>` and
|
|
`Manoj Kumar`_.
|
|
|
|
Other predictors
|
|
|
|
- Fix `semi_supervised.BaseLabelPropagation` to correctly implement
|
|
``LabelPropagation`` and ``LabelSpreading`` as done in the referenced
|
|
papers. :issue:`9239`
|
|
by :user:`Andre Ambrosio Boechat <boechat107>`, :user:`Utkarsh Upadhyay
|
|
<musically-ut>`, and `Joel Nothman`_.
|
|
|
|
Decomposition, manifold learning and clustering
|
|
|
|
- Fixed the implementation of :class:`manifold.TSNE`:
|
|
- ``early_exageration`` parameter had no effect and is now used for the
|
|
first 250 optimization iterations.
|
|
- Fixed the ``AssertionError: Tree consistency failed`` exception
|
|
reported in :issue:`8992`.
|
|
- Improve the learning schedule to match the one from the reference
|
|
implementation `lvdmaaten/bhtsne <https://github.com/lvdmaaten/bhtsne>`_.
|
|
by :user:`Thomas Moreau <tomMoral>` and `Olivier Grisel`_.
|
|
|
|
- Fix a bug in :class:`decomposition.LatentDirichletAllocation`
|
|
where the ``perplexity`` method was returning incorrect results because
|
|
the ``transform`` method returns normalized document topic distributions
|
|
as of version 0.18. :issue:`7954` by :user:`Gary Foreman <garyForeman>`.
|
|
|
|
- Fix output shape and bugs with n_jobs > 1 in
|
|
:class:`decomposition.SparseCoder` transform and
|
|
:func:`decomposition.sparse_encode`
|
|
for one-dimensional data and one component.
|
|
This also impacts the output shape of :class:`decomposition.DictionaryLearning`.
|
|
:issue:`8086` by `Andreas Müller`_.
|
|
|
|
- Fixed the implementation of ``explained_variance_``
|
|
in :class:`decomposition.PCA`,
|
|
`decomposition.RandomizedPCA` and
|
|
:class:`decomposition.IncrementalPCA`.
|
|
:issue:`9105` by `Hanmin Qin <https://github.com/qinhanmin2014>`_.
|
|
|
|
- Fixed the implementation of ``noise_variance_`` in :class:`decomposition.PCA`.
|
|
:issue:`9108` by `Hanmin Qin <https://github.com/qinhanmin2014>`_.
|
|
|
|
- Fixed a bug where :class:`cluster.DBSCAN` gives incorrect
|
|
result when input is a precomputed sparse matrix with initial
|
|
rows all zero. :issue:`8306` by :user:`Akshay Gupta <Akshay0724>`
|
|
|
|
- Fix a bug regarding fitting :class:`cluster.KMeans` with a sparse
|
|
array X and initial centroids, where X's means were unnecessarily being
|
|
subtracted from the centroids. :issue:`7872` by :user:`Josh Karnofsky <jkarno>`.
|
|
|
|
- Fixes to the input validation in :class:`covariance.EllipticEnvelope`.
|
|
:issue:`8086` by `Andreas Müller`_.
|
|
|
|
- Fixed a bug in :class:`covariance.MinCovDet` where inputting data
|
|
that produced a singular covariance matrix would cause the helper method
|
|
``_c_step`` to throw an exception.
|
|
:issue:`3367` by :user:`Jeremy Steward <ThatGeoGuy>`
|
|
|
|
- Fixed a bug in :class:`manifold.TSNE` affecting convergence of the
|
|
gradient descent. :issue:`8768` by :user:`David DeTomaso <deto>`.
|
|
|
|
- Fixed a bug in :class:`manifold.TSNE` where it stored the incorrect
|
|
``kl_divergence_``. :issue:`6507` by :user:`Sebastian Saeger <ssaeger>`.
|
|
|
|
- Fixed improper scaling in :class:`cross_decomposition.PLSRegression`
|
|
with ``scale=True``. :issue:`7819` by :user:`jayzed82 <jayzed82>`.
|
|
|
|
- :class:`cluster.SpectralCoclustering` and
|
|
:class:`cluster.SpectralBiclustering` ``fit`` method conforms
|
|
with API by accepting ``y`` and returning the object. :issue:`6126`,
|
|
:issue:`7814` by :user:`Laurent Direr <ldirer>` and :user:`Maniteja
|
|
Nandana <maniteja123>`.
|
|
|
|
- Fix bug where :mod:`sklearn.mixture` ``sample`` methods did not return as many
|
|
samples as requested. :issue:`7702` by :user:`Levi John Wolf <ljwolf>`.
|
|
|
|
- Fixed the shrinkage implementation in :class:`neighbors.NearestCentroid`.
|
|
:issue:`9219` by `Hanmin Qin <https://github.com/qinhanmin2014>`_.
|
|
|
|
Preprocessing and feature selection
|
|
|
|
- For sparse matrices, :func:`preprocessing.normalize` with ``return_norm=True``
|
|
will now raise a ``NotImplementedError`` with 'l1' or 'l2' norm and with
|
|
norm 'max' the norms returned will be the same as for dense matrices.
|
|
:issue:`7771` by `Ang Lu <https://github.com/luang008>`_.
|
|
|
|
- Fix a bug where :class:`feature_selection.SelectFdr` did not
|
|
exactly implement Benjamini-Hochberg procedure. It formerly may have
|
|
selected fewer features than it should.
|
|
:issue:`7490` by :user:`Peng Meng <mpjlu>`.
|
|
|
|
- Fixed a bug where `linear_model.RandomizedLasso` and
|
|
`linear_model.RandomizedLogisticRegression` breaks for
|
|
sparse input. :issue:`8259` by :user:`Aman Dalmia <dalmia>`.
|
|
|
|
- Fix a bug where :class:`feature_extraction.FeatureHasher`
|
|
mandatorily applied a sparse random projection to the hashed features,
|
|
preventing the use of
|
|
:class:`feature_extraction.text.HashingVectorizer` in a
|
|
pipeline with :class:`feature_extraction.text.TfidfTransformer`.
|
|
:issue:`7565` by :user:`Roman Yurchak <rth>`.
|
|
|
|
- Fix a bug where :class:`feature_selection.mutual_info_regression` did not
|
|
correctly use ``n_neighbors``. :issue:`8181` by :user:`Guillaume Lemaitre
|
|
<glemaitre>`.
|
|
|
|
Model evaluation and meta-estimators
|
|
|
|
- Fixed a bug where `model_selection.BaseSearchCV.inverse_transform`
|
|
returns ``self.best_estimator_.transform()`` instead of
|
|
``self.best_estimator_.inverse_transform()``.
|
|
:issue:`8344` by :user:`Akshay Gupta <Akshay0724>` and :user:`Rasmus Eriksson <MrMjauh>`.
|
|
|
|
- Added ``classes_`` attribute to :class:`model_selection.GridSearchCV`,
|
|
:class:`model_selection.RandomizedSearchCV`, `grid_search.GridSearchCV`,
|
|
and `grid_search.RandomizedSearchCV` that matches the ``classes_``
|
|
attribute of ``best_estimator_``. :issue:`7661` and :issue:`8295`
|
|
by :user:`Alyssa Batula <abatula>`, :user:`Dylan Werner-Meier <unautre>`,
|
|
and :user:`Stephen Hoover <stephen-hoover>`.
|
|
|
|
- Fixed a bug where :func:`model_selection.validation_curve`
|
|
reused the same estimator for each parameter value.
|
|
:issue:`7365` by :user:`Aleksandr Sandrovskii <Sundrique>`.
|
|
|
|
- :func:`model_selection.permutation_test_score` now works with Pandas
|
|
types. :issue:`5697` by :user:`Stijn Tonk <equialgo>`.
|
|
|
|
- Several fixes to input validation in
|
|
:class:`multiclass.OutputCodeClassifier`
|
|
:issue:`8086` by `Andreas Müller`_.
|
|
|
|
- :class:`multiclass.OneVsOneClassifier`'s ``partial_fit`` now ensures all
|
|
classes are provided up-front. :issue:`6250` by
|
|
:user:`Asish Panda <kaichogami>`.
|
|
|
|
- Fix :func:`multioutput.MultiOutputClassifier.predict_proba` to return a
|
|
list of 2d arrays, rather than a 3d array. In the case where different
|
|
target columns had different numbers of classes, a ``ValueError`` would be
|
|
raised on trying to stack matrices with different dimensions.
|
|
:issue:`8093` by :user:`Peter Bull <pjbull>`.
|
|
|
|
- Cross validation now works with Pandas datatypes that have a
|
|
read-only index. :issue:`9507` by `Loic Esteve`_.
|
|
|
|
Metrics
|
|
|
|
- :func:`metrics.average_precision_score` no longer linearly
|
|
interpolates between operating points, and instead weighs precisions
|
|
by the change in recall since the last operating point, as per the
|
|
`Wikipedia entry <https://en.wikipedia.org/wiki/Average_precision>`_.
|
|
(`#7356 <https://github.com/scikit-learn/scikit-learn/pull/7356>`_). By
|
|
:user:`Nick Dingwall <ndingwall>` and `Gael Varoquaux`_.
|
|
|
|
- Fix a bug in `metrics.classification._check_targets`
|
|
which would return ``'binary'`` if ``y_true`` and ``y_pred`` were
|
|
both ``'binary'`` but the union of ``y_true`` and ``y_pred`` was
|
|
``'multiclass'``. :issue:`8377` by `Loic Esteve`_.
|
|
|
|
- Fixed an integer overflow bug in :func:`metrics.confusion_matrix` and
|
|
hence :func:`metrics.cohen_kappa_score`. :issue:`8354`, :issue:`7929`
|
|
by `Joel Nothman`_ and :user:`Jon Crall <Erotemic>`.
|
|
|
|
- Fixed passing of ``gamma`` parameter to the ``chi2`` kernel in
|
|
:func:`metrics.pairwise.pairwise_kernels` :issue:`5211` by
|
|
:user:`Nick Rhinehart <nrhine1>`,
|
|
:user:`Saurabh Bansod <mth4saurabh>` and `Andreas Müller`_.
|
|
|
|
Miscellaneous
|
|
|
|
- Fixed a bug when :func:`datasets.make_classification` fails
|
|
when generating more than 30 features. :issue:`8159` by
|
|
:user:`Herilalaina Rakotoarison <herilalaina>`.
|
|
|
|
- Fixed a bug where :func:`datasets.make_moons` gives an
|
|
incorrect result when ``n_samples`` is odd.
|
|
:issue:`8198` by :user:`Josh Levy <levy5674>`.
|
|
|
|
- Some ``fetch_`` functions in :mod:`sklearn.datasets` were ignoring the
|
|
``download_if_missing`` keyword. :issue:`7944` by :user:`Ralf Gommers <rgommers>`.
|
|
|
|
- Fix estimators to accept a ``sample_weight`` parameter of type
|
|
``pandas.Series`` in their ``fit`` function. :issue:`7825` by
|
|
`Kathleen Chen`_.
|
|
|
|
- Fix a bug in cases where ``numpy.cumsum`` may be numerically unstable,
|
|
raising an exception if instability is identified. :issue:`7376` and
|
|
:issue:`7331` by `Joel Nothman`_ and :user:`yangarbiter`.
|
|
|
|
- Fix a bug where `base.BaseEstimator.__getstate__`
|
|
obstructed pickling customizations of child-classes, when used in a
|
|
multiple inheritance context.
|
|
:issue:`8316` by :user:`Holger Peters <HolgerPeters>`.
|
|
|
|
- Update Sphinx-Gallery from 0.1.4 to 0.1.7 for resolving links in
|
|
documentation build with Sphinx>1.5 :issue:`8010`, :issue:`7986` by
|
|
:user:`Oscar Najera <Titan-C>`
|
|
|
|
- Add ``data_home`` parameter to :func:`sklearn.datasets.fetch_kddcup99`.
|
|
:issue:`9289` by `Loic Esteve`_.
|
|
|
|
- Fix dataset loaders using Python 3 version of makedirs to also work in
|
|
Python 2. :issue:`9284` by :user:`Sebastin Santy <SebastinSanty>`.
|
|
|
|
- Several minor issues were fixed with thanks to the alerts of
|
|
`lgtm.com <https://lgtm.com/>`_. :issue:`9278` by :user:`Jean Helie <jhelie>`,
|
|
among others.
|
|
|
|
API changes summary
|
|
-------------------
|
|
|
|
Trees and ensembles
|
|
|
|
- Gradient boosting base models are no longer estimators. By `Andreas Müller`_.
|
|
|
|
- All tree based estimators now accept a ``min_impurity_decrease``
|
|
parameter in lieu of the ``min_impurity_split``, which is now deprecated.
|
|
The ``min_impurity_decrease`` helps stop splitting the nodes in which
|
|
the weighted impurity decrease from splitting is no longer at least
|
|
``min_impurity_decrease``. :issue:`8449` by `Raghav RV`_.
|
|
|
|
Linear, kernelized and related models
|
|
|
|
- ``n_iter`` parameter is deprecated in :class:`linear_model.SGDClassifier`,
|
|
:class:`linear_model.SGDRegressor`,
|
|
:class:`linear_model.PassiveAggressiveClassifier`,
|
|
:class:`linear_model.PassiveAggressiveRegressor` and
|
|
:class:`linear_model.Perceptron`. By `Tom Dupre la Tour`_.
|
|
|
|
Other predictors
|
|
|
|
- `neighbors.LSHForest` has been deprecated and will be
|
|
removed in 0.21 due to poor performance.
|
|
:issue:`9078` by :user:`Laurent Direr <ldirer>`.
|
|
|
|
- :class:`neighbors.NearestCentroid` no longer purports to support
|
|
``metric='precomputed'`` which now raises an error. :issue:`8515` by
|
|
:user:`Sergul Aydore <sergulaydore>`.
|
|
|
|
- The ``alpha`` parameter of :class:`semi_supervised.LabelPropagation` now
|
|
has no effect and is deprecated to be removed in 0.21. :issue:`9239`
|
|
by :user:`Andre Ambrosio Boechat <boechat107>`, :user:`Utkarsh Upadhyay
|
|
<musically-ut>`, and `Joel Nothman`_.
|
|
|
|
Decomposition, manifold learning and clustering
|
|
|
|
- Deprecate the ``doc_topic_distr`` argument of the ``perplexity`` method
|
|
in :class:`decomposition.LatentDirichletAllocation` because the
|
|
user no longer has access to the unnormalized document topic distribution
|
|
needed for the perplexity calculation. :issue:`7954` by
|
|
:user:`Gary Foreman <garyForeman>`.
|
|
|
|
- The ``n_topics`` parameter of :class:`decomposition.LatentDirichletAllocation`
|
|
has been renamed to ``n_components`` and will be removed in version 0.21.
|
|
:issue:`8922` by :user:`Attractadore`.
|
|
|
|
- :meth:`decomposition.SparsePCA.transform`'s ``ridge_alpha`` parameter is
|
|
deprecated in preference for class parameter.
|
|
:issue:`8137` by :user:`Naoya Kanai <naoyak>`.
|
|
|
|
- :class:`cluster.DBSCAN` now has a ``metric_params`` parameter.
|
|
:issue:`8139` by :user:`Naoya Kanai <naoyak>`.
|
|
|
|
Preprocessing and feature selection
|
|
|
|
- :class:`feature_selection.SelectFromModel` now has a ``partial_fit``
|
|
method only if the underlying estimator does. By `Andreas Müller`_.
|
|
|
|
- :class:`feature_selection.SelectFromModel` now validates the ``threshold``
|
|
parameter and sets the ``threshold_`` attribute during the call to
|
|
``fit``, and no longer during the call to ``transform```. By `Andreas
|
|
Müller`_.
|
|
|
|
- The ``non_negative`` parameter in :class:`feature_extraction.FeatureHasher`
|
|
has been deprecated, and replaced with a more principled alternative,
|
|
``alternate_sign``.
|
|
:issue:`7565` by :user:`Roman Yurchak <rth>`.
|
|
|
|
- `linear_model.RandomizedLogisticRegression`,
|
|
and `linear_model.RandomizedLasso` have been deprecated and will
|
|
be removed in version 0.21.
|
|
:issue:`8995` by :user:`Ramana.S <sentient07>`.
|
|
|
|
Model evaluation and meta-estimators
|
|
|
|
- Deprecate the ``fit_params`` constructor input to the
|
|
:class:`model_selection.GridSearchCV` and
|
|
:class:`model_selection.RandomizedSearchCV` in favor
|
|
of passing keyword parameters to the ``fit`` methods
|
|
of those classes. Data-dependent parameters needed for model
|
|
training should be passed as keyword arguments to ``fit``,
|
|
and conforming to this convention will allow the hyperparameter
|
|
selection classes to be used with tools such as
|
|
:func:`model_selection.cross_val_predict`.
|
|
:issue:`2879` by :user:`Stephen Hoover <stephen-hoover>`.
|
|
|
|
- In version 0.21, the default behavior of splitters that use the
|
|
``test_size`` and ``train_size`` parameter will change, such that
|
|
specifying ``train_size`` alone will cause ``test_size`` to be the
|
|
remainder. :issue:`7459` by :user:`Nelson Liu <nelson-liu>`.
|
|
|
|
- :class:`multiclass.OneVsRestClassifier` now has ``partial_fit``,
|
|
``decision_function`` and ``predict_proba`` methods only when the
|
|
underlying estimator does. :issue:`7812` by `Andreas Müller`_ and
|
|
:user:`Mikhail Korobov <kmike>`.
|
|
|
|
- :class:`multiclass.OneVsRestClassifier` now has a ``partial_fit`` method
|
|
only if the underlying estimator does. By `Andreas Müller`_.
|
|
|
|
- The ``decision_function`` output shape for binary classification in
|
|
:class:`multiclass.OneVsRestClassifier` and
|
|
:class:`multiclass.OneVsOneClassifier` is now ``(n_samples,)`` to conform
|
|
to scikit-learn conventions. :issue:`9100` by `Andreas Müller`_.
|
|
|
|
- The :func:`multioutput.MultiOutputClassifier.predict_proba`
|
|
function used to return a 3d array (``n_samples``, ``n_classes``,
|
|
``n_outputs``). In the case where different target columns had different
|
|
numbers of classes, a ``ValueError`` would be raised on trying to stack
|
|
matrices with different dimensions. This function now returns a list of
|
|
arrays where the length of the list is ``n_outputs``, and each array is
|
|
(``n_samples``, ``n_classes``) for that particular output.
|
|
:issue:`8093` by :user:`Peter Bull <pjbull>`.
|
|
|
|
- Replace attribute ``named_steps`` ``dict`` to :class:`utils.Bunch`
|
|
in :class:`pipeline.Pipeline` to enable tab completion in interactive
|
|
environment. In the case conflict value on ``named_steps`` and ``dict``
|
|
attribute, ``dict`` behavior will be prioritized.
|
|
:issue:`8481` by :user:`Herilalaina Rakotoarison <herilalaina>`.
|
|
|
|
Miscellaneous
|
|
|
|
- Deprecate the ``y`` parameter in ``transform`` and ``inverse_transform``.
|
|
The method should not accept ``y`` parameter, as it's used at the prediction time.
|
|
:issue:`8174` by :user:`Tahar Zanouda <tzano>`, `Alexandre Gramfort`_
|
|
and `Raghav RV`_.
|
|
|
|
- SciPy >= 0.13.3 and NumPy >= 1.8.2 are now the minimum supported versions
|
|
for scikit-learn. The following backported functions in
|
|
:mod:`sklearn.utils` have been removed or deprecated accordingly.
|
|
:issue:`8854` and :issue:`8874` by :user:`Naoya Kanai <naoyak>`
|
|
|
|
- The ``store_covariances`` and ``covariances_`` parameters of
|
|
:class:`discriminant_analysis.QuadraticDiscriminantAnalysis`
|
|
has been renamed to ``store_covariance`` and ``covariance_`` to be
|
|
consistent with the corresponding parameter names of the
|
|
:class:`discriminant_analysis.LinearDiscriminantAnalysis`. They will be
|
|
removed in version 0.21. :issue:`7998` by :user:`Jiacheng <mrbeann>`
|
|
|
|
Removed in 0.19:
|
|
|
|
- ``utils.fixes.argpartition``
|
|
- ``utils.fixes.array_equal``
|
|
- ``utils.fixes.astype``
|
|
- ``utils.fixes.bincount``
|
|
- ``utils.fixes.expit``
|
|
- ``utils.fixes.frombuffer_empty``
|
|
- ``utils.fixes.in1d``
|
|
- ``utils.fixes.norm``
|
|
- ``utils.fixes.rankdata``
|
|
- ``utils.fixes.safe_copy``
|
|
|
|
Deprecated in 0.19, to be removed in 0.21:
|
|
|
|
- ``utils.arpack.eigs``
|
|
- ``utils.arpack.eigsh``
|
|
- ``utils.arpack.svds``
|
|
- ``utils.extmath.fast_dot``
|
|
- ``utils.extmath.logsumexp``
|
|
- ``utils.extmath.norm``
|
|
- ``utils.extmath.pinvh``
|
|
- ``utils.graph.graph_laplacian``
|
|
- ``utils.random.choice``
|
|
- ``utils.sparsetools.connected_components``
|
|
- ``utils.stats.rankdata``
|
|
|
|
- Estimators with both methods ``decision_function`` and ``predict_proba``
|
|
are now required to have a monotonic relation between them. The
|
|
method ``check_decision_proba_consistency`` has been added in
|
|
**utils.estimator_checks** to check their consistency.
|
|
:issue:`7578` by :user:`Shubham Bhardwaj <shubham0704>`
|
|
|
|
- All checks in ``utils.estimator_checks``, in particular
|
|
:func:`utils.estimator_checks.check_estimator` now accept estimator
|
|
instances. Most other checks do not accept
|
|
estimator classes any more. :issue:`9019` by `Andreas Müller`_.
|
|
|
|
- Ensure that estimators' attributes ending with ``_`` are not set
|
|
in the constructor but only in the ``fit`` method. Most notably,
|
|
ensemble estimators (deriving from `ensemble.BaseEnsemble`)
|
|
now only have ``self.estimators_`` available after ``fit``.
|
|
:issue:`7464` by `Lars Buitinck`_ and `Loic Esteve`_.
|
|
|
|
|
|
Code and Documentation Contributors
|
|
-----------------------------------
|
|
|
|
Thanks to everyone who has contributed to the maintenance and improvement of the
|
|
project since version 0.18, including:
|
|
|
|
Joel Nothman, Loic Esteve, Andreas Mueller, Guillaume Lemaitre, Olivier Grisel,
|
|
Hanmin Qin, Raghav RV, Alexandre Gramfort, themrmax, Aman Dalmia, Gael
|
|
Varoquaux, Naoya Kanai, Tom Dupré la Tour, Rishikesh, Nelson Liu, Taehoon Lee,
|
|
Nelle Varoquaux, Aashil, Mikhail Korobov, Sebastin Santy, Joan Massich, Roman
|
|
Yurchak, RAKOTOARISON Herilalaina, Thierry Guillemot, Alexandre Abadie, Carol
|
|
Willing, Balakumaran Manoharan, Josh Karnofsky, Vlad Niculae, Utkarsh Upadhyay,
|
|
Dmitry Petrov, Minghui Liu, Srivatsan, Vincent Pham, Albert Thomas, Jake
|
|
VanderPlas, Attractadore, JC Liu, alexandercbooth, chkoar, Óscar Nájera,
|
|
Aarshay Jain, Kyle Gilliam, Ramana Subramanyam, CJ Carey, Clement Joudet, David
|
|
Robles, He Chen, Joris Van den Bossche, Karan Desai, Katie Luangkote, Leland
|
|
McInnes, Maniteja Nandana, Michele Lacchia, Sergei Lebedev, Shubham Bhardwaj,
|
|
akshay0724, omtcyfz, rickiepark, waterponey, Vathsala Achar, jbDelafosse, Ralf
|
|
Gommers, Ekaterina Krivich, Vivek Kumar, Ishank Gulati, Dave Elliott, ldirer,
|
|
Reiichiro Nakano, Levi John Wolf, Mathieu Blondel, Sid Kapur, Dougal J.
|
|
Sutherland, midinas, mikebenfield, Sourav Singh, Aseem Bansal, Ibraim Ganiev,
|
|
Stephen Hoover, AishwaryaRK, Steven C. Howell, Gary Foreman, Neeraj Gangwar,
|
|
Tahar, Jon Crall, dokato, Kathy Chen, ferria, Thomas Moreau, Charlie Brummitt,
|
|
Nicolas Goix, Adam Kleczewski, Sam Shleifer, Nikita Singh, Basil Beirouti,
|
|
Giorgio Patrini, Manoj Kumar, Rafael Possas, James Bourbeau, James A. Bednar,
|
|
Janine Harper, Jaye, Jean Helie, Jeremy Steward, Artsiom, John Wei, Jonathan
|
|
LIgo, Jonathan Rahn, seanpwilliams, Arthur Mensch, Josh Levy, Julian Kuhlmann,
|
|
Julien Aubert, Jörn Hees, Kai, shivamgargsya, Kat Hempstalk, Kaushik
|
|
Lakshmikanth, Kennedy, Kenneth Lyons, Kenneth Myers, Kevin Yap, Kirill Bobyrev,
|
|
Konstantin Podshumok, Arthur Imbert, Lee Murray, toastedcornflakes, Lera, Li
|
|
Li, Arthur Douillard, Mainak Jas, tobycheese, Manraj Singh, Manvendra Singh,
|
|
Marc Meketon, MarcoFalke, Matthew Brett, Matthias Gilch, Mehul Ahuja, Melanie
|
|
Goetz, Meng, Peng, Michael Dezube, Michal Baumgartner, vibrantabhi19, Artem
|
|
Golubin, Milen Paskov, Antonin Carette, Morikko, MrMjauh, NALEPA Emmanuel,
|
|
Namiya, Antoine Wendlinger, Narine Kokhlikyan, NarineK, Nate Guerin, Angus
|
|
Williams, Ang Lu, Nicole Vavrova, Nitish Pandey, Okhlopkov Daniil Olegovich,
|
|
Andy Craze, Om Prakash, Parminder Singh, Patrick Carlson, Patrick Pei, Paul
|
|
Ganssle, Paulo Haddad, Paweł Lorek, Peng Yu, Pete Bachant, Peter Bull, Peter
|
|
Csizsek, Peter Wang, Pieter Arthur de Jong, Ping-Yao, Chang, Preston Parry,
|
|
Puneet Mathur, Quentin Hibon, Andrew Smith, Andrew Jackson, 1kastner, Rameshwar
|
|
Bhaskaran, Rebecca Bilbro, Remi Rampin, Andrea Esuli, Rob Hall, Robert
|
|
Bradshaw, Romain Brault, Aman Pratik, Ruifeng Zheng, Russell Smith, Sachin
|
|
Agarwal, Sailesh Choyal, Samson Tan, Samuël Weber, Sarah Brown, Sebastian
|
|
Pölsterl, Sebastian Raschka, Sebastian Saeger, Alyssa Batula, Abhyuday Pratap
|
|
Singh, Sergey Feldman, Sergul Aydore, Sharan Yalburgi, willduan, Siddharth
|
|
Gupta, Sri Krishna, Almer, Stijn Tonk, Allen Riddell, Theofilos Papapanagiotou,
|
|
Alison, Alexis Mignon, Tommy Boucher, Tommy Löfstedt, Toshihiro Kamishima,
|
|
Tyler Folkman, Tyler Lanigan, Alexander Junge, Varun Shenoy, Victor Poughon,
|
|
Vilhelm von Ehrenheim, Aleksandr Sandrovskii, Alan Yee, Vlasios Vasileiou,
|
|
Warut Vijitbenjaronk, Yang Zhang, Yaroslav Halchenko, Yichuan Liu, Yuichi
|
|
Fujikawa, affanv14, aivision2020, xor, andreh7, brady salz, campustrampus,
|
|
Agamemnon Krasoulis, ditenberg, elena-sharova, filipj8, fukatani, gedeck,
|
|
guiniol, guoci, hakaa1, hongkahjun, i-am-xhy, jakirkham, jaroslaw-weber,
|
|
jayzed82, jeroko, jmontoyam, jonathan.striebel, josephsalmon, jschendel,
|
|
leereeves, martin-hahn, mathurinm, mehak-sachdeva, mlewis1729, mlliou112,
|
|
mthorrell, ndingwall, nuffe, yangarbiter, plagree, pldtc325, Breno Freitas,
|
|
Brett Olsen, Brian A. Alfano, Brian Burns, polmauri, Brandon Carter, Charlton
|
|
Austin, Chayant T15h, Chinmaya Pancholi, Christian Danielsen, Chung Yen,
|
|
Chyi-Kwei Yau, pravarmahajan, DOHMATOB Elvis, Daniel LeJeune, Daniel Hnyk,
|
|
Darius Morawiec, David DeTomaso, David Gasquez, David Haberthür, David
|
|
Heryanto, David Kirkby, David Nicholson, rashchedrin, Deborah Gertrude Digges,
|
|
Denis Engemann, Devansh D, Dickson, Bob Baxley, Don86, E. Lynch-Klarup, Ed
|
|
Rogers, Elizabeth Ferriss, Ellen-Co2, Fabian Egli, Fang-Chieh Chou, Bing Tian
|
|
Dai, Greg Stupp, Grzegorz Szpak, Bertrand Thirion, Hadrien Bertrand, Harizo
|
|
Rajaona, zxcvbnius, Henry Lin, Holger Peters, Icyblade Dai, Igor
|
|
Andriushchenko, Ilya, Isaac Laughlin, Iván Vallés, Aurélien Bellet, JPFrancoia,
|
|
Jacob Schreiber, Asish Mahapatra
|