sklearn/doc/testimonials/testimonials.rst

:orphan:

.. title:: Testimonials

.. _testimonials:

==========================
Who is using scikit-learn?
==========================

`J.P.Morgan <https://www.jpmorgan.com>`_
----------------------------------------

.. div:: sk-text-image-grid-large

  .. div:: text-box

    Scikit-learn is an indispensable part of the Python machine learning
    toolkit at JPMorgan. It is very widely used across all parts of the bank
    for classification, predictive analytics, and very many other machine
    learning tasks. Its straightforward API, its breadth of algorithms, and
    the quality of its documentation combine to make scikit-learn
    simultaneously very approachable and very powerful.

    .. rst-class:: annotation

      Stephen Simmons, VP, Athena Research, JPMorgan

  .. div:: image-box

    .. image:: images/jpmorgan.png
      :target: https://www.jpmorgan.com


`Spotify <https://www.spotify.com>`_
------------------------------------

.. div:: sk-text-image-grid-large

  .. div:: text-box

    Scikit-learn provides a toolbox with solid implementations of a bunch of
    state-of-the-art models and makes it easy to plug them into existing
    applications. We've been using it quite a lot for music recommendations at
    Spotify and I think it's the most well-designed ML package I've seen so far.

    .. rst-class:: annotation

      Erik Bernhardsson, Engineering Manager Music Discovery & Machine Learning, Spotify

  .. div:: image-box

    .. image:: images/spotify.png
      :target: https://www.spotify.com


`Inria <https://www.inria.fr/>`_
--------------------------------

.. div:: sk-text-image-grid-large

  .. div:: text-box

    At INRIA, we use scikit-learn to support leading-edge basic research in many
    teams: `Parietal <https://team.inria.fr/parietal/>`_ for neuroimaging, `Lear
    <https://lear.inrialpes.fr/>`_ for computer vision, `Visages
    <https://team.inria.fr/visages/>`_ for medical image analysis, `Privatics
    <https://team.inria.fr/privatics>`_ for security. The project is a fantastic
    tool to address difficult applications of machine learning in an academic
    environment as it is performant and versatile, but all easy-to-use and well
    documented, which makes it well suited to grad students.

    .. rst-class:: annotation

      Gaël Varoquaux, research at Parietal

  .. div:: image-box

    .. image:: images/inria.png
      :target: https://www.inria.fr/


`betaworks <https://betaworks.com>`_
------------------------------------

.. div:: sk-text-image-grid-large

  .. div:: text-box

    Betaworks is a NYC-based startup studio that builds new products, grows
    companies, and invests in others. Over the past 8 years we've launched a
    handful of social data analytics-driven services, such as Bitly, Chartbeat,
    digg and Scale Model. Consistently the betaworks data science team uses
    Scikit-learn for a variety of tasks. From exploratory analysis, to product
    development, it is an essential part of our toolkit. Recent uses are included
    in `digg's new video recommender system
    <https://medium.com/i-data/the-digg-video-recommender-2f9ade7c4ba3>`_,
    and Poncho's `dynamic heuristic subspace clustering
    <https://medium.com/@DiggData/scaling-poncho-using-data-ca24569d56fd>`_.

    .. rst-class:: annotation

      Gilad Lotan, Chief Data Scientist

  .. div:: image-box

    .. image:: images/betaworks.png
      :target: https://betaworks.com


`Hugging Face <https://huggingface.co>`_
----------------------------------------

.. div:: sk-text-image-grid-large

  .. div:: text-box

    At Hugging Face we're using NLP and probabilistic models to generate
    conversational Artificial intelligences that are fun to chat with. Despite using
    deep neural nets for `a few <https://medium.com/huggingface/understanding-emotions-from-keras-to-pytorch-3ccb61d5a983>`_
    of our `NLP tasks <https://huggingface.co/coref/>`_, scikit-learn is still the
    bread-and-butter of our daily machine learning routine. The ease of use and
    predictability of the interface, as well as the straightforward mathematical
    explanations that are here when you need them, is the killer feature. We use a
    variety of scikit-learn models in production and they are also operationally very
    pleasant to work with.

    .. rst-class:: annotation

      Julien Chaumond, Chief Technology Officer

  .. div:: image-box

    .. image:: images/huggingface.png
      :target: https://huggingface.co


`Evernote <https://evernote.com>`_
----------------------------------

.. div:: sk-text-image-grid-large

  .. div:: text-box

    Building a classifier is typically an iterative process of exploring
    the data, selecting the features (the attributes of the data believed
    to be predictive in some way), training the models, and finally
    evaluating them. For many of these tasks, we relied on the excellent
    scikit-learn package for Python.

    `Read more <http://blog.evernote.com/tech/2013/01/22/stay-classified/>`_

    .. rst-class:: annotation

      Mark Ayzenshtat, VP, Augmented Intelligence

  .. div:: image-box

    .. image:: images/evernote.png
      :target: https://evernote.com


`Télécom ParisTech <https://www.telecom-paristech.fr/>`_
--------------------------------------------------------

.. div:: sk-text-image-grid-large

  .. div:: text-box

    At Telecom ParisTech, scikit-learn is used for hands-on sessions and home
    assignments in introductory and advanced machine learning courses. The classes
    are for undergrads and masters students. The great benefit of scikit-learn is
    its fast learning curve that allows students to quickly start working on
    interesting and motivating problems.

    .. rst-class:: annotation

      Alexandre Gramfort, Assistant Professor

  .. div:: image-box

    .. image:: images/telecomparistech.jpg
      :target: https://www.telecom-paristech.fr/


`Booking.com <https://www.booking.com>`_
----------------------------------------

.. div:: sk-text-image-grid-large

  .. div:: text-box

    At Booking.com, we use machine learning algorithms for many different
    applications, such as recommending hotels and destinations to our customers,
    detecting fraudulent reservations, or scheduling our customer service agents.
    Scikit-learn is one of the tools we use when implementing standard algorithms
    for prediction tasks. Its API and documentations are excellent and make it easy
    to use. The scikit-learn developers do a great job of incorporating state of
    the art implementations and new algorithms into the package. Thus, scikit-learn
    provides convenient access to a wide spectrum of algorithms, and allows us to
    readily find the right tool for the right job.

    .. rst-class:: annotation

      Melanie Mueller, Data Scientist

  .. div:: image-box

    .. image:: images/booking.png
      :target: https://www.booking.com


`AWeber <https://www.aweber.com/>`_
-----------------------------------

.. div:: sk-text-image-grid-large

  .. div:: text-box

    The scikit-learn toolkit is indispensable for the Data Analysis and Management
    team at AWeber.  It allows us to do AWesome stuff we would not otherwise have
    the time or resources to accomplish. The documentation is excellent, allowing
    new engineers to quickly evaluate and apply many different algorithms to our
    data. The text feature extraction utilities are useful when working with the
    large volume of email content we have at AWeber. The RandomizedPCA
    implementation, along with Pipelining and FeatureUnions, allows us to develop
    complex machine learning algorithms efficiently and reliably.

    Anyone interested in learning more about how AWeber deploys scikit-learn in a
    production environment should check out talks from PyData Boston by AWeber's
    Michael Becker available at https://github.com/mdbecker/pydata_2013.

    .. rst-class:: annotation

      Michael Becker, Software Engineer, Data Analysis and Management Ninjas

  .. div:: image-box

    .. image:: images/aweber.png
      :target: https://www.aweber.com


`Yhat <https://www.yhat.com>`_
------------------------------

.. div:: sk-text-image-grid-large

  .. div:: text-box

    The combination of consistent APIs, thorough documentation, and top notch
    implementation make scikit-learn our favorite machine learning package in
    Python. scikit-learn makes doing advanced analysis in Python accessible to
    anyone. At Yhat, we make it easy to integrate these models into your production
    applications. Thus eliminating the unnecessary dev time encountered
    productionizing analytical work.

    .. rst-class:: annotation

      Greg Lamp, Co-founder

  .. div:: image-box

    .. image:: images/yhat.png
      :target: https://www.yhat.com


`Rangespan <http://www.rangespan.com>`_
---------------------------------------

.. div:: sk-text-image-grid-large

  .. div:: text-box

    The Python scikit-learn toolkit is a core tool in the data science
    group at Rangespan. Its large collection of well documented models and
    algorithms allow our team of data scientists to prototype fast and
    quickly iterate to find the right solution to our learning problems.
    We find that scikit-learn is not only the right tool for prototyping,
    but its careful and well tested implementation give us the confidence
    to run scikit-learn models in production.

    .. rst-class:: annotation

      Jurgen Van Gael, Data Science Director

  .. div:: image-box

    .. image:: images/rangespan.png
      :target: http://www.rangespan.com


`Birchbox <https://www.birchbox.com>`_
--------------------------------------

.. div:: sk-text-image-grid-large

  .. div:: text-box

    At Birchbox, we face a range of machine learning problems typical to
    E-commerce: product recommendation, user clustering, inventory prediction,
    trends detection, etc. Scikit-learn lets us experiment with many models,
    especially in the exploration phase of a new project: the data can be passed
    around in a consistent way; models are easy to save and reuse; updates keep us
    informed of new developments from the pattern discovery research community.
    Scikit-learn is an important tool for our team, built the right way in the
    right language.

    .. rst-class:: annotation

      Thierry Bertin-Mahieux, Data Scientist

  .. div:: image-box

    .. image:: images/birchbox.jpg
      :target: https://www.birchbox.com


`Bestofmedia Group <http://www.bestofmedia.com>`_
-------------------------------------------------

.. div:: sk-text-image-grid-large

  .. div:: text-box

    Scikit-learn is our #1 toolkit for all things machine learning
    at Bestofmedia. We use it for a variety of tasks (e.g. spam fighting,
    ad click prediction, various ranking models) thanks to the varied,
    state-of-the-art algorithm implementations packaged into it.
    In the lab it accelerates prototyping of complex pipelines. In
    production I can say it has proven to be robust and efficient enough
    to be deployed for business critical components.

    .. rst-class:: annotation

      Eustache Diemert, Lead Scientist

  .. div:: image-box

    .. image:: images/bestofmedia-logo.png
      :target: http://www.bestofmedia.com


`Change.org <https://www.change.org>`_
--------------------------------------

.. div:: sk-text-image-grid-large

  .. div:: text-box

    At change.org we automate the use of scikit-learn's RandomForestClassifier
    in our production systems to drive email targeting that reaches millions
    of users across the world each week. In the lab, scikit-learn's ease-of-use,
    performance, and overall variety of algorithms implemented has proved invaluable
    in giving us a single reliable source to turn to for our machine-learning needs.

    .. rst-class:: annotation

      Vijay Ramesh, Software Engineer in Data/science at Change.org

  .. div:: image-box

    .. image:: images/change-logo.png
      :target: https://www.change.org


`PHIMECA Engineering <https://www.phimeca.com/?lang=en>`_
---------------------------------------------------------

.. div:: sk-text-image-grid-large

  .. div:: text-box

    At PHIMECA Engineering, we use scikit-learn estimators as surrogates for
    expensive-to-evaluate numerical models (mostly but not exclusively
    finite-element mechanical models) for speeding up the intensive post-processing
    operations involved in our simulation-based decision making framework.
    Scikit-learn's fit/predict API together with its efficient cross-validation
    tools considerably eases the task of selecting the best-fit estimator. We are
    also using scikit-learn for illustrating concepts in our training sessions.
    Trainees are always impressed by the ease-of-use of scikit-learn despite the
    apparent theoretical complexity of machine learning.

    .. rst-class:: annotation

      Vincent Dubourg, PHIMECA Engineering, PhD Engineer

  .. div:: image-box

    .. image:: images/phimeca.png
      :target: https://www.phimeca.com/?lang=en


`HowAboutWe <http://www.howaboutwe.com/>`_
------------------------------------------

.. div:: sk-text-image-grid-large

  .. div:: text-box

    At HowAboutWe, scikit-learn lets us implement a wide array of machine learning
    techniques in analysis and in production, despite having a small team.  We use
    scikit-learn's classification algorithms to predict user behavior, enabling us
    to (for example) estimate the value of leads from a given traffic source early
    in the lead's tenure on our site. Also, our users' profiles consist of
    primarily unstructured data (answers to open-ended questions), so we use
    scikit-learn's feature extraction and dimensionality reduction tools to
    translate these unstructured data into inputs for our matchmaking system.

    .. rst-class:: annotation

      Daniel Weitzenfeld, Senior Data Scientist at HowAboutWe

  .. div:: image-box

    .. image:: images/howaboutwe.png
      :target: http://www.howaboutwe.com/


`PeerIndex <https://www.brandwatch.com/peerindex-and-brandwatch>`_
------------------------------------------------------------------

.. div:: sk-text-image-grid-large

  .. div:: text-box

    At PeerIndex we use scientific methodology to build the Influence Graph - a
    unique dataset that allows us to identify who's really influential and in which
    context. To do this, we have to tackle a range of machine learning and
    predictive modeling problems. Scikit-learn has emerged as our primary tool for
    developing prototypes and making quick progress. From predicting missing data
    and classifying tweets to clustering communities of social media users, scikit-
    learn proved useful in a variety of applications. Its very intuitive interface
    and excellent compatibility with other python tools makes it and indispensable
    tool in our daily research efforts.

    .. rst-class:: annotation

      Ferenc Huszar, Senior Data Scientist at Peerindex

  .. div:: image-box

    .. image:: images/peerindex.png
      :target: https://www.brandwatch.com/peerindex-and-brandwatch


`DataRobot <https://www.datarobot.com>`_
----------------------------------------

.. div:: sk-text-image-grid-large

  .. div:: text-box

    DataRobot is building next generation predictive analytics software to make data
    scientists more productive, and scikit-learn is an integral part of our system. The
    variety of machine learning techniques in combination with the solid implementations
    that scikit-learn offers makes it a one-stop-shopping library for machine learning
    in Python. Moreover, its consistent API, well-tested code and permissive licensing
    allow us to use it in a production environment. Scikit-learn has literally saved us
    years of work we would have had to do ourselves to bring our product to market.

    .. rst-class:: annotation

      Jeremy Achin, CEO & Co-founder DataRobot Inc.

  .. div:: image-box

    .. image:: images/datarobot.png
      :target: https://www.datarobot.com


`OkCupid <https://www.okcupid.com/>`_
-------------------------------------

.. div:: sk-text-image-grid-large

  .. div:: text-box

    We're using scikit-learn at OkCupid to evaluate and improve our matchmaking
    system. The range of features it has, especially preprocessing utilities, means
    we can use it for a wide variety of projects, and it's performant enough to
    handle the volume of data that we need to sort through. The documentation is
    really thorough, as well, which makes the library quite easy to use.

    .. rst-class:: annotation

      David Koh - Senior Data Scientist at OkCupid

  .. div:: image-box

    .. image:: images/okcupid.png
      :target: https://www.okcupid.com


`Lovely <https://livelovely.com/>`_
-----------------------------------

.. div:: sk-text-image-grid-large

  .. div:: text-box

    At Lovely, we strive to deliver the best apartment marketplace, with respect to
    our users and our listings. From understanding user behavior, improving data
    quality, and detecting fraud, scikit-learn is a regular tool for gathering
    insights, predictive modeling and improving our product. The easy-to-read
    documentation and intuitive architecture of the API makes machine learning both
    explorable and accessible to a wide range of python developers. I'm constantly
    recommending that more developers and scientists try scikit-learn.

    .. rst-class:: annotation

      Simon Frid - Data Scientist, Lead at Lovely

  .. div:: image-box

    .. image:: images/lovely.png
      :target: https://livelovely.com


`Data Publica <http://www.data-publica.com/>`_
----------------------------------------------

.. div:: sk-text-image-grid-large

  .. div:: text-box

    Data Publica builds a new predictive sales tool for commercial and marketing teams
    called C-Radar. We extensively use scikit-learn to build segmentations of customers
    through clustering, and to predict future customers based on past partnerships
    success or failure. We also categorize companies using their website communication
    thanks to scikit-learn and its machine learning algorithm implementations.
    Eventually, machine learning makes it possible to detect weak signals that
    traditional tools cannot see. All these complex tasks are performed in an easy and
    straightforward way thanks to the great quality of the scikit-learn framework.

    .. rst-class:: annotation

      Guillaume Lebourgeois & Samuel Charron - Data Scientists at Data Publica

  .. div:: image-box

    .. image:: images/datapublica.png
      :target: http://www.data-publica.com/


`Machinalis <https://www.machinalis.com/>`_
-------------------------------------------

.. div:: sk-text-image-grid-large

  .. div:: text-box

    Scikit-learn is the cornerstone of all the machine learning projects carried at
    Machinalis. It has a consistent API, a wide selection of algorithms and lots of
    auxiliary tools to deal with the boilerplate. We have used it in production
    environments on a variety of projects including click-through rate prediction,
    `information extraction <https://github.com/machinalis/iepy>`_, and even counting
    sheep!

    In fact, we use it so much that we've started to freeze our common use cases
    into Python packages, some of them open-sourced, like `FeatureForge
    <https://github.com/machinalis/featureforge>`_. Scikit-learn in one word: Awesome.

    .. rst-class:: annotation

      Rafael Carrascosa, Lead developer

  .. div:: image-box

    .. image:: images/machinalis.png
      :target: https://www.machinalis.com/


`solido <https://www.solidodesign.com/>`_
-----------------------------------------

.. div:: sk-text-image-grid-large

  .. div:: text-box

    Scikit-learn is helping to drive Moore's Law, via Solido. Solido creates
    computer-aided design tools used by the majority of top-20 semiconductor
    companies and fabs, to design the bleeding-edge chips inside smartphones,
    automobiles, and more. Scikit-learn helps to power Solido's algorithms for
    rare-event estimation, worst-case verification, optimization, and more. At
    Solido, we are particularly fond of scikit-learn's libraries for Gaussian
    Process models, large-scale regularized linear regression, and classification.
    Scikit-learn has increased our productivity, because for many ML problems we no
    longer need to “roll our own” code. `This PyData 2014 talk
    <https://www.youtube.com/watch?v=Jm-eBD9xR3w>`_ has details.

    .. rst-class:: annotation

      Trent McConaghy, founder, Solido Design Automation Inc.

  .. div:: image-box

    .. image:: images/solido_logo.png
      :target: https://www.solidodesign.com/


`INFONEA <http://www.infonea.com/en/>`_
---------------------------------------

.. div:: sk-text-image-grid-large

  .. div:: text-box

    We employ scikit-learn for rapid prototyping and custom-made Data Science
    solutions within our in-memory based Business Intelligence Software
    INFONEA®. As a well-documented and comprehensive collection of
    state-of-the-art algorithms and pipelining methods, scikit-learn enables
    us to provide flexible and scalable scientific analysis solutions. Thus,
    scikit-learn is immensely valuable in realizing a powerful integration of
    Data Science technology within self-service business analytics.

    .. rst-class:: annotation

      Thorsten Kranz, Data Scientist, Coma Soft AG.

  .. div:: image-box

    .. image:: images/infonea.jpg
      :target: http://www.infonea.com/en/


`Dataiku <https://www.dataiku.com/>`_
-------------------------------------

.. div:: sk-text-image-grid-large

  .. div:: text-box

    Our software, Data Science Studio (DSS), enables users to create data services
    that combine `ETL <https://en.wikipedia.org/wiki/Extract,_transform,_load>`_ with
    Machine Learning. Our Machine Learning module integrates
    many scikit-learn algorithms. The scikit-learn library is a perfect integration
    with DSS because it offers algorithms for virtually all business cases. Our goal
    is to offer a transparent and flexible tool that makes it easier to optimize
    time consuming aspects of building a data service, preparing data, and training
    machine learning algorithms on all types of data.

    .. rst-class:: annotation

      Florian Douetteau, CEO, Dataiku

  .. div:: image-box

    .. image:: images/dataiku_logo.png
      :target: https://www.dataiku.com/


`Otto Group <https://ottogroup.com/>`_
--------------------------------------

.. div:: sk-text-image-grid-large

  .. div:: text-box

    Here at Otto Group, one of global Big Five B2C online retailers, we are using
    scikit-learn in all aspects of our daily work from data exploration to development
    of machine learning application to the productive deployment of those services.
    It helps us to tackle machine learning problems ranging from e-commerce to logistics.
    It consistent APIs enabled us to build the `Palladium REST-API framework
    <https://github.com/ottogroup/palladium/>`_ around it and continuously deliver
    scikit-learn based services.

    .. rst-class:: annotation

      Christian Rammig, Head of Data Science, Otto Group

  .. div:: image-box

    .. image:: images/ottogroup_logo.png
      :target: https://ottogroup.com


`Zopa <https://zopa.com/>`_
---------------------------

.. div:: sk-text-image-grid-large

  .. div:: text-box

    At Zopa, the first ever Peer-to-Peer lending platform, we extensively use
    scikit-learn to run the business and optimize our users' experience. It powers our
    Machine Learning models involved in credit risk, fraud risk, marketing, and pricing,
    and has been used for originating at least 1 billion GBP worth of Zopa loans. It is
    very well documented, powerful, and simple to use. We are grateful for the
    capabilities it has provided, and for allowing us to deliver on our mission of
    making money simple and fair.

    .. rst-class:: annotation

      Vlasios Vasileiou, Head of Data Science, Zopa

  .. div:: image-box

    .. image:: images/zopa.png
      :target: https://zopa.com


`MARS <https://www.mars.com/global>`_
-------------------------------------

.. div:: sk-text-image-grid-large

  .. div:: text-box

    Scikit-Learn is integral to the Machine Learning Ecosystem at Mars. Whether
    we're designing better recipes for petfood or closely analysing our cocoa
    supply chain, Scikit-Learn is used as a tool for rapidly prototyping ideas
    and taking them to production. This allows us to better understand and meet
    the needs of our consumers worldwide. Scikit-Learn's feature-rich toolset is
    easy to use and equips our associates with the capabilities they need to
    solve the business challenges they face every day.

    .. rst-class:: annotation

      Michael Fitzke, Next Generation Technologies Sr Leader, Mars Inc.

  .. div:: image-box

    .. image:: images/mars.png
      :target: https://www.mars.com/global


`BNP Paribas Cardif <https://www.bnpparibascardif.com/>`_
---------------------------------------------------------

.. div:: sk-text-image-grid-large

  .. div:: text-box

    BNP Paribas Cardif uses scikit-learn for several of its machine learning models
    in production. Our internal community of developers and data scientists has
    been using scikit-learn since 2015, for several reasons: the quality of the
    developments, documentation and contribution governance, and the sheer size of
    the contributing community. We even explicitly mention the use of
    scikit-learn's pipelines in our internal model risk governance as one of our
    good practices to decrease operational risks and overfitting risk. As a way to
    support open source software development and in particular scikit-learn
    project, we decided to participate to scikit-learn's consortium at La Fondation
    Inria since its creation in 2018.

    .. rst-class:: annotation

      Sébastien Conort, Chief Data Scientist, BNP Paribas Cardif

  .. div:: image-box

    .. image:: images/bnp_paribas_cardif.png
      :target: https://www.bnpparibascardif.com/