sklearn/doc/testimonials/testimonials.rst

753 lines
26 KiB
ReStructuredText

:orphan:
.. title:: Testimonials
.. _testimonials:
==========================
Who is using scikit-learn?
==========================
`J.P.Morgan <https://www.jpmorgan.com>`_
----------------------------------------
.. div:: sk-text-image-grid-large
.. div:: text-box
Scikit-learn is an indispensable part of the Python machine learning
toolkit at JPMorgan. It is very widely used across all parts of the bank
for classification, predictive analytics, and very many other machine
learning tasks. Its straightforward API, its breadth of algorithms, and
the quality of its documentation combine to make scikit-learn
simultaneously very approachable and very powerful.
.. rst-class:: annotation
Stephen Simmons, VP, Athena Research, JPMorgan
.. div:: image-box
.. image:: images/jpmorgan.png
:target: https://www.jpmorgan.com
`Spotify <https://www.spotify.com>`_
------------------------------------
.. div:: sk-text-image-grid-large
.. div:: text-box
Scikit-learn provides a toolbox with solid implementations of a bunch of
state-of-the-art models and makes it easy to plug them into existing
applications. We've been using it quite a lot for music recommendations at
Spotify and I think it's the most well-designed ML package I've seen so far.
.. rst-class:: annotation
Erik Bernhardsson, Engineering Manager Music Discovery & Machine Learning, Spotify
.. div:: image-box
.. image:: images/spotify.png
:target: https://www.spotify.com
`Inria <https://www.inria.fr/>`_
--------------------------------
.. div:: sk-text-image-grid-large
.. div:: text-box
At INRIA, we use scikit-learn to support leading-edge basic research in many
teams: `Parietal <https://team.inria.fr/parietal/>`_ for neuroimaging, `Lear
<https://lear.inrialpes.fr/>`_ for computer vision, `Visages
<https://team.inria.fr/visages/>`_ for medical image analysis, `Privatics
<https://team.inria.fr/privatics>`_ for security. The project is a fantastic
tool to address difficult applications of machine learning in an academic
environment as it is performant and versatile, but all easy-to-use and well
documented, which makes it well suited to grad students.
.. rst-class:: annotation
Gaël Varoquaux, research at Parietal
.. div:: image-box
.. image:: images/inria.png
:target: https://www.inria.fr/
`betaworks <https://betaworks.com>`_
------------------------------------
.. div:: sk-text-image-grid-large
.. div:: text-box
Betaworks is a NYC-based startup studio that builds new products, grows
companies, and invests in others. Over the past 8 years we've launched a
handful of social data analytics-driven services, such as Bitly, Chartbeat,
digg and Scale Model. Consistently the betaworks data science team uses
Scikit-learn for a variety of tasks. From exploratory analysis, to product
development, it is an essential part of our toolkit. Recent uses are included
in `digg's new video recommender system
<https://medium.com/i-data/the-digg-video-recommender-2f9ade7c4ba3>`_,
and Poncho's `dynamic heuristic subspace clustering
<https://medium.com/@DiggData/scaling-poncho-using-data-ca24569d56fd>`_.
.. rst-class:: annotation
Gilad Lotan, Chief Data Scientist
.. div:: image-box
.. image:: images/betaworks.png
:target: https://betaworks.com
`Hugging Face <https://huggingface.co>`_
----------------------------------------
.. div:: sk-text-image-grid-large
.. div:: text-box
At Hugging Face we're using NLP and probabilistic models to generate
conversational Artificial intelligences that are fun to chat with. Despite using
deep neural nets for `a few <https://medium.com/huggingface/understanding-emotions-from-keras-to-pytorch-3ccb61d5a983>`_
of our `NLP tasks <https://huggingface.co/coref/>`_, scikit-learn is still the
bread-and-butter of our daily machine learning routine. The ease of use and
predictability of the interface, as well as the straightforward mathematical
explanations that are here when you need them, is the killer feature. We use a
variety of scikit-learn models in production and they are also operationally very
pleasant to work with.
.. rst-class:: annotation
Julien Chaumond, Chief Technology Officer
.. div:: image-box
.. image:: images/huggingface.png
:target: https://huggingface.co
`Evernote <https://evernote.com>`_
----------------------------------
.. div:: sk-text-image-grid-large
.. div:: text-box
Building a classifier is typically an iterative process of exploring
the data, selecting the features (the attributes of the data believed
to be predictive in some way), training the models, and finally
evaluating them. For many of these tasks, we relied on the excellent
scikit-learn package for Python.
`Read more <http://blog.evernote.com/tech/2013/01/22/stay-classified/>`_
.. rst-class:: annotation
Mark Ayzenshtat, VP, Augmented Intelligence
.. div:: image-box
.. image:: images/evernote.png
:target: https://evernote.com
`Télécom ParisTech <https://www.telecom-paristech.fr/>`_
--------------------------------------------------------
.. div:: sk-text-image-grid-large
.. div:: text-box
At Telecom ParisTech, scikit-learn is used for hands-on sessions and home
assignments in introductory and advanced machine learning courses. The classes
are for undergrads and masters students. The great benefit of scikit-learn is
its fast learning curve that allows students to quickly start working on
interesting and motivating problems.
.. rst-class:: annotation
Alexandre Gramfort, Assistant Professor
.. div:: image-box
.. image:: images/telecomparistech.jpg
:target: https://www.telecom-paristech.fr/
`Booking.com <https://www.booking.com>`_
----------------------------------------
.. div:: sk-text-image-grid-large
.. div:: text-box
At Booking.com, we use machine learning algorithms for many different
applications, such as recommending hotels and destinations to our customers,
detecting fraudulent reservations, or scheduling our customer service agents.
Scikit-learn is one of the tools we use when implementing standard algorithms
for prediction tasks. Its API and documentations are excellent and make it easy
to use. The scikit-learn developers do a great job of incorporating state of
the art implementations and new algorithms into the package. Thus, scikit-learn
provides convenient access to a wide spectrum of algorithms, and allows us to
readily find the right tool for the right job.
.. rst-class:: annotation
Melanie Mueller, Data Scientist
.. div:: image-box
.. image:: images/booking.png
:target: https://www.booking.com
`AWeber <https://www.aweber.com/>`_
-----------------------------------
.. div:: sk-text-image-grid-large
.. div:: text-box
The scikit-learn toolkit is indispensable for the Data Analysis and Management
team at AWeber. It allows us to do AWesome stuff we would not otherwise have
the time or resources to accomplish. The documentation is excellent, allowing
new engineers to quickly evaluate and apply many different algorithms to our
data. The text feature extraction utilities are useful when working with the
large volume of email content we have at AWeber. The RandomizedPCA
implementation, along with Pipelining and FeatureUnions, allows us to develop
complex machine learning algorithms efficiently and reliably.
Anyone interested in learning more about how AWeber deploys scikit-learn in a
production environment should check out talks from PyData Boston by AWeber's
Michael Becker available at https://github.com/mdbecker/pydata_2013.
.. rst-class:: annotation
Michael Becker, Software Engineer, Data Analysis and Management Ninjas
.. div:: image-box
.. image:: images/aweber.png
:target: https://www.aweber.com
`Yhat <https://www.yhat.com>`_
------------------------------
.. div:: sk-text-image-grid-large
.. div:: text-box
The combination of consistent APIs, thorough documentation, and top notch
implementation make scikit-learn our favorite machine learning package in
Python. scikit-learn makes doing advanced analysis in Python accessible to
anyone. At Yhat, we make it easy to integrate these models into your production
applications. Thus eliminating the unnecessary dev time encountered
productionizing analytical work.
.. rst-class:: annotation
Greg Lamp, Co-founder
.. div:: image-box
.. image:: images/yhat.png
:target: https://www.yhat.com
`Rangespan <http://www.rangespan.com>`_
---------------------------------------
.. div:: sk-text-image-grid-large
.. div:: text-box
The Python scikit-learn toolkit is a core tool in the data science
group at Rangespan. Its large collection of well documented models and
algorithms allow our team of data scientists to prototype fast and
quickly iterate to find the right solution to our learning problems.
We find that scikit-learn is not only the right tool for prototyping,
but its careful and well tested implementation give us the confidence
to run scikit-learn models in production.
.. rst-class:: annotation
Jurgen Van Gael, Data Science Director
.. div:: image-box
.. image:: images/rangespan.png
:target: http://www.rangespan.com
`Birchbox <https://www.birchbox.com>`_
--------------------------------------
.. div:: sk-text-image-grid-large
.. div:: text-box
At Birchbox, we face a range of machine learning problems typical to
E-commerce: product recommendation, user clustering, inventory prediction,
trends detection, etc. Scikit-learn lets us experiment with many models,
especially in the exploration phase of a new project: the data can be passed
around in a consistent way; models are easy to save and reuse; updates keep us
informed of new developments from the pattern discovery research community.
Scikit-learn is an important tool for our team, built the right way in the
right language.
.. rst-class:: annotation
Thierry Bertin-Mahieux, Data Scientist
.. div:: image-box
.. image:: images/birchbox.jpg
:target: https://www.birchbox.com
`Bestofmedia Group <http://www.bestofmedia.com>`_
-------------------------------------------------
.. div:: sk-text-image-grid-large
.. div:: text-box
Scikit-learn is our #1 toolkit for all things machine learning
at Bestofmedia. We use it for a variety of tasks (e.g. spam fighting,
ad click prediction, various ranking models) thanks to the varied,
state-of-the-art algorithm implementations packaged into it.
In the lab it accelerates prototyping of complex pipelines. In
production I can say it has proven to be robust and efficient enough
to be deployed for business critical components.
.. rst-class:: annotation
Eustache Diemert, Lead Scientist
.. div:: image-box
.. image:: images/bestofmedia-logo.png
:target: http://www.bestofmedia.com
`Change.org <https://www.change.org>`_
--------------------------------------
.. div:: sk-text-image-grid-large
.. div:: text-box
At change.org we automate the use of scikit-learn's RandomForestClassifier
in our production systems to drive email targeting that reaches millions
of users across the world each week. In the lab, scikit-learn's ease-of-use,
performance, and overall variety of algorithms implemented has proved invaluable
in giving us a single reliable source to turn to for our machine-learning needs.
.. rst-class:: annotation
Vijay Ramesh, Software Engineer in Data/science at Change.org
.. div:: image-box
.. image:: images/change-logo.png
:target: https://www.change.org
`PHIMECA Engineering <https://www.phimeca.com/?lang=en>`_
---------------------------------------------------------
.. div:: sk-text-image-grid-large
.. div:: text-box
At PHIMECA Engineering, we use scikit-learn estimators as surrogates for
expensive-to-evaluate numerical models (mostly but not exclusively
finite-element mechanical models) for speeding up the intensive post-processing
operations involved in our simulation-based decision making framework.
Scikit-learn's fit/predict API together with its efficient cross-validation
tools considerably eases the task of selecting the best-fit estimator. We are
also using scikit-learn for illustrating concepts in our training sessions.
Trainees are always impressed by the ease-of-use of scikit-learn despite the
apparent theoretical complexity of machine learning.
.. rst-class:: annotation
Vincent Dubourg, PHIMECA Engineering, PhD Engineer
.. div:: image-box
.. image:: images/phimeca.png
:target: https://www.phimeca.com/?lang=en
`HowAboutWe <http://www.howaboutwe.com/>`_
------------------------------------------
.. div:: sk-text-image-grid-large
.. div:: text-box
At HowAboutWe, scikit-learn lets us implement a wide array of machine learning
techniques in analysis and in production, despite having a small team. We use
scikit-learn's classification algorithms to predict user behavior, enabling us
to (for example) estimate the value of leads from a given traffic source early
in the lead's tenure on our site. Also, our users' profiles consist of
primarily unstructured data (answers to open-ended questions), so we use
scikit-learn's feature extraction and dimensionality reduction tools to
translate these unstructured data into inputs for our matchmaking system.
.. rst-class:: annotation
Daniel Weitzenfeld, Senior Data Scientist at HowAboutWe
.. div:: image-box
.. image:: images/howaboutwe.png
:target: http://www.howaboutwe.com/
`PeerIndex <https://www.brandwatch.com/peerindex-and-brandwatch>`_
------------------------------------------------------------------
.. div:: sk-text-image-grid-large
.. div:: text-box
At PeerIndex we use scientific methodology to build the Influence Graph - a
unique dataset that allows us to identify who's really influential and in which
context. To do this, we have to tackle a range of machine learning and
predictive modeling problems. Scikit-learn has emerged as our primary tool for
developing prototypes and making quick progress. From predicting missing data
and classifying tweets to clustering communities of social media users, scikit-
learn proved useful in a variety of applications. Its very intuitive interface
and excellent compatibility with other python tools makes it and indispensable
tool in our daily research efforts.
.. rst-class:: annotation
Ferenc Huszar, Senior Data Scientist at Peerindex
.. div:: image-box
.. image:: images/peerindex.png
:target: https://www.brandwatch.com/peerindex-and-brandwatch
`DataRobot <https://www.datarobot.com>`_
----------------------------------------
.. div:: sk-text-image-grid-large
.. div:: text-box
DataRobot is building next generation predictive analytics software to make data
scientists more productive, and scikit-learn is an integral part of our system. The
variety of machine learning techniques in combination with the solid implementations
that scikit-learn offers makes it a one-stop-shopping library for machine learning
in Python. Moreover, its consistent API, well-tested code and permissive licensing
allow us to use it in a production environment. Scikit-learn has literally saved us
years of work we would have had to do ourselves to bring our product to market.
.. rst-class:: annotation
Jeremy Achin, CEO & Co-founder DataRobot Inc.
.. div:: image-box
.. image:: images/datarobot.png
:target: https://www.datarobot.com
`OkCupid <https://www.okcupid.com/>`_
-------------------------------------
.. div:: sk-text-image-grid-large
.. div:: text-box
We're using scikit-learn at OkCupid to evaluate and improve our matchmaking
system. The range of features it has, especially preprocessing utilities, means
we can use it for a wide variety of projects, and it's performant enough to
handle the volume of data that we need to sort through. The documentation is
really thorough, as well, which makes the library quite easy to use.
.. rst-class:: annotation
David Koh - Senior Data Scientist at OkCupid
.. div:: image-box
.. image:: images/okcupid.png
:target: https://www.okcupid.com
`Lovely <https://livelovely.com/>`_
-----------------------------------
.. div:: sk-text-image-grid-large
.. div:: text-box
At Lovely, we strive to deliver the best apartment marketplace, with respect to
our users and our listings. From understanding user behavior, improving data
quality, and detecting fraud, scikit-learn is a regular tool for gathering
insights, predictive modeling and improving our product. The easy-to-read
documentation and intuitive architecture of the API makes machine learning both
explorable and accessible to a wide range of python developers. I'm constantly
recommending that more developers and scientists try scikit-learn.
.. rst-class:: annotation
Simon Frid - Data Scientist, Lead at Lovely
.. div:: image-box
.. image:: images/lovely.png
:target: https://livelovely.com
`Data Publica <http://www.data-publica.com/>`_
----------------------------------------------
.. div:: sk-text-image-grid-large
.. div:: text-box
Data Publica builds a new predictive sales tool for commercial and marketing teams
called C-Radar. We extensively use scikit-learn to build segmentations of customers
through clustering, and to predict future customers based on past partnerships
success or failure. We also categorize companies using their website communication
thanks to scikit-learn and its machine learning algorithm implementations.
Eventually, machine learning makes it possible to detect weak signals that
traditional tools cannot see. All these complex tasks are performed in an easy and
straightforward way thanks to the great quality of the scikit-learn framework.
.. rst-class:: annotation
Guillaume Lebourgeois & Samuel Charron - Data Scientists at Data Publica
.. div:: image-box
.. image:: images/datapublica.png
:target: http://www.data-publica.com/
`Machinalis <https://www.machinalis.com/>`_
-------------------------------------------
.. div:: sk-text-image-grid-large
.. div:: text-box
Scikit-learn is the cornerstone of all the machine learning projects carried at
Machinalis. It has a consistent API, a wide selection of algorithms and lots of
auxiliary tools to deal with the boilerplate. We have used it in production
environments on a variety of projects including click-through rate prediction,
`information extraction <https://github.com/machinalis/iepy>`_, and even counting
sheep!
In fact, we use it so much that we've started to freeze our common use cases
into Python packages, some of them open-sourced, like `FeatureForge
<https://github.com/machinalis/featureforge>`_. Scikit-learn in one word: Awesome.
.. rst-class:: annotation
Rafael Carrascosa, Lead developer
.. div:: image-box
.. image:: images/machinalis.png
:target: https://www.machinalis.com/
`solido <https://www.solidodesign.com/>`_
-----------------------------------------
.. div:: sk-text-image-grid-large
.. div:: text-box
Scikit-learn is helping to drive Moore's Law, via Solido. Solido creates
computer-aided design tools used by the majority of top-20 semiconductor
companies and fabs, to design the bleeding-edge chips inside smartphones,
automobiles, and more. Scikit-learn helps to power Solido's algorithms for
rare-event estimation, worst-case verification, optimization, and more. At
Solido, we are particularly fond of scikit-learn's libraries for Gaussian
Process models, large-scale regularized linear regression, and classification.
Scikit-learn has increased our productivity, because for many ML problems we no
longer need to “roll our own” code. `This PyData 2014 talk
<https://www.youtube.com/watch?v=Jm-eBD9xR3w>`_ has details.
.. rst-class:: annotation
Trent McConaghy, founder, Solido Design Automation Inc.
.. div:: image-box
.. image:: images/solido_logo.png
:target: https://www.solidodesign.com/
`INFONEA <http://www.infonea.com/en/>`_
---------------------------------------
.. div:: sk-text-image-grid-large
.. div:: text-box
We employ scikit-learn for rapid prototyping and custom-made Data Science
solutions within our in-memory based Business Intelligence Software
INFONEA®. As a well-documented and comprehensive collection of
state-of-the-art algorithms and pipelining methods, scikit-learn enables
us to provide flexible and scalable scientific analysis solutions. Thus,
scikit-learn is immensely valuable in realizing a powerful integration of
Data Science technology within self-service business analytics.
.. rst-class:: annotation
Thorsten Kranz, Data Scientist, Coma Soft AG.
.. div:: image-box
.. image:: images/infonea.jpg
:target: http://www.infonea.com/en/
`Dataiku <https://www.dataiku.com/>`_
-------------------------------------
.. div:: sk-text-image-grid-large
.. div:: text-box
Our software, Data Science Studio (DSS), enables users to create data services
that combine `ETL <https://en.wikipedia.org/wiki/Extract,_transform,_load>`_ with
Machine Learning. Our Machine Learning module integrates
many scikit-learn algorithms. The scikit-learn library is a perfect integration
with DSS because it offers algorithms for virtually all business cases. Our goal
is to offer a transparent and flexible tool that makes it easier to optimize
time consuming aspects of building a data service, preparing data, and training
machine learning algorithms on all types of data.
.. rst-class:: annotation
Florian Douetteau, CEO, Dataiku
.. div:: image-box
.. image:: images/dataiku_logo.png
:target: https://www.dataiku.com/
`Otto Group <https://ottogroup.com/>`_
--------------------------------------
.. div:: sk-text-image-grid-large
.. div:: text-box
Here at Otto Group, one of global Big Five B2C online retailers, we are using
scikit-learn in all aspects of our daily work from data exploration to development
of machine learning application to the productive deployment of those services.
It helps us to tackle machine learning problems ranging from e-commerce to logistics.
It consistent APIs enabled us to build the `Palladium REST-API framework
<https://github.com/ottogroup/palladium/>`_ around it and continuously deliver
scikit-learn based services.
.. rst-class:: annotation
Christian Rammig, Head of Data Science, Otto Group
.. div:: image-box
.. image:: images/ottogroup_logo.png
:target: https://ottogroup.com
`Zopa <https://zopa.com/>`_
---------------------------
.. div:: sk-text-image-grid-large
.. div:: text-box
At Zopa, the first ever Peer-to-Peer lending platform, we extensively use
scikit-learn to run the business and optimize our users' experience. It powers our
Machine Learning models involved in credit risk, fraud risk, marketing, and pricing,
and has been used for originating at least 1 billion GBP worth of Zopa loans. It is
very well documented, powerful, and simple to use. We are grateful for the
capabilities it has provided, and for allowing us to deliver on our mission of
making money simple and fair.
.. rst-class:: annotation
Vlasios Vasileiou, Head of Data Science, Zopa
.. div:: image-box
.. image:: images/zopa.png
:target: https://zopa.com
`MARS <https://www.mars.com/global>`_
-------------------------------------
.. div:: sk-text-image-grid-large
.. div:: text-box
Scikit-Learn is integral to the Machine Learning Ecosystem at Mars. Whether
we're designing better recipes for petfood or closely analysing our cocoa
supply chain, Scikit-Learn is used as a tool for rapidly prototyping ideas
and taking them to production. This allows us to better understand and meet
the needs of our consumers worldwide. Scikit-Learn's feature-rich toolset is
easy to use and equips our associates with the capabilities they need to
solve the business challenges they face every day.
.. rst-class:: annotation
Michael Fitzke, Next Generation Technologies Sr Leader, Mars Inc.
.. div:: image-box
.. image:: images/mars.png
:target: https://www.mars.com/global
`BNP Paribas Cardif <https://www.bnpparibascardif.com/>`_
---------------------------------------------------------
.. div:: sk-text-image-grid-large
.. div:: text-box
BNP Paribas Cardif uses scikit-learn for several of its machine learning models
in production. Our internal community of developers and data scientists has
been using scikit-learn since 2015, for several reasons: the quality of the
developments, documentation and contribution governance, and the sheer size of
the contributing community. We even explicitly mention the use of
scikit-learn's pipelines in our internal model risk governance as one of our
good practices to decrease operational risks and overfitting risk. As a way to
support open source software development and in particular scikit-learn
project, we decided to participate to scikit-learn's consortium at La Fondation
Inria since its creation in 2018.
.. rst-class:: annotation
Sébastien Conort, Chief Data Scientist, BNP Paribas Cardif
.. div:: image-box
.. image:: images/bnp_paribas_cardif.png
:target: https://www.bnpparibascardif.com/