Edit product

Building an Effective Machine Learning Workflow with scikit-learn

You have some experience building Machine Learning models, but only with artificially clean training data. How do you make the leap to building models using dirty, real-world data?

In this 8-hour course, you'll learn:

  • How to prepare complex datasets for Machine Learning using scikit-learn

  • How to handle common scenarios such as missing values, text data, and categorical data
  • How to build a reusable and efficient workflow that starts with a pandas DataFrame and ends with a trained scikit-learn model
  • How to integrate feature engineering, selection, and standardization into your workflow
  • How to avoid data leakage so that you can correctly estimate model performance
  • How to tune your entire workflow for maximum performance

By the end of this course, you'll be able to apply this workflow to your own datasets so that you can solve your own Machine Learning problems using scikit-learn!

How do I know if I'm ready for the course?

You're ready for this course if you can use scikit-learn to solve simple classification or regression problems, including loading a dataset, defining the features and target, training and evaluating a model, and making predictions with new data.

You'll also need to know how to perform a few basic pandas operations, including reading a CSV file and selecting columns from a DataFrame.

Has the Live Course already taken place?

Yes, I taught the Live Course in April of 2020.

Can I still purchase the Live Course?

Yes! By purchasing the Live Course ($99), you will receive:

  • Video recordings of the core lessons (4.5 hours)
  • Video recordings of the office hours (3.5 hours), during which I answered 45 student questions in great detail
  • Jupyter notebooks for the core lessons and office hours, which include my detailed lesson notes (9000 words) for easy reference
  • Lifetime access to the recordings and notebooks

What topics were covered during the Live Course?

  • Review of the basic Machine Learning workflow
  • Encoding categorical features
  • Encoding text data
  • Handling missing values
  • Creating an efficient workflow for preprocessing and model building
  • Tuning your workflow for maximum performance
  • Avoiding data leakage
  • Proper model evaluation
  • Model persistence
  • Feature selection
  • Feature standardization
  • Feature engineering using custom transformers

Is the course material up-to-date?

Yes! I created this course in 2020 using the latest version of scikit-learn.

What is the "Live Course + Advanced Course" bundle?

In 2021, I'll be publishing an Advanced Course that will cover all of the topics from the Live Course in greater depth, plus additional topics that I didn't have time to cover during the Live Course. I'll be recording the Advanced Course from scratch and having it professionally edited. (It won't be taught live.)

If you buy the Live Course + Advanced Course bundle ($129), you'll get automatic access to the Advanced Course as soon as it's available, regardless of how much it costs on its own!

What is your refund policy?

If you decide that the course isn't a good fit for you, I'd be happy to give you a full refund within 30 days of purchase.

Will the Live Course be offered another time?

No, I'm not planning to teach the Live Course again.

I have another question...

Please email me and I'd be happy to answer your question: kevin@dataschool.io

Comments from past students

"This was one of the best data science classes I have ever taken... I was impressed with Kevin's easy-to-understand teaching style where he clearly explains the 'what' and 'why' of each principle... I highly recommend this course." - Khaled Jafar, Director of Analytics

"This course takes you through some of the challenges we face with real data, which is not always the case in other courses... If you are familiar with Machine Learning but need to know how to apply it using scikit-learn, then this course is definitely for you!" - Abla Elsergany, MS in Advanced Analytics

"Learning Machine Learning is a bit of a zig zag process. A little from here. A little from there. Kevin Markham is BETTER THAN ANYBODY at pulling all those pieces together so you can use them and understand what you're doing." - Les Guessing, Creative Director at Creative Algorithm

"This class will not only save me a lot of time in the future, but will also ensure that my models will be robust to data leakage... The explanations and demonstrations are worth the price of admission." - Mike F., Data Scientist

"Kevin is a master at explaining difficult and confusing concepts with ease, and I was amazed at the sheer amount of information he was able to pack in a rather short span of time... I learned more about scikit-learn from this course than from months of watching YouTube tutorials and taking MOOCs." - Pranjal Chaubey, AI Mentor at Udacity

"I've already used the learnings from the course in a Machine Learning competition and got impressive results, while keeping the code clean and easy to understand. Also, I'm much more confident at tackling Machine Learning problems and I'm sure this will contribute a lot to my career." - João Vítor Franco, Data Scientist at 99

0 ratings


Building an Effective Machine Learning Workflow with scikit-learn

Enter your info to complete your purchase of Live Course


···· ···· ···· 4242
Test card



Use a different card?


pp paypal

or pay with

We do not keep any of your sensitive credit card information on file with us unless you ask us to after this purchase is complete.

or pay with

You'll be charged US$99.

Your purchase was successful!

We charged your card and sent you a receipt

    Gumroad Library

    Download from the App Store or text yourself a link to the app

    Good news! Since you already have a Gumroad account, it's also been added to your library.

    Powered by Gumroad