Preview this course

This course is a complete guide to data cleansing for machine learning engineers. In this course, you will learn data imputation and advanced data cleansing techniques.

Unlimited access to 750+ courses.
Enjoy a Free Trial. Cancel Anytime.

- OR -

30-Day Money-Back Guarantee
Full Lifetime Access.
111 on-demand videos & exercises
Level: Beginner
English
3hrs 33mins
Access on mobile, web and TV

What to know about this course

Data preparation may be the most important part of a machine learning project. It is the most time-consuming part, although it is the least discussed topic. Data preparation, sometimes referred to as data preprocessing, is the act of transforming raw data into a form that is appropriate for modeling. Machine learning algorithms require input data to be numbered, and most algorithm implementations maintain this expectation. Therefore, if your data contains data types and values that are not numbers, such as labels, you will need to change the data into numbers. Further, specific machine learning algorithms have expectations regarding the data types, scale, probability distribution, and relationships between input variables, and you may need to change the data to meet these expectations.

In this course, you will learn data imputation and advanced data cleansing techniques, how to apply real-world data cleansing techniques to your data, advanced data cleansing techniques. Also, learn how to prepare data in a way that avoids data leakage, and in turn, incorrect model evaluation. By the end of this course, you will perform data preprocessing and master data cleaning skills. The complete code bundle for this course is available at https://github.com/PacktPublishing/Data-Cleansing-Master-Class-in-Python

Who's this course for?

This course is for you if you are serious about becoming a machine learning engineer in the real world. You will need a solid foundation in Python and should understand the basics of machine learning. Also, you should have some expertise with machine learning libraries.

What you'll learn

  • Prepare data in a way that avoids data leakage.
  • Identify and handle problems with messy data.
  • Know which feature selection method to choose based on the data types.
  • Transform the probability distribution of input variables.
  • Identify and remove irrelevant and redundant input variables.
  • Project variables into a lower-dimensional space.

Key Features

  • Learn how to apply real-world data cleansing techniques to your data.
  • Learn advanced data cleansing techniques.
  • Learn how to prepare data in a way that avoids data leakage, and in turn, incorrect model evaluation.

Course Curriculum

About the Author

Mike West

Mike West is the founder of LogikBot. He has worked with databases for over two decades. He has worked for or consulted with over 50 different companies as a full-time employee or consultant. These were Fortune 500 as well as several small to mid-size companies. Some include Georgia Pacific, SunTrust, Reed Construction Data, Building Systems Design, NetCertainty, The Home Shopping Network, SwingVote, Atlanta Gas and Light, and Northrup Grumman. Over the last five years, Mike has transitioned to the exciting world of applied machine learning. He is excited to show you what he has learned and help you move into one of the single-most important fields in this space.. Mike West is the founder of LogikBot. He has worked with databases for over two decades. He has worked for or consulted with over 50 different companies as a full-time employee or consultant. These were Fortune 500 as well as several small to mid-size companies. Some include Georgia Pacific, SunTrust, Reed Construction Data, Building Systems Design, NetCertainty, The Home Shopping Network, SwingVote, Atlanta Gas and Light, and Northrup Grumman. Over the last five years, Mike has transitioned to the exciting world of applied machine learning. He is excited to show you what he has learned and help you move into one of the single-most important fields in this space.

40% OFF! Unlimited Access to 750+ Courses. Redeem Now.