Clustering and Classification with Machine Learning in R

Preview this course

The underlying patterns in your data hold vital insights; unearth them with cutting-edge clustering and classification techniques in R.

Unlimited access to 750+ courses.
Enjoy a Free Trial. Cancel Anytime.

- OR -

30-Day Money-Back Guarantee
Full Lifetime Access.
72 on-demand videos & exercises
Level: Intermediate
English
7hrs 42mins
Access on mobile, web and TV

What to know about this course

This course is your complete guide to both supervised and unsupervised learning using R. This course covers all the main aspects of practical data science; if you take this course, there is no need to take other courses or buy books on R-based data science. In this age of big data, companies across the Globe use R to sift through the avalanche of information at their disposal. By becoming proficient in unsupervised and supervised learning in R, you can give your company a competitive edge and take your career to the next level. Over the course of research, the author realized that almost all the R data science courses and books out there do take account of the multidimensional nature of the topic. This course will give you a robust grounding in the main aspects of machine learning: clustering and classification. Unlike other R instructors, the author digs deep into R's machine learning features and give you a one-of-a-kind grounding in data science! You will go all the way from carrying out data reading & cleaning to machine learning, to finally implementing powerful machine learning algorithms and evaluating their performance via R.

The following topics will be covered:
•A full introduction to the R Framework for data science.
• Data structures and reading in R, including CSV, Excel, and HTML data.
• How to pre-process and clean data by removing NAs/No data, visualization.
• Machine learning, supervised learning, and unsupervised learning in R.
• Model building and selection and much more!

The course will help you implement methods using real data obtained from different sources. Many courses use made-up data that does not empower students to implement R-based data science in real life. After taking this course, you'll easily use data science packages such as Caret to work with real data in R. You'll even understand concepts such as unsupervised learning, dimension reduction, and supervised learning. All the code and supporting files for this course are available at - https://github.com/PacktPublishing/Clustering-and-Classification-with-Machine-Learning-in-R

Who's this course for?

  • This course is for students interested in getting started with data science applications in the R Studio environment. Students wishing to learn how to implement unsupervised learning on real data.
  • Anyone with prior exposure to R who wants to get started with practical data science.

What you'll learn

  • Read-in data into the R environment from different sources.
  • Carry out basic data pre-processing and wrangling in R Studio.
  • Implement unsupervised/clustering techniques such as K-means clustering.
  • Implement dimensional reduction techniques (PCA) and feature selection.
  • Implement supervised learning techniques/classification such as Random Forests.
  • Evaluate model performance and learn the best practices for evaluating machine learning model accuracy.

Key Features

  • The course explains how the exam is structured, the way that the questions should be approached and how to study successfully to pass.
  • The course also includes invaluable advice on the best way to prepare and what to expect from the testing process.

Course Curriculum

About the Author

Minerva Singh

Minerva Singh is a PhD graduate from Cambridge University where she specialized in Tropical Ecology. She is also a part-time Data Scientist. As part of her research, she must carry out extensive data analysis, including spatial data analysis. For this purpose, she prefers to use a combination of freeware tools: R, QGIS, and Python. She does most of her spatial data analysis work using R and QGIS. Apart from being free, these are very powerful tools for data visualization, processing, and analysis. She also holds an MPhil degree in Geography and Environment from Oxford University. She has honed her statistical and data analysis skills through several MOOCs, including The Analytics Edge and Statistical. In addition to spatial data analysis, she is also proficient in statistical analysis, machine learning, and data mining.