Introduction to Machine Learning

Welcome to Introduction to Machine Learning! This course is part of the Key Capabilities for Data Science program and covers topics related to machine learning; a topic closely related to artificial intelligence (AI), data science, and statistics. This course covers the data science perspective on the introductory concepts in machine learning, with a focus on making predictions. Not only does it cover different models such as K-NN, decision trees and linear classifiers, but it also tackles important concepts needed to prepare and preprocess data before building them. No course would be complete without knowing how to read the results. We cover different ways to evaluate your model and when to question your results. Finally, we show you how to streamline the entire process by implementing pipelines in your workflow.

Course prerequisites: Programming in Python for Data Science

In this model, we will concentrate on the steps that need to be taken before building a model. Preparation through imputation and scaling is an important step of model building and can be done using tools such as pipelines. Next, we will explore automated hyperparameter optimization.

Module 6: Preprocessing Categorical Variables

This module will teach you different encoding methods for categorical variables (ordinal and one-hot encoding) and appropriately set them up. We will also introduce ColumnTransformer and CountVectorizer from the sklearn library and show you how to implement them.

Introduction to Machine Learning

Module 0: Welcome to Introduction to Machine Learning

Module 1: Machine Learning Terminology

Module 2: Decision Trees

Module 3: Splitting, Cross-Validation and the Fundamental Tradeoff

Module 4: Similarity-Based Approaches to Supervised Learning

Module 5: Preprocessing Numerical Features, Pipelines and Hyperparameter Optimization

Module 6: Preprocessing Categorical Variables

Module 7: Assessment and Measurements

Module 8: Linear Models

Module Closing Remarks