logoIntroduction toMachine Learnin

Introduction to Machine Learning

Welcome to Introduction to Machine Learning! This course is part of the Key Capabilities for Data Science program and covers topics related to machine learning; a topic closely related to artificial intelligence (AI), data science, and statistics. This course covers the data science perspective on the introductory concepts in machine learning, with a focus on making predictions. Not only does it cover different models such as K-NN, decision trees and linear classifiers, but it also tackles important concepts needed to prepare and preprocess data before building them. No course would be complete without knowing how to read the results. We cover different ways to evaluate your model and when to question your results. Finally, we show you how to streamline the entire process by implementing pipelines in your workflow.

Course prerequisites: Programming in Python for Data Science

Module 0: Welcome to Introduction to Machine Learning

Course introduction, summary of course learning outcomes and prerequisite validation.

Module 1: Machine Learning Terminology

In this module, we will explain the different branches of machine learning and introduce the steps needed to build a model by constructing baseline models.

Module 2: Decision Trees

In this module, we will introduce the decision tree model. We will explain the structure of decision trees and the process it take to make predictions.

Module 3: Splitting, Cross-Validation and the Fundamental Tradeoff

In this module, we will introduce why and how we split our data as well as how cross-validation works on training data. We will also explain two important concepts in machine learning: the fundamental tradeoff and the golden rule.

Module 4: Similarity-Based Approaches to Supervised Learning

In this module, we will cover similarity-based models 𝑘-Nearest Neighbours (also known as 𝑘-NNs) and Support Vector Machines (SVMs with an RBF kernel).

Module 5: Preprocessing Numerical Features, Pipelines and Hyperparameter Optimization

In this model, we will concentrate on the steps that need to be taken before building a model. Preparation through imputation and scaling is an important step of model building and can be done using tools such as pipelines. Next, we will explore automated hyperparameter optimization.

Module 6: Preprocessing Categorical Variables

This module will teach you different encoding methods for categorical variables (ordinal and one-hot encoding) and appropriately set them up. We will also introduce ColumnTransformer and CountVectorizer from the sklearn library and show you how to implement them.

Module 7: Assessment and Measurements

This module will teach you how to appropriately assess your model. We will teach you how to evaluate and calculate your model using an assortment of different measurements.

Module 8: Linear Models

This module will teach you about different types of linear models. You will learn how these models can be interpreted as well as their advantages and limitations.

Module Closing Remarks

Well done on finishing Introduction to Machine Learning.

About this course

This course covers the data science perspective on the introductory concepts in machine learning, with a focus on making predictions. It covers how to build different models such as K-NN, decision trees and linear classifiers as well as important concepts such as data splitting and fundamental rules and laws. In addition, this course will teach you how to evaluate models properly and question their validity all while streamlining the process with pipelines.

About the program

The University of British Columbia (UBC) is a comprehensive research-intensive university, consistently ranked among the 40 best universities in the world. The Key Capabilities in Data Science program was launched in September 2020 and is developed and taught by many of the same instructors as the UBC Master of Data Science program.