Machine Learning with R

University of Utah

Date: Monday March 31 - Tuesday April 1, 2025

Time: 9:00 am – 4:00 pm MDT

Location: HELIX Rm - GS150 - Chokecherry

Instructors: Rebecca Barter

Registration: Use the following link to sign up for this workshop.

Sign up for the DELPHI mailing list to stay in the loop about future workshops and funding opportunities.

General Information

What: This workshop provides a practical introduction to foundational machine learning algorithms, including linear regression, random forest, and XGBoost, with hands-on applications and best practices using tidymodels in R. Note that this workshop will NOT cover Large Language Models (LLMs) and while it may touch on Neural Network (NN)/Deep Learning models briefly at the end of the workshop, these will not be the focus.

Who: This workshop is designed for researchers, staff, and students who want to gain experience using machine learning techniques in practice in the R programming language.

Prerequisites: Participants should be familiar with the basics of the R programming language, including the tidyverse. Participants do not need to have any previous machine learning, statistical or mathematical experience to attend this workshop.

Requirements: Participants must bring a laptop onto which they can download R and Rstudio (and you should do so before the workshop).

Contact: Please email andrew.george@hsc.utah.edu and rebecca.barter@hsc.utah.edu for more information.

Resources

Books and websites

tidymodels.org. The official tidymodels website full of helpful tutorials and documentation.

Tidy Modeling with R by Max Kuhn and Julia Silge. A great read to get familiar with everything there is to do in the tidymodels ecosystem.

Efficient Machine Learning with R by Simon P Couch. Lots of great tips for speeding up your tidymodels code once you get a little more advanced.

Posit Cloud

A Posit Cloud workspace will be set up prior to the workshop for those who cannot (or prefer not to) install applications on their laptop.

You will need to create a posit cloud account if you do not already have one.

Click here to access the posit cloud workspace for this workshop.

Download files and data

If you are working in RStudio locally (rather than using posit cloud, above) click here to download all of the .R code scripts and data files we will be using throughout the workshop.

Schedule

Note that the schedule below serves as a guideline. The start, end, and break times are fixed, but timing for each topics covered may vary as we may go faster or slower through the content.

Note that morning snacks and lunch will be provided on both days.

Day 1

Time Topic Content
9:00 Introduction to Prediction Problems 01_intro.pdf
9:30 Linear Regression for Continuous Responses 02_linear_regression.pdf
10:00 Evaluating Continuous Response Predictions 02_linear_regression.pdf
10:30 [Break]
10:45 [Coding Session] Linear Regression for Continuous Responses <02_linear_regression.R>
11:30 Data Preprocessing 03_data_preprocessing.pdf
12:00 [Lunch]
1:00 [Coding Session] Data Preprocessing <03_data_preprocessing.R>
1:45 Logistic Regression for Predicting Binary Responses 04_binary_Responses.pdf
2:00 Evaluating Binary Responses Predictions 04_binary_Responses.pdf
2:30 [Break]
2:45 [Coding Session] Logistic Regression and Evaluation Binary Response Predictions <04_binary_Responses.R>
4:00 [End]

Day 2

Time Topic Content
9:00 Decision Trees and Random Forest 05_decision_trees.pdf
10:00 XGBoost 05_decision_trees.pdf
10:15 Variable Importance 05_decision_trees.pdf
10:30 [Break]
10:45 [Coding Session] Decision Tree Algorithms <05_decision_trees.R>
12:00 [Lunch]
1:00 Hyperparameter Tuning 06_hyperparameter_tuning.pdf
1:30 [Coding Session] Hyperparameter Tuning <06_hyperparameter_tuning.R>
2:30 [Break]
2:45 Basic Neural Networks 07_nn.pdf
3:00 [Coding Session] Basic Neural Networks <07_nn.R>
4:00 [End]