Machine Learning with R
University of Utah
Registration: Use the following link to sign up for this workshop.
Sign up for the DELPHI mailing list to stay in the loop about future workshops and funding opportunities.
General Information
What: This workshop provides a practical introduction to foundational machine learning algorithms, including linear regression, random forest, and XGBoost, with hands-on applications and best practices using tidymodels in R. Note that this workshop will NOT cover Large Language Models (LLMs) and while it may touch on Neural Network (NN)/Deep Learning models briefly at the end of the workshop, these will not be the focus.
Who: This workshop is designed for researchers, staff, and students who want to gain experience using machine learning techniques in practice in the R programming language.
Prerequisites: Participants should be familiar with the basics of the R programming language, including the tidyverse. Participants do not need to have any previous machine learning, statistical or mathematical experience to attend this workshop.
Requirements: Participants must bring a laptop onto which they can download R and Rstudio (and you should do so before the workshop).
Contact: Please email andrew.george@hsc.utah.edu and rebecca.barter@hsc.utah.edu for more information.
Resources
Books and websites
tidymodels.org. The official tidymodels website full of helpful tutorials and documentation.
Tidy Modeling with R by Max Kuhn and Julia Silge. A great read to get familiar with everything there is to do in the tidymodels ecosystem.
Efficient Machine Learning with R by Simon P Couch. Lots of great tips for speeding up your tidymodels code once you get a little more advanced.
Posit Cloud
A Posit Cloud workspace will be set up prior to the workshop for those who cannot (or prefer not to) install applications on their laptop.
You will need to create a posit cloud account if you do not already have one.
Click here to access the posit cloud workspace for this workshop.
Download files and data
If you are working in RStudio locally (rather than using posit cloud, above) click here to download all of the .R code scripts and data files we will be using throughout the workshop.
Schedule
Note that the schedule below serves as a guideline. The start, end, and break times are fixed, but timing for each topics covered may vary as we may go faster or slower through the content.
Note that morning snacks and lunch will be provided on both days.
Day 1
Time | Topic | Content |
---|---|---|
9:00 | Introduction to Prediction Problems | 01_intro.pdf |
9:30 | Linear Regression for Continuous Responses | 02_linear_regression.pdf |
10:00 | Evaluating Continuous Response Predictions | 02_linear_regression.pdf |
10:30 | [Break] | |
10:45 | [Coding Session] Linear Regression for Continuous Responses | <02_linear_regression.R> |
11:30 | Data Preprocessing | 03_data_preprocessing.pdf |
12:00 | [Lunch] | |
1:00 | [Coding Session] Data Preprocessing | <03_data_preprocessing.R> |
1:45 | Logistic Regression for Predicting Binary Responses | 04_binary_Responses.pdf |
2:00 | Evaluating Binary Responses Predictions | 04_binary_Responses.pdf |
2:30 | [Break] | |
2:45 | [Coding Session] Logistic Regression and Evaluation Binary Response Predictions | <04_binary_Responses.R> |
4:00 | [End] |
Day 2
Time | Topic | Content |
---|---|---|
9:00 | Decision Trees and Random Forest | 05_decision_trees.pdf |
10:00 | XGBoost | 05_decision_trees.pdf |
10:15 | Variable Importance | 05_decision_trees.pdf |
10:30 | [Break] | |
10:45 | [Coding Session] Decision Tree Algorithms | <05_decision_trees.R> |
12:00 | [Lunch] | |
1:00 | Hyperparameter Tuning | 06_hyperparameter_tuning.pdf |
1:30 | [Coding Session] Hyperparameter Tuning | <06_hyperparameter_tuning.R> |
2:30 | [Break] | |
2:45 | Basic Neural Networks | 07_nn.pdf |
3:00 | [Coding Session] Basic Neural Networks | <07_nn.R> |
4:00 | [End] |