Introduction to Natural Language Processing with Applications to Clinical Data Science

University of Utah

Date: March 8th, 2024

Time: 9:00 am – 4:00 pm MDT

Location: EHSEB 4100B

Instructors: Alec Chapman, Kelly Peterson

Registration: Use the following link to sign up for this workshop.

Sign up for the DELPHI mailing list to stay in the loop about future workshops and funding opportunities.

General Information

What: This workshop will introduce methods for using natural language processing (NLP) to extract information from unstructured text data. There will be a primary focus on applications to clinical text and electronic health record notes, but the methods we learn could be applied to other domains.

Who: The course is aimed at graduate students, postdocs, faculty, and other researchers across campus who are interested in learning how to use NLP for data analysis.

Requirements: Participants must bring a laptop on which they can access the internet via a web browser. We will be using Google Colab notebooks throughout this workshop. This workshop will use Python, and students are expected to have an introductory level of Python. If you do not have Python knowledge, we strongly recommend registering for our Introduction to Data Analysis in Python workshop or completing our virtual short course on Python (in development).

Contact: Please email penny.atkins@hsc.utah.edu or alec.chapman@hsc.utah.edu for more information.

Schedule

Time Topic Notebook Notebook solutions
Before Workshop Python Review

0. Roadmap

1. Python Essentials

1. Python Essentials
9:00 am Introduction and Setup Slides - Intro to NLP
9:30 am Working with text in Python 2. String Methods 2. String Methods
10:00 am Regular expressions 3. Regular Expressions 3. Regular Expressions
10:30 am Morning Break
10:45 am Rule-based NLP and medspaCy 4. Rule-based NLP with medspaCy 4. Rule-based NLP with medspaCy
11:30 am Attribute Detection 5. Attribute Detection 5. Attribute Detection
12:00 pm Lunch
1:00 pm Pneumonia classification 6. Pneumonia Classification Solutions posted after the workshop
1:45 pm Introduction to machine learning NLP Slides - Intro to ML for NLP
2:00 pm Pre-training a LM from scratch 7. Train a language model
2:30 pm Afternoon Break
2:45 pm Fine-tuning a LM for classification 8. Fine tuning text Classification for Pneumonia
3:30 pm Wrap-Up and Resources Slides - ML wrap up and Resources
4:00 pm End