Model Validation for Applied Data Science
Nov. 19, 2021, noon - Nov. 19, 2021, 2 p.m.
Organizer -
DataLab: Data Science and Informatics
Contact -
datalab-training@ucdavis.edu
Location -
Zoom
Description
In this workshop, we will discuss the basics of creating, comparing, and validating predictive models using a case study from the health sciences. We will demonstrate categorical prediction with logistic regression, and numerical predictions with a regression tree approach. We will calculate measurements of accuracy that are applicable to the different types of models, and use cross-validation to find the model parameters that generate the best predictions. Finally, we will interpret the results for insights about the real-world process being modeled. While this workshop features working with health data, the conceptual framework and principles discussed should be generalizable to research in other domains.
Learning Objectives
- Fit a logistic regression model
- Fit a random forest model
- Use cross-validation to tune model parameters
- Estimate the accuracy of predictions for future data
- Interpret model parameters.
Prerequisites
This workshop is open to learners at all levels, but prior experience with R is required in order to fully participate in this interactive, hands-on workshop.
Software
Please follow the DataLab install guides (https://datalab.ucdavis.edu/install-guide/) to install R and RStudio before the workshop. DataLab office hours are held via Zoom and in-person on Wednesdays from 1:30pm–3:00pm. Prior to the workshop, drop by office hours if you need help troubleshooting the installations. See the https://datalab.ucdavis.edu/office-hours/ for details.
Instructors: Wesley Brooks, Vladimir Filkov
Instructors’ Biographies
Wesley Brooks holds a Statistics Ph.D. from the University of Wisconsin. He works at the DataLab as a Data Scientist.
Vladimir Filkov is a Professor of Computer Science and DataLab's director for translational data science and leads the Health Data Science and Systems research and learning cluster.