Practical Machine Learning for Surveys, Panels, and Experiments
Lecturer: Marco Steenbergen
Modality: In presence
Week 2: 17-21 August 2026
Workshop Contents and Objectives
Machine learning (ML) is changing how social scientists approach familiar research problems. Issues such as missing data, model specification, measurement error, heterogeneity, and complex longitudinal patterns are not new. However, modern ML methods offer new ways to diagnose, correct, and exploit these challenges. This course shows how ML can directly improve the quality, robustness, and creativity of empirical social science research. Rather than focusing on abstract model families, we anchor each method in concrete tasks researchers face every day: cleaning messy data, building better measures, designing credible causal analyses, and extracting structure from complex datasets.
Through hands-on sessions, participants will learn how to combine traditional statistical thinking with modern ML workflows, including advanced tree-based algorithms, causal ML, and deep learning for tabular and survey data. By the end of the week, students will not only know how these methods work, but when they meaningfully expand what social scientists can learn from their data.
Detailed lecture plan (daily schedule)
| Day 1 – Foundations for modern ML workflows |
|
| Day 2 – ML for missing data and data quality |
|
| Day 3 – ML for causal inference I |
|
| Day 4 – ML for causal inference II |
|
| Day 5 – Deep learning for social science data |
|
Class materials
Recommended: Kuhn, Max and Julia Silge. 2022. Tidy Modeling with R: A Framework for Modeling in the Tidyverse. O’Reilly. ISBN: 978-1492096481
Prerequisites
Prior knowledge of regression and R is highly recommended.
Marco Steenbergen
University of Zürich, Switzerland
He is professor of political methodology at the University of Zurich. His methodological interests span choice models, machine learning, measurement, and multilevel analysis. His substantive interests cover voting behavior and digital democracy, in particular, online deliberative processes. Originally hailing from The Netherlands, he previously taught at Carnegie Mellon University, the University of North Carolina at Chapel Hill, and the University of Bern. He has published extensively and is co-author of the award-winning book The Ambivalent Partisan (OUP 2012).