Symposium Series
The DDSS Frontiers in Data Science Symposium series is organized around specific topics and will feature an interdisciplinary group of speakers with expertise in different areas, including computer science, statistics, social sciences, industry, and government. The topics will center on research questions and societal challenges where data-driven scientific solutions require the integration of the social sciences with statistical methods, computer science, and research design.
Overview
October 17-18, 2024
This symposium will feature state-of-the-art approaches to record linkage, bringing together social scientists, statisticians, academic researchers, and industry professionals. Presenters will showcase technical expertise in the development and use of innovative methods and software.
Topics will include:
- Census and historical record linkage
- Scalable probabilistic record linkage
- Evaluation
- Streaming and simultaneous analysis
- Industry collaboration
This event is cosponsored in partnership with theIndustrial Relations Section. Registration required.

Speakers
Senior Statistician
NORC
Data Scientist
American Institutes for Research
Assistant Professor of Political Science
Washington University in St. Louis
PhD Candidate, Economics
Princeton University
Assistant Professor in the Department of Government
University of Texas at Austin
Consultant
Institute for Health Metrics & Evaluation (IHME), University of Washington
Assistant Professor, Department of Statistics
Colorado State University
Lead Author of Splink, a Python library for record linkage
Data Scientist
Ministry of Justice UK
Assistant Professor of Statistics
Penn State University
Assistant Professor of Public Policy
Sanford School of Public Policy, Duke University
Professor of Economics
Brigham Young University
Professor of Statistical Science and Bass Fellow
Duke University
Associate Professor, Department of Statistics
University of Virginia
Schedule
Click Here for Schedule Details
Program Overview
Speaker | Topic |
DAY ONE | |
Session 1 | Census Part I |
Hannah Postel, Duke University | Subgroup Disparities in Automated Census Record Linkage |
Allison Green, Princeton University | Linking Historical Datasets: An Application to World War II Navy Muster Rolls |
Session 2 | Census Part 2 |
Joe Price, Brigham Young University | Breakthroughs in Historical Record Linking Using Genealogy Data: The Census Tree Project |
Adrian Haws, Cornell University | Software Demo: Census Tree - XGBoost |
Jonas Helgertz, Lund University [presented by Joe Price] | Examining the Role of Training Data for Supervised Methods of Automated Record Linkage: Lessons for Best Practice in Economic History |
Software Demo | |
Session 3 | Methods Part I |
Brenda Betancourt, NORC at the University of Chicago | Bayesian Clustering for Record Linkage Tasks |
Ted Enamorado, Washington University in St. Louis | A Locally Sensitive Hashing Approach to Scaling Up Probabilistic Record Linkage |
Software Demo: fastLink | |
Session 4 | Methods Part II |
Jerry Reiter, Duke | Simultaneous Record Linkage and Statistical Modeling |
Andee Kaplan, Colorado State University | Fast Bayesian Record Linkage for Streaming Data Contexts |
Software Demo: bstrl(R) | |
Session 5 | Project Discussion: Splink |
Robin Linacre | Splink. Discussion and Software Demo |
DAY TWO | |
Session 6 | Evaluation and Analysis |
Martin Slawski, University of Virginia | Some Recent Advances and Open Problems in Post-Linkage Data Analysis |
Software Demo: Pldamixture | |
Olivier Binette, Duke University | How to Evaluate Entity Resolution Systems: An Entity-Centric Framework with Application to Inventor Name Disambiguation |
Software Demo: ER-EVALUATION | |
Session 7 | Further Applications |
Cory McCartan, Penn State University | A Missing Data Approach to Record Linkage and Measurement Error |
Connor Jerzak, University of Texas, Austin | New Directions in Large-scale Record Linkage Using Half a Billion Open Collaborated Records from LinkedIn |
Software Demo: LinkOrgs |
Spring 2024: The Spread of Misinformation in a World Optimized for Engagement

Overview
May 10, 2024
The spread of misinformation is one of society's current challenges. The vast networks of communication and algorithms established in social media platforms can amplify false or inaccurate information at an unprecedented scale, potentially posing risks for public health, public security, democratic accountability, and many other domains. This DDSS symposium will focus on the role of algorithmic amplification in the spread of misinformation, the role of statistics and machine learning in processing information and misinformation, and some of the strategies that are being developed to counter misinformation.
Speakers
Mitsui Professor of Political Science
Massachusetts Institute of Technology
Assistant Professor of Politics and Public Affairs and Associated Faculty, Center for Information Technology Policy
Princeton University
Assistant Professor, School of International and Public Affairs, Department of Political Science
Columbia University
Professor of Computer Science and Director, Center for Information Technology Policy
Princeton University
Professor of Politics and International Affairs
Princeton University
Professor and Coca-Cola Foundation Chair
Georgia Institute of Technology
Schedule
Time | Speaker | Topic |
8:55am - 9:00am | Rocío Titiunik, Princeton University | Opening Remarks |
9:00am - 9:45am | Arvind Narayanan, Princeton University | Understanding Social Media Recommendation Algorithms |
9:45am - 10:30am | Andy Guess, Princeton University | Social Media, Ranking Algorithms, and Misinformation |
11:00am - 11:45am | Adam Berinsky, MIT | Thinking About Misinformation Interventions |
11:45am - 12:30pm | Yao Xie, Georgia Institute of Technology | Discovery and Mitigation of Disparities by Data |
1:30pm - 2:25pm | Tamar Mitts, Columbia University; Arvind Narayanan, Princeton University; Jacob Shapiro, Princeton University | Roundtable Discussion |
2:25pm - 2:30pm | Rocío Titiunik, Princeton University | Closing Remarks |
Contributions to and/or sponsorship of any event does not constitute departmental or institutional endorsement of the specific program, speakers or views presented.