Record Linkage II: Implementation

Feb 19, 2024, 4:30 pm6:00 pm


Event Description

This workshop briefly introduces participants to several R tools for record linkage. We then discuss Splink, an increasingly popular Python package for record linkage and deduplication. Following a brief (re)introduction to the underlying probabilistic framework, a code-along will demonstrate an example use case, highlighting the package’s scalability to large datasets and its helpful diagnostic tools. We also provide examples of additional pre- and post-processing that researchers might require before and after using the package in their own work.

This workshop will be held in-person; snacks will be served.  Please RSVP.

Initiative for Data-Driven Social Science