Speaker
Details
Event Description
Researchers often need to merge or deduplicate datasets that lack common identifiers for unique individuals or entity names. A researcher’s choice of solution to this problem can often affect downstream analyses, and researchers may also wish to incorporate uncertainty from this stage into their analyses. In this workshop, we introduce record linkage as a generic problem and break the task into its component stages. For each stage, we discuss best practices, potential pitfalls, and scalability while providing examples of existing implementations in the social sciences. We conclude by discussing when a custom solution might be desirable.
This workshop will take place in-person; snacks will be served. Please RSVP.
Sponsor
Initiative for Data-Driven Social Science