Advanced Geospatial II: Large Data Processing with R and SQL

Mar 4, 2024, 4:30 pm6:00 pm



Event Description

This workshop introduces researchers to tools for large geospatial data processing in R and SQL. Before researchers can conduct statistical analyses, they often first need to standardize and geolocate a list of addresses, then perform one or more common operations, such as spatial joins, identifying and aggregating covariates from nearest neighbors, or harmonizing geographies (for example, Census tracts and voting districts). These tasks – and other geospatial data processing tasks – pose unique challenges when the dataset is larger than memory. In this workshop, we cover open-source options for both standardizing and geocoding address data (some with associated R packages), and discuss how each solution scales to large data. We then cover storage and processing options for large spatial data that do not require standalone GIS software. Basic working knowledge of R (especially tidyverse) and SQL are useful prerequisites.

This workshop will take place in-person; snacks will be served.  Please RSVP.

Initiative for Data-Driven Social Science