
PICARD: A New Cloud Platform For Secure Data Needs
Computational social science often depends on large datasets that may contain sensitive or proprietary information. However, universities frequently lack the infrastructure needed to support these datasets, leading researchers to create ad hoc solutions that are often inefficient, insecure, and the result of unnecessary trial and error.
To address this gap, DDSS has developed a new, shareable research platform for social scientists, combining strong security with scalable computing power. In partnership with Databricks and Amazon Web Services, the platform can host massive structured or unstructured datasets, allowing researchers to securely and easily share data. It also offers on-demand computational resources to handle even the most demanding research tasks.
The Platform for Interactive Computation and Research Data (PICARD) supports the secure storage, access, and usage of large observational social science datasets containing identifiable private information, or personally identifiable information (PII). It excludes data marked as Controlled Unclassified Information (CUI).
Engineers: Eric Manning, Colin Swaney
Accessing Our Data
Currently, faculty, graduate students, and postdoctoral researchers can apply to access:
L2 Voter File: Standardized compilation of states’ voter registration files augmented with consumer data. Contains only registered voters. Standardized data begins in 2013. Raw data for some states available beginning in early 2000s.
L2 Commercial File: U.S. adults (i.e., including adults not registered to vote; excluding any vote history). Consumer data not sourced from any voter registration files.
Verisk Consumer History Snapshot: Address, name, and phone history for up to the last ten addresses and/or 30 years.