Evaluating the Impact of De-Identification on Social and Behavioral Research Data
Sayuri Modi, Melodie Galamo, Bhargavi Alluri Azwa Bajwah, Dakshita Pal
Project Description
Social and behavioral researchers often collect sensitive data about people. Publishing research data is beneficial for replication, meta-analysis, and public research. To prevent harms and privacy violations to research participants, data must be de-identified.
Principled approaches to de-identification, such as differential privacy and k-anonymity, can help ensure that data meets certain standards of privacy. However, researchers understandably have concerns that this gain in privacy will be unacceptably offset by a loss of data utility or fairness.
As a first step towards addressing researchers’ concerns, we aim to establish a baseline understanding of how existing de-identification tools impact data utility.
Goals
01
02
03
To understand how existing de-identification tools impact the utility of real research data
To be able to use ARX and SdcMicro data analysis tools to analyze data and calculate risk factors
To form hypotheses about how these tools could be better designed to meet the needs of social and behavioral researchers
sdcMicro Vs ARX
We will be using a dataset to anonymize it using these two tools and compare the results to understand how these two tools anonymize the same dataset differently.
Raw Data Set
Reducing Crime and Violence: Experimental Evidence from Cognitive Behavioral Therapy in Liberia
Data shows different attributes for 999 criminally-engaged men from Liberia and surrounding regions.
Notable variables include age, born country, born city, neighbourhood, tribe, religion, and level of education.
Importance of De-Identification
Methodologies (general)
K-anonymization
Generalizing (Hierarchy)
SdcMicro
01
Results
Results
Results
ARX
02
Results:
Generalized..
K-Anonymization
Comparison
Using cross-examinations we were able to see able to see similar risk/utility percentages to what the original tools outputted!
Final Thoughts
03
Challenges Faced
Takeaways
Thank you!
Special thanks to graduate students Wentao and Emma!