Copy of RDF Sprint: Data Anonymization Group
 Share
The version of the browser you are using is no longer supported. Please upgrade to a supported browser.Dismiss

 
Comment only
 
 
ABCDEFGHIJKLMNOPQRSTUVWXYZAAAB
1
Solutions:OK
Full Redaction
Partial RedactionChunk Up/Zoom OutAssign Alternative IDAdd NoiseAverage
2
Variables:Release w/o alterationDo not release in any formIncomplete release (remove certain part of the data)Collapse observations into groups (age example: 10-15, 16-20, 21-25)Assign you own number to each observation at random (example: 001, 002, 003, 004)Add random numbers to a piece of numeric dataCollapse units of observation through averaging (average ages of 5 people to create a single observation)
3
super-variable:
likely to appear across a range of data sets
4
Names
5
first nameXX
6
last nameXX
7
middle nameXX
8
usernameXX
9
petition signatureX
10
Non-Bio Characteristics
11
job title
X(unless small sample size allows for identification based on this characteristic)
12
Relationships
13
Guardian's profession
14
Spouse's name
15
doctor's name
16
Identification Numbers
17
ID (student, School)XX
18
donor information, including bank account #
19
passport #
20
IP addressX
21
platform-specific identifier #
22
electronic signature from a 3rd-party app
23
official government-provided ID (such as social security number)XX
24
identifier number from a 3rd-party application
25
Geographic
26
Physical address
27
Resource Location
28
Location
29
District/County/Province
30
Do you need pubic transport (school bus)
31
Pick up/drop-off address
32
Current/event location
33
Hazardous substance repository locations
34
Phone geolocation
35
social media report geotag
36
Transport routes
37
Bio-
38
Date of Birth
X (Remove the day or month variable as needed)
39
Age
X (Collapse precise ages into categories (18-24yo etc.)
(also workable, but more complicated, would reduce the number of observations)
40
Gender
X (unless person can still be identified because there are few people of that kind in the data set)
41
Ethnicity
X (If certain subgroups are so small that they could refer to specific individuals)
42
Diagnoses (eg HIV status)
43
Bio-indicators (Blood pressure, weight, blood type etc)
44
Pollution Exposure
45
Authentication
46
PasswordsX
47
Google/FB/Twitter Login
48
Contact
49
Phone no.
X (safest option)
X (ok to release area code ONLY if would be useful to establishing geography)
50
Email address X
51
Communication Content
52
social media message
53
social media followers/followees
54
social media "likes"
55
social media report date
56
social media message content
57
testimony message content
58
Financial
59
insurance providerX
60
political donation amountX
61
financial dependencyX
62
political donaton target
63
membership in a subsidy program
64
payment data (credit card #)X
65
Personal Activity History
66
Role in clinical trial (group assignment)
67
habits of behavior
68
scores/marks/performance metrics (ed.)
69
highest level of education
70
school history
71
voting history
72
energy use
73
citations/no. of publications
74
group membership
75
arrest record
76
resource provider
77
party affiliation
78
standardized test score info
79
browsing data
80
attendee rates (absentee rates)
81
phone use history
82
noise pollution levels
83
application for a license
84
pollution exposure (regional)
85
biodiversity stats
86
aggregated env. DBs
87
quantities of hazardous materials
88
medications prescribed
89
deforestation
90
resource consumption
91
post-secondary school info (about next moves i.e uni or military)
92
habitat loss
93
transition rates/drop out rates (ed.)
94
Time
95
political donation date
96
log in timestamp
97
submitted timestamp
98
date of offline testimony
99
datetime of action or event
100
Loading...
Main menu