Sensitive qualitative data and pseudonymisation
Arin Tham Savran
OS in the Swedish Context
Göteborg, 2026-04-06
| Göteborgs universitet - Chalmers tekniska högskola - Karolinska Institutet - Kungliga Tekniska högskolan - Lunds universitet - Stockholms universitet - Sveriges lantbruksuniversitet - Umeå universitet - Uppsala universitet
Detta verk är licensierat under en Creative Commons Erkännande 4.0 Internationell Licens.
Session plan
15:00 – 15:10 Welcome
15:10 – 15:40 Presentation
15:45 – 15:50 Activity 1
15:50 – 16:00 Presentation continues
16:00 – 16:10 Activity 2
16:10 – 16:20 Class discussion
16:20 – 16:30 Presentation finishes
Photo Matthew Henry on Burst
Types of qualitative ”sensitive data collection”
Svensk nationell datatjänst
3
Sensitive data
Svensk nationell datatjänst
4
GDPR (Dataskyddsförordningen)
Svensk nationell datatjänst
5
Pseudonymisation & Anonymisation: �In a nutshell
Svensk nationell datatjänst
6
Here’s the thing though…
Svensk nationell datatjänst
7
RE-IDENTIFICATION �Indirect identifiers and background variables
Svensk nationell datatjänst
8
1. Remove the ”obvious” and generalise identifying details
Make sure there’s nothing that could identify individuals, study participant or otherwise! Including via re-identification…
Exceptions? Yes, in some cases people want to be identified. But up to PI to ultimately judge the appropriateness!
Svensk nationell datatjänst
9
2. Create code/alias key
Use fake names, participant ID’s, or role-based labels.
Also, keep the key separately from the research data, stored securely, and accessible only to authorised individuals.
Ex:
Participant ID | Real name | Contact details |
P001 | Jane Doe | Email/phone |
Svensk nationell datatjänst
10
3. Pseudonyms for roles, organisations, geographical regions
Svensk nationell datatjänst
11
4. Pseudonymise places, organisations/companies, and networks
Original:
“I worked at Volvo Torslanda under my foreman Anders at T3.”
Replace with something like:
“I worked for a large manufacturing company in a major Swedish city under a senior manager.”
Svensk nationell datatjänst
12
5. Modify dates
Dates can be identifying, especially around incidents, complaints, hospitalisation, legal proceedings, or media-covered events.
Source: ChatGPT Edu
Original | Safer version |
“on 17 September 2023” | “in autumn 2023” or ”2023” |
“three days after the public inquiry” | “shortly after a major public event” |
“during the 2020 election campaign” | “during a national campaign period” |
Svensk nationell datatjänst
13
6. Generalise and categorise background information
Examples:
Source: www.researchdata.se
Svensk nationell datatjänst
14
ACTIVITY 1
3-5 min
ACTIVITY 1: Pseudonymised examples from ChatGPT
Original | Pseudonymised |
“I live in Kiruna” | |
“I am a 47-year-old neurosurgeon” | |
“My son attends Greenfield Primary” | |
“I started on 12 March 2024” | |
“My research is on hygiene practices in post-WW2 Finland and Sweden” | |
“As the only Arabic-speaking pediatric nurse in town…” | |
Svensk nationell datatjänst
16
Pseudonymised examples from ChatGPT (cont.)
Original | Pseudonymised |
“I live in Kiruna” | “I live in a small town in northern Sweden” |
“I am a 47-year-old neurosurgeon” | “I am a mid-career specialist doctor” |
“My son attends Tullbro School” | “My child attends a local primary school” |
“I started on 12 March 2024” | “I started in early 2024” |
“My research is on hygiene practices in post-WW2 Finland and Sweden” | “My research is on public health practices in two Nordic countries during the mid-20th century” |
“As the only Arabic-speaking pediatric nurse in town…” | “As one of few multilingual pediatric healthcare workers in the area…” |
Svensk nationell datatjänst
17
Read more…
Svensk nationell datatjänst
18
ACTIVITY 2
10 min
CLASS DISCUSSION
10 min
Also, to think about…
Svensk nationell datatjänst
21
Risk area | What to check |
File metadata | Author name, institution, path names, creation history |
Track changes | Deleted names or places may remain recoverable |
Comments | Researcher notes may contain real identities |
Embedded objects | Images, audio clips, linked files, spreadsheets |
File names | Avoid Interview_Anders_Andersson_2024.docx; use P01_transcript.docx |
PDF redactions | Do not just draw black boxes over text; use proper redaction/export workflows |
Audio/video transcripts | Check timestamps and speaker labels for identifiers |
Swedish National Data Service
22
Questions?
Swedish National Data Service
23
Thanks for listening
Arin Tham Savran
arin.tham@snd.se
www.researchdata.se
| University of Gothenburg - Chalmers University of Technology - Karolinska Institutet - KTH Royal Institute of Technology - Lund University - Stockholm University - Swedish University of Agricultural Sciences - Umeå University - Uppsala University