1 of 15

A Configurable Anonymisation Service for Semantically Annotated Data

Semantics 2025, NXDG, Sept 3rd, 2025

2 of 15

About OwnYourData

  • Founded in 2015 we are a Research Boutique focusing on Data Sovereignty
  • 1st Austrian Data Intermediation Service Provider

"DID Daten-Intermediär-Dienste FlexCo"

  • Open Source solutions for data exchange:
    • Storage: Semantic Container
    • Identity: did:oyd
    • Contracts: Data Agreements
    • Data models: Semantic Overlay Architecture

2

Key Characteristics

Provisioned

Packaged

Distributed

FAIR

Composable

Tradeable

Standards-based

3 of 15

About SOyA

  • Swiss Army Knife for Data Model Management
  • SOyA Components
    • SOyA Structure with SOyA Ontology
    • SOyA Tools: Online Repository, Command Line tools, Web tools

3

4 of 15

SOyA Authoring

YAML:

JSON-LD:

soya init

5 of 15

SOyA Instances

flat JSON:

Turtle:

soya acquire

soya canonical

6 of 15

Anonymization

  • Process to protect privacy of personal data while preserving structural integrity
  • Goal is to prohibit linking attacks
    • Two data sets are joined by quasi identifiers
  • Different approaches described in literature
    • Randomization: adds random salt
    • Generalization: values are replaced with more abstract or broader representations
  • K-Anonymity benchmark for anonymization
    • Each record must be indistinguishable from at least k − 1 other records

6

7 of 15

Data Governance in Europe

  • GDPR (2018) – core regulation for personal data
    • Principles: minimization, purpose limitation, accountability
    • Rights: access, rectification, erasure
    • Anonymization: irreversible → exempt from GDPR
  • Data Governance Act (DGA)
    • Enables secure reuse of public data, data altruism, neutral intermediaries
    • Rules on registration, neutrality, obligations
  • Layered model: GDPR → individual rights; DGA → ecosystem trust

7

8 of 15

Architecture

  • Attribute oriented implementation
  • Anonymization is applied to each attribute individually
  • Open Interface to easily integrate new anonymization approaches
  • Data type of the implementation must match the data type of the attribute
  • Currently Generalization and Randomization are implemented

8

Input data

Anonymization Service

Generalization Implementation

Randomization Implementation

Custom Implementation

Anonymized data

Generalization for numeric data

Generalization for dates

Generalization for object structures

9 of 15

Data Transformation

  • Input data in object oriented schema
  • To apply attribute specific anonymizer transformation to attribute oriented schema
  • Result transformed back

9

10 of 15

Working Example

10

11 of 15

Degree of Anonymization

  • The degree of anonymization depends on the number of attributes n and the data set size k
  • Number of buckets g is used as central anonymization parameter to represent the
  • Our goal is to reach an 99% confidence that no individual can be uniquely identified

  • For randomization an adaptation based on the number of instance per bucket i is applied

11

12 of 15

Service Evaluation

  • Evaluation performed with a test suite with synthetic test data
  • K-anonymity as primary performance metric
  • Randomization assessed via k-anonymity with similarity measure
    • Similarity benchmark: average distance between original and anonymized values
    • Two anonymized records are considered similar if distance ≤

12

13 of 15

Tradeoff: Safety vs Information Value

  • Trade-off between privacy protection and information value
  • Goal: apply minimum anonymization to meet regulatory requirements
  • Required level depends on number of attributes and dataset size
  • With anonymization fixed, information value determined by attributes and dataset size
  • Service can adapt anonymization level when higher protection is needed

13

14 of 15

Summary and Future Work

  • Introduction of Anonymization Service
    • Integrated with semantic technologies
    • Flexible implementation
    • Dynamic degree of anonymization
  • Future Work
    • Enrich the anonymization output with benchmark values and KPIs
    • Extend pool of anonymization techniques
    • Evaluate scalability & operational impact in real-world settings

14

15 of 15

OWNYOURDATA.EU

Your Data is precious.

Try our Anonymiser Service: �https://anonymiser.ownyourdata.eu

Dr. Christoph Fabianek

✉️ christoph@ownyourdata.eu

Paul Feichtenschlager, MSc.

✉️ paul@ownyourdata.eu