2 of 13

Motivation

Key Issues with LLM Embedding in Daily Life

Data Sources: Concerns about the data LLMs are trained on and the presence of societal biases.
Misinformation Risk: Potential for amplifying biases and spreading disinformation.
Ownership & Copyright: Questions about who owns the knowledge generated by LLMs; copyright disputes from content creators over datasets like Books3 (e.g., books, songs, news articles).
Legal Actions: Lawsuits claim LLMs are trained on pirated content, sparking debates over copyright and fair use.

🡪 Developers are reluctant to reveal training data sources.

🡪 GPT-4: No dataset information released.

🡪 Meta: Shared LLaMA dataset details, but refused to disclose for LLaMA-2.

https://medium.com/@kevaldekivadiya2415/fine-tuning-llama-2-your-path-to-chemistry-text-perfection-aa4c54ff5790

3 of 13

Threat Model (Auditing setup)

Auditor assumption:

LM is trained on dataset Dtrain.
Auditor infers whether D ∈ Dtrain.
Access

: Auditor has query-only access to LM and full access to its tokenizer T.

🡪 Can query LM with token sequence S and receive probability outputs for all tokens v in V.

Document Sets

: Auditor has access to DM and DNM, with DM and DNM having the same distribution as D.

🡪 DM: Consists of publicly available datasets (e.g., Project Gutenberg) that are similar to D.

🡪 DNM: Auditor collects data added after Dtrain to construct DNM.

4 of 13

Method

STEP1. Querying the model

STEP2. Normalization (NORMALIZE)

STEP3. Feature extraction (FEATAGG)

STEP4. Meta-classifier

https://www.usenix.org/system/files/usenixsecurity24_slides-meeus.pdf

5 of 13

STEP1. Querying the model

*N : Document length

*C : max_input

6 of 13

STEP2. Normalization (NORMALIZE)

The model is likely to assign a higher probability to common tokens, like "the" or "and."
In contrast, it tends to predict a lower probability for less frequent tokens.

To enhance membership inference, the output is normalized based on token rarity.

7 of 13

Given context

STEP2. Normalization (NORMALIZE)

8 of 13

MaxNorm assumes the difference between the true token's probability and the maximum predicted probability aids in membership inference.

🡪 This is then combined with the ratio normalization strategies to get MaxNormTF and MaxNormGP.

STEP2. Normalization (NORMALIZE)

9 of 13

STEP3. Feature extraction (FEATAGG)

Aggreate feature extractor (AggFE)

Histogram feature extractor (HistFE)

10 of 13

STEP4. Meta-classifier

11 of 13

Evaluation setup

Models: OpenLLaMA
Datasets:

Books: Project Gutenberg (PG-19 for members, books added after 2019 for non-members)
Academic Papers: ArXiv (papers published before and after training data cutoff)
Member: Collected from data sources commonly used for language modeling
Non-member: Added after the period during which the training data was collected

Eval Metric

12 of 13

Evaluation

ArXiv paper Dataset

Books Datset

13 of 13

Conclusion

Questions have been raised about the data used to train LLMs.

🡪 but model developers are reluctant to disclose details about the training dataset.

The task of document-level membership inference for LLMs and a methodology to achieve it are proposed.