1 of 13

2 of 13

Motivation

  • Key Issues with LLM Embedding in Daily Life
    • Data Sources: Concerns about the data LLMs are trained on and the presence of societal biases.
    • Misinformation Risk: Potential for amplifying biases and spreading disinformation.
    • Ownership & Copyright: Questions about who owns the knowledge generated by LLMs; copyright disputes from content creators over datasets like Books3 (e.g., books, songs, news articles).
    • Legal Actions: Lawsuits claim LLMs are trained on pirated content, sparking debates over copyright and fair use.

🡪 Developers are reluctant to reveal training data sources.

🡪 GPT-4: No dataset information released.

🡪 Meta: Shared LLaMA dataset details, but refused to disclose for LLaMA-2.

https://medium.com/@kevaldekivadiya2415/fine-tuning-llama-2-your-path-to-chemistry-text-perfection-aa4c54ff5790

3 of 13

Threat Model (Auditing setup)

  • Auditor assumption:
    • LM is trained on dataset Dtrain.
    • Auditor infers whether D ∈ Dtrain.
    • Access

: Auditor has query-only access to LM and full access to its tokenizer T.

🡪 Can query LM with token sequence S and receive probability outputs for all tokens v in V.

    • Document Sets

: Auditor has access to DM and DNM, with DM and DNM having the same distribution as D.

🡪 DM: Consists of publicly available datasets (e.g., Project Gutenberg) that are similar to D.

🡪 DNM: Auditor collects data added after Dtrain to construct DNM.

4 of 13

Method

STEP1. Querying the model

STEP2. Normalization (NORMALIZE)

STEP3. Feature extraction (FEATAGG)

STEP4. Meta-classifier

5 of 13

STEP1. Querying the model

D

 

*N : Document length

*C : max_input

 

6 of 13

STEP2. Normalization (NORMALIZE)

  • The model is likely to assign a higher probability to common tokens, like "the" or "and."
  • In contrast, it tends to predict a lower probability for less frequent tokens.
  • To enhance membership inference, the output is normalized based on token rarity.

 

 

TF

GP

7 of 13

 

Given context

STEP2. Normalization (NORMALIZE)

8 of 13

  • MaxNorm assumes the difference between the true token's probability and the maximum predicted probability aids in membership inference.

🡪 This is then combined with the ratio normalization strategies to get MaxNormTF and MaxNormGP.

STEP2. Normalization (NORMALIZE)

 

 

9 of 13

STEP3. Feature extraction (FEATAGG)

Aggreate feature extractor (AggFE)

Histogram feature extractor (HistFE)

10 of 13

STEP4. Meta-classifier

11 of 13

Evaluation setup

  • Models: OpenLLaMA
  • Datasets:
    • Books: Project Gutenberg (PG-19 for members, books added after 2019 for non-members)
    • Academic Papers: ArXiv (papers published before and after training data cutoff)
    • Member: Collected from data sources commonly used for language modeling
    • Non-member: Added after the period during which the training data was collected

  • Eval Metric
    • AUC

12 of 13

Evaluation

ArXiv paper Dataset

Books Datset

13 of 13

Conclusion

  • Questions have been raised about the data used to train LLMs.

🡪 but model developers are reluctant to disclose details about the training dataset.

  • The task of document-level membership inference for LLMs and a methodology to achieve it are proposed.