Motivation
🡪 Developers are reluctant to reveal training data sources.
🡪 GPT-4: No dataset information released.
🡪 Meta: Shared LLaMA dataset details, but refused to disclose for LLaMA-2.
https://medium.com/@kevaldekivadiya2415/fine-tuning-llama-2-your-path-to-chemistry-text-perfection-aa4c54ff5790
Threat Model (Auditing setup)
: Auditor has query-only access to LM and full access to its tokenizer T.
🡪 Can query LM with token sequence S and receive probability outputs for all tokens v in V.
: Auditor has access to DM and DNM, with DM and DNM having the same distribution as D.
🡪 DM: Consists of publicly available datasets (e.g., Project Gutenberg) that are similar to D.
🡪 DNM: Auditor collects data added after Dtrain to construct DNM.
Method
STEP1. Querying the model
STEP2. Normalization (NORMALIZE)
STEP3. Feature extraction (FEATAGG)
STEP4. Meta-classifier
STEP1. Querying the model
D
*N : Document length
*C : max_input
STEP2. Normalization (NORMALIZE)
TF
GP
Given context
STEP2. Normalization (NORMALIZE)
🡪 This is then combined with the ratio normalization strategies to get MaxNormTF and MaxNormGP.
STEP2. Normalization (NORMALIZE)
STEP3. Feature extraction (FEATAGG)
Aggreate feature extractor (AggFE)
Histogram feature extractor (HistFE)
STEP4. Meta-classifier
Evaluation setup
Evaluation
ArXiv paper Dataset
Books Datset
Conclusion
🡪 but model developers are reluctant to disclose details about the training dataset.