ANALYSIS
- SE can accurately locate evidence sentences across models and datasets, with 80-95 AUROC/NDCG in most cases.
- SelfElicit is robust to noisy context. It selects fewer text as evidence when the context contain many distracting info.
- SelfElicit help LM to focus on the important evidence in various types of tasks (e.g., true/false, comparison, fact retrieval, multi-hop reasoning).
MAIN RESULTS
- SelfElicit effectively boost 7B-70B models from different families (Llama, Mistral, Qwen) on multiple context-based QA tasks.�
SelfElicit: Your Language Model Secretly Knows Where is the Relevant Evidence
Zhining Liu†, Rana Ali Amjad‡, Ravinarayana Adkathimar‡, Tianxin Wei†, Hanghang Tong†�†University of Illinois at Urbana-Champaign ‡ Amazon Science�Contact: liu326@illinois.edu
MOTIVATION
- Recent studies have found that LMs often struggle to fully comprehend and utilize key evidence from the context, especially when it contains noise and irrelevant information: an issue common in real-world scenarios.
- Question: Can we help LMs focus on the right evidence at inference time, without retraining?
FINDING
- Your LM Secretly Knows the Evidence (but may fail to use them)
- We inspect the layer-wise attention on evidence/non-evidence sentences.
- Attention in mid-deep (e.g., later 50%) layers consistently focuses more on true evidence sentences, even when LMs answer incorrectly.
- This holds across different model families (Llama3.1, Mistral, Qwen2.5).
PRACTICAL IMPLICATIONS
- Mid-deep layers play important role in extracting and utilizing contextual information, but their influence may not propagate to the final output. Our findings echoes with other existing works that find early exiting without final layers sometimes increases the factuality.
- Dola [1] find that the decoding probability of factually correct tokens increase fast in intermediate layers, but incorrect tokens can maintain consistently high probabilities across all layers and still be selected as the final output.
- Our finding also suggest that the final layers do not prioritize contextual evidence as strongly as mid-layers do. This also echoes with existing works on “overthinking”[2] that find removing the last few layers can sometimes improve factual accuracy, implying that deeper layers may introduce unnecessary complexity or spurious reasoning.
[1] Chuang, Yung-Sung, et al. "DoLa: Decoding by Contrasting Layers Improves Factuality in Large Language Models." The Twelfth International Conference on Learning Representations. 2024.
[2] Halawi D, Denain J S, Steinhardt J. Overthinking the truth: Understanding how language models process false demonstrations[J]. arXiv preprint arXiv:2307.09476, 2023.
ACKNOWLEDGEMENTS & RESOURCES
METHODOLOGY: SELFELICIT
- We find that the high evidence attention behavior in deep layer appears as early as when generating the 1st token.
- Based on this, SelfElicit do:
- Extract the attention to input of deep layers.
- Compute the average attention-per-token for each sentence in the context document.
- Find evidence sentences with high deep-layer attention and highlight them with text marker.
- Do the actual QA with highlighted context doc.
- SelfElicit is also efficient as:
- It does not require generating much more tokens (e.g., evidence or CoT).