1 of 15

[NeurIPS’23]

2 of 15

Background

“RRR movie has a great story and amazing visuals.”

Positive

Negative

corrective input

Model

The keywords ’great’ and ’amazing’ are important cues in predicting the sentiment of this sentence.

“ in-context learning”

3 of 15

Motivation

“ in-context learning”

Requires human involvement

challenges; scalability

Generate automatically

4 of 15

Introduction

  • Proposes a novel framework Amplifying Model Performance by Leveraging In-Context Learning with Post Hoc Explanations (;AMPLIFY).
    • Leverage post hoc explanation methods, which output attribution scores that capture the influence of each input feature on model predictions.
    • Generates rationales with embed insights from post hoc explanations to provide corrective signals to LLMs.

  • Two challenges:
    • Calculating gradients for billion-parameter LLMs is computationally intensive.

→ Compute post hoc explanations using a smaller proxy model and then incorporate these explanations into prompts for larger language models.

    • Many LLMs are black boxes, limiting access to gradients and internal details.

→ Takes advantage of the accessibility of smaller models that are open source.

5 of 15

AMPLIFY

6 of 15

AMPLIFY

STEP 1. Proxy Model Selection

  • Pre-trained Model (e.g., GPT-2, BERT)
    • requires no additional computational cost as we directly use a pre-trained (potentially open-sourced) model
  • Fine-tuned Model (e.g., BERT + downstream task)

→ Both proxy model (smaller model) do not perform well on reasoning tasks by themselves.

7 of 15

AMPLIFY

STEP 2. Few-shot Sample Selection

  1. Identify instances from the validation set that are misclassified by the LLM. (not proxy model)
  2. Select few-shot samples

→ The samples that exhibit the highest MCS represent the most egregious misclassifications.

*x : Input sequence

*y : Incorrect label

*ŷ : Ground truth label

*f : Fine-tuned LM

x = { RRR movie has a great story and amazing visuals. }

Positive

Negative

Proxy

Model

8 of 15

AMPLIFY

STEP 3. Rationale Generation

  • Use a post hoc explanation method to calculate the attribution scores for each token.
    • Attribution scores capture the influence of each token on the label predicted by the proxy model.

  • Filter the top-k words with the highest attribution scores.

→ Output the set of top-k words for the input sample.

“RRR movie has a great story and amazing visuals.”

Model

{great, amazing, …}

top-k words

9 of 15

AMPLIFY

STEP 4. Prompt Design for LLMs

  • Construct the corrective rationale for each selected sample using the template.
    • Template: [Input][Rationale][Label]

"The key words: ’great’ and ’amazing’ are important clues to predict ‘Positive’ as the correct answer."

“RRR movie has a great story and amazing visuals.”

Positive

Negative

corrective input

Model

+ {test samples}

10 of 15

Experiment Setup

  • Datasets
    • BigBench-Hard benchmark
      • Snarks dataset, Causal Judgment dataset, Ruin Names task

Formal Fallacies task, Salient Translation Error Detection task

CommonsenseQA dataset, Coin Flip dataset

  • Implementation Details
    • Proxy model : GPT-2 + downstream task
    • k = 5 (top-k)

    • Models
      • GPT-3, 175B
      • GPT-3.5 175B

    • Post Hoc Explanation Methods
      • Vanilla Gradients

11 of 15

Experiment Setup

  • Baseline
    • Post Hoc Explanation Methods
      • Vanilla Gradients (ours)
        • attributions = gradient
      • Gradient x Input
        • attributions = gradient * token embedding
      • Contrastive explanations
        • attributions = predict label gradient - gradient
      • Contrastive explanations x Input
        • attributions = (predict label gradient - gradient) * token embedding

    • Methods (in-context learning)
      • Answer-Only (AO) prompts
        • few-shot prompts

      • Chain-of-Thought (CoT)
        • few-shot prompts + human-annotated rationales

* gradient

12 of 15

Evaluation

Overall Task Performance

13 of 15

Evaluation

Impact of Proxy Model Selection on LLM Performance

*E: proxy model fine-tuning epoch

14 of 15

Evaluation

Impact of Selection Strategies on LLM Performance

  • Experiment with three selection strategies
    • Random, H-MCS, L-MCS, F-Exp

15 of 15

Evaluation

Impact of Post Hoc Explanation Method on LLM Performance