2 of 15

Background

“RRR movie has a great story and amazing visuals.”

Positive

Negative

corrective input

Model

The keywords ’great’ and ’amazing’ are important cues in predicting the sentiment of this sentence.

“ in-context learning”

3 of 15

Motivation

“ in-context learning”

Requires human involvement

challenges; scalability

Generate automatically

4 of 15

Introduction

Proposes a novel framework Amplifying Model Performance by Leveraging In-Context Learning with Post Hoc Explanations (;AMPLIFY).

Leverage post hoc explanation methods, which output attribution scores that capture the influence of each input feature on model predictions.
Generates rationales with embed insights from post hoc explanations to provide corrective signals to LLMs.

Two challenges:

Calculating gradients for billion-parameter LLMs is computationally intensive.

→ Compute post hoc explanations using a smaller proxy model and then incorporate these explanations into prompts for larger language models.

Many LLMs are black boxes, limiting access to gradients and internal details.

→ Takes advantage of the accessibility of smaller models that are open source.

6 of 15

AMPLIFY

STEP 1. Proxy Model Selection

Pre-trained Model (e.g., GPT-2, BERT)

requires no additional computational cost as we directly use a pre-trained (potentially open-sourced) model

Fine-tuned Model (e.g., BERT + downstream task)

→ Both proxy model (smaller model) do not perform well on reasoning tasks by themselves.

7 of 15

AMPLIFY

STEP 2. Few-shot Sample Selection

Identify instances from the validation set that are misclassified by the LLM. (not proxy model)
Select few-shot samples

→ The samples that exhibit the highest MCS represent the most egregious misclassifications.

*x : Input sequence

*y : Incorrect label

*ŷ : Ground truth label

*f : Fine-tuned LM

x = { RRR movie has a great story and amazing visuals. }

Positive

Negative

Proxy

Model

8 of 15

AMPLIFY

STEP 3. Rationale Generation

Use a post hoc explanation method to calculate the attribution scores for each token.

Attribution scores capture the influence of each token on the label predicted by the proxy model.

Filter the top-k words with the highest attribution scores.

→ Output the set of top-k words for the input sample.

“RRR movie has a great story and amazing visuals.”

Model

{great, amazing, …}

top-k words

9 of 15

AMPLIFY

STEP 4. Prompt Design for LLMs

Construct the corrective rationale for each selected sample using the template.

Template: [Input][Rationale][Label]

"The key words: ’great’ and ’amazing’ are important clues to predict ‘Positive’ as the correct answer."

“RRR movie has a great story and amazing visuals.”

Positive

Negative

corrective input

Model

+ {test samples}

10 of 15

Experiment Setup

Datasets

BigBench-Hard benchmark

Snarks dataset, Causal Judgment dataset, Ruin Names task

Formal Fallacies task, Salient Translation Error Detection task

CommonsenseQA dataset, Coin Flip dataset

Implementation Details

Proxy model : GPT-2 + downstream task
k = 5 (top-k)

Models

GPT-3, 175B
GPT-3.5 175B

Post Hoc Explanation Methods

Vanilla Gradients

11 of 15

Experiment Setup

Baseline

Post Hoc Explanation Methods

Vanilla Gradients (ours)

attributions = gradient

Gradient x Input

attributions = gradient * token embedding

Contrastive explanations

attributions = predict label gradient - gradient

Contrastive explanations x Input

attributions = (predict label gradient - gradient) * token embedding

Methods (in-context learning)

Answer-Only (AO) prompts

few-shot prompts

Chain-of-Thought (CoT)

few-shot prompts + human-annotated rationales

* gradient

12 of 15

Evaluation

Overall Task Performance

13 of 15

Evaluation

Impact of Proxy Model Selection on LLM Performance

*E: proxy model fine-tuning epoch

14 of 15

Evaluation

Impact of Selection Strategies on LLM Performance

Experiment with three selection strategies

Random, H-MCS, L-MCS, F-Exp

1 of 15

2 of 15

3 of 15

4 of 15

5 of 15

6 of 15

7 of 15

8 of 15

9 of 15

10 of 15

11 of 15

12 of 15

13 of 15

14 of 15

15 of 15