1 of 29

Building with Instruction-Tuned LLMs

A Step-by-Step Guideby: Chris Alexiuk, and Greg Loughnane

2 of 29

A Motivating Example 🟠 🟢 ✈️!

Instruction

Identify the odd one out and explain your choice.

Input

Orange, Green, Airplane.

2

© 2023 FourthBrain

3 of 29

Identifying the odd one out 🟠 🟢 ✈️!

Instruction-Tuned Model

Response:

✈️ Airplane is the odd one out.

Explanation:

  • Airplane has nothing to do with the color spectrum.
  • Airplane uses aerodynamics and other techniques to fly in the air and not on the ground.
  • Airplane is commonly made of metal and other durable materials for survival and durability.

Non-Instruct-Tuned (Base) Model

Response:

🟠 Orange is the odd one out.

Explanation:

Orange is the odd one out because it is the only one that is not a plane.

3

© 2023 FourthBrain

4 of 29

Outline

4

LLM Context

Supervised Fine-Tuning, a.k.a Instruction-Tuning

D1: Instruction-Tuning

OpenLLaMA and Dolly15k with QLoRA

D2: Fine-Tuning Input/Output Schema

BLOOMZ with PEFT-LoRA

Conclusions

© 2023 FourthBrain

5 of 29

LLM Context

6 of 29

Generative Pre-Trained Transformers (GPTs)

Built on foundation of:�

  • Unsupervised pre-training
    • Web pages�
  • Supervised fine-tuning
    • Benchmarks�
  • More compute + data

6

© 2023 FourthBrain

7 of 29

Adapting Large Models (LLMs) to New Tasks

Zero-shot learning

  • prompting

Few-shot learning

  • prompting with examples”�

Fine-Tuning

  • dozens or fewer examples

7

https://www.deeplearning.ai/the-batch/issue-198/

© 2023 FourthBrain

8 of 29

Prompting (a.k.a. Prompt Engineering)

Whether zero-shot or few-shot:�

  • We’re exploring the space of what’s possible in the LLM

  • Benefits: cheap, quick

8

A 3D visualization of latent space in the style of Salvador Dali

https://www.deeplearning.ai/the-batch/issue-197/

© 2023 FourthBrain

9 of 29

What task should our LLM have superpowers on?

Large Language Models

  • Unsupervised pre-training
    • Web pages
  • Supervised fine-tuning
    • Benchmarks
  • Lots of public data

Your Models & GenAI Apps

  • Fine-tuning
    • Specific task
  • Your data!

9

© 2023 FourthBrain

10 of 29

Instruction-Tuning vs. “Fine-Tuning”

Large Language Models

  • Unsupervised pre-training
    • Web pages
  • Supervised fine-tuning
    • Benchmarks
  • Lots of public data

Your Models & GenAI Apps

  • Fine-tuning
    • Specific task
  • Your data!

  • Instruction-Tuning
    • Following instructions
    • New benchmarks: Bias, Toxicity, etc.

  • Fine-tuning of input-output schema
    • How a user interacts with our app!

10

© 2023 FourthBrain

11 of 29

Instruction-Tuning (a.k.a. Instruct-Tuning)

Large Language Models

  • Unsupervised pre-training
    • Web pages
  • Supervised fine-tuning
    • Benchmarks
  • Lots of public data

  • Instruction-Tuning
    • Following instructions
    • New benchmarks: Bias, Toxicity, etc.

11

Instruction Tuning enhances Supervised Fine-Tuning!

https://openai.com/research/instruction-following

© 2023 FourthBrain

12 of 29

Instruction Tuning IS Fine-Tuning

  • Instruct-Tuning is a subset of Fine-Tuning

  • Focused on alignment with humans��
  • Concerned with following “instructions” like a human would

12

All possible Fine-Tuning

Instruction-Tuning

© 2023 FourthBrain

13 of 29

Instruct-Tuning, Step-by-Step

Dolly 15k, OpenLLaMA, qLoRA

14 of 29

Dolly 15k Dataset

  • Contains 15,000 high-quality human-generated prompt-response pairs

  • Specifically designed for instruction tuning LLMs

  • Dataset Structure
    • Instruction
    • Context
    • Response
    • Category

  • Dolly 2.0 & Dolly 15k can be used commercially

Categories of Instructions

  • Creative Writing
  • Close QA
  • Open QA
  • Summarization
  • Information Extraction
  • Classification
  • Brainstorming

14

© 2023 FourthBrain

15 of 29

OpenLLaMA

  • A reproduction of Meta AI’s LLaMA (Large Language Model Meta AI)

  • May 22 update: 700B checkpoint for the OpenLLaMA 7B model
    • “We expect the full 1T token training run to finish at the end of this week”

  • Trained on RedPajama Dataset

  • Apache 2.0 License - can be used commercially

By OpenLM Research (UC Berkeley)

15

© 2023 FourthBrain

16 of 29

QLoRA: Efficient Fine-Tuning of Quantized LLMs

  • 💡 The Big Idea: Downstream tasks have intrinsically low dimensions
    • When fine-tuning, you can get away with a lot less compute!
  • Leverages:

  • QLoRA is an improvement on LoRA: Low-Rank Adaptation of Large Language Models
    • PEFT-LoRA methods have become a standard for fine-tuning LLMs
    • Paper submitted May 23, 2023

16

© 2023 FourthBrain

17 of 29

Let’s check it out!

18 of 29

Important Notes!

Training

  • Supervised Fine-Tuning
  • 15,000 data points, 5,000 Steps
  • Cost: 75 Google Colab Compute Units
  • 4-bit Quantization (QLoRA)

Rule of Thumb, Instruct-Tuning

  • Pick instruction-tuned models off the shelf when building GenAI apps!

18

© 2023 FourthBrain

19 of 29

Recall…

20 of 29

Instruction-Tuning vs. “Fine-Tuning”

Large Language Models

  • Unsupervised pre-training
    • Web pages
  • Supervised fine-tuning
    • Benchmarks
  • Lots of public data

Your Models & GenAI Apps

  • Fine-tuning
    • Specific task
  • Your data!

  • Instruction-Tuning
    • Following instructions
    • New benchmarks: Bias, Toxicity, etc.

  • Fine-tuning of input-output schema
    • How a user interacts with our app!

20

© 2023 FourthBrain

21 of 29

Fine-Tuning the “Structure” or “Input-Output Schema”

Your Models & GenAI Apps

  • Fine-tuning
    • Specific task
  • Your data!

  • Fine-tuning of input-output schema
    • How a user interacts with our app!

21

Vestibulum congue tempus

© 2023 FourthBrain

22 of 29

Fine-Tuning, Step-by-Step

BLOOMZ + PEFT-LoRA Fine-Tuning of an AI Marketing Assistant

23 of 29

  • Problem: Writing direct email marketing copy is tedious and should be streamlined.�
  • Proposed LLM Solution: a fine-tuned AI marketing assistant can help generate marketing emails for products in one click!�
  • 💰 Value : Produce high quality marketing emails in the same voice and tone as previous company messages without needing to spend on marketing experts or copywriters. �
  • 🤖 Data
    • Data was generated synthetically with GPT-4

© 2023 FourthBrain

24 of 29

  • BigScience Large Open-science Open-access Multilingual Language Model
    • BigScience initiative: largest collaboration of AI researchers involved in a single research project (over 1000 researchers from 70+ countries and 250+ institutions)
    • Similar architecture to GPT-3, Responsible AI license
    • Generates text in 46 natural languages and 13 programming languages�
  • BLOOMZ is fine-tuned (instruction-tuned) on BLOOM’s xP3 Dataset
    • Dataset includes 13 different instruction-tuning tasks
    • Highly recommend reading this paper for many more details!

© 2023 FourthBrain

25 of 29

Let’s check it out!

26 of 29

Important Notes!

Training

  • Unsupervised Fine-Tuning
  • 17 data points, 100 Steps
  • Cost: <5 Google Colab Compute Units
  • 8-bit Quantization (LoRA)

General Process, Fine-Tuning I/O Schema for single task superpowers

  • Decide who you’re building for and how they’ll interact with your app
  • Try zero-shot and few-shot prompting first
  • Then be data-centric about collecting high-quality examples!

26

© 2023 FourthBrain

27 of 29

Conclusions

  1. Instruction-tuning is a subset of fine-tuning, focused on following instructions and aligning with humans�
  2. Fine-tuning of input-output schema lowers the dimensionality of the LLM to focus the model on one task �
  3. Rule of thumb: Pick up instruction-tuned models when building!�
  4. This means that you will be fine-tuning on top of instruct-tuned models.�
  5. Airplane is the odd one out 🟠 🟢 ✈️

27

© 2023 FourthBrain

28 of 29

Shared Resources from today’s event!

© 2023 FourthBrain

29 of 29

Thanks! Questions?

29