1 of 29

Building with Instruction-Tuned LLMs

A Step-by-Step Guide�by: Chris Alexiuk, and Greg Loughnane

2 of 29

A Motivating Example 🟠 🟢 ✈️!

Instruction

Identify the odd one out and explain your choice.

Input

Orange, Green, Airplane.

2

3 of 29

Identifying the odd one out 🟠 🟢 ✈️!

Instruction-Tuned Model

Response:

✈️ Airplane is the odd one out.

Explanation:

Airplane has nothing to do with the color spectrum.
Airplane uses aerodynamics and other techniques to fly in the air and not on the ground.
Airplane is commonly made of metal and other durable materials for survival and durability.

Non-Instruct-Tuned (Base) Model

Response:

🟠 Orange is the odd one out.

Explanation:

Orange is the odd one out because it is the only one that is not a plane.

3

4 of 29

Outline

4

LLM Context

Supervised Fine-Tuning, a.k.a Instruction-Tuning

D1: Instruction-Tuning

OpenLLaMA and Dolly15k with QLoRA

D2: Fine-Tuning Input/Output Schema

BLOOMZ with PEFT-LoRA

Conclusions

5 of 29

LLM Context

6 of 29

Generative Pre-Trained Transformers (GPTs)

Built on foundation of:�

Unsupervised pre-training

Web pages�

Supervised fine-tuning

Benchmarks�

More compute + data

6

7 of 29

Adapting Large Models (LLMs) to New Tasks

Zero-shot learning

“prompting”

Few-shot learning

“prompting with examples”�

Fine-Tuning

dozens or fewer examples

7

https://www.deeplearning.ai/the-batch/issue-198/

8 of 29

Prompting (a.k.a. Prompt Engineering)

Whether zero-shot or few-shot:�

We’re exploring the space of what’s possible in the LLM

Benefits: cheap, quick

8

A 3D visualization of latent space in the style of Salvador Dali

https://www.deeplearning.ai/the-batch/issue-197/

9 of 29

What task should our LLM have superpowers on?

Large Language Models

Unsupervised pre-training

Web pages

Supervised fine-tuning

Benchmarks

Lots of public data

�Your Models & GenAI Apps

Fine-tuning

Specific task

Your data!

9

10 of 29

Instruction-Tuning vs. “Fine-Tuning”

Large Language Models

Unsupervised pre-training

Web pages

Supervised fine-tuning

Benchmarks

Lots of public data

�Your Models & GenAI Apps

Fine-tuning

Specific task

Your data!

Instruction-Tuning

Following instructions
New benchmarks: Bias, Toxicity, etc.

Fine-tuning of input-output schema

How a user interacts with our app!

10

11 of 29

Instruction-Tuning (a.k.a. Instruct-Tuning)

Large Language Models

Unsupervised pre-training

Web pages

Supervised fine-tuning

Benchmarks

Lots of public data

Instruction-Tuning

Following instructions
New benchmarks: Bias, Toxicity, etc.

11

Instruction Tuning enhances Supervised Fine-Tuning!

https://openai.com/research/instruction-following

12 of 29

Instruction Tuning IS Fine-Tuning

Instruct-Tuning is a subset of Fine-Tuning

�

Focused on alignment with humans��
Concerned with following “instructions” like a human would

12

All possible Fine-Tuning

Instruction-Tuning

13 of 29

Instruct-Tuning, Step-by-Step

Dolly 15k, OpenLLaMA, qLoRA

14 of 29

Dolly 15k Dataset

Contains 15,000 high-quality human-generated prompt-response pairs

Specifically designed for instruction tuning LLMs

Dataset Structure

Instruction
Context
Response
Category

Dolly 2.0 & Dolly 15k can be used commercially

Categories of Instructions

Creative Writing
Close QA
Open QA
Summarization
Information Extraction
Classification
Brainstorming

14

15 of 29

OpenLLaMA

A reproduction of Meta AI’s LLaMA (Large Language Model Meta AI)

May 22 update: 700B checkpoint for the OpenLLaMA 7B model

“We expect the full 1T token training run to finish at the end of this week”

Trained on RedPajama Dataset

Apache 2.0 License - can be used commercially

By OpenLM Research (UC Berkeley)

15

16 of 29

QLoRA: Efficient Fine-Tuning of Quantized LLMs

💡 The Big Idea: Downstream tasks have intrinsically low dimensions

When fine-tuning, you can get away with a lot less compute!�

Leverages:

bitsandbytes for quanitization
Hugging Face’s Parameter-Efficient Fine-Tuning (PEFT) methods

QLoRA is an improvement on LoRA: Low-Rank Adaptation of Large Language Models

PEFT-LoRA methods have become a standard for fine-tuning LLMs
Paper submitted May 23, 2023

16

17 of 29

Let’s check it out!

18 of 29

Important Notes!

Training

Supervised Fine-Tuning
15,000 data points, 5,000 Steps
Cost: 75 Google Colab Compute Units
4-bit Quantization (QLoRA)

�

Rule of Thumb, Instruct-Tuning

Pick instruction-tuned models off the shelf when building GenAI apps!

18

19 of 29

Recall…

20 of 29

Instruction-Tuning vs. “Fine-Tuning”

Large Language Models

Unsupervised pre-training

Web pages

Supervised fine-tuning

Benchmarks

Lots of public data

�Your Models & GenAI Apps

Fine-tuning

Specific task

Your data!

Instruction-Tuning

Following instructions
New benchmarks: Bias, Toxicity, etc.

Fine-tuning of input-output schema

How a user interacts with our app!

20

21 of 29

Fine-Tuning the “Structure” or “Input-Output Schema”

Your Models & GenAI Apps

Fine-tuning

Specific task

Your data!

Fine-tuning of input-output schema

How a user interacts with our app!

21

Vestibulum congue tempus

22 of 29

Fine-Tuning, Step-by-Step

BLOOMZ + PEFT-LoRA Fine-Tuning of an AI Marketing Assistant

23 of 29

💮 MarketMail-AI

Problem: Writing direct email marketing copy is tedious and should be streamlined.�
Proposed LLM Solution: a fine-tuned AI marketing assistant can help generate marketing emails for products in one click!�
💰 Value : Produce high quality marketing emails in the same voice and tone as previous company messages without needing to spend on marketing experts or copywriters. �
🤖 Data

Data was generated synthetically with GPT-4

24 of 29

BigScience Large Open-science Open-access Multilingual Language Model

BigScience initiative: largest collaboration of AI researchers involved in a single research project (over 1000 researchers from 70+ countries and 250+ institutions)
Similar architecture to GPT-3, Responsible AI license
Generates text in 46 natural languages and 13 programming languages�

BLOOMZ is fine-tuned (instruction-tuned) on BLOOM’s xP3 Dataset

Dataset includes 13 different instruction-tuning tasks
Highly recommend reading this paper for many more details!

25 of 29

Let’s check it out!

26 of 29

Important Notes!

Training

Unsupervised Fine-Tuning
17 data points, 100 Steps
Cost: <5 Google Colab Compute Units
8-bit Quantization (LoRA)

General Process, Fine-Tuning I/O Schema for single task superpowers

Decide who you’re building for and how they’ll interact with your app
Try zero-shot and few-shot prompting first
Then be data-centric about collecting high-quality examples!

26

27 of 29

Conclusions

Instruction-tuning is a subset of fine-tuning, focused on following instructions and aligning with humans�
Fine-tuning of input-output schema lowers the dimensionality of the LLM to focus the model on one task �
Rule of thumb: Pick up instruction-tuned models when building!�
This means that you will be fine-tuning on top of instruct-tuned models.�
Airplane is the odd one out 🟠 🟢 ✈️

27

28 of 29

Shared Resources from today’s event!

Building with Instruct-Tuned LLMs, a Step-by-Step Guide

https://github.com/FourthBrain/Building-with-Instruction-Tuned-LLMs-A-Step-by-Step-Guide

Supervised Fine-Tuning (Instruct-Tuning) of OpenLLaMA with Dolly15k & QLoRA

https://colab.research.google.com/drive/1SRclU2pcgzCkVXpmhKppVbGW4UcCs5xT?usp=sharing

Unsupervised Fine-Tuning (I-O Schema) of BLOOMZ with PEFT-LoRA

https://colab.research.google.com/drive/1ARmlaZZaKyAg6HTi57psFLPeh0hDRcPX?usp=sharing

29 of 29

Thanks! Questions?

Reach out anytime!

chris@fourthbrain.ai

greg@fourthbrain.ai

29