1 of 24

Prompt Engineering

John Berryman

2 of 24

Hi! I'm John Berryman

Aerospace Engineer (just long enough to get the merit badge)
Search Technology Consultant
Eventbrite Search Engineer
Wrote a book. (Swore never to do so again.)

GitHub Code Search
GitHub Data Science
GitHub Copilot Prompt Engineer
Writing a book. (But why!?)

LLM Application Consulting – Arcturus Labs

career

3

career

2

career

1

career

4

3 of 24

What is a Language Model?

How has this taking the world by storm?

4 of 24

What is a Large Language Model?

It's the same thing, just a lot more accurate.

c. 2014 the top language models were Recurrent Neural Networks
Sept 2014 Attention mechanism introduced in "Neural Machine Translation by Jointly Learning to Align and Translate" – allowed "soft search" of previous context.
Jun 2017 got rid of RNNs because "Attention is all you Need" – introduced Transformer architecture
Jun 2018 chopped the Transformer in half in "Improving Language Understanding by Generative Pre-Training " only use the decoder side – this is GPT!
Feb 2019 GPT-2 was trained on 10x the data in "Language Models are Unsupervised Multitask Learners" … and things started getting weird.

Our model, called GPT-2 (a successor to GPT), was trained simply to predict the next word in 40GB of Internet text. Due to our concerns about malicious applications of the technology, we are not releasing the trained model. (ref)

SAVE TIME – by skipping all but the last bullet in the description

c 2014 - neat part of the recurrent aspect is the output would be fed back into the input and you could deal with arbitrarily large blobs of tex
Sept 2014 - but there was a problem, these models would often forget about earlier text they read because you can only pack so much information into the model state - so attention
Jun 2017 - removed RNNs and focused only on attention – good part, started seeing a lot better performance - bad part, can no longer take arbitrary amounts of text
Jun 2018 - idea: pre-train GPTs on a large amount of text and it would learn so much about "how text worked" that you could take a much smaller training set and train it for specialized things
Feb 2019 – launch article for GPT-2 – weird

It's just the middle button on the phone!

5 of 24

What is a Large Language Model

But with great power comes great responsibility. Models can:

Generate misleading news articles
Impersonate others online
Automate the production of abusive or faked content to post on social media
Automate the production of spam/phishing content

(These are all from the Feb 2019 GPT-2 release article.)

GPT-2 was beating models trained for specific tasks

missing word prediction
pronoun understanding
part of speech tagging
text compression

summarization
sentiment analysis
entity extraction
question answering
translation
content generation

6 of 24

What is a Large Language Model

GPT-2 was beating models trained for specific tasks

missing word prediction
pronoun understanding
part of speech tagging
text compression

summarization
sentiment analysis
entity extraction
question answering
translation
content generation

But with great power comes great responsibility. Models can:

Generate misleading news articles
Impersonate others online
Automate the production of abusive or faked content to post on social media
Automate the production of spam/phishing content

(These are all from the Feb 2019 GPT-2 release article.)

Our model, called GPT-2 (a successor to GPT), was trained simply to predict the next word in 40GB of Internet text. Due to our concerns about malicious applications of the technology, we are not releasing the trained model. (ref)

Our model, called GPT-2 (a successor to GPT), was trained simply to predict the next word in 40GB of Internet text.

…And we figured out that now you can just ask it to do stuff and it will!

IT'S AMAZING

(But also it will help you make bombs, and drugs, and overthrow governments. So…)

Due to our concerns about malicious applications of the technology, we are not releasing the trained model.

7 of 24

Prompt Crafting

> How are you doing today?

< ¿Cómo estás hoy?�

> My name is John.�< Mi nombre es John.

�> Can I have fries with that?�< ¿Puedo tener papas fritas con eso?

technique #1: few-shot prompting

"Language Models are Few-Shot Learners" May 2020

examples to set the pattern

the actual task

8 of 24

Prompt Crafting

Q: It takes one baker an hour to make a cake. How long does it take 3 bakers to make 3 cakes?

A: 3

technique #2: chain-of-thought reasoning

"Chain-of-Thought Prompting Elicits Reasoning in Large Language Models"

Jan 2022

9 of 24

Prompt Crafting

Q: Jim is twice as old as Steve. Jim is 12 years how old is Steve.

A: In equation form: 12=2*a where a is Steve's age. Dividing both sides by 2 we see that a=6. Steve is 6 years old.

Q: It takes one baker an hour to make a cake. How long does it take 3 bakers to make 3 cakes?

A: The amount of time it takes to bake a cake is the same regardless of how many cakes are made and how many people work on them. Therefore the answer is still 1 hour.

"Chain-of-Thought Prompting Elicits Reasoning in Large Language Models"

Jan 2022

technique #2: chain-of-thought reasoning

10 of 24

Prompt Crafting

technique #2: chain-of-thought reasoning

Q: It takes one baker an hour to make a cake. How long does it take 3 bakers to make 3 cakes?

A: Let's think step-by-step. The amount of time it takes to bake a cake is the same regardless of how many cakes are made and how many people work on them. Therefore the answer is still 1 hour.

"Large Language Models are Zero-Shot Reasoners" May 2022

11 of 24

Prompt Crafting

technique #3: document mimicry

# IT Support Assistant

The following is a transcript between an award winning IT support rep and a customer.

## Customer:

My cable is out! And I'm going to miss the Superbowl!

## Support Assistant:

What if you found this scrap of paper on the ground?

What do you think the rest of the paper would say?

12 of 24

Prompt Crafting

technique #3: document mimicry

# IT Support Assistant

The following is a transcript between an award winning IT support rep and a customer.

## Customer:

My cable is out! And I'm going to miss the Superbowl!

## Support Assistant:

Let's figure out how to diagnose your problem…

It uses markdown to establish structure

Document type is transcript

It tells a story to condition a particular response.

13 of 24

Prompt Crafting Intuition: LLMs are Dumb Mechanical Humans.

LLMs understand better when you use familiar language and constructs.
LLMs get distracted. Don't fill the prompt with lots of "just in case" information.
LLMs aren't psychic. If information is neither in training or in the prompt, then they don't know it.
If you look at the prompt and you can't make sense of it, a LLMs is hopeless.

14 of 24

Building LLM Applications

The hard part!

15 of 24

Creating the Prompt

Collect context
Rank context
Trim context
Assembling Prompt

16 of 24

Creating the Prompt: Copilot Code Completion

Collect context – current document, open tabs, symbols, file path
Rank context – file path -> current document -> open tabs -> symbols
Trim context – drop open tab snippets; truncate current document
Assembling Prompt

// pkg/skills/search.go

// <consider this snippet from ../skill.go>

// type Skill interface {

// Execute(data []byte) (refs, error)

// }

// </end snippet>

package searchskill

import (

"context"

"encoding/json"

"fmt"

"strings"

"time"

)

type Skill struct {

█

}

type params struct {

file path

snippet from open tab

current document

cursor

17 of 24

The Introduction of Chat

# IT Support Assistant

The following is a transcript between an award winning IT support rep and a customer.

## Customer:

My cable is out! And I'm going to miss the Superbowl!

## Support Assistant:

Let's figure out how to diagnose your problem…

document

messages =

[{

"role": "system"

"content": "You are an award winning support staff representative that helps customers."

},

{"role": "user",

"content":"My cable is out! And I'm going to miss the Superbowl!"

}

]

API

benefits

Really easy for users to build assistants.

System messages make controlling behavior easy.
The assistant always responds with an complete thought and then stops.

Safety is baked in:

Assistant will (almost) never respond with insults or instructions to make bombs
Assistant will (almost) never hallucinate false information.
Prompt injection is (almost) impossible.

<|im_start|> system

You are an award winning IT support rep. Help the user with their request.<|im_stop|>

<|im_start|> user

My cable is out! And I'm going to miss the Superbowl!<|im_stop|>

<|im_start|> assistant

Let's figure out how to diagnose your problem…

(ChatGPT Nov 30, 2022)

18 of 24

The Introduction of Tools

Agents can reach out into the real world

Read information
Write information

Model chooses to answer in text or run a tool
Tools can be called in series or in parallel
Tools can be interleaved with user and assistant text

{

"type": "function",

"function": {

"name": "get_weather",

"description": "Get the weather",

"parameters": {

"type": "object",

"properties": {

"location": {

"type": "string",

"description": "The city and state",

},

"unit": {

"type": "string",

"description": "degrees Fahrenheit� or Celsius"

"enum": ["celsius", "fahrenheit"]},

},

"required": ["location"],

},

}

Input:

{"role": "user",

"content": "What's the weather

like in Miami?"}

Function Call:

{"role": "assistant",

"function": {

"name": "get_weather",

"arguments": '{

"location": "Miami, FL"

}'}

Real API request:

curl http://weathernow.com/miami/FL?deg=f

{"temp": 78}

Function Response:

{ "role": "tool",

"name": "get_weather",

"content": "78ºF"}

(function calling Jun 13, 2022)

Assistant Response:

{"role": "assistant",

"content": "It's a balmy 78ºF"}

19 of 24

Building LLM Applications

20 of 24

Building LLM Applications:

Bag of Tools Agent

functions:

getTemp()
setTemp(degreesF)

user: make it 2 degrees warmer in here

assistant: getTemp()

function: 70ºF

assistant: setTemp(72)

function: success

assistant: Done!

user: actually… put it back

assistant: setTemp(70)

function: success

assistant: Done again, you fickle pickle!

21 of 24

Creating the Prompt: Copilot Chat

Collect context:

References – files, snippets, issues, that users attach or tools produce
Prior messages

Rank, Trim and Assemble:

must fit:

system message
function definitions (if we plan to use them)
user's most recent message

fit if possible:

all the function calls and evals that follow
the references that belong to each message
historic messages (most recent being most important)

fallback to no-function usage if we can't fit with (causes Assistant to respond and turn to complete)

22 of 24

Tips for Defining Tools

Don't have "too many" tools - look for evidence of collisions
Name tools simply and clearly (and in typeScript format?)
Don't copy/paste your API - keep arguments simple and few
Keep function and arg descriptions short and consider what the model knows

It probably understands public documentation.
It doesn't know about internal company acronyms.

More on arguments

Nest arguments don't retain descriptions
You can use enum and default, but not minimum, maximum…

Skill output – don't include extra "just-in-case" content
Skill errors – when reasonable, send errors to model (validation errors)

23 of 24

Questions?

P.S. I'm also available for LLM application consulting at jfberryman@gmail.com

24 of 24

Questions?