1 of 77

Build with AI

Build agent with Gemini/Gemma and AutoGen

Jimmy Liao

Generative AI Evangelist

Taoyuan

Deck Link

April 13, 2024

2 of 77

Discord for GDG Taiwan

https://discord.com/invite/google-dev-community

https://discord.gg/6xGhqZf6J8

3 of 77

In the upcoming series of session/workshops

使用 Gemini Pro Vision API 打造出國旅遊的小幫手
Gemini多模態RAG應用/ Gemini Code Assist 應用介紹
AI Your Summer

4 of 77

Before start - check your GCP access

https://github.com/jimmyliao/genai-gdg

已經有 Google Cloud billing account ，而且可以打開取得產生 API Key 的權限了嗎？

https://aistudio.google.com/

5 of 77

Temp access for this workshop

為了 Workshop方便，會提供講師的 Google Cloud Project，加入學員的 email。預計課程後一個禮拜會關掉，之後請改用自己的帳號與 Project。

請用 QRCode 填入你的 Email

6 of 77

Agenda

Intro to the Gemini Pro API
Hands-on: AI Studio with Gemini Pro, Vision, Prompt Library
Hands-on: Simple Chat with Gemini Pro
Hands-on: Image Recognition with Gemini Pro Vision API
Hands-on: RAG Concept with Gemini Pro API
Hands-on: Multi-Agent, Gemma, Ollama, and AutoGen

https://github.com/jimmyliao/genai-gdg

7 of 77

Gemini Models

and APIs

Taoyuan

10 of 77

AI Studio

https://aistudio.google.com/

Generate API Keys
Create, test, save prompts
Customize models in minutes
Generate starter code

11 of 77

From Prompt Library

14 of 77

Prompting

Chained prompts

make a plan, then execute it

Context

Few shot prompts

Generation parameters

Temperature, Safety, Top-P, Top-K

15 of 77

Search and Information Synthesis

16 of 77

BYO Data

Models have knowledge cut-offs
LLMs are not fact engines
No exposure to private data

17 of 77

Use the prompt’s context window

Instructions + Context + Question all in the prompt
Easy to implement

No extra code, just ask

19 of 77

Gemini 1.5 Pro!

20 of 77

Use the prompt’s content window

gemini-1.5-pro: 1M tokens.

Remember: Everything is tokens

Using context may be more flexible than Retrieval Augmented Generation (RAG)
More input context means higher latency

21 of 77

AQA Attributed Question Answering

with inline chunks

Instruction + Chunks + Question in one request

AQA model is specifically tuned for RAG

AQA is a new feature

24 of 77

Use AQA with inline chunks

Simple to use

Example: handle chunks returned by a search tool

Limited by the request size (Max 4MB)
Inefficient when asking about the same data repeatedly

25 of 77

Use AQA with the retriever service

Corpus < Document < Chunk hieararchy

Pass a reference to a document or corpus to the generate_answer function

27 of 77

Function Calling

Describe external functions to the model.
The model may ask you to call the function to help it respond to your queries.
Endless possibilities for integrating external tools.

28 of 77

Function Calling - Basic

How?
The google.generativeai SDK will inspect the function’s type hints to determine the schema.
Allowed types are limited:

AllowedTypes = (

31 of 77

Function Calling - Basic

What happened?

Use the chat history to find out.

The chat history collects all the function calls and responses that tool place.

33 of 77

Function Calling - Schema

JSON object

34 of 77

Function Calling - Structured data

Structured data extraction.
Ask the model to do it and return JSON.

35 of 77

Before function calling

36 of 77

Function Calling - Structured data

Asking for JSON often works.
Function calling lets you strictly describe the schema.
With a strict description, can strictly enforce that what gets returned.

37 of 77

with function calling declaration

38 of 77

Function Calling - Structured data

Why?

Function calling doesn’t return a string.
return data-structures: You don’t parse text.

39 of 77

Here how function calling with SDK…

41 of 77

Image understanding

46 of 77

Gemini 1.5 Pro

47 of 77

Gemini 1.5 Pro

48 of 77

Image understanding

Images are just tokens in the input

Can be used for instructions, context or query subject

49 of 77

Time to

Get Hands Dirty

Kaohsiung

50 of 77

Task 1: AI Studio and Prompting

Generate API Key

Gemini Pro

Playground
Generate Code
1.0, 1.5

Gemini Vision

Image

Prompt Library

51 of 77

Task 2: Python SDK part 1 - Chat

Notebook => intro_gemini_chat.ipynb

Model parameters

Stream

Chat History

Use with LangChain

Estimated: 10 mins

52 of 77

Task 3: Python SDK part 2 - Vision

Notebook =>

https://colab.research.google.com/github/GoogleCloudPlatform/generative-ai/blob/main/gemini/getting-started/intro_gemini_pro_vision_python.ipynb

Image (Local) -> Text

Image + Text -> Text

Video -> Text

Estimated: 20 mins

53 of 77

Task 4: cURL to know all Gemini Pro APIs

Notebook => intro_gemini_curl.ipynb

text -> text
image -> text
chat (multi-turn, role)
function calling
vision

image -> text
video -> text

Estimated: 15 mins

54 of 77

Task 5: Build RAG application (LMM)

Purpose link
Notebook: Gemini-lmm.ipynb

Remember to use TPU runtime

Estimated: 20 mins

(https://github.com/jimmyliao/genai-gdg/blob/main/gemini/rag-intro/gemini-lmm.ipynb)

55 of 77

Gemma

Taoyuan

56 of 77

https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard

58 of 77

https://www.youtube.com/watch?v=60V70JqGkuU

60 of 77

Setup Tokenizer

61 of 77

Create Pipeline

62 of 77

Text Generation

63 of 77

Create Message and Prompt

64 of 77

Ollama

Taoyuan

65 of 77

Run Your Local LLM!

66 of 77

AI Agent

Taoyuan

68 of 77

Difference

Source: Youtube video of Andrew Ng at Sequoia Capital (Link in References)

69 of 77

AI Agent Example

70 of 77

AutoGPT

71 of 77

Langroid

72 of 77

AutoGen

73 of 77

Agentic Workflow (example)

https://www.linkedin.com/pulse/introducing-ai-agents-agentic-workflows-future-ken-rheingans-1rwce/

74 of 77

Related publication

Agent / Multi-Agent Application 淺談系列 - 1 (https://jimmyliaonet.substack.com/p/agent-multi-agent-application-1 )

75 of 77

Task 6: Multi-Agent - Gemma, Ollama, and AutoGen

Notebook

Remember to use TPU runtime

Estimated: 30~60 mins

(https://colab.research.google.com/github/yeyu2/Youtube_demos/blob/main/autogen-ollama-gemma.ipynb )

76 of 77

References

Official

Gemini Official Cloud Training doc
Google Developers Group training deck (remade)
Multi-Agent demo https://www.youtube.com/watch?v=RC6OFzyHYpY

77 of 77

DM for detail

We are working on * initiatives around GenAI

1 of 77

2 of 77

3 of 77

4 of 77

5 of 77

6 of 77

7 of 77

8 of 77

9 of 77

10 of 77

11 of 77

12 of 77

13 of 77

14 of 77

15 of 77

16 of 77

17 of 77

18 of 77

19 of 77

20 of 77

21 of 77

22 of 77

23 of 77

24 of 77

25 of 77

26 of 77

27 of 77

28 of 77

29 of 77

30 of 77

31 of 77

32 of 77

33 of 77

34 of 77

35 of 77

36 of 77

37 of 77

38 of 77

39 of 77

40 of 77

41 of 77

42 of 77

43 of 77

44 of 77

45 of 77

46 of 77

47 of 77

48 of 77

49 of 77

50 of 77

51 of 77

52 of 77

53 of 77

54 of 77

55 of 77

56 of 77

57 of 77

58 of 77

59 of 77

60 of 77

61 of 77

62 of 77

63 of 77

64 of 77

65 of 77

66 of 77

67 of 77

68 of 77

69 of 77

70 of 77

71 of 77

72 of 77

73 of 77

74 of 77

75 of 77

76 of 77

77 of 77