1 of 77

Build with AI

Build agent with Gemini/Gemma and AutoGen

Jimmy Liao

Generative AI Evangelist

Taoyuan

Deck Link

April 13, 2024

2 of 77

Discord for GDG Taiwan

https://discord.com/invite/google-dev-community

3 of 77

In the upcoming series of session/workshops

  • 使用 Gemini Pro Vision API 打造出國旅遊的小幫手
  • Gemini多模態RAG應用/ Gemini Code Assist 應用介紹
  • AI Your Summer

4 of 77

Before start - check your GCP access

https://github.com/jimmyliao/genai-gdg

已經有 Google Cloud billing account ,而且可以打開取得產生 API Key 的權限了嗎?

https://aistudio.google.com/

5 of 77

Temp access for this workshop

為了 Workshop方便,會提供講師的 Google Cloud Project,加入學員的 email。預計課程後一個禮拜會關掉,之後請改用自己的帳號與 Project。

請用 QRCode 填入你的 Email

6 of 77

Agenda

  1. Intro to the Gemini Pro API
  2. Hands-on: AI Studio with Gemini Pro, Vision, Prompt Library
  3. Hands-on: Simple Chat with Gemini Pro
  4. Hands-on: Image Recognition with Gemini Pro Vision API
  5. Hands-on: RAG Concept with Gemini Pro API
  6. Hands-on: Multi-Agent, Gemma, Ollama, and AutoGen

https://github.com/jimmyliao/genai-gdg

7 of 77

Gemini Models

and APIs

Taoyuan

8 of 77

9 of 77

10 of 77

AI Studio

https://aistudio.google.com/

  • Generate API Keys
  • Create, test, save prompts
  • Customize models in minutes
  • Generate starter code

11 of 77

From Prompt Library

12 of 77

13 of 77

14 of 77

Prompting

Chained prompts

  • make a plan, then execute it

Context

  • Few shot prompts

Generation parameters

  • Temperature, Safety, Top-P, Top-K

15 of 77

Search and Information Synthesis

16 of 77

BYO Data

  • Models have knowledge cut-offs
  • LLMs are not fact engines
  • No exposure to private data

17 of 77

Use the prompt’s context window

  • Instructions + Context + Question all in the prompt
  • Easy to implement
    • No extra code, just ask

18 of 77

19 of 77

Gemini 1.5 Pro!

20 of 77

Use the prompt’s content window

  • gemini-1.5-pro: 1M tokens.
    • Remember: Everything is tokens
  • Using context may be more flexible than Retrieval Augmented Generation (RAG)
  • More input context means higher latency

21 of 77

AQA Attributed Question Answering

with inline chunks

  • Instruction + Chunks + Question in one request

  • AQA model is specifically tuned for RAG

  • AQA is a new feature

22 of 77

23 of 77

24 of 77

Use AQA with inline chunks

  • Simple to use

  • Example: handle chunks returned by a search tool

  • Limited by the request size (Max 4MB)
  • Inefficient when asking about the same data repeatedly

25 of 77

Use AQA with the retriever service

  • Corpus < Document < Chunk hieararchy

  • Pass a reference to a document or corpus to the generate_answer function

26 of 77

27 of 77

Function Calling

  • Describe external functions to the model.
  • The model may ask you to call the function to help it respond to your queries.
  • Endless possibilities for integrating external tools.

28 of 77

Function Calling - Basic

  • How?
  • The google.generativeai SDK will inspect the function’s type hints to determine the schema.
  • Allowed types are limited:

AllowedTypes = (

int | float | str | bool | list | dict )

29 of 77

30 of 77

31 of 77

Function Calling - Basic

  • What happened?
    • Use the chat history to find out.
  • The chat history collects all the function calls and responses that tool place.

32 of 77

33 of 77

Function Calling - Schema

  • JSON object

34 of 77

Function Calling - Structured data

  • Structured data extraction.
  • Ask the model to do it and return JSON.

35 of 77

Before function calling

36 of 77

Function Calling - Structured data

  • Asking for JSON often works.
  • Function calling lets you strictly describe the schema.
  • With a strict description, can strictly enforce that what gets returned.

37 of 77

with function calling declaration

38 of 77

Function Calling - Structured data

  • Why?
    • Function calling doesn’t return a string.
    • return data-structures: You don’t parse text.

39 of 77

Here how function calling with SDK…

40 of 77

41 of 77

Image understanding

42 of 77

43 of 77

44 of 77

45 of 77

46 of 77

Gemini 1.5 Pro

47 of 77

Gemini 1.5 Pro

48 of 77

Image understanding

  • Images are just tokens in the input

  • Can be used for instructions, context or query subject

49 of 77

Time to

Get Hands Dirty

Kaohsiung

50 of 77

Task 1: AI Studio and Prompting

Generate API Key

Gemini Pro

  • Playground
  • Generate Code
  • 1.0, 1.5

Gemini Vision

  • Image

Prompt Library

51 of 77

Task 2: Python SDK part 1 - Chat

Notebook => intro_gemini_chat.ipynb

Model parameters

Stream

Chat History

Use with LangChain

Estimated: 10 mins

52 of 77

Task 3: Python SDK part 2 - Vision

53 of 77

Task 4: cURL to know all Gemini Pro APIs

Notebook => intro_gemini_curl.ipynb

  1. text -> text
  2. image -> text
  3. chat (multi-turn, role)
  4. function calling
  5. vision
    1. image -> text
    2. video -> text

Estimated: 15 mins

54 of 77

Task 5: Build RAG application (LMM)

  • Purpose link
  • Notebook: Gemini-lmm.ipynb
    • Remember to use TPU runtime
  • Estimated: 20 mins

(https://github.com/jimmyliao/genai-gdg/blob/main/gemini/rag-intro/gemini-lmm.ipynb)

55 of 77

Gemma

Taoyuan

56 of 77

https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard

57 of 77

58 of 77

https://www.youtube.com/watch?v=60V70JqGkuU

59 of 77

60 of 77

Setup Tokenizer

61 of 77

Create Pipeline

62 of 77

Text Generation

63 of 77

Create Message and Prompt

64 of 77

Ollama

Taoyuan

65 of 77

Run Your Local LLM!

66 of 77

AI Agent

Taoyuan

67 of 77

AI Agent

68 of 77

Difference

Source: Youtube video of Andrew Ng at Sequoia Capital (Link in References)

69 of 77

AI Agent Example

70 of 77

AutoGPT

71 of 77

Langroid

72 of 77

AutoGen

73 of 77

Agentic Workflow (example)

https://www.linkedin.com/pulse/introducing-ai-agents-agentic-workflows-future-ken-rheingans-1rwce/

74 of 77

Related publication

75 of 77

Task 6: Multi-Agent - Gemma, Ollama, and AutoGen

  • Notebook
    • Remember to use TPU runtime
  • Estimated: 30~60 mins

(https://colab.research.google.com/github/yeyu2/Youtube_demos/blob/main/autogen-ollama-gemma.ipynb )

76 of 77

References

Official

77 of 77

DM for detail

We are working on * initiatives around GenAI