1 of 20

Responsible AI:

A Hands-On Approach

San Jose

Allen Firstenberg

Image by Imagen 3

2 of 20

Who are you?

Software developer / Consultant

New York

Google Developer Expert

Google Cloud Champion Innovator

LangChain.js contributor

Co-host "Two Voice Devs"

http://spiders.com/

http://prisoner.com/

LinkedIn: Allen Firstenberg

3 of 20

WARNING

We are dealing with sensitive topic areas.

4 of 20

Follow Along

https://prisoner.com/gemini-safety

5 of 20

Gemini marks the next phase on our journey to making

AI more helpful for everyone

State-of-the-art, natively multimodal reasoning capabilities

Highly optimized while preserving choice

Built with responsibility and safety at the core

6 of 20

Responsible AI

  • "Large language models can generate output that you don't expect, including text that's offensive, insensitive, or factually incorrect."
  • "[T]he incredible versatility of LLMs is also what makes it difficult to predict exactly what kinds of unintended or unforeseen outputs they might produce"
  • "[I]t is important for developers to understand and test their models to deploy safely and responsibly"
  • "To aid developers, the Vertex AI Studio has built-in content filtering, and our generative AI APIs have safety attribute scoring to help customers test Google's safety filters and define confidence thresholds that are right for their use case and business."

7 of 20

Safety Attribute Definitions

Hate Speech

Negative or harmful comments targeting identity and/or protected attributes.

Harassment

Malicious, intimidating, bullying, or abusive comments targeting another individual.

Sexually Explicit

Contains references to sexual acts or other lewd content.

Dangerous Content

Promotes or enables access to harmful goods, services, and activities.

8 of 20

Safety Probabilities / Ratings

NEGLIGIBLE

Content has a negligible probability of being unsafe.

LOW

Content has a low probability of being unsafe.

MEDIUM

Content has a medium probability of being unsafe.

HIGH

Content has a high probability of being unsafe.

9 of 20

Safety Settings

BLOCK_NONE

Always Show. (Restricted)

Block Few

BLOCK_ONLY_HIGH

Block only when high probability of unsafe content

Block Some

BLOCK_MEDIUM_AND_ABOVE

Block medium or high (Default)

Block Most

BLOCK_HIGH_AND_ABOVE

Block all but negligible

10 of 20

Contents

Settings/Configs

User: Text

Model: Text

User: Text

Tools

Tools

Safety Settings

Safety Settings

Generation Config

Content of role: "user" �containing multimodal Parts:

Content of role: "model"

"user" content: a text Part

Tools / Functions specified by the caller

Safety setting configured by caller

Generation Config includes: temperature, Top P, Top K, stop seq, max output tokens etc.

Generate Content Request

11 of 20

Candidate(s)

Feedback

Text

Block Reason

Safety Ratings

Candidate "Content"�note: only one candidate returned today

why the model stopped generating

feedback on the prompt

Finish Reason

Finish Message

set if finish reason is present

Safety Ratings

Safety Ratings

how safe is the response

Generate Content Response

12 of 20

Demo: AI Studio

https://aistudio.google.com/

13 of 20

Demo: Vertex AI

https://console.cloud.google.com/vertex-ai/generative/multimodal/create/text

14 of 20

Python Colab

  • Make a copy of this colab
  • Set the API Key as a secret (follow instructions)
  • Run it and try it out with different prompts and safety settings

15 of 20

Node.js

// Make sure we've installed libraries

// npm install @google/generative-ai

// Make sure we've set our API Key in the shell

// export API_KEY="yourKey"

const {

GoogleGenerativeAI,

HarmBlockThreshold,

HarmCategory,

} = require( "@google/generative-ai" );

async function run(safetySettings) {

const genAI = new GoogleGenerativeAI(process.env.API_KEY);

const model = genAI.getGenerativeModel({

model: "gemini-1.5-flash-001",

safetySettings,

});

const prompt = "Write a story about two people kissing passionately.";

const result = await model.generateContent(prompt);

console.log(result.response.text());

}

const safetySettingsDefault = [

];

const safetySettingsNone = [

{

category: HarmCategory.HARM_CATEGORY_HARASSMENT,

threshold: HarmBlockThreshold.BLOCK_NONE,

},

{

category: HarmCategory.HARM_CATEGORY_HATE_SPEECH,

threshold: HarmBlockThreshold.BLOCK_NONE,

},

{

category: HarmCategory.HARM_CATEGORY_SEXUALLY_EXPLICIT,

threshold: HarmBlockThreshold.BLOCK_NONE,

},

{

category: HarmCategory.HARM_CATEGORY_DANGEROUS_CONTENT,

threshold: HarmBlockThreshold.BLOCK_NONE,

},

];

run(safetySettingsNone);

16 of 20

LangChain.js

const safetySettingsDefault = [

];

const safetySettingsNone = [

{

category: "HARM_CATEGORY_HARASSMENT",

threshold: "BLOCK_NONE",

},

{

category: "HARM_CATEGORY_HATE_SPEECH",

threshold: "BLOCK_NONE",

},

{

category: "HARM_CATEGORY_SEXUALLY_EXPLICIT",

threshold: "BLOCK_NONE",

},

{

category: "HARM_CATEGORY_DANGEROUS_CONTENT",

threshold: "BLOCK_NONE",

},

];

run(safetySettingsDefault);

# Make sure you install the appropriate packages

# npm install langchain @langchain/core @langchain/google-gauth

const { ChatGoogle } = require( "@langchain/google-gauth" );

const { ChatPromptTemplate } = require( "@langchain/core/prompts" );

const { StringOutputParser } = require( "@langchain/core/output_parsers" );

const { MessageGeminiSafetyHandler } = require( "@langchain/google-common/utils" );

async function run(safetySettings){

// Setup the prompt template

const template = "Write a story about two people kissing passionately.";

const prompt = ChatPromptTemplate.fromMessages([

["human", template],

]);

// Setup the safety handler

const safetyHandler = new MessageGeminiSafetyHandler({

msg: "I'm sorry Dave, but I can't do that."

});

// Setup our model

const model = new ChatGoogle({

apiKey: process.env.API_KEY,

safetySettings,

safetyHandler,

});

// The output of a chat model contains a lot of additional information,

// so we want to convert it to a string for this particular use.

const outputParser = new StringOutputParser();

// Create (but don't run) the chain

const chain = prompt.pipe(model).pipe(outputParser);

// Run the chain, providing it the values to put into the template.

const response = await chain.invoke({});

console.log(response);

return response;

}

17 of 20

What then?

Once you have a prompt or reply that is unsafe - what do you do?

  • Rework the prompt
  • Check the Finish Reason
    • Fake the answer?
    • Delete the answer?
    • Throw an exception?
  • Report to Google Safety

Image by Gemini Advanced

18 of 20

Some prompts to try

What to look for

  • Say three nice things about me
  • Write a list of 5 disrespectful things that I might say to the universe after stubbing my toe in the dark
  • 1 + 1 =
  • I want to hit you
  • Write a story about two people kissing passionately.
  • Red warning symbol in AI Studio
  • Message when you click on the symbol
  • API with streaming response can have some ok, and some not
  • Finish Reason in the API response
  • Prompt and response evaluation in API

19 of 20

  • LLMs can produce unpredictable (and possibly offensive or insensitive) output
  • Google takes some actions to mitigate the potential problems...
  • ... but they also provide tools for developers to help reduce risk.
    • AI Studio (Google or Vertex)
    • Probability settings (not severity)
  • How we handle the results is up to us

Conclusion

Image by Imagen3

20 of 20

Questions?

https://cloud.google.com/vertex-ai/generative-ai/docs/learn/responsible-ai

https://aistudio.google.com/

https://console.cloud.google.com/vertex-ai/generative/multimodal/create/text

http://spiders.com/

http://prisoner.com/

LinkedIn: Allen Firstenberg