1 of 42

Ruby + AI for building multi-agent architecture

Sergy Sergyenko

24 Oct 2024, EDI

2 of 42

Sergey Sergyenko

@sergyenko

3 of 42

4 of 42

5 of 42

6 of 42

7 of 42

Ruby Software Development Agency

Addicted to Ruby since v2.1.9

8 of 42

https://rubyness.co.uk

sergy@rubyness.co.uk

AI + Ruby and Rails

9 of 42

Ruby + AI for multi-agent architecture

10 of 42

When you start an AI project,

in 99% you go with a platform

Microsoft Azure AI, Google Cloud AI, OpenAI API, Hugging Face, AWS AI/ML

11 of 42

Is ChatGPT a language model?

12 of 42

In most of the cases one model is not enough

13 of 42

Multi-Agent Architecture

over

“One-Size-Fits-All” LLM

14 of 42

Typical MA Ecosystem

User

Input

Initial Analysis and Routing

LLM Function Calling

Agent Skills Calling

RAG

Internal / External APIs

Evals

Debug / Graph

Agent Step

Analyze

Evals

Continuous Analysis and re-Routing

Router and

Planer function

15 of 42

Router

Planner

Agent Step

Skill 1

Skill N-1

User

Input

Skill 2

Messages

Memory

Critique Step

Router

Planner

Agent Step

Skill N

User

Input

Critique Step

Agent Process (Router - Skill)

Saved State

16 of 42

Router Example

Customer Input

I need a new Kindle e-reader for my reading hobby. Are there any discounts currently?

Classification�OpenAI Call

Item Search

Q/A

LLM

Recommend Item

LLM

Query Response

Purchase

Query

Classification::Purchase

Classification::Query

Code

LLM

17 of 42

User Intent

Router

Product Comparison

Product Search

Customer Support

Track Package

Promos and Deals

Chatbot Query: Product, Price, Quality

Unstructured to Structured

Extract Query

Call Order Details API

Call Promo Database for latest promos

Unstructured to Structured

Search API

RAG on Customer Support Docs

Tracking UI

Promo UI

Search API

Rank Products

Chatbot Query: Classification, FAQ

Chat with Live Support Agent

Comparison UI

Summarize Product Description

Add to Wishlist

Checkout

Skills

LLM Call

Application

API Call

Router Example

Return to Menu

18 of 42

User Intent

Router

Product Comparison

Product Search

Customer Support

Track Package

Promos and Deals

Chatbot Query: Product, Price, Quality

Unstructured to Structured

Extract Query

Call Order Details API

Call Promo Database for latest promos

Unstructured to Structured

Search API

RAG on Customer Support Docs

Tracking UI

Promo UI

Search API

Rank Products

Chat with Live Support Agent

Comparison UI

Summarize Product Description

Add to Wishlist

Checkout

Skills

LLM Call

Application

API Call

Router Example

Chatbot Query: Classification, FAQ

Return to Menu

19 of 42

User Intent

Router

Product Comparison

Chatbot Query: Product, Price, Quality

Unstructured to Structured

Search API

Comparison UI

Skills

LLM Call

Application

API Call

Router Example

PROMPT

Function Call

Customer Support

Extract Query

RAG on Customer Support Docs

Chatbot Query: Classification, FAQ

Return to Menu

Function Call

PROMPT

20 of 42

21 of 42

User Intent

Router

Product Comparison

Chatbot Query: Product, Price, Quality

Unstructured to Structured

Search API

Comparison UI

Skills

LLM Call

Application

API Call

Router Example

PROMPT

Function Call

Customer Support

Extract Query

RAG on Customer Support Docs

Chatbot Query: Classification, FAQ

Return to Menu

Function Call

PROMPT

22 of 42

User Intent

Router

Product Comparison

Chatbot Query: Product, Price, Quality

Unstructured to Structured

Search API

Comparison UI

Skills

LLM Call

Application

API Call

Router Example

PROMPT

Function Call

Customer Support

Extract Query

RAG on Customer Support Docs

Chatbot Query: Classification, FAQ

Return to Menu

Function Call

PROMPT

Conversational Model

Model Classifiers

Instructional Model

23 of 42

Extract Query

RAG on Customer Support Docs

Chatbot Query: Classification, FAQ

Conversational Model

Classifier Model

Instructional Model

24 of 42

Extract Query

RAG on Customer Support Docs

Chatbot Query: Classification, FAQ

Conversational Model

Classifier Model

Instructional Model

Own Your Own AI

25 of 42

Story behind llama.cpp

26 of 42

27 of 42

Georgi Gerganov

Sofia, Bulgaria

28 of 42

$ git clone https://github.com/ggerganov/llama.cpp.git

$ python3 -m pip install -r requirements.txt

$ cd models

llama.cpp

brew install llama.cpp

29 of 42

🤗

30 of 42

$ git lfs install

$ git clone git@hf.co:openlm-research/open_llama_7b

Clone Model Repository (🤗 huggingface.co)

31 of 42

https://huggingface.co/bartowski/Meta-Llama-3.1-8B-Instruct-GGUF

32 of 42

Convert the Model to GGUF format

$ python3 convert_hf_to_gguf.py models/open_llama_7b

$ make

This conversion enables the model to be loaded and executed with improved performance on CPUs

33 of 42

Quantization reduces the precision of model weights (e.g., from 32-bit to 16-bit or even 1-bit) to save memory and speed up inference. For example:

- **32-bit weight**: `0.123456789`

- **16-bit weight**: `0.1234`

- **1-bit weight**: `0` or `1`

This allows AI models to run efficiently on devices like smartphones or IoT hardware with minimal accuracy loss.

Quantize the Model

$ ./llama-quantize \

./models/open_llama_7b/Open_Llama_7B-6.7B-F16.gguf \

./models/open_llama_7b/Open_Llama_7B-6.7B-F16.bin \

q4_0

34 of 42

Demo

35 of 42

require 'llama_cpp'

36 of 42

37 of 42

rSpec LLama

gem install rspec-llama

https://github.com/aifoundry-org/rspec-llama

38 of 42

39 of 42

40 of 42

LLamagator

docker compose up

https://github.com/aifoundry-org/llamagator

41 of 42

LLamagator

LLM-as-a-Judge made easy

https://github.com/aifoundry-org/llamagator

42 of 42

Thank you!

@sergyenko