1 of 34

I2 Cloud Community Chatbot

Timothy Manik, Internet2

Date: May 21, 2026

2 of 34

Agenda

  • Background
  • Architecture
  • Demo
  • What’s next?
  • Q&A

2

3 of 34

4 of 34

In the beginning…

5 of 34

Background

  • Early 2024: We want a chatbot. Who can help us build?
  • Summer 2024: Jan connects us with the CalPoly CIC Team
  • Fall 2024: They're booked solid. So we wait…
  • Early 2025: A slot opens!
  • Mid 2025: Iterations begin
  • Fall 2025: Prototype is done!

5

6 of 34

7 of 34

8 of 34

9 of 34

10 of 34

11 of 34

12 of 34

13 of 34

14 of 34

15 of 34

Architecture - Web App

15

16 of 34

Architecture - Data Pipeline

16

17 of 34

Architecture - Data Pipeline (Video)

17

  1. Transcribe
  2. Detect Scenes
  3. Transcript-Scene Alignment
  4. OCR Text Detection
    1. Pre-filtering with Tesseract
    2. AWS Textract over textual scenes
    3. Cut Textract costs by ~90%

Large text chunk

Small text chunk

0s

30s

Small text chunk

Small text chunk

Small text chunk

5s

10s

15s

23s

Content is here, timestamp is 18s early

Timestamp

  1. Chunking
    1. Split long dialogues at speaker change for retrieval
    2. Chunk length 400 - 800 chars
  2. Index chunks into OpenSearch
    • Timestamp at start of chunk

17

18 of 34

Demo

19 of 34

What’s next?

20 of 34

Links

20

21 of 34

Thank you!

22 of 34

Q&A

23 of 34

23

24 of 34

24

25 of 34

25

26 of 34

26

27 of 34

27

28 of 34

28

29 of 34

29

30 of 34

30

31 of 34

31

32 of 34

32

33 of 34

33

34 of 34

34