1 of 8

Evolving KServe: The Unified Model Inference Platform for Both Predictive and Generative AI

Yuan Tang

Senior Principal Software Engineer, Red Hat

Project Lead, KServe

#KubeCon #CloudNativeCon

2 of 8

Model inference platform

Supported runtimes

Orchestration

Hardware accelerators (GPUs, CPUs, etc.)

Cloud native integrations

Autoscaling

Networking

Generative

Predictive

GenAI integrations

3 of 8

KServe

CNCF incubating project (Sept 2025)

30+ adopters

19 maintainers

300+ contributors

Trusted by industry leaders

KServe is used in production by organizations across various industries, providing reliable model inference at scale.

4 of 8

Predictive AI Features

Illustrations by Alexa Griffith

5 of 8

GenAI Building Blocks

LLM Metric-Based Autoscaling

Prompt Caching

Intelligent Routing, Traffic Management

GenAI Runtime

vLLM, TRT-LLM,�llm-d

Scale

Cost

Latency /

Throughput

Efficiency

6 of 8

Optimized LLM Inference

A CNCF Sandbox project for distributed large language model inference that runs natively on Kubernetes.

7 of 8

Join Our Community!

  • Repo: https://github.com/kserve/kserve
  • Website: https://kserve.github.io
  • Biweekly community meetings on Thursdays at 9 AM PST
  • #kserve and #kserve-contributors channels in the CNCF Slack

https://github.com/kserve/kserve

8 of 8

Find Us This Week!

  • Check out our maintainer session on Thursday! https://sched.co/2EF54
  • We have a project booth throughout the week!
    • Kiosk Number: P-8A
    • Location: Halls 1-5 | Project Pavilion
    • Schedule:
      • Tuesday 10:15 - 14:40
      • Wednesday 10:00 - 13:30
      • Thursday 10:00 - 12:00

https://github.com/kserve/kserve