you
rn
“An LLM agent runs tools in a loop to achieve a goal.”
What Exactly is an Agent?
10×
Enterprise data growth, year over year
…not ONLY
Tokenmaxxing!
Human-cadence data systems 🧑💻
Designed for batch queries
QA cycles: ~16 days
Policy approvals: ~6 weeks
Agent-cadence query loads 🤖
Continuous, autonomous queries
Self-directed exploration
10× YoY data growth
Snowflake
Databricks
BigQuery
Hadoop
S3
ADLS
GCS
on-prem clusters
Vendor lock-in
Cost spirals
Expensive ETL
Fragmented governance
No schema context
Factory Floors
Customers' production sites (On-Prem)
Office
Their own teams (Cloud)
Portable compute
Runs anywhere data live
Federated catalog
Spans environments
Traveling governance
Policy moves with data
Intelligence layer
Reasons across the footprint
Factory Floors
Portable Spark · on-prem GKE
Office
Same Spark, in cloud
Federated catalog · Traveling governance · Intelligence
Control Plane
xLake Compute (Hybrid, On-premise, VPC)
Enterprise Context, Governance for Agents & Applications
Observability, Automated Operations & Semantic Knowledge Layer
xLake Platform
Data Runtime
Agentic Runtime
Agentic Data Management
Data Warehousing
Automated Data Engineering
AI Applications
Streaming Applications
Analytics & Business Applications
Data & AI Observability
TPC-DS benchmark · 12-node GKE cluster · near-linear to 6.3B rows
Peak: 3.26M rows/sec · 5× faster than baseline · 1TB profiled in under 3 minutes
45B
rows validated
in under 2 hours
top-3 US telco
3×
Spark performance
on same hardware
$150–300K saved annually
5×
faster validation
vs. baseline
TPC-DS benchmark
Snowflake
Centralized data warehouse
AI workloads
Growing rapidly
Databricks-style
xLake
Deployment
Foundation
Data location
Governance
Cost profile
AI integration
Managed SaaS
Proprietary runtime
Migrate to platform
Unity Catalog (native)
DBUs, compute + storage coupled
Mosaic / Genie
Customer-managed Kubernetes
Open source (Apache)
Stays where it lives
Apache Ranger (portable)
Compute and storage decoupled
Schema-aware AI Studio
AI at speed
Slow deployment
Trusted data, fast
Self-healing
Reactive firefighting
Agents repair pipelines
Cost discipline
Runaway cloud spend
Decoupled, right-sized
Always-on compliance
Audit debt
Policy travels with data
Factory Floors
Office
Portable Spark on-site and in cloud
Federated catalog spanning both surfaces
One governance regime, end to end
Not 'cloud or on-prem.' Both. On one architecture.
Data don't move anymore.
The compute does.
youtube.com/c/JonKrohnLearns
linkedin.com/in/jonkrohn
jonkrohn.com/talks