Copyright © 2026
Miguel Soto
Cloud Solutions Architect, LATAM
Intel Xeon 6
The future of AI Powered Cloud
1
Copyright © 2025
2
Copyright © 2026
2
3
The majority of enterprise �AI projects run on �Intel® Xeon®, built for scalable, �general-purpose AI workloads.
Copyright © 2026
3
4
GenAI is shifting from GPU-heavy LLM training �to smaller, more targeted models, inference �optimization, and agentic AI.
And, GenAI is only part of the enterprise AI story. �For decades, enterprises have relied on general-purpose compute to power AI – from data analytics and machine learning to forecasting and fraud detection.
Copyright © 2025
4
Customer service teams leverage churn prediction models to proactively identify and retain high-value clients.
Operations teams optimize supply chain decisions using predictive analytics to improve inventory accuracy and reduce delays.
The finance team uses continuous fraud detection to monitor all transactions in real-time and trigger instant alerts.
Product developers accelerate designs with rapid prototyping and GenAI-powered simulation tools.
5
Enterprise AI, in reality, looks like this:
Copyright © 2026
5
6
MATURE AI
EMERGING AI
Data & Feature Engineering
Classical Machine Learning Training & Inference
Deep Learning Inference
Generative AI Fine-Tuning / Training
Generative AI Inference
Agentic AI Orchestration
Edge & Embedded AI
Ingestion, transformation, orchestration, vectorization
Native execution; optimal for training and inference; ensemble models, batch processing
Small/mid-size DL models, transformer inference, real-time and batch execution
Multi-GPU orchestration, memory/I/O coordination, fallback compute
Executes small/mid-size model inference with low-latency response; orchestrates RAG pipelines and optimizes MoE routing
Task routing, tool execution, hybrid model coordination
Orchestrates low-latency GenAI inference at the edge, with fallback compute, RAG execution, and agentic coordination.
CPU-INVOLVED?
AI WORKLOAD CATEGORIES
CPU ROLE
PROOF POINTS
Copyright © 2026
6
7
The foundation is already there. Your business, your developers, �your operations have operated in this CPU environment for 30 years.
CPUs provide flexibility across your enterprise workloads and deployment environments.
CPUs are still the ideal compute for mature AI and AI data architectures.
CPUs are required in accelerator-based AI systems to serve as the central orchestrator.
CPUs are foundational to secure AI infrastructure, enabling trusted execution and enterprise-grade data protection.
Because…
Why CPUs now?
CPUs are optimized for emerging AI workloads like SLM inference and Agentic AI.
Copyright © 2026
7
8
With CPUs
the future is more accessible than you think.
Copyright © 2026
8
9
Copyright © 2026
9
10
Why you choose our accelerated CPUs:
Trusted Compute Foundation
AI-Optimized Architecture
Deployment & Workload Flexibility
Copyright © 2026
10
11
TRUSTED COMPUTE FOUNDATION
When we say Trusted Compute Foundation, we’re talking about a deep commitment to keeping your business safe, secure, and resilient - right from the heart of your infrastructure.
With Intel Xeon, security starts inside the chip itself. Technologies like SGX and TDX enable confidential computing and Zero Trust architectures at the hardware level - protecting data, applications, and memory even while in use.
Xeon is a proven foundation - trusted across enterprise workloads for decades. It integrates seamlessly into your existing environments, so you can modernize securely without retraining teams or rearchitecting systems.
100% of the Intel processor vulnerabilities addressed in 2024 were discovered through internal security research.
Intel scored 82.2, ranking #1 across the silicon industry for product security assurance maturity.
Intel reported 4.4x fewer firmware vulnerabilities in root-of-trust and 1.8x fewer in confidential computing technologies than AMD.
100%
#1
4.4x
Copyright © 2026
11
Xeon 6 handles up to 69,000 concurrent queries, up to 3.2× higher concurrent prompts than AMD EPYC 9965
69k
Xeon 6 with AMX delivers up to:
2.59× higher vector search throughput for RAG systems vs AMD EPYC 9575F
1.93× higher DLRM performance vs AMD EPYC 9654
1.85× faster BERT-Large inference vs AMD EPYC 9654
17× faster ResNet-50 batch inference vs AMD EPYC 9654
12
AI OPTIMIZED INFRASTRUCTURE
When we say AI Optimized infrastructure, we mean giving your business the tools it needs to run enterprise AI smarter, faster, and easier – on the Xeon platform it already trusts.
Xeon accelerates AI workloads with built-in instruction sets that deliver intelligent performance out of the box, without changing your apps or retraining your teams.
Integrated accelerators handle data movement, analytics, and security - streamlining the AI pipeline and freeing up compute for what matters most.
And with high core density, advanced memory, and energy-efficient design, Xeon scales to support any AI workload - from single models to enterprise-wide deployments.
Copyright © 2026
12
Xeon 6 delivers up to 50% lower TCO and stronger performance than AMD EPYC 9005 - across diverse enterprise workloads.
Supports predictable scaling and operational autonomy across deployment models.
Intel–NVIDIA collaboration aligns �x86 stack with CUDA architecture — simplifying deployment across GPU-accelerated AI workloads.
50%
x86
13
DEPLOYMENT & WORKLOAD FLEXIBILITY
When we say Deployment & Workload Flexibility, we mean �it’s built to handle any business computing challenge - wherever and however you need it.
Xeon supports diverse workloads on a single platform, so you don’t need separate systems for different jobs.
Deployments are simplified and future-ready thanks to Xeon’s ability to run in public, private, sovereign, and edge environments - from data centers to retail stores to remote sites.
Its open ecosystem works with leading software stacks and avoids vendor lock-in, making it easy to launch new solutions �and reduce investment risk as your business evolves.
Copyright © 2026
13
Matching GPU Price Performance Using Amazon Instances With Intel® Xeon® Processors
1 For more complete information about performance and benchmark results, visit https://www.intel.com/content/www/us/en/customer-spotlight/stories/storm-reply-customer-story.html
Products and Solutions | Industry IT Services & IT Consulting | Organization Size 201-500 | Country Italy | Partners | Learn more |
14
Solving problems �from AI/ML and Analytics �to Database and HPC
15
Achieve Cost-Performance for the Workloads that Matter�with new AWS 8th Gen EC2 Instances powered by custom Intel® Xeon® 6 CPUs
16
Delivering through �our strong partnership
17
Intel and AWS Partnership Dates Back To the First EC2 Instance with Intel® Xeon® Processor
“At AWS, we’re committed to delivering the most powerful and innovative cloud infrastructure to our customers. By co-developing next-generation AI fabric chips on Intel 18A, we continue our long-standing collaboration, dating back to 2006 when we launched the first Amazon EC2 instance featuring their chips. Our continued collaboration allows us to empower our joint customers with the ability to run any workload and unlock new AI capabilities.”� –Matt Garman, CEO at AWS
18
Hardware optimization
© Copyright 2026, Intel | Confidential – NDA Required
AWS instances:
M6i, C6i, R6i…
AWS instances:
M5, C5, R5…
AWS instances:
M1
Intel Xeon
(2006)
Intel Xeon Scalable 4th Gen
(2021)
Intel Xeon Scalable 2nd Gen
(2019)
Intel Xeon Scalable 3rd Gen
(2021)
AWS instances:
M7i, C7i, R7i…
Intel Xeon Scalable 5th Gen
(2024)
Intel Xeon Scalable 6th Gen
(2025)
AWS instances:
I7ie, G7, P6
AWS instances:
M8i,M8id,…
Price:
M1.2xlarge:
$255
Price:
M5.2xlarge
$280
Price:
M6i.2xlarge
$280
Price:
M7i.2xlarge
$294
Flex: $279
Price:
N/A
Price:
M8i.2xlarge
$309
Flex: $293
*Prices based on AWS public calculator.
19
Intel® Architecture Instance Types on AWS
General purpose instances provide a balance of compute, memory and networking resources, and can be used for a variety of diverse workloads.
Habana Gaudi
1st and 2nd Gen Intel® Xeon® �Scalable processors
General
Purpose
Compute-�Optimized
Memory
Optimized
Accelerated�Compute
Compute Optimized instances are ideal for compute bound applications that benefit from high performance processors.
Memory optimized instances are designed to deliver fast performance for workloads that process large data sets in memory.
Accelerated computing instances use hardware accelerators, or co-processors, to perform functions more efficiently.
2nd Gen Intel® Xeon® �Scalable processors
Intel® Xeon® Scalable
processors
Intel® Xeon® v4
Processors
Storage optimized instances are designed for workloads that require high, sequential read and write access to very large data sets on local storage.
Storage�Optimized
Intel® Xeon® v3
Processors
DL1
M5(d)n
M5zn
R5(d)n
R5b
T2
C5(d)
R5(d)
HMI
T3
M5(d)
C5n
z1d
P3dn
I3en
M4
X1
R4
P2
P3
G3
F1
H1
I3
C4
X1e
D2
X2idn
X2iezn
I4i
HPC6id
3rd Gen Intel® Xeon® �Scalable processors
HPC
Optimized
Ideal for applications that benefit from high-performance processors including large, complex simulations and deep learning workloads.
4th Gen Intel® Xeon® �Scalable processors
R7iz
M7i
M6i(d)
M6i(d)n
C6i(d)
C6in
R6i(d)
R6i(d)n
X2iedn
D3
D3en
G4dn
P4d
C7i
R7i
Features Intel® AMX
U7i
C7i-Flex
M7i-Flex
5th Gen Intel® Xeon® �Scalable processors
6th Gen Intel® Xeon® �Scalable processors
M8i
M8i-Flex
R8i
R8i-Flex
I7i
I7ie
See https://aws.amazon.com/ec2/instance-types/ and speaker notes for details.
C8i
C8i-Flex
20
Instance Analysis
Current | $ | 6i | $/ Perf * | 7i | $/ Perf * |
c5.xlarge | 0.2620 | c6i.xlarge | 20% | c7i.xlarge | 33% |
c6a.2xlarge | 0.4716 | c6i.2xlarge | 3% | c7i.2xlarge | 29% |
m5.xlarge | 0.3060 | m6i.xlarge | 20% | m7i.xlarge | 33% |
m5.2xlarge | 0.6120 | m6i.2xlarge | 20% | m7i.2xlarge | 33% |
m6a.2xlarge | 0.5508 | m6i.2xlarge | 3% | m7i.2xlarge | 29% |
Public prices, São Paulo Region
* Estimated
21
Resource optimization
22
$ 0.9157
4th Generation Intel Xeon Scalable
5:3�Instance Consolidation
2nd Generation Intel Xeon Scalable
C5.xlarge
5 instances
3 instances
m7i-flex.xlarge
$ 0.2620
*Hourly price per instance in São Paulo
*Pay as you go
*Linux
Total for 5 instances
$ 1.31
$ 0.3052
Total for 3 instances
30% potential savings
48 GB RAM
40 GB RAM
22
ITAU Unibanco – Caso de Estudio
El banco más grande de Brasil y América Latina, con operaciones en todo el mundo que atiende a unos 55 millones de clientes.
23
Reto: Transformar y modernizar sus aplicaciones, reduciendo costos, aumentando las ganancias y mejorando la escalabilidad.
ITAU Unibanco – Caso de Estudio
24
Solución: Migró el 99% de su nube privada y el 20% de su plataforma distribuida a AWS. +19.000 servidores
ITAU Unibanco – Caso de Estudio
25
Resultados: Reducción del 99% en el tiempo de entrega de la plataforma. Satisfacción del cliente
De
3.5x – 6.4 veces
Rendimiento de la infraestructura mediante instancias de AWS basadas en
Intel Xeon de 4.ª generación.
ITAU Unibanco – Caso de Estudio
26
27