Apache Pinot Roadmap 2023
Brought to you by the Apache Pinot Community
Opening remarks
Apache Pinot Incubation
2022
2021
2020
2019
2018
800 Members
150 Contributors
500k Lines changed
100 Members
3500 Members
270 Contributors
10k Commits
100 Deployments
Presenter: Mayank Shrivastava
2000 Members
Kicking off the roadmap show-and-tell
Presenter: Neha Pawar
Multi-stage Engine
V2 Engine Improvements
Framework Enhancement
Query Planner Improvement
Partition Multi-threading
Presenter: Rong Rong
Co-located Joins
Colocated What?
Colocated Why?
Colocated When?
~30 TB
Data
2M+ MPS
30+ Stages
P90
< 5s
Presenter: Ankit Sultana
Local Joins
What?
SELECT/AGGREGATE/GROUP-BY
TRANSFORM/LOCAL-JOIN
PROJECTION
FILTER
Why?
SELECT …
FROM T1 JOIN T2 ON T1.key = T2.key
WHERE T1.a = 123 AND T1.b > T2.c
When?
Presenter: Jackie Jiang
SQL Compliance
Postgres SQL compliance
Sql type compliance
Better usability with sql standard
Presenter: Yao Liu
Window Functions
Presenter: Sonam Mandal
Pagination
Presenter: Jialiang Li
Pluggability
Index SPI
Presenter: Gonzalo Ortiz
Ingestion
Spark Connector
Presenter: Caner Balci
Pauseless consumption
Motivation
Approach
Status/Challenges
Presenter: Sajjad Moradi
Record Deletion in upserts
Problem: Explicitly delete a record from an upsert table (#10452)
Use Case Examples:
Approach ((Design Doc)
Presenter: Navina Ramesh
TTL for Upserts
Context
Presenter: Qiaochu Liu
Performance Enhancements
Group by improvements
High latency and low resource utilization for large scale group by
Presenter: Yao Liu
Offheap distinct(count)
Presenter: Jia Guo
Partitioned distinct(count)
%3 = 0
3, 3, 12, 15, 18, 15, 21
%3 = 1
%3 = 2
5 + 4 + 3 = 12
1, 4, 4, 7, 7, 7, 13
5, 5, 5, 8, 11, 11
Presenter: Jia Guo
Arg_Min/Max
Position | Avg(salary) | Max(Salary) | ArgMax(Salary,FirstName) | ArgMax(Salary, ID) |
Mechanical Engineer | 7000 | 18000 | Henry | 12345 |
| | | Linda | 12310 |
Professor | 7200 | 16000 | Juliette | 13451 |
Presenter: Jia Guo
Continuing Resiliency Work (Workload Management, Scheduling)
Problem Statement
Mitigation
Presenter: Vivek Iyer
CLP and log compression
Presenter: Ting Chen
Operational Improvements
Groovy function registry
Function Registry
function x() -> Groovy(...)�function y() -> Groovy(...)�function z() -> Groovy(...)�…
Authorized users register functions�(API endpoint should be authorized based on ACL)
"ingestionConfig": {
"filterConfig": {
"filterFunction": "x()"
},� "transformConfigs": [{
"columnName": "colA",
"transformFunction": “y()"
},� ...
}
select z() from T...
Use those registered functions for �ingestion config
Use those registered functions for queries
Presenter: Haitao Zhang
Pauseless consumption
Motivation
Approach
Status/Challenges
Presenter: Sajjad Moradi
V2 Engine Improvements
Framework Enhancement
Query Planner Improvement
Partition Multi-threading
Presenter: Rong Rong
Some other interesting features in the poll
Presenter: Neha Pawar
Content
Advocacy Updates
Conferences
Presenter: Mark Needham
Q&A Closing remarks
Guidelines