Why Rent When You Can Own?
Build your modern data lakehouse with true optionality
What’s The Problem With Today’s Architecture?
01
The Data Warehouse Paradigm Creates Vendor Lock-In
Your data is locked into a proprietary database
Why Data Lakehouse
02
Lakehouse = Data Warehouse Without Vendor Lock-in, With Best-Of-Breed Tools
What An Open Lakehouse Looks Like
Lakehouse Offers More Functionality Without Compromise
Feature | Lakehouse | Data Warehouse |
Interactive queries | Yes | Yes |
Manipulation of data (DML) | Yes | Yes |
Petabytes of data | Yes | No |
Indexing and caching to speed up queries | Yes, with Starburst+Verada | Yes |
Ability to use the best engine for your use case, not locked into a vendors’ ecosystem | Yes | No |
Optionality to switch to open source | Yes, with Starburst/Trino | No |
Active data warehousing | No | Yes |
Get The Benefits Of Today And Years To Come With The Lakehouse
Why The Starburst Approach To The Lakehouse
How To Build The Data Lakehouse
03
Lakehouse Architecture
How It Looks Like With Starburst Galaxy
Operate
Your Data Lakehouse with Starburst
04
Use Case: Data Lakehouse Engine
Deploy to any environment. Also supports HDFS, cloud storage and S3 compatible (Dell ECS, Minio,etc..)
High concurrency, auto-scaling MPP engine (Trino), which is widely used in industry (replaced Hive)
Full role based access control
Use Case: Data Lakehouse w/ Data Mesh
SELECT
c.orderkey,
o.shippriority
FROM
teradata.tpch.customer c, sql_server.tpch.orders o
Query over 35 data sources using standard ANSI SQL
Starburst engine provides really fast speeds via file indexing, caching, cost-based optimizer, dynamic filtering and join pushdown, and more
Use Case: Data Processing Engine
What Makes Starburst (Trino) A Versatile Engine
05
Fast And Cost-Efficient
*Test run on TPC-H 10TB data schema using 5 m5.8xlarge machines
Ability To Run Trino on Spots For Cost Savings
Trino Is Fast And Predictable On Spots
Trino query execution time on spot instances is faster than Spark on-demand instances
Starburst Galaxy+dbt Demo
06
Demo - Building Pipeline and Consumption
Starburst Galaxy Provides Great Ecosystem For Trino
Ecosystem of connectors
Performance and flexibility
Scalability
Ease of use / consumability
Security and compliance
Optionality
Ease of use and consumability
Capabilities that enable easy discovery and consumption of high-quality data
Easy to connect to a rich ecosystem of data sources, BI tools, partner products
Intuitive user experience using the SQL skills and tools you already know
Fully managed SaaS option
Resource elasticity: reduces need for dedicated operational team
Flexible and transparent licensing, pricing, and billing options