1 of 21

BI ARCHITECTURE ON AZURE

Alessio Goggia | Business Intelligence Consultant

2 of 21

MEETING AGENDA

  1. What is Business Intelligence?
  2. Overview on CDC and IIDR
  3. BI architecture as-is
  4. Overview on some Azure resources
  5. BI architecure on Azure (draft)
  6. Bibliography
  7. Q&A

3 of 21

WHAT IS BUSINESS INTELLIGENCE?

4 of 21

WHAT IS BUSINESS INTELLIGENCE (BI)?

  • Business processes to collect and analyze strategic information;
  • Technology used to execute these processes;
  • Info obtained from these processes;
  • Key words: reports, OLAP, ETL, DW, data mining etc.

5 of 21

OVERVIEW ON �CDC AND IIDR

6 of 21

CHANGE DATA CAPTURE �(CDC)

Collection of software design patterns used �to detect any data change in the database.

It allows Data Warehouse or Databases to stay active for some action to perform as soon any Change Data Capture occurs.

CDC is a Data Integration approach that allows high-velocity data to achieve reliable, low latency and scalable data replication using fewer computation resources.

Companies deliver new data changes to BI tools and team members in real-time, keeping them up-to-date.

7 of 21

IBM INFOSPHERE DATA REPLICATION �(IIDR)

Log-based change data capture with real-time replication that provides trusted data integration and synchronization.

High-throughput replication with integrity and consistency.

Benefits:

  1. Centralized monitoring platform;
  2. Continuous, trusted data delivery;
  3. Increased business agility;
  4. Reduced costs.

8 of 21

BI ARCHITECTURE AS-IS

9 of 21

BI ARCHITECTURE AS-IS

10 of 21

OVERVIEW �ON SOME AZURE RESOURCES

11 of 21

EVENT HUB FOR KAFKA

Provides an endpoint compatible with Kafka, as an alternative to running your own Apache Kafka cluster.

You can use it from your applications without code and only modify the configuration.

Kafka and Event Hubs are very similar: they're both built for streaming data.

Kafka is software you typically need to install and operate, Event Hubs is a fully managed, cloud-native service.

Scale in Event Hubs is controlled by how many throughput units or processing units you purchase.

12 of 21

DEBEZIUM

A connector based on CDC that captures row-level changes that occur in the db schemas.

First time Debezium connects to a db, it takes a snapshot of the schemas. Once snapshot is complete, the connector continuously captures row-level changes for INSERT, UPDATE or DELETE committed to the db.

It produces events for each data change, and streams them to Kafka topics.

13 of 21

FUNCTIONAPP

Serverless solution that allows you to write less code, maintain less infrastructure, and save on costs.

Provides "compute on-demand" in two ways:

  1. allows to implement system's logic into readily available blocks of code. These code blocks are called "functions". Different functions can run anytime;
  2. meets the demand with as many resources and function instances as necessary. Any extra resources drop off automatically.

Benefits: write functions in different languages, automate deployment, troubleshoot a function, flexible pricing options.

14 of 21

AZURE DATA LAKE STORAGE (ADLS)

Provides file system semantics, file-level security, scale, low-cost, tiered storage, high availability/disaster recovery.

Organizes objects/files into a hierarchy of directories for efficient data access.

Allows you to manage and access data just as you would with HDFS.

Several open source platforms have been supported, like Databricks, Event Hub, Logic Apps, ML, Stream Analytics, PowerBI, SQL db etc.

15 of 21

BI ARCHITECURE ON AZURE �(DRAFT)

16 of 21

BI ARCHITECTURE ON AZURE (DRAFT)

17 of 21

BIBLIOGRAPHY

18 of 21

BIBLIOGRAPHY

19 of 21

THANK YOU

20 of 21

21 of 21