1 of 25

Thinking BIG

Big Data Concepts and Patterns

Dr. Lilia Sfaxi

INSAT

2 of 25

2

DATA

Data Mining

Data Viz

Data Analytics

Open Data

Data Science

Cloud

Mobile

IoT

BI

Big Data

3 of 25

3

Collection

Processing

Storage

Visualization

Security - Monitoring

Business

Data Issues

4 of 25

4

Data Issues

Available Data

Value Extraction

On-Time

5 of 25

5

Data Issues

Scalable

Available

Flexible

Volume

Velocity

Variety

3V

6 of 25

6

Store THEN Process

Big Data Principles

Data-driven Decisions (NOT Decision-Driven Data)

Redunduncy is GOOD

There is NO NEEDLESS Data

GO Polyglot!

7 of 25

7

Scalability

Scale UP vs Scale OUT

Classical : Scale UP

8 of 25

8

Scalability

Scale UP vs Scale OUT

Big Data : Scale OUT

9 of 25

9

Ring

Master - Worker

Scalability

Architectures

10 of 25

10

Classical Architectures

Big Data Architecture

Database

Application Server

Scalability

Co-Locality of Processing & Storage

11 of 25

11

Scalability

Fault Tolerance

Data Replication

12 of 25

12

Scalability

Fault Tolerance

Data Replication

Cluster Replication

13 of 25

13

Scalability

Fault Tolerance

Data Replication

Cluster Replication

Rack Awareness

14 of 25

14

Availability

CAP Theorem

Consistency

Availability

Partition

Tolerance

Pick

Only

Two

15 of 25

15

Availability

CAP Theorem

Consistency

Atomicity

Isolation

Durability

Basically Available

Soft-State

Eventual Consistency

ACID

BASE

16 of 25

16

Availability

Time

Stream Processing Support

Dynamic and Interactive Charts and Reports

In-Memory Processing

In-Memory Storage

17 of 25

17

Flexibility

ONE application can support…

Diverse Data Sources

Schema-less Data

Multiple Processing Paradigms

Multiple Storage Systems

18 of 25

18

Research Domains

In the Big Data Domain

EVERYTHING

Is Yet to Be Done

19 of 25

19

Research Domains

Optimization

Processing Time Optimization

Storage Size and Compression

Data Access Optimization

Tradeoff bw Consistency and Availability

20 of 25

20

Research Domains

Data Science

Distributed Algorithms for Machine Learning

Semantic & Sentiment Analysis

Visualization Algorithms

Data Mining, Data Prediction, Data Analytics

21 of 25

21

Research Domains

Big Data Design

Design Methodologies for Big Data Systems

Standardization of Big Data Architectures

Design and Architectural Patterns

Modeling Language(s) for Schema-less Data

22 of 25

22

Research Domains

Big Data Security

Non-Relational Databases Security

Logs Gathering and Analysis

Source Data Validation and Filtering

Access Control and Cryptography

23 of 25

23

Research Domains

Big Data & Other Trends

Big Data & Business Intelligence

Big Data & Cloud Computing

Big Data & Internet of Things

Big Data & Mobile

24 of 25

24

Research Domains

Big Data & Business Fields

Big Data in Education

Big Data in Health

Big Data in Art

Big Data in Finance

25 of 25

25

Dr. Lilia Sfaxi

https://www.linkedin.com/in/liliasfaxi/

https://liliasfaxi.wixsite.com/liliasfaxi

https://www.youtube.com/c/TechWall

Lilia.sfaxi@insat.ucar.tn