Thinking BIG
Big Data Concepts and Patterns
Dr. Lilia Sfaxi
INSAT
2
DATA
Data Mining
Data Viz
Data Analytics
Open Data
Data Science
Cloud
Mobile
IoT
BI
Big Data
3
Collection
Processing
Storage
Visualization
Security - Monitoring
Business
Data Issues
4
Data Issues
Available Data
Value Extraction
On-Time
5
Data Issues
Scalable
Available
Flexible
Volume
Velocity
Variety
3V
6
Store THEN Process
Big Data Principles
Data-driven Decisions (NOT Decision-Driven Data)
Redunduncy is GOOD
There is NO NEEDLESS Data
GO Polyglot!
7
Scalability
Scale UP vs Scale OUT
Classical : Scale UP
8
Scalability
Scale UP vs Scale OUT
Big Data : Scale OUT
9
Ring
Master - Worker
Scalability
Architectures
10
Classical Architectures
Big Data Architecture
Database
Application Server
Scalability
Co-Locality of Processing & Storage
11
Scalability
Fault Tolerance
Data Replication
12
Scalability
Fault Tolerance
Data Replication
Cluster Replication
13
Scalability
Fault Tolerance
Data Replication
Cluster Replication
Rack Awareness
14
Availability
CAP Theorem
Consistency
Availability
Partition
Tolerance
Pick
Only
Two
15
Availability
CAP Theorem
Consistency
Atomicity
Isolation
Durability
Basically Available
Soft-State
Eventual Consistency
ACID
BASE
16
Availability
Time
Stream Processing Support
Dynamic and Interactive Charts and Reports
In-Memory Processing
In-Memory Storage
17
Flexibility
ONE application can support…
Diverse Data Sources
Schema-less Data
Multiple Processing Paradigms
Multiple Storage Systems
18
Research Domains
In the Big Data Domain
EVERYTHING
Is Yet to Be Done
19
Research Domains
Optimization
Processing Time Optimization
Storage Size and Compression
Data Access Optimization
Tradeoff bw Consistency and Availability
20
Research Domains
Data Science
Distributed Algorithms for Machine Learning
Semantic & Sentiment Analysis
Visualization Algorithms
Data Mining, Data Prediction, Data Analytics
21
Research Domains
Big Data Design
Design Methodologies for Big Data Systems
Standardization of Big Data Architectures
Design and Architectural Patterns
Modeling Language(s) for Schema-less Data
22
Research Domains
Big Data Security
Non-Relational Databases Security
Logs Gathering and Analysis
Source Data Validation and Filtering
Access Control and Cryptography
23
Research Domains
Big Data & Other Trends
Big Data & Business Intelligence
Big Data & Cloud Computing
Big Data & Internet of Things
Big Data & Mobile
24
Research Domains
Big Data & Business Fields
Big Data in Education
Big Data in Health
Big Data in Art
Big Data in Finance
25
Dr. Lilia Sfaxi
https://www.linkedin.com/in/liliasfaxi/
https://liliasfaxi.wixsite.com/liliasfaxi
https://www.youtube.com/c/TechWall
Lilia.sfaxi@insat.ucar.tn