Big Data Analytics - 17CI18
COURSE OUTCOMES (COS)
After the completion of this course, the student will be able to:
CO1: Identify Big Data and its Business Implications.
CO2: Access and Process Data on Distributed File System.
CO3: Manage Job Execution in Hadoop Environment.
CO4: Develop Big Data Solutions using Hadoop Eco System.
CO5: Apply Machine Learning Techniques using R.
Welcome to the World of Big Data
Data- What Makes it Big???
How big is big?
No Single Definition
The term ‘big data’ is self-explanatory − a collection of huge data sets that normal computing techniques cannot process.
“Big Data” is data whose scale, diversity, and complexity require new architecture, techniques, algorithms, and analytics to manage it and extract value and hidden knowledge from it…
Big Facts About Big Data
As of 2013, experts believed that 90% of the world’s data was generated from 2011 to 2012.
In 2018, more than 2.5 quintillion bytes of data were created every day.
The amount of data in the world was estimated to be 44 zettabytes at the dawn of 2020.
Google handles a staggering 1.2 trillion searches every year.
Of Bits and Bytes…….
A gigabyte is equal to 1,024 megabytes. A terabyte is equal to 1,024 gigabytes.
A petabyte is equal to 1,024 terabytes.
An exabyte is equal to 1,024 petabytes. A zettabyte is equal to 1,024 exabytes. A yottabyte is equal to 1,024 zettabytes.
1021 bytes equal to 1 zettabyte or one billion terabytes forms a zettabyte.
Three attributes stand out as defining Big Data characteristics
Huge volume of data: Rather than thousands or millions of rows, Big Data can be billions of rows and millions of columns.
Complexity of data types and structures: Big Data reflects the variety of new data sources, formats, and structures, including digital traces being left on the web and other digital repositories for subsequent analysis.
Speed of new data creation and growth: Big Data can describe high velocity data, with rapid data ingestion and near real time analysis.
Big Data- the 4Vs
Characteristics of Big Data: 1-Scale (Volume)
Characteristics of Big Data: 2-Complexity (Variety)
Characteristics of Big Data: 3-Speed (Velocity)
promotions right now for store next to you
Characteristics of Big Data:
4-Accuracy/ Trustworthiness (Veracity)
The 4Vs in a Nutshell
The Sources….
Sources of Big Data Deluge
Who’s Generating Big Data
Social media and networks
(all of us are generating data)
Scientific instruments
(collecting all sorts of data)
Mobile devices
(tracking all objects all the time)
Sensor technology and networks
(measuring all kinds of data)
15
The Evolution…
Evolution of Big Data by technology
Evolution of Big Data by Internet Of Things
Evolution of Big Data by Social Media
Evolution of Big Data by other factors
The Model Has Changed…
Old Model: Few companies are generating data, all others are consuming data
New Model: all of us are generating data, and all of us are consuming data
17
Harnessing Big Data
(DBMSs)
(Data Warehousing)
18
Value of Big Data Analytics
20
Challenges in Handling Big Data
21
Challenge #1: Insufficient understanding and acceptance of big data
Challenge #2: Confusing variety of big data technologies
Challenge #3: Paying loads of money
Challenge #4: Complexity of managing data quality
Challenge #5: Dangerous big data security holes
Challenge #6: Tricky process of converting big data into valuable insights
Challenge #7: Troubles of up scaling
Types of big data
BigData' could be found in three forms:
Structured
Structured Data
Employee_I D | Employee_ Name | Gender | Department | Salary_In_la cs |
2365 | Rajesh Kulkarni | Male | Finance | 650000 |
3398 | Pratibha Joshi | Female | Admin | 650000 |
7465 | Shushil Roy | Male | Admin | 500000 |
7500 | Shubhojit Das | Male | Finance | 500000 |
7699 | Priya Sane | Female | Finance | 550000 |
Unstructured Data
Unstructured Data
Semi-structured
Semi Structured
<rec><name>Prashant Rao</name><gender>Male</gender><age>35</age></rec>
<rec><name>Seema R.</name><gender>Female</gender><age>41</age></rec>
<rec><name>Satish Mane</name><gender>Male</gender><age>29</age></rec>
<rec><name>Subrato Roy</name><gender>Male</gender><age>26</age></rec>
<rec><name>Jeremiah J.</name><gender>Male</gender><age>35</age></rec>
A Contrast of the Three Types
Why Big Data Analytics
Big Data analytics is a process used to extract meaningful insights, such as hidden patterns, unknown correlations, market trends, and customer preferences.
Big Data Analytics Advantages
Cost Savings : help in identifying more efficient ways of doing business.
Time Reductions :helps businesses analyzing data immediately and make quick decisions based on the learnings.
New Product Development : By knowing the trends of customer needs and satisfaction through analytics you can create products according to the wants of customers.
Understand the market conditions : By analyzing big data you can get a better understanding of current market conditions.
Control online reputation: Big data tools can do sentiment analysis. Therefore, you can get feedback about who is saying what about your company.
Big Data Analytics Applications
Big Data Use Cases
Netflix -Big Data & User Experience
Manufacturing Big Data Use Cases
The digital revolution has transformed the manufacturing industry. Manufacturers are now finding new ways to harness all the data they generate to improve operational efficiency, streamline business processes, and uncover valuable insights that will drive profits and growth.
Predictive Maintenance�Big data can help predict equipment failure. Potential issues can be discovered by analyzing both structured data (equipment year, make, and model) and multistructured data (log entries, sensor data, error messages, engine temperature, and other factors). With this data, manufacturers can maximize parts and equipment uptime and deploy maintenance more cost effectively.
Operational Efficiency�Operational efficiency is one of the areas in which big data can have the most impact on profitability. With big data, you can analyze and assess production processes, proactively respond to customer feedback, and anticipate future demands.
Production Optimization�Optimizing production lines can decrease costs and increase revenue. Big data can help manufacturers understand the flow of items through their production lines and see which areas can benefit. Data analysis will reveal which steps lead to increased production time and which areas are causing delays.
Retail Big Data Use Cases
Competition is fierce in retail. To stay ahead, companies strive to differentiate themselves. Big data is being used across all stages of the retail process—from product predictions to demand forecasting to in-store optimization. Using big data, retailers are finding new ways to innovate.
Healthcare Big Data Use Cases
Healthcare organizations are using big data for everything from improving profitability to helping save lives. Healthcare companies, hospitals, and researchers collect massive amounts of data. But all of this data isn’t useful in isolation. It becomes important when the data is analyzed to highlight trends and threats in patterns and create predictive models.
Patient Experience and Outcomes
Claims Fraud
Healthcare Billing Analytics
Oil and Gas Big Data Use Cases
For the past few years, the oil and gas industry has been leveraging big data to find new ways to innovate. The industry has long made use of data sensors to track and monitor the performance of oil wells, machinery, and operations. Oil and gas companies have been able to harness this data to monitor well activity, create models of the Earth to find new oil sources, and perform many other value-added tasks.
Predictive Equipment MaintenanceOil and gas companies often lack visibility into the condition of their equipment, especially in remote offshore and deep-water locations. Big data can help by providing insight so companies can predict the remaining optimal life of their systems and components, ensuring that their assets operate at optimum production efficiency.
Oil Exploration and DiscoveryExploring for oil and gas can be expensive. But companies can make use of the vast amount of data generated in the drilling and production process to make informed decisions about new drilling sites. Data generated from seismic monitors can be used to find new oil and gas sources by identifying traces that were previously overlooked.
Oil Production OptimizationUnstructured sensor and historical data can be used to optimize oil well production. By creating predictive models, companies can measure well production to understand usage rates. With deeper data analysis, engineers can determine why actual well outputs aren’t tallying with their predictions.
Telecommunications Big Data Use Cases
The popularity of smart phones and other mobile devices has given telecommunications companies tremendous growth opportunities. But there are challenges as well, as organizations work to keep pace with customer demands for new digital services while managing an ever-expanding volume of data.
Optimize Network CapacityOptimal network performance is essential for a telecom’s success. Network usage analytics can help companies identify areas with excess capacity and reroute bandwidth as needed. Big data analytics can help them plan for infrastructure investments and design new services that meet customer demands. With new insights, telecoms are able maintain customer loyalty and avoid losing revenue to competitors.
Telecom Customer ChurnBy analyzing the data telecoms already have about service quality, convenience, and other factors, telecoms can predict overall customer satisfaction. And they can set up alerts when customers are at risk of churning—and take action with retention campaigns and proactive offers.
New Product OfferingsUnstructured sensor and historical data can be used to optimize oil well production. By creating predictive models, companies can measure well production to understand usage rates. With deeper data analysis, engineers can determine why actual well outputs aren’t tallying with their predictions.
Financial Services Big Data Use Cases
Forward-thinking banks and financial services firms are capitalizing on big data. From capturing new market opportunities to reducing fraud, financial services organizations have been able to convert big data into a competitive advantage.
Fraud and Compliance�When it comes to security, it’s not just a few rogue hackers. The financial services industry is up against entire expert teams. While security landscapes and compliance requirements are constantly evolving. Using big data, companies can identify patterns that indicate fraud and aggregate large volumes of information to streamline regulatory reporting.
Anti-Money Laundering�Financial services firms are under more pressure than ever before from governments passing anti-money laundering laws. These laws require that banks show proof of proper diligence and submit suspicious activity reports. In this extraordinarily complicated arena, big data analytics can help companies identify potential fraud patterns.
Financial Regulatory and Compliance Analytics�Financial services companies must be in compliance with a wide variety of requirements concerning risk, conduct, and transparency. At the same time, banks must comply with the Dodd-Frank Act, Basel III, and other regulations that require detailed reporting.
To Conclude…
uture as all of become more
A Gentle Introduction to Hadoop
Hadoop is a well-adopted, standards-based, open-source software framework built on the foundation of Google’s MapReduce and Google File System.
It’s meant to leverage the power of massive parallel processing to take advantage of Big Data, generally by using lots of inexpensive commodity servers.
Hadoop is designed to abstract away much of the complexity of distributed processing.
This lets developers focus on the task at hand, instead of getting lost in the technical details of deploying such a functionally rich environment.
Hadoop
🢝 Hadoop is an Apache open source framework written in java that allows distributed processing of large datasets across clusters of computers using simple programming models.
🢝A Hadoop frame-worked application works in an environment that provides distributed storage and computation across clusters of computers.
🢝 Hadoop is designed to scale up from single server to thousands of machines, each offering local computation and storage.
MapReduce
Hadoop Distributed File System (HDFS)
Takes care of storage part of Hadoop architecture.
Hadoop HDFS broke the files into small pieces of data known as blocks. The default block size in HDFS is 128 MB.
We can configure the size of the block as per the requirements.
These blocks are stored in the cluster in a distributed manner on different nodes.
This provides a mechanism for MapReduce to process the data in parallel in the cluster.
Advantages of HDFS
Fault Tolerance–Each data blocks are replicated thrice ((everything is stored on three machines/Data Nodes by default) in the cluster. This helps to protect the data against data Node (machine) failure.
Space – Just add more data nodes if you need more disk space.
Scalability – Unlike traditional databases HDFS is highly scalable because it can store and distribute very large datasets across many nodes that can operate in parallel.
Flexibility –It can store any kind of data, whether its structured, semi-structured or unstructured.
Cost-effective – HDFS has direct attached storage and shares the cost of the network
and computers it runs on with the MapReduce. It's also an open source software.
Basic Data Analytic Methods Using R
Introduction to R
R is a programming language and software framework for statistical analysis and graphics.
The following R code illustrates a typical analytical situation in which a dataset is imported, the contents of the dataset are examined, and some model building tasks are executed.
In the following scenario the annual sales in U.S. dollars for 10,000 retail customers have been provided in the form of a comma separated- value (CSV) file. The read.csv() function is used to import the CSV file. This dataset is stored to the R variable sales using the assignment operator <-.
# import a CSV file of the total annual sales for each customer
sales <- read.csv("c:/data/yearly_sales.csv")
# examine the imported dataset
head(sales)
#The summary() function provides some descriptive statistics, such as the mean and median, for each data column. Additionally, the minimum and maximum values as well as the 1st and 3rd quartiles are provided. Because the gender column contains two possible characters, an “F” (female) or “M” (male),
summary(sales)
Quartiles
Median
75% Qu
75% Qu
Interquartile range (IQR)
#Plotting a dataset’s contents can provide information about the relationships between the various columns.
# plot num_of_orders vs. sales
plot(sales$num_of_orders,sales$sales_total,
main="Number of Orders vs. Sales"
the plot() function generates a scatterplot of the number of orders (sales$num_of_orders) against the annual sales (sales$sales_total).
Note:The $ is used to reference a specific column in the dataset sales.
Data Import and Export
The dataset was imported into R using the read.csv() function
#sales <- read.csv("c:/data/yearly_sales.csv")
The setwd() function can be used to set the working directory for the subsequent import and export operations
#setwd("c:/data/")
#sales <- read.csv("yearly_sales.csv")
read.table()
#sales_table <- read.table("yearly_sales.csv", header=TRUE, sep=",")
read.delim()
#sales_delim <- read.delim("yearly_sales.csv", sep=",")
add a column for the average sales per order
#sales$per_order <- sales$sales_total/sales$num_of_orders
export data as tab delimited without the row names
#write.table(sales,"sales_modified.txt", sep="\t", row.names=FALSE
Big Data Analytics
Best Practices for Big data Analytics
Best practices for Big Data Analytics
1. UNDERSTAND THE BUSINESS REQUIREMENTS�
Analyzing and understanding the business requirements and organizational goals is the first and the foremost step that must be carried out even before leveraging big data analytics into your projects.
The business users must understand which projects in their company must use big data analytics to make maximum profit.
2. DETERMINE THE COLLECTED DIGITAL ASSETS�
The second best big data practice is to identify the type of data pouring into the organization, as well as, the data generated in-house. Usually, the data collected is disorganized and in varying formats. Moreover, some data is never even exploited (read dark data), and it is essential that organizations identify this data too.
3. IDENTIFY WHAT IS MISSING�
The third practice is analyzing and understanding what is missing. Once you have collected the data needed for a project, identify the additional information that might be required for that particular project and where can it come from. For instance, if you want to leverage big data analytics in your organization to understand your employee's well-being, then along with information such as login logout time, medical reports, and email reports, we need to have some additional information about the employee’s, let’s say, stress levels. This information can be provided by co-workers or leaders.
4. COMPREHEND WHICH BIG DATA ANALYTICS MUST BE LEVERAGED�
After analyzing and collecting data from different sources, it's time for the organization to understand which big data technologies, such as predictive analytics, stream analytics, data preparation, fraud detection, sentiment analysis, and so on can be best used for the current business requirements.
For instance, big data analytics helps the HR team in companies for the recruitment process to identify the right talent faster by collaborating the social media and job portals using predictive and sentiment analysis.
5. ANALYZE DATA CONTINUOUSLY�
This is the final best practice that an organization must follow when it comes to big data. You must always be aware of what data is lying with your organization and what is being done with it.
Check the health of your data periodically to never miss out on any important but hidden signals in the data.
Before implementing any new technology in your organization, it is vital to have a strategy to help you get the most out of it. With adequate and accurate data at their disposal, companies must also follow the above mentioned big data practices to extract value from this data.
Stages of Big Data Analytical Evolution�
The process of dealing with big data is quite different from handling traditional data. Big Data processing consists of
1. Data Collection
This is the first stage which involves the collection of web data, log data, structured and unstructured data from several types of data sources, like mobile devices, sensor devices, social media.
2. Storing
In this stage the collected data has to be stored into distributed database systems and servers. Introduction to NOSQL facilitated to store big data. Since NOSQL does not have any fixed schema and there is no relationship between entities it is used to store dynamic and un structured data.
3. Data Organization
In this stage, data is arranged and organized as structured, unstructured and semi-unstructured data.
In order to access and analyzed. After the data is arranged and organized, the analysis stage is applied. Analyzing large data set involves more complexities and computations. More research and survey is going on to find the algorithm and mathematical model to minimize the computational and storage cost. The extracted hidden information will be useful for the Industries, Academicians and the Government to make necessary action and decision . The infrastructure needed for big data should be highly scalable, support statistical analytics and data mining and based on analytical model automated decision should be made in quick time.
4. Analysis
After the data is arranged and organized, the analysis stage is applied. Analyzing large data set involves more complexities and computations. More research and survey is going on to find the algorithm and mathematical model to minimize the computational and storage cost.
The extracted hidden information will be useful for the Industries, Academicians and the Government to make necessary action and decision .
The infrastructure needed for big data should be highly scalable, support statistical analytics and data mining and based on analytical model automated decision should be made in quick time.
5. Data Visualization
Once the information has been carried out from data, it has to be represented in a visualized manner. The representation is generally done using Data Visualization tools that enable decision makers to grasp difficult concepts and pattern easily.
State of the Practice in Analytics
Business Drivers for Advanced Analytics
BI Versus Data Science
Current Analytical Architecture
Drivers of Big Data
Emerging Big Data Ecosystem and a New Approach to Analytics
The Data Scientist
Profile of a Data Scientist
Data scientists having five main sets of skills and behavioral characteristics
Quantitative skill: such as mathematics or statistics
Technical aptitude: namely, software engineering, machine learning, and programming skills
Skeptical mind-set and critical thinking: It is important that data scientists can examine their work critically rather than in a one-sided way.
Curious and creative: Data scientists are passionate about data and finding creative ways to solve problems and portray information.
Communicative and collaborative: Data scientists must be able to articulate the business value in a clear way and collaboratively work with other groups, including project sponsors and key stakeholders.
Data Analytics Lifecycle
Data Analytics Lifecycle Overview
The Data Analytics Lifecycle is designed specifically for Big Data problems and data science projects.
The lifecycle has six phases, and project work can occur in several phases at once.
For most phases in the lifecycle, the movement can be either forward or backward.
This iterative depiction of the lifecycle is intended to more closely portray a real project, in which aspects of the project move forward and may return to earlier stages as new information is uncovered and team members learn more about various stages of the project.
This enables participants to move iteratively through the process and drive toward operationalizing the project work.
Key Roles for a Successful Analytics Project
Background and Overview of Data Analytics Lifecycle
Phase 1: Discovery
Learning the Business Domain
Resources
Framing the Problem
Identifying Key Stakeholders
Interviewing the Analytics Sponsor
Developing Initial Hypotheses
Identifying Potential Data Sources
Phase 2: Data Preparation
Preparing the Analytic Sandbox
Performing ETLT
Learning About the Data
Data Conditioning
Survey and Visualize
Common Tools for the Data Preparation Phase
Phase 3: Model Planning
Data Exploration and Variable Selection
Model Selection
Common Tools for the Model Planning Phase
Phase 4: Model Building
Common Tools for the Model Building Phase
Commercial Tools:
● SAS Enterprise Miner allows users to run predictive and descriptive models based on large
volumes of data from across the enterprise. It interoperates with other large data stores, has many
partnerships, and is built for enterprise-level computing and analytics.
● SPSS Modeler (provided by IBM and now called IBM SPSS Modeler) offers methods to
explore and analyze data through a GUI.
● Matlab provides a high-level language for performing a variety of data analytics, algorithms,
and data exploration.
● Alpine Miner provides a GUI front end for users to develop analytic workflows and interact
with Big Data tools and platforms on the back end.
● STATISTICA and Mathematica are also popular and well-regarded data mining and
analytics tools.
Free or Open Source tools:
● R and PL/R R was described earlier in the model planning phase, and PL/R is a procedural
language for PostgreSQL with R. Using this approach means that R commands can be executed
in database. This technique provides higher performance and is more scalable than
running R in memory.
● Octave a free software programming language for computational modeling, has some of
the functionality of Matlab. Because it is freely available, Octave is used in major universities
when teaching machine learning.
● WEKA is a free data mining software package with an analytic workbench. The functions
created in WEKA can be executed within Java code.
● Python is a programming language that provides toolkits for machine learning and analysis,
such as scikit-learn, numpy, scipy, pandas, and related data visualization using matplotlib.
● SQL in-database implementations, such as MADlib, provide an alterative to in-memory
desktop analytical tools. MADlib provides an open-source machine learning library of algorithms
that can be executed in-database, for PostgreSQL or Greenplum.
Phase 5: Communicate Results
After executing the model, the team needs to compare the outcomes of the modelling to the criteria established for success and failure.
it is critical to articulate the results properly and position the findings in a way that is appropriate for the audience.
the team needs to determine if it succeeded or failed in its objectives. Many times people do not want to admit to failing, but in this instance failure should not be considered as a true failure, but rather as a failure of the data to accept or reject a given hypothesis adequately.
the key is to remember that the team must be rigorous enough with the data to determine whether it will prove or disprove the hypotheses outlined in Phase 1 (discovery).
Phase 6: Operationalize
In the final phase, the team communicates the benefits of the project more broadly and sets up a pilot project to deploy the work in a controlled way before broadening the work to a full enterprise or ecosystem of users.
Rather than deploying these models immediately on a wide-scale basis, the risk can be managed more effectively and the team can learn by undertaking a small scope, pilot deployment before a wide-scale rollout.
This approach enables the team to learn about the performance and related constraints of the model in a production environment on a small scale and make adjustments before a full deployment.
Part of the operationalizing phase includes creating a mechanism for performing ongoing monitoring of model accuracy and, if accuracy degrades, finding ways to retrain the model.
Key outputs from a successful analytics project
Best Practices for Big data Analytics
1. UNDERSTAND THE BUSINESS REQUIREMENTS�
Analyzing and understanding the business requirements and organizational goals is the first and the foremost step that must be carried out even before leveraging big data analytics into your projects.
The business users must understand which projects in their company must use big data analytics to make maximum profit.
2. DETERMINE THE COLLECTED DIGITAL ASSETS�
The second best big data practice is to identify the type of data pouring into the organization, as well as, the data generated in-house. Usually, the data collected is disorganized and in varying formats. Moreover, some data is never even exploited (read dark data), and it is essential that organizations identify this data too.
3. IDENTIFY WHAT IS MISSING�
The third practice is analyzing and understanding what is missing.
Once you have collected the data needed for a project, identify the additional information that might be required for that particular project and where can it come from.
For instance, if you want to leverage big data analytics in your organization to understand your employee's well-being, then along with information such as login logout time, medical reports, and email reports, we need to have some additional information about the employee’s, let’s say, stress levels. This information can be provided by co-workers or leaders.
4. COMPREHEND WHICH BIG DATA ANALYTICS MUST BE LEVERAGED�
After analyzing and collecting data from different sources, it's time for the organization to understand which big data technologies, such as predictive analytics, stream analytics, data preparation, fraud detection, sentiment analysis, and so on can be best used for the current business requirements.
For instance, big data analytics helps the HR team in companies for the recruitment process to identify the right talent faster by collaborating the social media and job portals using predictive and sentiment analysis.
5. ANALYZE DATA CONTINUOUSLY�
This is the final best practice that an organization must follow when it comes to big data. You must always be aware of what data is lying with your organization and what is being done with it.
Check the health of your data periodically to never miss out on any important but hidden signals in the data.
Before implementing any new technology in your organization, it is vital to have a strategy to help you get the most out of it. With adequate and accurate data at their disposal, companies must also follow the above mentioned big data practices to extract value from this data.
Stages of Big Data Analytical Evolution�
The process of dealing with big data is quite different from handling traditional data. Big Data processing consists of
1. Data Collection
This is the first stage which involves the collection of web data, log data, structured and unstructured data from several types of data sources, like mobile devices, sensor devices, social media.
2. Storing
In this stage the collected data has to be stored into distributed database systems and servers.
Introduction to NOSQL facilitated to store big data.
Since NOSQL does not have any fixed schema and there is no relationship between entities it is used to store dynamic and un structured data.
3. Data Organization
In this stage, data is arranged and organized as structured, unstructured and semi-unstructured data In order to access and analysis.
After the data is arranged and organized, the analysis stage is applied. Analyzing large data set involves more complexities and computations. More research and survey is going on to find the algorithm and mathematical model to minimize the computational and storage cost.
The extracted hidden information will be useful for the Industries, Academicians and the Government to make necessary action and decision .
The infrastructure needed for big data should be highly scalable, support statistical analytics and data mining and based on analytical model automated decision should be made in quick time.
4. Analysis
After the data is arranged and organized, the analysis stage is applied. Analyzing large data set involves more complexities and computations. More research and survey is going on to find the algorithm and mathematical model to minimize the computational and storage cost.
The extracted hidden information will be useful for the Industries, Academicians and the Government to make necessary action and decision .
The infrastructure needed for big data should be highly scalable, support statistical analytics and data mining and based on analytical model automated decision should be made in quick time.
5. Data Visualization
Once the information has been carried out from data, it has to be represented in a visualized manner. The representation is generally done using Data Visualization tools that enable decision makers to grasp difficult concepts and pattern easily.
State of the Practice in Analytics
Business Drivers for Advanced Analytics
BI Versus Data Science
Current Analytical Architecture
Drivers of Big Data
Emerging Big Data Ecosystem and a New Approach to Analytics
Question Time
Have a Great Learning Time!