1 of 23

Google File System / Hadoop

Alan Nguyen, Tom Lai

University of Washington

View this presentation at bit.ly/hadoop434

2 of 23

Outline

Intro
Architecture

Namenode
Datanode

Availability/Recovery
Consistency/Atomicity
Application domains focused on by this system
Other notable features such as performance, scalability, and security
Paper Critique

View this presentation at bit.ly/hadoop434

3 of 23

Intro

Hadoop Features

Open-source Java Framework

supported on all major platforms

Extremely fault-tolerant

hardware failure is the norm rather than the exception

Slaves run on cheap computers
Linear Scalability

View this presentation at bit.ly/hadoop434

4 of 23

Intro

the idea came from google

A paper was published
But Google's implementation (Google file system) was private to google
Doug Cutting read the paper and used the architecture in his project

calls it Hadoop

GFS and HDFS is very similar

View this presentation at bit.ly/hadoop434

One of Google’s first challenges was to figure out how to index the exploding volume of content on the web. To solve this, Google invented a new style of data processing known as MapReduce to manage large-scale data processing across large clusters of commodity servers.

Programs written in this functional style are automatically parallelized and executed on a large cluster of commodity machines.

A year after Google published a white paper describing the MapReduce framework (2004), Doug Cutting and Mike Cafarella created Apache Hadoop.

History

https://www.mapr.com/blog/5-google-projects-changed-big-data-forever

What's difference between the Hadoop distributed file system and the google file system?

https://www.quora.com/What-is-the-difference-between-the-Hadoop-file-distributed-system-and-the-Google-file-system

5 of 23

Intro

Fun Fact

“The name my kid gave a stuffed yellow elephant. Short, relatively easy to spell and pronounce, meaningless, and not used elsewhere: those are my naming criteria. Kids are good at generating such. Googol is a kid’s term”

Doug Cutting

View this presentation at bit.ly/hadoop434

6 of 23

Intro

Who use Hadoop

(and more)

View this presentation at bit.ly/hadoop434

7 of 23

Architecture

Components

1. NameNode (master)

2. DataNode (slave)

3. Client

View this presentation at bit.ly/hadoop434

8 of 23

Architecture

DataNode

Slave (many)�
Stores data�
Compute data

when node is also acting as TaskTracker

��

View this presentation at bit.ly/hadoop434

9 of 23

Architecture

NameNode

Master (only one per cluster)�
Stores metadata

Which DataNodes are alive
Which DataNodes have which data blocks�

Interact with client

Accepts requests
Redirect to DataNodes�

View this presentation at bit.ly/hadoop434

10 of 23

View this presentation at bit.ly/hadoop434

11 of 23

View this presentation at bit.ly/hadoop434

12 of 23

View this presentation at bit.ly/hadoop434

13 of 23

View this presentation at bit.ly/hadoop434

14 of 23

View this presentation at bit.ly/hadoop434

15 of 23

Availability / Recovery

Checkpoint Node / Secondary Namenode

Merges edit logs into the image file
Sends it to namenode
Problem: May take a long time for Checkpoint node to start

has to get heartbeat message from DataNode, and block locations
hours for large clusters
Hadoop 2.x uses Standby Node

View this presentation at bit.ly/hadoop434

Secondary NameNode - does not sends new fsimage back to NameNode unless requested (periodic)

Checkpoint Node - sends new fsimage back to NameNode autonomously (periodic)

Backup Node - get continuous stream of new logs from primary node

Standby Node - like a backup node but could take over if original fails

http://data-flair.com/answers/what-is-single-point-of-failure-in-hadoop-1-and-how-it-is-resolved-in-hadoop-2/

There is a time of unavailability from when the NameNode fails, to when the Secondary NameNode turns on and becomes available. The

Secondary NameNode would first have to turn on, get heartbeat signals from all of the DataNode, and then receive information from the DataNode on which blocks they have,

This could take hours in large clusters, making the file system unavailable during this transition time.

https://hadoop.apache.org/docs/r2.7.1/hadoop-project-dist/hadoop-hdfs/HDFSHighAvailabilityWithQJM.html

In order to avoid this downtime, in some Hadoop systems (newer ones) , there is an active/passive NameNode. Both NameNodes would be “on”, and the active name node would pass its operation log to the passive NameNode. The DataNodes would send heartbeat signals to both the active and passive NameNodes. If the active NameNode fails, the passive

NameNode can quickly become the active NameNode.

16 of 23

Availability / Recovery

Checkpointing

Snapshot of file system is stored in:

fsimage
edit log

Changing fsimage directly is network and I/O intensive, so better to write new edit log, and periodically merge them

View this presentation at bit.ly/hadoop434

17 of 23

Availability / Recovery

Clones of Blocks

Each file is divided into blocks

Multiple copies of each blocks stored on multiple nodes

View this presentation at bit.ly/hadoop434

18 of 23

Availability / Recovery

Rack Distribution

Files are divided into blocks

Copies of each block are placed on separate racks

If one rack fails, the other racks will have copies of all blocks of that file

View this presentation at bit.ly/hadoop434

19 of 23

Availability

Heartbeat Messages

The master receives a heartbeat message from slaves periodically
No heartbeat = dead 💀💀💀

no new IO requests for dead nodes
make new replicas if not enough

Each heartbeat message also includes what file blocks they have
The master acknowledge each heartbeat messages

sometimes a command is piggybacked on the ack

View this presentation at bit.ly/hadoop434

20 of 23

Consistency / Atomicity enforcement

These must be atomic:

Creating a file
Deleting file
Renaming file
Renaming directory
Creating directory

Deleting directory recursively may not be atomic

One-copy-update semantics

every read sees the effect of all previous writes

All immediately visible:

CRUD -- create, rename, update, delete
Delete followed by create�

Consistency

Atomicity

View this presentation at bit.ly/hadoop434

21 of 23

Application domains focused on by this system

Data Collection

Log storage
Data storage

Documents
Sensors

Data Processing

Map Reduce
Analyze user behavior

Machine learning

Spam Filters
Generate suggestions

View this presentation at bit.ly/hadoop434

22 of 23

Other notable features

DataNode Hot Swap Drive

You can add or replace harddrives without shutting down the datanode

Checkpoint Node

Make backups of the Namenode

View this presentation at bit.ly/hadoop434

23 of 23

Paper Critique

	GFS	GFS Case Study	Hadoop
Contributions	GFS Architecture and write examples/ lease examples	Mentions the limitation of having one GFS instance per data center and how they worked around it	Hadoop Architecture-- very similar to GFS. A lot of information, more than you can read.
Drawbacks of ideas	Single master was not scalable enough (limit in memory size). An issue addressed in the GFS Case study. Some problems doesn't fit neatly into a single Map/Reduce job
How to improve	Hadoop uses a different file system, you have to move it to hadoop's file space to do any computation, and then move it back to see the result

View this presentation at bit.ly/hadoop434