Google File System / Hadoop
Alan Nguyen, Tom Lai
University of Washington
View this presentation at bit.ly/hadoop434
Outline
View this presentation at bit.ly/hadoop434
Intro
Hadoop Features
View this presentation at bit.ly/hadoop434
Intro
the idea came from google
View this presentation at bit.ly/hadoop434
Intro
Fun Fact
“The name my kid gave a stuffed yellow elephant. Short, relatively easy to spell and pronounce, meaningless, and not used elsewhere: those are my naming criteria. Kids are good at generating such. Googol is a kid’s term”
View this presentation at bit.ly/hadoop434
Intro
Who use Hadoop
(and more)
View this presentation at bit.ly/hadoop434
Architecture
Components
1. NameNode (master)
2. DataNode (slave)
3. Client
View this presentation at bit.ly/hadoop434
Architecture
DataNode
���
View this presentation at bit.ly/hadoop434
Architecture
NameNode
View this presentation at bit.ly/hadoop434
View this presentation at bit.ly/hadoop434
View this presentation at bit.ly/hadoop434
View this presentation at bit.ly/hadoop434
View this presentation at bit.ly/hadoop434
View this presentation at bit.ly/hadoop434
Availability / Recovery
Checkpoint Node / Secondary Namenode
View this presentation at bit.ly/hadoop434
Availability / Recovery
Checkpointing
View this presentation at bit.ly/hadoop434
Availability / Recovery
Clones of Blocks
View this presentation at bit.ly/hadoop434
Availability / Recovery
Rack Distribution
View this presentation at bit.ly/hadoop434
Availability
Heartbeat Messages
View this presentation at bit.ly/hadoop434
Consistency / Atomicity enforcement
These must be atomic:
Deleting directory recursively may not be atomic
One-copy-update semantics
All immediately visible:
Consistency
Atomicity
View this presentation at bit.ly/hadoop434
Application domains focused on by this system
Data Collection
Data Processing
Machine learning
View this presentation at bit.ly/hadoop434
Other notable features
DataNode Hot Swap Drive
Checkpoint Node
View this presentation at bit.ly/hadoop434
Paper Critique
| GFS | GFS Case Study | Hadoop |
Contributions | GFS Architecture and write examples/ lease examples | Mentions the limitation of having one GFS instance per data center and how they worked around it | Hadoop Architecture-- very similar to GFS. A lot of information, more than you can read. |
Drawbacks of ideas |
| ||
How to improve |
| ||
View this presentation at bit.ly/hadoop434