1 of 49

View from the Top

Instructors: Rebecca Herman and Steven Ho

2 of 49

Administrivia

  • HW7 due tonight—the last assignment! :D
  • Final: Thu 8/10, 7-10pm
    • Rooms: 10 Evans (aaa-aee), Hearst Annex A1 (aef-aig)
    • Three cheat sheets
    • Review slides, mt2 walkthrough will be posted on course website
  • Office hours during lab & discussion
    • Labs today (return borrowed clickers!)
    • Optional discussions tomorrow
    • Course Evaluations
    • Self-Evaluation for EPA
    • Grades should be up to date on glookup
      • If you’re missing any grades, please email a TA

CS61C Su17 - Lecture 27

2

8/8/2017

3 of 49

Powers of Ten-inspired View of 61C

CS61C Su17 - Lecture 27

3

8/8/2017

4 of 49

The Dalles, Oregon

CS61C Su17 - Lecture 27

4

8/8/2017

104

meters

11,100 meters (across city)

5 of 49

Google’s Oregon WSC

CS61C Su17 - Lecture 27

5

8/8/2017

103

meters

103

meters

102

meters

1,100 meters (around facility)

6 of 49

Containers in WSCs

CS61C Su17 - Lecture 27

6

8/8/2017

102

meters

100 meters (all containers)

7 of 49

Google Server Array

  • 2 long rows, each with 29 racks
  • Cooling below raised floor
  • Hot air returned behind racks

CS61C Su17 - Lecture 27

7

8/8/2017

101

meters

10 meters (container)

8 of 49

Google Rack

  • Google rack with 20 servers + Network Switch in the middle
  • 48-port 1 Gigabit/sec Ethernet switch every other rack
  • Array switches connect to racks via multiple 1 Gbit/s links
  • 2 datacenter routers connect to array switches over 10 Gbit/s links

CS61C Su17 - Lecture 27

8

8/8/2017

100

meters

3 meters (tall)

9 of 49

Google Server Internals

CS61C Su17 - Lecture 27

9

8/8/2017

10-1

meters

0.45 meters

10 of 49

Central Processing Unit (CPU)

CS61C Su17 - Lecture 27

10

8/8/2017

10-2

meters

3 centimeters

11 of 49

Processor

CS61C Su17 - Lecture 27

11

8/8/2017

10-2

meters

1 centimeter

L3

(L3)

12 of 49

CPU Core

CS61C Su17 - Lecture 27

12

8/8/2017

10-3

meters

3 millimeters

13 of 49

CMOS Gate

CS61C Su17 - Lecture 27

13

8/8/2017

10-6

meters

2 micrometers

14 of 49

What is CS61C About?

  • It is about the hardware-software interface
    • What does the programmer need to know about the underlying hardware
  • Introduction to systems courses
    • Software: compilers, operating systems, security
    • Hardware: digital logic design, embedded systems, VLSI, computer architecture

CS61C Su17 - Lecture 27

14

8/8/2017

15 of 49

CS61C For Software Development

  • Knowing the tools of the trade – computers!
    • “Computers” come in all shapes and sizes
    • Computing achieved in many different ways nowadays
  • When performance matters
    • Ex: taking advantage of parallelism
  • Differences between programming languages under the hood
  • Design large systems – abstraction in hardware
  • Security
  • Design methodology – limitations and tradeoffs

CS61C Su17 - Lecture 27

15

8/8/2017

16 of 49

“Software is so bad because it’s so complex, and because it’s trying to talk to other programs on the same computer, or over connections to other computers. Even your computer is kind of more than one computer, boxes within boxes, and each one of those computers is full of little programs trying to coordinate their actions and talk to each other.”

“Your average piece-of-shit Windows desktop is so complex that no one person on Earth really knows what all of it is doing, or how.”�

CS61C Su17 - Lecture 27

16

8/8/2017

17 of 49

Course Learning Objectives

  • After taking this class students should be able to:
    • Identify and explain the various layers of abstraction that allow computer users to perform complex software tasks without understanding what the computer hardware is actually doing
    • Judge the effect of changing computer components (e.g. processor, RAM, HDD, cache) on the performance of a computer program
    • Explain how the memory hierarchy creates the illusion of being almost as fast as the fastest type of memory and almost as large as the biggest memory
    • Construct a working CPU from logic gates for a specified instruction set architecture
    • Identify the different types of parallelism and predict their effects on different types of applications

CS61C Su17 - Lecture 27

17

8/8/2017

18 of 49

Course Learning Objectives

  • In addition, this class will require students to work on the following skills:
    • Creating and modifying designs to meet a given set of specifications
    • Identifying unexpected or problematic situations and create test cases to ensure proper behavior
    • Defending design choices based on tradeoffs and limitations

CS61C Su17 - Lecture 27

18

8/8/2017

19 of 49

61C Mission Statement

“CS61C introduces students to foundational concepts that form the backbone of different branches of computer science such as computer architecture, low-level software, computer security, and large-scale computing.  This course demystifies some of the vulnerabilities inherent in abstraction barriers that programmers of all levels confide in when the performance and correctness of high-level code is under fire.  After taking this course, students will be able to see computing through a bit-level perspective and understand the central trade-offs and choices involved in making fast hardware and software of all levels that work together efficiently and safely.”

—Rebecca and Steven

CS61C Su17 - Lecture 1

19

6/19/2017

20 of 49

Number Representation

  • Anything can be represented as a number!
    • With n digits in base B, can represent Bn things
    • Signed (two’s complement) and unsigned ints
    • Floating point (sign, biased exp, mantissa)

    • Characters/text
    • Computer instructions (stored program concept)

  • Limitations of fixed width: Overflow, underflow, sign extension; Precision and truncation/rounding

CS61C Su17 - Lecture 27

20

8/8/2017

opcode

funct

rs

rt

rd

shamt

S

Exponent

Mantissa

31 30

23 22

0

1 bit

8 bits

23 bits

21 of 49

Levels of Interpretation

lw $t0, 0($2)

lw $t1, 4($2)

sw $t1, 0($2)

sw $t0, 4($2)

CS61C Su17 - Lecture 27

21

8/8/2017

Higher-Level Language�Program (e.g. C)

Assembly Language Program (e.g. MIPS)

Machine Language Program (MIPS)

Hardware Architecture Description�(e.g. block diagrams)

Compiler

Assembler

Machine Interpretation

temp = v[k];

v[k] = v[k+1];

v[k+1] = temp;

0000 1001 1100 0110 1010 1111 0101 1000

1010 1111 0101 1000 0000 1001 1100 0110

1100 0110 1010 1111 0101 1000 0000 1001

0101 1000 0000 1001 1100 0110 1010 1111

Logic Circuit Description�(Circuit Schematic Diagrams)

Architecture Implementation

22 of 49

Higher-Level Language (HLL)

  • C exposes more of hardware (especially Mem)
    • Compiled language is machine-dependent
    • Weakly-typed language stresses data as bits
    • “C is good for two things: being beautiful and creating catastrophic 0days in memory management.” (same Medium blog post)
  • Arrays and strings don’t know own length
  • Pointers hold addresses, used to pass by ref
    • Pointer arithmetic; array vs. pointer syntax
  • Structs are padded collections of variables

CS61C Su17 - Lecture 27

22

8/8/2017

23 of 49

Assembly Language

  • Close to the level that a machine understands
    • ISA in human-readable �format
    • TAL vs. MAL �(pseudo-instructions)
  • RISC vs. CISC
    • Trade-off in complexity of HW & SW
  • MIPS Instruction Formats: R, I, J
    • Meaning and limitations of the fields
    • Relative (branch) vs. absolute (jump) addressing
    • Register conventions (saved/volatile; caller/callee)

CS61C Su17 - Lecture 27

23

8/8/2017

EVERY COMMAND IS DIRECTLY IMPLEMENTED IN THE HARDWARE

24 of 49

CS61C Su16 - Lecture 8

24

6/30/2016

Memory

Loader

Executable (mach lang pgm): a.out

Linker

Object (mach lang module): foo.o

Assembler

Assembly program: foo.s

Compiler

C program: foo.c

lib.o

25 of 49

Machine Language

  • Everything is just 0’s and 1’s!
    • Actually voltages running through transistors
  • Executable produced by linker, run by loader
    • OS creates new process (PT, swap space)
    • Instructions loaded and run by processor

CS61C Su17 - Lecture 27

25

8/8/2017

26 of 49

Pipelined Instruction Execution

  • 5-stage pipelined Datapath: IF (PC,I$), ID (RegFile), EX (ALU), MEM (D$), WB (RegFile)
    • Clock rate potentially 5x faster
  • Control handles decision-making and signal routing

CS61C Su17 - Lecture 27

26

8/8/2017

1. Instruction

Fetch

2. Decode/

Register Read

3. Execute

4. Memory

5. Write

Back

PC

instruction

memory

+4

Register

File

rt

rs

rd

ALU

Data

memory

imm

MUX

27 of 49

27

28 of 49

The Memory Hierarchy

CS61C Su17 - Lecture 27

28

8/8/2017

Trade-off in speed and cost vs. capacity!

Increasing distance from application/user

Based on the principle of locality!

29 of 49

CACHES

29

Tag

Index

Offset

31

T

I

O

Cache

Addr

miss

hit

data

CPU

Main Memory

30 of 49

Memory

  • Programmer treats as �one long array
    • Virtual address space →
  • Memory is byte-addressed
    • Look at data in different-sized chunks: byte, word, block, page
  • Multicore systems use shared memory
    • Protection, synchronization, and cache coherence necessary

CS61C Su17 - Lecture 27

30

8/8/2017

code

static data

heap

stack

~ FFFF FFFFhex

~ 0hex

31 of 49

Accessing Memory

CS61C Su17 - Lecture 27

31

8/8/2017

TLB Tag

TLB Index

Page Offset

VPN

PPN

Page Offset

Tag

Index

Offset

Physical Address

Virtual Address

To TLB:

From TLB:

To Cache:

Cache

VA

PA

miss

hit

block

hit

miss

CPU

Main Memory

TLB

Page Table

Disk

page

fault

32 of 49

Data Management

  • Extra memory bits to detect/correct errors
    • Even parity using XOR
    • Conceptual minimum Hamming distances
      • 2 is SED, 3 is SEC, 4 is SEC/DED
    • Hamming ECC can indicate corrupted bit
  • RAID to protect data against disk failure(s)
    • Main ideas: data striping and extra parity bits
  • Make sure system components are reliable so system remains available

CS61C Su17 - Lecture 27

32

8/8/2017

33 of 49

Input/Output

  • Disk Latency = Seek Time + Rotation Time + Transfer Time + Controller Overhead
  • Processor must synchronize with I/O devices before use due to difference in data rates:
    • Polling is expensive due to repeated queries
    • Interrupts are asynchronous events from I/O devices
    • Exceptions are “unexpected” events in processor
  • In SW, need special handling code

CS61C Su17 - Lecture 27

33

8/8/2017

34 of 49

Performance

  • Allows direct comparisons of architectures and quantification of improvements
    • It is all about time to finish (latency)
    • Includes both setup and execution.
  • Match application and �hardware to exploit:
    • Locality
    • Parallelism
    • Special hardware features, �like specialized instructions �(e.g. matrix manipulation)

CS61C Su17 - Lecture 27

34

8/8/2017

35 of 49

Performance Measurements

  • Execution time (latency) and work per time (throughput)
    • CPU Time = Instructions × CPI × Clock Cycle Time
  • Memory Access:
    • AMAT uses hit time, miss rate, miss penalty
    • Definitions recursive back to last level in hierarchy
  • Amdahl’s Law
    • Speedup = 1 / [ (1-F) + F/S ]
    • Why we almost never get max possible speedup

CS61C Su17 - Lecture 27

35

8/8/2017

36 of 49

Parallelism

CS61C Su17 - Lecture 27

36

8/8/2017

Smart�Phone

Warehouse Scale Computer

Leverage�Parallelism &

Achieve High�Performance

Core

Memory

Input/Output

Computer

Core

  • Parallel Requests

Assigned to computer

e.g. search “assembler.c”

  • Parallel Threads

Assigned to core

e.g. lookup, ads

  • Parallel Instructions

> 1 instruction @ one time

e.g. 5 pipelined instructions

  • Parallel Data

> 1 data item @ one time

e.g. add of 4 pairs of words

  • Hardware descriptions

All gates functioning in parallel at same time

Software Hardware

Cache Memory

Core

Instruction Unit(s)

Functional

Unit(s)

A0+B0

A1+B1

A2+B2

A3+B3

Logic Gates

37 of 49

Types of Parallelism

  • Request-Level Parallelism (RLP)
    • Handling many requests per second (e.g. web search)
  • Data-Level Parallelism (DLP)
    • Operate on many pieces of data at once
    • SIMD: at the level of single instructions
    • MapReduce: at the level of programs (split into map and reduce)
  • Thread-Level Parallelism (TLP)
    • Have many processors, run either �different programs or different parts �of same program at same time

CS61C Su17 - Lecture 27

37

8/8/2017

38 of 49

Thread-Level Parallelism Concerns

  • OpenMP: work sharing, synchronization
  • Splitting up work properly is difficult!
    • Shared vs. private variables in OpenMP
    • Often requires re-designing your algorithm
  • Shared memory needs cache coherence
    • False sharing is a problem (coherence miss)
    • Sharing protocols necessary (MOESI)
  • Synchronization requires hardware support
    • Test-and-set mechanism
    • ll and sc in MIPS

CS61C Su17 - Lecture 27

38

8/8/2017

39 of 49

What’s Next? Classes

CS61C Su17 - Lecture 27

39

8/8/2017

Operating

Systems

Security

Languages & Compilers

Databases

Software

Engineering

Computer

Architecture

Digital Systems

Design

HARDWARE

SYSTEMS

SOFTWARE

APPLICATIONS

EECS

151

EECS

149

Embedded

Systems

40 of 49

Becoming a Part of CS61C

  • Four levels of the CS61C staff:
    • Lab assistant: help students in labs (course credit)
    • Tutor: teach guerrilla sessions and assist students ($$)
    • TA: teach labs, sections, and help run the course ($$$)
    • Lecturer…?!?!?!? o: ( )

  • Lab Assisting:
    • Learn more, help others, get units, get on teaching track

CS61C Su17 - Lecture 27

40

8/8/2017

Lab Assistant

Tutor

TA

41 of 49

Making as much of college as is humanly and healthily reasonable

  • Seek out experiences that lead to new experiences �(i.e. that pay dividends)
    • Build skills, interests, relationships
    • Meet new people, join interesting clubs, go on adventures
  • Don’t go it alone – find a friend group for classes
  • Take advantage of educational opportunities
  • Take care of yourself!

CS61C Su17 - Lecture 27

41

8/8/2017

42 of 49

On the Shoulders of Giants

Many thanks to the main architects of CS61C:

CS61C Su17 - Lecture 27

42

8/8/2017

Brian

Harvey

Dan

Garcia

Randy

Katz

David

Patterson

43 of 49

Lab Assistants

  • Jasper Deng
  • Jackson Sipple
  • Brose Johnstone
  • Darren Lim
  • Morgan Rae Reschenberg
  • Ryan Panwar
  • Vincent Escueta
  • Darrell Dao
  • Ryan Pottle
  • Robert Zhang
  • Nick Titterton
  • Jonathan Yang
  • Ahmad Badary

CS61C Su17 - Lecture 27

43

8/8/2017

44 of 49

Your TAs

44

Zubeezy: Head TA

Derk: Head TA

Nick

Nikh

Ryan

Alex

45 of 49

Your TAs

45

Zubair: Head TA

Derek: Head TA

Ryan

Nikhil

Nick

Alex

46 of 49

Your Tutors

46

Megan

Chris

Ann

Damon

Keyhan

Lance

47 of 49

Ask Us Anything

CS61C Su17 - Lecture 27

47

8/8/2017

48 of 49

Good Luck!

CS61C Su17 - Lecture 27

48

8/8/2017

49 of 49