JavaScript isn't enabled in your browser, so this file can't be opened. Enable and reload.

1 of 16

Computer Organization & Design 1

Pratyay Pandey

2 of 16

3 of 16

Introduction

4 of 16

Motivation to Learn CO&D

Computers are quite literally the third revolution in civilization, after the Agricultural and Industrial Revolutions

Knowing about algorithms is not enough to know about computer science holistically and to be able to implement what you have learned

Each time the cost of computing improves by a factor of 10, new possibilities unfold for computing

Artificial intelligence - The transition from the GOFAI era was only possible because of strides in computing in the 2000s, through GPUs and processors

Low-level programming is vital to understanding how your programs run and why code works the way it does

How does your program translate into something that a machine can understand?
How does the machine interpret machine code?

Low-level programming is also a great industry skill, even in AI

Though we don’t really notice it these days with the Python ecosystem, Numpy and other libraries are built in C and are interpreted as such, which is why they are so fast
Knowing exactly how a computer works allows you to fully utilize it

Computer organization and design open up new possibilities for programming practices, such as parallelism
Minimize memory space to make programs fast

5 of 16

Classes of Computing Applications

Desktop computers = Computers designed for use by individuals, typically with a graphic display, keyboard, and mouse

Best-known and emphasize good performance at low-cost, typically via third-party softwares

Servers = Computers used for running large programs for multiple users, often at the same time, accessed via a network

Modern form of what were once mainframes
Servers are directed towards large workloads, like several tiny applications or a few very complex ones
Built from the same basic tech as desktop computers, yet provide more computing and input / output capability

Supercomputers = Class of computers with highest performance and cost; typically in the millions

These consist of thousands of processors and typically terabytes (2⁴⁰ bytes) of memory and petabytes (1000 or 1024 terabytes) of storage
Used for large-scale scientific and engineering calculations

Datacenters = Rooms or buildings meant to handle the power, cooling, and networking of several servers

Used by companies like eBay and Google and contain thousands of processors, terabytes of memory, and petabytes of storage

Embedded computers = Computers inside other devices meant for one application or a collection of software

Largest class of computers (ex. video game softwares)

6 of 16

Programs and Hardware

7 of 16

Programs

Involve three parts

The hardware
The systems software

Between the hardware and the actual application software
Meant to organize the connections between the hardware and the application
Two types that are in almost any computer today

Operating system

Interfaces between a user’s program and the hardware, providing services and supervisory functions
Handles basic input and output operations
Allocates storage and memory
Provides for protected sharing across multiple applications used simultaneously on a computer

Compiler

Executes the translation of a program written in a high-level language into an instruction-set that is executable by the hardware

8 of 16

High-Level Languages to Hardware

From an electronic standpoint, we either turn on an electric signal or turn it off

Hence, we can use a binary digit or bit, that is either off (0) or on (1)
Commands that follow are known to the computer as instructions, a subtle, yet important, piece of terminology for future concepts

First programmers communicated with these binary numbers but, later, made programs to translate from syntax to binary

Assemblers were made to translate from certain sets of instructions to binary

add A, B → 1000110010100000

This symbolic language is now known as assembly language, while what the machine understands, the binary representation, is machine language

High-level languages, like C and Python, are first translated into an assembly language program and then into machine language

Python is translated thrice, first into C and then into assembly, and then into machine language

9 of 16

Introduction to Hardware

There are five classical components to any computer

Input, output, memory, datapath, and control
Traditionally, the last two are combined into what we call the processor

Input devices, such as a keyboard or a mouse, feed information into a computer
Output devices are what convey the result of a computation to a user or another computer

10 of 16

Display

Most graphics these days use liquid crystal displays (LCDs), which is a class of display technology which uses a thin layer of liquid polymers that can be used to transmit or block light depending on the charge that is provided

These give a thin, low-power display that simply controls the rod-shaped molecules in the liquid to form a twisting helix that bends light entering the display

Most LCD displays use an active matrix, that has a tiny transistor switch for each pixel to control its current

An image is simply a matrix of picture elements, pixels, which are represented by a matrix of bits, or a bit map

The computer hardware support for graphics is typically a raster refresh buffer, or frame buffer, to store the bit map

11 of 16

Inside the Box

Motherboard = Plastic board containing packages of integrated circuits, like the processor, cache, memory, and connectors for I/O devices

Integrated circuits or chips are devices with dozens to millions of transistors

Memory is where programs and their data are kept when they are running

One type of memory is DRAM, or dynamic random access memory

Because they are RAM, memory accesses are all constant time

The processor is the active part of the board, which follows the instructions of the program

Sometimes called the CPU or central processing unit, which contains the datapath and controls, does arithmetic logic, tests numbers, signals I/O devices, etc.
The datapath is what performs arithmetic operations
Control tells the datapath, memory, and I/O devices what to do
The processor also has another type of memory, cache memory, which is small, fast memory that acts as a buffer for slow, larger memory

Cache is built on static random access memory (SRAM), which is faster but less dense and, thus, more expensive

A common theme that is easily noticeable is abstraction, or the use of layering, where lower-level details are hidden to offer a simpler model

Hardware and software build upon one another in a manner that hides away irrelevant details for simplicity
A very important abstraction is that between hardware and the lowest-level software, which is known as instruction set architecture or simply architecture
The combination of the operating system interface and the basic instruction set is known as the application binary interface (ABI)

12 of 16

Data

Obviously, data is very important and varies in types
Some types of data are volatile or can only be stored when the device is receiving power, meaning that, if the device is powered off, this storage is lost

This includes DRAM

Nonvolatile memory, like a DVD, retains data even in the absence of a power source and
Main memory or primary memory = Memory used to hold programs when they are running, like DRAM
Secondary memory = Nonvolatile memory used to store programs and data between runs, like magnetic disks

Magnetic disks = Nonvolatile memory that is composed of rotating platters coated with magnetic recording material

To read and write information on a hard disk, a moveable arm containing a small electromagnetic coil called a read-write head is kept right above the surface

Flash memory = A nonvolatile semiconductor memory that is cheaper and slower than DRAM but more expensive and faster than magnetic disks

There are several types of storage technologies

Optical disks, like CDs and DVDs are the most common removable storage
Flash-based removable memory typically attaches to USB connections and is often used to transfer files
Magnetic tapes provide slow serial access and are used to back up disks, though this is most often done with duplicate hard drives now

13 of 16

Tech for Building Processors and Memory

Processors and memory have improved at a very drastic rate
Transistors are simply on / off switches controlled by electricity
An integrated circuit is a combination of many transistors
A very large-scale integrated circuit (VLSI) is a chip with millions of transistors
Rate for increasing integration has been very consistent

The industry has quadrupled capacity every three years for twenty years
The increase in transistor count is known as Moore’s law

The transistor capacity doubles every 18 - 24 months

14 of 16

Performance

Accurately measuring and comparing different computers is very important
Suppose we define performance in terms of speed

We would want to reduce response or execution time, or the speed of each task
Datacenters, with hundreds of computers, want to reduce throughput or bandwidth, which is the total amount of work in a given time
We can define performance to be the reciprocal of execution time

The lower the execution time, the higher the performance

Execution time itself varies and depends on what kind of task we are referring to

CPU execution time or CPU time is the amount of time required for the CPU to compute for a task

This can be split into user CPU time, the time spent in the program, and system CPU time, the time spent performing tasks on behalf of the program

We care about how fast hardware can perform basic functions

Most computers are constructed with a clock that determines when events take place inside the hardware

Discrete time intervals known as clock cycles, ticks, clock periods, clocks, cycles, clock ticks
Clock rate is the inverse of the clock period, or time for a complete clock cycle

CPU execution time = # of clock cycles * clock cycle time = # of clock cycles / clock rate
CPU clock cycles = Instructions for a program * Average clock cycles per instruction (CPI)

15 of 16

The Power Wall

Clock rate and power have increased very rapidly for decades but have started to flatten out

They are correlated, which is why they grew together
They are slowing down because there is a power limit for cooling

Dominant technology for chips is CMOS (complementary metal oxide semiconductor)

The source of power dissipation is dynamic power, the power consumed during switching
Dynamic power dissipation depends on the loading of each transistor, the voltage applied, adn the frequency that the transistor is switched
Power = Voltage x Frequency of switches in transistor x Capacitive load

This power limit is what changed the design of microprocessors

Companies started referring to processors as “cores” to avoid confusion between processors and microprocessors
Microprocessors are called multicore microprocessors now, because they hold multiple microprocessors

This relates to the topic of parallelism, which is beyond the introductory lecture

16 of 16

Making a Chip

All chips begin with silicon, which does not conduct electricity well and is called a semiconductor

Chemical processes can turn silicon into either one of these three

Conductors
Insulators
Areas that can conduct or insulate

The design starts with a silicon crystal ingot rod which is sliced into wafers of up to 0.1 inches
They are engrained with patterns and then cut into pieces called dies, which become chips
They are connected to I/O pins in a process called bonding and then tested and sent
Cost per Die = Cost Per Wafer / (Dies per Wafer x Yield)

Yield is the percentage of good dies out of total dies