1 of 16

Computer Organization & Design 1

Pratyay Pandey

2 of 16

3 of 16

Introduction

4 of 16

Motivation to Learn CO&D

  • Computers are quite literally the third revolution in civilization, after the Agricultural and Industrial Revolutions
    • Knowing about algorithms is not enough to know about computer science holistically and to be able to implement what you have learned
  • Each time the cost of computing improves by a factor of 10, new possibilities unfold for computing
    • Artificial intelligence - The transition from the GOFAI era was only possible because of strides in computing in the 2000s, through GPUs and processors
  • Low-level programming is vital to understanding how your programs run and why code works the way it does
    • How does your program translate into something that a machine can understand?
    • How does the machine interpret machine code?
  • Low-level programming is also a great industry skill, even in AI
    • Though we don’t really notice it these days with the Python ecosystem, Numpy and other libraries are built in C and are interpreted as such, which is why they are so fast
    • Knowing exactly how a computer works allows you to fully utilize it
  • Computer organization and design open up new possibilities for programming practices, such as parallelism
  • Minimize memory space to make programs fast

5 of 16

Classes of Computing Applications

  • Desktop computers = Computers designed for use by individuals, typically with a graphic display, keyboard, and mouse
    • Best-known and emphasize good performance at low-cost, typically via third-party softwares
  • Servers = Computers used for running large programs for multiple users, often at the same time, accessed via a network
    • Modern form of what were once mainframes
    • Servers are directed towards large workloads, like several tiny applications or a few very complex ones
    • Built from the same basic tech as desktop computers, yet provide more computing and input / output capability
  • Supercomputers = Class of computers with highest performance and cost; typically in the millions
    • These consist of thousands of processors and typically terabytes (240 bytes) of memory and petabytes (1000 or 1024 terabytes) of storage
    • Used for large-scale scientific and engineering calculations
  • Datacenters = Rooms or buildings meant to handle the power, cooling, and networking of several servers
    • Used by companies like eBay and Google and contain thousands of processors, terabytes of memory, and petabytes of storage
  • Embedded computers = Computers inside other devices meant for one application or a collection of software
    • Largest class of computers (ex. video game softwares)

6 of 16

Programs and Hardware

7 of 16

Programs

  • Involve three parts
    • The hardware
    • The systems software
      • Between the hardware and the actual application software
      • Meant to organize the connections between the hardware and the application
      • Two types that are in almost any computer today
        • Operating system
          • Interfaces between a user’s program and the hardware, providing services and supervisory functions
          • Handles basic input and output operations
          • Allocates storage and memory
          • Provides for protected sharing across multiple applications used simultaneously on a computer
        • Compiler
          • Executes the translation of a program written in a high-level language into an instruction-set that is executable by the hardware

8 of 16

High-Level Languages to Hardware

  • From an electronic standpoint, we either turn on an electric signal or turn it off
    • Hence, we can use a binary digit or bit, that is either off (0) or on (1)
    • Commands that follow are known to the computer as instructions, a subtle, yet important, piece of terminology for future concepts
  • First programmers communicated with these binary numbers but, later, made programs to translate from syntax to binary
    • Assemblers were made to translate from certain sets of instructions to binary
      • add A, B → 1000110010100000
    • This symbolic language is now known as assembly language, while what the machine understands, the binary representation, is machine language
  • High-level languages, like C and Python, are first translated into an assembly language program and then into machine language
    • Python is translated thrice, first into C and then into assembly, and then into machine language

9 of 16

Introduction to Hardware

  • There are five classical components to any computer
    • Input, output, memory, datapath, and control
    • Traditionally, the last two are combined into what we call the processor
  • Input devices, such as a keyboard or a mouse, feed information into a computer
  • Output devices are what convey the result of a computation to a user or another computer

10 of 16

Display

  • Most graphics these days use liquid crystal displays (LCDs), which is a class of display technology which uses a thin layer of liquid polymers that can be used to transmit or block light depending on the charge that is provided
    • These give a thin, low-power display that simply controls the rod-shaped molecules in the liquid to form a twisting helix that bends light entering the display
  • Most LCD displays use an active matrix, that has a tiny transistor switch for each pixel to control its current
    • An image is simply a matrix of picture elements, pixels, which are represented by a matrix of bits, or a bit map
  • The computer hardware support for graphics is typically a raster refresh buffer, or frame buffer, to store the bit map

11 of 16

Inside the Box

  • Motherboard = Plastic board containing packages of integrated circuits, like the processor, cache, memory, and connectors for I/O devices
    • Integrated circuits or chips are devices with dozens to millions of transistors
  • Memory is where programs and their data are kept when they are running
    • One type of memory is DRAM, or dynamic random access memory
      • Because they are RAM, memory accesses are all constant time
  • The processor is the active part of the board, which follows the instructions of the program
    • Sometimes called the CPU or central processing unit, which contains the datapath and controls, does arithmetic logic, tests numbers, signals I/O devices, etc.
    • The datapath is what performs arithmetic operations
    • Control tells the datapath, memory, and I/O devices what to do
    • The processor also has another type of memory, cache memory, which is small, fast memory that acts as a buffer for slow, larger memory
      • Cache is built on static random access memory (SRAM), which is faster but less dense and, thus, more expensive
  • A common theme that is easily noticeable is abstraction, or the use of layering, where lower-level details are hidden to offer a simpler model
    • Hardware and software build upon one another in a manner that hides away irrelevant details for simplicity
    • A very important abstraction is that between hardware and the lowest-level software, which is known as instruction set architecture or simply architecture
    • The combination of the operating system interface and the basic instruction set is known as the application binary interface (ABI)

12 of 16

Data

  • Obviously, data is very important and varies in types
  • Some types of data are volatile or can only be stored when the device is receiving power, meaning that, if the device is powered off, this storage is lost
    • This includes DRAM
  • Nonvolatile memory, like a DVD, retains data even in the absence of a power source and
  • Main memory or primary memory = Memory used to hold programs when they are running, like DRAM
  • Secondary memory = Nonvolatile memory used to store programs and data between runs, like magnetic disks
    • Magnetic disks = Nonvolatile memory that is composed of rotating platters coated with magnetic recording material
      • To read and write information on a hard disk, a moveable arm containing a small electromagnetic coil called a read-write head is kept right above the surface
    • Flash memory = A nonvolatile semiconductor memory that is cheaper and slower than DRAM but more expensive and faster than magnetic disks
  • There are several types of storage technologies
    • Optical disks, like CDs and DVDs are the most common removable storage
    • Flash-based removable memory typically attaches to USB connections and is often used to transfer files
    • Magnetic tapes provide slow serial access and are used to back up disks, though this is most often done with duplicate hard drives now

13 of 16

Tech for Building Processors and Memory

  • Processors and memory have improved at a very drastic rate
  • Transistors are simply on / off switches controlled by electricity
  • An integrated circuit is a combination of many transistors
  • A very large-scale integrated circuit (VLSI) is a chip with millions of transistors
  • Rate for increasing integration has been very consistent
    • The industry has quadrupled capacity every three years for twenty years
    • The increase in transistor count is known as Moore’s law
      • The transistor capacity doubles every 18 - 24 months

14 of 16

Performance

  • Accurately measuring and comparing different computers is very important
  • Suppose we define performance in terms of speed
    • We would want to reduce response or execution time, or the speed of each task
    • Datacenters, with hundreds of computers, want to reduce throughput or bandwidth, which is the total amount of work in a given time
    • We can define performance to be the reciprocal of execution time
      • The lower the execution time, the higher the performance
  • Execution time itself varies and depends on what kind of task we are referring to
    • CPU execution time or CPU time is the amount of time required for the CPU to compute for a task
      • This can be split into user CPU time, the time spent in the program, and system CPU time, the time spent performing tasks on behalf of the program
  • We care about how fast hardware can perform basic functions
    • Most computers are constructed with a clock that determines when events take place inside the hardware
      • Discrete time intervals known as clock cycles, ticks, clock periods, clocks, cycles, clock ticks
      • Clock rate is the inverse of the clock period, or time for a complete clock cycle
    • CPU execution time = # of clock cycles * clock cycle time = # of clock cycles / clock rate
    • CPU clock cycles = Instructions for a program * Average clock cycles per instruction (CPI)

15 of 16

The Power Wall

  • Clock rate and power have increased very rapidly for decades but have started to flatten out
    • They are correlated, which is why they grew together
    • They are slowing down because there is a power limit for cooling
  • Dominant technology for chips is CMOS (complementary metal oxide semiconductor)
    • The source of power dissipation is dynamic power, the power consumed during switching
    • Dynamic power dissipation depends on the loading of each transistor, the voltage applied, adn the frequency that the transistor is switched
    • Power = Voltage x Frequency of switches in transistor x Capacitive load
  • This power limit is what changed the design of microprocessors
    • Companies started referring to processors as “cores” to avoid confusion between processors and microprocessors
    • Microprocessors are called multicore microprocessors now, because they hold multiple microprocessors
      • This relates to the topic of parallelism, which is beyond the introductory lecture

16 of 16

Making a Chip

  • All chips begin with silicon, which does not conduct electricity well and is called a semiconductor
    • Chemical processes can turn silicon into either one of these three
      • Conductors
      • Insulators
      • Areas that can conduct or insulate
    • The design starts with a silicon crystal ingot rod which is sliced into wafers of up to 0.1 inches
    • They are engrained with patterns and then cut into pieces called dies, which become chips
    • They are connected to I/O pins in a process called bonding and then tested and sent
    • Cost per Die = Cost Per Wafer / (Dies per Wafer x Yield)
      • Yield is the percentage of good dies out of total dies