Computer Architecture
Prepared by
M.Arumugam
AP/CT-PG
Kongu Engineering College,Perundurai
The Memory System
Some basic concepts
Up to 2
k
addressable
MDR
MAR
k
-bit
address bus
n
-bit
data bus
Control lines
( , MFC, etc.)
Processor
Memory
locations
Word length =
n
bits
W
R
/
Some basic concepts(Contd.,)
The Memory System
Semiconductor RAM memories
Internal organization of memory chips
Internal organization of memory chips (Contd.,)
FF
circuit
Sense / Write
Address
decoder
FF
CS
cells
Memory
circuit
Sense / Write
Sense / Write
circuit
Data input
/output lines:
A
0
A
1
A
2
A
3
W
0
W
1
W
15
7
1
0
W
R
/
7
1
0
b
7
b
1
b
0
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
SRAM Cell
Y
X
Word line
Bit lines
b
T
2
T
1
b
′
Asynchronous DRAMs
Asynchronous DRAMs
Column
CS
Sense / Write
circuits
cell array
latch
address
Row
Column
latch
decoder
Row
decoder
address
4096
512
8
×
(
)
×
R
/
W
A
20
9
-
A
8
0
-
⁄
D
0
D
7
R
A
S
C
A
S
Fast Page Mode
fast page mode feature.
Synchronous DRAMs
R
/
W
R
A
S
C
A
S
C
S
Clock
Cell array
latch
address
Row
decoder
Ro
w
decoder
Co
lumn
Read/Write
circuits & latches
counter
address
Column
Row/Column
address
Data input
register
Data output
register
Data
Refresh
counter
Mode register
and
timing control
with processor clock signal.
connected to a latch.
contents of the cells in a row are
loaded onto the latches.
contents of the cells are refreshed
without changing the contents of
the latches.
to the selected columns are transferred
to the output.
successive columns are selected using
column address counter and clock.
CAS signal need not be generated
externally. A new data is placed during
raising edge of the clock
Latency, Bandwidth, and DDRSDRAMs
Static memories
19-bit internal chip address
decoder
2-bit
addresses
21-bit
A
0
A
1
A
19
memory chip
A
20
D
31-24
D
7-0
D
23-16
D
15-8
512
K
8
×
Chip select
memory chip
19-bit
address
512
K
8
×
8-bit data
input/output
Implement a memory unit of 2M
words of 32 bits each.
Use 512x8 static memory chips.
Each column consists of 4 chips.
Each chip implements one byte
position.
A chip is selected by setting its
chip select control line to 1.
Selected chip places its data on the
data output line, outputs of other
chips are in high impedance state.
21 bits to address a 32-bit word.
High order 2 bits are needed to
select the row, by activating the
four Chip Select signals.
19 bits are used to access specific
byte locations inside the selected
chip.
Dynamic memories
Memory controller
controller circuit is inserted between the processor
and memory.
Memory controller (contd..)
17
Processor
R
A
S
C
A
S
R
/
W
Clock
Address
Row/Column
address
Memory
controller
R
/
W
Clock
Request
C
S
Data
Memory
The Memory System
Read-Only Memories (ROMs)
Read-Only Memories (ROMs)
of memory is called Read-Only memory (ROM).
Read-Only Memories (Contd.,)
Read-Only Memories (Contd.,)
larger memory modules are implemented using
flash cards and flash drives.
Speed, Size, and Cost
of storage, but is much slower than DRAMs.
Memory Hierarchy
Pr
ocessor
Primary
cache
Main
memory
Increasing
size
Increasing
speed
Magnetic disk
secondary
memory
Increasing
cost per bit
Re
gisters
L1
Secondary
cache
L2
processor registers. Registers are at
the top of the memory hierarchy.
can be implemented on the processor
chip. This is processor cache.
is on the processor chip. Level 2 (L2)
cache is in between main memory and
processor.
as SIMMs. Much larger, but much slower
than cache memory.
of inexepensive storage.
idea is to bring instructions and data
that will be used in the near future as
close to the processor as possible.
The Memory System
Cache Memories
Cache Memories
Locality of Reference
to be executed soon.
Cache memories
from the main memory, some block of words in the cache must be
replaced. This is determined by a “replacement algorithm”.
Cache
Main
memory
Processor
Cache hit
Cache miss
addressed word is first brought into the cache. The desired word
is overwritten with new information.
Cache Coherence Problem
Mapping functions
Direct mapping
Main
memory
Block 0
Block 1
Block 127
Block 128
Block 129
Block 255
Block 256
Block 257
Block 4095
7
4
Main memory address
T
ag
Block
W
ord
5
tag
tag
tag
Cache
Block 0
Block 1
Block 127
the cache. 0 maps to 0, 129 maps to 1.
position in the cache.
cache is not full.
replace the old block, leading to a trivial replacement
algorithm.
- Low order 4 bits determine one of the 16
words in a block.
- When a new block is brought into the cache,
the the next 7 bits determine which cache
block this new block is placed in.
- High order 5 bits determine which of the possible
32 blocks is currently present in the cache. These
are tag bits.
Associative mapping
position.
- Low order 4 bits identify the word within a block.
- High order 12 bits or tag bits identify a memory
block when it is resident in the cache.
existing block in the cache when the cache is full.
the need to search all 128 patterns to determine
whether a given block is in the cache.
Main
memory
Block 0
Block 1
Block 127
Block 128
Block 129
Block 255
Block 256
Block 257
Block 4095
4
Main memory address
Tag
Word
12
tag
tag
tag
Cache
Block 0
Block 1
Block 127
Set-Associative mapping
Blocks of cache are grouped into sets.
Mapping function allows a block of the main
memory to reside in any block of a specific set.
Divide the cache into 64 sets, with two blocks per set.
Memory block 0, 64, 128 etc. map to block 0, and they
can occupy either of the two positions.
Memory address is divided into three fields:
- 6 bit field determines the set number.
- High order 6 bit fields are compared to the tag
fields of the two blocks in a set.
Set-associative mapping combination of direct and
associative mapping.
Number of blocks per set is a design parameter.
- One extreme is to have all the blocks in one set,
requiring no set bits (fully associative mapping).
- Other extreme is to have one block per set, is
the same as direct mapping.
Main
memory
Block 0
Block 1
Block 63
Block 64
Block 65
Block 127
Block 128
Block 129
Block 4095
7
4
Main memory address
T
ag
Block
W
ord
5
tag
tag
tag
Cache
Block 1
Block 2
Block 126
Block 127
Block 3
Block 0
tag
tag
tag
The Memory System
Performance considerations
Performance considerations
Interleaving
Methods of address layouts
m
bits
Address in module
MM address
i
k
bits
Module
Module
Module
Module
DBR
ABR
DBR
ABR
ABR
DBR
0
n
1
-
i
k
bits
0
Module
Module
Module
Module
MM address
DBR
ABR
ABR
DBR
ABR
DBR
Address in module
2
k
1
-
m
bits
Hit Rate and Miss Penalty
Caches on the processor chip
T ave = h1c1+(1-h1)h2c2+(1-h1)(1-h2)M
Other Performance Enhancements
Write buffer
block can be read first.
Other Performance Enhancements (Contd.,)
Prefetching
memory references and then prefetches according
to this pattern.
Other Performance Enhancements (Contd.,)
Lockup-Free Cache
information about these misses.
The Memory System
Virtual Memory
Virtual memories
45
Virtual memories (contd..)
46
Virtual memories (contd..)
47
Virtual memories (contd..)
48
Virtual memory organization
49
Data
Data
DMA transfer
Physical address
Physical address
Virtual address
Disk storage
Main memory
Cache
MMU
Processor
virtual addresses into physical addresses.
main memory they are fetched as described
previously.
the main memory, they must be transferred
from secondary storage to the main memory.
the data from the secondary storage into the
main memory.
Address translation
50
Address translation (contd..)
51
Address translation (contd..)
52
Address translation (contd..)
53
Address translation (contd..)
54
Page frame
Virtual address from processor
in memory
Offset
Offset
Virtual page number
Page table address
Page table base register
Control
bits
Physical address in main memory
PAGE TABLE
Page frame
+
Virtual address is
interpreted as page
number and offset.
Page table holds information
about each page. This includes
the starting address of the page
in the main memory.
PTBR holds
the address of
the page table.
PTBR + virtual
page number provide
the entry of the page
in the page table.
This entry has the starting location
of the page.
Address translation (contd..)
55
Address translation (contd..)
56
Address translation (contd..)
57
Address translation (contd..)
58
Address translation (contd..)
59
No
Yes
Hit
Miss
Virtual address from processor
TLB
Offset
Virtual page number
number
Virtual page
Page frame
in memory
Control
bits
Offset
Page frame
=?
Physical address in main memory
Associative-mapped TLB
High-order bits of the virtual address
generated by the processor select the
virtual page.
These bits are compared to the virtual
page numbers in the TLB.
If there is a match, a hit occurs and
the corresponding address of the page
frame is read.
If there is no match, a miss occurs
and the page table within the main
memory must be consulted.
Set-associative mapped TLBs are
found in commercial processors.
Address translation (contd..)
60
Address translation (contd..)
61
Address translation (contd..)
62
Address translation (contd..)
63
Address translation (contd..)
64
Address translation (contd..)
65
The Memory System
Memory Management
Memory management
system space + several user spaces.
Memory management (contd..)
given time.
Memory management (contd..)
The Memory System
Secondary Storage
Magnetic Hard Disks
Disk
Disk drive
Disk controller
Organization of Data on a Disk
Sector 0, track 0
Sector 3, track
n
Figure 5.30. Organization of one surface of a disk.
Sector 0, track 1
Access Data on a Disk
Disk Controller
Processor
Main memory
System bus
Figure 5.31. Disks connected to the system bus.
Disk controller
Disk drive
Disk drive
Disk Controller
RAID Disk Arrays
Optical Disks
(a) Cross-section
Source
Detector
Source
Detector
Source
Detector
No reflection
Reflection
Reflection
Pit
Land
0
0
0
1
0
0
0
0
1
0
0
0
1
0
0
1
0
0
1
0
(c) Stored binary pattern
Figure 5.32. Optical disk.
1
(b) Transition from pit to land
Optical Disks
Magnetic Tape Systems
Figure 5.33. Organization of data on magnetic tape.
File
File
mark
mark
File
7 or 9
gap
gap
File gap
Record
Record
Record
Record
bits
•
•
•
•
•
•
•
•