2 of 43

�4.1 INTRODUCTION�

Most computers are built using the Von Neumann model, which is centered on memory.
The programs that perform the processing are stored in memory.
We know memory is logically structured as a linear array of locations, with addresses from 0 to the maximum memory size the processor can address.
In this chapter we examine the various types of memory and how each is part of the memory hierarchy system.
We then look at cache memory (a special high-speed memory) and virtual memory.

3 of 43

4.2 Types of memory�RAM(Random Access Memory)

Static RAM (SRAM)

Consists essentially of internal flip-flops and transisters that store the binary information.
The stored information remains valid as long as power is applied to the unit.
Relatively insensitive to disturbances such as electrical noise.
Faster (8-16 times faster) and more expensive (8-16 times more expensive as well) than DRAM.

Dynamic RAM (DRAM)

Each cell stores bit with a capacitor .
Value must be refreshed every 10-100 ms.
Sensitive to disturbances.
Slower and cheaper than SRAM.

4 of 43

Even though a large number of memory technologies exist, there are only two basic types of memory:
RAM (random access memory) and ROM (read-only memory).
RAM is also called main memory. Often called primary memory, RAM is used to store programs and data that the computer needs when executing programs; but RAM is volatile, and loses this information once the power is turned off.
There are two general types of chips used to build the bulk of RAM memory in today’s computers:
SRAM and DRAM (Static and dynamic Random Access Memory).

5 of 43

SRAM is faster and much more expensive than DRAM; however, designers use DRAM because it is much denser (can store many bits per chip), uses less power, and generates less heat than SRAM.
For these reasons, both technologies are often used in combination: DRAM for main memory and SRAM for cache.
In addition to RAM, most computers contain a small amount of ROM (read only memory) that stores critical information necessary to operate the system, such as the program necessary to boot the computer.
ROM is not volatile and always retains its data.

6 of 43

This type of memory is also used in embedded systems or any systems where the programming does not need to change.

There are five basic ROM types:

ROM - Read Only Memory
PROM - Programmable Read Only Memory
EPROM - Erasable Programmable Read Only Memory
EEPROM - Electrically Erasable Programmable Read Only Memory.

7 of 43

�4.3.Characteristics of Memory System:�

Capacity: The amount of information that can be contained in a memory unit.
Memory Word: The natural unit of organization in the memory, typically the number of bits used to represent a number
Addressable unit: The fundamental data element size that can be addressed in the memory.
Unit of transfer: The number of data elements Transferred at a time
Transfer rate: Rate at which data is transferred to/from the memory device.
Access time: For RAM the time to address the unit and perform the transfer

4.3.1.Access Technique: How are memory contents accessed

Random Access:

Each location has a unique physical address.
Locations can be accessed in any order and all access times are the same.
Example :Main Memory

8 of 43

Sequential access:

Data does not have a unique address
Must read all data items in sequence until the desired item is found.
Example: tape drive units

Direct Access: Data items have unique address

Access is done using a combination of moving to a general memory “area” followed by a sequential access to reach the desired data item
Ex: Disk Drives

Associative access:

Data items are accessed based on their contents rather than their actual location.
Search all data items in parallel for a match to a given search pattern.
Example: some cache memory units

9 of 43

�4.4 THE MEMORY HIERARCHY�

Today’s computer systems use a combination of memory types to provide the best performance at the best cost.
This approach is called hierarchical memory.
As a rule, the faster memory is, the more expensive it is per bit of storage.
By using a hierarchy of memories, each with different access speeds and storage capacities

10 of 43

Today’s computers each have a small amount of very high-speed memory, called a cache, where data from frequently used memory locations may be temporarily stored.
This cache is connected to a much larger main memory, which is typically a medium-speed memory.
We classify memory based on its “distance” from the processor, with distance measured by the number of machine cycles required for access.

11 of 43

The following terminology is used when referring to this memory hierarchy:

Hit—the requested data resides in a given level of memory
Miss—the requested data is not found in the given level of memory.
Hit rate—the percentage of memory accesses found in a given level of memory.
Miss rate—the percentage of memory accesses not found in a given level of memory.

Note: Miss Rate = 1 - Hit Rate.

Hit time—the time required to access the requested information in a given level of memory.
Miss penalty—the time required to process a miss, which includes replacing a block in an upper level of memory, plus the additional time to deliver the requested data to the processor. (The time to process a miss is typically significantly larger than the time to process a hit.)

12 of 43

�The memory hierarchy is illustrated in the figure shown below�.

13 of 43

�4.5 CACHE MEMORY�

A cache memory is a small, temporary, but fast memory that the processor uses for information it is likely to need again in the very near future.
The computer really has no way to know, a priori, what data is most likely to be accessed, so it uses the locality principle and transfers an entire block from main memory into cache whenever it has to make a main memory access.
The cache location for this new block depends on two things:

the cache mapping policy and
the cache size

14 of 43

The size of cache memory can vary enormously.
A typical personal computer’s level 2 (L2) cache is 256K or 512K.
Level 1 (L1) cache is smaller, typically 8K or 16K.
L1 cache resides on the processor, whereas L2 cache resides between the CPU and main memory.
L1 cache is, therefore, faster than L2 cache.
What makes cache “special”? Cache is not accessed by address; it is accessed by content.
For this reason, cache is sometimes called content addressable memory or CAM.
To simplify this process of locating the desired data, various cache mapping algorithms are used.

15 of 43

�4.5.1 Cache Mapping Schemes�

When accessing data or instructions, the CPU first generates a main memory address.
Main memory and cache are both divided into the same size blocks (the size of the total blocks varies).
When a memory address is generated, cache is searched first to see if the required word exists there.
When the requested word is not found in cache, the entire main memory block in which the word resides is loaded into cache.
How, then, does the CPU locate data when it has been copied into cache?
The CPU uses a specific mapping scheme that “converts” the main memory address into a cache location.

16 of 43

This address conversion is done by giving special significance to the bits in the main memory address.
We first divide the bits into distinct groups we call fields.
Depending on the mapping scheme, we may have two or three fields.
How we use these fields depends on the particular mapping scheme being used.
The mapping scheme determines

where the data is placed
when it is originally copied into cache and
also provides a method for the CPU to find previously copied data when searching cache.

17 of 43

So, how do we use fields in the main memory address?
One field of the main memory address points us

A location in cache in which the data resides

if it is resident in cache (this is called a cache hit), or where it is to be placed
if it is not resident (which is called a cache miss).

The cache block referenced is then checked to see if it is valid.
This is done by associating a valid bit with each cache block.
A valid bit of 0 means the cache block is not valid (we have a cache miss) and

we must access main memory

18 of 43

A valid bit of 1 means it is valid (we may have a cache hit
We then compare the tag in the cache block to the tag field of our address.
(The tag is a special group of bits derived from the main memory address that is stored with its corresponding block in cache.)
If the tags are the same,

then we have found the desired cache block (we have a cache hit).

At this point we need to locate the desired word in the block;

this can be done using a different portion of the main memory address called the word field.

19 of 43

Direct Mapped Cache

Direct mapped cache assigns cache mappings using a modular approach.
Because there are more main memory blocks than there are cache blocks, it should be clear that main memory blocks compete for cache locations.
Direct mapping maps block X of main memory to block Y of cache, mod N, where N is the total number of blocks in cache.
For example, if cache contains 10 blocks, then main memory block 0 maps to cache block 0, main memory block 1 maps to cache block 1, . . . , main memory block 9 maps to cache block 9, and main memory block 10 maps to cache block 0.

21 of 43

You may be wondering, if main memory blocks 0 and 10 both map to cache block 0, how does the CPU know which block actually resides in cache block 0 at any given time?
The answer is that each block is copied to cache and identified by the tag.

22 of 43

To perform direct mapping, the binary main memory address is partitioned into the fields shown below

23 of 43

Consider the following example: Assume memory consists of 2¹⁴ words, cache has 16 blocks, and each block has 8 words.
From this we determine that memory has

2¹⁴/2³ = 2¹¹ blocks.

We know that each main memory address requires 14 bits.
Of this 14-bit address field, the rightmost 3 bits reflect the word field (we need 3 bits to uniquely identify one of 8 words in a block).
We need 4 bits to select a specific block in cache, so the block field consists of the middle 4 bits. The remaining 7 bits make up the tag field.

24 of 43

In this example, because main memory blocks 0 and 16 both map to cache block 0, the tag field would allow the system to differentiate between block 0 and block 16.
The binary addresses in block 0 differ from those in block 16 in the upper leftmost 7 bits, so the tags are different and unique.
To see how these addresses differ, let’s look at a smaller, simpler example.
Suppose we have a system using direct mapping with 16 words of main memory divided into 8 blocks (so each block has 2 words).
Assume the cache is 4 blocks in size (for a total of 8 words).

25 of 43

We know: A main memory address has 4 bits (because there are 2⁴ or 16 words in main memory).
This 4-bit main memory address is divided into three fields:
The word field is 1 bit (we need only 1 bit to differentiate between the two words in a block);
the block field is 2 bits (we have 4 blocks in main and need 2 bits to uniquely identify each block); and the tag field has 1 bit (this is all that is left over).
The main memory address is divided into the fields as shown below

cache

26 of 43

Suppose we generate the main memory address 9.
We can see from the mapping listing above that address 9 is in main memory block 4 and should map to cache block 0 (which means the contents of main memory block 4 should be copied into cache block 0).
The computer, however, uses the actual main memory address to determine the cache mapping block.
This address, in binary, is represented in figure shown below.

27 of 43

4.5.2. Effective Access Time and Hit Ratio

The performance of a hierarchical memory is measured by its effective access time (EAT), or the average time per access.
EAT is a weighted average that uses the hit ratio and the relative access times of the successive levels of the hierarchy.
For example, suppose the cache access time is 10ns, main memory access time is 200ns, and the cache hit rate is 99%.
The average time for the processor to access an item in this two-level memory would then be:

28 of 43

What, exactly, does this mean? If we look at the access times over a long period of time, this system performs as if it had a single large memory with an 11ns access time.
A 99% cache hit rate allows the system to perform very well, even though most of the memory is built using slower technology with an access time of 200ns.
The formula for calculating effective access time for a two-level memory is given by:

whereH = cache hit rate, AccessC = cache access time, and AccessMM = main memory access time.
This formula can be extended to apply to three- or even four-level memories

29 of 43

�4.6 VIRTUAL MEMORY�

The purpose of virtual memory is to use the hard disk as an extension of RAM, thus increasing the available address space a process can use.
Using virtual memory, your computer addresses more main memory than it actually has, and it uses the hard drive to hold the excess.
This area on the hard drive is called a page file, because it holds chunks of main memory on the hard drive.

30 of 43

The easiest way to think about virtual memory is to conceptualize it as an imaginary memory
The most common way to implement virtual memory is by using paging,

a method in which main memory is divided into fixed-size blocks and programs are divided into the same size blocks.
Program addresses, once generated by the CPU, must be translated to main memory addresses.

31 of 43

How is this done?

Before delving further into an explanation of virtual memory, let’s define some frequently used terms for virtual memory implemented through paging:

Virtual address—The logical or program address that the process uses.

Whenever the CPU generates an address, it is always in terms of virtual address space.

Physical address—The real address in physical memory or main memory.
Mapping—The mechanism by which virtual addresses are translated into physical ones.
Page frames—The equal-size chunks or blocks into which main memory (physical memory) is divided.

32 of 43

Pages—The chunks or blocks into which virtual memory (the logical address space) is divided, each equal in size to a page frame.

-Virtual pages are stored on disk until needed.

Paging—The process of copying a virtual page from disk to a page frame in main memory.
Fragmentation—Memory that becomes unusable.
Page fault—An event that occurs when a requested page is not in main memory and must be copied into memory from disk.

33 of 43

Virtual memory can be implemented with different techniques, including paging, segmentation, or a combination of both, but paging is the most popular.
The success of paging, like that of cache, is very dependent on the locality principle.
When data is needed that does not reside in main memory, the entire block in which it resides is copied from disk to main memory.

34 of 43

4.6.1 Paging

The basic idea behind paging is quite simple:

Allocate physical memory to processes in fixed size chunks (page frames)
and keep track of where the various pages of the process reside by recording information in a page table.

Every process has its own page table that typically resides in main memory,

and the page table stores the physical location of each virtual page of the process.

The page table has N rows, where N is the number of virtual pages in the process.

35 of 43

If there are pages of the process currently not in main memory,

the page table indicates this by setting a valid bit to 0; if the page is in main memory, the valid bit is set to 1.

Therefore, each entry of the page table has two fields:

a valid bit and a frame number

To accomplish this address translation, a virtual address is divided into two fields:

a page field and an offset field, to represent the offset within that page where the requested data is located.

36 of 43

To access data at a given virtual address, the system performs the following steps:

Extract the page number from the virtual address.
Extract the offset from the virtual address.
Translate the page number into a physical page frame number by accessing the page table.
Look up the page number in the page table (using the virtual page number as an index).
Check the valid bit for that page.
If the valid bit = 0, the system generates a page fault and the operating system must intervene to Locate the desired page on disk.
Find a free page frame (this may necessitate removing a “victim” page from memory and copying it back to disk if memory is full).

37 of 43

Copy the desired page into the free page frame in main memory.
Update the page table. (The virtual page just brought in must have its frame number and valid bit in the page table modified. If there was a “victim” page, its valid bit must be set to zero.)
If the valid bit = 1, the page is in memory.
Replace the virtual page number with the actual frame number.
Access data at offset in physical page frame by adding the offset to the frame number for the given virtual page

38 of 43

Let’s look at an example. Suppose that we have a virtual address space of 2⁸words for a given process and physical memory of 4 page frames.
Assume also that pages are 32 words in length.
Virtual addresses contain 8 bits, and physical addresses contain 7 bits (4 frames of 32 words each is 128 words, ).
Suppose, also, that some pages from the process have been brought into main memory.
The diagram shown below illustrates the current state of the system.

40 of 43

Each virtual address has 8 bits and is divided into 2 fields: the page field has 3 bits, indicating there are 2³pages of virtual memory (2⁸/2⁵) = 2³.Each page is 2⁵= 32 words in length, so we need 5 bits for the page offset. Therefore, an 8-bit virtual address has the format shown below.

1 of 43