What is it and
why do we need it?
Until now, all our discussions have assumed that all programs run by the
operating system fit into available memory. What if we have a program that
does not? Overlays were the first solution to this problem. The
program was split into pieces called overlays. All overlays were kept on the
disk and swapped in and out of memory as needed. Overlay 0would start running
first; when it was done, it would call another overlay. The problem with this
approach was that the programmer would have to split the program into overlays
manually. Virtual memory was developed thereafter that made the
computer do all the work.
The motivation behind the idea is that if the combined size of the program, data and stack exceeds the amount of available physical memory, the operating system can still execute the program. Those parts of the program that are currently in use are stored in main memory; the rest on the disk. For example, a 16M program can run on a 4M machine by choosing which parts to keep in memory and which on the disk. The pieces of the program are swapped between memory and disk as required. Virtual memory can work in a multiprogramming system as well, with bits and pieces of many programs in memory.
In a system that implements virtual memory, all addresses generated by the program are virtual addresses. In a system without it, all addresses generated by the program can be put directly on the memory bus because the virtual address is the same as the physical address. With virtual memory however, the addresses go to a MMU (Memory Management Unit) that maps the virtual addresses to physical addresses. The virtual address space is divided into pages. These correspond to page frames in physical memory. Both have exactly the same size. Transfers between memory and disk are always in units of pages. Typical page sizes are anywhere between 512 bytes to 64K.
How the MMU Works
When a program tries to access address x, the virtual
address is sent to the MMU. The MMU determines the page number x
falls into - let us say, y. The page frame mapping (that is, the
page in physical memory that corresponds to y) for y
is determined first. x is then transformed into a physical address
and is output onto the
memory bus. Note that since the number of page frames may be less than the
number of pages a program occupies, not all the pages will have a mapping to a
page frame. A Present/absent bit for each page is used to keep track
of whether the page is mapped or not.
FIGURE vm.gif WILL BE HERE. Some text related to the figure will have to be added.
If the program tries to access an unmapped page, the MMU notices that the page is unmapped and causes a page fault - a trap to the operating system. The operating system then chooses a page frame that has not been used much and writes its contents back to disk. It then fetches the page that was just referenced into the freed page frame. The Present/absent bit of the replaced page is cleared and that of the new page is set.
Implementation of the MMU
Each virtual address is split into a virtual page number(high order bits)
and an offset(low order bits). The mapping between the pages and page frames
is stored in a page table. The virtual page number is used as an index
into this page table, the corresponding entry yields the page frame for the
virtual page. The page frame number (if any, determined by the Present/absent
bit) is attached to the high order end of the offset, replacing the virtual
page number, to form a physical address that can be sent to memory.
For example, assuming pages of size 4K, for 16-bit addresses, the virtual address is split into a 4-bit page number and a 12-bit offset. With 4 bits for the page number, we can represent 16 pages, and with 12 bits for the offset, all 4096 bytes within a page can be addressed.
Note that with paging, th CPU may have to make two page table references per memory reference. Since execution speed is limited by the rate the CPU can get instructions and data out of memory, these references seriously reduce performance. A hardware solution to this problem is a device called a TLB - Translation Lookaside Buffer, also called associative memory. This device maps virtual addresses to physical addresses without going through the page table. It is usually inside the MMU and functions like a cache of virtual page-to-page frame mappings. Each entry in the TLB contains the following information about a virtual page:
Translation of a virtual address into a physical address now takes the following form:
Thus, traps to the operating system occur only when a page is not in memory. In case of a page table lookup, the MMU evicts one of the entries from the TLB and replaces it with the page table entry just looked up. Thus, if that page is used again soon, the second time it will result in a hit rather than a miss. When an entry is purged from the TLB, the modify bit is copied back into the page table entry in memory. This is necessary because the modify bit determines whether the page needs to be written back to the disk in case it is replaced. When the TLB is loaded from the page table, all the fields are taken from memory.
Software-managed
TLBs
Some RISC machines, including MIPS and Alpha, do nearly all of the page
management described above in software. On these machines, the TLB entries
are explicitly loaded by the operating system. When a TLB miss occurs,
instead of the MMU just going to the page tables to find and fetch the needed
page reference, a TLB fault is generated. The operating system then must
find the page, remove an entry from the TLB, enter the new one, and restart
the instruction that faulted. Note that this must be done very efficiently
because TLB misses occur much more frequently than page faults.
If the TLB is reasonably large to reduce the miss rate, software management of the TLB turns out to be quite efficient.
An inverted page table contains one entry per page frame in real memory , rather than one entry per page of virtual address space. For example, with 64-bit virtual addresses, a 4K page, and 32 MB of RAM, an inverted page table requires only 8K entries. Each entry is of the form (process, virtual page) and keeps track of the page frame corresponding to virtual page.
A serious downside to inverted page tables is that virtual-to-physical translation becomes much harder. When process n references virtual page vp , the hardware can no longer find the physical page by using vp as an index into the page table. Instead it must search the entire inverted page table for the entry (n,vp). This problem can be overcome by maintaining the inverted page table as a hash table. Thus, for translation of a virtual address, first the TLB is searched. On a TLB miss, the inverted page table must be searched.
Inverted page tables are currently used on some IBM and Hewlett-Packard workstations.
The Optimal Page Replacement
Algorithm
This algorithm requires each page in memory to be labeled with the number of
instructions that will be executed before that page is first referenced. The
page with the highest label will be chosen for page replacement. If one page
will not be used for 6 instructions and another will not be used for 8
instructions, choosing the latter for replacement causes the page fault to
occur later into the future than if the former was chosen.
The problem with implementing this algorithm is that the operating system has no way of knowing when each page will be referenced next. The only way to implement this algorithm is by first running the program on a simulator and then using the page reference information from it to run the program on the actual algorithm.
Not Recently Used Page
Replacement Algorithm
Most systems implementing virtual memory have two bits associated with each
page - a reference bit and a modify bit . The first is
set when the page is read or written to. The second is set when the page is
written to. Upon start up, both the bits for all pages of a process are set
to 0 by the operating system. Periodically, the reference bit is
cleared to distinguish pages that have not been referenced recently from those
that have been. The modify bit is not cleared because that
information is needed to decide whether a page needs to be written back to
disk or not. Upon a page fault, the operating system divides all the pages
into the following categories:
The algorithm chooses a page at random from the lowest numbered nonempty category. Note that this algorithm is different from one that would choose the least recently used page.
First-In, First-Out (FIFO) Page
Replacement Algorithm
In this algorithm, the operating system maintains a list of all pages
currently in memory, with the page at the head of the list the oldest one and
the page at the tail the most recent arrival. On a page fault, the page at
the head is removed and the new page added to the tail of the list.
A problem with this algorithm is that it may throw out a heavily used page.
The Second Chance Page
Replacement Algorithm
This algorithm is a modification of the FIFO algorithm. Before throwing out a
page, the reference bit of the page is inspected. If it is not
set, the page is both old and unused and it is replaced immediately. If the
bit is set, the bit is cleared, the page is put onto the end of the list of
pages, and its load time is updated to the current time. This is done until a
page with its reference bit not set is found.
Note that this algorithm degenerates into pure FIFO if all the pages have been referenced. It is also inefficient because it constantly moves around pages on its list.
The Clock Page Replacement
Algorithm
The clock algorithm keeps all the pages on a circular list in the form of a
clock. A hand points to the oldest page. When a page fault occurs, the page
being pointed to by the hand is inspected. If its reference bit is
not set, the page is evicted, the new page is inserted into the clock in its
place, and the hand is advanced one position. If the reference bit
is set, it is cleared and the hand is advanced to the next page. This process
is repeated until a page is found with its reference bit not set.
This algorithm differs from the second chance algorithm only in its implementation - it avoids the problem of moving around pages in the list.
The Least
Recently Used (LRU) Page Replacement Algorithm
This algorithm is an approximation to the optimal algorithm. The idea is that
the pages that have been heavily used in the last few instructions are likely
to be used again in the next few. Conversely, pages that have not been used
for ages will probably remain unused for a long time. Upon a page fault, LRU
throws out a page that has been unused for the longest time.
This algorithm is not cheap to implement because it needs to maintain a linked list of all pages in memory, with the most recently used page at the front and least recently used page at the rear. The problem is that the list must be updated at every memory reference - the updating itself is an expensive operation because because it involves finding a page in the list, deleting it and then moving it to the front.
Thus, translation of a virtual address generated by the currently running
process takes place in the following way. First, the TLB is checked. If the
mapping to its corresponding physical address is found, we are done. On a TLB
miss, we check the inverted page table. If a mapping of the form
How to Implement TLBs and the
Inverted Page Table
For the multiprogramming lab, each process had a
page table. On a context switch, the pageTable variable of the
machine object was set to the page table of the process that was
going to execute next. Nachos is set up to either use TLBs or page tables
(see the constructor function of the machine class in machine.cc) - in this lab, the
machine object uses TLBs. The flag USE_TLB defined in the Makefile in
the vm directory determines which will be used.
When you compile Nachos for virtual memory, the flag is defined and the
machine uses TLBs.
The MIPS simulator in Nachos raises a PageFaultException in the event of a TLB miss. Thus, you need to write a function that will be called from the ExceptionHandler() (see exception.cc) to handle the TLB miss. You will also need functions that will write a page from memory to disk and from disk to memory. For this, the functions ReadAt() and WriteAt() defined as part of the OpenFile class in openfile.h(cc) will be helpful.
The inverted page table will have the same size as the number of pages in mainMemory - NumPhysPages and will contain information about each physical page - whether valid, dirty, the corresponding virtual page, the process that owns the page and some information that you may need to maintain to implement your page replacement algorithm. In case of a TLB miss, we do not want to search this table sequentially - thus, it is important that you implement it as a hash table. It is helpful for the hash function to be some computation based on the virtual address and the id of the process that owns the page.
What Happens to the AddrSpace
Class
Two things are important here. Instead of having a page
table, we will now need to allocate a backing store for each
process, a swapfile that will be used in the event of a miss in the inverted
page table. It is helpful to name this file swapfile.pid where pid
is the id of the process that owns the file. This file must be initialized
in the same way as mainMemory was initialized for the multiprogramming lab
from the executable that contains the program to run.
Another thing to note is what happens on a context switch. The entries in the TLB now become invalid because they belonged to the process that was executing before the context switch. Thus, as part of the routine RestoreState() in the AddrSpace class (this routine is called by the scheduler on a context switch. See the Run() routine in scheduler.cc.), all TLB entries need to be invalidated. Also, some entries in the inverted page table may need to be updated.
Reference: Operating Systems, Design and Implementation. Second
Edition, Chapter 4.
Andrew S. Tanenbaum and Albert S. Woodhull.
Source
code in code/vm
Lab
Quiz
Last modified on Thursday, 29-May-97 17:03:31 EDT