Salsa - Virtual Memory

Topics

Nachos

Filesystem

[ Source code in code/vm | Lab| Quiz]

What is it and why do we need it?
Until now, all our discussions have assumed that all programs run by the operating system fit into available memory. What if we have a program that does not? Overlays were the first solution to this problem. The program was split into pieces called overlays. All overlays were kept on the disk and swapped in and out of memory as needed. Overlay 0would start running first; when it was done, it would call another overlay. The problem with this approach was that the programmer would have to split the program into overlays manually. Virtual memory was developed thereafter that made the computer do all the work.

The motivation behind the idea is that if the combined size of the program, data and stack exceeds the amount of available physical memory, the operating system can still execute the program. Those parts of the program that are currently in use are stored in main memory; the rest on the disk. For example, a 16M program can run on a 4M machine by choosing which parts to keep in memory and which on the disk. The pieces of the program are swapped between memory and disk as required. Virtual memory can work in a multiprogramming system as well, with bits and pieces of many programs in memory.

Paging

For a discussion of paging in the context of memory allocation, click here.

In a system that implements virtual memory, all addresses generated by the program are virtual addresses. In a system without it, all addresses generated by the program can be put directly on the memory bus because the virtual address is the same as the physical address. With virtual memory however, the addresses go to a MMU (Memory Management Unit) that maps the virtual addresses to physical addresses. The virtual address space is divided into pages. These correspond to page frames in physical memory. Both have exactly the same size. Transfers between memory and disk are always in units of pages. Typical page sizes are anywhere between 512 bytes to 64K.

How the MMU Works
When a program tries to access address x, the virtual address is sent to the MMU. The MMU determines the page number x falls into - let us say, y. The page frame mapping (that is, the page in physical memory that corresponds to y) for y is determined first. x is then transformed into a physical address and is output onto the memory bus. Note that since the number of page frames may be less than the number of pages a program occupies, not all the pages will have a mapping to a page frame. A Present/absent bit for each page is used to keep track of whether the page is mapped or not.

FIGURE vm.gif WILL BE HERE. Some text related to the figure will have to be added.

If the program tries to access an unmapped page, the MMU notices that the page is unmapped and causes a page fault - a trap to the operating system. The operating system then chooses a page frame that has not been used much and writes its contents back to disk. It then fetches the page that was just referenced into the freed page frame. The Present/absent bit of the replaced page is cleared and that of the new page is set.

Implementation of the MMU
Each virtual address is split into a virtual page number(high order bits) and an offset(low order bits). The mapping between the pages and page frames is stored in a page table. The virtual page number is used as an index into this page table, the corresponding entry yields the page frame for the virtual page. The page frame number (if any, determined by the Present/absent bit) is attached to the high order end of the offset, replacing the virtual page number, to form a physical address that can be sent to memory.

For example, assuming pages of size 4K, for 16-bit addresses, the virtual address is split into a 4-bit page number and a 12-bit offset. With 4 bits for the page number, we can represent 16 pages, and with 12 bits for the offset, all 4096 bytes within a page can be addressed.

Translation Lookaside Buffers (TLBs)

Modern computers use virtual addresses of at least 32 bits. With a page size of 4K, a 32-bit address space has 1 million pages. This means that the page table must have 1 million entries. The page table can thus be very large (consider the memory requirements for a 64-bit address space). Since a virtual-to-physical mapping must be done on every memory reference, it is important that this mapping be fast. In most paging schemes, the page tables are kept in memory.

Note that with paging, th CPU may have to make two page table references per memory reference. Since execution speed is limited by the rate the CPU can get instructions and data out of memory, these references seriously reduce performance. A hardware solution to this problem is a device called a TLB - Translation Lookaside Buffer, also called associative memory. This device maps virtual addresses to physical addresses without going through the page table. It is usually inside the MMU and functions like a cache of virtual page-to-page frame mappings. Each entry in the TLB contains the following information about a virtual page:

Virtual page number.
modify bit - a bit that is set when the page is modified.
The protection code - read/write/execute permissions.
The page frame in which the page is located.
valid bit - indicating whether the entry is valid (in use) or not.

Translation of a virtual address into a physical address now takes the following form:

Check whether the virtual page number is present in the TLB.
Check whether the valid bit is set.
If yes, and, if access does not violate the protection bits, get the page frame number. Note that no page table references are made.
If the virtual page number is not in the TLB or the valid bit is not set, the MMU does an ordinary page table lookup.

Thus, traps to the operating system occur only when a page is not in memory. In case of a page table lookup, the MMU evicts one of the entries from the TLB and replaces it with the page table entry just looked up. Thus, if that page is used again soon, the second time it will result in a hit rather than a miss. When an entry is purged from the TLB, the modify bit is copied back into the page table entry in memory. This is necessary because the modify bit determines whether the page needs to be written back to the disk in case it is replaced. When the TLB is loaded from the page table, all the fields are taken from memory.

Software-managed TLBs
Some RISC machines, including MIPS and Alpha, do nearly all of the page management described above in software. On these machines, the TLB entries are explicitly loaded by the operating system. When a TLB miss occurs, instead of the MMU just going to the page tables to find and fetch the needed page reference, a TLB fault is generated. The operating system then must find the page, remove an entry from the TLB, enter the new one, and restart the instruction that faulted. Note that this must be done very efficiently because TLB misses occur much more frequently than page faults.

If the TLB is reasonably large to reduce the miss rate, software management of the TLB turns out to be quite efficient.

Inverted Page Tables

Page tables of the type discussed till now require one entry per virtual page. If the address space is large, say, 2⁶⁴ bytes, with 4K pages, over 10 ¹⁵ bytes are needed for the page table. Using over 1 million gigabytes just for a page table is not doable; a different solution for 64-bit addresses is an inverted page table.

An inverted page table contains one entry per page frame in real memory , rather than one entry per page of virtual address space. For example, with 64-bit virtual addresses, a 4K page, and 32 MB of RAM, an inverted page table requires only 8K entries. Each entry is of the form (process, virtual page) and keeps track of the page frame corresponding to virtual page.

A serious downside to inverted page tables is that virtual-to-physical translation becomes much harder. When process n references virtual page vp , the hardware can no longer find the physical page by using vp as an index into the page table. Instead it must search the entire inverted page table for the entry (n,vp). This problem can be overcome by maintaining the inverted page table as a hash table. Thus, for translation of a virtual address, first the TLB is searched. On a TLB miss, the inverted page table must be searched.

Inverted page tables are currently used on some IBM and Hewlett-Packard workstations.

Page Replacement

When a page fault occurs, the operating system has to choose a page to remove from memory to make room for the page that has to be brought in. If the page being removed has been modified while in memory, it must be rewritten to the disk. If not, the page to be read in just overwrites the page being evicted. Many page replacement algorithms have been developed to choose the page to be replaced. They are described briefly below.

The Optimal Page Replacement Algorithm
This algorithm requires each page in memory to be labeled with the number of instructions that will be executed before that page is first referenced. The page with the highest label will be chosen for page replacement. If one page will not be used for 6 instructions and another will not be used for 8 instructions, choosing the latter for replacement causes the page fault to occur later into the future than if the former was chosen.

The problem with implementing this algorithm is that the operating system has no way of knowing when each page will be referenced next. The only way to implement this algorithm is by first running the program on a simulator and then using the page reference information from it to run the program on the actual algorithm.

Not Recently Used Page Replacement Algorithm
Most systems implementing virtual memory have two bits associated with each page - a reference bit and a modify bit . The first is set when the page is read or written to. The second is set when the page is written to. Upon start up, both the bits for all pages of a process are set to 0 by the operating system. Periodically, the reference bit is cleared to distinguish pages that have not been referenced recently from those that have been. The modify bit is not cleared because that information is needed to decide whether a page needs to be written back to disk or not. Upon a page fault, the operating system divides all the pages into the following categories:

Class 0: not referenced, not modified.
Class 1: not referenced, modified.
Class 2: referenced, not modified.
Class 3: referenced, modified.

The algorithm chooses a page at random from the lowest numbered nonempty category. Note that this algorithm is different from one that would choose the least recently used page.

First-In, First-Out (FIFO) Page Replacement Algorithm
In this algorithm, the operating system maintains a list of all pages currently in memory, with the page at the head of the list the oldest one and the page at the tail the most recent arrival. On a page fault, the page at the head is removed and the new page added to the tail of the list.

A problem with this algorithm is that it may throw out a heavily used page.

The Second Chance Page Replacement Algorithm
This algorithm is a modification of the FIFO algorithm. Before throwing out a page, the reference bit of the page is inspected. If it is not set, the page is both old and unused and it is replaced immediately. If the bit is set, the bit is cleared, the page is put onto the end of the list of pages, and its load time is updated to the current time. This is done until a page with its reference bit not set is found.

Note that this algorithm degenerates into pure FIFO if all the pages have been referenced. It is also inefficient because it constantly moves around pages on its list.

The Clock Page Replacement Algorithm
The clock algorithm keeps all the pages on a circular list in the form of a clock. A hand points to the oldest page. When a page fault occurs, the page being pointed to by the hand is inspected. If its reference bit is not set, the page is evicted, the new page is inserted into the clock in its place, and the hand is advanced one position. If the reference bit is set, it is cleared and the hand is advanced to the next page. This process is repeated until a page is found with its reference bit not set.

This algorithm differs from the second chance algorithm only in its implementation - it avoids the problem of moving around pages in the list.

The Least Recently Used (LRU) Page Replacement Algorithm
This algorithm is an approximation to the optimal algorithm. The idea is that the pages that have been heavily used in the last few instructions are likely to be used again in the next few. Conversely, pages that have not been used for ages will probably remain unused for a long time. Upon a page fault, LRU throws out a page that has been unused for the longest time.

This algorithm is not cheap to implement because it needs to maintain a linked list of all pages in memory, with the most recently used page at the front and least recently used page at the rear. The problem is that the list must be updated at every memory reference - the updating itself is an expensive operation because because it involves finding a page in the list, deleting it and then moving it to the front.

How to Implement in Nachos

Lab5 requires you to implement virtual memory in Nachos. In addition, you have to implement TLBs in software. Implementing virtual memory involves the following:

A mechanism to move a page from disk to memory and from memory to disk.
Implementing a core map or an inverted page table to keep track of all the pages in the system that are currently in use. Note that the size of this table will be the same as the number of pages in physical memory.

Thus, translation of a virtual address generated by the currently running process takes place in the following way. First, the TLB is checked. If the mapping to its corresponding physical address is found, we are done. On a TLB miss, we check the inverted page table. If a mapping of the form is found, we are done. If not, we read in the page that contains virtual_address from disk, update the TLB, and the inverted page table. Note that reading the page from disk requires some kind of a backing store and also figuring out the page that contains the virtual address we are looking for.

How to Implement TLBs and the Inverted Page Table
For the multiprogramming lab, each process had a page table. On a context switch, the pageTable variable of the machine object was set to the page table of the process that was going to execute next. Nachos is set up to either use TLBs or page tables (see the constructor function of the machine class in machine.cc) - in this lab, the machine object uses TLBs. The flag USE_TLB defined in the Makefile in the vm directory determines which will be used. When you compile Nachos for virtual memory, the flag is defined and the machine uses TLBs.

The MIPS simulator in Nachos raises a PageFaultException in the event of a TLB miss. Thus, you need to write a function that will be called from the ExceptionHandler() (see exception.cc) to handle the TLB miss. You will also need functions that will write a page from memory to disk and from disk to memory. For this, the functions ReadAt() and WriteAt() defined as part of the OpenFile class in openfile.h(cc) will be helpful.

The inverted page table will have the same size as the number of pages in mainMemory - NumPhysPages and will contain information about each physical page - whether valid, dirty, the corresponding virtual page, the process that owns the page and some information that you may need to maintain to implement your page replacement algorithm. In case of a TLB miss, we do not want to search this table sequentially - thus, it is important that you implement it as a hash table. It is helpful for the hash function to be some computation based on the virtual address and the id of the process that owns the page.

What Happens to the AddrSpace Class
Two things are important here. Instead of having a page table, we will now need to allocate a backing store for each process, a swapfile that will be used in the event of a miss in the inverted page table. It is helpful to name this file swapfile.pid where pid is the id of the process that owns the file. This file must be initialized in the same way as mainMemory was initialized for the multiprogramming lab from the executable that contains the program to run.

Another thing to note is what happens on a context switch. The entries in the TLB now become invalid because they belonged to the process that was executing before the context switch. Thus, as part of the routine RestoreState() in the AddrSpace class (this routine is called by the scheduler on a context switch. See the Run() routine in scheduler.cc.), all TLB entries need to be invalidated. Also, some entries in the inverted page table may need to be updated.

Reference: Operating Systems, Design and Implementation. Second Edition, Chapter 4.
Andrew S. Tanenbaum and Albert S. Woodhull.

Source code in code/vm
Lab
Quiz

Last modified on Thursday, 29-May-97 17:03:31 EDT