Project 2: Basic Multiprogramming and System Calls

Due May 14, 1999

Contents

Overview
Problems
Issues to Consider
Suggestions
What to turn in

NOTE: This project is more difficult than the previous one. We advise you to start early. For questions about this project, send mail to besmith@cs.

Overview

The original implementation of Nachos has a very limited capability for running user-level programs: at most one user process can run at a time, and halt is the only system-call implemented. For this project, we ask you to correct some of these deficiencies and to turn Nachos into a multiprogramming operating system with a working set of basic system calls.

For the first part of the project, you are to modify Nachos so that it can run multiple cooperative user processes simultaneously. For this, you are to implement the Fork(), Yield(), Exit(), Exec(), and Join() system calls (detailed specs later in this note). You are also to implement non-preemptive scheduling. For the second part of the project, you are to implement the Creat, Open, Read, Write, and Close system calls.

You will be working on the version of Nachos in the userprog directory. To test your additions to Nachos, you will need to write some simple user programs, compile them using the MIPS cross compiler, and run them under Nachos. User programs are to be written in ANSI C. For examples of user programs, go to the test subdirectory of Nachos. Several sample applications are provided in this directory. The only one that will run on an unmodified Nachos is the halt program. To run this program: (1) go to the test directory; (2) type gmake halt to cross-compile halt.c and to create a Nachos executable in halt; (3) go to the userprog directory and type gmake Nachos; (4) type nachos -x halt. This will start Nachos and ask it to load and run the halt program. You should see a message indicating that Nachos is halting at the request of the user program.

In brief, what happens when you type Nachos -x halt is as follows:

Nachos starts. The initial thread starts by running function StartProcess() in file progtest.cc.

A new address space is allocated for the user process, and the contents of the executable file halt are loaded into that address space. This is accomplished by the constructor function AddrSpace::AddrSpace() in the file addrspace.cc.

MIPS registers and memory management hardware are initialized for the new user process by the functions AddrSpace::InitRegister() and AddrSpace::RestoreState() in addrspace.cc.

Control is transferred to user mode and the halt program begins running. This is accomplished by the function Machine::Run(), which starts the MIPS emulator.

The system call Halt() is executed from user mode (now running the program halt). This causes a trap back to the Nachos kernel via function ExceptionHandler() in file exception.cc.

The exception handler determines that a Halt() system call was requested from user mode, and it halts Nachos by calling the function Interrupt::Halt().

Trace through the Nachos code until you think you understand how program halt is executed.

In this project, you will also need to know the object file format for Nachos. NOFF (Nachos Object File Format) looks like the following:


-----------
| DATA    |
-----------
| ....    |         -----
-----------             |
| ....    |             |
-----------             |
| bss     | segment     |
-----------             |---- CODE Section
| data    | segment     |
-----------             |
| code    | segment     |
-----------             |
| magic # | 0xbadfad    | 
-----------         -----


NOFF files have only code and data section. Inside CODE sections are segments pointing to the real location of code, data, and bss sections.

--------------
|virtual addr|  points to the location in virtual memory
--------------
|in file addr|  points to a location inside the DATA part of NOFF file
--------------
|size        |  size of a segment in bytes
--------------
Information about NOFF can be found in bin/noff.h.

When you create user programs and compile them using the MIPS compiler (cross compile) you get COFF (common object file format) file. This is a normal MIPS object (executable) file that has DATA, TEXT and CODE sections. For this file to be runable under Nachos it has to be turned into NOFF. This is done by using the coffnoff translator.

Problems

For the first part of this project, you are to implement the Fork(), Yield(), Exit(), Exec(), and Join() system calls that act as follows:
The Fork(func) system call creates a new user-level (child) process, whose address space starts out as an exact copy of that of the caller (the parent), but immediately the child abandons the program of the parent and starts executing the function supplied by the single argument.

The Yield() call is used by a process executing in user mode to temporarily relinquish the CPU to another process.

The Exit(int) call takes a single argument, which is an integer status value as in Unix. The currently executing process is terminated. For now, you can just ignore the status value. Later you will figure out how to get this value to an interested process.

The Exec(filename) system call spawns a new user-level thread (process), but creates a new address space and begins executing a new program given by the object code in the Nachos file whose name is supplied as an argument to the call. It should return to the parent a SpaceId which can be used to uniquely identify the newly created process.

The Join() call waits and returns only after a process with the specified ID (supplied as an argument to that call) has finished.

Test your code by creating several user programs that exercise the various system calls. Be sure to test each of the system calls, and to try forking up to three processes (since each has a 1024 byte stack, that's all that will fit in Nachos' 4K byte physical memory) and have them yield back and forth for awhile to make sure everything is working. Since the functionality for providing I/O operations to user programs will be implemented as a part of this project, you may initially have to rely on using debugging printout in the kernel to track what is happening. Use the DEBUG macro for this, and make sure that debugging printout is disabled by default when you submit your code for grading.

In the second part of this project, you are to implement the file-system calls: Creat, Open, Read, Write, and Close. The semantics of thes calls are specified in syscall.h. You should extend your file-system code to handle the console as well as normal files.

To support the system calls that access the console device, you will probably find it useful to implement a SynchConsole class that provides the abstraction of synchronous access to the console. The file progtest.cc has the beginning of a SynchConsole implementation.

Issues to Consider

Here is an outline of some of the major issues you will have to deal with to make Nachos into a multiprogrammed system:
To implement the system calls, you will need to modify the ExceptionHandler() function in exception.cc to determine which system call or exception occurred, and to transfer control to an appropriate function. You might want to consider introducing ``stubs'' (functions with empty bodies or with debugging printout so you can tell when they are called) for all the system calls right away, and then postpone their actual implementation till you have the control-flow working correctly. This strategy will help you understand the flow of control for a system-call -- from user mode to kernel mode and back.

The original version of Nachos is simple-minded about memory management. In particular, the constructor function AddrSpace::AddrSpace() simply determines the amount of memory that will be required by the application to be run and then allocates that much space contiguously starting at address zero in physical memory. The page tables (which control the address translation hardware) are set up so that the logical addresses (what the user program sees) are identical to the physical addresses (where the data is actually stored).

The above scheme is inadequate for running more than one application at a time. You will need to design and implement a scheme for allocating and freeing physical memory, and you will need to arrange to set up the page tables so that the logical address space seen by a user application is a contiguous region starting from address zero, even though the data is stored at different physical addresses. You will want to implement a memory management scheme that is flexible enough to extend to virtual memory later in the semester. We suggest implementing a C++ class with methods for allocating and freeing physical memory one page at a time. By setting up the page tables properly, you can give the user application a contiguous logical address space even though each page of actual data might be stored anywhere in physical memory.

The Fork() system call is probably the most difficult part of this project. It is different from the Exec system call in that Fork will start a new process that runs a user function specified by the argument of the call, while Exec will start a process that runs a

different executable file. The signatures for Fork() and Exec() also differ. Fork(func) takes an argument func which is a pointer to a function. The function must be compiled as part of the user program that is currently running. By making this system call Fork(func), the user program expects the following: a new thread will be generated for use by the user program; and this thread will run func in an address space that is an exact copy of the current one. This implementation of Fork makes it possible to have multiple entry points in an executable file.

To implement the Fork(func) system-call, you will need to know how to find the entry point of the function that is passed as a parameter. The parameter convention is determined by the cross-compiler which produces executable code from the user source program. Look at the file exception.cc to see that this entry point, which is an address in the executable code's address space, is already loaded into register 4 when the trap to the exception handler occurs. All you need to do is to insert code into the exception handler (or call a new function of your own) which does the following: set up an address space which is a copy of the address space of the current thread, and load the address that is in register 4 into the program counter. After these steps, use Thread::Fork() to create a new thread, initialize the MIPS registers for the new process, and have both the new and old processes return to user mode. The parent should return to user mode by returning from the exception handler, the child process should continue to run from the address that is now in the program counter, which is the entry point of the function. To implement Fork, you will need to introduce modifications to the AddrSpace class in addrspace.cc so that you can make a ``clone'' of a running user application program. We suggest adding a function AddrSpace::Fork(). In brief, calling this function will create a new address space that is an exact copy of the original. You will have to allocate additional physical memory for this copy, set up the page tables properly for the new address space, and copy the data from the old address space to the new. Once the physical memory has been allocated and the page tables set up, you will use Thread::Fork() to create a new kernel thread, initialize the MIPS registers for the new process, and then have both the old and the new processes return to user mode. The child process should continue by finishing the Fork() system call. The parent should return to user mode merely by returning from the ExceptionHandler() function.

The Exit() system call should work by calling Thread::Finish(), but only after deallocating any physical memory and other resources that are assigned to the thread that is exiting.

In order to implement the Exec() system call, you will need a method for transferring data (the name of the executable, supplied as the argument to the system call) between the user address space and the kernel. You are not to use functions Machine::ReadMem() and Machine::WriteMem() in machine/translate.cc. Instead, you will have to code your own functions that take into account the address translations described by the page tables to locate the proper physical address for any given logical address. (Recall that strings in C are stored as sequences of characters in successive memory locations, terminated by a null character.)

Once the name of the executable has been copied into the kernel, and the file has been verified to exist, the executable file should be consulted to determine the amount of physical memory required for the new program. This physical memory should be allocated and initialized with data from the executable file, the page tables thread should be adjusted for the new program, the MIPS registers should be reinitialized for starting at the beginning of the new program, and control should return to user mode. File progtest.cc contains a sample for executing a binary program.

If you use ``machine->Run'' to execute a user program, it terminates the current thread. Since Exec() needs to return space ID to the caller, you should find a way to do that.

NOTE: The object code produced by the MIPS cross-compiler assumes that the data segment begins at the physical address immediately following the text segment. In particular, there is no page alignment, so that if the text segment ends in the middle of a page, then the data segment will start just after it and the page will contain both code and data.

The Yield() system call will call Thread::Yield() after making sure to save any necessary state information about the currently executing process.

Be sure to synchronize your code correctly. You will need to put lock operations in your code to ensure that it will work properly. Your locking should be fine grained enough to eliminate any spurious latency problems caused by coarse grained locking. For example, any time a thread accesses the disk or the console it should not hold a lock that would prevent another thread from accessing some other I/O device or piece of data.

The code you are given buffers its reads from the executable file in a disk buffer called diskbuffer. All of your file I/O must go through the diskbuffer.

You will need to solve a synchronization problem that occurs when multiple processes try to read or write from a file at the same time.

Suggestions

Error handling: System calls invoked by a user process in user mode should never ``crash'' Nachos. This means you should never trust that the arguments supplied for a system call are correct or reasonable. Instead, these arguments should be checked, and if they are incorrect, a failure status should be passed back to the caller. You will want to think of some scheme for accomplishing this.

Build and test incrementally: We suggest not trying to implement everything at once, but rather do a little at a time, testing that each bit you do works before going on to the next modification. This incremental approach seems to work best for operating systems, since it enables you never to be too far from having a version that ``runs''. If you make too many changes all at once, the debugging task to get back to a running system becomes enormous.

Adding source files: Don't be afraid to add new source files to Nachos, especially if it makes your program more modular. For example, it might be reasonable to add a new source file for the physical memory manager.

Debugging flags and switches: Using the ``-s'' flag to Nachos along with the ``-x'' flag causes Nachos to single-step while in user mode. This might be helpful for debugging and understanding. Also, have a look at the file threads/utility.h to see all the code letters that can be supplied along with the ``-d'' flag to enable various kinds of debugging printout from Nachos. The ``-d m'' option prints out each MIPS instruction as it is executed, which is very helpful for tracing problems with Fork() and Exec().

Incrementing Program Counter after a System Call: Before returning to user mode after a system call, it is necessary to increment the MIPS program counter to point to the next instruction. A system call instruction takes 4 bytes, so you must add 4 to the program counter. If you don't do this, the application making the system call will go into a loop in which it repeatedly makes the same call over and over. This is hard to figure out otherwise, so I'm telling you now so you don't have to.

What to turn in

All additions (new code in existing files as well as new files, if any) should be in the userprog directory. You are to turn in your userprog directory. In addition to the code, you are to include a file called PROJECT2_WRITEUP that explains the design of your code and how it works. If some parts of your code does not run, you need to say this outright in PROJECT2_WRITEUP and to describe your design, implementation and difficulties. This is needed for partial credit. Your code must compile and link if you want to receive any credit at all. That by itself, however, does not receive any credit. We will test your code with our own test programs. These programs will test the system-calls you will have implemented -- both singly and in combinations. You will receive points for each of our tests your system can execute correctly. For partial credit, the PROJECT2_WRITEUP file must contain a description of what works and what doesn't, your design, implementation and difficulties.

Required output

In order for us to see how your program works, some debugging output must be included in your code. You must print out the following information:
  1. the number of pages allocated to a user program
  2. the identity of a thread that has been forked
  3. the number and the identities of the pages allocated to a new process
  4. identity of the system-call made
  5. the filename argument in the Exec system call
  6. indication that a thread yields/exits, and which thread gets the control