Heloise's Helpful Hints for CS170 KOS Lab

Aunt Heloise


KOS -- A Tasty and Nutritious Way to Start Your Day

When cooking up a batch of KOS, there are a few ways to make your creation a crowd pleaser instead of something that makes your family and guests say "please pass the Windows-NT." Here are things I do when I want my KOS Lab to to keep them coming back for more.


Gathering and Using your Utensils

Before you begin, it will be best if you have all of your utensils near you and you are proficient in using them. Dr. Plank has provided a series of modules that make this lab MUCH easier to complete. In particular, you will very much want to understand how to use his dllist routines (see /cs/faculty/rich/cs170/include/dllist.h for the interface). Without these tools, you are really missing some of technology's greatest time savers, although you are certainly free to write your own.


Compiling KOS and The Code Structure

You may be confused about how the simulator, your KOS code, and a user program fit together. I know I was. Here is the presentation of KOS that helped me make the most tasty KOS Lab. The thing to realize is that the simulator and your code are being combined to simulate the behavior of a DEC MIPS machine and its OS. To understand how this works, you probably need to think back to your assembly language programming experience. It all comes down to assembly language, doesn't it? Recall that each assembly language instruction changes a small part of the machine's "state" (registers, condition codes, memory, etc.). If you think about it for a minute, you could probably write a C program that defines a variable for each piece of machine state, and then walks through an assembly language program one-instruction-at-a-time making the same state changes that the hardware would have made. For example, consider the register set and a fictitious assembly language instruction "ADD R1 R2 R3" that adds the contents of R1 to the contents of R2 and puts the results in R3. You can imagine defining R1, R2, and R3 as integers and writing C code that does this if it encounters the string "ADD R1 R2 R3." If you do this in a detailed enough way, and if you are willing to read the instructions in the format that they are stored in an actual Unix binary (as opposed to as strings) you have built a machine simulator. The file /cs/faculty/rich/cs170/lib/libsim.a that you need to load with does exactly this for MIPS binaries that have been compiled for the DEC Ultrix version of Unix.

The next thing to appreciate is that when a program makes a system call, it is really issuing a special assembly instruction called a TRAP instruction. It is up to the operating system to define where it expects to find the arguments to the system call when a TRAP is executed. Usually, some set of registers (like 5, 6, and 7 maybe?) are chosen. When the compiler compiles in a system call, it arranges for the arguments to be loaded into the right registers before the TRAP is issued. So what happens when the simulator "simulates" a TRAP instruction?

The answer, in our case, is that your code gets control through the exceptionHandler() subroutine. That is, you are actually writing the part of the simulator that deals with a TRAP instruction and you are writing in C.

Indeed, the way I visualize the problem is to think of a simulator as having being written, but which is missing a couple of modules. So your job, then, becomes to write code that is loaded with the code you are given to complete the full simulation. Think of it as a simulator with a hole in it that you must fill in. The next question, then, is "what parts am I given?"

The libsim.a and main_lab1.o (which you must load with) combine to give you

If someone were to give you a MIPS binary (called "a.out" maybe?) that was compiled for DEC Ultrix, and you were to tell the simulator to load it into main_memory, and then you were to set the simulator's program counter to the first instruction and to initialize your OS, then the simulator could start to run your program. It will call your OS when it reaches a TRAP instruction or when your simulated console device interrupts.

One point of confusion here, though, might arise over the difference between compiling your simulator+OS and cross-compiling so you can make your own MIPS/DEC Ultrix binaries. YOU compile your OS code for Linux using gcc. The simulator (libsim.a) you are given is compiled for Linux using gcc as is main_lab1.o. We also have a special version of gcc available that let's us build binaries for the MIPS processor running DEC Ultrix. The directory /cs/faculty/rich/cs170/test_execs contains a bunch of C programs that have been compiled with this special version of gcc. You should use them as the programs to load when you are running your simulator+OS.


Structure of KOS Lab

You are really being asked to accomplish four independent tasks in KOS Lab:

You should think of these as separate assignments, but do them in this order as each builds skills used in the next. You are strongly encouraged to follow your dear Auntie's cook book as it is a step-by-step list of things to accomplish in order to finish the lab. Understand each step thoroughly before moving on and try to see where each of the tasks begins and ends.

Here are some realizations that may help you make things go more smoothly.

OS Initialization

The key thing here to realize is that the simulator is expecting your code (the OS kernel) to do its business, store off anything it will need to remember when the next exception or interrupt occurs, and then call run_user_code(). When run_user_code() executes, your code is done. Anything you store in a global variable (like the ready queue) will be preserved, any blocked threads will still be blocked, but the currently running thread dies an ignominious death. The structure, then, that minimizes the ignominy of your OS is as follows:

exception called
     	.
	.
	.
you kt_fork whatever it is you need to get done
	.
	.
	.
the exception handler does a kt_joinall() and continues through 
a routine (usually the scheduler) that eventually calls run_user_code().  
Notice the thread synchronization structure. The kt_joinall() won't run until all of your other threads have either successfully called kt_exit() or have blocked themselves on a semaphore. That is, when there is no more work for the kernel to do, the kt_joinall() fires and your code goes back into user mode.

Writing the Console

There isn't much to say here other than what I have in my cook book recipe. The high-level realization to have is that the semaphore is really being used as a blocking lock. That is, a P() call locks the console, and a V() call unlocks it so another character can be written. Also, you are asked to use a second semaphore to implement exclusive access to the console. Ask yourself why that might be.

Reading the Console

Here, things are a bit different. The structure that is advocated is that you create a reader thread that consumes characters from the console and buffers them in kernel space. Many actual hardware devices work in this way with respect to unsolicited interrupts. The piece of hardware typically has a very small buffer (one character in this case). Your kernel buffer lets the system pull characters out and store them until a process can consume them. Also, the most elegant solution to this part of the lab takes advantage of the ability of semaphores to count how many "wake-ups" have occurred. If you implement this part of the assignment using a semaphore as a lock around a separate counter, it will work, but you may wish to review semaphores a bit.

The Process Control Block -- a Precious Ingredient

One source of confusion for many new kos chefs concerns the use of a Process Control Block (PCB) to record the process state when a user space process either traps into the OS or is interrupted by a device. To start out, if you think about it a bit, the CPU only has one set of registers and the C compiler uses all of them when you compile any code -- your own code to run in user space or the operating system.

Thus, when you compile a program to run as a user space program (i.e. not the OS) the compiler will use all of the registers in the CPU to implement your code logic. Similarly, when you compile your OS, the compiler will use all of the CPU registers. However, when the user space program executes a trap instruction or a device interrupts, the hardware will switch between the two and the compiler doesn't know that it will happen. In particular, the register values that are present when a user space program is paused (either because of a trap or an interrupt) cannot be different when the program is resumed or the compiled code will generate an incorrect answer. Put another way, the compiler assumes that your user space program will never be paused when it compiles your code. It relies on the hardware and OS to make such pauses transparent to the user.

The way that the OS can implement this transparency is to save off the register values into a set of memory locations immediately, as the first thing it does, when the user space code calls a trap instruction or when a device interrupts. In kos, the function examine_registers() does this saving. Note that this saving function is tricky to write because it is compiled with a C compiler and, thus, it uses registers, and thus it could overwrite the registers it is trying to save. For this reason the register saving function is usually written carefully, in assembly language so that it does not damage the registers before it saves them. However, it is also destructive. Once it has saved off the registers, it has used the registers to do the saving, and thus the register values at changed after a call to examine_registers. For this reason, your kos should only call examine_registers() once per trap or interrupt.

The question comes up, though, of where examine_registers() should save the register values. For this lab, where there is only one user space process in memory, you could use a global variable. That variable (an array of integers large enough to hold all registers) would be filled in every time the user space process calls a trap or a device interrupts and restored with a call to run_user_code(). However, in the next labs, we will enjoy creating multiple processes, so there will need to be more than one set of registers save areas and a single global won't work.

The solution, then, is to allocate a save area when ever a process is created and then to deallocate the save area when the process exits. This record is a data structure that the OS uses to save the user process state that the OS needs to restore the process (and, it turns out, other pieces of information that the OS needs to operate a specific process). The record is called a Process Control Block (PCB). There is one per existing user space process. You should allocate it when the process is created (e.g. in initialize_user_process()) and deallocate it when the process exits.

Note that your Aunt Heloise has observed some of your older cousins allocating a PCB every time a trap occurs or an interrupt happens. That is, there is a call to malloc() in exceptionHandler() and interruptHandler() for a PCB. Sadly, this error does not cause a problem for the first lab, but is disastrous in the next two. Can you see why?

The problem is that when you save off a processes registers, and then another process runs and you save its registers and you want to go back to run the first process, you can use a new PCB that you have just malloced -- you need the registers from the first process. Thus, each process has one PCB that needs to be created when the process is created and destroyed when the process exits and you use the PCB to save the registers (and other process-specific data) during the process' lifetime.

Initializing the User Program's argc and argv[]

Linux (and Ultrix -- the OS that ran on the MIPS) uses a specific argument passing mechanism in which the name of the program and all of the arguments are passed to the main() function as an array of strings. By convention, element zero of the array is the name of the program and the last element of the array is NULL. The integer parameter argc is set to the number of arguments (counting the zeroth argument as argument 1).

It is always good to validate your understanding by tasting your cooking during the preparation process. In this case, you might try compiling and running the following code on a Linux system.

#include < unistd.h >
#include < stdlib.h >
#include < stdio.h >
  
int main(int argc, char *argv[])
{
        int i;
        printf("program %s called with argc: %d\n",
                argv[0], argc);
        if(argc > 1) {
                for(i=1; i < argc; i++) {
                        printf("\targv[%d]: %s\n",i,argv[i]);
                }
        }
        return(0);
}
Compile it (taking the spaces inside the angle brackets out of the C file) as follows
gcc mytestprogram.c
where the file mytestprogram.c contains this code. Then try running it as
a.out
a.out firstarg secondarg thirdarg
What do you get? Your aunt gets something like
heloise@csilvm-03$ ./a.out
program ./a.out called with argc: 1
heloise@csilvm-03$
and
heloise@csilvm-03$ ./a.out firstarg secondarg thirdarg
program ./a.out called with argc: 4
	argv[1]: firstarg
	argv[2]: secondarg
	argv[3]: thirdarg
heloise@csilvm-03$
Note that argc counst the number of arguments passed including argv[0] which is the name of the program ("a.out" in this example).

So far so good? This is the way that Linux passes arguments from the command line to a program executed by the shell or, more properly, between programs when the are executed as subprograms.

This part of the lab is, by far, the most exotic and flavorful. It is really pretty straightforward, but at the same time it is fraught with dangerous undertones. Basically, you need to realize two things in order to make it a pleasant experience. First, you should try and visualize where argc and argv[] are going to live and how it is that the MIPS simulator will find them. Here is a really bad picture of their neighborhood. They live on "the wrong side of the stack."

00000:	|---------------|
	|		|
	|		|
	|		|
	|		|
	|		|
	|		|
	|		|
	|		|
	|		|
	|		|
	|		|
	|		|
	|		|
	|		|
	|		|
	|		|
	|		|
    	|		|
sp:	| used by C  	|
	| used by C	|
	| used by C 	|
	| argc  	|
	| &argv[0]	|----
	| &envp[0]	|   |
   ----	| argv[0]	|<---
   |	| argv[1]	|
   |	| argv[2]	|
   |	|   .     	|
   |	|   .		|
   |	|   .		|
   |	| argv[argc-1]	|
   |	| NULL		|
   ---->| string0	|
	| string1	|
	| string2	|
	|   .     	|
	|   .		|
	|   .		|
	| string.argc-1	|
	|---------------|
Study this picture. Go ahead. Close your eyes. Now, visualize it. The "trick" here is that once a user program starts running (not the OS but a program loaded into main_memory[] by load_user_program()) the stack pointer (a register in the CPU) will never be incremented past its initial value. Thus, all of the data above the stack pointer in memory (at higher addresses in main_memory[]) is "safe" and will not be overwritten or changed by a correctly working C program. Of course, you can "hack the stack" by setting a pointer explicitly to address these values or a programming error could result in an address that is above the initial stack pointer, but the compiler will assume that the arguments passed to the C program from the command line by the shell will be put into locations above the initial value loaded into the stack pointer register.

This is the organization that the MIPS/DEC Ultrix process-launching mechanism (defined by gcc) assumes will be in place when a program is initiated. Bigger addresses are at the bottom of the figure, by the way. Yes, it is weird, but it is less weird than writing the string values backwards. Think about it. Anyway, the stack (which grows from bigger addresses to smaller addresses) grows up in this figure.

To set up this stack structure (which it is helpful to understand) the OS bootstrap includes two functions with the prototypes

int *MoveArgsToStack(int *registers, char *argv[], int mem_base);
void InitCRuntime(int *user_args, *registers, char *argv[], int mem_base);
that will initialize the stack for you. This prototype is in "simulator.h" and the function is included in the bootstrap code (like load_user_program() and run_user_code()). It takes three arguments: These routines, together, build the runtime argument data structure that the C-language runtime for the DEC Ultrix version of gcc implements. Note that it builds it in memory after the program is loaded but before the program is run. So your OS (in this lab -- we will extend this to multiple programs in the next lab) needs to do something like
SetUpandRunUserProgram()
{
  int local_registers[NumTotalRegs]; /* memory for initial set of registers */
  int *user_args;
  .
  .
  .
  /* see discussion below for how to use kos_argv[] */
  load_user_program(kos_argv[0]); /* argv[0] is the name of the program */
  .
  .
  .
  /* init local_registers */
  .
  .
  local_registers[StackReg] = MemorySize - 12; /* set stack at top of memory */
  .
  .
  .
  user_args = MoveArgsToStack(local_registers,kos_argv,0); /* copy strings onto user-space stack */
  InitCRuntime(user_args,local_registers,kos_argv,0); /* build gcc runtime data structures in user space */
  run_user_code(local_registers); /* jump to user space and run the program */
  /* not reached */
}
Note that this example uses kos_argv[] (discussed in the next section) directly. In future labs (like when you implement the exec() call) you will want to make your own argv[] that you can pass this function.

This example is a sketch of how to call MoveArgsToStack() and InitCRuntime() -- there are other pieces of code you will need to include but these functions uses the initial value of the stack pointer (contained in registers[StackReg] passed as an argument) to initialize the C-runtime. They also resets registers[StackReg] with the initial stack pointer value that the C runtime will use as the base of the stack (the highest address that the compiler will access). Thus, you should not change this value between the call to MoveArgsToStack() and run_use_code() since the purpose is to initialize the user-space stack prior to jumping to user space.

Getting the "-a" flag to work

Another subtlety has to do with booting your version of KOS. If you think about it a bit, you'll see that the main routine KOS() is really the entrypoint for the whole operating system.. This code is usually called bootstrap code in OS parlance and "booting" refers to running the bootstrap code. Thus when you boot KOS, you start running in the function KOS() (which is called by the machine simulator when a boot is initialized).

When you boot KOS, you need to give the machine something to do once your boot sequence is completed. For example, my boot sequence initializes a doubly linked list to use as the ready queue of runnable processes. Once that queue is initialized, my OS is booted and needs to do something. If there are no runnable processes, then it needs to call noop(). Otherwise, it needs to schedule a runnable process to the CPU.

In this assignment, what you want to do is to boot the OS and then immediately load a user process into main_memory along with the command line arguments that you would like to pass to it. Thus you will probably write a function like init_user_process(char *fname) which takes the path in the Linux file system to an R3000 binary which has been cross compiled (in my solution, init_user_process() is forked using kt_fork() so the argument is a (void *) but the function unmarshals the argument into a (char *)).

If the name of the binary is, say, "a.out" then the call might look like


kt_fork(init_user_process,(void *)"a.out");

Notice, though, that the binary must be called "a.out" and there is no way to pass command-line arguments. That is, there is no way to run

a.out Rex is awesome!

Dr. Plank's solution uses a global variable called Argv which the init_user_process() function can access. Thus, each time you want to run a different program you need to copy it (or symlink it) to the file "a.out" and change the global variable Argv (and recompile). Not a great solution, but functional and fine once we get a shell going for KOS.

However, to make this a little cleaner we have included a global variable to the machine boot code that captures the "-a" flag from the command line when you run your OS (boot your OS). The variable is char *kos_argv[] and it is declared global to the simulator. Thus you can run

./kos -a a.out Rex is awesome!
and when the simulator calls your KOS() bootstrap function the global variable will be initialized as:
kos_argv[0] = "a.out";
kos_argv[1] = "Rex";
kos_argv[2] = "is";
kos_argv[3] = "awesome!";
kos_argv[4] = NULL;
There is no kos_argc. Instead, you need to look for the NULL in the array of pointers as the end of the argument list.

With this feature, then, you don't need to name your R3000 binary "a.out". For example, you could run the "hello world" test code we have provided as

./kos -a /cs/faculty/rich/cs170/test_execs/hw
and you will pick up the right path name to the file hw which we have compiled for the R3000. As another example, you could run your own test program that takes three arguments:
./kos -a ./my-awesome-test-code-R3000 arg1 arg2 arg3
and the boot code for the machine will fill in the kos_argv[] array for you with the appropriate strings.

And that is it. Follow my recipe and keep these helpful hints in mind for a fluffy and perky KOS Lab every time.