CS170 Lecture notes -- A Brief Introduction to Threads

  • Rich Wolski and James Plank
  • Source Code Directory: /cs/faculty/rich/public_html/class/cs170/IntroThreads
  • Lecture notes: http://www.cs.ucsb.edu/~rich/class/cs170/notes/IntroThreads/index.html
  • Posix Threads can be fun and exciting if used properly.
  • a makefile for all of the code examples in this lecture is also available.
    Threads are programming abstractions permit you to encapsulate the functionality you wish to implement using multiple, self-contained processing units that can be made to cooperate via shared memory. In this lecture, we will begin to study threads, both in abstract terms and in terms of the Posix Threads standard abstractions.

    What are threads?

    In its most basic and abstract form, a thread is defined to be a program counter and some local storage (state). Multiple threads can be executing within the same program (list of instructions), but at different places (each given by a different program counter value). Typically, each thread has its own stack so local variables are local to a thread. All threads share the global variables and heap space for the program in which they execute, however, and they used this "shared memory" to pass data between themselves.

    Threads are often called "lightweight processes". Whereas a typical process in Unix consists of CPU state (i.e. registers), memory (code, globals, heap and stack), and OS info (such as open files, a process ID, etc), in a thread system there is a larger entity, called a "task", a "pod", or sometimes a "heavyweight process."

    Tasks, programs, threads We will alter this definition slightly. For our work, a "task" or "program" will refer to a collection of resources (registers, memory space, file descriptors, network connections, etc.) A thread (or a set of threads) will refer to the active execution (a moving program counter and local variables) through a program. Under these definitions, then, a Unix process is a "task" (the resources defined by the process charged to a user ID) with a single "thread" of execution within it.

    virtual parallel processing When you program with multiple threads explicitly, you assume that they execute simultaneously within their task. In other words, it should appear to you as if each thread is executing on its own CPU, and that all the threads share the same memory, network connections, file descriptors, disk storage, files etc.

    Why threads?

    There are many reasons to program with threads. In the context of this class, there are two important ones:
    1. They allow you to deal with asynchronous events synchronously and efficiently.
    2. They allow you to get parallel performance on a shared-memory multiprocessor.
    You'll find threads to be a big help in writing an operating system.

    Some Useful Definitions

    Before we go further, you will need to know what a few terms mean in the context of an operating system.

    program: a list of instructions that direct the machine to perform a desired computation.

    state: a set of values contained in a specified set of variables (memory locations).

    process: the state associated with a running program.

    thread: a fundamental unit of computation consisting of:

    Using these definitions, a program becomes a process when it is initiated. It may contain one or more threads, each characterized by an individual program counter and local state variables, all accessing a share global set of state variables.

    Thread Primitives

    basic thread primitives There are various primitives that a thread system must provide. Let's start with two basic ones. In this initial discussion, I am talking about a generic thread system. We'll talk about specific ones (such as POSIX and Solaris threads) later.

    Posix threads

    syntax versus semantics On Solaris systems, there is a thread system that you can use. It is called ``Solaris threads.'' There is another thread system called ``Posix threads'' that is a standard. The differences are subtle and mostly have to do with the corporate machinations that take place whenever a standard is defined. For the most part, Posix Threads and Solaris Threads has very similar syntax. Their semantics can differ substantially. In this course, we'll stick with Posix Threads unless otherwise specified. Be forewarned, though, your mileage may vary, especially if use Solaris instead of Linux. Most Linux systems also support Posix threads as well although there is some dispute over the issue of semantics. Bottom line: RTFM.

    To make use of Posix threads in your program, you need to have the following include directive:

    #include < pthread.h >
    
    And you have to link libpthread.a to your object files. The tricky part is that some Unix and Linux systems build libpthread.a into the standard C library. The easiest way to make sure you get what you are paying for is to use the -lpthread build option.
    UNIX> cc -c main.c
    UNIX> cc -o main main.o -lpthread
    
    You can use gcc too so.
    UNIX> gcc -c main.c
    UNIX> gcc -o main main.o -lpthread
    
    In this class, please use gcc unless we give you a good reason that you should not. There are, again, subtle differences between the way in which Solaris cc works and the way in which gcc particularly with respect to other libraries that you may use (perhaps implicitly). When all else fails, please use gcc.

    There's a lot of junk in the pthread library. You can read about it in the various man pages. Start with ``man pthreads''. The two basic primitives defined above are the following in Posix threads:

         int pthread_create(pthread_t *new_thread_ID,
                            const pthread_attr_t *attr,
                            void * (*start_func)(void *), 
                            void *arg);
    
         int pthread_join(pthread_t target_thread, 
                          void **status);
    
    This isn't too bad, and not too far off from my generic description above. Instead of returning a pointer to a thread control block, pthread_create() has you pass the address of one, and it fills it in. Don't worry about the attr argument -- just use NULL. Then func is the function, and arg is the argument to the function, which is a (void *). When pthread_create returns, the TCB (which uniquely identifies the created thread) is in *new_thread_ID, and the new thread is running func(arg).

    pthread_join() has you specify a thread, and give a pointer to a (void *). When the specified thread exits, the pthread_join() call will return, and *status will be the return or exit value of a thread.

    In all the Posix threads, calls, in integer is returned. If zero, everything went ok. Otherwise, an error has occurred. As with system calls, it is always good to check the return values of these calls to see if there has been an error. In my code here in the lecture notes, I'll omit error checking, but it is in the files, and you should do it.

    How does a thread exit? By calling return or pthread_exit().

    Ok, so check out the following program (in hw.c):

    
    /*
     * hw.c -- hello world with posix threads
     *
     */
    
    #include < pthread.h >
    #include < stdio.h >
    
    void *printme(void *arg)
    {
    	printf("Hello world\n");
    	return NULL;
    }
    
    int
    main()
    {
    	pthread_t tcb;
    	void *status;
    	int err;
    
    	err = pthread_create(&tcb, NULL, printme, NULL);
    
    	if (err != 0) 
    	{
    		perror("pthread_create");
        		exit(1);
      	}
    
    	err = pthread_join(tcb, &status);
    	if (err != 0) 
    	{ 
    		perror("pthread_join"); 
    		exit(1); 
    	}
    
    	return(0);
    
    }
    
    Try copying hw.c to your home area, compiling it, and running it. It should print out ``Hello world''.

    Forking multiple threads

    Now, look at print4.c. This forks off 4 threads that print out ``Hi. I'm thread n'', where n is the TCB identifier. Notice that this might be an integer or an address, depending on the implementation, but that it doesn't matter which. The TCB is the unique "name" of the thread within your program. This should give you a good idea of how the pthread library works. Feel free to play with this library to get a feeling for how a thread system works. Since Unix is not multithreaded, and since your machines are not multiprocessors, the threads don't get you any extra performance. It just lets you play with threads.

    Here's the output of print4.c when run on the department's Linux systems:

    ./print4
    Hi.  I'm thread -1208280176
    Hi.  I'm thread -1218770032
    Hi.  I'm thread -1229259888
    Hi.  I'm thread -1239749744
    main thread -- Hi.  I'm thread -1208277312
    I'm -1208277312 Trying to join with thread -1208280176
    -1208277312 Joined with thread -1208280176
    I'm -1208277312 Trying to join with thread -1218770032
    -1208277312 Joined with thread -1218770032
    I'm -1208277312 Trying to join with thread -1229259888
    -1208277312 Joined with thread -1229259888
    I'm -1208277312 Trying to join with thread -1239749744
    -1208277312 Joined with thread -1239749744
    
    So what happened is the following. The main() program forked the first 4 threads and they each ran in turn. Then the main() thread got control and printed its message after which It called pthread_join for thread 1082375472, 1090768176, 1099160880, and 1116949808 respectively. Finally, when main() returns, all the threads are done, and the program exits. Three things to note. The main program is implicitly, itself, a thread. Notice that thread 1073980576 was never created but the call to Ego() works all the same. Secondly, the order in which created threads run is not defined by pthreads. Thirdly, pthreads is free to choose any way it wants to name threads. Here is the output from exactly the same program run on the department's Solaris machines.
    ./print4
    main thread -- Hi.  I'm thread 1
    I'm 1 Trying to join with thread 4
    Hi.  I'm thread 4
    Hi.  I'm thread 5
    Hi.  I'm thread 6
    Hi.  I'm thread 7
    1 Joined with thread 4
    I'm 1 Trying to join with thread 5
    1 Joined with thread 5
    I'm 1 Trying to join with thread 6
    1 Joined with thread 6
    I'm 1 Trying to join with thread 7
    1 Joined with thread 7
    
    First, the main thread (thread 1) runs before the created threads -- not after. Why? Because pthreads doesn't dictate when created threads run relative to each other. The Solaris implementers chose to allow the main thread to keep running until the call to pthread_join() at which time it blocked and the other runnable threads each ran in turn. Notice also that the Solaris version just happens to skip 2 and 3 for reasons it is not obligated to tell us.

    Under OSX, the following output is generated from the same program:

    ./print4
    main thread -- Hi.  I'm thread -1610559488
    I'm -1610559488 Trying to join with thread 25166848
    Hi.  I'm thread 25166848
    Hi.  I'm thread 25167872
    Hi.  I'm thread 25168896
    Hi.  I'm thread 25169920
    -1610559488 Joined with thread 25166848
    I'm -1610559488 Trying to join with thread 25167872
    -1610559488 Joined with thread 25167872
    I'm -1610559488 Trying to join with thread 25168896
    -1610559488 Joined with thread 25168896
    I'm -1610559488 Trying to join with thread 25169920
    -1610559488 Joined with thread 25169920
    

    Again, it is key to your cosmic wa and general happiness that you understand all three of these executions are absolutely correct. That is, the thread system is free to impose either ordering and any naming scheme it chooses. It is your responsibility to ensure that threads execute in the order you want them to and we'll discuss how you can control this ordering.

    exit() vs pthread_exit()

    In pthreads there are two things you should know about thread/program termination. The first is that pthread_exit() makes a thread exit, but keeps the program alive, while exit() terminates the entire program. If all threads (and the main() program should be considered a thread) have terminated, then the program terminates. So, look at p4a.c.

    Here, all threads, including the main() program exit with pthread_exit(). You'll see that the output is the same as print4. Notice, however, that the main thread cannot call printme() and get the same output since printme() calls pthread_exit(). p4b.c illustrates what happens when we replace the printf statement at line 69 with a call to printme() which contains a pthread_exit(). The output (for Linux) is:

    ./p4b
    Hi.  I'm thread 1082375472
    Hi.  I'm thread 1090768176
    Hi.  I'm thread 1099160880
    Hi.  I'm thread 1116949808
    main thread -- Hi.  I'm thread 1073980576
    
    You'll note that none of the "Joining" lines were printed out because the main thread had exited. However, the other threads ran just fine, and the program terminated when all the threads had exited.

    The second thing you need to know is that when a forked thread returns from its initial calling procedure (e.g. printme() in print4.c, then that is the same as calling pthread_exit(). However, if the main() thread returns and it is the first to run, then that is the same as calling exit(), and the program dies. Here is where you really need to be careful. Check out p4c.c. Here is the Linux output

    ./p4c
    Hi.  I'm thread 1082375472
    Hi.  I'm thread 1090768176
    Hi.  I'm thread 1099160880
    Hi.  I'm thread 1116949808
    main thread -- Hi.  I'm thread 1073980576
    
    Under OSX, however, you get
    main thread -- Hi.  I'm thread -1610609172
    
    and that's it. All threads have been created when the main thread exits, but they haven't run yet. When the main thread returns, the task is terminated, and thus the threads do not run. Again, it is critical that you understand that both of these programs are correct from the perspective of the standard.

    Finally, look at p4d.c. Here, the threads call exit() instead of pthread_exit(). You'll note that the output is:

    main thread -- Hi.  I'm thread -1610559488
    I'm -1610559488 Trying to join with thread 25166848
    Hi.  I'm thread 25166848
    
    This is because the program is terminated by thread 25166848's exit() call.

    Parameter passing and return values

    Often, you are going to want to pass parameters to a thread and get back one or more return values through pthread_join(). The single void * argument that each thread takes is intended to allow the caller of pthread_create() to specify one or more arguments to the thread that is created. To see how this technique is commonly employed, consider the code in adder.c in which each thread does the same thing to different parameters passed by the main thread.

    The thread entry point called AddIt() takes a single void * argument. It converts that pointer to a pointer to a structure of type struct thread_arg so that it can extract the two fields: value and increment. It then mallocs a structure for the return value and puts into it the sum of the value and the increment that are passed. Finally, it frees the argument structure and passes the pointer to the return structure to pthread_exit() casted as a void *. The calling thread gets this pointer through a call to pthread_join() and, once the return values is printed out, frees the malloced space.

    You should study this code very carefully. Not only does it illustrate the common method of parameter and return value passing under pthreads, but it covers most of the important C concepts (e.g. malloc(), casting, structures, pointers and addresses) that you will need for the remainder of this class. If this code is not 100% crystal clear, you should consider brushing up on your C.


    Preemption versus non-preemption

    Now, take a look at iloop.c. Here, four threads are forked off, and then the main() thread goes into an infinite loop. When you execute it on a Solaris system, you see nothing. Threads zero through 3 are never executed. This is because the threads system on these machines is non-preemptive. In other words, there is one CPU, and unless a thread voluntarily gives up the CPU (via a blocking call line pthread_join or by terminating), it will retain the CPU. In a preemptive system, such as Linux, threads may be interrupted and rescheduled at any time, and iloop will actually have threads 0 through 3 print out their id's (although the program will never terminate). There are some machines that have multiple CPU's attached to a single memory. These systems are by nature preemptive, since different threads will actually execute on different CPU's. Such a machines will have threads 0 through 3 print out their id's (although the program will never terminate).

    We will talk more about preemption later.

    A non-preemptive thread system on a system with a single CPU (called a "uniprocessor" may seem useless, but in actuality it is extremely useful. On a multiprocessor system (i.e. multiple CPU's attached to one memory), it should be obvious that you can use threads to achieve parallel speedup on your programs.