"Unix Network Programming Volume 1, Networking APIs: Sockets and XTI", W. Richard Stevens, Prentice Hall PTR, 1998
as it is generally considered biblical with respect to the subject of sockets. There is some ambiguity even in the bible, however, so your best bet is to consult with the local man pages as well.
IPC -- In the last lecture we spent a great deal of time talking about threads. In particular,. we said that thread communicate state between themselves via shared variables, and that communication can be controlled using synchronization primitives. Threads for Unix, however, are a relatively new occurrence as Unix goes. When Unix was first invented, and for many years after, there were no thread packages (none that were portable, at least). Processes were the only control abstraction, and they can't share memory.
Well, they can, but they had to use what is called System V shared memory to do so, and it wasn't as straight-forward as using threads.
message passing -- Worse, what if two threads wanted to share information and they were not located within the same process, or even on the same machine? Unix and Linux do not currently support shared variables over the network (although doing so has been and continues to be an on-going research area). In the bad old days, before threads, there needed to be a way to get data from the memory space of one process into another and to synchronize the transfer. That is why Unix invented Unix pipes as a programming abstraction.
network communication -- But there was still a problem. Pipes included file-system semantics that were really hard to implement over a network. Fundamentally, all pipes do is to push the shared memory down into the kernel -- they still assume that memory can be shared. To communicate over a network, then, a different abstraction was necessary.
device drivers -- The typical approach to implementing network communication was via a Unix device driver which was completely specialized for each combination of network protocol and Unix implementation. You could write a program (like telnet) for a particular device driver implementation of TCP on a particular version of Unix, but it would be as portable as the Brooklyn Bridge. At O.S. upgrade time, you had to hope that the device would not have changed. In addition, if you wanted to write an application consisting of processes running on multiple computers, you were faced with having to write the communication part differently for each driver implementation.
sockets as a common abstraction -- It was decided, then, that a common abstraction for network communication was needed. There were two competing possibilities that anyone took seriously: sockets and System V streams (invented for making terminal handling better). Sockets won out as a user abstraction, although the functionality is simple enough that Sun Microsystems implemented sockets using kernel-level streams in early versions of Solaris (I don't know what they do now).
byte streams and file descriptors -- From a user's perspective, sockets are intended to function like Unix files, once they are initialized. They are represented in a user program by file descriptors, and as such the Unix read(), write(), and close() work with sockets without modification. They idea, then, is that once a socket is established, it looks to your program like a file. In particular, data is sent and received as a byte stream (without record boundaries). Of course, in practice, the byte stream abstraction is not strictly observed (c.f. UDP) but it is mostly true. Notice also that seek() is not supported since it is not generally possible to rewind and randomly seek into a network communication with another process.
universal IPC -- all network protocols -- On the user side, sockets are intended to present a single interface: the Unix byte-stream abstraction. On the system side, though, they were designed to be able to interface to all of the networking protocols of the day. This sounds simple until you consider all of the possible addressing modes and protocol parameters that any protocol might need. This level of generality complicates the interface substantially, in my opinion.
Internet protocol suite -- Despite this complexity, however, about the only real protocol implementations that are available are for two of the Internet protocol combinations: TCP/IP and UDP/IP. There are two reasons, and they are not independent. First, these are, by far, the most popular communication protocols on the planet. Secondly, sockets were most competently implemented for Berkeley Software Distribution (BSD) Unix which was the base O.S. for many of the workstations that were being attached to the Internet at the time ethernet became popular.
In order to be general, sockets define a generic address structure that can be
interpreted in the context of what ever protocol happens to be the one that
will be attached to the socket. The sockaddr_in data structure in
Stevens (and usually found in
struct sockaddr_in {
uint_8 sin_len; /* length of structure (16) */
sa_family_t sin_family; /* address family is AF_INET for IP */
in_port_t sin_port; /* port number (UDP/TCP)
struct in_addr sin_addr; /* IP address */
char sin_zero[8]; /* unused */
};
which is fairly self-explanatory. If you were implementing sockets for
another family of protocols, you would need to define something analogous to
this address structure for the specific protocol. The sin_len field
allows you to specify the size of the structure to the other socket routines
so you are not constrained to a fixed address structure size.
TCP/UDP ports and IP addresses -- There are two essential pieces of information in this data structure that actually have to do with addresses: the address filed sin_addr and the port number sin_port. All user-level transport connections use an (addr,port) ordered pair as an address. The idea is that the address refers to the machine (or more cryptically, an interface on the machine) and the port number refers to a particular process. It is these fields that you must set when you wish to get your socket to talk to another socket (e.g. on another machine).
Incidentally, here is sockaddr_in on my Linux machine:
/* Structure describing an Internet socket address. */
struct sockaddr_in
{
__SOCKADDR_COMMON (sin_);
in_port_t sin_port; /* Port number. */
struct in_addr sin_addr; /* Internet address. */
/* Pad to size of `struct sockaddr'. */
unsigned char sin_zero[sizeof (struct sockaddr) -
__SOCKADDR_COMMON_SIZE -
sizeof (in_port_t) -
sizeof (struct in_addr)];
};
This is a somewhat more cryptic version, but it conveys an important concept.
The intention is that you will cast between a "generic" socket address
described by struct sockaddr and a particular kind of address referred
to, in our case, by struct sockaddr_in. Some how this state of affairs
sounds like having your generality cake and eating it too, but I'll refrain
from further comment. You should be aware, though, that you will be called
upon to perform mighty feats of casting occasionally when you program sockets.
It is not the way I would have done it, but that is probably a good thing.
uint32_t htonl(uint32_t host_long); uint16_t htons(uint16_t host_short); uint32_t ntohl(uint32_t net_long); uint16_t ntohs(uint16_t net_short);The function htonl() is shorthand for "host-to-network-long" which is hopefully, but not always (thanks Cray) a 32-bit quantity. I'll let you substitute the word "short" for the letter "s" and figure out the other three calls. You need to use them convert IP addresses to network byte ordering and back again when you use them in socket calls.
#include < sys/socket.h > int tcp_sd; int udp_sd; tcp_sd = socket(AF_INET, SOCK_STREAM, 0); udp_sd = socket(AF_INET, SOCK_DGRAM, 0);What you've done with these calls is to tell the kernel that you want a TCP (UDP) socket from the Internet (INET) protocol family, and the third argument is zero because life is better that way. Strangely, the kernel is happy to know this information, but the socket at this stage of its existence is really pretty useless. Oh. If the kernel is unhappy about your intentions, it will let you know by returning a -1.
To make your sockets more interesting, you need to bind them to actual addresses. This binding typically takes two forms, depending on whether you wish the socket to be a server socket or a client socket. More specifically, if you know that some other entity will be connecting to your socket, you are loosely thinking of the socket as a "server" socket. Otherwise, your socket is a "client" socket.
To bind your server socket to a well-known address you must use the bind() system call (why would they name it that?).
#include < netinet/in.h > #include < sys/socket.h > int tcp_sd; struct sockaddr_in sa; short my_port = 8293; /* * make a socket */ tcp_sd = socket(AF_INET, SOCK_STREAM, 0); sa.sin_addr.s_addr = htonl(INADDR_ANY); sa.sin_port = htonl(my_port); bind(tcp_sd, (struct sockaddr *)&sa, sizeof(sa));This example bears a little explaining. Calling socket() creates a TCP socket, as before. To bind that socket with bind() we need to specify and address in a sockaddr_in structure. If we had a particular IP address in mind, we could specify it. Often, though, a server wants to be able to respond to any connection that is attempted to a given port (in this case, 8293). Unix and Linux both provide INADDR_ANY as a wildcard IP address constant in < netinet/in.h >.
argument problems -- But there is a problem and if you ask me, and I know I do frequently, this problem is the reason they should have thrown sockets out as a general mechanism and should have made them IP specific. The call to bind() doesn't take a struct sockaddr_in as an argument. Instead, it takes a struct sockaddr, which is a "generic" socket address structure. Okay, if it is "generic", shouldn't it be a void * or caddr_t? I'd feel better about things if there was a conversion routine or a packaging routine or something, but that isn't the way it is done. Instead, the common parlance is to simply cast the address of your sockaddr_in as a struct sockaddr and then let things fly. Why does this interface suck? Because typing the first argument as a struct sockaddr * implies that it has a data type and, as such, a size (at least). Giving a size argument as the second argument implies that the size can be anything. C does not support structured data types of indeterminate length. Yes -- I know I'm nit picking, but if you are going to go through all of the pain and suffering of this interface, you might as well get it right.
Anyway, the call to bind() then attaches the socket to any IP address associated with the machine, but to port 8293.
Notice, also, that we had to convert the address and port numbers to network byte order before using them in the bind() call.
You might think that after a call to socket() and a call to bind(), you program would be in possession of a working server socket. Yes -- well -- we can dream. Unfortunately, the socket() call actually creates a client socket by default. Never mind that the call to bind() specifies INADDR_ANY which makes little sense for a client. Think of it for a minute. I'm a client and I want to connect to a server using all legal addresses for my machine? Sigh. Sockets are supposed to be general, and this is definitely specific knowledge about the IP protocol. For the socket to recognize that you are writing a server from the arguments to bind(), they would need to understand the contents of sockadr_in. They explicitly do not.
Okay, so you don't really have a server socket yet. To tell the kernel that, in fact, you plan to treat the socket as a server socket, you need to issue a call to listen().
#include < netinet/in.h > #include < sys/socket.h > int tcp_sd; struct sockaddr_in sa; short my_port = 8293; /* * make a socket */ tcp_sd = socket(AF_INET, SOCK_STREAM, 0); sa.sin_addr.s_addr = htonl(INADDR_ANY); sa.sin_port = htonl(my_port); bind(tcp_sd, (struct sockaddr *)&sa, sizeof(sa)); listen(tcp_sd, 5);The listen() call does two things. First, it tells the kernel that the socket will be a server socket. What this really means is that the TCP state is set to LISTEN so that the socket is the "contactee" and not the "contractor" in a rendezvous. The second argument sets the backlog value for the socket which is the number of connections you are requesting the kernel to buffer for you.
backlog -- Why is the kernel buffering anything for you at all? A a server process, it turns out that it has to in order to avoid a race condition. Consider what happens if it does not. A client connects to your server, but before Unix scheduled your process so that you can service the connection (with an accept() call -- see below), a second client attempts a connection. That second client's attempt will fail with a "connection refused" response because the socket is engaged, but not serviced. Why the kernel just doesn't give you the maximum be default, we just don't know. It used to be that the maximum number was 5, but with web servers and things like the Kenneth Starr report, 5 has turned out to be a bit small. I'm old, though, so I'll stick with 5 in my examples.
#include < netinet/in.h > #include < sys/socket.h > int tcp_sd; int new_sd; struct sockaddr_in sa; struct sockaddr_in remote_addr; short my_port = 8293; /* * make a socket */ tcp_sd = socket(AF_INET, SOCK_STREAM, 0); sa.sin_addr.s_addr = htonl(INADDR_ANY); sa.sin_port = htonl(my_port); bind(tcp_sd, (struct sockaddr *)&sa, sizeof(sa)); listen(tcp_sd, 5); new_sd = accept(tcp_sd, (struct sockaddr *)&remote_addr, sizeof(remote_addr));Now, to me, this call is really confusing. Not only do we get to enjoy the usual magic dance of the "generic" structured data type casting, but we get back a new socket. A NEW SOCKET?? I thought these were supposed to be file descriptors. They are not.
server sockets and active sockets -- It turns out that while the "generic" socket address is really an unstructured data type that is implemented as a structure for no good reason, the socket descriptor is a structured data type that is implemented as an integer. Sigh. I cannot reconstruct the thinking that went into this decision, but the way it works on the server end is that there are two kind of sockets -- listening sockets and active sockets -- and they are both represented by the same data type (the integer). The semantics of this call are that a new socket descriptor is created that in the connected state if a client has attempted to connect. Otherwise, the call blocks (indefinitely) waiting for a connection. When a connection is made, the new descriptor is created, and the call unblocks returning the new descriptor. The old descriptor remains the old descriptor, complete with its original characteristics.
#include < netinet/in.h >
#include < sys/socket.h >
int tcp_sd;
short my_port = 8293;
struct sockaddr_in server_addr;
tcp_sd = socket(AF_INET, SOCK_STREAM, 0);
inet_aton("207.171.183.16",&server_addr.sin_addr.s_addr);
server_addr.sin_port = htonl(my_port);
connect(tcp_sd, (struct sockaddr *)&server_addr, sizeof(server_addr));
Here, there are two things you need to understand. The first is the call to
inet_aton() which converts an ASCII string in dotted notation to an
Internet address in network byte order. The second is the call to
connect() which takes, as its arguments, a sockaddr specifying
the address of the remote server with which the socket is to be connected.
If the server specified by the address is there and listening, the call to
connect() will block until the connection is established.
If it the server is not there, the call will terminate with an error. The
time it takes to terminate, however, is a function of what the remote end and
the network send back. If the machine is there, but it knows there is no
server listening to the specified port, an RST is usually sent back that
causes the connect to fail immediately. Otherwise, the call will hang
while the kernel retries for an implementation-specific amount of time.
When the connect() completes successfully, however, the socket specified as its first argument is the connected socket. Sigh. Just like accept()? Not.
At this point, you have enough to write a simple client/server application using sockets, assuming that you have the dotted-notation IP addresses for the servers, at least. The last piece you need to make this process seem natural is some understanding of how to convert a machine "name" as we think of it to an IP address. Overwhelmingly, the way this is done in almost all environments today is with the Domain Name Server (DNS), although that hasn't always been true. There are a set of calls, though, that let you ask the local kernel (which is usually contacting the DNS) for name-to-address translations and vice versa.
The most portable conversion routine, at present is gethostbyname().
#include < netdb.h >
struct hostent *hn;
hn = gethostbyname("amazon.com");
This call returns a struct hostent
struct hostent {
char *h_name; /* official name of host */
char **h_aliases; /* alias list */
int h_addrtype; /* host address type */
int h_length; /* length of address */
char **h_addr_list; /* list of addresses */
}
There are a couple of tricky parts here. First, you get back a pointer to a
struct hostent that you didn't allocate. How is this possible? It is
because gethostbyname() is using a static buffer to hold the structure.
WARNING: If you are using threads, this static buffer is a global
variable. You must protect it as a critical region.
The other tricky part concerns the actual addresses. The field h_addr_list points to an array of char * pointers, each one of which points to an array of address bytes, in network byte order. The last pointer in this array is NULL, and each address is h_length bytes long.
Let's look at example. Before we do, however, you need to understand another portability point.
While most Unix and Linux systems give you the necessary socket calls as part of the default invocation of the compiler, Sun is different. When you compile a program containing socket and/or DNS routines on a Solaris system you must include
-lsocket -lnslas part of your load line. It turns out that you need you use autoconf and configure to handle this difficulty automatically. In makefile, I handle it manually by including a LIBS constant. If you want to use the makefile on Solaris, uncomment that line. And if you ever work in a position of authority at Sun, and you do not immediately institute an engineering change that adds the socket and network routines to the standard compile line (being the networking-is-the-computer company), I will personally hunt you down. You have been warned.
The code contained in my_dns1.c takes a DNS name from the command line and prints out the dotted-notation ASCII for its IP address.
#include < unistd.h >
#include < stdlib.h >
#include < stdio.h >
#include < sys/types.h >
#include < sys/socket.h >
#include < netinet/in.h >
#include < arpa/inet.h >
#include < netdb.h >
#include < errno.h >
#include < signal.h >
#include < math.h >
#include < string.h >
int main(int argc, char *argv[])
{
struct hostent *hn;
int i;
struct in_addr addr;
if(argc < 2)
{
fprintf(stderr,"usage: my_dns1 dns_name\n");
fflush(stderr);
exit(1);
}
hn = gethostbyname(argv[1]);
if(hn == NULL)
{
fprintf(stderr,"gethotbyname() failed for %s\n",argv[1]);
fflush(stderr);
exit(1);
}
/*
* loop through address list
*/
i = 0;
while(hn->h_addr_list[i] != NULL)
{
/*
* this is ugly -- inet_ntoa() takes a structure as a value
* parameter
*
* please don't ever do this
*/
memcpy(&addr,hn->h_addr_list[i],sizeof(addr));
fprintf(stdout,"%s\n", inet_ntoa(addr));
i++;
}
return(0);
}
There are a couple of things to notice from this example. The first is that
each element of the array h_addr_list is, in fact, a pointer to an
array of bytes containing an address. The second, is that the call
inet_ntoa() takes a struct in_addr and not a pointer to a
structure a an argument. It used to be that all C compilers would let you
perform various machinations using casting before a final dereference to get
the address value, but the portable way to do that now is to copy the address
data into a structure of the correct type using memcpy() and then to
pass the target variable into inet_ntoa().
This second example illustrates just exactly how bad things can be when decisions about compatibility are not made by rational human beings. The file my_dns2.c extends my_dns1.c in a very small way. Instead of taking only a DNS name as an argument, it will take either a DNS name or a dotted-notation IP address on the command line. If an address is specified, the ASCII has to be translated into a network-byte order struct in_addr before it can be passed to gethostbyaddr() to retrieve the hostent.
#include < unistd.h >
#include < stdlib.h >
#include < stdio.h >
#include < sys/types.h >
#include < sys/socket.h >
#include < netinet/in.h >
#include < arpa/inet.h >
#include < netdb.h >
#include < errno.h >
#include < signal.h >
#include < math.h >
#include < string.h >
int main(int argc, char *argv[])
{
int is_addr;
struct hostent *hn;
int i;
struct in_addr addr;
if(argc < 2)
{
fprintf(stderr,"usages: my_dns2 dns_name\nmy_dns2 ip_addr\n");
fflush(stderr);
exit(1);
}
/*
* first, try to make an ip address out of the string passed as the
* first argument
*/
#ifdef sun
is_addr = inet_addr(argv[1]);
if(is_addr == -1)
{
is_addr = 0;
}
else
{
memcpy(&addr,&is_addr,sizeof(addr));
is_addr = 1;
}
#else
is_addr = inet_aton(argv[1],&addr);
#endif
/*
* if this worked, argv[1] is a dotted notation IP address =>
* gethostbyaddr()
*/
if(is_addr)
{
hn = gethostbyaddr((char *)&addr, sizeof(addr), AF_INET);
if(hn == NULL)
{
fprintf(stderr,"gethostbyaddr() failed for %s\n",
argv[1]);
fflush(stderr);
exit(1);
}
}
else
{
hn = gethostbyname(argv[1]);
if(hn == NULL)
{
fprintf(stderr,"gethostbyname() failed for %s\n",
argv[1]);
fflush(stderr);
exit(1);
}
}
/*
* print its official host name
*/
fprintf(stdout,"host name: %s\n",hn->h_name);
/*
* loop through the alias list
*/
i = 0;
while(hn->h_aliases[i] != NULL)
{
fprintf(stdout,"alias: %s\n",hn->h_aliases[i]);
i++;
}
/*
* loop through address list
*/
i = 0;
while(hn->h_addr_list[i] != NULL)
{
/*
* this is ugly -- inet_ntoa() takes a structure as a value
* parameter
*
* please don't ever do this
*/
memcpy(&addr,hn->h_addr_list[i],sizeof(addr));
fprintf(stdout,"address: %s\n", inet_ntoa(addr));
i++;
}
return(0);
}
It turns out that the function inet_aton() is not supported under
Solaris. Instead, an older "obsolete" function inet_addr() is
available, but of course, it works very differently. I tested the code shown
above on Linux, Solaris on SPARC, and Solaris on X86 and it worked. I'll
leave it to you to consult the relevant man pages to understand why, but the
bottom line is that writing portable code can be exasperating.
waiting for multiple events -- Okay, let's say that you are writing a server that calculates pi or something and the pseudocode for your server looks something like:
do all of the junk necessary to create a server socket; do forever accept() an incoming request; spawn a thread to compute pi; end doand you want your code to do a graceful shutdown if one of your threads encounters an error. What do you do? In particular, how does your dispatch thread learn of the error code that is going to come back when the thread calls pthread_exit()? It turns out that the answer is not all that clean. First, the problem is that it is hard to make a Unix thread (or process) wait for either a socket event or a thread/child exit. Notice that if the main thread synchronizes with any or all of the children, it will block waiting for at least one of them to terminate. But if it is blocked in a pthread_cond_wait() call, it isn't blocked in the accept() call so the server socket goes unserved.
Alternatively, if the server thread blocks in accept() it won't wake up to reap the status code (or whatever) from any terminating threads.
The best way to solve this problem actually requires another system call: select(). The select() call lets a Unix process or thread wait for activity on one of a number of different file descriptors. When the call terminates, the file descriptor(s) that is (are) "ready" is (are) indicated in a bit mask that is passed back as an out parameter.
To solve the wait problem, then, the server thread can open a socket (or a pipe if you are really brave) to each child thread and adds the socket to the set of file descriptors it wishes to wait for. It also puts the server socket in that list. When it blocks, it blocks in select(). Here is some pseudocode:
do all of the junk necessary to create a server socket;
add server_socket to select list;
do forever
select(select list);
if(socket that is ready is server socket)
accept() an incoming request;
make a new socket for thread and add to select list;
spawn a thread to compute pi;
else /* must be a child terminating */
find thread associated with the ready socket
join with thread to reap status
close and discard socket
end do
I've really glossed over quite a few nitty-gritty details here. First, select
modifies its file descriptor parameters (they are in-out parameters). What
that means is that the main thread must keep a global list and re-initialize
the arguments each time. Next, the main thread would need to be able to map
from a socket descriptor to a thread ID. A hash table would do. The reason
is that when the select() unblocks, the main thread needs to determine
the ID of the thread that caused the unblock. The point here is that what
looks like a simple problem might require several hundred lines of code to
implement properly and portably. Such is the current state of portable (and
hence Grid) programming.
There are three ways to implement time-limited connections using sockets: signals, non-blocking sockets and select, and time-out threads. We'll look at one in detail and discuss the other two more abstractly.
#include < unistd.h >
#include < stdlib.h >
#include < stdio.h >
#include < signal.h >
#include < sys/types.h >
#include < sys/socket.h >
#include < netinet/in.h >
#include < arpa/inet.h >
#include "socket_signal.h"
static void TO_Handler(int sig)
{
return;
}
int Accept(int s, struct sockaddr *addr, socklen_t *addrlen, unsigned int sec)
{
sighandler_t was; /* old signal handler */
int ret_code;
/*
* make sure the signal causes blocking system calls to unblock
*/
if(siginterrupt(SIGALRM, 1) < 0)
{
printf("siginterrupt failed\n");
exit(-1);
}
/*
* set our sig gandler and remember the one that was there
*/
was = signal(SIGALRM,TO_Handler);
/*
* set the timeout alarm
*/
alarm(sec);
ret_code = accept(s, addr, addrlen);
/*
* accept done, cancel alarm
*/
alarm(0);
/*
* reset the signal handler
*/
signal(SIGALRM, was);
/*
* ret_code will be -1 if accept died as a result of alarm
*/
return(ret_code);
}
int Connect(int sockfd, const struct sockaddr *serv_addr, socklen_t addrlen,
unsigned int sec)
{
void (*was)(int); /* old signal handler */
int ret_code;
/*
* make sure the signal causes blocking system calls to unblock
*/
if(siginterrupt(SIGALRM, 1) < 0)
{
printf("siginterrupt failed\n");
exit(-1);
}
/*
* set our sig gandler and remember the one that was there
*/
was = signal(SIGALRM,TO_Handler);
/*
* set the timeout alarm
*/
alarm(sec);
ret_code = connect(sockfd, serv_addr, addrlen);
/*
* connect done, cancel alarm
*/
alarm(0);
/*
* reset the signal handler
*/
signal(SIGALRM, was);
/*
* ret_code will be -1 if accept died as a result of alarm
*/
return(ret_code);
}
int Send(int s, const void *msg, size_t len, int flags, unsigned int sec)
{
void (*was)(int); /* old signal handler */
int ret_code;
/*
* make sure the signal causes blocking system calls to unblock
*/
if(siginterrupt(SIGALRM, 1) < 0)
{
printf("siginterrupt failed\n");
exit(-1);
}
/*
* set our sig gandler and remember the one that was there
*/
was = signal(SIGALRM,TO_Handler);
/*
* set the timeout alarm
*/
alarm(sec);
ret_code = send(s, msg, len, flags);
/*
* send done, cancel alarm
*/
alarm(0);
/*
* reset the signal handler
*/
signal(SIGALRM, was);
/*
* ret_code will be -1 if accept died as a result of alarm
*/
return(ret_code);
}
int Recv(int s, void *buf, size_t len, int flags, unsigned int sec)
{
void (*was)(int); /* old signal handler */
int ret_code;
/*
* make sure the signal causes blocking system calls to unblock
*/
if(siginterrupt(SIGALRM, 1) < 0)
{
printf("siginterrupt failed\n");
exit(-1);
}
/*
* set our sig gandler and remember the one that was there
*/
was = signal(SIGALRM,TO_Handler);
/*
* set the timeout alarm
*/
alarm(sec);
ret_code = recv(s, buf, len, flags);
/*
* send done, cancel alarm
*/
alarm(0);
/*
* reset the signal handler
*/
signal(SIGALRM, was);
/*
* ret_code will be -1 if accept died as a result of alarm
*/
return(ret_code);
}
It turns out that you really only need to worry about four calls (at a
minimum):
There are two subtleties that you should notice. First, the call to siginterrupt() is important. It tells the operating system that system calls should not be restarted after a signal is received by the process. On some systems (including many versions of Linux), the default action is to go back and restart the system call (thereby negating the signal's ability to interrupt and cause a time out).
The other subtlety is really just a matter of fastidious signal programming style. The wrapper codes save off the signal handler that was in place and then restore them before they return.
To see these calls in action, check out the code in echo.c which implements a small echo client and server (under the control of a pre-processor definition). I strongly recommend that you study this code, install it on a machine or two, and play around with it. In particular, change the definitions of SERVER_TIMEOUT and CLIENT_TIMEOUT to convince yourself that they work.
Using alarm() to implement timeouts is fairly portable in a single-threaded program. In a threaded program, however, that uses pthreads there is an important issue to consider: pthreads DOES NOT specify that a thread posting an alarm signal request will be the thread to receive the request. According to the POSIX standard, the implementation is free to deliver a process signal to ANY thread in the process -- not necessarily the one that requests it.
You can get around this difficulty by using the select() system call in a clever way. Consider the code in socket_select.c. Each of the routines we have defined uses a select() to block before making a system call on a socket, but with a timeout. If the select times out, the subsequent blocking call is not made.
#include < unistd.h > #include < stdlib.h > #include < stdio.h > #include < fcntl.h > #include < sys/types.h > #include < sys/socket.h > #includeThere are two noteworthy aspects of this code to consider. The first is that it assumes there is no race condition between when a select() call completes and the subsequent blocking call. That is, a socket will not become blcoking after a select() completes and when another call is made on it. That is a fairly safe assumption given most implementations.#include < arpa/inet.h > #include < sys/select.h > #include "socket_select.h" int Accept(int s, struct sockaddr *addr, socklen_t *addrlen, unsigned int sec) { fd_set fds; int ret_code; int nfds; struct timeval time_out; /* * set the socket as the only one to check */ FD_ZERO(&fds); FD_SET(s,&fds); nfds = s+1; /* check up to the fd we've passed in */ /* * set the timeout value */ time_out.tv_sec = sec; time_out.tv_usec = 0; /* * wait until the socket becomes readable * or a time out has occurred */ ret_code = select(nfds,&fds,NULL,NULL,&time_out); if(ret_code > 0) { ret_code = accept(s, addr, addrlen); } else { #ifdef VERBOSE fprintf(stderr,"Accept: error\n"); fflush(stderr); #endif } return(ret_code); } int Connect(int s, const struct sockaddr *serv_addr, socklen_t addrlen, unsigned int sec) { fd_set fds; int status; int ret_code; int nfds; struct timeval time_out; int flags; /* * get the current flags */ flags = fcntl(s, F_GETFL, 0); if(flags < 0) { #ifdef VERBOSE fprintf(stderr,"Connect: fcntl to get flags failed\n"); fflush(stderr); #endif return(-1); } status = fcntl(s, F_SETFL, flags | O_NONBLOCK); if(status < 0) { #ifdef VERBOSE fprintf(stderr,"Connect: fcntl to set flags failed\n"); fflush(stderr); #endif return(-1); } /* * try a non- blocking connect */ connect(s, serv_addr, addrlen); /* * set the socket as the only one to check */ FD_ZERO(&fds); FD_SET(s,&fds); nfds = s+1; /* check up to the fd we've passed in */ /* * set the timeout value */ time_out.tv_sec = sec; time_out.tv_usec = 0; /* * select and block until the socket becomes writable (the * other side has done an accept) */ ret_code = select(nfds,NULL,&fds,NULL,&time_out); if(ret_code <= 0) { #ifdef VERBOSE fprintf(stderr,"Connect: time out failure\n"); fflush(stderr); #endif } /* * now set the socket back to blocking */ flags = fcntl(s, F_GETFL, 0); if(flags < 0) { #ifdef VERBOSE fprintf(stderr,"Connect: fcntl to get flags failed\n"); fflush(stderr); #endif return(-1); } status = fcntl(s, F_SETFL, flags & ~O_NONBLOCK); if(status < 0) { #ifdef VERBOSE fprintf(stderr,"Connect: fcntl to set flags failed\n"); fflush(stderr); #endif return(-1); } return(ret_code); } int Send(int s, const void *msg, size_t len, int flags, unsigned int sec) { fd_set fds; int ret_code; int nfds; struct timeval time_out; /* * set the socket as the only one to check */ FD_ZERO(&fds); FD_SET(s,&fds); nfds = s+1; /* check up to the fd we've passed in */ /* * set the timeout value */ time_out.tv_sec = sec; time_out.tv_usec = 0; ret_code = select(nfds,NULL,&fds,NULL,&time_out); if(ret_code <= 0) { #ifdef VERBOSE fprintf(stderr,"Send: select failed\n"); fflush(stderr); #endif return(ret_code); } ret_code = send(s, msg, len, flags); return(ret_code); } int Recv(int s, void *buf, size_t len, int flags, unsigned int sec) { fd_set fds; int ret_code; int nfds; struct timeval time_out; /* * set the socket as the only one to check */ FD_ZERO(&fds); FD_SET(s,&fds); nfds = s+1; /* check up to the fd we've passed in */ /* * set the timeout value */ time_out.tv_sec = sec; time_out.tv_usec = 0; ret_code = select(nfds,&fds,NULL,NULL,&time_out); if(ret_code <= 0) { #ifdef VERBOSE fprintf(stderr,"Recv: select failed %d\n",ret_code); fflush(stderr); #endif return(ret_code); } ret_code = recv(s, buf, len, flags); return(ret_code); }
The second issue is with connect(). Since connect() returns a socket, it is not possible to use select() to protect it. The solution is to the fcntl() command to set the status of the file descriptor to be non-blocking, call connect() and then use select() to tell you that the connect() succeeded.
problems with non-blocking I/O -- The issue with this approach is portability. First, not all systems treat socket descriptors as true file descriptors, so fcntl() does not have exactly the same effect as it would in a normal file. Some systems, like Solaris, provide a non-blocking setsockopt() call, but again -- not all. Finally, while non-blocking I/O will work on most systems for read() and write(), their behavior with respect to connect() and accept() is less clear. The select() takes three file descriptor bit masks: read descriptors, write descriptors, and exception descriptors. What category does a connect() descriptor fall into? It isn't exactly clear, as it isn't for accept(). You'd hope that a socket would be both readable and writeable only after a connect() completes. For accept(), though, the socket is being created. Do you consider the server socket (which cannot be read or written) readable and writable if an accept() will pay off? I don't know, but I suspect the behavior isn't uniform across implementations. Posix does define aioread() and aiowrite() calls (which are not yet available everywhere) but that doesn't clear up the confusion with connect() and accept(). Thus be warned: this approach works on our systems but might cause problems elsewhere.
timeout threads -- The other option is to spawn a separate thread that sleeps for the timeout interval, and closes the socket if a timeout occurs. The thread making the blocking call, then, must cancel the timeout thread if a timeout does not occur. This approach has several potential problems. The first is that it is not clear what behavior you expect if one thread is blocked reading a socket and another closes that socket. You'd hope that the thread that is blocked would be unblocked because the socket is technically dead, but I'm not sure you can count on it. Even if that does work, though, there is a serious performance consideration is spawning and canceling a thread for each timeout.
#ifdef SERVER
char *ARGS = "p:\n";
void *ServerThread(void *arg)
{
int incoming_sock = (int)arg;
char send_buf[255];
char recv_buf[255];
int recvd;
int sent;
memset(send_buf,0,sizeof(send_buf));
memset(recv_buf,0,sizeof(send_buf));
fprintf(stdout,"connected.\n");
fflush(stdout);
fprintf(stdout,"waiting for string...");
fflush(stdout);
recvd = RecvString(incoming_sock,recv_buf,sizeof(recv_buf));
if(recvd < 0)
{
fprintf(stdout,"failed in server thread.\n");
fflush(stdout);
HangUp(incoming_sock);
return(NULL);
}
fprintf(stdout,"got:\n\n\t%s\n\nfrom client\n",recv_buf);
fflush(stdout);
fprintf(stdout,"echoing...");
fflush(stdout);
sent = SendString(incoming_sock,recv_buf);
if(sent < 0)
{
fprintf(stdout,",failed\n");
fflush(stdout);
HangUp(incoming_sock);
return(NULL);
}
fprintf(stdout,"successfully\n");
/*
* do not do a HangUp -- the other side MAY think the
* socket is dead before the data gets there
*/
close(incoming_sock);
/*
* throw away the server socket, though
*/
return(NULL);
}
int
main(int argc,char *argv[])
{
int c;
int server_sock;
int incoming_sock;
pthread_t tid;
pthread_attr_t attr;
while((c = getopt(argc,argv,ARGS)) != EOF)
{
switch(c)
{
case 'p':
Server_port = atoi(optarg);
break;
default:
fprintf(stderr,
"echo server unrecognized argument: %c\n",c);
fflush(stderr);
exit(1);
argc = -1;
break;
}
}
if(argc < 1)
{
fprintf(stderr,
"usage: server -p server port\n");
fflush(stderr);
exit(1);
}
fprintf(stdout,"server creating port %d...",
Server_port);
server_sock = CreateServerSocket(Server_port);
if(server_sock < 0)
{
fprintf(stdout,"failed.\n");
fflush(stdout);
exit(1);
}
fprintf(stdout,"created.\n");
fflush(stdout);
/*
* VERY IMPORTANT to use strncpy with sockets so that incoming
* data can't overwrite stack
*/
fprintf(stdout,"waiting for client...");
fflush(stdout);
while(1)
{
incoming_sock = WaitForClient(server_sock);
if(incoming_sock < 0)
{
fprintf(stdout,"failed, looping back.\n");
fflush(stdout);
continue;
}
pthread_attr_init(&attr);
pthread_attr_setdetachstate(&attr,PTHREAD_CREATE_DETACHED);
pthread_attr_setscope(&attr, PTHREAD_SCOPE_SYSTEM);
pthread_create(&tid, &attr,
ServerThread, (void *)incoming_sock);
fprintf(stdout,"server looping back after spawn\n");
fflush(stdout);
}
HangUp(server_sock);
return(0);
}
#endif
The code uses the SIGALRM protected version of the time out wrappers
discussed above. In addition, we have added a print statement to the signal
handler to indicate the process ID of the process catching the signal.
We can then test to see if the main thread gets a signal when the spawned
server thread times out by blocking the client indefinitely. Here is the
output which I block the client in gdb.
UNIX> echo_server_thread server creating port 8008...created. waiting for client...server looping back after spawn connected. waiting for string...signal handler called by 3972 RecvString: recv failed after 0 failed in server thread.Notice that only one signal is caught. Also this version of the echo server loops back and does another accept(). If you try this yourself, you'll notice that the server keeps running just fine.
It is a good theory, but it doesn't quite tell you everything you need to know. In particular, even after the child has terminated, the rebind may fail when you restart your server. That is, you run your server and your client, you kill them both, and then you try to rerun your server and it won't work because the rebind fails. What's happening here is that the kernel doesn't know that your client has been killed. As such, it has to assume that the connection to the other end is still valid. It should know that there are no other processes locally who might have the connection open, but according to Richard, it doesn't.
Actually what is happening is that the TCP state associated with the connection must persist for some amount of time after a connection is terminated to protect against "replay." If the kernel were smart, it would let the bind occur but cause a failure if the same remote address/port pair tried to connect again within the timeout. It's not. Instead, it prevents you from re-opening the server address for an unspecified amount of time, which if you are like me, you find pretty annoying.
There is a fix, though, and it is to use the setsockopt() command to tell the kernel to stop this silly behavior. Once you have created a socket, you run
int socket_sd; int on = 1; /* this is really brain dead */ setsockopt(socket_sd, SOL_SOCKET, SO_REUSEADDR, (char *)on, sizeof(on));If ever there was a ridiculous interface, this is it. The first three arguments make some kind of sense (even though the behavior you are trying to turn off does not). The fourth argument has to be some byte, short, or integer variable containing the number "1" and the last argument says how big the number "1" is.
Sigh.
You should know that this trick actually introduces a security hole in your program because now any process (including those that do not belong to you) can now bind to this address.
I'm sure there was a really good reason for all of this complexity at some point in the past, but it does seem that now, knowing what we know, a better job of it could be had. Anyway -- there it is.
Speaking of brain dead, there are a few troublesome lines in the Solaris documentation on the cond_wait() and pthread_cond_wait() thread calls. Note that pthread_cond_wait() has the same semantics as cond_wait() when you load with "-lthread" on Solaris.
When cond_wait() returns the value of the condition is indeterminate and must be reevaluated.
and
If a signal is delivered to a thread waiting for a condition variable, upon return from the signal handler the thread resumes waiting for the condition variable as if it was not interrupted, or it returns 0 due to spurious wakeup.What does this mean to you? It means that the code in rr_condvar.c from the threads lecture that goes
pthread_mutex_lock(t->lock);
while((MyTurn % t->nthreads) != t->id)
{
pthread_cond_wait(&(t->wait[t->id]),t->lock);
}
is, in fact, the correct code for Solaris. Why? Because if your thread
gets sent a signal for any reason, the pthread_cond_wait() call will
unblock even if no pthread_cond_signal() was made. You, therefore,
need to test the condition in a while loop even though, in the example code,
only one thread is being signalled.
What conclusions can we draw from all of this? First, writing portable, robust, and useful threaded and socket codes is not easy. You really have to sweat the details. Secondly, from the perspective of sockets and the alarm signal, threads behave as individual Unix or Linux processes. When a thread sets a signal handler or fields a signal it does not affect the rest of the process to which that thread belongs. That's a good thing since the alternative would require quite a bit of book keeping in the signal handlers. With all of that said, you should be careful since there are lots of ways to use threads, signals, and sockets that might cause problems. For example, what happens if a signal handler calls pthread_mutex_unlock()? I'll leave it to you to figure out what you can and cannot do, but my suggestion is to keep the interaction between these three subsystems to a minimum whenever possible.