Asynchronous Control Events

When writing networked programs, I keep running into the same problem over and again. I want to be able to get user input, but there may be input from the network at the same time as well. This is something that a program cannot handle -- you can either wait for input from the user, or wait for input from the network. Of course, this is not entirely true, since everything is possible. There is this wonderful system call named select() that enables you to wait for multiple sources of input.
If there is input from the user, get it, and if there is input from the network, roll the network routines and read the input. Things are bound to get very complicated though. Think of a server that offers lots of functionality to hundreds of simultaneous clients. While it is not impossible to handle this using select() (since everything is possible, remember), the program code will get immensely complex, since you are messing with the natural flow of your program. You are trying to do two things in a single thread of execution. In my attempts to circumvent this problem I have gotten as far as developing my own implementation of 'threads' and thread scheduling. Even though the code works, I refuse to believe that operating system developers want to encourage you to develop your own little operating system that runs under theirs. In this document I describe a design that is yet another attempt to come up with something workable.

I want to be able to send control events (or messages) to the working threads. I also want to be able to use the master thread for user interaction, because he has the controlling tty. In Unix, the controlling tty is the terminal where the end-user can give input via the keyboard (amongst other things which are not relevant at this time).

Model: Process A forks a worker named B. A and B communicate through a socketpair (a socketpair is like a bi-directional pipe). There's nothing special about that... The problem, however, is that A and B need to be written in such way that they properly can communicate with each other. They have to be perfectly synchronized. When A writes, B needs to read, and when B writes, A needs to read, otherwise things won't work.
This can be solved by putting the master process into a message handling loop. This works for programs that typically work in a 'question-answer' fashion (stateless), or for simple state machines. For programs that need states to go a number of levels deep, things can get real complicated, and remember that we also want to do user input at the same time, etcetera. So far no real solutions have been given, only more problems have arisen.

The idea is that both master and worker follow a (what I call) natural program flow, without any state switches cluttering the source code. For the asynchronous events, we should have event handlers. This means that when a asynchronous event occurs, I want the event handler to be called, side-stepping from the natural program flow, but not actually disturbing the natural program flow, and keeping the source code readable and cleanly structured.
This can be achieved by wrapping the existing I/O library functions with functions that check for any asynchronous input. The wrapper function works much like the Yield() mechanism I came up with for the BBX simulated threads scheduler, except that now we do not switch threads, we just call an event handling routine and return. This keeps things simple, stupid..!
In pseudo-C:

int fdread(int fd, char *buf, int num) {
    for(;;) {
        FD_ZERO(&rfds);
        FD_SET(fd, &rfds);
        FD_SET(port, &rfds);

        err = select(highest_fd, rfds, NULL, NULL, NULL);
        if (err > 0) {
            if (FD_ISSET(fd, rfds))     /* fd is ready to be read */
                break;

            if (FD_ISSET(port, rfds))   /* async event has arrived */
                event_handler();
        }
        if (err < 0) {
            if (errno == EAGAIN || errno == EINTR || errno == EWOULDBLOCK)
                continue;

            if (errno == EBADF) {
                i = find_rotten_egg([fd, inports]);
                if (fd == i)
                    return -1;
            }
            perror("fdread(): select() failed");
            exit(-1);
        }
    }
    return read(fd, buf, num);
}
This function blocks until the data on the filedescriptor has been read, while handling asynchronous events in between. The event handler reads the control message, and processes it, and probably does so via a switch. Notice how we didn't like to use any switches, but mind that now we have seperated the natural program flow from the control flow.

There is a problem with this function though, it is too simplistic; there can be only one input port. Generally this is not a problem for a worker, but it is a problem for the master. Also, the event handler is hardcoded. I guess it could be a pointer that is set to a different handler each time, but that would be ugly. As an improvement, we should add multiple ports and configurable event handlers.

typedef struct Port_tag Port;

struct Port_tag {
    int fd;
    void (*handler)(Port *);
};
Done! Well err... almost. Filling in the missing bits is a bit of standard work. You can put the ports in an array or in a linked list, whatever you prefer. I would also add a rank number, and have the master assign each worker a rank. By convention, the master can be reached at rank #0.
If you are familiar with parallel programming, you will recognize this and wonder why I didn't resort to using MPI, PVM, or whatever. The main reason is that these libraries only hide the socket (or shared memory) programming and replace it with another complex library. Although they are popular and widely in use, they do not directly solve the problem presented above.

There still is some work to do, a sleep() call should be implemented the same way as we dealt with read(). Every time a program blocks, is an opportunity to check the asynchronous ports. You could also insert sleep(0) into busy loops, to make the system more responsive.
The actual implementation could be done via sockets, shared memory, or message queues. That is what they are here for, but the idea of side-stepping into an event handler remains the same.

Now that we are using select() to monitor the asynchronous I/O, it becomes tempting to reintegrate the worker code back into the master process and to return to a model in which we deal with everything from within one single process. I see this as a pitfall and something you definately should not do. By using seperate processes (or threads), we have seperated two entities that are logically seperate, and each have their own natural program flow. Uniting these seperate entities would be illogical (as Spock would say). Keep it simple (stupid..!), and it will show in the program source code. Clean and easy to understand code is, above all, most important.
This model also opens the road that leads into the distributed environment, but that is a different story...


If you really must, you can contact the author at walter at heiho dot net