Claudio
Fleiner, Jerry
Feldman, David
Stoutamire
fleiner@,
jfeldman@,
davids@icsi.berkeley.edu
International Computer Science
Institute
Berkeley, USA
The parallel Sather (pSather) group continues its efforts to make parallel OO programming safe, efficient and relatively easy. Since the last POOMA meeting we have learned many lessons, some of which seem to be of general interest. All Sather releases since 1.0.6 in May 1995 have included pSather. The latest version runs on Solaris SMPs, networks of these linked by Myrinet or TCP/IP, and the Meiko CS-2. It is currently being used in a parallel computing course at Berkeley and for ICSI projects.
Sather is an object oriented language that has parameterized classes, object-oriented dispatch, statically-checked strong (contravariant) typing, separate implementation and type inheritance, multiple inheritance, garbage collection, iteration abstraction, higher-order routines and iters, exception handling, assertions, preconditions, postconditions, and class invariants. pSather is a parallel extension that addresses non-uniform-memory-access multiprocessor architectures but presents a shared memory model to the programmer. It extends serial Sather with threads, synchronization and data distribution. Unlike actor languages, multiple threads can execute in one object. It offers several synchronization mechanisms like futures, gates, mutex, reader/writer locks, barrier synchronization, rendezvous and a disjunctive lock statement.
The talk will briefly introduce the key features of pSather and the new modifications. These include disjunctive locks, visitor/ mutator protection and iterator yields from within a lock statement.
One of the most interesting lessons of the last year has been a major revision of how thread termination is handled. There was a "gate.clear" command that terminated all threads attached to a gate, regardless of their state. We eventually convinced ourselves that there is no way to do this that guarantees correct semantics. Meanwhile, we had been discussing whether it was worth adding disjunctive locks to the language. It turns out that disjunctive locks cleanly handle all the cases in which we were using asynchronous thread termination. We believe that both the negative and positive sides of this finding are general, but will be very interested in the experience of other groups.
pSather has a lock statement that allows a thread to safely acquire one or more locks at a time, guaranteeing correct unlocking and reasonable properties of fairness and deadlock avoidance even in the face of exceptions. The disjunctive lock is an extension that allows a thread to wait for one of several locks and, depending on the locks it acquired, execute some statements. Similar constructs are used in Ada (select statement) and in Occam (ALT statement), although both statements are used to select data from one or more input channels, like the select(2) system call in UNIX. The disjunctive lock can be used together with gates (a pSather object that implements a queue and can be used to synchronize threads) to get the same effects, but there are many other ways to use it. The general syntax of this lock statement is as follows:
lock [guard condition] when lock_expr [, lock_expr] then statements [guard condition] when lock_expr [, lock_expr] then statements ... [else statements] end
The lock_expr is supposed to return an object of type $LOCK. The standard libraries have classes for mutual exclusion locks (MUTEX), several flavours of reader/writer locks, barrier locks and rendezvous locks. As these synchronization objectst are standard pSather classes, one can add new lock types if necessary. When a thread enters a lock statement, it evaluates all guard conditions. All when branches with guard conditions that evaluate to false are ignored. The thread will then select a branch for which it can acquire all locks. If no branch is available the thread executes the optional else part, if present, or blocks. The system avoids thread starvation: if a thread can run, it will eventually do so. It also guarantees that eventually each of the different when clauses will be selected if its locks are free and their respective guard condition is true.
A thread that waits for input on one of several different gates (gates are FIFO queues with a blocking dequeue) could be written as follows:
loop lock when gate1.not_empty then work_on(gate1.dequeue); when gate2.not_empty then work_on(gate2.dequeue); end; end;
Note that gate1.not_empty does not return a boolean, but an object of the abstract type $LOCK. In this case it returns a lock object which provides mutual exclusion , but which cannot be locked until a certain condition (here: gate is not empty) holds. This incorporates an older insight; signal conditions (not_empty) must be lockable to ensure that the condition still holds when the signalled thread resumes. As long as the thread holds this lock, no other thread can enqueue or dequeue a value from the gate.
During the last year we convinced ourselves that there is no clean way to asynchronously terminate a thread in an object oriented language. Simply stopping a thread is clearly not an option, as such a thread could leave some broken objects behind, like a linked list where only some of the pointers have been updated. If pSather had such a construct, it would also need a way to protect against it. However, a thread that shields itself against the kill signal and calls a function that blocks (like gate.dequeue) cannot be killed while blocked and might never be unblocked.
A programmer that relies on the fact that a thread can be killed would have to know exactly which functions can block and make sure that the thread does not shield itself against kill signals while calling those functions. Shielding makes code reuse difficult too, as one has to know for each function if it is supposed to be called with or without shielding.
Of course an interuptable thread can poll a flag while it is running; the problem is terminating a blocked thread. One idea is to raise a special exception in this thread. This removes some of the problems above, but it is still not easy to write correct programs. Besides, such a program can only use libraries that are aware of the fact that a clear exception can happen and that shield themselves against it to make sure that no broken objects are left behind. Take for example the following code fragment:
1. protect 2. e:=gate.dequeue; 3. when CLEARED_EXCEPTION then 4. clean_up; 5. end;
If the gates are not designed to handle clear exceptions correctly, the gate may be in a unsafe state after the exception occurred and hence may not be used anymore (neither by this thread nor by another thread). If it is written correctly, we do not know in line 4 if a value has been dequeued, and even if we knew that it had been dequeued (by checking the size of the gate, for example), there is no guarantee that e holds this value, as the exception could have occurred after gate.dequeue returned, but before the return value has been assigned to e. To do it correctly one has to write
1. got_value:=false; 2. protect 3. lock gate.not_empty then 4. shield 5. e:=gate.dequeue; got_value:=true; 6. end; 7. work_on(e); 8. end; 9. when CLEARED_EXCEPTIONthen 10. if got_value(e)then safe_e end; 11. end;
Note that although we know in line 10 whether e has been dequeued or not, we have no information on how far work_on(e) has proceeded. To get this information we would have to use additional flags inside work_on(). This example shows that although it is possible to write code that works with CLEARED_EXCEPTION it is far from easy.
Some additional problems occur with bound routines (a Sather construct similar to closures and function pointers in C). Suppose we have a clever search routine to find database records: As an input parameter it gets a bound routine that decides for each record if it is the one being sought. This search routine could start several threads that run through the database in parallel and stop as soon as any one has found a matching record. Unfortunately this works only if either the bound routine is shielded against the clear exception (which works only if it will never block) or has been designed to correctly catch the exception. As the compiler has no way to check for either of those conditions, it cannot help in detecting errors and hence the programmer may easily introduce hard to find bugs.
Not only would it be troublesome to write all libraries in a way that they are safe to use with a CLEARED_EXCEPTION, it would also slow them down whenever this functionality is not needed (the protect as well as the shield command need some instructions and the former has to store the current frame information to make it possible to jump back if an exception happened). Even though the killing of a thread should occur at most once it would slowdown the complete program.
The above problems convinced us that it is not possible to have an easy, safe and efficient parallel, object oriented language language that offers a way to asynchronously stop a thread. Instead, we introduced the disjunctive lock discussed above. A thread that needs to be stopped asynchronously can test a flag while executing and, if it waits on locks, it can simultaneously wait on a special lock which will signal termination. The appendix shows how a producer/consumer can be implemented in pSather such that all consumer threads stop as soon as all producers have finished and all items generated by them have been consumed - without any spinning.
Over the last several years a large number of classes have been developed for Sather, and we would like to use most of them in pSather too. Any Sather class can be used in pSather as long as the programmer assures that all accesses to it are serialzed. However, we would like to provide classes that guarantee serializations. For most classes this means to put a lock statement in most routines.
We identified three different types of functions:
The code for initializing, updating, and testing iteration variables is often complex and error prone. The code to step through complex containers such as hash tables typically must utilize the detailed internal structure of the container, sometimes causing duplication of virtually the same code in many places. Ensuring correct synchronization while iterating introduces additional pitfalls.
To remedy this, Sather encourages encapsulation of iteration, abstracting iteration operations as part of the interface of the container class rather than a part of the code in the client of the container. Sather iterators are like routines, but instead of returning a value they can yield and later resume execution while still holding a lock. For example, most container classes define elt!:T as shown above, which yields another element on every call.
We implemented this protocol by adding a reader/writer lock to each such object and wrapping mutator and visitor methods with lock statements as shown in the following code:
class LIST{T} < $LIST{T} is include VISITOR_MUTATOR; -- includes the code for the RW_LOCK enqueue(e:T) is lock mutator then -- enqueue element end; end; head:T is lock visitor then return head_element; end; end; elt!:T is lock visitor then current_element:=head_element; loop while!(~void(current_element)); yield current_element; current_element:=current_element.next; end; end; end;
This shows a very important feature of the current design. The elt! iterator yields each element of the list one at a time with the guarantee that the list structure does not change during the loop. Unlike routine, iterator calls within lock statements do not give up the lock until the loop completes. The following loop uses this iterator to print out all elements of the list.
loop #OUT + list.elt! + ", "; end;
When elt! executes for the first time it locks the list as a visitor. This lock will only be released when the loop finishes and it guarantees that the list does not change during the loop. Until recently we did not allow yields inside locks, but we convinced ourselves that we needed this feature to implement thread-safe iterators.
We believe that those changes have brought us one step closer to a safe and efficient parallel object oriented language for SMPs and SMPs connected through high speed networks.
Sather home page: http://www.icsi.berkeley.edu/~sather
pSather home page: http://www.icsi.berkeley.edu/~sather/psather.html
Source code is available from ftp://ftp.icsi.berkeley.edu/pub/sather
This pSather program shows how to implement producer/consumer with correct termination.
class PRODUCER_CONSUMER is producer(comm:GATE{INT}) is -- enqueue 1..5 in the gate loop comm.enqueue(1.upto!(5)); end; end; consumer(comm:GATE{INT},prod:ATTACH) is loop lock -- as long as the gate is not empty, we -- dequeue an element and print it out when comm.not_empty then #OUT+comm.dequeue+'\n'; -- if the gate is empty and all producers have -- finished their work, the consumer ends too. when comm.empty, prod.no_threads then return; end; end; end; main is prod::=#ATTACH; -- all producer threads are attached to this object comm::=#GATE{INT}; -- gate used to send values to the consumers loop 5.times!; -- create 5 producers that run in parallel prod:-producer(comm); end; parloop 3.times!; -- create three consumers do consumer(comm,prod); end; -- main thread waits here until all consumers have ended end; end;