Thursday, February 16, 2012

Locking Style for Concurrent Programs (c#)

As a job I do a lot of asynchronous programming. I have been working intensively with multi threading some 6 years now (before that I did programming on Unix in c++).

As a consequence of the problems I encountered during this 6 year period, the way I use locks has changed a lot. The most recurring problems were:
  • deadlocks.
  • pumping on the UI thread. When the UI-thread waits on a lock, the runtime can decide to start pumping on that call stack!!! This way *any* code can be called from nearly any point in your code. This can easily cause deadlocks or other (crazy) faulty programs because of unexpected reentry.
  • performance: I used to take my locks over longer periods of time. For example during doing I/O or remoting to get some kind of serialization behavior. Instead it has shown to be a better idea to schedule work where possible, as it is often not important when something really happens.
I adopted a certain style when writing code with locks, in effect limiting the way in which locks are used. They are now used only to protect against concurrent access of fields and possibly to order the task scheduling. This style forces me to use other (higher level) constructs for coordination. This style has served me well in writing cleaner concurrent code.


Remark that the TPL (Task Parallel Library) has made this style of programming a lot easier. It provides a very easy to use interface to schedule tasks. Before TPL I often had dedicated Threads running to do this kind of scheduling (e.g. consuming from a queue). Not included in the TPL is ParallelExtensionsExtras also from Microsoft. This one provide a lot of extensions, especially the additional TaskSchedulers are very useful.

I am looking forward to the TPL DataFlow library in the next version of .net.

What follows are the rules I apply when designing and implementing concurrent software. You will see that there rules are formulated as negatives, limiting the applicability of locks, forcing you to use higher level constructs to deal with concurrency.


A lock should be defined as a private field of a class

The way to keep your locking problems little is by keeping the scope of your lock small. A good start is to define you lock as a private instance field of your class:

class X {
  ...

  private readonly object lockme = new object();
  // her follows fields that are protected by the lockme lock:
  ...

  public X(...) {
  ...

Never ever use "this" as a lock, as the lock should always remain hidden away. Otherwise, when someone decides to use your object as a lock you're suddenly sharing locks.

I always define a specific member (like "lockme") for locking. I never recycle some other member for this purpose. Locking is tricky enough already, and not using an special locking object obscures it. Exception is when I expect ot have many little objects and want to save memory.

Personally, I always give my lock the same un-inspirational name: "lockme". When I see a "lockme" field defined, I know what to expect of usage pattern, code style, etc..  I quickly can see what fields are protected by the lock (those declared after lockme). And I know I am going to see some short time locking, probably also the use of Tasks and may be a custom TaskScheduler. I might also see some cleaver stuff being done outside of lock (nicely marked with "// outside of lock:" comments). All this makes it the code less surprising, easier to understand and verify the correctness of the source code of that class.

A lock should only protect against concurrent access of data members in an object

Limiting locks to protect concurrent access to the private data of an object keeps the scope of the lock small, the time the lock is taken small and specifically makes it easy to reason about the correctness of your program.

The lock should be taken a very short time, really to read/write fields of your object atomically, do some (simple) calculation/decision work and the rest of the work should be pushed outside of the scope of that lock, or be scheduled.

Here is an example where we notify listeners about a change. Calling this from within the lock would be the biggest no-no of them all, as listeners represents unknown code:

private readonly object lockme = new object();
private State state;
private Action listeners;
private int changeCount;

void IncrementChangeCount()
{
  Action tmp;
  int count;
  lock(lockme) {
    if(state != State.Running) {
      return;
    }
    tmp = listeners;
    count = ++changeCount;
  }

  // outside of lock
  if(tmp != null) {
    tmp(count);
 }
}

Prefer to define at most one lock per class

If you need several locks in your class, it might be that your class is getting too big. It is not always so, but it could be an indication that you might be better of splitting the class.

Writing classes for multi threading often deal with asynchronous events or need to do coordination. This is often though matter already. When there is no real need to create a second lock then do not do it.

Another argument to use only one lock per object is that if locks are really only to set and get member fields values in an atomic way, the lock time will be very small and contention will very likely not to be an issue.

Do not take the lock recursively

The class should be designed so that:
  • when entering a public member function the lock should not be taken.
  • when entering a private member function it should be clear by design if the lock is taken or not, and not depend on runtime.
When limiting the code in this fashion it gets again easier to reason about correctness of the code and to prevent inadverted holding the lock when the control flow passes through the object.
    It might be an idea to name the members or annotate them to convey expectations about locking. I personally do not do this, instead I usually comment the members that expect the lock to be taken beforehand.

    Never call a method of another cooperating object while holding the lock

    Locks tend to leak: When holding a lock and calling a member function of another object, then that object will effectively be holding that lock. This goes against the idea of encapsulation of locks. Such code will create nesting levels in locking and sooner or later, particularly in production, the program will deadlock.

    Exceptions to the rule are of course (private) helper classes or classes that are well known and stable like BCL container classes. So if you want to call methods on other objects or do I/O you should do this outside of a lock.

    One problem that ofter reoccurs in this context is, that one wants to I/O in the order the lock was taken. This can easily be done by using for example the OrderedTaskScheduler from ParallelExtensionsExtras (as useful supportive library for the TPL):

    ...
    private readonly TaskScheduler snapshotScheduler = new OrderedTaskScheduler();
    
    private object lockme = new object();
    private int nScheduledSnapshots;
    
    void MakeSnapshot() {  
      lock(lockme) {  
        ++nScheduledSnapshots;  
        Task.Factory.StartNew(DoSnapshot, CancellationToken.None,  
                    TaskCreationOptions.None, snapshotScheduler)  
          .LogExceptions(Log, "MakeSnapshotBackground")  
          .ContinueWith(  
           t => {  
            lock(lockme) {  
             --nScheduledSnapshots;  
            }  
           });  
        }  
    }
    
    void DoSnapshot()
    {
      // outside of lockme:
       ... I/O or long running calculation ...
    } 

    Never call unknown code (callbacks) while holding a lock

    Callbacks are about the worst thing to call while holding a lock as a callback could really be any code and do whatever. In most cases it is possible to capture the callback delegate (list) while holding the list and then calling them outside. Remember that the delegate object declared like:

    Action listeners;

    is really immutable. So you can safely copy it while holding the lock, effectively taking a snapshot, and then use that copy to call the callbacks (after releasing the lock).

    A lock should not be used for coordination purposes

    Coordination is about ordering so some things happen after other things. Yes, it is possible to do some really clever coordination by using all kinds of different lock. These solutions reduce to some user thread doing some waiting. This can have a number of problems:
    • It is often a very clever and fragile solution, and can be difficult to understand for other programmers.
    • Or it is a rather dumb solution and then it has scalability or deadlock problems.
    • To wait you often do not need the resources of a thread. Either a callback mechanism is provided, or if you are out of luck and you only have a WaitHandle, you can use the ThreadPool.RegisterWaitForSingleObject method.
    But there are a great many tools to do coordination in a clear manner:
    • Model your class as a state machine. While taking the lock, you'll just be changing state and scheduling stuff to do.
    • To do stuff in a certain order use Tasks and TaskScheduler; or a producer-consumer solution; another typical solution is AsyncCall from ParallelExtensionsExtras.
    • Design your interface accordingly: return Tasks when you want to give the caller abilities to react to a future event, thus allowing for waiting without actually blocking a Thread for waiting. 


    No comments:

    Post a Comment