Page 1 of 1

[SOLVED] Concurrency: Lock and non-preemptible code

Posted: Wed Sep 21, 2016 11:01 am
by wichtounet
Hi fellow osdevers,

I just realized that I had a huge problem with concurrency when a process is waiting for a lock and the lock is released from an interrupt handler (non-preemptible). This is typical case of deferred work from IRQ handlers.

More specifically, I have this particular case:
  • 1 thread is waiting on the lock: calling lock.wait() and the thread never change
  • 1 IRQ handler is releasing the lock: calling lock.notify()
Until now, I was using a spinlock to protect the binary lock value, but this won't obviously work since then there would be a possible livelock. The obvious solution in that case would be to disable interrupts in the wait() function.

For this very particular problem (not for a general-purpose lock), I was thinking of using Compare-And-Swap to solve this issue rather than disable interrupts.

The pseudo-code would be something like this:

Code: Select all

//pid is the pid of the preemptible process
//value is the current value of the lock (INIT/WAIT/FREE)

void wait(){
    state[pid] = DEFER;
    if(!CAS(value, FREE, WAIT)){
        if(CAS(state[pid], DEFER, BLOCKED)){
            reschedule();
        }
    }
    state[pid] = RUNNING;
}

void notify(){
    if(value == WAIT){
          state[pid] = READY; 
    }
    value = FREE;
}
(This does not fully take preemption into account (preemption would have to play nice with DEFER))

It's harder to adapt for non-binary locks though.

What do you think of this code ? Could this kind of thing work ? Is it worth trying to make it work fully or is it simply better to disable IRQ at this point ?

Thanks

Re: Concurrency: Lock and non-preemptible code

Posted: Wed Sep 21, 2016 3:00 pm
by Boris
Hi,
What happens if you get notified the lock became free just before your second Compare And Swap call ?
you will end up having a thread blocked and a released resource.

You need to prevent the current thread to be preempted by the notifier thread between the two CAS ifs... Which may make you disable interrupts for that.

Re: Concurrency: Lock and non-preemptible code

Posted: Wed Sep 21, 2016 3:47 pm
by simeonz
Without locking, having multiple waiters requires implementing non-blocking queue. A single waiter is easier (again depending). One central issue is the "use after free" of list nodes during CAS processing. In this regard, many papers for non-blocking structures assume there is some "garbage collection" scheme in place.

The linux kernel RCU is one such efficient "garbage collection" method. Just note, that it is patent encumbered in its direct form. Alternatively, there are reference counted pointers with atomic pointer assignment. This is not equivalent to atomic reference counters (which is useful on its own). A third alternative is to grow the node storage, but without reclamation.

The above schemes do not work with stack allocations (waiters can not allocate their nodes on the stack). If the node is inside the task structure, the latter will be subject to garbage collection instead.

Going the non-blocking route will place you on the frontier of algorithmic advancement. And to further illustrate, this is not used in neither Linux, nor Windows. Linux disables ISRs while manipulating wait lists. Windows does not disable them, but does not provide direct signalling from ISRs. You need deferred work to signal a worker thread and the deferred work level (akin to tasklets) is disabled during dispatcher processing. In fact, Windows XP and earlier used global synchronization for the dispatcher - a contention inducing spinlock. Windows 7 removed it and uses a mix of CAS and local per-object spinlocks.

Re: Concurrency: Lock and non-preemptible code

Posted: Wed Sep 21, 2016 4:38 pm
by simeonz
For this very particular problem (not for a general-purpose lock),
I must have been high on grass. You have mentioned that this is communication to that particular thread for deferred work, not synchronization in the general sense. The IRQ part also makes more sense now. Forget my ramblings above.

Boris is correct. You can get pulsed and "forget" before reaching the "sleep" state.

This is what I would do instead:

Code: Select all

void wait(){
    state[pid] = DEFER;
    old_value = CAS(value, NON_SIGNALED, WAIT);
    if(old_value == NON_SIGNALED){
        if(CAS(state[pid], DEFER, BLOCKED)){
            reschedule();
        }
    }
    state[pid] = RUNNING;
}

void notify(){
    if(xchg(value, SIGNALED) == WAIT){
          state[pid] = READY;
    }
}

Re: Concurrency: Lock and non-preemptible code

Posted: Wed Sep 21, 2016 4:41 pm
by simeonz
And - just one last thing. The event must be cleared by the waiter. But it may be better to do this in a separate method if it makes sense (reset() or smth).

Re: Concurrency: Lock and non-preemptible code

Posted: Thu Sep 22, 2016 1:31 am
by wichtounet
Thanks @Boris and @simeonz :)

You're absolutely right, I forgot this case. And there is indeed need for a clearing of the state.

I haven't found a good way to deal with preemption nicely, except for disabling it. For now, I'll write a version with a minimal critical section where interruptions are disabled. Since this is done in both Linux and Windows, it's probably not too bad if I minimize the critical section.

@simeonz Your ramblings are quite interesting :) I started thinking at the beginning of a general solution, but it is way more complicated in the general case. My OS is really not at the point where I think it's worth delving into such problems :P