Mutex implementation

zeusk · Post by **zeusk** » Fri Sep 14, 2012 1:16 pm

Hi,

I've been referring to the wiki since some time though I'm new to the forums here. I am working on a preemptive multitasking system for ARM, It's an educational project for me. Below is the implementation I have written for mutex objects, I'd like to know if there are possible better implementations or maybe a bug/lockups. Though i don't intend to use it on a multicore system, it wouldn't hurt to support them for future extensibility. Thank you for your time and help

Code: Select all

/*******************************************************************************
 * @File    mutex.c
 * @Date    22nd September 2012
 * @Brief   SMP implementation of mutex
 *          Structure size should be (4 + wait_queue_t) bytes
 *
 * For more info, Read /notes/thread-safety
 *
 * @Authors
 *          Shantanu Gupta <shans95g AT gmail.com>
 ******************************************************************************/

#include <os.h>
#include <cpu.h>
#include <mutex.h>
#include <thread.h>
#include <spinlock.h>

OS_INLINE OS_ERR mutex_acquire_try(mutex_t *mutex)
{
    if(unlikely(cpu_atomic_cmpxchg((void *)&mutex->owner, current_thread, OS_NULL) != OS_EOKAY))
        return OS_EBUSY;

    return OS_EOKAY;
}

OS_ERR mutex_acquire_timeout(mutex_t *mutex, time_t timeout)
{
    if(unlikely(mutex_acquire_try(mutex) != OS_EOKAY))
    {
        spinlock_acquire(&mutex->lock);
        while(unlikely(mutex_acquire_try(mutex) != OS_EOKAY))
        {
            thread_boost(mutex->owner);
            spinlock_release(&mutex->lock);
            if(thread_block(&mutex->queue, timeout) != OS_EOKAY)
                return OS_EBUSY;
            spinlock_acquire(&mutex->lock);
        }
        spinlock_release(&mutex->lock);
    }

    return OS_EOKAY;
}

void mutex_acquire(mutex_t *mutex)
{
    if(unlikely(mutex_acquire_try(mutex) != OS_EOKAY))
    {
        spinlock_acquire(&mutex->lock);
        while(unlikely(mutex_acquire_try(mutex) != OS_EOKAY))
        {
            thread_boost(mutex->owner);
            spinlock_release(&mutex->lock);
            thread_block(&mutex->queue, timeout);
            spinlock_acquire(&mutex->lock);
        }
        spinlock_release(&mutex->lock);
    }
}

OS_INLINE void mutex_release(mutex_t *mutex)
{
#ifdef DEBUG
    /* In good code, release should only be called if acquire was successful */
    if(unlikely((mutex->owner) != current_thread))
    {
        LOGF("%s: TCB %p OBJ %p\n", __func__, current_thread, (void *) mutex);
    } else
#endif
    {
        spinlock_acquire(mutex->lock);
        mutex->owner = OS_NULL;
        spinlock_release(mutex->lock);

        thread_queue_wake_one_sync(&mutex->queue);
    }
}

OS_INLINE void mutex_init(mutex_t *mutex)
{
    mutex->owner = OS_NULL;
    spinlock_init(mutex->lock);
    thread_queue_init(&mutex->queue);
}

zeusk · Post by **zeusk** » Sat Sep 15, 2012 10:25 am

Hmm, 90 views and not a single response ?

Combuster · Post by **Combuster** » Sat Sep 15, 2012 4:19 pm

Probably a tl;dr issue. Especially when "todo"s pop up.

zeusk · Post by **zeusk** » Sat Sep 15, 2012 10:13 pm

Hmm, it is just one and non-critical, haven't thought of a good way to prevent a rogue thread from locking the mutex right after we unlock it and before the thread in wait queue locks it, though it can handle it, the wait queue thread will go back to sleep.

gerryg400 · Post by **gerryg400** » Mon Sep 17, 2012 3:58 am

Code: Select all

OS_ERR mutex_acquire(mutex_t *mutex)
{
    while(unlikely(cpu_atomic_cmpxchg((void *)&mutex->owner, curr_thread, OS_NULL) != OS_OKAY))
    {
        if(mutex->owner == curr_thread)
            goto locked;

        preempt_disable();

        if(mutex->owner == OS_NULL)
            continue;

        if(curr_thread->priority >= mutex->owner->priority)
            thread_boost(mutex->owner);

        thread_block(&mutex->queue, 0);

        preempt_enable();
    }

locked:
    mutex->count++;
    return OS_EOKAY;
}

The lock code seems incorrect. If a thread A misses the lock but gets pre-empted before it can call preempt_disable() and the lock gets unlocked by thread B, thread A will end up in the mutex_queue forever.

Performance might be a problem too. preempt_disable(), thread_block(), thread_boost() and preempt_disable() will all most likely need to be system calls and that will hurt. It means 4 system calls to lock a contended mutex and a system call on every free.

JamesM · Post by **JamesM** » Mon Sep 17, 2012 4:26 am

Code: Select all

       if(unlikely(mutex->owner != curr_thread))
            return OS_EBUSY;

I'd have thought this would be an error condition. If we just failed to get the mutex, and the mutex is owned by our thread, then we should assert or panic. Something went seriously wrong!

Code: Select all

        /*
         * Jackpot, resource freed right after our attempt, try again.
         * Also ensures the code below doesn't cause NULL de-reference.
         */
        if(mutex->owner == OS_NULL)
            continue;

The compiler is free to optimise this and hoist it above preempt_disable if it knows anything about preempt_disable's behaviour. You should insert a volatile local variable and store mutex->owner in it:

Code: Select all

volatile int owner = mutex->owner;
if (owner == OS_NULL) continue;

The code for acquiring is also not multicore safe - you assume once you've called preempt_disable you don't need any atomic accesses any more. As long as you know it isn't multicore safe, it's no problem.

You assume preemption was enabled on entry to the lock. Is this always true? is it valid for me to call preempt_disable() then get a lock? in which case, on exit from acquiring the lock preemption has suddenly become enabled!

By convention a mutex should be a binary semaphore. That is, instead of count++ just do count = 1. Unless it's actually a counting semaphore or a recursive mutex, in which case label it as such

Code: Select all

    if(mutex->owner != curr_thread)
        return OS_EINVAL;

This is a serious error condition. If I were you, I'd make this a hard panic instead of a soft return value error. It'll help the developer spot problems quicker.

Otherwise it looks like good, well written code.

zeusk · Post by **zeusk** » Mon Sep 17, 2012 6:56 am

gerryg400 wrote:The lock code seems incorrect. If a thread A misses the lock but gets pre-empted before it can call preempt_disable() and the lock gets unlocked by thread B, thread A will end up in the mutex_queue forever.

after preempt_disable, it checks for owner == NULL, which signifies the thread is not owned by anyone and if it did happen, it retries the loop with "continue". Since it doesn't support SMP yet, preempt disable should be enough to ensure object isn't changed after that point .... Oh god. Thanks, i just noticed that if it does occur, we cause imbalanced preempt_disable/enable Will fix it right away.

gerryg400 wrote:Performance might be a problem too. preempt_disable(), thread_block(), thread_boost() and preempt_disable() will all most likely need to be system calls and that will hurt. It means 4 system calls to lock a contended mutex and a system call on every free.

The implementation is for kernel drivers and internal use, why does the kernel code need syscall ?

JamesM wrote:I'd have thought this would be an error condition. If we just failed to get the mutex, and the mutex is owned by our thread, then we should assert or panic. Something went seriously wrong!

It's not an error for now, For start I am enabling recursive locks to get around some bad code, I will clean this once i have most things working. (cmpxchg will fail if we're recursive since we look for NULL pointer at owner, but what exists there is our tcb pointer, so we explicitly check if it's already locked by us or not).

JamesM wrote:The compiler is free to optimise this and hoist it above preempt_disable if it knows anything about preempt_disable's behaviour. You should insert a volatile local variable and store mutex->owner in it

Thanks, will do that. Would it be better to use a memory barrier instead after preempt_* If i stick to gcc ?

JamesM wrote:The code for acquiring is also not multicore safe - you assume once you've called preempt_disable you don't need any atomic accesses any more. As long as you know it isn't multicore safe, it's no problem.

Yep, i know it wouldn't work on multicores for now.

JamesM wrote:You assume preemption was enabled on entry to the lock. Is this always true? is it valid for me to call preempt_disable() then get a lock? in which case, on exit from acquiring the lock preemption has suddenly become enabled!

My preemption_(disable|enable) (which disable just context switch) and preemption_irq_(disable|enable) (which completely disable all IRQs) are recursive safe, they count the calls and preemption is only enabled after the counter reaches zero again, I don't use atomic functions to deal with these counters but store them in each thread's control block so that thread can yield safely if it chooses to while preempt is disable safely. So when it comes back, preemption is disabled again once we pop back the value from it's tcb.

JamesM wrote:By convention a mutex should be a binary semaphore. That is, instead of count++ just do count = 1. Unless it's actually a counting semaphore or a recursive mutex, in which case label it as such

Yeah, i should probably have mentioned I have recursive mutexes because it is required in one of our drivers (very old and buggy, have to rewrite), though I am thinking of making a separate object for recursive mutex instead of having recursive ability on all mutex objects even when not used at all.

JamesM wrote:This is a serious error condition. If I were you, I'd make this a hard panic instead of a soft return value error. It'll help the developer spot problems quicker.

Hmm, I will have to think about this, maybe panic in debug build and just log it in release build.

Thanks for your help

zeusk · Post by **zeusk** » Sun Sep 23, 2012 12:20 am

I've been thinking a lot over this, but couldn't find a suitable fix for the scenario where a thread gets preempted in mutex_acquire(_timeout) right after failed CAS, the new thread unlocks the mutex so when we comeback mutex->owner is null and the queue becomes unbalanced. Any help ? I've looked over pthread_mutex implementation from apple, linux and unless they have some magic trickery, they seem to be affected by this too

P.S Updated my code, moved recursion to a separate structure.

gerryg400 · Post by **gerryg400** » Sun Sep 23, 2012 4:55 am

zeusk wrote:I've been thinking a lot over this, but couldn't find a suitable fix for the scenario where a thread gets preempted in mutex_acquire(_timeout) right after failed CAS, the new thread unlocks the mutex so when we comeback mutex->owner is null and the queue becomes unbalanced. Any help ? I've looked over pthread_mutex implementation from apple, linux and unless they have some magic trickery, they seem to be affected by this too

P.S Updated my code, moved recursion to a separate structure.

In general terms, you need to find a way to ensure that the queue doesn't change after you perform the CAS operation.

I would recommend reading about futexes, which solve this problem by using an extra bit in the mutex flag to signify that the queue is currently occupied or not. At the point of queue insertion the flag can be checked.

Check out http://www.akkadia.org/drepper/futex.pdf

I'm not suggesting that you need a full futex implementation in the kernel, you can probably get away with a spinlock protecting the queue and the bit indicating whether the queue is occupied.

zeusk · Post by **zeusk** » Sun Sep 23, 2012 6:30 am

gerryg400 wrote: I'm not suggesting that you need a full futex implementation in the kernel, you can probably get away with a spinlock protecting the queue and the bit indicating whether the queue is occupied.

Hmm, Currently I've used a spinlock to protect the mutex object (making it SMP safe), however i will look into using that extra bit, I can merge it in the owner pointer itself since TCB(s) are 32 byte aligned in my system.

The document you shared is interesting, thanks for sharing it.

gerryg400 · Post by **gerryg400** » Sun Sep 23, 2012 7:02 am

zeusk wrote:
gerryg400 wrote: I'm not suggesting that you need a full futex implementation in the kernel, you can probably get away with a spinlock protecting the queue and the bit indicating whether the queue is occupied.
Hmm, Currently I've used a spinlock to protect the mutex object (making it SMP safe), however i will look into using that extra bit, I can merge it in the owner pointer itself since TCB(s) are 32 byte aligned in my system.

The document you shared is interesting, thanks for sharing it.

The bit needs to be in the mutex word itself so that it is read at the instant the mutex is taken and rechecked later.

zeusk · Post by **zeusk** » Sun Sep 23, 2012 7:44 am

gerryg400 wrote:The bit needs to be in the mutex word itself so that it is read at the instant the mutex is taken and rechecked later.

Yes, but how to check it with CAS implementation ? (I still haven't completely read the pdf you shared, will read it tonight)

For now, My implementation with spinlock looks very much like linux's implementation except for i use owner field as lock status.

OSDev.org

Mutex implementation

Mutex implementation

Re: Mutex implementation

Re: Mutex implementation

Re: Mutex implementation

Re: Mutex implementation

Re: Mutex implementation

Re: Mutex implementation

Re: Mutex implementation

Re: Mutex implementation

Re: Mutex implementation

Re: Mutex implementation

Re: Mutex implementation