Best way to implement thread blocking in user space

Discussions on more advanced topics such as monolithic vs micro-kernels, transactional memory models, and paging vs segmentation should go here. Use this forum to expand and improve the wiki!
Post Reply
rdos
Member
Member
Posts: 3276
Joined: Wed Oct 01, 2008 1:55 pm

Best way to implement thread blocking in user space

Post by rdos »

So, I have this scenario in the file system class (in user space):

I have an array of pathnames. When the directory is loaded an array is initialized, but it only reads the directory entries in the current directory and leaves the subdirectories as NULL pointers. Now, when somebody requests the subdirectory to be read too, I run into a potential multithreading issue if more than one thread tries to read it at the same time. So, I figured that when the first thread tries to read it, it will see that it's not initialized and so sets the reference count to -1. The next thread will see the reference count being -1, and so should block until the operation is done. Potentially more than two threads could try to get the same directory at the same time, so a list of waiting threads should be used. In kernel space, this is simple enough to solve. You just add the list to the struct and link waiting threads there. However, I cannot do this in user space since threads use kernel-only thread control blocks, and besides, I don't want to expose this to user space anyway. So, how is this ideally implemented? It should support multiple threads and it should only use at most an extra pointer or handle in the struct.
nullplan
Member
Member
Posts: 1766
Joined: Wed Aug 30, 2017 8:24 am

Re: Best way to implement thread blocking in user space

Post by nullplan »

So I find myself explaining futexes again. I'm not entirely sure what you want to do here, I'd just implement opendir() and readdir() like POSIX defines them, and leave the business of puzzling out subdirectories to the applications. But since you want to do this automatically, here's what you can do: The threads try to compare-and-swap (CAS) the refcount from 0 to -1. Whichever thread succeeds must then fill the pointer. If a thread reads a -1, it tries to CAS it to -2, then futex_waits on that count while it is -2. If a thread reads -2, they just immediately futex_wait on the count. When the filling thread is finished, it tests if the count has become -2, then it sets the count to 1 or whatever you need. If the count was -2 before that, it performs a futex wake on the futex. Or in pseudo-code:

Code: Select all

int fill_dir(struct dir* dir)
{
  int wake = 0;
  while (dir->refcnt <= 0) {
    switch (CAS(&dir->refcnt, 0, -1)) {
      case 0:
        do_the_real_work(dir);
        if (dir->refcnt == -2) wake = 1;
        dir->refcnt = 1;
        if (wake) futex_wake(&dir->refcnt, INT_MAX);
        break;
      case -1:
        if (CAS(&dir->refcnt, -1, -2) < 0) {
      case -2:
          futex_wait(&dir->refcnt, -2);
        }
        break;
    }
  }
}
In this I assume that CAS returns the value read from the memory location. futex_wait() and futex_wake() can easily be system calls, so can just be used from user space. Also, pthread_once() works the same way.
Carpe diem!
rdos
Member
Member
Posts: 3276
Joined: Wed Oct 01, 2008 1:55 pm

Re: Best way to implement thread blocking in user space

Post by rdos »

I have futexes (as a way to implement critical sections in user-space), but they have some serious drawbacks in this scenario. First, the futex implies that each release of the futex will wakeup a single thread, but if two threads are waiting for the data to become available, only one of them will be woken up and the other one will be left blocked. The other drawback is that a futex must always be allocated and associated with the struct, which slows down the code. Particularly since multiple threads generally will not try to read the same directory, and so this is an exceptional case. Also, there will be an issue with freeing the futex too, and making sure that a thread will not try to use the futex while it is freed. In general, my usage of futexes does not really garantee problem-free behavior when futexes are freed, rather implies that the code makes sure that they are only freed when they cannot be referenced.

So, my take on this is like this:

Code: Select all

  for (;;)
  {
    if (ref_count >= 0)
    {
      if (pointer)
        return pointer;

      lock ref_count--;
      if (CY)
      {
        pointer = ReadTheEntry();
        lock ref_count++;
        while (wait_count != 0)
          Yield();
        ebx = 0;
        xchg ebx,wait
        if (ebx)
          CloseThreadBlock(ebx);
        return pointer;
      }
      else
        lock ref_count++;
    }
    else
    {
      if (!wait)
      {
        lock wait_count--;
        if (CY)
          wait = CreateThreadBlock();
        lock wait_count++;
      }

      if (wait)
        WaitThreadBlock(wait);
      else
        Yield();
    }
  }
CreateThreadBlock will create a handle that is backed with a list in kernel space. WaitThreadBlock will block a thread on the list, unless the handle is freed. CloseThreadBlock will inactivate the handle, wakeup waiting threads and then free it.

Edited the code many times, but I'm still not sure if it is multicore safe. :(
rdos
Member
Member
Posts: 3276
Joined: Wed Oct 01, 2008 1:55 pm

Re: Best way to implement thread blocking in user space

Post by rdos »

I think I have a working solution, but this is much easier (and probably faster) in x86 assembler. I'm not even sure that my compiler supports the atomic operations needed.

Code: Select all


dir_link_struc  STRUC

dl_offset          DD ?
dl_link            DD ?
dl_wait_handle     DW ?
dl_ref_count       DB ?
dl_wait_count      DB ?

dir_link_struc  ENDS

    extern LowReadDirLink:near

wait_name DB "Wait Dir", 0

LockDirLinkObject_ Proc near
    push eax
    push ebx

ldlRetry:
    test [edi].dl_ref_count,80h
    jnz ldlWait
;
    cmp [edi].dl_link,0
    jnz ldlDone
;
    lock sub [edi].dl_ref_count,1
    jnc ldlLockFailed
;
    call LowReadDirLink
    lock inc [edi].dl_ref_count

ldlWaitLoop:
    cmp [edi].dl_wait_count,0
    je ldlWaitOk
;
    mov ax,1
    WaitMilliSec
    jmp ldlWaitLoop

ldlWaitOk:
    xor bx,bx
    xchg bx,[edi].dl_wait_handle
    or bx,bx
    jz ldlDone
;
    CloseThreadBlock
    jmp ldlDone

ldlLockFailed:
    lock inc [edi].dl_ref_count
    jmp ldlRetry

ldlWait:
    mov bx,[edi].dl_wait_handle
    or bx,bx
    jnz ldlDoWait
;
    lock sub [edi].dl_wait_count,1
    jc ldlWaitCreate
;
    lock inc [edi].dl_wait_count
;
    mov ax,1
    WaitMilliSec
    jmp ldlRetry

ldlWaitCreate:
    push edi
    mov edi,OFFSET wait_name
    CreateThreadBlock
    pop edi
    mov [edi].dl_wait_handle,bx
;
    lock inc [edi].dl_wait_count

ldlDoWait:
    WaitThreadBlock
    jmp ldlRetry

ldlDone:
    lock inc [edi].dl_ref_count
;
    pop ebx
    pop eax
    ret
LockDirLinkObject_ Endp

Post Reply