Page 1 of 1

Thread local storage

Posted: Sat Mar 20, 2010 3:59 am
by FlashBurn
So I need TLS (thread local storage), because of multithreading (it makes it easier). But as I think about how I can port e.g. a c library I came across a problem of TLS.

You have some functions tls_get_slot, tls_free_slot, tls_get_data and tls_set_data. The problem I have is that I cannot allocate a slot and assume then when I allocate a slot in another thread that it will be the same slot number (I need this e.g. for a c library, so that I can use a global var to save the slot number and use this slot in every thread).

So I thought it would be a good idea to have 2 more functions tls_get_slot_proc and tls_free_slot_proc. With this you get a slot which will be set to used for all threads in a process (also for all threads which will be created after this call). To make this work the thread only slots will grow up and the proc slots will grow down.

I would like to know what others think about this system and if 1024 slots (1 page) would be enough (I think so because you could use 1 slot and save a mem addr where you then can store all your thread local storage)?

One problem with this system is that it could become a performance problem if I need to test if a slot is free for every thread.

Re: Thread local storage

Posted: Sat Mar 20, 2010 4:38 pm
by Gigasoft
Why would there be thread only slots? TLS slots are always global to a process. TLS slots used by only a single thread are pointless, since the point of TLS is that every thread uses the same slot to access their own data.

1024 slots are more than enough, since each module usually uses only 1 slot at most.

Re: Thread local storage

Posted: Sun Mar 21, 2010 1:32 am
by FlashBurn
Gigasoft wrote: TLS slots used by only a single thread are pointless, since the point of TLS is that every thread uses the same slot to access their own data.
I haven´t used TLS yet, so I don´t know how it is used.
Gigasoft wrote: 1024 slots are more than enough, since each module usually uses only 1 slot at most.
As I would use TLS every shared library would use a slot for TLS and I think that I so can solve my problem with returning 64bit values from a syscall. I only need to say that the returning value is in 2 TLS slots which are preserved for returning such a value (like slot 0 and 1).

Thanks for the hint!

Re: Thread local storage

Posted: Sun Mar 21, 2010 6:18 pm
by Owen
You'd be surprised how applications can churn through TLS keys. 1024 may not be enough.

Re: Thread local storage

Posted: Sun Mar 21, 2010 6:40 pm
by Gigasoft
For returning 64-bit values from a syscall, I'd just use two registers (usually, eax and edx). To return large structures, the caller can pass a pointer, just make sure the pointer points to user space and be prepared to handle exceptions that occur if the pointer is invalid.

Re: Thread local storage

Posted: Mon Mar 22, 2010 12:58 am
by FlashBurn
Owen wrote: You'd be surprised how applications can churn through TLS keys. 1024 may not be enough.
So maybe I should design it so that user can call for more mem.
Gigasoft wrote: For returning 64-bit values from a syscall, I'd just use two registers (usually, eax and edx). To return large structures, the caller can pass a pointer, just make sure the pointer points to user space and be prepared to handle exceptions that occur if the pointer is invalid.
But here comes the problem syscall/sysenter (fast ring change instructions) use edx and so I can´t use it. The way with the TLS I could also say that if eax != 0 then you got an error and the error code is in the TLS, so that the return value is always in the TLS and I could give better errors back (and the error checking is easier).

Re: Thread local storage

Posted: Mon Mar 22, 2010 3:23 pm
by Owen
Then return in ebx or ecx,
or edx or esi,
or edi or ebp,
or even esp.

(OK, don't return in esp. thats awkward)

Re: Thread local storage

Posted: Mon Mar 22, 2010 3:53 pm
by FlashBurn
According to the ABI of x86 you are only allowed to clobber eax, ecx and edx. Sysexit/Sysret are using ecx and edx, so you only have left eax for returning a value, but if I say that in my os the api is that in eax you will find "0" or an error code and the return value is in the TLS then it is so and the programs have to live with that :twisted:

I know that in Haiku if you want to return a 64bit value you have to set some flag and then you will return as if the syscall were made with an int instruction. I don´t know how big the difference between returning with sysexit/sysret and iret are, but as you also need to create a stack for the iret instruction it will be not too little.

And as you most of the time will use some wrapper function for calling a kernel service you wont see a difference. Only assembly programmers need to know it (but with them I have also another problem (and all other OS´s which use sysenter, not syscall, will have this).

Re: Thread local storage

Posted: Tue Mar 23, 2010 10:40 am
by Owen
Syscalls don't follow the ABI anyway; what is the problem? If you're calling from (say) inline assembly anyway, you can correctly instruct the compiler to optimize around it; if calling from assembly itself, then you need to know the system call details.

The kernel should know nothing about TLS; it should be entirely managed in user space, except for perhaps asking the kernel to point %gs or %fs at it.

Re: Thread local storage

Posted: Tue Mar 23, 2010 1:10 pm
by Combuster
Owen wrote:The kernel should know nothing about TLS; (...) asking the kernel to point %gs or %fs at it.
There be a slight contradiction, my friend :wink:

A kernel's design must know about TLS to the extent it can support that, It's not like userspace can load GS or FS with something and expect it to magically work...?

Re: Thread local storage

Posted: Tue Mar 23, 2010 1:52 pm
by Gigasoft
A kernel could have a system call which sets LDT entries to anything specified by the caller (while making sure they don't contain anything funky). Then it's up to the caller to implement TLS on top of this.

Re: Thread local storage

Posted: Tue Mar 23, 2010 3:09 pm
by Owen
Combuster wrote:
Owen wrote:The kernel should know nothing about TLS; (...) asking the kernel to point %gs or %fs at it.
There be a slight contradiction, my friend :wink:

A kernel's design must know about TLS to the extent it can support that, It's not like userspace can load GS or FS with something and expect it to magically work...?
For the kernel, they're just arbitrary extra pointer registers which userspace can't directly write ;-) No reason they have to be TLS

(You might use one to point to a pointer to the system call vector, for example, so apps just do a call (%fs:0) to get to the kernel

Re: Thread local storage

Posted: Tue Mar 23, 2010 3:47 pm
by FlashBurn
The problem with sysenter is that you need to save the user esp in user space, so you can´t be sure, that the value which you think is the saved user esp is the right value (a security problem). I don´t know how other OS´s check if the stack which was given to the syscall code is present and right, but at the moment I have no such checks.

@back to topic

The kernel needs to know about TLS, because it has to create (or better change) a segment descriptor and I find it easier if I give the user a standard TLS of 4kb (1024 entries) and the rest like getting a free slot is done in an user space library. But actually a 4byte value (a pointer) would be enough, because a thread only needs 1 ptr which then points to the TLS. Then you wont need to preallocate 4kb (for my os) and all things will be done in user space (maybe I´m taking this approach, but I like the other more).

Re: Thread local storage

Posted: Tue Mar 23, 2010 4:36 pm
by Gigasoft
When copying the parameters for a system call, just check that the parameters are outside the virtual address range reserved for the kernel, and have an exception handling system so that you can catch accesses to a non-present stack and either return an error code or pass the exception to user mode. You must implement address checking and exception handling for every user buffer passed to a system call, too. Remember that an user buffer can become invalid at any time (if another thread frees it) unless you implement a memory locking mechanism to prevent that from happening.

Re: Thread local storage

Posted: Wed Mar 24, 2010 1:22 am
by FlashBurn
Gigasoft wrote: When copying the parameters for a system call, just check that the parameters are outside the virtual address range reserved for the kernel, and have an exception handling system so that you can catch accesses to a non-present stack and either return an error code or pass the exception to user mode. You must implement address checking and exception handling for every user buffer passed to a system call, too. Remember that an user buffer can become invalid at any time (if another thread frees it) unless you implement a memory locking mechanism to prevent that from happening.
I copy the parameters from the user stack to the kernel stack, only problem is the function which does this, doesn´t know anything about the parameters. So maybe I should change my syscall system so that I use wrapper function and check the parameters for the kernel function and then call the kernel function. At the moment my kernel functions are called directly by the syscall interface.