Page 1 of 1

Threading library

Posted: Sun Jun 21, 2015 6:28 am
by wordnice
Writing (copy pasta from osdever and your wiki) light microkernel, and now, when really low-level stuff like irq, idt, pit, putchar and dlmalloc (+ utilities like printf from another library) is done, I want to implement threads.

Is there any easy-to-deploy multithreading library for C? I found pth and lthread, but these have, let's say, more depencies. Or is there any tutorial with code snippets how to make primitive threading? I don't know how to write it from osdev wiki. Just need functions to create, join and stop threads, and, but not required, mutex locking.

Thanks

PS: I wanted to try it via setjmp =D> , but... someone said that it is not good idea

Re: Threading library

Posted: Sun Jun 21, 2015 6:39 am
by Kevin
You say you have a microkernel, so you must have processes (tasks). Threads are essentially just tasks running in the same address space as other tasks. I'm sure you can do that without copying.

Re: Threading library

Posted: Sun Jun 21, 2015 6:43 am
by wordnice
Sorry, I wrote it bad. I am trying to write something like microkernel. Just need to implement threads (and processes) first, before the real fun begin. Thanks

Re: Threading library

Posted: Sun Jun 21, 2015 7:38 am
by Brendan
Hi,
wordnice wrote:Sorry, I wrote it bad. I am trying to write something like microkernel. Just need to implement threads (and processes) first, before the real fun begin. Thanks
A threading library is typically just a layer of glue between the API that processes expect (e.g. POSIX or something good) and the functionality the kernel/OS actually provides.

If you're writing a kernel (of any type) you have to write that "functionality the kernel/OS actually provides" yourself.

Note that the scheduler and how it works is extremely important for all OSs - it's the difference between a very sluggish OS (e.g. CPUs busy doing irrelevant work while user is frantically hammering the keyboard wondering why OS is unresponsive) and an OS that always seems to respond "immediately" to everything regardless of what other things are happening. For a micro-kernel (where scheduler has to deal with things like device drivers, etc) it's even more important than it would be for a monolithic kernel; and you'll find that the scheduler is strongly influenced by communication between tasks and tied in with IPC/message passing.


Cheers,

Brendan

Re: Threading library

Posted: Sun Jun 21, 2015 8:24 am
by wordnice
Thanks for response.

I don't know how to write "the layer under glue". I have not got implemented processes and threads (these "under glue") yet, and don't know where and with what should I start. Does it work like setjmp & longjmp?

Re: Threading library

Posted: Sun Jun 21, 2015 10:25 am
by Brendan
Hi,
wordnice wrote:I don't know how to write "the layer under glue". I have not got implemented processes and threads (these "under glue") yet, and don't know where and with what should I start.
Start by inventing/creating some sort of data structure that contains information about a thread ("thread data structure"). This must include something to keep track of the thread's CPU state when it's not running. It might (or might not) also include things like:
  • Information about the thread's priority (used by scheduler)
  • Information about what the thread belongs to (e.g. if it belongs to the kernel or if it belongs to a process and which one)
  • Information about what the thread is able to access (e.g. if it is allowed to use networking or file IO, or whatever).
  • For multi-CPU; information about which CPUs the thread is allowed to use.
  • Statistical information (how much CPU time the thread has consumed, how many messages it has sent/received, etc).
  • A thread name (e.g. "GUI worker thread"), which can be used by utilities that show running threads/processes to the user (to make it much easier for users to understand than meaningless "thread #1234" numbers).
Don't get too worried about including everything you could possibly want - it should be easy enough to add new fields to your "thread data structure" later.

At the moment you already have one thread running; and you need to create a "thread data structure" for it. After inventing/designing your "thread data structure" create one for the initial/existing thread.

Then you need a way to create a new thread. This will be a function that creates a new "thread data structure"; including setting up values for the "something to keep track of the thread's CPU state when it's not running".

Next step is to implement code that does "switch immediately to thread # X". This will mostly just save the current thread's state in its "something to keep track of the thread's CPU state when it's not running" and load the new thread's "something to keep track of the thread's CPU state when it's not running". This needs to be tested to make sure it works - for example, create a second thread, and add some code so that your initial/first thread calls this function to switch to the new thread, and the new thread calls this function to switch back to the initial/first thread; so that you end up with 2 threads constantly switching to each other. Do not continue until you're extremely sure that this function definitely works correctly.

Once you know that works, you need code to decide which thread to switch to (and calls the "switch immediately to thread # X" function once it has decided which thread to switch to). There are many very different ways to do this. My advice is, if you don't know what you're doing just implement a very simple (and very crappy) "round robin" thing for now.

The next step is IPC ("Inter Process Communication"). The name is very misleading - this is how "things" (mostly threads) communicate and isn't for processes at all. There's many ways of doing IPC too. Research the alternatives and pick whichever is right for you; then implement it. You will find that sometimes threads need to wait until they receive data. This is important because it effects the scheduler. Basically, when a thread has to wait you tell the scheduler "don't give this thread CPU time until whatever it's waiting for occurs" and then when whatever it was waiting for occurs (e.g. it receives data via. IPC) you tell the scheduler "Hey, that thread is no longer waiting and can be given CPU time again now".

After your IPC works; if you implemented a very simple (and very crappy) "round robin" thing before; replace it with something that doesn't suck. At this point I'd strongly recommend doing some benchmarking - e.g. how long does a thread switch take, how quickly can you send a data via. IPC and get a response back? Most importantly; if there's 123 low priority threads running and a high priority task becomes ready to run, how long does it take for the high priority thread to get CPU time? You want to make sure the scheduler behaves the way you hoped it would under various conditions; and possibly tune/modify the scheduler to improve its behaviour under various conditions.

Next; you want to have some way to kill a thread. Actually, you want 2 ways - one for when the thread voluntarily terminates itself (e.g. "thread_exit()"); and one for when a thread has been naughty (crashed) and needs to be forcibly killed. This mostly means code to free any resources the thread was using, and (possibly) notifying other threads that the terminated thread was killed (and why).

Once all of that is done, you will be able to create new threads and terminate them, and schedule them (give them CPU time); and the threads will be able to communicate with each other and do useful work. However; you will probably only be using "kernel threads" at this stage. The next step would be implementing code to create "user-space" processes; where starting a process includes creating an initial thread for the new process. This will probably involve some sort of executable file loader.

Once you're able to start new processes, you're going to want to provide some sort of kernel API that processes can use to do things like allocate memory, create threads, communicate (via. IPC), etc. This kernel API is what your "threading library" would use.
wordnice wrote:Does it work like setjmp & longjmp?
Sort of, but not really. ;)


Cheers,

Brendan

Re: Threading library

Posted: Sun Jun 21, 2015 11:54 am
by wordnice
Thanks for exhaustive answer. Everything is clear except that how can I "save current state" - I can ask only one thing - is it something like setjmp()? I am assembler newbie who can only few very basic commands - enough when actually not writing OS from scratch.
Again, many thanks for answer.

Re: Threading library

Posted: Sun Jun 21, 2015 12:29 pm
by Brendan
Hi,
wordnice wrote:Thanks for exhaustive answer. Everything is clear except that how can I "save current state" - I can ask only one thing - is it something like setjmp()? I am assembler newbie who can only few very basic commands - enough when actually not writing OS from scratch.
For "save current state" you need to ensure that whatever needs to be saved is saved. It's extremely likely this includes the contents of CPU registers (includes ESP and EIP). It may also include FPU/MMX/SSE/AVX state, plus potentially other more obscure things (e.g. the debug registers if you're doing "per thread debugging", the performance monitoring state if you're doing "per thread performance monitoring", etc).

Note that "ensure whatever needs to be saved is saved" doesn't necessarily mean saving it yourself in cases where you know it must've already been saved. Specifically, for most calling conventions there are "caller preserved" registers which must've already been saved by caller and therefore needn't be saved again. Also calling a function causes the caller's EIP to be saved.

Also; "ensure whatever needs to be saved is saved" doesn't really include anything that remains the same. For example, while kernel is running most segment registers are typically constant and needn't be saved, and a thread's CR3 doesn't change after the thread is created either.

For "bare minimum" saving a thread's state may mean pushing "callee preserved" registers onto the thread's kernel stack; saving "kernel stack top" into the thread's "thread data structure", and nothing else. Of course "load saved state" is the opposite. E.g. for "bare minimum" you'd load ESP with "kernel stack top" from the thread's "thread data structure", and pop the "callee preserved" registers from the thread's kernel stack. This forms the basis of your "switch immediate to thread # X" code; which does "save old thread's state" followed by "load new thread's state".


Cheers,

Brendan

Re: Threading library

Posted: Sun Jun 21, 2015 12:49 pm
by wordnice
Many thanks for help! Fun began.