Do Threads Remain the Best Concurrency Abstraction?

Discussions on more advanced topics such as monolithic vs micro-kernels, transactional memory models, and paging vs segmentation should go here. Use this forum to expand and improve the wiki!
Avarok
Member
Member
Posts: 102
Joined: Thu Aug 30, 2007 9:09 pm

Post by Avarok »

I suppose that's why this belongs in OS Design & Theory. :D

Well, the plan for me wasn't quite continuations because the idea isn't to throw them away each time though I suppose you could since they're rather cheap. My plan was to keep them by default so that you could pre-empt them and use them for multiplexing as well.

The idea was to then furnish a separate ring 0 and ring 3 change_thread(), and perhaps put the change_thread() code in execute-only pages for each agent.

The change_thread() would pre-emptively iterate through their continuation stack in a round-robin fashion, which is okay because we move up and down using interrupt prioritization via the LAPIC.

The continuations are being stored by pushing everything onto the continuation's (users) stack and storing the RSP on the continuation stack (belonging to us). I then put a separate push on there for status information.

So anyways, now it looks like this:

Code: Select all

struct Continuation {
 ulong flags = 
 [63..3 = signal I'm waiting for]
 [2 = is waiting for a signal]
 [1 = is awake]
 [0 = is alive];
 ulong RSP;
}
The thought at this point is that you can now simply call broadcast_signal_to_threads(4); and all the threads listening on signal 4 are changed from waiting to asleep (ready to be awake). I'm doing this simply by scanning the continuation stack for the appropriate value after shifting 4 << x.

So yes, my plan then uses pervasive continuations, as event handlers at entry points for IPC and RPC, and for doing various things inside a "process/agent". In fact, the plan was to do this exclusively.

Things I encountered so far:

1)
I realized you can't push status data on active threads or mark continuations dead very effectively if you try to do it by storing it on the user thread's stack. The initial plan was to push it just before it went to sleep, and pop it every time just after it woke up and before it resumed.

2)
Killing threads can be done simply by setting the bit, which is checked during change_thread. However, that doesn't clear it off the stack. An innovation I came up with to remedy the dirtiness was to, when executing change_thread(), store the old stack back if that one was dead. It causes the algorithm to slowly shift all dead stacks to the bottom, where the RSP simply adjusts.

3)
I think the signal thing is kind of cool. It evolved from an explanation I made on the idea of one thread ceding to another thread. It's asynchronous if you keep running, or synchronous if you then cooperatively give up this CPU session.

4)
State changes on threads that may be continuated or awake are asynchronous atm. I'm thinking you'd have to pre-empt it, since it couldn't possibly be aware of the state change synchronously while running.

Definitely would like some more exploration into this subject.

For example, perhaps more on why you feel that discarding continuations after one use is better than keeping them?
There are two ways of constructing a software design: One way is to make it so simple that there are obviously no deficiencies, and the other way is to make it so complicated that there are no obvious deficiencies.
- C. A. R. Hoare
Crazed123
Member
Member
Posts: 248
Joined: Thu Oct 21, 2004 11:00 pm

Post by Crazed123 »

You discard continuations after one use because otherwise you would have to copy them on each use.

A continuation isn't just the registers and things saved by an interrupt, it includes the entire stack. So if you throw to a continuation, that stack goes from read-only data in a continuation to read-write data of a running computation. Therefore, the only way to preserve the actual saved continuation when it is thrown is to copy it, including the entire stack.

Most implementations of Scheme, ML and other languages with continuations actually do this to make continuations immutable values, but I don't see why an operating-system kernel should go to the trouble.
Avarok
Member
Member
Posts: 102
Joined: Thu Aug 30, 2007 9:09 pm

Post by Avarok »

Yes, of course it includes the entire stack. The thing is though, is that the stack isn't lost simply by using it once. The data on it may be pushed off, but the memory is still allocated and the RSP and RBP are still in the right places. By running that thread and assuming it clears off it's stack data, you're back in a ready state for more data to be pushed on. Even if you pre-empt to do other stuff and it's still got stuff on the stack, that tends to mean you want to keep that stack until it's processed. I also suggest that you can just mark it's pages read/write rather than read only. In fact, I don't see the benefit of marking pages read only unless you want to lock it because you don't trust other things in your address space. In which case, it ought not share an address space?

*note the "?"

IMHO spurious copies are evil.
There are two ways of constructing a software design: One way is to make it so simple that there are obviously no deficiencies, and the other way is to make it so complicated that there are no obvious deficiencies.
- C. A. R. Hoare
Crazed123
Member
Member
Posts: 248
Joined: Thu Oct 21, 2004 11:00 pm

Post by Crazed123 »

My point is that once you throw a continuation, the data in those registers and on that stack just isn't the same any longer. Since it isn't the same data, it is no longer the same continuation. And since continuations cannot be mutable objects, the choice a kernel developer implementing continuations faces is whether to:

1) Copy the continuation and throw to the copy on a throw. This will make continuations immutable data types at the cost of massive data copying.
2) Destroy the continuation on a throw. This doesn't provide as much theoretical elegance as #1, but works without copying dozens of pages on every throw.
Avarok
Member
Member
Posts: 102
Joined: Thu Aug 30, 2007 9:09 pm

Post by Avarok »

Okay... :p

I've read the Wikipedia listing for Continuations. They're interesting, but a long ways from what I'm thinking of. I *am* thinking in threads.

I sort of understand now what you mean about copying continuations. Threads are much easier to continually and repeatedly use without any copying, and tend to be much easier to initialize.

After reading a few PDFs and the Wikipedia entry on multi-threading, I'm under the impression I may have stumbled into some solutions for scheduling problems:

1) By using round-robin (and/or possibly another simple) context switchers and a LAPIC prioritization scheme for interrupts, one can reduce the need for complex scheduling and improve SMP handling at the same time. I'm probably missing something here, but we might be able to shift it away from being a complicated daemon.

2) By using a ring 3 context switcher for threads, we still have the problem of our pre-emption code being based on the timer, which is a ring 0 thing. If we could pre-empt in ring 3, we could make the context-switching cost roughly half as much.

3) By using multiple entry points, rather than forcing threads to branch from a single entry-point, one can save some spurious code execution and make for cheap event handlers by providing entry-points/handler stack population.

4) Mutexes, spin-locks, and atomic write strategies are the simile for the complication wrought by IPC between processes. A thread's stack is shared memory if more than one thread can write to it.

5) By providing a ring 0 atomic_write(a,b) function which uses cli/sti to avoid pre-emption (and then possibly cooperatively thread switches), one can safely and efficiently serialize threads (avoiding the nasties) for shared memory writes without any additional complexity.

Just my thoughts.
There are two ways of constructing a software design: One way is to make it so simple that there are obviously no deficiencies, and the other way is to make it so complicated that there are no obvious deficiencies.
- C. A. R. Hoare
Crazed123
Member
Member
Posts: 248
Joined: Thu Oct 21, 2004 11:00 pm

Post by Crazed123 »

Yes, you have been thinking of threads.

However, I'm going to note one of the last ways in which continuation systems act as a kind of mirror image of thread systems. A thread system grants concurrency with parallelism bolted on as an OS feature. A continuation system grants parallelism natively with concurrency (time-sharing) built in user space.

So I personally think that a multicore system will see far more benefit from continuations than a single-core system.
Avarok
Member
Member
Posts: 102
Joined: Thu Aug 30, 2007 9:09 pm

Post by Avarok »

:) Well, the summary is where we differ now.

Actually threads in the system I'm thinking are concurrency in it's most streamlined form assuming all threads are active. Saving the state of a thread for later execution is overhead (continuation or thread); useful only as a means of desynchronizing execution so we can prioritize something else.

Nothing about my threads involves "bolted on", as they're not an extension or feature of some other strategy, but a replacement for them as an execution model. Since they are the single execution primitive, they ought be rather effective and efficient comparatively.

Neither threads nor continuations can exist purely in ring 3 and be pre-emptive (a requirement more or less for computing unsafe code). While threads or continuations can implement cooperative strategies, both require the system timer to pre-empt.

Your implementation stores continuation locations in user-space and so the timer (assuming you pre-empt) then switches between something *else* which then determines which continuations to execute. My system instead stores pointers to stacks, and so the "OS" can switch directly into waiting execution states, which means I *can* use them as my only execution primitive.

That's how I see it.
There are two ways of constructing a software design: One way is to make it so simple that there are obviously no deficiencies, and the other way is to make it so complicated that there are no obvious deficiencies.
- C. A. R. Hoare
Crazed123
Member
Member
Posts: 248
Joined: Thu Oct 21, 2004 11:00 pm

Post by Crazed123 »

My continuation system does *not* schedule (beyond event handling). It does *not* have any primitive underlying continuations. And it does *not* store continuations in user space. They are kernel-space objects that user programs can manipulate using a name, like all other kernel objects.

Instead, it uses events to pass the handling of an interrupt (like the timer interrupt) on to user space. The event handler can then run whatever it likes.

Thus, a preemptive concurrency system can be implemented in user-space using only the primitives given by the kernel.

I mean no offense, but have you actually read and understood what I've described in this thread? Am I just that bad a writer?
Avarok
Member
Member
Posts: 102
Joined: Thu Aug 30, 2007 9:09 pm

Post by Avarok »

My continuation system does *not* schedule (beyond event handling).
Sir, I did not say it did.
It does *not* have any primitive underlying continuations.
I quote dictionary.reference.com:
Primitive: (7) Computer Science: A basic or fundamental unit of machine instruction or translation.
You mean to tell me that it lacks any fundamental unit of machine instruction or translation? Here I thought the objective was to accomplish exactly that; the definition of a more effective primitive.
And it does *not* store continuations in user space.
Instead, it uses events to pass the handling of an interrupt (like the timer interrupt) on to user space. The event handler can then run whatever it likes.
So the kernel stores the object, and the user programs manipulate them using a name. The system timer triggers the interrupt handler but the whole thing runs in ring 3. The event handler can run whatever it likes, but it can't run since it doesn't have any context until it acquires one from your continuations, which you say are kernel space objects.

Thus a pre-emptive concurrency system can be implemented in user space but still requires a switch to ring 0 every time the system timer triggers the handler interrupt.

Moreso, you don't have any primitive underlying continuations, but your system is implemented in user space using only primitives given by the kernel.

No offense to you sir, but I would argue the latter.
There are two ways of constructing a software design: One way is to make it so simple that there are obviously no deficiencies, and the other way is to make it so complicated that there are no obvious deficiencies.
- C. A. R. Hoare
Crazed123
Member
Member
Posts: 248
Joined: Thu Oct 21, 2004 11:00 pm

Post by Crazed123 »

Avarok wrote: You mean to tell me that it lacks any fundamental unit of machine instruction or translation? Here I thought the objective was to accomplish exactly that; the definition of a more effective primitive.
Right. And primitives don't have other things sitting beneath them. They are the foundations, and thus they require no foundation.

Rather than switching to a kernel-space *something* which determines what continuations run (the underlying primitive you seem to think is there), a continuation system switches continuations either on an event or at the command of user-space. In either case, the command to switch includes a new continuation to switch to.
So the kernel stores the object, and the user programs manipulate them using a name. The system timer triggers the interrupt handler but the whole thing runs in ring 3. The event handler can run whatever it likes, but it can't run since it doesn't have any context until it acquires one from your continuations, which you say are kernel space objects.
What is this context you speak of?
Thus a pre-emptive concurrency system can be implemented in user space but still requires a switch to ring 0 every time the system timer triggers the handler interrupt.
Basically. All operating-systems switch to kernel mode on interrupts.
Moreso, you don't have any primitive underlying continuations, but your system is implemented in user space using only primitives given by the kernel.
The continuation is the primitive given by the kernel.
No offense to you sir, but I would argue the latter.
The latter what, exactly? You aren't making much sense.
Avarok
Member
Member
Posts: 102
Joined: Thu Aug 30, 2007 9:09 pm

Post by Avarok »

*sigh*
Right. And primitives don't have other things sitting beneath them. They are the foundations, and thus they require no foundation.
Exactly. That's the concept. They are "whatever the most fundamentally small useful piece is". That's the definition.

To elaborate, since you don't seem to have understood the dictionary definition; electrons, neutrons and protons make excellent primitives in chemistry for example. Sure they have something underneath them, but such things are unnecessary to perform chemistry.
Rather than switching to a kernel-space *something* which determines what continuations run
The system timer interrupts whatever is happening. That interrupt inherently, by definition, jumps into ring 0 (kernel space). You really don't have any choice in that with an x86. I suggested it would be wonderful if you did, but you don't. There is no real difference as to whether you jump back into ring 3 before figuring out which context to restore, or after. Unless you figure it out in ring 3, and then attempt to restore it in ring 0; in which case you're making the transition twice as often.
What is this context you speak of?
Google "context". Look it up on dictionary.reference.com. It's important if you ever want to discussing computing that you learn all the terminology and existing strategies. Otherwise you jump on the first boat you see - such as "continuations" for example.
There are two ways of constructing a software design: One way is to make it so simple that there are obviously no deficiencies, and the other way is to make it so complicated that there are no obvious deficiencies.
- C. A. R. Hoare
Crazed123
Member
Member
Posts: 248
Joined: Thu Oct 21, 2004 11:00 pm

Post by Crazed123 »

Avarok wrote:*sigh*

Exactly. That's the concept. They are "whatever the most fundamentally small useful piece is". That's the definition.

To elaborate, since you don't seem to have understood the dictionary definition; electrons, neutrons and protons make excellent primitives in chemistry for example. Sure they have something underneath them, but such things are unnecessary to perform chemistry.
Please refrain from condescending to me just because I do not understand your broken English.
The system timer interrupts whatever is happening. That interrupt inherently, by definition, jumps into ring 0 (kernel space). You really don't have any choice in that with an x86. I suggested it would be wonderful if you did, but you don't. There is no real difference as to whether you jump back into ring 3 before figuring out which context to restore, or after. Unless you figure it out in ring 3, and then attempt to restore it in ring 0; in which case you're making the transition twice as often.
I understand quite well about the transition to ring 0 on an interrupt, having actually coded interrupt handlers, but you do not seem to understand that a continuation system needs only the simplest and least functional of kernel-land schedulers to decide what to run after an interrupt. It just runs the event handlers for the event associated with the interrupt.
Google "context". Look it up on dictionary.reference.com. It's important if you ever want to discussing computing that you learn all the terminology and existing strategies. Otherwise you jump on the first boat you see - such as "continuations" for example.
As I said before, do not condescend to me for failing to understand your personal tongue. I know well what "context" means, but as we can see here:
So the kernel stores the object, and the user programs manipulate them using a name. The system timer triggers the interrupt handler but the whole thing runs in ring 3. The event handler can run whatever it likes, but it can't run since it doesn't have any context until it acquires one from your continuations, which you say are kernel space objects.
you appear to have used the word... out of context. What kind of context must an event handler "acquire" from a continuation to run?
speal
Member
Member
Posts: 43
Joined: Wed Mar 07, 2007 10:09 am
Location: Minneapolis, Minnesota
Contact:

Post by speal »

I've been trying to follow this discussion since the idea of an alternate to processes/threads is an appealing one to me. I wasn't familiar with continuations before, but I've tried to figure it out with some reading. Maybe I've misunderstood, but it seems to me that there's nothing inherently concurrent about a continuation. My assumption was that the behavior was much like a function call, except instead of returning a single value, the continuation will yield an intermediate value, and can be resumed at a later time to return another intermediate value. Is this correct?

If I'm right, I don't see how you can construct a system of concurrency from this, as the control flow is still in one place, but is operating in a different context when resuming a continuation.

The only idea I had for this was that you could assume a program will eventually want many intermediate values from a continuation, so the continuation may be called repeatedly, and its output stored until the program that initiated the continuation requests them. Is this anywhere near what you're proposing?
User avatar
Combuster
Member
Member
Posts: 9301
Joined: Wed Oct 18, 2006 3:45 am
Libera.chat IRC: [com]buster
Location: On the balcony, where I can actually keep 1½m distance
Contact:

Post by Combuster »

continuations are just as concurrent as single-core CPU's - they appear to be concurrent because they are done in bits after each other. The only difference is that continuations make this decade-old 'cheat' tangible to the userland programmer.
"Certainly avoid yourself. He is a newbie and might not realize it. You'll hate his code deeply a few years down the road." - Sortie
[ My OS ] [ VDisk/SFS ]
speal
Member
Member
Posts: 43
Joined: Wed Mar 07, 2007 10:09 am
Location: Minneapolis, Minnesota
Contact:

Post by speal »

So in the end, you still need something like a process abstraction to handle independent programs executing "simultaneously."

Seems to me that this has very little to do with how the kernel presents itself, but instead is simply a method of dispatching kernel tasks and handling events. It's not so much an alternative to processes/threads as it is a construction that is useful for people when writing a preemptible kernel.

It's an interesting idea, but I don't see it as an alternative to threads. The fact that continuations are often described as "gotos that pass a value" suggests the added complexity in your control flow may not be worth the "elegant solution," if you see it that way.
Post Reply