Setup syscall stack pointer

Question about which tools to use, bugs, the best way to implement a function, etc should go here. Don't forget to see if your question is answered in the wiki first! When in doubt post here.
Post Reply
torshie
Member
Member
Posts: 89
Joined: Sun Jan 11, 2009 7:41 pm

Setup syscall stack pointer

Post by torshie »

I'm using the 64-bit syscall/sysret for system call handling. I'm not sure about how to setup the system call stack.
Should I put the system call stack in the lower half or in the higher half? Is it necessary to setup a system call stack for every user mode thread?

Thanks
-torshie
User avatar
Nessphoro
Member
Member
Posts: 308
Joined: Sat Apr 30, 2011 12:50 am

Re: Setup syscall stack pointer

Post by Nessphoro »

Okay, I do not know a lot about syscall/sysret, but regarding the syscalls -

There are two models by which you can go,

1: One kernel stack per CPU Core - very hard to implement since you need to use tricks like stack continuation.

2: One kernel stack per thread - very easy to implement you just change the ESP0 in TSS (If you're using interrupts) - or the MSR to point to the threads kernel stack.
And also since writing to the MSR is quite slow you can put a memory location in there to load the ESP's for all the threads.

Cheers,

Paul
gerryg400
Member
Member
Posts: 1801
Joined: Thu Mar 25, 2010 11:26 pm
Location: Melbourne, Australia

Re: Setup syscall stack pointer

Post by gerryg400 »

3. A combination of both. You can have a small kernel stack per thread (about 96 bytes for i386, 160 bytes for x86_64) that's just big enough to hold the ring 3 context and then switch to a bigger kernel stack per core that's used for performing syscalls, interrupts and faults. There are some tricks needed to pre-empt system calls and nest interrupts but it's all very doable (at least in a microkernel).
If a trainstation is where trains stop, what is a workstation ?
torshie
Member
Member
Posts: 89
Joined: Sun Jan 11, 2009 7:41 pm

Re: Setup syscall stack pointer

Post by torshie »

Nessphoro wrote:Okay, I do not know a lot about syscall/sysret, but regarding the syscalls -

There are two models by which you can go,

1: One kernel stack per CPU Core - very hard to implement since you need to use tricks like stack continuation.

2: One kernel stack per thread - very easy to implement you just change the ESP0 in TSS (If you're using interrupts) - or the MSR to point to the threads kernel stack.
And also since writing to the MSR is quite slow you can put a memory location in there to load the ESP's for all the threads.

Cheers,

Paul
I'm very interested in the "One stack per CPU Core" model, do you have any documents/links about this model ?
I tried this model, but didn't find out how to handle long system calls. During long system calls, another system call (of a different process) can happen in the middle of current system calls. I have no idea how to handle this kind of mess :(
torshie
Member
Member
Posts: 89
Joined: Sun Jan 11, 2009 7:41 pm

Re: Setup syscall stack pointer

Post by torshie »

Not just long system call has this problem. If the kernel is fully preemptible, another system call could happen anytime, anywhere :shock:
User avatar
Nessphoro
Member
Member
Posts: 308
Joined: Sat Apr 30, 2011 12:50 am

Re: Setup syscall stack pointer

Post by Nessphoro »

The only place of where to go right now is
http://i30www.ira.uka.de/~neider/edu/mk ... /ch02.html

But as it points out -"All threads executing on one CPU can use the same kernel stack. As a consequence, either only one thread can execute in kernel mode at any time (i.e., threads executing in kernel mode cannot be preempted), or unusual approaches such as continuations must be used," I would go against it - but suit yourself sir.
gerryg400
Member
Member
Posts: 1801
Joined: Thu Mar 25, 2010 11:26 pm
Location: Melbourne, Australia

Re: Setup syscall stack pointer

Post by gerryg400 »

You're assuming that (pre-emptable == good). This is not an automatic decision.

I always wonder what the advantages of a pre-emptable kernel are. As far as I can see it's about scheduling latency.

Writing a fully pre-emptable kernel is not easy and it doesn't guarantee that maximum interrupt and/or scheduling latency is shorter than the non-pre-empting kernel. The reason is that you cannot usually pre-empt or sleep a thread that is holding an exclusive lock or a deadlock will occur. This in turn means that you need to disable pre-emption, and perhaps interrupts, frequently in your code. It's actually only pre-emptable when pre-emption isn't disabled.

I think it's better to decide what sort of latencies you can live with and try to keep your system calls shorter than that. If a system call is longer it can be split up into pieces and allow a co-operative pre-emption. It may even be possible for the system call that is split up to be completed by more than one core.

You may judge that this is okay and it seems the Linux guys did after quite some debate. Personally I feel that, for a microkernel at least, co-operative pre-emption provides better control over the latencies of the kernel.
If a trainstation is where trains stop, what is a workstation ?
User avatar
Owen
Member
Member
Posts: 1700
Joined: Fri Jun 13, 2008 3:21 pm
Location: Cambridge, United Kingdom
Contact:

Re: Setup syscall stack pointer

Post by Owen »

...This is all fine if you design your kernel to never run with interrupts disabled (or only disable them during some critical scheduler operations)
gerryg400
Member
Member
Posts: 1801
Joined: Thu Mar 25, 2010 11:26 pm
Location: Melbourne, Australia

Re: Setup syscall stack pointer

Post by gerryg400 »

Owen wrote:...This is all fine if you design your kernel to never run with interrupts disabled (or only disable them during some critical scheduler operations)
Is there any other way ? What are you thinking ?
If a trainstation is where trains stop, what is a workstation ?
User avatar
Nessphoro
Member
Member
Posts: 308
Joined: Sat Apr 30, 2011 12:50 am

Re: Setup syscall stack pointer

Post by Nessphoro »

Here this should give you an idea:
http://www.disy.cse.unsw.edu.au/theses_ ... warton.pdf

Single stack kernels are only possible if (I think):
1:Threads data is being discarded on a context switch
2:You use blocking context switches
gerryg400
Member
Member
Posts: 1801
Joined: Thu Mar 25, 2010 11:26 pm
Location: Melbourne, Australia

Re: Setup syscall stack pointer

Post by gerryg400 »

Nessphoro wrote:Here this should give you an idea:
http://www.disy.cse.unsw.edu.au/theses_ ... warton.pdf

Single stack kernels are only possible if (I think):
1:Threads data is being discarded on a context switch
2:You use blocking context switches
I'll have a proper read of that article tonight. I do see now what continuations are. I do use a continuation function (just one function, not a stack of them) in my kernel. When a thread wakes from a blocked state (let's say it went to sleep waiting for a thread to join) the scheduler calls a post-processing function on its behalf to complete the system call. In this case it grabs the exit status and returns it to the joiner.

I really don't believe that it's much more difficult to design a single stack microkernel than a multistack one, expecially in the multi-core case. It's probably different for a monolithic kernel. I don't really understand point 2, but in answer to your points no 1.

1. I assmume you mean that the threads syscall data is discarded on a pre-emption. This would be true but if you use a co-operative pre-emption the thread can store enough info in it's TCB to complete the syscall in a continuation function. The trick is to interrupt the system call at a convenient point. Take for example, a long system call like deleting a process and all its resources. You might being by removing the process from the process table to make sure no-one accesses it any more. Now as soon and any reference counts reach zero you can start to dismantle the process and remove and delete its resources. If it takes too long the thread running in the kernel might need to yield to another thread. Later on it when it resumes it can try to delete the process again. It will make more progress each time and eventually the system call will complete.
If a trainstation is where trains stop, what is a workstation ?
User avatar
Nessphoro
Member
Member
Posts: 308
Joined: Sat Apr 30, 2011 12:50 am

Re: Setup syscall stack pointer

Post by Nessphoro »

What I meant by 2 is that on a syscall - no interrupt will occur, so that it will enforce one thread in kernel mode rule
gerryg400
Member
Member
Posts: 1801
Joined: Thu Mar 25, 2010 11:26 pm
Location: Melbourne, Australia

Re: Setup syscall stack pointer

Post by gerryg400 »

Nessphoro wrote:What I meant by 2 is that on a syscall - no interrupt will occur, so that it will enforce one thread in kernel mode rule
That's certainly not true. Interrupts can execute below (on intel) the syscall on the kernel stack. Interrupts can nest as well. No problem there at all.
If a trainstation is where trains stop, what is a workstation ?
User avatar
Combuster
Member
Member
Posts: 9301
Joined: Wed Oct 18, 2006 3:45 am
Libera.chat IRC: [com]buster
Location: On the balcony, where I can actually keep 1½m distance
Contact:

Re: Setup syscall stack pointer

Post by Combuster »

The point was: you can't freely have interrupts in kernel land when you have a global kernel stack rather than a per-process one. You might end up calling the scheduler from random points within unrelated system calls, which is probably not what you want.
"Certainly avoid yourself. He is a newbie and might not realize it. You'll hate his code deeply a few years down the road." - Sortie
[ My OS ] [ VDisk/SFS ]
gerryg400
Member
Member
Posts: 1801
Joined: Thu Mar 25, 2010 11:26 pm
Location: Melbourne, Australia

Re: Setup syscall stack pointer

Post by gerryg400 »

It's true, you can't call the scheduler from just anywhere. You can only call it when you're about to return to userspace. There are a few cases to consider.
1. An interrupt occurs during userspace. -- execute ISR and then call scheduler.
2. An interrupt occurs during a syscall -- execute ISR and queue the result, resume syscall, process queued result then call scheduler.
3. An interrupt occurs during another interrupt -- execute (or defer if lower priority) ISR and queue result. When all interrupts are complete process the queued results then call the scheduler.

Add to that some co-operative yielding inside long system calls and you can achieve latencies that are more than acceptable, even comparable with a pre-emptable kernel.

Note Combuster, that I'm talking about a microkernel here, where system calls are short and there's no IO waiting etc.
If a trainstation is where trains stop, what is a workstation ?
Post Reply