Page 1 of 1
Saving context
Posted: Sat May 01, 2010 4:27 pm
by gerryg400
Hi all,
When I enter the kernel in my OS I do lot's of 'push' instructions to save all the GP regs. Last night I had a peek at the linux x86_64 context save routines and saw that they use rsp-relative moves to save the register context. Something like this.
Code: Select all
subq $64, %rsp
movq %rdi, 8*8(%rsp)
movq %rsi, 7*8(%rsp)
... etc.
Can anyone suggest why ? Surely push is faster than a reg-rel mov.
Thanks
- gerryg400
Re: Saving context
Posted: Sat May 01, 2010 5:10 pm
by NickJohnson
gerryg400 wrote:Can anyone suggest why ? Surely push is faster than a reg-rel mov.
Why do you assume that? Every push is a write and a register modification; that is just a write: the stack pointer is only modified once, at the end. I don't know which exactly is faster, but I'd trust the Linux people to optimize at least that section to hell. Beyond speed, it may just be neater to think of the area that the context is being saved to as less of a stack than an array.
Edit: Actually, maybe I do understand. It seems like pushes would be hard to pipeline, because they all modify the stack pointer. By saving each register without modifying the stack pointer, it may play better with the processor pipeline. It would all depend on whether writes can be pipelined like that - maybe only on newer processors?
Re: Saving context
Posted: Sat May 01, 2010 6:38 pm
by gerryg400
I'm not sure about instruction timings either but push rdx, rcx, rax, etc are single byte opcodes. Those stack relatives are 5 byte opcodes. I'm pretty sure that in isolation, it's quicker to move something to the stack with a push.
However, I think you're on to something with the pipelining and out of order execution. I don't really know anything about this type of stuff but it strikes me that the pushes would have to be done one at a time in order, but the mov's could be done in parallel or out of order since they all access different regs and different memory. This is way beyond my knowledge at the moment. Just a curiosity for me. When my OS is complete (yeah right!) and I'm looking for a performance improvement, I'll look at this again. Thanks NickJohnson
- gerryg400