Page 1 of 1
Theory of implementing a new ABI for X86 LongMode
Posted: Wed Nov 07, 2018 3:36 pm
by TightCoderEx
I've come to a point now where I need to start thinking of an ABI. Of the ones that are out there or at least those that I've had exposure too, FASTCALL is most appealing, but with a twist. Only use those registers that are specifically designed to be counters, pointers to arrays and indices. Keep in mind, this is only an example and I don't want to imply that I'm saving ABI registers above procedure frame, but rather whatever is essential to preserve is done outside the frame.
- RBX = Base pointer to arrays, structures or even arrays of structures.
RCX = Counter
RSI = Source index
RDI = Destination index
Essentially only use the stack for local data and/or buffers by using this type of Prologue and Epilogue.
Code: Select all
Proc: push rdi
push rsi
push rbx
push rbp
mov rbp, rsp ; or maybe even ENTER ?,?
..... Body of Procedure ....
leave
pop rbx
pop rsi
pop rdi
ret
This has worked very effectively as now I don't need to be concerned about the stack pointer as LEAVE unroll RSP.
As my system is not intended to be compatible with POSIX, SYSTEM V or anything else, it has been plagued with a few tribulations where I've had to redraft, at times right to the beginning. What I'm looking for here, has anyone else designed such an OS and what sort of ABI did you implement. Although I appreciate what M$ had to do for backward compatibility maybe, using shadow space does introduce a lot of bloat.
Re: Theory of implementing a new ABI for X86 LongMode
Posted: Wed Nov 07, 2018 6:15 pm
by TightCoderEx
mariuszp wrote:I actually condone "shadow space".
Probably the most poignant word to describe that paradigm, but I suppose it was the best alternative to maintain backward compatibility.
Re: Theory of implementing a new ABI for X86 LongMode
Posted: Wed Nov 07, 2018 11:29 pm
by nullplan
I'm not sure it would be worth the effort to me. Inventing a new ABI is all well and good, but then you have to tell the compiler about it. One look at gcc's source code and I knew I wanted no part of that.
All ABIs need to make tradeoffs. The System V i386 ABI has to cope with the fact that only 8 GP regs exist, 2 of which aren't all that GP. So putting arguments on the stack was a sound decision. For x86_64 however, enough registers exist to be able to avoid spilling to the memory, which is, in general, slower than just keeping it all in regs.
Personally, I like the PowerPC ABI a hell of a lot more than the System V x86_64 ABI, as it does handle variadic functions and still passes most things in registers. In x86_64, a variadic function is treated as if the variadic arguments were normal arguments to the function. In the PowerPC ABI, however, variadic args are always pushed to the stack. Non-variadic args are put into registers 3-10 (allowing for 8 arguments in registers), or the FP regs, as appropriate. So this ABI trades consistency for easy access to variadic args.
But you still haven't described your ABI: Which registers are volatile (clobbered by callee, i.e. caller-saved), which registers are non-volatile (or callee-saved) and how does argument passing work? From what you wrote, I can only conclude that RBX, RCX, RDI, RSI, RBP, and RSP are non-volatile. So RAX, RDX, and R8-R15 are all volatile? And all args are passed on stack? That's a lot of registers to save if I want to call a function. In System V I can keep a local variable in R15 and the callees will save it if they need to clobber it. But most functions don't need to do that, so my variable never makes it into memory.
Re: Theory of implementing a new ABI for X86 LongMode
Posted: Thu Nov 08, 2018 1:07 am
by TightCoderEx
NOTE: Probably should have mentioned everything is developed using assembly.
Maybe what I'm calling an ABI might be a little misleading. The volatility of any register would be dictated by the procedures intent. To clarify what I mean consider this example that could be thought of as STRLEN, but not necessarily looking for NULL.
Code: Select all
mov rsi, WideTxt ; Points to a wide text buffer
mov eax, 0x1D2A ; Just a hypothetical terminating character
mov ecx, 1024 ; Maximum
call STRLEN
So in this case, RAX & RCX will be the only two registers modified and callee would be responsible for preserving anything else it needs to accomplish the task. Return values would be in their respective registers meaning RCX would be text buffer length. If RCX = 0 && EAX = Original value then string was the exact length as specified by caller, otherwise EAX = whatever character is a ESI + ECX. Now to accomplish something like STRCAT
So that is considerably more compact than;
Code: Select all
mov ecx, 1024
mov rdx, WideTxt
mov r8, 0x1D2A
sub rsp, 32
call StrLen ; Then function would have move them again or at least R8 & RDX.
add rsp, 32
Obviously, at some point, I will need documentation, but that's the other objective I hope to address that when writing code what needs to be passed to callee be a little more intuitive based on target architecture.
Re: Theory of implementing a new ABI for X86 LongMode
Posted: Mon Nov 12, 2018 2:54 pm
by TightCoderEx
mariuszp wrote:Not sure what you mean here... in what way is it done for backward compatibility?
What I'm suggesting and probably doesn't apply to System V, is that CDELC and STDCALL in 32 bit, memory above EBP will look exactly the same as that of FASTCALL once callee has moved registers into shadow space, other than data being 4 bytes vs 8 respectively. Although I don't know for sure, but this would suggest an air of compatibility on M$ part.