x86-64 calling convention, returning structs

Question about which tools to use, bugs, the best way to implement a function, etc should go here. Don't forget to see if your question is answered in the wiki first! When in doubt post here.
Post Reply
User avatar
AndrewAPrice
Member
Member
Posts: 2303
Joined: Mon Jun 05, 2006 11:00 pm
Location: USA (and Australia)

x86-64 calling convention, returning structs

Post by AndrewAPrice »

I want to return two values from my functions (I'm working on my VM and I want to return the value and the type) and I'm trying to understand the calling convention so I can generate this code in my JIT compiler. My goal is to be able to JIT compile my functions to machine code to match the C/C++ calling convention, so the caller can call either native or JIT'ed functions with a single function pointer.

I'm using Visual Studio 2012 (MS C++ 11) so I'm trying to follow the Microsoft x64 calling convention. I'm using these MSDN resources here and here.

I can return a single value by returning it in RAX, however I'm trying to return two values. Here is an example in C++ of what I'm trying to do:

Code: Select all

struct value {
    uint64_t type;
    uint64_t data;
};

value do_something() {
   value v;
   v.type = 10; // unsigned integer
   v.data = 0xDEADBEEFDEADBEEF; // dead beef
   return v;
}

int call_do_something() {
	value v = do_something();
	return 0;
}
The Visual Studio debugger dissembles that code into this:

The callee:

Code: Select all

value do_something() {
000000013F71A6A0  mov         qword ptr [rsp+8],rcx  
000000013F71A6A5  push        rsi  
000000013F71A6A6  push        rdi  
000000013F71A6A7  sub         rsp,58h  
000000013F71A6AB  mov         rdi,rsp  
000000013F71A6AE  mov         ecx,16h  
000000013F71A6B3  mov         eax,0CCCCCCCCh  
000000013F71A6B8  rep stos    dword ptr [rdi]  
000000013F71A6BA  mov         rcx,qword ptr [rsp+70h]  
   value v;
   v.type = 10; // unsigned integer
000000013F71A6BF  mov         qword ptr [v],0Ah  
   v.data = 0xDEADBEEFDEADBEEF; // dead beef
000000013F71A6C8  mov         rax,0DEADBEEFDEADBEEFh  
000000013F71A6D2  mov         qword ptr [rsp+30h],rax  
   return v;
000000013F71A6D7  lea         rax,[v]  
000000013F71A6DC  mov         rdi,qword ptr [rsp+70h]  
000000013F71A6E1  mov         rsi,rax  
000000013F71A6E4  mov         ecx,10h  
000000013F71A6E9  rep movs    byte ptr [rdi],byte ptr [rsi]  
000000013F71A6EB  mov         rax,qword ptr [rsp+70h]  
}
000000013F71A6F0  mov         rdi,rax  
000000013F71A6F3  mov         rcx,rsp  
000000013F71A6F6  lea         rdx,[string L"(size & (size - 1)) "...+8428h (013F72F608h)]  
000000013F71A6FD  call        _RTC_CheckStackVars (013F71AF60h)  
000000013F71A702  mov         rax,rdi  
000000013F71A705  add         rsp,58h  
000000013F71A709  pop         rdi  
000000013F71A70A  pop         rsi  
000000013F71A70B  ret
The caller:

Code: Select all

int call_do_something() {
000000013F52A720  push        rsi  
000000013F52A722  push        rdi  
000000013F52A723  sub         rsp,78h  
000000013F52A727  mov         rdi,rsp  
000000013F52A72A  mov         ecx,1Eh  
000000013F52A72F  mov         eax,0CCCCCCCCh  
000000013F52A734  rep stos    dword ptr [rdi]  
	value v = do_something();
000000013F52A736  lea         rcx,[rsp+58h]  
000000013F52A73B  call        do_something (013F4E164Fh)  
000000013F52A740  lea         rcx,[rsp+48h]  
000000013F52A745  mov         rdi,rcx  
000000013F52A748  mov         rsi,rax  
000000013F52A74B  mov         ecx,10h  
000000013F52A750  rep movs    byte ptr [rdi],byte ptr [rsi]  
000000013F52A752  lea         rax,[v]  
000000013F52A757  lea         rcx,[rsp+48h]  
000000013F52A75C  mov         rdi,rax  
000000013F52A75F  mov         rsi,rcx  
000000013F52A762  mov         ecx,10h  
000000013F52A767  rep movs    byte ptr [rdi],byte ptr [rsi]  
	return 0;
000000013F52A769  xor         eax,eax  
}
000000013F52A76B  mov         edi,eax  
000000013F52A76D  mov         rcx,rsp  
000000013F52A770  lea         rdx,[string L"(size & (size - 1)) "...+84C0h (013F53F6A0h)]  
000000013F52A777  call        _RTC_CheckStackVars (013F52AF60h)  
000000013F52A77C  mov         eax,edi
000000013F52A77E  add         rsp,78h  
000000013F52A782  pop         rdi  
000000013F52A783  pop         rsi  
000000013F52A784  ret
This is compiled unoptimized with all checks, so I see some fluff added by the compiler into the epilog/prolog ("mov eax,0CCCCCCCCh" "call _RTC_CheckStackVars")

If I'm reading this correctly - the callee copies the struct to into the top of the local stack, and shrinks it's stack back down to all but the struct size, and the caller then copies that temporary struct into it's own stack, and shrinks the stack again. Am I interpreting this correctly?

That seems a little heavy, compared to simply returning the values in rax/rbx, so may wrap JIT<->native calls in a thunk.
My OS is Perception.
User avatar
Owen
Member
Member
Posts: 1700
Joined: Fri Jun 13, 2008 3:21 pm
Location: Cambridge, United Kingdom
Contact:

Re: x86-64 calling convention, returning structs

Post by Owen »

covers it. For >64-bit values excepting __m128 & co (SSE vectors), pass a pointer to the return buffer in RCX; callee must return said pointer in RAX.

Seriously consider using GCC or Clang and the AMD ABI if you can. They're significantly more efficient, especially when AVX gets in play. That can return dual-register sized structures in rdx:rax, like the i386 calling convention does, plus it supports more registers for argument passing.

I'd be especially tempted towards GCC and Clang if you're stuck with MSVC 2012. That version is @$$ slow (2013 isn't much better...)
User avatar
AndrewAPrice
Member
Member
Posts: 2303
Joined: Mon Jun 05, 2006 11:00 pm
Location: USA (and Australia)

Re: x86-64 calling convention, returning structs

Post by AndrewAPrice »

I'm think about adopting an AMD64 style calling convention with 2 register returns (type and value) for my JIT and writing thunking routines in assembly for converting calls between my calling convention and native code in MS x64 or AMD64.
My OS is Perception.
User avatar
AndrewAPrice
Member
Member
Posts: 2303
Joined: Mon Jun 05, 2006 11:00 pm
Location: USA (and Australia)

Re: x86-64 calling convention, returning structs

Post by AndrewAPrice »

Regarding the 16-byte stack alignment requirement in both x64 calling conventions - does the stack have to be 16-byte aligned before or after the 'call' instruction (which pushes a return address onto the stack)?
My OS is Perception.
jnc100
Member
Member
Posts: 775
Joined: Mon Apr 09, 2007 12:10 pm
Location: London, UK
Contact:

Re: x86-64 calling convention, returning structs

Post by jnc100 »

MessiahAndrw wrote:Regarding the 16-byte stack alignment requirement in both x64 calling conventions - does the stack have to be 16-byte aligned before or after the 'call' instruction (which pushes a return address onto the stack)?
I don't know about MS, but for SysV its aligned before the call, thus the return address is not 16 byte aligned (remember that push decrements rsp, then saves the data). The 16 byte alignment for local variables is then usually restored by the called procedure pushing the old rbp.

Regards,
John.
User avatar
AndrewAPrice
Member
Member
Posts: 2303
Joined: Mon Jun 05, 2006 11:00 pm
Location: USA (and Australia)

Re: x86-64 calling convention, returning structs

Post by AndrewAPrice »

Thanks, jnc100.
My OS is Perception.
Post Reply