earlz wrote:There is GCCs first 6 go into registers, and then the rest as cdecl
there is still the cdecl used sometimes
and there is Microsofts hack job that is like GCCs but it does something different that I can't remember
The MS calling convention dictates that the first four int64-or-less-sized arguments go to rcx, rdx, r8, r9 in that order. They are also allocated a so-called home location on the stack (meaning that stack space is allocated, but the value is not pushed on the stack - this is used if the argument gets its address taken). The rest of the arguments are stored on the stack. Floating-point values use registers xmm4-xmm7, if I remember correctly. The calling convention is basically cdecl (the function called doesn't do retn N but a simple retn; the caller deallocates the arguments from the stack), but there is a trick. You have to save nonvolatile registers that get trashed (rsi, rdi, rbp, r12-r15) and you are not allowed to modify the stack pointer once the function prologue has run. This means that if you call two functions, one which takes 5 arguments and one which takes 200, you have to allocate stack space for holding the larger number (200) of arguments (and also have to align rsp to a 16 byte boundary). When you call the first function (the one that takes 5 arguments) the upper portion of this stack space is unused, and only the lower 5 'slots' are used; and even then the first 4 arguments go to registers and only the fifth is actually stored on the stack. If I recall correctly, this was introduced so that the stack could be unwound easily. There is an official specification out there, google it.
I consider it a good solution that they did away with the different calling conventions (x86 used cdecl, stdcall, different versions of fastcall, etc), but the stack unwinding thingy is unnecessary complication at first. Especially when you write assembly code that has to call functions coded in C - you have to make sure the stack frame is as the C compiler expects. When writing leaf functions (functions that don't call other functions) in some cases you can do away with the stack frame entirely.
I don't know exactly what calling convention GCC uses, but expect it more or less the same (register-based cdecl with optionally preallocating space for all arguments).
It is indeed unfortunate that MSVC and GCC use slightly different calling conventions.