Re: Calling-Conventions
Posted: Fri Jan 15, 2010 2:05 pm
The function "SomeClass::aMethod(int param, short param2)" is called as if it were "aFunction(SomeClass* thisp, int param, short param2)"ErikVikinger wrote:Where is the position of the this pointer exactly?Owen wrote:I just have the this parameter behave as a hidden first parameter to a function.
I have some code which, on supporting ABIs (E.G. not x86_64, where a vararg call requires extra work by the caller), masquerades as a function and then walks it's parameters based on some tables for things like serializing across a network (For RPC, for example). On architectures with the differing varargs methods, this gets more complex (And generally requires assembly)ErikVikinger wrote:Please describe this a little bit more in detail.Owen wrote:This has a few advantages:
- C style functions can masquerade as class functions (This kind of thing is used a lot for dynamic bridging of C++ to dynamic languages)
- This becomes a valid argument for the va_start C builtin
Of course, this is pretty rare, but it's useful.
std::vector & co are somewhat unusual examples in that they're simple container classes rather than application ones which tend to be more complex. While optimizing for such classes, because of how often they are used, would seem like a good idea, they're often inlined anyway making it moot.ErikVikinger wrote:You are sure? I have looked into vector.h and there are many method-calls with this.Owen wrote:Additionally, classes call their own members relatively rarely;
I will try to discover it.
Any method which calls external classes is going to have to copy the this pointer somewhere anyway, probably into the callee save registers, which is something it would have to do anyway if the this register was not callee save. However, it now also has the expense of copying it back before returning. I'm inclined to say that it would end up cheaper to just treat this as any other parameter.
I suppose the difference here is that I don't have any instructions which can fault midway through execution which don't at least maintain a shadow state in registers. For example:ErikVikinger wrote:Processor-exceptions, for example at memory access (memory read access with multiple registers that cross page boundary) or during mathematical calculations. If this instructions are destructive they must be carefully and must detect all possible exceptions before it write its results into the registers. Or you have register renaming and can allocate new virtual registers for the results, i will not implement register renaming into my CPU. Non-destructive instructions can raise an exception and, after its handling, simply restarted. High level language exceptions are not my problem, the compiler can do this job, if needed.Owen wrote:What kind of exception are we talking about here - processor or high level language?
Code: Select all
memcpy r1, r2, r3 ; same as C memcpy(r1, r2, r3)
Aah, 64-bit instructions I guess? No way I could see for you squeeze that into 32. I wonder what the speed tradeoff is like w.r.t memory bandwidth v.s. expressiveness.ErikVikinger wrote:Yes, our instructions looks really totally different, but esoteric is my too. I have 4 complete flag-sets and every instruction that modify it must specify which one is used, this avoids a single point of dependency and help for loop unrolling. In do not have indirect accesses, each usage of a register or flag-set must be specified on the assembly instruction. A PUSHM.F2:Z R10,10 is expanded to STRMIA.W,F2:Z [R62!],R10,10 which means : STRM = store multiple, I = increment address, A = after each register, W = word size (32 bit) for all registers and the increment of R62 is 4 after each register, F2 = execution depends on flag-set 2, Z = instruction is executed is Z is set (in flag-set 2), R62 is the stack-pointer, ! = the new value of R62 after all memory writes is written back to R62, R10 is the first register that is written to memory, 10 = ten registers R10...R19 are written ; this means : if the instruction is executed it does : the registers R10...R19, 40 bytes, are written to [R62] and R62 += 40. An other example is ADD.H,F1:GE R7,R6,R5,F0 which means : if the instruction is executed (GE-condition in flag-set 1 is true) it do half-word (16 bit) based R7 = R6 + R5 and set the flags in flag-set 0 for the 16 bit result.Owen wrote:I suspect our instruction encodings look radically different - I don't know about yours, but mine is pretty esoteric;
I'll mention a few other esoteric things:
Fused multiply-shift and shift-divide instructions, for fast fixed point: (Edit: I should add that they use a 64-bit intermediate)
Code: Select all
; X prefix for fixed point ;-)
XMULS r1, r2, 8, r3 ; r3 = (r1 * r2) >> 8. signed
XMULU r4, r5, 16, r6 ; r6 = (r4 * r5) >> 16. unsigned
XDIV r3, 8, r6, r7 ; r7 = (r3 << 8) / r6
Code: Select all
ENTER r31, r30, 8, 4
; Pseudocode:
; StartSP = SP
; SP -= 8 words
; for(int i = 0; i < 4; i++) *(StartSP - 1 - i) = r[31 - i]; (Save top 4 registers to 4 stack slots of the 8 we just reserved)
; r31 = LR (Link register - an SFR)
; r30 = StartSP (Create a frame pointer)
EXIT r31, r30, 4
; LR = r31
; SP = r30
; for(int i = 0; i < 4; i++) r[31 - i] = *(SP - 1 - i); (Restore top 4 registers off stack)
; JMP LR ; Return!