Page 1 of 1

how looks the lisp `car` function in asm?

Posted: Mon Jul 05, 2004 12:57 pm
by d11in
hi,

i`m new in lisp and i`m wondering (is my english correct?.. it doesn`t matter ::) ) how the car function looks like if its compiled --> assembler code, i made a example.please give me your comments.

d11in


example:

it`s a 32-bit assembler routine, the input argument (eax) shoul`d be a list (cons cell) and the output argument is the car value (eax) which can be an atom or a list. the three `highest significant` bits are used for typing and garbage collection, maybe. we work in a single-address space, unsegmented protected mode --> 32-bit flat pmode....

Code: Select all

f-car:        push ebp              ; enter stack frame
              mov ebp,esp

              mov ebx,eax          ; store argument in ebx too

              and eax,0xE0000000     ; check if arg a list
              jz f-car-work

f-car-error:   ???                       ; illegal CAR argument

               pop ebp
               ret

f-car-work:    and ebx,0x1fffffff    ; get car-value
               mov eax,[ebx]

               pop ebp                  ; leave stack frame
               ret

Re:how looks the lisp `car` function in asm?

Posted: Tue Jul 06, 2004 9:22 am
by Schol-R-LEA
To start with, this probably belongs in the General Programming forum, even if you are planning a systems dialect of Lisp. See the thread "Language Design: Sibilance" for a (brief) discussion on this topic.

Your existing code, as best as I can follow it, does seem to do the job after a fashion: it takes a pointer to a cons cell as an argument (in eax), checks that the cons cell is valid (by checking the tag of eax), then returns the value of the first item in the cons cell (it copies the value of eax to ebx, then dereferences ebx and stores the result in eax).

This having been said, there are some specific critiques I would add regarding your particular design. These critiques are from the point of view of classic Lisp and Scheme implementations; if you are intending a system-programming subset (as seems likely), then they would be less critical, though other limitations would apply.

To begin with, I don't think that using the primary accumulator for argument passing is a great idea, esp. on a register-poor architecture like the x86 (which really needs a good dozen or so more for that sort of thing). This is particularly problematic in a language that uses recursion as it's primary means of repetition; even on an architecture where register windowing were an option, you would quickly reach a call depth where you would be using the stack more using register passing than with stack passing, since you would have to push the whole register set pre-emptively in the calling function rather than selectively pushing single registers in the called function. It also would make TCO a nightmare, as you wouldn't know ahead of time which registers could be elided and which could not be. Besides, what would you need the save the frame pointer for if you aren't using the stack for arguments?

Conversely, using a conventional environment frame - either using the stack or register windowing - would rule out the possibility of closures. Since closures are the traditional form of persistent objects in Lisp (and especially in Scheme), this would be a serious handicap. It would also make continuations impossible (though that is only of interest in Scheme-like implementations). In the end, you would need to have free environment tables to support a full Lisp, I think, or at the very least the ability to copy the local variables off of the stack onto a persistent table in order to pass a function state variable.

As for using tagged pointers, it's hard to say if it is worth it. While it would be really useful in many ways, it would also mean reducing the effective address space to an 8th of the total memory, or 512M in a 32-bit system. While this is still a huge amount of memory - even today, many systems have less than 256M of RAM - it would still mean a considerable loss for a language as memory-intensive as Lisp. Also, two type bits and a sweep bit may not be enough to make it worthwhile; with only four types, one of which would by necessity be null, it would be too little information at too much cost. Tagged memory is very useful when there is hardware support, but it is far less so when done in software alone. Here, too, it may be better just to eat the overhead on a table for the variables.

Finally, I don't know if the approach you've taken captures the concept generally enough. In most Lisps, (car) is defined as taking a proper list - a cons cell whose second element points to another cons cell, or to null - and returns a the pointer in the first value. This last part is important, as (in most Lisps), a returned value is a pointer to a structure allocated on the heap. In the design you've chosen, it looks as if you mean to 'snap the pointers', that is to say, to replace pointers to literals with the pointed values themselves. While this would be very speed efficient, it presents problems whe handling shared data structures, as well as adding overhead in that it requires you to test the type of the return value.

Re:how looks the lisp `car` function in asm?

Posted: Tue Jul 06, 2004 1:56 pm
by d11in
Your existing code, as best as I can follow it, doe seem to do the job after a fashion: it takes a pointer to a cons cell as an argument (in eax), checks that the cons cell is valid (by checking the tag of eax), then returns the value of the first item in the cons cell (it copies the value of eax to ebx, then dereferences ebx and stores the result in eax).
that`s right but it`s just an example.
To begin with, I don't think that using the primary accumulator for argument passing is a great idea, esp. on a register-poor architecture like the x86 (which really needs a good dozen or so more for that sort of thing). This is particularly problematic in a language that uses recursion as it's primary means of repetition; even on an architecture where register windowing were an option, you would quickly reach a call depth where you would be using the stack more using register passing than with stack passing, since you would have to push the whole register set pre-emptively in the calling function rather than selectively pushing single registers in the called function. It also would make TCO a nightmare, as you wouldn't know ahead of time which registers could be elided and which could not be. Besides, what would you need the save the frame pointer for if you aren't using the stack for arguments?
i agree with you. the arguments should stored on the stack. but i`ve got a question:

push ebp
mov ebp,esp
mov eax,[ebp+8]
mov ebx,eax
:

that`s the code to enter the new stack frame and get
the first argument. maybe the ecx-register holds the value of arguments (??). but at the end of this routine: how do i store the result(s) back on stack? maybe throught a pointer in eax-register to the heap memory, or i do "mov [ebp+8],eax" - store the result back to 1st one? what would you do?


d11in

Re:how looks the lisp `car` function in asm?

Posted: Tue Jul 06, 2004 5:37 pm
by Christoph Reichenbach
Hi,
Schol-R-LEA wrote: To begin with, I don't think that using the primary accumulator for argument passing is a great idea, esp. on a register-poor architecture like the x86 (which really needs a good dozen or so more for that sort of thing). This is particularly problematic in a
language that uses recursion as it's primary means of repetition;
Recall that, thanks to CPS, we can always get tail recursion and, thus, transform recursion into iteration at compile time. (Though we don't neccessarily get many benefits out of doing CPS all the time, admittedly ;-)
Schol-R-LEA wrote: even on an architecture where register windowing were an option, you would quickly reach a call depth where you would be using the stack more using register passing than with stack passing, since you would have to push the whole register set pre-emptively in the calling function rather than selectively pushing single registers in the called function. It also would make TCO a nightmare, as you wouldn't know ahead of time which registers could be elided and which could not be. Besides, what would you need the save the frame pointer for if you aren't using the stack for arguments?
I would argue that using one or two registers for argument passing or return values might well be OK even on broken architectures like the IA32. Only having to back up one or two registers doesn't make that much of a difference if you're very likely to pass at least one or two arguments anyway; moreover, if many functions return immediately, backing up arguments in the activation record can typically be avoided.
Schol-R-LEA wrote: Conversely, using a conventional environment frame - either using the stack or register windowing - would rule out the possibility of closures. Since closures are the traditional form of persistent objects in Lisp (and especially in Scheme), this would be a serious handicap. It would also make continuations impossible (though that is only of interest in Scheme-like implementations). In the end, you would need to have free environment tables to support a full Lisp, I think, or at the very least the ability to copy the local variables off of the stack onto a persistent table in order to pass a function state variable.
Do you have a reference to how Scheme traditionally implements closures and continuations? I was taught that the "usual way" was to represent closures as straightforward (code pointer, environment table pointer) tuples, at least for escaping closures, and continuations as copies of the stack (which seemed rather costly). I'm not sure how destructive updates are supposed to factor into this, though.
Schol-R-LEA wrote: As for using tagged pointers, it's hard to say if it is worth it. While it would be really useful in many ways, it would also mean reducing the effective address space to an 8th of the total memory, or 512M in a 32-bit system.
I don't know the recent literature on whether tagged memory is "worth it", but my gut feeling-- considering the extreme price we have to pay for memory access these days-- would be that it might well be. Note that you don't address unaligned offsets anyway, so that the last two bits are typically zero; if you extend the size of your cons cells to two 32 bit words (and, honestly, why should you not do that?), you have precisely three bits to spare.
Schol-R-LEA wrote: In the design you've chosen, it looks as if you mean to 'snap the pointers', that is to say, to replace pointers to literals with the pointed values themselves. While this would be very speed efficient, it presents problems whe handling shared data structures, as well as adding overhead in that it requires you to test the type of the return value.
Interesting point. Again, we should consider that memory access is very, very expensive nowadays. A few hundred instructions are usually cheaper than a single access to main memory (as opposed to cached memory). It might be worthwhile to check the current literature (spineless tagless G-machine, O'Caml RTS) to see how current FPLs deal with this.

-- Christoph