enter and leave instruction in asm

Programming, for all ages and all languages.
Post Reply
wenn32
Posts: 17
Joined: Sun Oct 24, 2010 2:30 am

enter and leave instruction in asm

Post by wenn32 »

hello, i am currently learning asm and please look at this asm program

Code: Select all


[section .data]
fmt: db 'First = %d',10,'Second = %d',10,0

[section .text]
global _main

extern _printf

_main:

enter 8,0
mov dword [ebp - 4],123
mov dword [ebp - 8],456
push dword [ebp - 8]
push dword [ebp - 4]
push fmt
call _printf
add esp,12

leave
mov eax,0
ret

what is the meaning of enter and leave instruction?
User avatar
Combuster
Member
Member
Posts: 9301
Joined: Wed Oct 18, 2006 3:45 am
Libera.chat IRC: [com]buster
Location: On the balcony, where I can actually keep 1½m distance
Contact:

Re: enter and leave instruction in asm

Post by Combuster »

To create (enter) and discard (leave) a stack frame. See the intel manuals (Intel 2A) for a detailed description.
"Certainly avoid yourself. He is a newbie and might not realize it. You'll hate his code deeply a few years down the road." - Sortie
[ My OS ] [ VDisk/SFS ]
User avatar
bewing
Member
Member
Posts: 1401
Joined: Wed Feb 07, 2007 1:45 pm
Location: Eugene, OR, US

Re: enter and leave instruction in asm

Post by bewing »

Well, the manuals don't do a very good job of telling you why you might want to create a stack frame.

When a function gets called in C, for example, the arguments get pushed onto the stack. Then the function gets called. The stack will be used lots more in just a second, so many programmers think it is a good idea to save a copy of ESP at this moment. The EBP register was made for doing exactly that. So you do "PUSH EBP; MOV EBP, ESP". That is called "setting up a stack frame pointer", which is EBP. That is what the ENTER opcode does -- it's pretty much a replacement for those two opcodes. But then while the function is running, you can use ESP as much as you want and leave it trashed -- since you saved a good copy of the pointer in EBP. You can also use EBP to easily access the arguments that were pushed onto the stack. LEAVE does a "MOV ESP, EBP; POP EBP".
User avatar
Brendan
Member
Member
Posts: 8561
Joined: Sat Jan 15, 2005 12:00 am
Location: At his keyboard!
Contact:

Re: enter and leave instruction in asm

Post by Brendan »

Hi,
bewing wrote:When a function gets called in C, for example, the arguments get pushed onto the stack. Then the function gets called. The stack will be used lots more in just a second, so many programmers think it is a good idea to save a copy of ESP at this moment. The EBP register was made for doing exactly that. So you do "PUSH EBP; MOV EBP, ESP". That is called "setting up a stack frame pointer", which is EBP. That is what the ENTER opcode does -- it's pretty much a replacement for those two opcodes.
Actually, up to 3 opcodes ("PUSH EBP; MOV EBP, ESP; SUB ESP,<space_for_local_variables>").

Ironically, on most CPUs ENTER/LEAVE are implemented in micro-code and it's faster to use 2 or 3 smaller/simpler instructions instead, so most compilers don't use ENTER/LEAVE at all.

Also note that if you replace ENTER/LEAVE with the faster/smaller/simpler alternative instructions and then optimise the assembly (e.g. replace "MOV ESP,EBP" with "ADD ESP,<space_for_local_variables>" and remove the "MOV EBP,ESP", then use ESP instead of EBP to access local variables and input parameters to free up EBP for normal use) you end up with smaller/faster code with no stack frame.


Cheers,

Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
User avatar
JamesM
Member
Member
Posts: 2935
Joined: Tue Jul 10, 2007 5:27 am
Location: York, United Kingdom
Contact:

Re: enter and leave instruction in asm

Post by JamesM »

Ironically, on most CPUs ENTER/LEAVE are implemented in micro-code
Every instruction is microcoded on every CPU since the 1990s.
wenn32
Posts: 17
Joined: Sun Oct 24, 2010 2:30 am

Re: enter and leave instruction in asm

Post by wenn32 »

thanks!
User avatar
Brendan
Member
Member
Posts: 8561
Joined: Sat Jan 15, 2005 12:00 am
Location: At his keyboard!
Contact:

Re: enter and leave instruction in asm

Post by Brendan »

Hi,
JamesM wrote:
Ironically, on most CPUs ENTER/LEAVE are implemented in micro-code
Every instruction is microcoded on every CPU since the 1990s.
Simple instructions are decoded directly into a small number of micro-ops (typically 1 micro-op). Complex instructions aren't, and (for the sake of over-simplifying) are a little bit like miniature subroutines stored in microcode ROM (or "microcoded") rather than actual instructions that are executed directly (quickly).

From Intel's Optimisation Reference Manual:
"Assembler/Compiler Coding Rule 40. (ML impact, M generality) Avoid using complex instructions (for example, enter, leave, or loop) that have more than 4 uops and require multiple cycles to decode. Use sequences of simple instructions instead.

Complex instructions may save architectural registers, but incur a penalty of 4 uops to set up parameters for the microcode ROM.
"


Cheers,

Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
User avatar
Brynet-Inc
Member
Member
Posts: 2426
Joined: Tue Oct 17, 2006 9:29 pm
Libera.chat IRC: brynet
Location: Canada
Contact:

Re: enter and leave instruction in asm

Post by Brynet-Inc »

berkus wrote:Just don't forget to add you're speaking about Intel cpus, not all cpus.
JamesM works for a very successful designer of semiconductors.
Image
Twitter: @canadianbryan. Award by smcerm, I stole it. Original was larger.
User avatar
Combuster
Member
Member
Posts: 9301
Joined: Wed Oct 18, 2006 3:45 am
Libera.chat IRC: [com]buster
Location: On the balcony, where I can actually keep 1½m distance
Contact:

Re: enter and leave instruction in asm

Post by Combuster »

Depending where you put the distinction between microcoded instructions and others, on the Athlon series of AMD there are two distinct decoder systems: directpath and vectorpath. The manuals suggest that the decoding operations are hardwired into the directpath unit, while a the vector unit generates internal opcodes from an internal memory (ROM is technically not the best description).

Thing is, out-of-order execution always causes a need to save some internal state to be dispatched to the various ALU components - at this point, the distinction between microcodes and storage of "simple" control signals is kind of blurred. An ancient processor like a 6502 simply grabs an instruction, load the operands when needed, perform the ALU op, then store the operands where needed. The moment you start pipelining that, you can do the loads where possible in one cycle, the operation in the next, and the store in the third. If you do that out of order, you can just save some control signals for later use. In all cases, there is no technical difference between having a "discrete" lookup table that we label a "microcode rom" that converts an opcode into signal batches, or that it is done by a more efficient logic network that takes advantage of the similarities in instruction formatting - the net effect is, at this level, the same.

Therefore the statement "microcode rom is slow" or "microcoded instructions are slow" is, as a generalisation, wrong.

The difference is how much load is put on the so-called microcode unit. If it can always respond with the same amount of operations, there's no difference. If it has to respond with a variable number of operations, then it can become a bottleneck the moment the amount of instructions dispatched is high compared to the input. The execution engine will then start seeing bunches of operations belonging to one instruction, and then goes idle because it has no other source instruction in the queue it might do in parallel. And that is the situation behind the microcode myth: "complex microcoded instructions break the amount of independent work available to the processor, so that it can no longer do more than one thing at the same time"
"Certainly avoid yourself. He is a newbie and might not realize it. You'll hate his code deeply a few years down the road." - Sortie
[ My OS ] [ VDisk/SFS ]
User avatar
Owen
Member
Member
Posts: 1700
Joined: Fri Jun 13, 2008 3:21 pm
Location: Cambridge, United Kingdom
Contact:

Re: enter and leave instruction in asm

Post by Owen »

Also note that, IIRC, simple LEAVEs are fast (at least as good as the simple instructions), but ENTER and complex LEAVEs are expensive and to be avoided.

(This would appear to be corroborated by GCC generating LEAVEs quite often when it doesn't elide the frame pointer altogether)
User avatar
JamesM
Member
Member
Posts: 2935
Joined: Tue Jul 10, 2007 5:27 am
Location: York, United Kingdom
Contact:

Re: enter and leave instruction in asm

Post by JamesM »

berkus wrote:
Brynet-Inc wrote:
berkus wrote:Just don't forget to add you're speaking about Intel cpus, not all cpus.
JamesM works for a very successful designer of semiconductors.
This alone still doesn't mean that ALL cpus are microcoded. If you want superscalar and out-of-order, then yes, unit-specific mops make more sense, but for the microcontroller sort of cpus just dumb direct execution may be more efficient.

Prove me wrong, James.
I'd like to point out that I do not work in the processor division of said company, so my statements have as much research behind them as any of yours.

Every CPU with a pipeline requires one operation to be broken down into multiple micro-ops - LOAD, EXECUTE, WRITEBACK for a very simple system (ignoring instruction fetch because although it is a pipeline stage it obviously doesn't depend on instruction content).

As Combuster rightly mentions, to split an insn into mops, you need what is functionally equivalent to a lookup table. A request to ROM is functionally equivalent to just "a combinitorial function - there's no constraint on how that combinitorial function is implemented. In the x86 it seems some instructions fall through to a ROM (ENTER et al) and others are special-cased for speed. This is what I would expect.

But they're all microcoded, because they all use a pipeline. Even the Cortex-M3 is pipelined, so yes, it is microcoded too.

How the microcode lookup is implemented is an implementation detail!

James
drudru
Posts: 1
Joined: Fri Oct 21, 2011 8:39 pm

Re: enter and leave instruction in asm

Post by drudru »

Sooo.....

Why not just transform the instruction into the equivalent fast preamble??

Is it because of a transmeta patent :-)


Brendan wrote:Hi,
JamesM wrote:
Ironically, on most CPUs ENTER/LEAVE are implemented in micro-code
Every instruction is microcoded on every CPU since the 1990s.
Simple instructions are decoded directly into a small number of micro-ops (typically 1 micro-op). Complex instructions aren't, and (for the sake of over-simplifying) are a little bit like miniature subroutines stored in microcode ROM (or "microcoded") rather than actual instructions that are executed directly (quickly).

From Intel's Optimisation Reference Manual:
"Assembler/Compiler Coding Rule 40. (ML impact, M generality) Avoid using complex instructions (for example, enter, leave, or loop) that have more than 4 uops and require multiple cycles to decode. Use sequences of simple instructions instead.

Complex instructions may save architectural registers, but incur a penalty of 4 uops to set up parameters for the microcode ROM.
"


Cheers,

Brendan
User avatar
miker00lz
Member
Member
Posts: 144
Joined: Wed Dec 08, 2010 3:16 am
Location: St. Louis, MO USA

Re: enter and leave instruction in asm

Post by miker00lz »

yep, as has been mentioned it sets up a stack frame. this is the code to handle them from my x86 emu, so you can see how it works:

Code: Select all

case 0xC8: //C8 ENTER (80186+)
    stacksize = getmem16(segregs[regcs], ip); StepIP(2);
    nestlev = getmem8(segregs[regcs], ip); StepIP(1);
    push(getreg16(regbp));
    frametemp = getreg16(regsp);
    if (nestlev) {
        for (temp16=1; temp16<nestlev; temp16++) {
                putreg16(regbp, getreg16(regbp) - 2);
                push(getreg16(regbp));
        }
        push(getreg16(regsp));
    }
    putreg16(regbp, frametemp);
    putreg16(regsp, getreg16(regbp) - stacksize);
    break;

case 0xC9: //C9 LEAVE (80186+)
    putreg16(regsp, getreg16(regbp));
    putreg16(regbp, pop());
    break;
Post Reply