Page 1 of 1
Is the enter/leave instruction better?
Posted: Mon Oct 26, 2015 2:53 am
by wyj
for assembly,
is it better to do sth like
foo:
enter
......
leave
ret
or
foo:
push rbp
mov rbs,rsp
sub rsp,sth
......
mov rsp,rbp
pop rbp
ret
which one is better for us to use?
and I can see some different version of
sub rsp,sth
while gcc may even do nothing to rsp.
it is actually unnecessary since you can always read parameters with an offset
but which one is better for us?
Re: Is the enter/leave instruction better?
Posted: Mon Oct 26, 2015 3:58 am
by Brendan
Hi,
In assembly; it's better to not bother (and just use ESP/RSP to find things on the stack) because it's faster and lets you use EBP/RBP as an extra general purpose register (which helps a lot for 32-bit code where there's less other general purpose registers).
Apart from that, the ENTER/LEAVE instructions are better for code size and worse for speed; so you want to use them in things like "only executed once" initialisation code and want to avoid them in code that's executed often.
Cheers,
Brendan
Re: Is the enter/leave instruction better?
Posted: Mon Oct 26, 2015 7:02 am
by Combuster
In practice you never see ENTER being used. LEAVE is short and doesn't do much and therefore is much more optimised from the processor's perspective. It can be found in various forms of optimised code. Not using EBP as a copy of the stack pointer is faster, but it will break stacktrace functionality. Considering you seem to be using 64-bit code, you can even consider using the red zone - as long as it's not part of the kernel.
Re: Is the enter/leave instruction better?
Posted: Mon Oct 26, 2015 1:12 pm
by tlf30
A really good topic on the matter that was posted some time ago:
http://forum.osdev.org/viewtopic.php?t=22683
Re: Is the enter/leave instruction better?
Posted: Fri Oct 30, 2015 8:19 am
by wyj
Combuster wrote:In practice you never see ENTER being used. LEAVE is short and doesn't do much and therefore is much more optimised from the processor's perspective. It can be found in various forms of optimised code. Not using EBP as a copy of the stack pointer is faster, but it will break stacktrace functionality. Considering you seem to be using 64-bit code, you can even consider using the red zone - as long as it's not part of the kernel.
yes it is x64, I'm currently write only the so called "leaf function" , with no more than 4 paras, 4 is usually enough for most of the occasions
and if not I will use struct and pointer to avoid stack(laugh)
I hate to count bytes on stack to be honest
and I am sorry but what do you mean by "red zone?"
Re: Is the enter/leave instruction better?
Posted: Fri Oct 30, 2015 11:18 am
by SpyderTL
wyj wrote:and I am sorry but what do you mean by "red zone?"
Calling Conventions - System V X86_64
There is a 128 byte area below the stack called the 'red zone', which may be used by leaf functions without increasing %rsp. This requires the kernel to increase %rsp by an additional 128 bytes upon signals in user-space. This is not done by the CPU - if interrupts use the current stack (as with kernel code), and the red zone is enabled (default), then interrupts will silently corrupt the stack. Always pass -mno-red-zone to kernel code (even support libraries such as libc's embedded in the kernel) if interrupts don't respect the red zone.
Re: Is the enter/leave instruction better?
Posted: Fri Oct 30, 2015 12:24 pm
by kzinti
Curious... Seems to me like it would be better to enable red zones in the kernel and properly fix the stack when entering interrupt gates. Anyone has done some testing here?
Re: Is the enter/leave instruction better?
Posted: Fri Oct 30, 2015 11:35 pm
by gerryg400
kiznit wrote:Curious... Seems to me like it would be better to enable red zones in the kernel and properly fix the stack when entering interrupt gates. Anyone has done some testing here?
The problem is if you enable red-zone in your kernel and an interrupt occurs stuff gets pushed onto your ring0 stack. If there is a red-zone it will be trashed. Userspace red-zone is okay because when an interrupt occurs nothing is pushed onto the userspace stack and the red-zone is undisturbed.
Re: Is the enter/leave instruction better?
Posted: Fri Oct 30, 2015 11:52 pm
by kzinti
Right... What was I thinking... =)
Re: Is the enter/leave instruction better?
Posted: Sat Oct 31, 2015 12:01 am
by gerryg400
kiznit wrote:Right... What was I thinking... =)
Yeah, don't feel too bad. The red-zone has caught plenty. Read this
http://forum.osdev.org/viewtopic.php?f= ... t=red+zone
Re: Is the enter/leave instruction better?
Posted: Fri Dec 18, 2015 7:30 am
by azblue
Combuster wrote:Considering you seem to be using 64-bit code, you can even consider using the red zone - as long as it's not part of the kernel.
I jut learned about the red zone from this thread (and the other one linked), but I don't understand why it needs to be confined exclusively to long mode; shouldn't it also work in ring 3 protected mode leaf functions? If they're not calling other functions, and an interrupt will switch to another stack, I don't see why long mode is required (or why the red zone would be limited to 128 bytes).
Re: Is the enter/leave instruction better?
Posted: Fri Dec 18, 2015 8:16 am
by Brendan
Hi,
azblue wrote:Combuster wrote:Considering you seem to be using 64-bit code, you can even consider using the red zone - as long as it's not part of the kernel.
I jut learned about the red zone from this thread (and the other one linked), but I don't understand why it needs to be confined exclusively to long mode; shouldn't it also work in ring 3 protected mode leaf functions? If they're not calling other functions, and an interrupt will switch to another stack, I don't see why long mode is required (or why the red zone would be limited to 128 bytes).
The first thing to understand is that for instructions like "mov rax,[rsp+(-123)]" there's 3 alternatives:
- encode the offset (-123) as a sign extended 8-bit immediate
- encode the offset (-123) as a sign extended 16-bit immediate and waste 2 extra bytes (one for the extra immediate byte and another for the size override prefix you'd need)
- encode the offset (-123) as a sign extended 32-bit immediate and waste 3 extra bytes
The point of the red zone is to make code more efficient by avoiding the need to adjust RSP (e.g. doing "sub rsp,256" to make space, which causes a dependency problem for later instructions that use RSP because they have to wait until the new value of RSP has been calculated); while also increasing the chance that those (shorter, better) "sign extended 8-bit immediate" instructions can be used.
It's this "(shorter, better) sign extended 8-bit immediate" that's responsible for the 128 byte size limit. If the red zone was larger, you'd have to use something less efficient (16-bit or 32-bit immediate), and it'd probably better to adjust RSP instead.
Now, calling conventions...
There is no reason the same red zone stuff couldn't be done for 32-bit code (or 16-bit code). In fact, if you're willing to write your own compiler and ensure that all your shared libraries, kernel API, etc. is designed for it; nothing prevents you from implementing any calling convention you like. The problem here is that it'd break compatibility with the calling conventions that everything has used for about 25 years.
Also note that you only need a strictly defined calling convention for cases where the tools can't optimise the calling convention properly (e.g. because the called function is in a completely different object file, or in a shared library or something). For better tools (e.g. where native code generation is done by a link-time optimiser, and where the calling conventions used by most functions can be optimised properly) the "strictly defined calling convention" wouldn't be used anywhere near as much, and would have far less performance impact. Basically; if you're going to replace standard tools just to improve that strictly defined calling convention; then you're probably solving the wrong problem in the first place.
Cheers,
Brendan
Re: Is the enter/leave instruction better?
Posted: Fri Dec 18, 2015 6:51 pm
by TightCoderEx
I generally use ENTER all the time as most procedures are at least 500 - 750 cycles, so the .8% saving is negligible. There is also a two byte saving for frames greater than 128 bytes and not often, but there have been times when nested frames where handy
. That being said, I'll probably refrain from using it in interrupt handlers, but I can't really see there would be a need for a procedure frame in a handler anyway.