push & pop in 64bit mode
push & pop in 64bit mode
i'm trying to write my first function in 64bit mode,
however 64bit doesn't seem to accept PUSHA/POPA/PUSHAD/POPAD instructions
however 64bit doesn't seem to accept PUSHA/POPA/PUSHAD/POPAD instructions
I'm using AMD Sempron 140 Single core 2.7GHz
Re: push & pop in 64bit mode
Those instructions are not supported in long mode. You need to create
a macro instead which simulates the functionality of pushaq and popaq.
regards
Mac2004
a macro instead which simulates the functionality of pushaq and popaq.
regards
Mac2004
Re: push & pop in 64bit mode
in 64bit mode, there are so many registers,
should we create pushaq,popaq macros?
does it lower down performance?
should we create pushaq,popaq macros?
does it lower down performance?
I'm using AMD Sempron 140 Single core 2.7GHz
Re: push & pop in 64bit mode
Assuming you are using an assembler. Take a look
at the manual of your assembler how create macroses.
regards
Mac2004
at the manual of your assembler how create macroses.
regards
Mac2004
- Combuster
- Member
- Posts: 9301
- Joined: Wed Oct 18, 2006 3:45 am
- Libera.chat IRC: [com]buster
- Location: On the balcony, where I can actually keep 1½m distance
- Contact:
Re: push & pop in 64bit mode
if pusha (or popa) is what you want then just push everything. However, in practice you won't need to push more than 15 GPRs so every pusha is wasting at least one push, if not more.
Re: push & pop in 64bit mode
Here the macroses I use, (fasm syntax)
Code: Select all
;**************************************************************************************
;PUSHAQ: Emulates the 'pushaq instruction' under long mode.
;
; Input: --
;
; Output: --
;
;**************************************************************************************
align 8
macro PUSHAQ
{
;Save registers to the stack.
;--------------------------------
push rax ;save current rax
push rbx ;save current rbx
push rcx ;save current rcx
push rdx ;save current rdx
push rbp ;save current rbp
push rdi ;save current rdi
push rsi ;save current rsi
push r8 ;save current r8
push r9 ;save current r9
push r10 ;save current r10
push r11 ;save current r11
push r12 ;save current r12
push r13 ;save current r13
push r14 ;save current r14
push r15 ;save current r15
} ;end of macro definition
;**************************************************************************************
;POPAQ: Emulates the 'popaq instruction' under long mode.
;
; Input: --
;
; Output: --
;
;**************************************************************************************
align 8
macro POPAQ
{
;Restore registers from the stack.
;--------------------------------
pop r15 ;restore current r15
pop r14 ;restore current r14
pop r13 ;restore current r13
pop r12 ;restore current r12
pop r11 ;restore current r11
pop r10 ;restore current r10
pop r9 ;restore current r9
pop r8 ;restore current r8
pop rsi ;restore current rsi
pop rdi ;restore current rdi
pop rbp ;restore current rbp
pop rdx ;restore current rdx
pop rcx ;restore current rcx
pop rbx ;restore current rbx
pop rax ;restore current rax
} ;end of macro definition
Re: push & pop in 64bit mode
allthough bad for performance, however i follow mac2004 to use the NASM version of PUSHAQ
easier for coding a little bit
Code: Select all
%macro pushaq 0
push rax ;save current rax
push rbx ;save current rbx
push rcx ;save current rcx
push rdx ;save current rdx
push rbp ;save current rbp
push rdi ;save current rdi
push rsi ;save current rsi
push r8 ;save current r8
push r9 ;save current r9
push r10 ;save current r10
push r11 ;save current r11
push r12 ;save current r12
push r13 ;save current r13
push r14 ;save current r14
push r15 ;save current r15
%endmacro
I'm using AMD Sempron 140 Single core 2.7GHz
Re: push & pop in 64bit mode
If you want better performance, it is probably faster to decrement the stack by the size of the entire register set and then mov each register to the stack one at a time.
i.e instead oftry
i.e instead of
Code: Select all
push reg1
push reg2
push reg3
...
push reg15
Code: Select all
sub $120, rsp
mov reg1, 112(rsp)
mov reg2, 104(rsp)
mov reg3, 96(rsp)
...
mov reg15, 0(rsp)
If a trainstation is where trains stop, what is a workstation ?
Re: push & pop in 64bit mode
gerryg400's method using MOV & SUB SP seems to be 15+1 ticks for 15 MOVs & 1 SUB
and the method using direct PUSH seems to be 15+15 ticks for 15 MOVs & 15 DEC,
25% faster, i think
and the method using direct PUSH seems to be 15+15 ticks for 15 MOVs & 15 DEC,
25% faster, i think
I'm using AMD Sempron 140 Single core 2.7GHz
Re: push & pop in 64bit mode
It's not as simple as that. In this case a stand-alone MOV and PUSH would take the same amount of time because the micro-ops that make up the PUSH can be done in parallel.nicola wrote:gerryg400's method using MOV & SUB SP seems to be 15+1 ticks for 15 MOVs & 1 SUB
and the method using direct PUSH seems to be 15+15 ticks for 15 MOVs & 15 DEC,
25% faster, i think
The advantage comes from the fact that using MOV instead of PUSH reduces the dependency from one instruction to the next and allows the cpu to do parallel and out of order execution.
NOTE: I'm no expert. I hope an expert chimes in soon if I've got this wrong !
If a trainstation is where trains stop, what is a workstation ?