Page 1 of 1

pusha and push eax, ecx, edx, ebx, etc.

Posted: Thu Feb 08, 2007 10:15 am
by digo_rp
guys just to save some clock cicles I´m trying to find out a way to do it.

I saw in some tut´s helppc etc

that pusha takes 11 cicles of clock
and push takes only one
push segreg takes 3 cicles of clock

now is my question

the only segreg that x86 have is cs, ds, es, fs and gs, right ?
eax, ecx, edx, ebx esp, ebp esi and edi is not a segreg, right?
my doubt is the esp, ebp. esi and edi I´m almost sure that isn´t a seg reg.

guys could you help with that question ? please

Posted: Thu Feb 08, 2007 10:51 am
by JAAman
that pusha takes 11 cicles of clock
and push takes only one
push segreg takes 3 cicles of clock
i wouldnt trust anything that tells you how many cycles it takes, since that changes for each release of each model of each class of CPU (there are at least 6 different timings for chips marked as 'P4' -- probably more), plus, intel no longer publishes how many cycles each instruction takes, because it is no longer important (other things affect it more than cycle times, making it virtually impossible to tell at compile-time, how many cycles it will take even with a cycle timing chart)
instead, get the intel optimization guide (should be on the page with the manuals) or the AMD equivalent
the only segreg that x86 have is cs, ds, es, fs and gs, right ?
eax, ecx, edx, ebx esp, ebp esi and edi is not a segreg, right?
my doubt is the esp, ebp. esi and edi I´m almost sure that isn´t a seg reg.
look at the encodings in the intel manual, volume 2b, apendix B:

GPRs:
000 EAX/AX/AL
001 ECX/CX/CL
010 EDX/DX/DL
011 EBX/BX/BL
100 ESP/SP/AH
101 EBP/BP/CH
110 ESI/SI/DH
111 EDI/DI/BH

all these can be used as normal registers (reg)

segregs (better known as segment registers, or seg2/seg3 are used to hold segments:
seg2/seg3
00/000 - ES
01/001 - CS
10/010 - SS
11/011 - DS

AVAILIBLE IN seg3 ONLY:
100 - FS
101 - GS
110 - RESERVED
111 - RESERVED



the sad part is i wrote most of that from memory (i had to look up DS/SS)

Posted: Thu Feb 08, 2007 11:22 am
by XCHG
Just to add a few points to [JAAman]'s post, the PUSHA instruction does *not* push 32-bit registers. Instead, it pushes the 16-bits registers that are Accumulator (AX), Base Index (BX), Count Register (CX), Data Register (DX), Stack Pointer (SP), Base Pointer (BP), Source Index (SI) and last but not least, Destination Index (DI).

The 32-bit equivalent of the PUSHA instruction is the PUSHAD. They have POP versions also as in POPA and POPAD for 16-bit and 32-bit registers, respectively.

The combination of PUSHA and POPA instructions took 17 clock cycles to execute on my PIII 800 MHZ machine while the code segment was aligned on a DWORD boundary. I checked it on a WORD boundary and it yielded the same result. The PUSHAD and POPAD versions of this instruction took 2 clock cycles more on the same CPU, for 19 clock cycles in total.

The combination of consecutive PUSH instructions followed by POP instructions to do the same job as PUSHAD and POPAD, manually, took 20 clock cycles for the same machine. I have not checked the 16-bit registers sequential PUSH and POPs but I guess they would take more clock cycles to execute due to partial register access stalls.

Posted: Thu Feb 08, 2007 11:54 am
by digo_rp
guys I sayd pusha and popa but the nasm choose the right one
as I´m working on 32bits pmode nasm uses pushad and popad

is that right ?

so, I should use pushad and popad
?

and not push eax, ecx, blah blah blah ? <- that one should takes more cycle of clock?

Posted: Thu Feb 08, 2007 12:47 pm
by JAAman
XCHG wrote:Just to add a few points to [JAAman]'s post, the PUSHA instruction does *not* push 32-bit registers. Instead, it pushes the 16-bits registers that are Accumulator (AX), Base Index (BX), Count Register (CX), Data Register (DX), Stack Pointer (SP), Base Pointer (BP), Source Index (SI) and last but not least, Destination Index (DI).

The 32-bit equivalent of the PUSHA instruction is the PUSHAD. They have POP versions also as in POPA and POPAD for 16-bit and 32-bit registers, respectively.
your post is slightly misleading:
there is no PUSHAD/POPAD instruction

there is only 1 instruction for pusha/popa -- if your current operand size is 16bits, it will store/retrieve 16bit registers, if you current operand size is 32bits, it will store/retrieve 32bit registers, but both are aliases for the same instruction:


0110.0000 this is pusha if your default size is 16bit, pushad if your default size is 32bit
0110.0001 this is popa if your default size is 16bit, popad if your default size is 32bit

0110.0110 this is the operand size override -- many (not all) assemblers will automatically put this before your pusha/popa instruction if you use the wrong term for your current bits setting (if you use pushad and have 'bits16' or you use pusha when you have 'bits32') however, it is proper (just not common) to refer to both forms as pusha/popa (since they are the same instruction)

you should prob change that to PUSHAD/POPAD instead, just to be safe