Anything I say about optimization is probably out of date, since the only processors I ever had to heavily optimize for were 8086-80386 real mode and protected mode for 386 under DOS.
I think some operations, mostly arithmetic, are faster when using eax as a destination. They might also generate slightly smaller opcodes.
For newer processors, one important optimization is to order the instructions to maximize parallelization. So the following:
Code: Select all
mov eax, [esp + 4]
shr eax, 5
mov ebx, [esp + 8]
and ebx, 0xF0F0F0F0
could be coded better as:
Code: Select all
mov eax, [esp + 4]
mov ebx, [esp + 8]
shr eax, 5
and ebx, 0xF0F0F0F0
The goal is to allow the processor to pipeline as much instructions as possible. If you use an instruction that uses EAX right after another that uses it, you will have to wait for that instruction above to finish before accessing the next one. Of course, there are always the classics such as unrolling small loops, improving cache performance, and optimizing for branch prediction. For information, there are the Intel and AMD optimization manuals, and lots of websites. Just google something like "
your processor optimization".