Page 1 of 1

Question to the fellow ASM programmers

Posted: Sun Apr 30, 2006 11:00 pm
by ComputerPsi
Which do you think is faster (or smaller):

lodsb
mov cl,al

OR

mov cl,[si]
inc si

Or if anybody has a different way of doing it please share. :) The byte being read is for the length of a string that will be printed. (cld will be in the front with or without lodsb, so don't take that into account). Also, is there a good website anybody knows that has clock cycle information for different processors?

Re: Question to the fellow ASM programmers

Posted: Sun Apr 30, 2006 11:00 pm
by rexlunae
ComputerPsi wrote:Which do you think is faster (or smaller):

lodsb
mov cl,al

OR

mov cl,[si]
inc si
Don't care to speculate as to which is faster, the difference is probably irrelevant, but I can tell you that they are exactly the same size. Both versions produce exactly 3 bytes of code. You can determine the size by assembling them to flat binaries.

Re: Question to the fellow ASM programmers

Posted: Sun Apr 30, 2006 11:00 pm
by Da_Maestro

Code: Select all

mov cl,[si] 
inc si
This is faster, because it doesn't have the overhead of incrementing SI, while the lodsb instruction in the other does. Negligible difference though

Re: Question to the fellow ASM programmers

Posted: Mon May 01, 2006 11:00 pm
by JAAman
i would have to agree, but i dont think the incrementation would make it any slower (both take the same number of cycles i would guess)

however, the second one (mov cl,[si]; inc si) is non-dependant (alowing the CPU to reorder the instructions if it chooses to do so, executing the 2 instructions in reverse order) while the first pair (lodsb; mov cl,al) is dependant -- it cannot be reordered (al cannot be moved into cl until previous instructions using al as destination are complete)

Re: Question to the fellow ASM programmers

Posted: Mon May 01, 2006 11:00 pm
by Kicer
once upon a time.... i read that all comands like lodsb movsb loop etc are slower than for example
(lodsb:)
mov al,[si]
inc si

or
(loop:)
dec ecx
jnz @B

but i can't remeber where it was and why it is so but since this time i don't use them

Re: Question to the fellow ASM programmers

Posted: Tue May 02, 2006 11:00 pm
by Da_Maestro
On the newer CPUs lodsb and loop and whatnot are slower for the reasons that JAAman was saying.

The increment operation can make it slower, because the CPU must increment that register. If it didn't have to increment that register it could be instead starting another instruction.

Re: Question to the fellow ASM programmers

Posted: Wed May 03, 2006 11:00 pm
by JAAman
there is an increment both ways, what i was saying is that with


lodsb
mov cl,al

the lodsb must be completed before the mov cl,al can be executed

but with

mov cl,[si]
inc si

the inc si can be executed before or at the same time as the mov cl,[si]


if they are executed in order, then i would think that they would both take farly similar execution time (the increment is present on both sets of instructions -- the single instruction may be faster, but then you need a inc/dec)

but i firmly believe that asembly level optimization usually doesnt make any sence at all (do to vast differences between CPUs and hardware-specific runtime optimization) -- its very likely that given these 2 code samples, run repeatedly (even on the same CPU) sometimes #1will be faster, and sometimes #2 will be faster, depending on other (unpredictable) factors

Re: Question to the fellow ASM programmers

Posted: Wed May 03, 2006 11:00 pm
by Da_Maestro
:)