OSDev.org

Posted: **Sun Apr 30, 2006 11:00 pm**

Which do you think is faster (or smaller):

lodsb
mov cl,al

OR

mov cl,[si]
inc si

Or if anybody has a different way of doing it please share.

The byte being read is for the length of a string that will be printed. (cld will be in the front with or without lodsb, so don't take that into account). Also, is there a good website anybody knows that has clock cycle information for different processors?

Posted: **Sun Apr 30, 2006 11:00 pm**

ComputerPsi wrote:Which do you think is faster (or smaller):

lodsb
mov cl,al

OR

mov cl,[si]
inc si

Don't care to speculate as to which is faster, the difference is probably irrelevant, but I can tell you that they are exactly the same size. Both versions produce exactly 3 bytes of code. You can determine the size by assembling them to flat binaries.

Posted: **Sun Apr 30, 2006 11:00 pm**

Code: Select all

mov cl,[si] 
inc si

This is faster, because it doesn't have the overhead of incrementing SI, while the lodsb instruction in the other does. Negligible difference though

Posted: **Mon May 01, 2006 11:00 pm**

i would have to agree, but i dont think the incrementation would make it any slower (both take the same number of cycles i would guess)

however, the second one (mov cl,[si]; inc si) is non-dependant (alowing the CPU to reorder the instructions if it chooses to do so, executing the 2 instructions in reverse order) while the first pair (lodsb; mov cl,al) is dependant -- it cannot be reordered (al cannot be moved into cl until previous instructions using al as destination are complete)

Posted: **Mon May 01, 2006 11:00 pm**

once upon a time.... i read that all comands like lodsb movsb loop etc are slower than for example
(lodsb:)
mov al,[si]
inc si

or
(loop:)
dec ecx
jnz @B

but i can't remeber where it was and why it is so but since this time i don't use them

Posted: **Tue May 02, 2006 11:00 pm**

On the newer CPUs lodsb and loop and whatnot are slower for the reasons that JAAman was saying.

The increment operation can make it slower, because the CPU must increment that register. If it didn't have to increment that register it could be instead starting another instruction.

Posted: **Wed May 03, 2006 11:00 pm**

there is an increment both ways, what i was saying is that with

lodsb
mov cl,al

the lodsb must be completed before the mov cl,al can be executed

but with

mov cl,[si]
inc si

the inc si can be executed before or at the same time as the mov cl,[si]

if they are executed in order, then i would think that they would both take farly similar execution time (the increment is present on both sets of instructions -- the single instruction may be faster, but then you need a inc/dec)

but i firmly believe that asembly level optimization usually doesnt make any sence at all (do to vast differences between CPUs and hardware-specific runtime optimization) -- its very likely that given these 2 code samples, run repeatedly (even on the same CPU) sometimes #1will be faster, and sometimes #2 will be faster, depending on other (unpredictable) factors

Posted: **Wed May 03, 2006 11:00 pm**

OSDev.org

Question to the fellow ASM programmers

Question to the fellow ASM programmers

Re: Question to the fellow ASM programmers

Re: Question to the fellow ASM programmers

Re: Question to the fellow ASM programmers

Re: Question to the fellow ASM programmers

Re: Question to the fellow ASM programmers

Re: Question to the fellow ASM programmers

Re: Question to the fellow ASM programmers