Question to the fellow ASM programmers

Question about which tools to use, bugs, the best way to implement a function, etc should go here. Don't forget to see if your question is answered in the wiki first! When in doubt post here.
Post Reply
ComputerPsi
Member
Member
Posts: 83
Joined: Fri Oct 22, 2004 11:00 pm

Question to the fellow ASM programmers

Post by ComputerPsi »

Which do you think is faster (or smaller):

lodsb
mov cl,al

OR

mov cl,[si]
inc si

Or if anybody has a different way of doing it please share. :) The byte being read is for the length of a string that will be printed. (cld will be in the front with or without lodsb, so don't take that into account). Also, is there a good website anybody knows that has clock cycle information for different processors?
rexlunae
Member
Member
Posts: 134
Joined: Sun Oct 24, 2004 11:00 pm
Location: North Dakota, where the buffalo roam

Re: Question to the fellow ASM programmers

Post by rexlunae »

ComputerPsi wrote:Which do you think is faster (or smaller):

lodsb
mov cl,al

OR

mov cl,[si]
inc si
Don't care to speculate as to which is faster, the difference is probably irrelevant, but I can tell you that they are exactly the same size. Both versions produce exactly 3 bytes of code. You can determine the size by assembling them to flat binaries.
Da_Maestro
Member
Member
Posts: 144
Joined: Tue Oct 26, 2004 11:00 pm
Location: Australia

Re: Question to the fellow ASM programmers

Post by Da_Maestro »

Code: Select all

mov cl,[si] 
inc si
This is faster, because it doesn't have the overhead of incrementing SI, while the lodsb instruction in the other does. Negligible difference though
Two things are infinite: The universe and human stupidity. But I'm not quite sure about the universe.
--- Albert Einstein
User avatar
JAAman
Member
Member
Posts: 879
Joined: Wed Oct 27, 2004 11:00 pm
Location: WA

Re: Question to the fellow ASM programmers

Post by JAAman »

i would have to agree, but i dont think the incrementation would make it any slower (both take the same number of cycles i would guess)

however, the second one (mov cl,[si]; inc si) is non-dependant (alowing the CPU to reorder the instructions if it chooses to do so, executing the 2 instructions in reverse order) while the first pair (lodsb; mov cl,al) is dependant -- it cannot be reordered (al cannot be moved into cl until previous instructions using al as destination are complete)
Kicer
Posts: 13
Joined: Sat Apr 01, 2006 12:00 am

Re: Question to the fellow ASM programmers

Post by Kicer »

once upon a time.... i read that all comands like lodsb movsb loop etc are slower than for example
(lodsb:)
mov al,[si]
inc si

or
(loop:)
dec ecx
jnz @B

but i can't remeber where it was and why it is so but since this time i don't use them
Da_Maestro
Member
Member
Posts: 144
Joined: Tue Oct 26, 2004 11:00 pm
Location: Australia

Re: Question to the fellow ASM programmers

Post by Da_Maestro »

On the newer CPUs lodsb and loop and whatnot are slower for the reasons that JAAman was saying.

The increment operation can make it slower, because the CPU must increment that register. If it didn't have to increment that register it could be instead starting another instruction.
Two things are infinite: The universe and human stupidity. But I'm not quite sure about the universe.
--- Albert Einstein
User avatar
JAAman
Member
Member
Posts: 879
Joined: Wed Oct 27, 2004 11:00 pm
Location: WA

Re: Question to the fellow ASM programmers

Post by JAAman »

there is an increment both ways, what i was saying is that with


lodsb
mov cl,al

the lodsb must be completed before the mov cl,al can be executed

but with

mov cl,[si]
inc si

the inc si can be executed before or at the same time as the mov cl,[si]


if they are executed in order, then i would think that they would both take farly similar execution time (the increment is present on both sets of instructions -- the single instruction may be faster, but then you need a inc/dec)

but i firmly believe that asembly level optimization usually doesnt make any sence at all (do to vast differences between CPUs and hardware-specific runtime optimization) -- its very likely that given these 2 code samples, run repeatedly (even on the same CPU) sometimes #1will be faster, and sometimes #2 will be faster, depending on other (unpredictable) factors
Da_Maestro
Member
Member
Posts: 144
Joined: Tue Oct 26, 2004 11:00 pm
Location: Australia

Re: Question to the fellow ASM programmers

Post by Da_Maestro »

:)
Two things are infinite: The universe and human stupidity. But I'm not quite sure about the universe.
--- Albert Einstein
Post Reply