Which do you think is faster (or smaller):
lodsb
mov cl,al
OR
mov cl,[si]
inc si
Or if anybody has a different way of doing it please share. The byte being read is for the length of a string that will be printed. (cld will be in the front with or without lodsb, so don't take that into account). Also, is there a good website anybody knows that has clock cycle information for different processors?
Question to the fellow ASM programmers
-
- Member
- Posts: 83
- Joined: Fri Oct 22, 2004 11:00 pm
-
- Member
- Posts: 134
- Joined: Sun Oct 24, 2004 11:00 pm
- Location: North Dakota, where the buffalo roam
Re: Question to the fellow ASM programmers
Don't care to speculate as to which is faster, the difference is probably irrelevant, but I can tell you that they are exactly the same size. Both versions produce exactly 3 bytes of code. You can determine the size by assembling them to flat binaries.ComputerPsi wrote:Which do you think is faster (or smaller):
lodsb
mov cl,al
OR
mov cl,[si]
inc si
-
- Member
- Posts: 144
- Joined: Tue Oct 26, 2004 11:00 pm
- Location: Australia
Re: Question to the fellow ASM programmers
Code: Select all
mov cl,[si]
inc si
Two things are infinite: The universe and human stupidity. But I'm not quite sure about the universe.
--- Albert Einstein
--- Albert Einstein
Re: Question to the fellow ASM programmers
i would have to agree, but i dont think the incrementation would make it any slower (both take the same number of cycles i would guess)
however, the second one (mov cl,[si]; inc si) is non-dependant (alowing the CPU to reorder the instructions if it chooses to do so, executing the 2 instructions in reverse order) while the first pair (lodsb; mov cl,al) is dependant -- it cannot be reordered (al cannot be moved into cl until previous instructions using al as destination are complete)
however, the second one (mov cl,[si]; inc si) is non-dependant (alowing the CPU to reorder the instructions if it chooses to do so, executing the 2 instructions in reverse order) while the first pair (lodsb; mov cl,al) is dependant -- it cannot be reordered (al cannot be moved into cl until previous instructions using al as destination are complete)
Re: Question to the fellow ASM programmers
once upon a time.... i read that all comands like lodsb movsb loop etc are slower than for example
(lodsb:)
mov al,[si]
inc si
or
(loop:)
dec ecx
jnz @B
but i can't remeber where it was and why it is so but since this time i don't use them
(lodsb:)
mov al,[si]
inc si
or
(loop:)
dec ecx
jnz @B
but i can't remeber where it was and why it is so but since this time i don't use them
-
- Member
- Posts: 144
- Joined: Tue Oct 26, 2004 11:00 pm
- Location: Australia
Re: Question to the fellow ASM programmers
On the newer CPUs lodsb and loop and whatnot are slower for the reasons that JAAman was saying.
The increment operation can make it slower, because the CPU must increment that register. If it didn't have to increment that register it could be instead starting another instruction.
The increment operation can make it slower, because the CPU must increment that register. If it didn't have to increment that register it could be instead starting another instruction.
Two things are infinite: The universe and human stupidity. But I'm not quite sure about the universe.
--- Albert Einstein
--- Albert Einstein
Re: Question to the fellow ASM programmers
there is an increment both ways, what i was saying is that with
lodsb
mov cl,al
the lodsb must be completed before the mov cl,al can be executed
but with
mov cl,[si]
inc si
the inc si can be executed before or at the same time as the mov cl,[si]
if they are executed in order, then i would think that they would both take farly similar execution time (the increment is present on both sets of instructions -- the single instruction may be faster, but then you need a inc/dec)
but i firmly believe that asembly level optimization usually doesnt make any sence at all (do to vast differences between CPUs and hardware-specific runtime optimization) -- its very likely that given these 2 code samples, run repeatedly (even on the same CPU) sometimes #1will be faster, and sometimes #2 will be faster, depending on other (unpredictable) factors
lodsb
mov cl,al
the lodsb must be completed before the mov cl,al can be executed
but with
mov cl,[si]
inc si
the inc si can be executed before or at the same time as the mov cl,[si]
if they are executed in order, then i would think that they would both take farly similar execution time (the increment is present on both sets of instructions -- the single instruction may be faster, but then you need a inc/dec)
but i firmly believe that asembly level optimization usually doesnt make any sence at all (do to vast differences between CPUs and hardware-specific runtime optimization) -- its very likely that given these 2 code samples, run repeatedly (even on the same CPU) sometimes #1will be faster, and sometimes #2 will be faster, depending on other (unpredictable) factors
-
- Member
- Posts: 144
- Joined: Tue Oct 26, 2004 11:00 pm
- Location: Australia
Re: Question to the fellow ASM programmers
Two things are infinite: The universe and human stupidity. But I'm not quite sure about the universe.
--- Albert Einstein
--- Albert Einstein