Page 1 of 1
ASM question: preserving carry flag
Posted: Wed Mar 05, 2008 4:46 am
by bewing
Can anyone give me a single opcode asm command that sets the zero flag, and does not bork the carry flag? I know I can do:
mov al, 1
dec al
but I'm hoping for something just a little bit less ugly than that.
I was thinking that any bitwise operator would do it, but it turns out that pretty much all of them force carry clear.
Posted: Wed Mar 05, 2008 6:05 am
by Combuster
inc [some_hardware_reg_that_always_reads_ff]
or
inc [in_bochs_where_theres_no_memory]
but really, what do you want to achieve?
Posted: Wed Mar 05, 2008 6:10 am
by bewing
I have a function that I want to have return two testable flags. It's easy if the ZF gets set first -- if I need to also set/clear carry, then STC/CLC. It's harder the other way around -- if carry is already set properly, and I also want to force the zero flag on or off ....
Posted: Wed Mar 05, 2008 6:22 am
by Combuster
well then, how about changing the ZF and CF's semantics to the order in which you can most easily set them?
Its just that inc and dec are the only official opcodes that leave CF untouched and not ZF, and for them to work, you must get some value from somewhere. If its about speed, weave the mov al into the code a bit earlier, if its about size, then the mov,dec combo is pretty much the smallest you can get to already (iirc its 4 bytes, maybe 3.)
Posted: Thu Mar 06, 2008 10:05 pm
by Masterkiller
Code: Select all
PUSHF
MOV bx, sp
OR byte[bx+1], 2
POPF
Posted: Sat Mar 08, 2008 7:53 am
by Philip
how about
xor al,al
Posted: Mon Mar 10, 2008 10:30 am
by JAAman
xor clears CF
Posted: Mon Mar 10, 2008 11:33 am
by Combuster
I've been browsing the flag masks for instructions, and there are only three groups that leave CF while modifying ZF:
1) inc/dec, needs a load and a trashed register. However, these instructions aren't microcoded and will execute pretty fast
2) cmpxchgxxx - needs even more preparation
3) the segmentation helper instructions. While they can do what you want in a single opcode, they are microcoded and thus slower.
- VERR and VERW set ZF is you supply a readable/writable segment selector. Doing VERR [copy_of_ds] should be enough. However, this call forces a GDT read which costs you a few more memory cycles.
- LAR and LSL sets ZF if a selector is valid, then copies the relevant part of that GDT entry to a register. Apart from the GDT access, it also trahses a register
- ARPL sets ZF if a selector's RPL field has to be lowered. If you want to set ZF, you'll have setup costs and a trashed register, However, if you want to clear ZF, arpl ax,ax does just that, keeps all other flags in their original state, as well as not trashing registers. Probably faster because it doesn't involve a GDT access.
I'd stick with the mov/dec combo.
Posted: Mon Mar 10, 2008 7:00 pm
by bewing
Wow, Combuster! Impressive! Thank you!
That info is really above and beyond the call of kindness.
What I actually decided to do is reverse the definition of the zero flag in the return. In 4G-1 cases out of 4G, an inc or dec will clear ZF. So that's much easier to set up than setting ZF, of course.
So, there was one case where I was intending an error return of carry (don't care) and zf clear. After changing things, that one return became carry clear and zf set. So: xor al, al ; pop registers ; ret.
All the other returns ended up as: preserve carry, clear zf. So: inc any old register with a value != -1 (which was easy) ; pop registers ; ret.
So that is sort of an implementation of your first & second suggestion a week ago.
BTW -- how do you know which opcodes are microcoded? Is that in the AMD manuals? I have only looked at the intel ones, and if it's in one of those, it's one I haven't seen.
Posted: Mon Mar 10, 2008 8:33 pm
by B.E
bewing wrote:I have a function that I want to have return two testable flags. It's easy if the ZF gets set first -- if I need to also set/clear carry, then STC/CLC. It's harder the other way around -- if carry is already set properly, and I also want to force the zero flag on or off ....
Whats wrong with this (if eax is used, there are 3 other registers):
or eax, 1 ; to set cf
and eax, 0xFFFFFFFE ; to clear cf
xor eax, 1; to toggle cf
or eax, 2 ; to set zf
and eax, 0xFFFFFFFD ; to clear zf
xor eax, 2; to toggle zf
test eax, 1; if cf is set
test eax, 2; if zf is set
Posted: Tue Mar 11, 2008 1:13 am
by bewing
Nothing particularly wrong with it -- using bits in eax (or whatever register) instead of using EFLAGS bits as error return flags. It just uses up a register and adds a couple opcodes. A perfectly reasonable thing to do. I'm just silly about efficient code.
And at least a few OSes that I know of utilize the carry flag as an error indication flag return from system calls. It is a convention that I am overly familiar/comfortable with.
Posted: Tue Mar 11, 2008 4:36 am
by Combuster
bewing wrote:BTW -- how do you know which opcodes are microcoded? Is that in the AMD manuals? I have only looked at the intel ones, and if it's in one of those, it's one I haven't seen.
Non-standard instructions always are
I've located the information in both an AMD and an Intel manual. Neither uses the label "microcoded" though (hint: look for µop breakdown and decoder type/specification), and neither belongs to the set of holy books.
Posted: Tue Mar 11, 2008 6:24 pm
by Masterkiller
B.E wrote:or eax, 1 ; to set cf
and eax, 0xFFFFFFFE ; to clear cf
xor eax, 1; to toggle cf
What is the point to use 32-bit AND and OR, while you change only the last two bits. This is over 2 times slower and four or five times larger than: