ASM question: preserving carry flag
ASM question: preserving carry flag
Can anyone give me a single opcode asm command that sets the zero flag, and does not bork the carry flag? I know I can do:
mov al, 1
dec al
but I'm hoping for something just a little bit less ugly than that.
I was thinking that any bitwise operator would do it, but it turns out that pretty much all of them force carry clear.
mov al, 1
dec al
but I'm hoping for something just a little bit less ugly than that.
I was thinking that any bitwise operator would do it, but it turns out that pretty much all of them force carry clear.
- Combuster
- Member
- Posts: 9301
- Joined: Wed Oct 18, 2006 3:45 am
- Libera.chat IRC: [com]buster
- Location: On the balcony, where I can actually keep 1½m distance
- Contact:
well then, how about changing the ZF and CF's semantics to the order in which you can most easily set them?
Its just that inc and dec are the only official opcodes that leave CF untouched and not ZF, and for them to work, you must get some value from somewhere. If its about speed, weave the mov al into the code a bit earlier, if its about size, then the mov,dec combo is pretty much the smallest you can get to already (iirc its 4 bytes, maybe 3.)
Its just that inc and dec are the only official opcodes that leave CF untouched and not ZF, and for them to work, you must get some value from somewhere. If its about speed, weave the mov al into the code a bit earlier, if its about size, then the mov,dec combo is pretty much the smallest you can get to already (iirc its 4 bytes, maybe 3.)
- Masterkiller
- Member
- Posts: 153
- Joined: Sat May 05, 2007 6:20 pm
Code: Select all
PUSHF
MOV bx, sp
OR byte[bx+1], 2
POPF
- Combuster
- Member
- Posts: 9301
- Joined: Wed Oct 18, 2006 3:45 am
- Libera.chat IRC: [com]buster
- Location: On the balcony, where I can actually keep 1½m distance
- Contact:
I've been browsing the flag masks for instructions, and there are only three groups that leave CF while modifying ZF:
1) inc/dec, needs a load and a trashed register. However, these instructions aren't microcoded and will execute pretty fast
2) cmpxchgxxx - needs even more preparation
3) the segmentation helper instructions. While they can do what you want in a single opcode, they are microcoded and thus slower.
- VERR and VERW set ZF is you supply a readable/writable segment selector. Doing VERR [copy_of_ds] should be enough. However, this call forces a GDT read which costs you a few more memory cycles.
- LAR and LSL sets ZF if a selector is valid, then copies the relevant part of that GDT entry to a register. Apart from the GDT access, it also trahses a register
- ARPL sets ZF if a selector's RPL field has to be lowered. If you want to set ZF, you'll have setup costs and a trashed register, However, if you want to clear ZF, arpl ax,ax does just that, keeps all other flags in their original state, as well as not trashing registers. Probably faster because it doesn't involve a GDT access.
I'd stick with the mov/dec combo.
1) inc/dec, needs a load and a trashed register. However, these instructions aren't microcoded and will execute pretty fast
2) cmpxchgxxx - needs even more preparation
3) the segmentation helper instructions. While they can do what you want in a single opcode, they are microcoded and thus slower.
- VERR and VERW set ZF is you supply a readable/writable segment selector. Doing VERR [copy_of_ds] should be enough. However, this call forces a GDT read which costs you a few more memory cycles.
- LAR and LSL sets ZF if a selector is valid, then copies the relevant part of that GDT entry to a register. Apart from the GDT access, it also trahses a register
- ARPL sets ZF if a selector's RPL field has to be lowered. If you want to set ZF, you'll have setup costs and a trashed register, However, if you want to clear ZF, arpl ax,ax does just that, keeps all other flags in their original state, as well as not trashing registers. Probably faster because it doesn't involve a GDT access.
I'd stick with the mov/dec combo.
Wow, Combuster! Impressive! Thank you!
That info is really above and beyond the call of kindness.
What I actually decided to do is reverse the definition of the zero flag in the return. In 4G-1 cases out of 4G, an inc or dec will clear ZF. So that's much easier to set up than setting ZF, of course.
So, there was one case where I was intending an error return of carry (don't care) and zf clear. After changing things, that one return became carry clear and zf set. So: xor al, al ; pop registers ; ret.
All the other returns ended up as: preserve carry, clear zf. So: inc any old register with a value != -1 (which was easy) ; pop registers ; ret.
So that is sort of an implementation of your first & second suggestion a week ago.
BTW -- how do you know which opcodes are microcoded? Is that in the AMD manuals? I have only looked at the intel ones, and if it's in one of those, it's one I haven't seen.
That info is really above and beyond the call of kindness.
What I actually decided to do is reverse the definition of the zero flag in the return. In 4G-1 cases out of 4G, an inc or dec will clear ZF. So that's much easier to set up than setting ZF, of course.
So, there was one case where I was intending an error return of carry (don't care) and zf clear. After changing things, that one return became carry clear and zf set. So: xor al, al ; pop registers ; ret.
All the other returns ended up as: preserve carry, clear zf. So: inc any old register with a value != -1 (which was easy) ; pop registers ; ret.
So that is sort of an implementation of your first & second suggestion a week ago.
BTW -- how do you know which opcodes are microcoded? Is that in the AMD manuals? I have only looked at the intel ones, and if it's in one of those, it's one I haven't seen.
Last edited by bewing on Tue Mar 11, 2008 1:31 am, edited 1 time in total.
Whats wrong with this (if eax is used, there are 3 other registers):bewing wrote:I have a function that I want to have return two testable flags. It's easy if the ZF gets set first -- if I need to also set/clear carry, then STC/CLC. It's harder the other way around -- if carry is already set properly, and I also want to force the zero flag on or off ....
or eax, 1 ; to set cf
and eax, 0xFFFFFFFE ; to clear cf
xor eax, 1; to toggle cf
or eax, 2 ; to set zf
and eax, 0xFFFFFFFD ; to clear zf
xor eax, 2; to toggle zf
test eax, 1; if cf is set
test eax, 2; if zf is set
Microsoft: "let everyone run after us. We'll just INNOV~1"
Nothing particularly wrong with it -- using bits in eax (or whatever register) instead of using EFLAGS bits as error return flags. It just uses up a register and adds a couple opcodes. A perfectly reasonable thing to do. I'm just silly about efficient code.
And at least a few OSes that I know of utilize the carry flag as an error indication flag return from system calls. It is a convention that I am overly familiar/comfortable with.
And at least a few OSes that I know of utilize the carry flag as an error indication flag return from system calls. It is a convention that I am overly familiar/comfortable with.
- Combuster
- Member
- Posts: 9301
- Joined: Wed Oct 18, 2006 3:45 am
- Libera.chat IRC: [com]buster
- Location: On the balcony, where I can actually keep 1½m distance
- Contact:
Non-standard instructions always arebewing wrote:BTW -- how do you know which opcodes are microcoded? Is that in the AMD manuals? I have only looked at the intel ones, and if it's in one of those, it's one I haven't seen.
I've located the information in both an AMD and an Intel manual. Neither uses the label "microcoded" though (hint: look for µop breakdown and decoder type/specification), and neither belongs to the set of holy books.
- Masterkiller
- Member
- Posts: 153
- Joined: Sat May 05, 2007 6:20 pm
What is the point to use 32-bit AND and OR, while you change only the last two bits. This is over 2 times slower and four or five times larger than:B.E wrote:or eax, 1 ; to set cf
and eax, 0xFFFFFFFE ; to clear cf
xor eax, 1; to toggle cf
Code: Select all
or al, 1
and al, 0xFE
ALCA OS: Project temporarity suspended!
Current state: real-mode kernel-FS reader...
Current state: real-mode kernel-FS reader...