Is xor ax,ax and mov ax,0 both setting the ax register to 0?
- thewebparadox
- Posts: 4
- Joined: Sun Jun 10, 2018 4:27 pm
- Libera.chat IRC: the web paradox
Is xor ax,ax and mov ax,0 both setting the ax register to 0?
because i've never seen xor before and mov ex1,ex2 moves ex2 to ex1 but i observed on some code that somebody did xor ax,ax and in comments said ; setting the ax register to 0
finesseOS
Re: Is xor ax,ax and mov ax,0 both setting the ax register t
This is an assembly question more than an os development question, but okay.
I would look up the instruction in the intel manual, or search for "xor instruction". Most any page will tell you exactly what this instruction does.
The difference is that the "mov" instruction will not affect the flags register where the "xor" instruction will.
Ben
- http://www.fysnet.net/osdesign_book_series.htm
I would look up the instruction in the intel manual, or search for "xor instruction". Most any page will tell you exactly what this instruction does.
Code: Select all
10100110
xor 00111011
---------
10011101
Ben
- http://www.fysnet.net/osdesign_book_series.htm
-
- Member
- Posts: 5586
- Joined: Mon Mar 25, 2013 7:01 pm
Re: Is xor ax,ax and mov ax,0 both setting the ax register t
There are also differences in size: XOR and SUB are one byte smaller than MOV for setting a 16-bit register to 0 (and three bytes smaller for a 32-bit register). This makes it a popular optimization for bootloaders, where space can be very expensive.
- Schol-R-LEA
- Member
- Posts: 1925
- Joined: Fri Oct 27, 2006 9:42 am
- Location: Athens, GA, USA
Re: Is xor ax,ax and mov ax,0 both setting the ax register t
To expand on Ben's answer a bit (thought as Ben said, it really isn't an OS question so much as a general assembly programming question): the XOR instruction performs a bitwise logical operation known as 'exclusive or', in which the value is true if and only if exactly one of the two logical values (in this case, stored as individual bits, with false == 0, true == 1) it is applied to are true.
XOR is one of several bitwise logical operations, the other common ones being AND, OR, and NOT. The general Boolean truth tables for the common logical operations are
Unary Not:
And:
Regular (Inclusive) Or:
Exclusive Or:
Since these instructions operate on the bits in a data word, they perform these operation on the individual bits, paired up in the two pieces of data. So, if you have one byte x that holds the binary value 1100 1001 (splitting the nybbles just to make it easier to read)and another byte y that has the binary value 0001 1010, then the results would be
Now, here's the trick being used: if you XOR any value against itself, all the set bits cancel, clearing the entire datum. In other words,
Note that these bitwise operators are not specific to assembly language; these same AND, OR, XOR, and NOT operations that are performed by the '&' (ampersand), `|` (vertical bar, or pipe), `^` (caret) and `~` (tilde) operators in C and related languages:
I hope this helps, because to be honest, this something you really ought to have down solid before jumping into OS-Dev.
As has already been said, the main reason this is sometimes used is because, on the x86 instruction set (and several others), the XOR operation (also sometimes called EOR or something similar on different instruction set architectures) is encoded in fewer bytes than the MOV operation, and can also be faster in some implementations, too (for example, some of the early 8088 models). Also, as Ben mentioned, it clears the Overflow and Carry flags in the FLAGS register, which MOV doesn't, and that's sometimes useful to do when clearing a register.
This isn't universal, however; for example, both the size and the number of cycles used by the equivalent instructions are the same either way on ARM CPUs, and in the MIPS CPU, the MOVE <regX>, <regY> pseudo-instruction is just an alias for the OR <regX>, $zero, <regY> instruction (the register $0 - also called $zero - is permanently set to zero), while CLEAR <regX> is just OR <regX>, $zero, $zero. The Flags/Status register issues are different, too; the CPSR (Current Processor Status Register) in the ARM design behaves differently from the x86 FLAGS register, while MIPS doesn't have a status register, period (at least not one used for this purpose).
XOR is one of several bitwise logical operations, the other common ones being AND, OR, and NOT. The general Boolean truth tables for the common logical operations are
Unary Not:
Code: Select all
F | T
-------
T | F
Code: Select all
| F | T
-----------
F | F | F
-----------
T | F | T
Code: Select all
| F | T
-----------
F | F | T
-----------
T | T | T
Code: Select all
| F | T
-----------
F | F | T
-----------
T | T | F
Code: Select all
NOT x
1100 1001
-----------
0011 0110
AND x, y
1100 1001
0001 1010
-----------
0000 1000
OR x, y
1100 1001
0001 1010
-----------
1101 1011
XOR x, y
1100 1001
0001 1010
-----------
1101 0011
Code: Select all
XOR n, n == 0, for all n
Code: Select all
uint8_t a, b, c, d, n, m, x, y;
x = 0xC9; // == binary 11001001 == decimal 201
y = 0x1A; // == binary 00011010 == decimal 26
a = ~x; // == binary 00110110 == dec 54 == hex 36
b = x & y; // == binary 00001000 == dec 8 == hex 08
c = x | y; // == binary 11011011 == dec 219 == hex DB
d = x ^ y; // == binary 11010011 == dec 211 == hex D3
n = x ^ x; // == binary 00000000 == dec 0 == hex 00
m = y ^ y; // == binary 00000000 == dec 0 == hex 00
As has already been said, the main reason this is sometimes used is because, on the x86 instruction set (and several others), the XOR operation (also sometimes called EOR or something similar on different instruction set architectures) is encoded in fewer bytes than the MOV operation, and can also be faster in some implementations, too (for example, some of the early 8088 models). Also, as Ben mentioned, it clears the Overflow and Carry flags in the FLAGS register, which MOV doesn't, and that's sometimes useful to do when clearing a register.
This isn't universal, however; for example, both the size and the number of cycles used by the equivalent instructions are the same either way on ARM CPUs, and in the MIPS CPU, the MOVE <regX>, <regY> pseudo-instruction is just an alias for the OR <regX>, $zero, <regY> instruction (the register $0 - also called $zero - is permanently set to zero), while CLEAR <regX> is just OR <regX>, $zero, $zero. The Flags/Status register issues are different, too; the CPSR (Current Processor Status Register) in the ARM design behaves differently from the x86 FLAGS register, while MIPS doesn't have a status register, period (at least not one used for this purpose).
Last edited by Schol-R-LEA on Sat Jun 30, 2018 10:13 am, edited 8 times in total.
Rev. First Speaker Schol-R-LEA;2 LCF ELF JAM POEE KoR KCO PPWMTF
Ordo OS Project
Lisp programmers tend to seem very odd to outsiders, just like anyone else who has had a religious experience they can't quite explain to others.
Ordo OS Project
Lisp programmers tend to seem very odd to outsiders, just like anyone else who has had a religious experience they can't quite explain to others.
Re: Is xor ax,ax and mov ax,0 both setting the ax register t
Hi,
For example, if you do an instruction like "div" (which is slow and updates EAX) followed by "mov eax,0" then all out-of-order 80x86 CPUs will know that the second instruction doesn't have to wait for the slow division to finish; and if you do "div" and then "xor eax,eax" some (older) CPUs will wait for the "div" to finish (and some newer CPUs won't).
Of course even on older CPUs (where there is a false dependency) "xor eax,eax" might still be faster in some cases (if the bottleneck is instruction fetch, and if EAX hasn't been changed for ages anyway).
Cheers,
Brendan
BenLunt wrote:The difference is that the "mov" instruction will not affect the flags register where the "xor" instruction will.
The other difference is whether or not the CPU (mistakenly) thinks that the instruction depends on the previous value of EAX.Octocontrabass wrote:There are also differences in size: XOR and SUB are one byte smaller than MOV for setting a 16-bit register to 0 (and three bytes smaller for a 32-bit register). This makes it a popular optimization for bootloaders, where space can be very expensive.
For example, if you do an instruction like "div" (which is slow and updates EAX) followed by "mov eax,0" then all out-of-order 80x86 CPUs will know that the second instruction doesn't have to wait for the slow division to finish; and if you do "div" and then "xor eax,eax" some (older) CPUs will wait for the "div" to finish (and some newer CPUs won't).
Of course even on older CPUs (where there is a false dependency) "xor eax,eax" might still be faster in some cases (if the bottleneck is instruction fetch, and if EAX hasn't been changed for ages anyway).
Cheers,
Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
Re: Is xor ax,ax and mov ax,0 both setting the ax register t
Perhaps this is not very relevant but I used an "optimization" in my boot code if I wanted to set a register to zero without changing the flags. I was able to assume that a segment register was always zero so a "mov ?x, cs" instruction could be used. The instruction takes two bytes, like the xor instruction. This was mostly for return code paths, for example
Please note that this is not optimized for speed.
Code: Select all
...
jc .Err1
...
jc .Err2
...
.Err1: mov ax, cs ; (only two bytes)
ret ; return cf = 1
.Err2: mov ax, 0x0001 ; (something else)
ret ; return cf = 1
-
- Member
- Posts: 5586
- Joined: Mon Mar 25, 2013 7:01 pm
Re: Is xor ax,ax and mov ax,0 both setting the ax register t
It can also be faster where the CPU mistakenly thinks future instructions using AX, AH, or AL are dependent on the result of "mov eax,0" but not "xor eax,eax". In some cases, it's fastest to use "mov eax,0" followed immediately by "xor eax,eax" to prevent both past and future false dependencies.Brendan wrote:Of course even on older CPUs (where there is a false dependency) "xor eax,eax" might still be faster in some cases (if the bottleneck is instruction fetch, and if EAX hasn't been changed for ages anyway).
Optimizations like this are too dependent on individual CPUs to make any general statements, so it's best to forget about speed unless you're optimizing for a specific CPU.
Keep in mind that CS may not be 0 if you haven't used a far jump to explicitly set CS.Antti wrote:Perhaps this is not very relevant but I used an "optimization" in my boot code if I wanted to set a register to zero without changing the flags. I was able to assume that a segment register was always zero so a "mov ?x, cs" instruction could be used.
- Schol-R-LEA
- Member
- Posts: 1925
- Joined: Fri Oct 27, 2006 9:42 am
- Location: Athens, GA, USA
Re: Is xor ax,ax and mov ax,0 both setting the ax register t
That's a situation I hadn't heard of before. Can anyone else confirm, and more importantly, does anyone know which processor models might behave this way, and under what circumstances? I don't believe you are lying - such are the idiosyncrasies of model-specific optimization - but 'extraordinary claims demand extraordinary proof' and all that.Octocontrabass wrote:It can also be faster where the CPU mistakenly thinks future instructions using AX, AH, or AL are dependent on the result of "mov eax,0" but not "xor eax,eax". In some cases, it's fastest to use "mov eax,0" followed immediately by "xor eax,eax" to prevent both past and future false dependencies.Brendan wrote:Of course even on older CPUs (where there is a false dependency) "xor eax,eax" might still be faster in some cases (if the bottleneck is instruction fetch, and if EAX hasn't been changed for ages anyway).
If so, it is rather amusing to me that there would be situations where the best speed performance would come from duplicating instructions with identical results, reached in different ways. That's definitely not something you'd normally expect.
It would be somewhat interesting to know just when and where that is likely to happen, and what the downsides of optimizing for that case would be (aside from the obvious use of two instructions, I mean). I doubt it would be a particularly relevant matter anyway, as I can't see a case where clearing a register would be a critical-path bottleneck, but you never know.
Rev. First Speaker Schol-R-LEA;2 LCF ELF JAM POEE KoR KCO PPWMTF
Ordo OS Project
Lisp programmers tend to seem very odd to outsiders, just like anyone else who has had a religious experience they can't quite explain to others.
Ordo OS Project
Lisp programmers tend to seem very odd to outsiders, just like anyone else who has had a religious experience they can't quite explain to others.
Re: Is xor ax,ax and mov ax,0 both setting the ax register t
This SO post may be of interest. I haven't checked the Intel Pentium manuals yet, but the answer there claims that "xor reg, reg" had been recognized "zeroing idiom" since the Pentium (correction: Pentium Pro) days.
- Schol-R-LEA
- Member
- Posts: 1925
- Joined: Fri Oct 27, 2006 9:42 am
- Location: Athens, GA, USA
Re: Is xor ax,ax and mov ax,0 both setting the ax register t
I don't know about 'recognized', but it is a lot older than that. I recall reading about it in an old 8088 assembly book (Peter Norton's, I think, or maybe Lefore, both of which I tried and failed to get through back in the 1980s; I had somewhat better luck with The IBM Personal Computer from the Inside Out, but still didn't really get a good grasp of assembly until much later).simeonz wrote:This SO post may be of interest. I haven't checked the Intel Pentium manuals yet, but the answer there claims that "xor reg, reg" had been recognized "zeroing idiom" since the Pentium (correction: Pentium Pro) days.
For that matter, I am pretty sure it got used in early PDP-11 Unix (both in assembly and in C), but I don't have a copy of Lions' Commentary on hand to check. EDIT: Hey, there's an authorized PDF version of both the commentary and the source code, nice! I will look to see if I was right. Further update: Nope, it isn't used there. Turns out that the PDP-11 had a dedicated Clear Register instruction, and XOR-clearing doesn't seem to have been used in the C code, either.
Last edited by Schol-R-LEA on Thu Jun 14, 2018 12:39 pm, edited 1 time in total.
Rev. First Speaker Schol-R-LEA;2 LCF ELF JAM POEE KoR KCO PPWMTF
Ordo OS Project
Lisp programmers tend to seem very odd to outsiders, just like anyone else who has had a religious experience they can't quite explain to others.
Ordo OS Project
Lisp programmers tend to seem very odd to outsiders, just like anyone else who has had a religious experience they can't quite explain to others.
-
- Member
- Posts: 5586
- Joined: Mon Mar 25, 2013 7:01 pm
Re: Is xor ax,ax and mov ax,0 both setting the ax register t
That example is straight out of Agner Fog's microarchitecture optimization guide. It's in the section that describes the Pentium Pro, Pentium II, and Pentium III. The example scenario is optimizing for a balance of reasonably high speed on a variety of older CPUs without using MOVZX, which is quite slow on some CPUs.Schol-R-LEA wrote:That's a situation I hadn't heard of before. Can anyone else confirm, and more importantly, does anyone know which processor models might behave this way, and under what circumstances? I don't believe you are lying - such are the idiosyncrasies of model-specific optimization - but 'extraordinary claims demand extraordinary proof' and all that.Octocontrabass wrote:In some cases, it's fastest to use "mov eax,0" followed immediately by "xor eax,eax" to prevent both past and future false dependencies.
Re: Is xor ax,ax and mov ax,0 both setting the ax register t
I think it's "recognized" as in the instruction decoder handles it in a special way for the sake of optimization (i.e. the decoder recognizes it as a special case). That means that e.g. XOR EDX, EDX would immediately get rid of previous dependencies, but e.g. XOR EBX, ECX wouldn't, despite both being based on the same opcode.Schol-R-LEA wrote:I don't know about 'recognized', but it is a lot older than that. I recall reading about it in an old 8088 assembly book (Peter Norton's, I think, or maybe Lefore, both of which I tried and failed to get through back in the 1980s; I had somewhat better luck with The IBM Personal Computer from the Inside Out, but still didn't really get a good grasp of assembly until much later).simeonz wrote:This SO post may be of interest. I haven't checked the Intel Pentium manuals yet, but the answer there claims that "xor reg, reg" had been recognized "zeroing idiom" since the Pentium (correction: Pentium Pro) days.
Re: Is xor ax,ax and mov ax,0 both setting the ax register t
Yes, but it sounded like I was referring to the popularity of the assembly programming technique. The SO post is much more detailed about the benefits of the idiom from the hardware standpoint, in some places so much so that it is hard to digest.Sik wrote:I think it's "recognized" as in the instruction decoder handles it in a special way for the sake of optimization (i.e. the decoder recognizes it as a special case). That means that e.g. XOR EDX, EDX would immediately get rid of previous dependencies, but e.g. XOR EBX, ECX wouldn't, despite both being based on the same opcode.
In the end however, as per the Pentium manual:
Zero-Extension of Short Integers
The MOVZX instruction has a prefix and takes 3 cycles to execute (a total of 4 cycles).
As with the Intel486 CPU, it is recommended to use the following sequence instead:
xor eax, eax
mov al, mem
If this occurs within a loop, it may be possible to pull the XOR out of the loop if the only
assignment to EAX is the MOV AL, MEM. This has greater importance for the Pentium
processor due to its concurrency of instruction execution.
The same is cited in the Pentium Pro manual. Now, that does not mean to say that Intel necessarily have the best understanding of the practical facets of their CPU usage, but probably the instruction was specifically catered for.Clearing a Register
The preferred sequence to move zero to a register is XOR REG, REG. This saves code
space but sets the condition codes. In contexts where the condition codes must be
preserved, use: MOV REG, 0.