How can xchg return a value?

Question about which tools to use, bugs, the best way to implement a function, etc should go here. Don't forget to see if your question is answered in the wiki first! When in doubt post here.
Post Reply
quadrant
Member
Member
Posts: 74
Joined: Tue Apr 24, 2018 9:46 pm

How can xchg return a value?

Post by quadrant »

Hi

I am trying to understand how xv6 uses the xchg instruction to achieve atomicity when creating locks. Here is the relevant snippet from spinlock.c:

Code: Select all

void acquire ( struct spinlock *lk )
{
	...

	while( xchg( &lk->locked, 1 ) != 0 )
	{
		// spin, waiting for lock to become available
	}

	...
}
The function xchg is a wrapper for the xchg assembly instruction as defined below (see x86.h):

Code: Select all

static inline uint xchg ( volatile uint *addr, uint newval )
{
	uint result;

	// The + in "+m" denotes a read-modify-write operand.
	asm volatile(

		"lock; xchgl %0, %1" :
		"+m" ( *addr ), "=a" ( result ) :
		"1" ( newval ) :
		"cc"
	);

	return result;
}
The xv6 book has this to say about the use of xchg:
In one atomic operation, xchg swaps a word in memory with the contents of a register. The function acquire repeats this xchg instruction in a loop; each iteration atomically reads lk->locked and sets it to 1. If the lock is already held, lk->locked will already be 1, so the xchg returns 1 and the loop continues. If the xchg returns 0, however, acquire has successfully acquired the lock (locked was 0 and is now 1) so the loop can stop.
What I don't understand is how the function xchg returns a value. How can you get a return value from the assembly xchg instruction? What does the code inside asm volatile do? What would it look like if it was not inline, but instead written in a separate assembly file? The only line I sort of understand is "lock; xchgl %0, %1". Any insight would be much appreciated!

Edit: The following quote from this OS textbook, makes me think that the line "+m" ( *addr ), "=a" ( result ) does some voodoo to move the old value (addr) into the return value (result):
The key, of course, is that this sequence of operations is performed atomically. The reason it is called "test and set" is that it enables you to "test" the old value (which is what is returned) while simultaneously "setting" the memory location to a new value; as it turns out, this slightly more powerful instruction is enough to build a simple spin lock...
User avatar
xenos
Member
Member
Posts: 1121
Joined: Thu Aug 11, 2005 11:00 pm
Libera.chat IRC: xenos1984
Location: Tartu, Estonia
Contact:

Re: How can xchg return a value?

Post by xenos »

There is actually less voodoo involved than inline assembly "magic", but that is less tricky:

First of all, the xchg assembly instruction does just one thing: it exchanges atomically, i.e., as one non-dividable operation, the contents of a memory location and a register. With the lock prefix, it (simply said) also prevents other CPUs from interfering.

Now let's have a look at the inline assembly. The first line "lock; xchgl %0, %1" is the actual assembly code to be emitted. It has two placeholders, %0 and %1, which the compiler must replace before emitting the instruction, and the following lines tell it how to do so. The "+m" ( *addr ) tells the compiler that %0 shall refer to the memory location which contains the value of addr, and the + in particular means that this address is both read from and written to. The "=a" ( result ) means that %1 shall be replaced with the "a" register (which is actually eax here, since the operand size is 32 bit), and that the contents of this register is stored in the variable result after the operation. The next line "1" ( newval ) says that before the instruction, the register corresponding to placeholder %1 (so eax) shall be filled with the contents of newval. The last line "cc" informs the compiler that also condition codes in eflags wll change during the operation, so it shall not rely on them being preserved. See http://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html for more information.

Not this is what a simple test gives me, if I remove the "inline" attribute:

Code: Select all

movl 4(%esp), %edx
movl 8(%esp), %eax
lock; xchgl (%edx), %eax
ret
The first line moves the address that addr points to into the edx register, while the second line moves the contents of newval into eax. The third line atomically exchanges the contents of the memory location that edx points to with the contents of eax. Now whatever was at that memory location is in eax. But this is also what the function is supposed to return, namely result, so there is nothing left to do, because eax is just the place where a function returns its return value, and that value is already there. So the function simply returns.

So mostly this is about understanding how inline assembly works with input and output operands. The actual result will still be a bit more tricky because the "inline" at the beginning tells the compiler to inline the function instead of emitting it as shown here. But I guess you get the point.
Programmers' Hardware Database // GitHub user: xenos1984; OS project: NOS
mikegonta
Member
Member
Posts: 229
Joined: Thu May 19, 2011 5:13 am
Contact:

Re: How can xchg return a value?

Post by mikegonta »

XenOS wrote:There is actually less voodoo involved than inline assembly "magic", but that is less tricky:

Code: Select all

movl 4(%esp), %edx
movl 8(%esp), %eax
lock; xchgl (%edx), %eax
ret
xchg wrote:Exchanges the contents of the destination (first) and source (second) operands. The operands can be two general-purpose registers or a register and a memory location.
If a memory operand is referenced, the processor's locking protocol is automatically implemented for the duration of the exchange operation, regardless of the presence
or absence of the LOCK prefix or of the value of the IOPL.
lock wrote:The LOCK prefix can be prepended only to the following instructions and only to those forms of the instructions where the destination operand is a memory operand:
ADD, ADC, AND, BTC, BTR, BTS, CMPXCHG, CMPXCH8B, DEC, INC, NEG, NOT, OR, SBB, SUB, XOR, XADD, and XCHG. If the LOCK prefix is used with one of these instructions
and the source operand is a memory operand, an undefined opcode exception (#UD) may be generated.
So, in the original code (as translated above) an exception should be generated when using the redundant lock prefix with the xchg source operand as a memory location.
lock wrote:The XCHG instruction always asserts the LOCK# signal regardless of the presence or absence of the LOCK prefix.
This implies that xchg is also atomic in a multiprocessor environment without the lock prefix.

Oh, and neither xchg nor mov affect the flags so the condition codes won't change during the inline operation.
Mike Gonta
look and see - many look but few see

https://mikegonta.com
User avatar
xenos
Member
Member
Posts: 1121
Joined: Thu Aug 11, 2005 11:00 pm
Libera.chat IRC: xenos1984
Location: Tartu, Estonia
Contact:

Re: How can xchg return a value?

Post by xenos »

Indeed, good point. That seems to be a bug in the xv6 code, which I find a bit surprising, as I would have expected that this had been discovered before... But this test does the right thing (as does my preferred way using C++11 atomics):

Code: Select all

movl 4(%esp), %edx
movl 8(%esp), %eax
xchgl (%edx), %eax
ret
Apart from that, one might actually use cmpxchg here, since for the spin lock the next thing to do would be to test the result of the operation and perform a conditional jump, and cmpxchg already sets ZF depending on the result.

EDIT: Another quick test gave this assembly if one uses the compiler builtin __sync_val_compare_and_swap:

Code: Select all

movl 4(%esp), %edx
xorl %eax, %eax
movl $1, %ecx
lock cmpxchgl %ecx, (%edx)
ret
Programmers' Hardware Database // GitHub user: xenos1984; OS project: NOS
Octocontrabass
Member
Member
Posts: 5584
Joined: Mon Mar 25, 2013 7:01 pm

Re: How can xchg return a value?

Post by Octocontrabass »

mikegonta wrote:So, in the original code (as translated above) an exception should be generated when using the redundant lock prefix with the xchg source operand as a memory location.
Both operands of XCHG are destinations. In fact, the XCHG opcode does not encode the order of the operands - you can reverse them and the assembler will output exactly the same opcode. As long as either operand is memory, a LOCK prefix will not cause an exception.

Of course, the lock prefix is still redundant and can safely be removed. Additionally, there is no reason to restrict the register operand to EAX - any register is valid here. (Replace "=a" with "=r".) And it's already been mentioned, but you can also remove the "cc" clobber since XCHG doesn't modify any flags.

A lot of the xv6 inline assembly has similar optimization potential.
XenOS wrote:Apart from that, one might actually use cmpxchg here, since for the spin lock the next thing to do would be to test the result of the operation and perform a conditional jump, and cmpxchg already sets ZF depending on the result.
I suspect the authors are intentionally limiting themselves to the 386 instruction set, since xv6 is intended to be educational and modern x86 is quite complex.
quadrant
Member
Member
Posts: 74
Joined: Tue Apr 24, 2018 9:46 pm

Re: How can xchg return a value?

Post by quadrant »

Thank you all for the detailed replies! @XenOs, using godbolt to expand the inline assembly is brilliant, will definitely be using it more often!
quadrant
Member
Member
Posts: 74
Joined: Tue Apr 24, 2018 9:46 pm

Re: How can xchg return a value?

Post by quadrant »

XenOS wrote:But this test does the right thing (as does my preferred way using C++11 atomics):

Code: Select all

movl 4(%esp), %edx
movl 8(%esp), %eax
xchgl (%edx), %eax
ret
Given that it decomposes into three instructions, what happens if a context switch happens before we reach the xchgl instruction? Or does it not matter because whichever process/thread manages to reach xchgl first, all the rest who were interrupted midway on resuming will do the xchgl and find that the value now indicates that the lock is held...
Octocontrabass
Member
Member
Posts: 5584
Joined: Mon Mar 25, 2013 7:01 pm

Re: How can xchg return a value?

Post by Octocontrabass »

It doesn't matter. The important part when acquiring the lock is reading it and updating its status to "held", while ensuring no other programs can access the lock between your read and write. Since the XCHG instruction performs both the read and the write, a context switch can't occur between them. Since XCHG with a memory operand includes an implicit bus lock, no other CPUs can access the lock between your read and write.
Post Reply