Hi,
Some notes for more discussion, etc.
Teodor Väänänen wrote:When I've seen spinlock code in the past, I've also seen the use of a special instruction to avoid penalties with the processor pipeline or cache, IIRC the PAUSE instruction. Maybe a separate page, or a subsection on why to use it?
For a tight loop the CPU can pipeline many different iterations of the loop. For a spinloop this is bad because you can end up with the same instruction repeated in the CPUs pipeline several times, which is pointless and increases the time taken to exit the loop. It's even worse for hyper-threading (where the other logical CPU could be using the pipeline instead). The PAUSE instruction prevents this problem (AFAIK by waiting for the previous instruction to complete before shoving more instructions onto the pipeline) and prevents the loop exit penalty.
Teodor Väänänen wrote:I've seen other forms of spinlocks than the ones using BTx, IIRC Linux uses XCHG. A comparison on methods maybe?
The XCHG instruction is a little faster and probably consumes less bytes (no LOCK prefix is needed because it's implied). The BTS method only relies one one bit, which means that the remainder of the dword can be used for extra information, like recording which task acquired the lock (to detect deadlocks and/or when someone tries to free the lock when someone else acquired it), or having a "this lock was under contention" flag that's polled occasionally to see which locks are under contention and which locks aren't.
Note: There's also other things you can do to detect programming errors, like checking that the lock was acquired when you attempt to free it. It's always better to have a little extra code to find problems like this than to have random unexpected crashes that you can never find (especially for alpha/beta code).
Teodor Väänänen wrote:[*]I seem to recall that the preferred way is to use a plain BT to see if the approriate bit is set/clear, then use
a LOCK BTx to actually try to get the lock. Any thoughts on this?
The "test then test-and-set" method is Intel's recommended method. The idea is to avoid continually LOCKing when there's lock contention; which can be a serious problem if 2 or more CPUs are trying to acquire an already locked lock, because you end up with a cache line constantly bouncing from CPU to CPU (while the bus is constantly locked) rather than having the cache line in the shared state in all CPUs (and a lot less LOCKed bus traffic).
I'd also add something about the
bakery algorithm. It has similar properties as a spinlock but also makes sure that tasks get the lock in the order they arrive (which prevents starvation). The disadvantage is that it costs twice as much data (e.g. either 2 cache lines instead of one, or 2 dwords instead of one).
The code is simple enough:
Code: Select all
mov ebx,1
lock xadd [ticket],ebx ;ebx = my ticket number
cmp [counter],ebx ;Is it my turn?
je .acquired ; yes
.retry:
pause
cmp [counter],ebx ;Is it my turn?
je .retry ; no, keep waiting
.acquired:
< Critical Section Here >
lock inc [counter]
I'd guess spinlocks are better if lock contention is unlikely, while a "bakery" is better if lock contention is expected. For a small amount of lock contention I'd use a bakery where the "ticket" and "counter" are on the same cache line (to reduce cache line fetches) while for moderate lock contention I'd put "ticket" and "counter" on different cache lines. For even heavier lock contention I'd find a way to reduce lock contention...
Also note that re-entrancy locks protect data, and the critical section/s are pieces of code that work on that data. Some people assume that re-entrancy locks protect critical sections, but that's just wrong.
It's probably a good idea to have something about avoiding deadlocks too - e.g. always acquire locks in the same order, and always free locks in the reverse order.
Doh - there's a mistake on the Wiki page too - the BTC instruction is "Bit Test and Compliment". You'd want BTR (Bit Test and Reset). This might have been my fault (I have a habit of thinking BTC is "Bit Test and Clear" when it's not
).
Cheers,
Brendan