Page 1 of 1

Out of order loads

Posted: Tue Dec 17, 2013 8:05 pm
by devsau
Was looking over a lot of linux code lately, a few locks in particular. I have seen a few of these and was wondering the purpose:

Code: Select all

lock and byte ptr [rcx], 0xf3   //leaving a spinlock
lfence
//do stuff
Afaik, lock prefixes act as memory barriers and drain/complete any loads or stores. So I don't see the purpose of the lfence in that case. Can anyone enlighten me, or point me to some reading material? Thanks :D

Re: Out of order loads

Posted: Wed Dec 18, 2013 12:51 am
by Brendan
Hi,
devsau wrote:Afaik, lock prefixes act as memory barriers and drain/complete any loads or stores. So I don't see the purpose of the lfence in that case. Can anyone enlighten me, or point me to some reading material? Thanks :D
As far as I know; you're correct and the "lfence" should be unnecessary.

However the "lfence" may have been needed as a work-around for bugs in a few CPUs. Specifically, there were bugs in AMD's "family 15" CPUs that involved locks/semaphores and read ordering rules. I think the specific CPU bugs I'm thinking of have been fixed by a micro-code update since, so the "lfence" might just be there in case someone hasn't installed/used the most recent micro-code update, but I really can't be too sure.


Cheers,

Brendan

Re: Out of order loads

Posted: Wed Dec 18, 2013 10:43 am
by devsau
thanks Brendan. I was beginning to think something along those lines.

Re: Out of order loads

Posted: Thu Dec 19, 2013 1:50 am
by kscguru
Yes, this is AMD errata 147:
http://support.amd.com/TechDocs/25759.pdf‎

Never was fixed in microcode; Family 15 Rev F2 fixed it in silicon (~2006). And the errata was under NDA for another three years (sigh). It's memorable for how invasive and costly the workaround is. As far as I'm concerned, it's bad enough to render a few years of Opterons as junk - nobody wants to debug lock instructions that don't work right.

Linux kernel mention of the errata: https://bugzilla.kernel.org/show_bug.cgi?id=11305