Question about which tools to use, bugs, the best way to implement a function, etc should go here. Don't forget to see if your question is answered in the wiki first! When in doubt post here.
lock and byte ptr [rcx], 0xf3 //leaving a spinlock
lfence
//do stuff
Afaik, lock prefixes act as memory barriers and drain/complete any loads or stores. So I don't see the purpose of the lfence in that case. Can anyone enlighten me, or point me to some reading material? Thanks
devsau wrote:Afaik, lock prefixes act as memory barriers and drain/complete any loads or stores. So I don't see the purpose of the lfence in that case. Can anyone enlighten me, or point me to some reading material? Thanks
As far as I know; you're correct and the "lfence" should be unnecessary.
However the "lfence" may have been needed as a work-around for bugs in a few CPUs. Specifically, there were bugs in AMD's "family 15" CPUs that involved locks/semaphores and read ordering rules. I think the specific CPU bugs I'm thinking of have been fixed by a micro-code update since, so the "lfence" might just be there in case someone hasn't installed/used the most recent micro-code update, but I really can't be too sure.
Cheers,
Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
Never was fixed in microcode; Family 15 Rev F2 fixed it in silicon (~2006). And the errata was under NDA for another three years (sigh). It's memorable for how invasive and costly the workaround is. As far as I'm concerned, it's bad enough to render a few years of Opterons as junk - nobody wants to debug lock instructions that don't work right.