I don't develop for Arm, but wanted to ask anyway.
Generally, code with LDREX and STREX (or LDAXR and STLXR) when properly implemented and supported by the OS is considered lock-free. I mean basic operations, like increments, CAS, etc. Then again, I get the feeling of obstruction-freedom more than lock-freedom from the Arm specs. Like two cores looping tightly over a LL/SC loop might fail STREX due to each other's LDREXs and make no progress for some arbitrary amount of time. May be I am misunderstanding the term "exclusive monitor" here.
So, what are the progress guarantees of a tight LL/SC loop implementing basic operation for Arm compliant CPUs, assuming no interrupts or other memory accesses intervene. Can lock-freedom be achieved, or is it obstruction free code at best?
PS: To rephrase the question. Does a LDREX from a competing core fail the STREX of a core that already holds exclusive monitor on that address?
Progress guarantees of Arm LL/SC
Re: Progress guarantees of Arm LL/SC
My v7-A manual says this at the very end of subsection A3.4.5:
In the event of repeatedly-contending load-exclusive/store-exclusive sequences from multiple processors, an implementation must ensure that forward progress is made by at least one processor.
If I understand the global monitor state machine right, it doesn't. Instead, the matching STREX from second core will fail.simeonz wrote:Does a LDREX from a competing core fail the STREX of a core that already holds exclusive monitor on that address?
Re: Progress guarantees of Arm LL/SC
Great. That's nice to know. Looking at the quote, I see a subtle nuance, where the specs say "repeatedly contending", which might imply backoff heuristics. I doubt that it would be very practical though. The arbitration is probably implemented through the cache coherence. It doesn't even matter too much, if the progress is required by the architecture.Icee wrote:If I understand the global monitor state machine right, it doesn't. Instead, the matching STREX from second core will fail.
Thanks.
Re: Progress guarantees of Arm LL/SC
You might also be interested in a related discussion in section 6.2 of Volume 1 of the RISC-V manual. It goes into more detail on how an actual implementation might guarantee forward progress in LR/SC contention scenario and indeed relies on cache line manipulation.
Re: Progress guarantees of Arm LL/SC
@Icee: You probably meant section 5.2.
The idea is to apparently allow blocking implementations. That is, a core can keep the cache line exclusive for as long as necessary, stalling others in the meantime. Due to the syntactical constraints imposed on the code section guarded by LL/SC, the execution time of the blocker is always bounded. Hence the progress guarantee.
LL/SC ends up being not so much limited version of transnational memory, but a flexible/user-programmable version of x86 atomic instructions. Which does fit well with the RISC principles overall.
PS: Thanks again. I will try to RTFM next time.
The idea is to apparently allow blocking implementations. That is, a core can keep the cache line exclusive for as long as necessary, stalling others in the meantime. Due to the syntactical constraints imposed on the code section guarded by LL/SC, the execution time of the blocker is always bounded. Hence the progress guarantee.
LL/SC ends up being not so much limited version of transnational memory, but a flexible/user-programmable version of x86 atomic instructions. Which does fit well with the RISC principles overall.
PS: Thanks again. I will try to RTFM next time.
Re: Progress guarantees of Arm LL/SC
It's actually section 7.2 in the latest (v2.2) spec. The RISC-V people are very liberal about moving stuff around.simeonz wrote:@Icee: You probably meant section 5.2.