Page 1 of 1

lock memory bus for multiple processor system

Posted: Thu Sep 13, 2018 8:45 pm
by ITchimp
As far as I know, on a single processor system, disabling interrupts would be sufficient to implement atomicity... on multi-processor system... there is a TSL instruction that does the memory bus locking...

so here is my question, what is the exact behavior of memory bus locking? Does that complete disable the memory access of any processor trying to access the bus... or it will only block those memory requests that refers to those memory addresses (addresses not related to the atomicity are still allowed to proceed)?

In other words, in x86, x64 or ARM implementation, is this is a full lock or a memory address specific lock?

Re: lock memory bus for multiple processor system

Posted: Thu Sep 13, 2018 9:41 pm
by Brendan
Hi,
ITchimp wrote:As far as I know, on a single processor system, disabling interrupts would be sufficient to implement atomicity... on multi-processor system... there is a TSL instruction that does the memory bus locking...

so here is my question, what is the exact behavior of memory bus locking? Does that complete disable the memory access of any processor trying to access the bus... or it will only block those memory requests that refers to those memory addresses (addresses not related to the atomicity are still allowed to proceed)?

In other words, in x86, x64 or ARM implementation, is this is a full lock or a memory address specific lock?
I don't know about ARM.

For 80x86; there is no "test, set and lock" instruction that leaves the bus locked until later. Instead there's a lock prefix that you can use to make sure a single instruction is atomic (that doesn't leave the bus locked after the instruction completes). For ancient CPUs this prefix literally locked the entire bus, which meant that everything else trying to use the bus (other CPUs, devices doing bus mastering or DMA) had to wait for the instruction to complete for the bus to be unlocked again (which is relatively bad for performance/scalability when there's many CPUs and/or devices waiting to use the bus). For newer CPUs (except for special/rare cases) it only locks the cache line and doesn't lock the bus, and (with cache coherency) this is enough to create the illusion that the bus was locked even though it wasn't. For both cases; the instruction is serialising (other reads or writes can't be re-ordered around it), because often an instruction with a "lock" prefix is used to synchronise other accesses (e.g. you might use an instruction with a lock prefix to determine if its safe to read other data from somewhere else, so the CPU can't assume its safe to read data from anywhere else until after the instruction with the "lock" prefix is done).


Cheers,

Brendan