Dissapointment with ARMv8 limited synchronization primitives
Posted: Mon Dec 16, 2013 5:59 pm
ARMv8 is fairly new ISA but despite of that its synchronization primitives remain of the traditional type of exclusive load and store. They have added load acquire and store release but as I understand it, they were added to conform the memory model and could be emulated previously but at the expense of performance.
What I think is very much needed today is the new type of synchronization with transactional memory. Intel recently added this with their TSX extension which is what I and many programmers wants. Making even basic algorithms and data structures concurrent is very hard and often ends up in algorithms that is hard to understand and with much more code than the original non-concurrent version. Small fast locks like spinlocks are virtually useless in user space due to preemption which can have nasty side effects and therefore seldom used.
The limitation of the CAS has been known for a very long time and transactional memory has been proposed for some time now, so I think ARM really took the cheap route by not including this.
What I think is very much needed today is the new type of synchronization with transactional memory. Intel recently added this with their TSX extension which is what I and many programmers wants. Making even basic algorithms and data structures concurrent is very hard and often ends up in algorithms that is hard to understand and with much more code than the original non-concurrent version. Small fast locks like spinlocks are virtually useless in user space due to preemption which can have nasty side effects and therefore seldom used.
The limitation of the CAS has been known for a very long time and transactional memory has been proposed for some time now, so I think ARM really took the cheap route by not including this.