Types of spinlocks (I need a new type!)

Discussions on more advanced topics such as monolithic vs micro-kernels, transactional memory models, and paging vs segmentation should go here. Use this forum to expand and improve the wiki!
User avatar
turdus
Member
Member
Posts: 496
Joined: Tue Feb 08, 2011 1:58 pm

Re: Types of spinlocks (I need a new type!)

Post by turdus »

gerryg400 wrote:
Brendan wrote:That needs a LOCK prefix (a memory fence isn't enough).
Are you sure ? Operations that clear individual bits may be okay because only the lock holder will ever clear that bit. The test and the reset don't need to be atomic.

In any event the original code is not guaranteed to be sufficient.
I use lock prefix in that case (since it's an inter cpu lock). And the page that holds the flags in marked as global. Intel manual says that global pages are cached individually to other pages. It looks good in my testcases, but I haven't finished smp in my os yet, so it might need revision later. The reset must be atomic too, as Brendan pointed out in that example (the qword holds several locks, so multiple clears or sets may happen).
rdos
Member
Member
Posts: 3276
Joined: Wed Oct 01, 2008 1:55 pm

Re: Types of spinlocks (I need a new type!)

Post by rdos »

gerryg400 wrote:
turdus wrote:Releasing a lock is simple as

Code: Select all

btr qword [offset], lockid
Don't you also sometimes need a memory fence before that btr ? If you use a LOCK prefix in the unlock case as well it might be okay. I'm not 100% sure.

Rdos, there's no fence in your unlock code either. That can cause problems on SMP unless you have a fence, if necessary, in the calling code.
I don't think one is needed. I'm not using a read-modify-write instruction to release the lock. It is a write-only instruction, and until it has propagated to the other cores they will get one instead of zero on their xchg instructions.
User avatar
Brendan
Member
Member
Posts: 8561
Joined: Sat Jan 15, 2005 12:00 am
Location: At his keyboard!
Contact:

Re: Types of spinlocks (I need a new type!)

Post by Brendan »

Hi,
rdos wrote:
gerryg400 wrote:Rdos, there's no fence in your unlock code either. That can cause problems on SMP unless you have a fence, if necessary, in the calling code.
I don't think one is needed. I'm not using a read-modify-write instruction to release the lock. It is a write-only instruction, and until it has propagated to the other cores they will get one instead of zero on their xchg instructions.
You don't need a lock for something like "mov [es:p_spinlock],ax", as long as the access is aligned on a sane boundary. If the write isn't aligned (e.g. half on one cache line and half on another) then other CPUs may read an old half and a new half; but in rdos' case (where the highest 8 bits are always zero anyway) that wouldn't matter, and if it did matter you'd fix the alignment rather than using a lock prefix.
gerryg400 wrote:I have a question related to locks. In my spin_unlock function I use a fence whenever I clear a lock so that I don't need to remember to figure out whether to put a fence in my main code.

Do you think this perhaps unnecessary fence would have a big performance penalty ?
Fences are only really required when the CPU's memory ordering isn't strong enough, which only really happens when there's strange/hidden side effects involved (e.g. when dealing with memory mapped IO). For example, imagine the video card has a memory mapped register that effects how reads from display memory are handled (like this VGA register, but memory mapped and not an IO port), and you write to this register (changing how reads from display memory are handled) and then read from display memory. In this case the CPU can't really know that the write effects the read, and may do the read before the write - a fence would be needed. Note: if the register was an IO port and not memory mapped, there'd be no need for a fence as IO port instructions cause strong ordering, which is part of the reason why accessing IO ports is slower (and why most modern video cards allow the registers to be memory mapped instead).

However, when a fence isn't required it might still be beneficial. For example, if lots of CPUs are waiting for another CPU to do a write, then putting a fence after the write might help to make the write visible to other CPUs sooner (by limiting the amount of other things the CPU might do before the write is visible to others). It's really hard to say when this might help and when it won't though - you'd need to benchmark specific code (with specific access patterns) on a variety of different CPUs. In general, I'd assume that (for spinlocks) if the chance of lock contention is low then the fence won't be beneficial, and if the chance of lock contention is high the fence still might hurt performance (and it'd be better to reduce lock contention somehow anyway).


Cheers,

Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
gerryg400
Member
Member
Posts: 1801
Joined: Thu Mar 25, 2010 11:26 pm
Location: Melbourne, Australia

Re: Types of spinlocks (I need a new type!)

Post by gerryg400 »

Thanks Brendan. Sorry Turdus and Rdos, it seems that the fence in only rarely required and that my spin_unlock is a too safe. Guess I can delete it then.
If a trainstation is where trains stop, what is a workstation ?
User avatar
Combuster
Member
Member
Posts: 9301
Joined: Wed Oct 18, 2006 3:45 am
Libera.chat IRC: [com]buster
Location: On the balcony, where I can actually keep 1½m distance
Contact:

Re: Types of spinlocks (I need a new type!)

Post by Combuster »

Brendan wrote:Fences are only really required when the CPU's memory ordering isn't strong enough, which only really happens when there's strange/hidden side effects involved
AMD 2 ch 7.2 wrote:Non-overlapping Loads may pass stores.
"Not happening" depends if your code does not include cases such as:

Code: Select all

//*a == *b == 0;

// thread 1
if(*a) *b = 1;


// thread 2
int temp = *b;
*a = 1;
// temp may be 1
"Certainly avoid yourself. He is a newbie and might not realize it. You'll hate his code deeply a few years down the road." - Sortie
[ My OS ] [ VDisk/SFS ]
gerryg400
Member
Member
Posts: 1801
Joined: Thu Mar 25, 2010 11:26 pm
Location: Melbourne, Australia

Re: Types of spinlocks (I need a new type!)

Post by gerryg400 »

There would be a problem in that situation. I think that type of problem might occur if you try to design a lockless linked list for example.

But I think we can say that on x86, for a simple spinlock protecting a shared piece of memory, if the LOCK prefix is used to take the spinlock it's sufficient to release the spinlock without a LOCK prefix or fence. (Well except that some CPUs have an erratum about this).

BTW, for the "__sync_lock_release" intrinsic which I use to release user-land locks, GCC generates

Code: Select all

  400410:	0f ae f0             	mfence 
  400413:	c7 07 00 00 00 00    	movl   $0x0,(%rdi)
so it seems they take a conservative approach.
If a trainstation is where trains stop, what is a workstation ?
OSwhatever
Member
Member
Posts: 595
Joined: Mon Jul 05, 2010 4:15 pm

Re: Types of spinlocks (I need a new type!)

Post by OSwhatever »

For those who are interested in this subject I've found that this page

http://locklessinc.com/articles/locks/

it has some interesting information about spinlocks. Lockless inc has also other interesting articles about locks, you can check it out here.

http://locklessinc.com/articles/
User avatar
Brendan
Member
Member
Posts: 8561
Joined: Sat Jan 15, 2005 12:00 am
Location: At his keyboard!
Contact:

Re: Types of spinlocks (I need a new type!)

Post by Brendan »

Hi,
OSwhatever wrote:For those who are interested in this subject I've found that this page

http://locklessinc.com/articles/locks/

it has some interesting information about spinlocks. Lockless inc has also other interesting articles about locks, you can check it out here.

http://locklessinc.com/articles/
That reminds me of this: http://concurrencykit.org/

It's a "concurrency kit" that contains code for various types of locks, etc. :)


Cheers,

Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
Post Reply