I use lock prefix in that case (since it's an inter cpu lock). And the page that holds the flags in marked as global. Intel manual says that global pages are cached individually to other pages. It looks good in my testcases, but I haven't finished smp in my os yet, so it might need revision later. The reset must be atomic too, as Brendan pointed out in that example (the qword holds several locks, so multiple clears or sets may happen).gerryg400 wrote:Are you sure ? Operations that clear individual bits may be okay because only the lock holder will ever clear that bit. The test and the reset don't need to be atomic.Brendan wrote:That needs a LOCK prefix (a memory fence isn't enough).
In any event the original code is not guaranteed to be sufficient.
Types of spinlocks (I need a new type!)
Re: Types of spinlocks (I need a new type!)
Re: Types of spinlocks (I need a new type!)
I don't think one is needed. I'm not using a read-modify-write instruction to release the lock. It is a write-only instruction, and until it has propagated to the other cores they will get one instead of zero on their xchg instructions.gerryg400 wrote:Don't you also sometimes need a memory fence before that btr ? If you use a LOCK prefix in the unlock case as well it might be okay. I'm not 100% sure.turdus wrote:Releasing a lock is simple asCode: Select all
btr qword [offset], lockid
Rdos, there's no fence in your unlock code either. That can cause problems on SMP unless you have a fence, if necessary, in the calling code.
Re: Types of spinlocks (I need a new type!)
Hi,
However, when a fence isn't required it might still be beneficial. For example, if lots of CPUs are waiting for another CPU to do a write, then putting a fence after the write might help to make the write visible to other CPUs sooner (by limiting the amount of other things the CPU might do before the write is visible to others). It's really hard to say when this might help and when it won't though - you'd need to benchmark specific code (with specific access patterns) on a variety of different CPUs. In general, I'd assume that (for spinlocks) if the chance of lock contention is low then the fence won't be beneficial, and if the chance of lock contention is high the fence still might hurt performance (and it'd be better to reduce lock contention somehow anyway).
Cheers,
Brendan
You don't need a lock for something like "mov [es:p_spinlock],ax", as long as the access is aligned on a sane boundary. If the write isn't aligned (e.g. half on one cache line and half on another) then other CPUs may read an old half and a new half; but in rdos' case (where the highest 8 bits are always zero anyway) that wouldn't matter, and if it did matter you'd fix the alignment rather than using a lock prefix.rdos wrote:I don't think one is needed. I'm not using a read-modify-write instruction to release the lock. It is a write-only instruction, and until it has propagated to the other cores they will get one instead of zero on their xchg instructions.gerryg400 wrote:Rdos, there's no fence in your unlock code either. That can cause problems on SMP unless you have a fence, if necessary, in the calling code.
Fences are only really required when the CPU's memory ordering isn't strong enough, which only really happens when there's strange/hidden side effects involved (e.g. when dealing with memory mapped IO). For example, imagine the video card has a memory mapped register that effects how reads from display memory are handled (like this VGA register, but memory mapped and not an IO port), and you write to this register (changing how reads from display memory are handled) and then read from display memory. In this case the CPU can't really know that the write effects the read, and may do the read before the write - a fence would be needed. Note: if the register was an IO port and not memory mapped, there'd be no need for a fence as IO port instructions cause strong ordering, which is part of the reason why accessing IO ports is slower (and why most modern video cards allow the registers to be memory mapped instead).gerryg400 wrote:I have a question related to locks. In my spin_unlock function I use a fence whenever I clear a lock so that I don't need to remember to figure out whether to put a fence in my main code.
Do you think this perhaps unnecessary fence would have a big performance penalty ?
However, when a fence isn't required it might still be beneficial. For example, if lots of CPUs are waiting for another CPU to do a write, then putting a fence after the write might help to make the write visible to other CPUs sooner (by limiting the amount of other things the CPU might do before the write is visible to others). It's really hard to say when this might help and when it won't though - you'd need to benchmark specific code (with specific access patterns) on a variety of different CPUs. In general, I'd assume that (for spinlocks) if the chance of lock contention is low then the fence won't be beneficial, and if the chance of lock contention is high the fence still might hurt performance (and it'd be better to reduce lock contention somehow anyway).
Cheers,
Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
Re: Types of spinlocks (I need a new type!)
Thanks Brendan. Sorry Turdus and Rdos, it seems that the fence in only rarely required and that my spin_unlock is a too safe. Guess I can delete it then.
If a trainstation is where trains stop, what is a workstation ?
- Combuster
- Member
- Posts: 9301
- Joined: Wed Oct 18, 2006 3:45 am
- Libera.chat IRC: [com]buster
- Location: On the balcony, where I can actually keep 1½m distance
- Contact:
Re: Types of spinlocks (I need a new type!)
Brendan wrote:Fences are only really required when the CPU's memory ordering isn't strong enough, which only really happens when there's strange/hidden side effects involved
"Not happening" depends if your code does not include cases such as:AMD 2 ch 7.2 wrote:Non-overlapping Loads may pass stores.
Code: Select all
//*a == *b == 0;
// thread 1
if(*a) *b = 1;
// thread 2
int temp = *b;
*a = 1;
// temp may be 1
Re: Types of spinlocks (I need a new type!)
There would be a problem in that situation. I think that type of problem might occur if you try to design a lockless linked list for example.
But I think we can say that on x86, for a simple spinlock protecting a shared piece of memory, if the LOCK prefix is used to take the spinlock it's sufficient to release the spinlock without a LOCK prefix or fence. (Well except that some CPUs have an erratum about this).
BTW, for the "__sync_lock_release" intrinsic which I use to release user-land locks, GCC generatesso it seems they take a conservative approach.
But I think we can say that on x86, for a simple spinlock protecting a shared piece of memory, if the LOCK prefix is used to take the spinlock it's sufficient to release the spinlock without a LOCK prefix or fence. (Well except that some CPUs have an erratum about this).
BTW, for the "__sync_lock_release" intrinsic which I use to release user-land locks, GCC generates
Code: Select all
400410: 0f ae f0 mfence
400413: c7 07 00 00 00 00 movl $0x0,(%rdi)
If a trainstation is where trains stop, what is a workstation ?
-
- Member
- Posts: 595
- Joined: Mon Jul 05, 2010 4:15 pm
Re: Types of spinlocks (I need a new type!)
For those who are interested in this subject I've found that this page
http://locklessinc.com/articles/locks/
it has some interesting information about spinlocks. Lockless inc has also other interesting articles about locks, you can check it out here.
http://locklessinc.com/articles/
http://locklessinc.com/articles/locks/
it has some interesting information about spinlocks. Lockless inc has also other interesting articles about locks, you can check it out here.
http://locklessinc.com/articles/
Re: Types of spinlocks (I need a new type!)
Hi,
It's a "concurrency kit" that contains code for various types of locks, etc.
Cheers,
Brendan
That reminds me of this: http://concurrencykit.org/OSwhatever wrote:For those who are interested in this subject I've found that this page
http://locklessinc.com/articles/locks/
it has some interesting information about spinlocks. Lockless inc has also other interesting articles about locks, you can check it out here.
http://locklessinc.com/articles/
It's a "concurrency kit" that contains code for various types of locks, etc.
Cheers,
Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.