Hello,
I know that the amd64 architecture is strong on memory ordering. But stores can happen after a load.
So it basically means that I may only need a compiler barrier (better with gcc at O2 or O3) and a sfence in my code, right ?
No need for lfence and mfence, since the load/load and store/store are always in the right order ?
Thanks
AMD64 -> intel memory ordering : question on fences
-
- Member
- Posts: 97
- Joined: Tue Mar 10, 2015 10:08 am
Re: AMD64 -> intel memory ordering : question on fences
Hi,
In general, code written in C uses "C abstract machine ordering", which is mostly weak ordering (unless the variable is 'volatile'), and has nothing to do with the underlying architecture.
Cheers,
Brendan
What you need (or don't need) depends on what your code does.JulienDarc wrote:I know that the amd64 architecture is strong on memory ordering. But stores can happen after a load.
So it basically means that I may only need a compiler barrier (better with gcc at O2 or O3) and a sfence in my code, right ?
No need for lfence and mfence, since the load/load and store/store are always in the right order ?
In general, code written in C uses "C abstract machine ordering", which is mostly weak ordering (unless the variable is 'volatile'), and has nothing to do with the underlying architecture.
Cheers,
Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
-
- Member
- Posts: 97
- Joined: Tue Mar 10, 2015 10:08 am
Re: AMD64 -> intel memory ordering : question on fences
Thanks Brendan,
Yes that alright for the c part.
But as far as cpu fences/barriers are concerned, one should never use lfence/mfence on a amd64 arch, right ?
https://en.wikipedia.org/wiki/Memory_or ... rd_pdf_6-0
Because it is useless as it seems : load/load && store/store && load/store operations are strong ordered but not store/load.
So I guess on this arch, one should only use the sfence and forget about the other ones as they are implied already.
Am I correct?
Yes that alright for the c part.
But as far as cpu fences/barriers are concerned, one should never use lfence/mfence on a amd64 arch, right ?
https://en.wikipedia.org/wiki/Memory_or ... rd_pdf_6-0
Because it is useless as it seems : load/load && store/store && load/store operations are strong ordered but not store/load.
So I guess on this arch, one should only use the sfence and forget about the other ones as they are implied already.
Am I correct?
Re: AMD64 -> intel memory ordering : question on fences
Hi,
Also; loads can happen in any order. This may create problems when there's multiple CPUs (or just one CPU and a device). For example, imagine if the first CPU does this:
And the second CPU does this:
In this case the second CPU may read from "data" speculatively (before the first CPU has set "flag") and the value in EAX may not be the value stored by the first CPU (123456), and a fence (e.g. lfence) is needed to ensure the second load ("mov eax,[data]") doesn't happen before the earlier load says "flag" is non-zero.
Cheers,
Brendan
No. You should always use lfence/mfence in cases where they're necessary.JulienDarc wrote:Thanks Brendan,
Yes that alright for the c part.
But as far as cpu fences/barriers are concerned, one should never use lfence/mfence on a amd64 arch, right ?
The default memory ordering on 80x86 is "write ordering with store forwarding". This is only the default. There are various instructions (e.g. non-temporal stores) that use weaker ordering, and there are caching types (e.g. write combining) that also weaken the ordering. In addition to that, the CPU does some (rare) things that ignore the memory ordering (e.g. setting the accessed and dirty flags in page table entries can happen "out of order").JulienDarc wrote:Because it is useless as it seems : load/load && store/store && load/store operations are strong ordered but not store/load.
So I guess on this arch, one should only use the sfence and forget about the other ones as they are implied already.
Am I correct?
Also; loads can happen in any order. This may create problems when there's multiple CPUs (or just one CPU and a device). For example, imagine if the first CPU does this:
Code: Select all
mov dword [data],123456 ;Set the data for the second CPU
mov dword [flag],1 ;Tell the second CPU it can continue
Code: Select all
.l1: cmp dword [flag],0 ;Can this CPU continue yet?
je .l1 ; no, keep waiting
mov eax,[data] ; yes, get the data
Cheers,
Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
-
- Member
- Posts: 97
- Joined: Tue Mar 10, 2015 10:08 am
Re: AMD64 -> intel memory ordering : question on fences
Glad you are here Brendan,
I would have met some major bugs without your help.
Thanks for your clarifications,
julien
I would have met some major bugs without your help.
Thanks for your clarifications,
julien