Page 1 of 1
gcc optimization for statement re-ordering
Posted: Sat Apr 28, 2012 11:53 am
by bluemoon
Hi,
I'm now working on code for initializing MMU handler (I did it in assembly with my 32-bit kernel without problem, but now I want to do it in C).
and I have something like this:
Code: Select all
// add page structures for foo_ptr to SOME_ADDRESS
// flush CR3
_MOVCR3 (KADDR_PMA(k_PML4E));
//kprintf ("foo\n");
uint64_t* foo_ptr = (uint64_t*) SOME_ADDRESS;
foo_ptr[0] = 1;
kprintf ("bar\n");
If I don't put kprint("foo") there, the foo_ptr will be shuffled before flushing CR3 thus generates #PF.
If I put kprintf("foo") there, gcc put the statements in the expected order, and the write is succeeded.
So, what's the proper way to ensure code dependency is followed and avoid too aggressive optimization?
Re: gcc optimization for statement re-ordering
Posted: Sat Apr 28, 2012 12:04 pm
by bluemoon
I'm not sure if it's best way, but now I'm putting this after flushing CR3 and it seems work.
Code: Select all
__asm volatile("mfence":::"memory");
But still, what's the proper way to do it?
Re: gcc optimization for statement re-ordering
Posted: Sat Apr 28, 2012 12:40 pm
by Rudster816
bluemoon wrote:I'm not sure if it's best way, but now I'm putting this after flushing CR3 and it seems work.
Code: Select all
__asm volatile("mfence":::"memory");
But still, what's the proper way to do it?
Just do a normal function call to a stub that reads\writes to CR3. You're already flushing to TLB, so performance isn't an issue, and GCC can't move the function call around, even if it decides to inline it by itself.
Re: gcc optimization for statement re-ordering
Posted: Sat Apr 28, 2012 12:52 pm
by bluemoon
I figured it out. I need to tell gcc to clobber "memory" after flushing CR3.
Re: gcc optimization for statement re-ordering
Posted: Sun Apr 29, 2012 10:46 am
by JamesM
bluemoon wrote:I figured it out. I need to tell gcc to clobber "memory" after flushing CR3.
Indeed, that is the correct way.
Re: gcc optimization for statement re-ordering
Posted: Sun Apr 29, 2012 11:40 am
by bluemoon
I see. I think same apply to invlpg, and I have updated the wiki example to include "clobber memory".
Re: gcc optimization for statement re-ordering
Posted: Sun Apr 29, 2012 1:25 pm
by JamesM
bluemoon wrote:I see. I think same apply to invlpg, and I have updated the wiki example to include "clobber memory".
Indeed - note that you will only see this if using macro expansion, inline functions or whole-program-optimization to include the inline asm in a function that uses the resulting memory - all (opaque) function calls are assumed to be clobber-memory.
Re: gcc optimization for statement re-ordering
Posted: Sun Apr 29, 2012 3:58 pm
by Owen
bluemoon wrote:I see. I think same apply to invlpg, and I have updated the wiki example to include "clobber memory".
For invlpg, the optimal solution is something like
Code: Select all
static inline void invlpg(void* page)
{
struct pagesized { char _[4096]; } *pPage = (struct pagesized*) page;
__asm__ __volatile__("invlpg %0" : "=m"(pPage._));
}
That gives GCC a precise clobber - i.e. it knows that anything which could possibly overlap should be reloaded, but anything which provably doesn't won't be.
(This code is untested. Consult the GCC manual; it contains a similar demonstration)