gcc optimization for statement re-ordering

Programming, for all ages and all languages.
Post Reply
User avatar
bluemoon
Member
Member
Posts: 1761
Joined: Wed Dec 01, 2010 3:41 am
Location: Hong Kong

gcc optimization for statement re-ordering

Post by bluemoon »

Hi,

I'm now working on code for initializing MMU handler (I did it in assembly with my 32-bit kernel without problem, but now I want to do it in C).
and I have something like this:

Code: Select all

  // add page structures for foo_ptr to SOME_ADDRESS

  // flush CR3
    _MOVCR3 (KADDR_PMA(k_PML4E));

    //kprintf ("foo\n");
   uint64_t* foo_ptr = (uint64_t*) SOME_ADDRESS;
    foo_ptr[0] = 1;
    kprintf ("bar\n");
If I don't put kprint("foo") there, the foo_ptr will be shuffled before flushing CR3 thus generates #PF.
If I put kprintf("foo") there, gcc put the statements in the expected order, and the write is succeeded.


So, what's the proper way to ensure code dependency is followed and avoid too aggressive optimization?
User avatar
bluemoon
Member
Member
Posts: 1761
Joined: Wed Dec 01, 2010 3:41 am
Location: Hong Kong

Re: gcc optimization for statement re-ordering

Post by bluemoon »

I'm not sure if it's best way, but now I'm putting this after flushing CR3 and it seems work.

Code: Select all

__asm volatile("mfence":::"memory");
But still, what's the proper way to do it?
Rudster816
Member
Member
Posts: 141
Joined: Thu Jun 17, 2010 2:36 am

Re: gcc optimization for statement re-ordering

Post by Rudster816 »

bluemoon wrote:I'm not sure if it's best way, but now I'm putting this after flushing CR3 and it seems work.

Code: Select all

__asm volatile("mfence":::"memory");
But still, what's the proper way to do it?
Just do a normal function call to a stub that reads\writes to CR3. You're already flushing to TLB, so performance isn't an issue, and GCC can't move the function call around, even if it decides to inline it by itself.
User avatar
bluemoon
Member
Member
Posts: 1761
Joined: Wed Dec 01, 2010 3:41 am
Location: Hong Kong

Re: gcc optimization for statement re-ordering

Post by bluemoon »

I figured it out. I need to tell gcc to clobber "memory" after flushing CR3.
User avatar
JamesM
Member
Member
Posts: 2935
Joined: Tue Jul 10, 2007 5:27 am
Location: York, United Kingdom
Contact:

Re: gcc optimization for statement re-ordering

Post by JamesM »

bluemoon wrote:I figured it out. I need to tell gcc to clobber "memory" after flushing CR3.
Indeed, that is the correct way.
User avatar
bluemoon
Member
Member
Posts: 1761
Joined: Wed Dec 01, 2010 3:41 am
Location: Hong Kong

Re: gcc optimization for statement re-ordering

Post by bluemoon »

I see. I think same apply to invlpg, and I have updated the wiki example to include "clobber memory".
User avatar
JamesM
Member
Member
Posts: 2935
Joined: Tue Jul 10, 2007 5:27 am
Location: York, United Kingdom
Contact:

Re: gcc optimization for statement re-ordering

Post by JamesM »

bluemoon wrote:I see. I think same apply to invlpg, and I have updated the wiki example to include "clobber memory".
Indeed - note that you will only see this if using macro expansion, inline functions or whole-program-optimization to include the inline asm in a function that uses the resulting memory - all (opaque) function calls are assumed to be clobber-memory.
User avatar
Owen
Member
Member
Posts: 1700
Joined: Fri Jun 13, 2008 3:21 pm
Location: Cambridge, United Kingdom
Contact:

Re: gcc optimization for statement re-ordering

Post by Owen »

bluemoon wrote:I see. I think same apply to invlpg, and I have updated the wiki example to include "clobber memory".
For invlpg, the optimal solution is something like

Code: Select all

static inline void invlpg(void* page)
{
    struct pagesized { char _[4096]; } *pPage = (struct pagesized*) page;
    
    __asm__ __volatile__("invlpg %0" : "=m"(pPage._));
}
That gives GCC a precise clobber - i.e. it knows that anything which could possibly overlap should be reloaded, but anything which provably doesn't won't be.

(This code is untested. Consult the GCC manual; it contains a similar demonstration)
Post Reply