Two ways of flushing TLB in IA32 and two different behaviors

Question about which tools to use, bugs, the best way to implement a function, etc should go here. Don't forget to see if your question is answered in the wiki first! When in doubt post here.
giovanig
Member
Member
Posts: 29
Joined: Tue Mar 13, 2012 12:03 pm

Two ways of flushing TLB in IA32 and two different behaviors

Post by giovanig »

Hi guys,

IA32 processors have two ways of flushing TLB. Moving the CR3 value to itself:
ASMV("movl %cr3,%eax");
ASMV("movl %eax,%cr3");
and using invlpg instruction:
ASMV("invlpg %0" : : "m"(addr));
I am remapping some logical pages in the page table to different physical pages. After remapping, I must flush the TLB.

My test application allocates 10 char arrays of 4KB each, remaps the physical pages (it is a page coloring mechanism), and initializes the first position of the array with 'A' + (array number from 1 to 10). Then, it prints the first character.

When I use the first method of flushing the TLB, the character printed is not 'A', it is a completely strange character. On the other hand, when I use the invlpg instruction, it correctly prints 'A'.

Does anyone have any clue about this behavior? The processor is an intel i7-2600.

Best regards,
Giovani
User avatar
NickJohnson
Member
Member
Posts: 1249
Joined: Tue Mar 24, 2009 8:11 pm
Location: Sunnyvale, California

Re: Two ways of flushing TLB in IA32 and two different behav

Post by NickJohnson »

Are you properly alerting the compiler that eax is being trashed by your inline assembly?
giovanig
Member
Member
Posts: 29
Joined: Tue Mar 13, 2012 12:03 pm

Re: Two ways of flushing TLB in IA32 and two different behav

Post by giovanig »

Are you properly alerting the compiler that eax is being trashed by your inline assembly?
What you mean by trashing the eax? ASMV is a macro to __asm__ __volatile__. The method that flushes the TLB is the following:

Code: Select all

#define ASMV  __asm__ __volatile__
static void flush_tlb() {
        ASMV("movl %cr3,%eax");
        ASMV("movl %eax,%cr3");
}
It is what the intel's manual says. Maybe, the problem can be related to my remap mechanism. A description is found here: http://forum.osdev.org/viewtopic.php?f=1&t=25496
jbemmel
Member
Member
Posts: 53
Joined: Fri May 11, 2012 11:54 am

Re: Two ways of flushing TLB in IA32 and two different behav

Post by jbemmel »

Are you setting the "global" bit in your PTEs?
User avatar
bluemoon
Member
Member
Posts: 1761
Joined: Wed Dec 01, 2010 3:41 am
Location: Hong Kong

Re: Two ways of flushing TLB in IA32 and two different behav

Post by bluemoon »

giovanig wrote:

Code: Select all

ASMV("movl %cr3,%eax");
ASMV("movl %eax,%cr3");
and using invlpg instruction:

Code: Select all

ASMV("invlpg %0" : : "m"(addr));
The syntax look like you're using gcc.
As noted by NickJohnson, you need to tell compiler you clobber eax on first code.
Furthermore you need to tell compiler you clobber memory in both code.
Check the gcc manual for exact syntax of clobber list.
User avatar
Owen
Member
Member
Posts: 1700
Joined: Fri Jun 13, 2008 3:21 pm
Location: Cambridge, United Kingdom
Contact:

Re: Two ways of flushing TLB in IA32 and two different behav

Post by Owen »

  • Firstly, in fact, that whole first code segment is invalid. It is entirely legal for the compiler to have emitted something like this...

    Code: Select all

    #line "yourfile.c" 23
        mov %cr3, %eax
    #line /*Mark this as a compiler generated section */
        mov 8(%esp), %eax
    #line "yourfile.c" 24
        mov %eax, %cr3
    
  • Secondly, assuming "addr" is a pointer, the "m"(addr) specifier will generate a reference to the address of that pointer. You probably wanted "m"(*addr)
  • Finally, in neither case are you informing the compiler that a region of memory is (from its point of view) being modified. You have two options here:
    • You can use the "memory" clobber, in which case the compiler will assume that all memory could have changed and reload all pointers
    • You can do something like

      Code: Select all

      struct PageSize { char contents[4096]; };
      ...
      PageSize* p = (PageSize*) addr;
      asm volatile("invlpg %0" : "+m"(*p));
      
      The "+m" specifier tells GCC that the inline assembly both "reads and writes to" *p. Because *p is 4096 bytes in size, it assumes that whole region is/may be modified, and so this will ensure that writes and reads are not incorrectly reorganized.
giovanig
Member
Member
Posts: 29
Joined: Tue Mar 13, 2012 12:03 pm

Re: Two ways of flushing TLB in IA32 and two different behav

Post by giovanig »

Thank you guys for the answers. I made some changes as you pointed out, but the problem remains.
Are you setting the "global" bit in your PTEs?
Yes. A PTE has the following bits set: 0 (present), 1 (read/write), 2 (user access), 5 (accessed), and 8 (global).
As noted by NickJohnson, you need to tell compiler you clobber eax on first code.
Furthermore you need to tell compiler you clobber memory in both code.
I changed the flush_tlb code to:

Code: Select all

ASMV("movl %%cr3, %%eax \n"
          "movl %%eax, %%cr3 \n"
           : : : "eax", "memory");
Running the system with this code, I got the same result as before, a wrong printed character instead of 'A', and the strange thing is that if I print the 9 left characters, they are corrected.
Secondly, assuming "addr" is a pointer, the "m"(addr) specifier will generate a reference to the address of that pointer. You probably wanted "m"(*addr)
Addr is already the logical address that points to a page, not a pointer.
The "+m" specifier tells GCC that the inline assembly both "reads and writes to" *p. Because *p is 4096 bytes in size, it assumes that whole region is/may be modified, and so this will ensure that writes and reads are not incorrectly reorganized.
I tried to change the ASMV, but the g++ has complained about the +m, maybe because addr is not a pointer. I just added the "memory" option the in the clobbered list, to avoid optimizer reordering access before invlpg:

Code: Select all

ASMV("invlpg %0" : : "m"(addr) : "memory");
I am not sure if invlpg is really invalidating a TLB entry, since it behaves differently from flush_tlb.
giovanig
Member
Member
Posts: 29
Joined: Tue Mar 13, 2012 12:03 pm

Re: Two ways of flushing TLB in IA32 and two different behav

Post by giovanig »

I made another test to understand what is happening. I print the value of the first character in the loop that allocates memory:

Code: Select all

char *array[10];
for(int i = 0; i < SIZE; i++) {
    if(i == 0) {
    array[i] = new (COLOR_2) char[4092]; //alllocates 4096 bytes (there are more 4 bytes used by the heap list) and remaps a page
    array[i][0] = 'A' + i;
    cout << array[i][0]; //prints 'A' correct
    } else {
      array[i] = new (COLOR_2) char[4092];
      array[i][0] = 'A' + i;
      cout << " " << array[0][0]; // prints 'A' 8 times, but the last one is wrong
    }
  }
For some reason, in the 10th time that the code remaps a page, the value of 'A' changes.
User avatar
bluemoon
Member
Member
Posts: 1761
Joined: Wed Dec 01, 2010 3:41 am
Location: Hong Kong

Re: Two ways of flushing TLB in IA32 and two different behav

Post by bluemoon »

giovanig wrote:For some reason, in the 10th time that the code remaps a page, the value of 'A' changes.
And what did it change to? 'A'+i or random garbage?
It look like a bug in the heap allocator. If you are not sure, replace your page invalidate function with a hardcoded "mov cr3" and see if problem still there.
giovanig
Member
Member
Posts: 29
Joined: Tue Mar 13, 2012 12:03 pm

Re: Two ways of flushing TLB in IA32 and two different behav

Post by giovanig »

Hi,
And what did it change to? 'A'+i or random garbage?
It changes to a random garbage value. When "invlpg" is used, this does not happen.
User avatar
Owen
Member
Member
Posts: 1700
Joined: Fri Jun 13, 2008 3:21 pm
Location: Cambridge, United Kingdom
Contact:

Re: Two ways of flushing TLB in IA32 and two different behav

Post by Owen »

giovanig wrote:Thank you guys for the answers. I made some changes as you pointed out, but the problem remains.
Are you setting the "global" bit in your PTEs?
Yes. A PTE has the following bits set: 0 (present), 1 (read/write), 2 (user access), 5 (accessed), and 8 (global).
Hang on...

you have the global bit set... and you're still wondering why reloading CR3 isn't causing that page to be flushed from the TLB?

Go and reread the paging section of the manual (in particular the definition of the global bit) again, please...
jbemmel
Member
Member
Posts: 53
Joined: Fri May 11, 2012 11:54 am

Re: Two ways of flushing TLB in IA32 and two different behav

Post by jbemmel »

giovanig wrote:Yes. A PTE has the following bits set: 0 (present), 1 (read/write), 2 (user access), 5 (accessed), and 8 (global).
Well, then that explains why your mov from/to cr3 does not flush these mappings.

"Global" means "do not flush unless explicitly requested through invlpg". So if you want these pages to be flushed when you change cr3, you should clear the global bit in their PTEs
giovanig
Member
Member
Posts: 29
Joined: Tue Mar 13, 2012 12:03 pm

Re: Two ways of flushing TLB in IA32 and two different behav

Post by giovanig »

Well, a quick test, removing the GLOBAL bit, did not change anything. I will investigate the memory allocator, if there is a bug there.
User avatar
Combuster
Member
Member
Posts: 9301
Joined: Wed Oct 18, 2006 3:45 am
Libera.chat IRC: [com]buster
Location: On the balcony, where I can actually keep 1½m distance
Contact:

Re: Two ways of flushing TLB in IA32 and two different behav

Post by Combuster »

And changing the G bit in memory doesn't change the G bit in the TLB, which means still no flush...
"Certainly avoid yourself. He is a newbie and might not realize it. You'll hate his code deeply a few years down the road." - Sortie
[ My OS ] [ VDisk/SFS ]
sv75
Posts: 1
Joined: Sat Mar 23, 2013 9:06 am

Re: Two ways of flushing TLB in IA32 and two different behav

Post by sv75 »

(deleted)
Last edited by sv75 on Sat Mar 23, 2013 11:46 am, edited 1 time in total.
Post Reply