Page 1 of 1

Disable cache in specific pages on Intel i7

Posted: Wed Mar 21, 2012 3:02 pm
by giovanig
Hello guys,

I am implementing a memory allocator that is aware about the usage of the memory. The idea is to pass an annotation to the allocator and based on this annotation it can disable the cache for the new memory. Something like this (I am using our OS):

Code: Select all

int *data = new (ALLOC_WR) unsigned int[20];
ALLOC_WR says that this data will be used to write and read, then it disables the cache by changing the page flags. This is done by calling another method, inside the MMU:

Code: Select all

//receives the new address (addr), already allocated
static void change_page_flags(Log_Addr addr, int size, IA32_Flags flags) { 
      //gets the current page directory by reading the CR3 register
      Page_Directory * pd = current(); 

      //gets the current page table, accessing the Page directory 
      //directory() makes a shift >> of 22 bits in the received address
      Page_Table * pt = (*pd)[directory(addr)];

       //access the specific address page and set the PCD flag (disable cache) 
       //PCD = 0x010
      (*pt)[page(addr)] = (*pt)[page(addr)] | (IA32_Flags::PCD); 

        //flushes TLB to make the change
        flush_tlb(); 
}

The flush_tlb() code is:

Code: Select all

static void flush_tlb() {
	ASMV("movl %cr3,%eax");
	ASMV("movl %eax,%cr3");
 }
The problem is weird. The system is restarting when it flushes the TLB. But if I put a "cout" to print the page table (cout << (*pt)[page(addr)] ) before or after changing the flags, the system does not restart, but it does not disable the cache either.

I also tried to use another way to flush the TLB using invlpg instruction with no success:

Code: Select all

static void flush_tlb(Log_Addr addr) {
	ASMV("invlpg %0" : : "m"(addr));
}
I do not know what is going on here. Does anyone know something that I can try to avoid restarting?

Thanks in advance,
[-o<
Giovani

Re: Disable cache in specific pages on Intel i7

Posted: Wed Mar 21, 2012 3:42 pm
by Nable
a bit offtopic but i'll just leave a notice that cache disabling dramatically degrades performance (as dynamic memory have a long latency and works much slower, than static memory inside cpu). Cache disabling is only used for some MMIO such as APIC or PCI devices' registers.

CPUs are using special kind of cache syncronization protocol (http://en.wikipedia.org/wiki/MESI_protocol), also you can use memory fences and other interesting things if you want to sync data between multiple cores and prefetch instructions to preload data that you suppose to be very heavily used.

More close to this topic: "the system does not restart, but it does not disable the cache either" - how do you find that cache is not disabled?

Re: Disable cache in specific pages on Intel i7

Posted: Wed Mar 21, 2012 3:55 pm
by giovanig
Hi Nable,
More close to this topic: "the system does not restart, but it does not disable the cache either" - how do you find that cache is not disabled?
that is exactly what I am measuring. I have an application composed of two threads. Each thread is running in one different core. Each thread reads and writes in the two arrays, in the same location, during thousands of repetitions. I want to disable cache in those arrays to prevent the cache coherency protocol to act as a bus snoop is equivalent to access the main memory (in time).

There are three versions of this application, one sequential (two threads running in same core one after another one), parallel sharing data, and parallel not sharing data. I know that the cache is disable by measuring the execution time of these applications.

When I disable cache for the parallel with data sharing, in a Quad core machine, it ran faster than when it is running with cache enabled.

I want to perform some tests in the i7 machine now, that has a MESIF protocol instead of MESI.

Thanks,
Giovani

Re: Disable cache in specific pages on Intel i7

Posted: Wed Mar 21, 2012 4:43 pm
by Nable
Stupid question, but what about bit IA32_Flags::PWT, and MTRR and PAT mapping? Looks like (chapter 11.5.2 of IASDM, vol.3A) their values also have some effect.

Also, may be if you rewrite your program avoiding such aggressive memory sharing, you'll achieve MUCH better performance.

Re: Disable cache in specific pages on Intel i7

Posted: Thu Mar 22, 2012 2:32 am
by bluemoon
giovanig wrote:

Code: Select all

//receives the new address (addr), already allocated
static void change_page_flags(Log_Addr addr, int size, IA32_Flags flags) { 
      //gets the current page directory by reading the CR3 register
      Page_Directory * pd = current(); 

      //gets the current page table, accessing the Page directory 
      //directory() makes a shift >> of 22 bits in the received address
      Page_Table * pt = (*pd)[directory(addr)];

       //access the specific address page and set the PCD flag (disable cache) 
       //PCD = 0x010
      (*pt)[page(addr)] = (*pt)[page(addr)] | (IA32_Flags::PCD); 

        //flushes TLB to make the change
        flush_tlb(); 
}
(*pd)[directory(addr)] is the physical address, PLUS the flags, you need to mask it, and it won't work unless it's on identity mapped region.

Re: Disable cache in specific pages on Intel i7

Posted: Thu Mar 22, 2012 9:27 am
by giovanig
Hello guys, thank you for replying.
what about bit IA32_Flags::PWT, and MTRR and PAT mapping? Looks like (chapter 11.5.2 of IASDM, vol.3A) their values also have some effect.
I will take a look on these flags.
Also, may be if you rewrite your program avoiding such aggressive memory sharing, you'll achieve MUCH better performance.
I want this behavior, exactly to stress the bus and the cache coherency protocol. Let's say it is a king of benchmark.
(*pd)[directory(addr)] is the physical address, PLUS the flags, you need to mask it, and it won't work unless it's on identity mapped region.
I print the received addr value, directory(addr), page(addr) and (*pt)[page(addr)] before and after setting the PCD flag. Here is the output:

Code: Select all

addr = 0x017fffc0;
directory(addr) = 0x05;
page(addr) = 0x3ff; //1023
(*pt)[ page(addr) ] = 0x80902701; //before setting PCD flag, PCD=0x10
(*pt)[ page(addr) ]  = (*pt)[ page(addr) ]  | IA32::PCD; //set the PCD flag
(*pt)[ page(addr) ] = 0x07ffc000;
It seems that the address+flags of (*pt)[ page(addr) ] before setting PCD is right. The expected output would be 0x80902711, that is, only setting the 0x10 bit. But instead of this, the output is 0x07ffc000.

Is that what you mean bluemoon? If yes, could you give an example?

Re: Disable cache in specific pages on Intel i7

Posted: Thu Mar 22, 2012 9:38 am
by bluemoon
giovanig wrote:

Code: Select all

addr = 0x017fffc0;
directory(addr) = 0x05;
page(addr) = 0x3ff; //1023
(*pt)[ page(addr) ] = 0x80902701; //before setting PCD flag, PCD=0x10
(*pt)[ page(addr) ]  = (*pt)[ page(addr) ]  | IA32::PCD; //set the PCD flag
(*pt)[ page(addr) ] = 0x07ffc000;
(*pt)[ page(addr) ] = 0x80902701; //before setting PCD flag, PCD=0x10
That is more than 2GiB physical memory address, are you sure you not mixing up physical and logical addresses?

Re: Disable cache in specific pages on Intel i7

Posted: Thu Mar 22, 2012 10:12 am
by giovanig
(*pt)[ page(addr) ] = 0x80902701; //before setting PCD flag, PCD=0x10
That is more than 2GiB physical memory address, are you sure you not mixing up physical and logical addresses?
I've found an error. I have two heaps in the system, one for normal allocation and another one for special allocation like disabling cache. The last one, was not being created in the right way. I fixed it, now the outputs are:

Code: Select all

addr = 0x06fd0fc0;
directory(addr) = 0x1b; // 27
page(addr) = 0x3d0; // 976
(*pt)[ page(addr) ] = 0xfda02706; //before setting PCD flag, PCD=0x10
(*pt)[ page(addr) ]  = (*pt)[ page(addr) ]  | IA32::PCD; //set the PCD flag
(*pt)[ page(addr) ] = 0x07ffc000; //this output continues the same

Re: Disable cache in specific pages on Intel i7

Posted: Thu Mar 22, 2012 10:33 am
by bluemoon
OK, now look at the code again, I'll give comment in the code
giovanig wrote:

Code: Select all

static void change_page_flags(Log_Addr addr, int size, IA32_Flags flags) { 
      //gets the current page directory by reading the CR3 register
      Page_Directory * pd = current(); 

      //gets the current page table, accessing the Page directory 
      //directory() makes a shift >> of 22 bits in the received address

For convenient, let say pd = C0150000 (this is logical address on the heap)

      Page_Table * pt = (*pd)[directory(addr)];

pt will be the physical address of that PTE and the flags, unless the whole thing is identity mapped,
it is possible that pt = 00160001 (this is about 1.2MiB physical address, with present bit)

       //access the specific address page and set the PCD flag (disable cache) 
       //PCD = 0x010
      (*pt)[page(addr)] = (*pt)[page(addr)] | (IA32_Flags::PCD); 

Now, you accessing something not intended, possibly overwrite important data.

        //flushes TLB to make the change
        flush_tlb(); 
}
Note, even the whole stuff is identity mapped, you still need to mask that "present bit" otherwise your access is shifted.

Re: Disable cache in specific pages on Intel i7

Posted: Thu Mar 22, 2012 11:25 am
by giovanig
Note, even the whole stuff is identity mapped, you still need to mask that "present bit" otherwise your access is shifted.
I don't know if I got it. I printed the pd and pt values:

Code: Select all

pd = 0x07ffc000;
pt = 0x7ff5027;
The present bit (bit 0) in pt is set. Should I set it to 0?

The last (*pt)[page(addr)] printed value is the same address of the current "pd".