Page 1 of 2

Cost of enabling / disabling pages?

Posted: Wed May 16, 2007 9:40 pm
by astrocrep
If there a heavy cost (performance) if I flip paging bit on cr0 alot?

Thanks,
Rich

Posted: Thu May 17, 2007 3:15 am
by nick8325
I'm not sure. Write a program that switches CR0 on and off a lot of times and see how quickly it runs :)

I think the cost might be lower than switching address spaces (changing CR3). The processor won't have to flush the TLB (it can just ignore it) but it might have to restart instructions in the pipeline that access memory.

What are you thinking of using this for? (i.e. how often are you going to change CR0?)

Re: Cost of enabling / disabling pages?

Posted: Thu May 17, 2007 3:45 am
by Brendan
Hi,
astrocrep wrote:If there a heavy cost (performance) if I flip paging bit on cr0 alot?
There's a full TLB flush each time you modify CR0 (including "global" pages). This won't matter if you disable paging, but when you enable paging again the CPU will have lots of TLB misses, and will need to read lots of page directories entries, page table entries, etc again.

The "best case" is that the paging data stays in the CPUs cache/s, which means the CPUs caches hold less other things but means TLB misses can be resolved faster. The "worst case" is where the paging data isn't in the CPUs caches and the CPU needs to do many reads from (slow) RAM.

For e.g. fetching a page directory entry and a page table entry from RAM can take over 400 cycles (depending on a lot of things, like other bus traffic, difference between CPU speed and RAM speed, depth of CPU pipeline/s, chipset, etc).

For code in a tight loop, the paging data would remain in the cache, there wouldn't be too many TLB misses and the performance decrease wouldn't be very bad. For certain code (e.g. where the "working set" comes close to filling the TLB) disabling/enabling paging regularly could cause the code to run 20 times slower.


Cheers,

Brendan

Re: Cost of enabling / disabling pages?

Posted: Thu May 17, 2007 3:57 am
by nick8325
Brendan wrote:There's a full TLB flush each time you modify CR0 (including "global" pages).
I didn't realise that. Is that true even for flags which have nothing to do with memory (e.g. TS)?

Posted: Thu May 17, 2007 6:24 am
by astrocrep
I am having a hard time in the last leg of paging...

The whole chicken and the egg scenario...

When I read my physical address from the pd for a specific pt... I disable paging and modify the pt (adding a new address or removing one)... once I am done, I flip the paging bit and everything is ok...

Needless to say, this is going to happen A LOT...
What I was trying to do, to avoid this, is
When I read the pd, and get the physical address of the pt,

to map the physical pt address in the end of the kernels pt (0m-4m) this way, 4m-4k would give me the physical page that represents the pts data....

Once I had access to its memory, I would be able to manipulate the data. However, for some reason, I cannot get it to work (page faults)...

Thats why I am killing the paging mech for a couple of cycles...

Anyone have any good ideas for getting around this???

Thanks,
Rich

Posted: Thu May 17, 2007 6:40 am
by urxae
astrocrep wrote:to map the physical pt address in the end of the kernels pt (0m-4m) this way, 4m-4k would give me the physical page that represents the pts data....

Once I had access to its memory, I would be able to manipulate the data. However, for some reason, I cannot get it to work (page faults)...
Are you using the invlpg instruction (or reloading cr3) after altering the page mappings? You need to do that to make sure the TLB stays in sync with the mappings...

Re: Cost of enabling / disabling pages?

Posted: Thu May 17, 2007 7:13 am
by Brendan
Hi,
nick8325 wrote:
Brendan wrote:There's a full TLB flush each time you modify CR0 (including "global" pages).
I didn't realise that. Is that true even for flags which have nothing to do with memory (e.g. TS)?
Sorry - only modifying the PE or PG flags in CR0 causes a full TLB flush.

The full list (according to Intel's manual) for full TLB flush is:
  • - writing to an MTTR (if MTTRs are supported)
    - modifying PE or PG in CR0
    - modifying the PSE, PGE or PAE flags in CR4
There's also "partial TLB flush" (where everything except global pages are flushed):
  • - any write to CR3 (including a write that doesn't change anything)
    - hardware task switches that change CR3 (where "change" != "writes to"; no TLB flush occurs if the same value is written to CR3 for hardware task switches)

Cheers,

Brendan

Posted: Thu May 17, 2007 7:47 am
by Brendan
Hi,

Urxae is right - you need to invalidate the effected area in the TLB to keep the TLB in sync.

This includes when a page goes from "not present" to "present" (even though you'd assume a "not present" page wouldn't be in the TLB, and therefore wouldn't need to be invalidated). I've had problems in the past on Cyrix CPUs because I assumed "not present" pages weren't in the TLB, and now I know better I'd expect problems on newer CPUs due to speculative execution.

It also includes modifying the dirty and accessed flags. For e.g. if the TLB thinks a page is already marked as "dirty" then it won't set the dirty flag again, even if the dirty flag is actually clear in RAM.
astrocrep wrote:When I read the pd, and get the physical address of the pt,

to map the physical pt address in the end of the kernels pt (0m-4m) this way, 4m-4k would give me the physical page that represents the pts data....

Once I had access to its memory, I would be able to manipulate the data. However, for some reason, I cannot get it to work (page faults)...

Thats why I am killing the paging mech for a couple of cycles...

Anyone have any good ideas for getting around this???
There's a trick that might help - by pretending that the page directory is a page table and giving it it's own page directory entry, you can have 4 MB of address space that contains every page table entry and page directory entry for the current process.

This is confusing, but imagine what would happen if you did something like this every time you created a new page directory:

Code: Select all

    mov eax, physical_address_of_new_page_directory
    mov ebx,eax
    or eax,5                           ;Set the supervisor and present flags
    mov [ebx+0xFFC],eax      ;Put "self-reference" into page directory
Once you do this you'd be able to access all page table entries from 0xFFC00000 to 0xFFFFEFFF, and all page directory entries from 0xFFFFF000 to 0xFFFFFFFF.

In this case, to change a page table entry you'd do something like:

Code: Select all

    test dword [0xFFFFF000 + pageDirectoryEntryNumber * 4], 1
    je .pageTableNotPresent
    mov [0xFFC00000 + pageTableEntryNumber * 4], eax
    invlpg [linearAddress]
Of course you don't need to use the last 4 MB of each address space for this (you could use any 4 MB area you like).


Cheers,

Brendan

Posted: Thu May 17, 2007 8:37 am
by astrocrep
Brendan,

Thanks, I am a lot closer to under standing this... it took me a couple of re-reads... but I think I got it! I cannot wait to get home and start cracking at this again!Two quick questions:

1.) What did you mean?

Code: Select all

or eax,5                           ;Set the supervisor and present flags
Shouldn't that be 3??

2.) The physical address of page tables anywhere right? This way I can dynamically grab a 4k block, and stick it in the pd... from there I just access the hybrid pd/pt... thats AWESOME!

3.) When I invalidate a page... whats the value I should be passing to it?
For instance... I add a new page table to the pd to map 16mb - 20mb... what do I pass...
For instance... I add a new page of ram to the above page table now that 16785408 maps to some physical page...

A big big Thanks,
Rich

Posted: Thu May 17, 2007 10:25 am
by Brendan
Hi,
astrocrep wrote:1.) What did you mean?

Code: Select all

or eax,5                           ;Set the supervisor and present flags
Shouldn't that be 3??
Yes :)

More correctly, it should be something like PGFLG_SUPERVISOR_RW rather than any number... ;)
astrocrep wrote:2.) The physical address of page tables anywhere right? This way I can dynamically grab a 4k block, and stick it in the pd... from there I just access the hybrid pd/pt... thats AWESOME!

3.) When I invalidate a page... whats the value I should be passing to it?
For instance... I add a new page table to the pd to map 16mb - 20mb... what do I pass...
For instance... I add a new page of ram to the above page table now that 16785408 maps to some physical page..
This is where things get tricky (or badly documented). INVLPG is only guaranteed to invalidate the TLB entry for a single page. This means if you change a page directory entry you need to INVLPG everything that is effected by that page, which is 1024 pages. However, because of the "self-reference" a page directory entry also effects a page in the mapping (between 0xFFC00000 to 0xFFFFEFFF) which also needs to be invalidated.

BTW the INVLPG instruction only takes a memory address, which needs to be the linear address somewhere in the page being invalidated (it doesn't matter where). For e.g. "invlpg [0x1000]" does the same as "invlpg [0x1FFF]".

There is also way to cheat called "lazy TLB invalidation" that is mainly used for multi-CPU (where one CPU needs to tell another CPU to invalidate) but can also make some single-CPU situations faster. It's probably best to get INVLPG working correctly without lazy TLB invalidation first though...


Cheers,

Brendan

Posted: Thu May 17, 2007 11:05 am
by astrocrep
For sanity's sake,

While implementing the recursive pd, would it be ok to just call cr3(pd); every time we change (map/unmap) a page? Then once I got everything working without issue, I remove the cr3(pd) calls and switch them to the proper INVLPG? My fear is implementing both together and having INVLPG not working, but me trying to find a bug in the allocator...

Thanks,
Rich

Posted: Thu May 17, 2007 11:18 am
by nick8325
Yes, that should be fine, just a bit slower.

Me = Failure

Posted: Thu May 17, 2007 7:54 pm
by astrocrep
I've got it like 80% of the way... I don't know what I am doing wrong...
Below is the code that does the the grunt work for me...

My page directory is physically allocated somewhere low in ram... its dynamic... but more then likely between 2mb-4mb (which is ALWAYS mapped 1:1)

These guys are my physical page allocators, and they work good.
mm_allocl(); returns a page < 16mb...
mm_alloc(); returns a page over > 16mb (if there are any else < 16mb)

kernel_pd is a global variable for the virtual mem man...

I allocate the page directory, 1:1 map the first 4mb and then recursively map the pd to 508mb mark.

This is my init function...

Code: Select all

int vmm_install()
{
	int i;
	unsigned long *pt;
	
	kernel_pd = (unsigned long *) (mm_allocl() * 4096);
	
	for (i = 0; i < 1024; i++)
	{
		kernel_pd[i] = 0 | 2;
	}
	
	//Now, we will init the whole pd w/ 1 pt (1:1 for 0-4mb) and the rest blank
	kernel_pd[0] = (mm_allocl() * 4096);
	vmm_create_pt1to1(kernel_pd[0], 3, 0);
	kernel_pd[0] = kernel_pd[0] | 3;
	
	//Map the address of the pd as a pt in the pd... wha??
	kernel_pd[127] = *kernel_pd ;
	kernel_pd[127] = kernel_pd[127] | 3;
	
	vmm_set_pd(kernel_pd);
	vmm_enable_paging();
		
	//Map in 2 pages (8k) at 1gb for use later w/ the kernel malloc...
	vmm_map(262144,mm_alloc());
	vmm_map(262145,mm_alloc());
	
	kmalloc_page_count = 2;
	
	return TRUE;
}

This guys is called in my actual mapping function below... His job is to return the virtual address of the page directory entry if one exists, if one doesn't, he creates it.

Code: Select all

unsigned long vmm_get_virtual_pt(unsigned long pd_index)
{
	unsigned long *pt;
	unsigned long *vpd_base;
	
	vpd_base = (unsigned long *) 0x1FC00000;	//508mb
	
	pt = (unsigned long *)kernel_pd[pd_index];	
		
	if (pt == (unsigned long *)2)
		pt = (unsigned long *) (mm_alloc() * 4096);	//create a new page table.
	else
		return vpd_base + (pd_index * 1024);
	
	kernel_pd[pd_index] = pt;
	kernel_pd[pd_index] = kernel_pd[pd_index] | 3;
	
	vmm_disable_paging();
	vmm_enable_paging();
	
	return vpd_base + (pd_index * 1024);
}
This guys does the actual mapping... he takes a virtual page(not address), and a physical page (not address) and maps them together...

Code: Select all

int vmm_map(unsigned long virt_page, unsigned long phys_page)
{
	int pd_index;
	int pt_index;
	unsigned long *vpt;
	
	pd_index = virt_page / 1024;
	
	vpt = (unsigned long*) vmm_get_virtual_pt(pd_index);
	
	pt_index = ((virt_page * 4096) - (pd_index * 0x400000)) / 4096;

	vpt[pt_index] = (phys_page * 4096);
	vpt[pt_index] = vpt[pt_index] | 3;
	
	vmm_disable_paging();
	vmm_enable_paging();
}

As you saw in the vmm_install procedure, I map two pages to the first two at 1gb mark. This way my kmalloc function starts giving out addresses at the virtual 1gb mark.

This code runs find until I attempt to write to the virtual 1gb mark. Something is wrong in the paging code... but I have no idea... been on this for a couple of hours now, and figured I could use another set of eyes.

Thanks in advance,
Rich

Re: Me = Failure

Posted: Thu May 17, 2007 8:54 pm
by Brendan
Hi,
astrocrep wrote:This code runs find until I attempt to write to the virtual 1gb mark. Something is wrong in the paging code... but I have no idea... been on this for a couple of hours now, and figured I could use another set of eyes.
I'd try something more like this:

Code: Select all

unsigned long *vpt_base = (unsigned long *) 0x1FC00000;	//508mb
unsigned long *vpd_base = (unsigned long *) (0x1FC00000 + 127 * 4096);


int vmm_install()
{
	int i;
	unsigned long *pt;

	// Allocate a page directory	

	kernel_pd = (unsigned long *) (mm_allocl() * 4096);

	// Fill page directory with "not present"

	for (i = 0; i < 1024; i++) {
		kernel_pd[i] = 0 | 2;
	}

	// Identity map first 4 Mb of physical RAM

	pt = (unsigned long *) (mm_allocl() * 4096) | 3;
	kernel_pd[0] = pt | 3;

	for (i = 0; i < 1024; i++)
	{
		pt[i] = (i * 4096) | 3;
	}

	// Do self-reference

	kernel_pd[127] = *kernel_pd | 3;

	// Load page directory int CR3 and enable paging

	vmm_set_pd(kernel_pd);
	vmm_enable_paging();

	//Map in 2 pages (8k) at 1gb for use later w/ the kernel malloc...

	vmm_map(262144,mm_alloc(), 3, 3);
	vmm_map(262145,mm_alloc(), 3, 3);

	kmalloc_page_count = 2;
	
	return TRUE;
}


int vmm_map(unsigned long virt_page, unsigned long phys_page, unsigned long PTE_flags, unsigned long PDE_flags)
{
	int pd_index;
	int pt_index;
	unsigned long *vpt;
	void *phys_page;

	if( (vpd_base[virt_page / 1024] & 1) == 0 ) { // No page table mapped
		phys_page = (mm_allocl() * 4096);
		vpd_base[virt_page / 1024] = phys_page | PDE_flags | 1;
		vmm_disable_paging();
		vmm_enable_paging();
	}
	if( (vpt_base[virt_page] & 1) != 0 ) return ERROR_PAGE_ALREADY_PRESENT;
	vpt_base[virt_page] = (phys_page * 4096) | PTE_flags | 1;
	vmm_disable_paging();
	vmm_enable_paging();

	return SUCCESS;
}

It's all entirely untested, but it should be close to correct...


Cheers,

Brendan

Posted: Thu May 17, 2007 9:19 pm
by astrocrep
Ive copy/pasted what you wrote, and it works (compiles)...

But I still page fault when I try to write to the 1gb mark,,,

even:

Code: Select all

  s1 = (char *) 0x40000000; //kmalloc(1200);
  strcpy(s1,"This is test 1\n\0");
  kprintf("%d || %s", s1,s1);	
Throws a page fault on the strcpy...

:(

-Rich