OSDev.org

Posted: **Fri Oct 21, 2011 6:02 am**

The number of levels in x86 page tables has grown like topsy, and the reason for that is that we are still working with 4096 byte pages.

In the days when computers had 64mb of memory inside them, that was probably a sensible size, and I suppose the argument against 2mb pages is that they imply the potential to waste 2097151 bytes in a single page. That sounds like a lot. But for computers which have a now not uncommon 4Gb of memory inside them, it represents only 0.05% of available memory, and in the future that is going to shrink percentage wise.

Posted: **Fri Oct 21, 2011 6:06 am**

Erm... yes?

You are free to use 4-MByte pages in your OS, ever since the Pentium...?!?

Posted: **Fri Oct 21, 2011 6:08 am**

Solar wrote:Erm... yes?

You are free to use 4-MByte pages in your OS, ever since the Pentium...?!?

The question is whether it would have been sensible back then.

Posted: **Fri Oct 21, 2011 6:20 am**

The 4096 page size was calculated many years ago in the 80s I think, that was the most appropriate size for swapping, fragmentation etc. Some newer CPU architectures use 8096 as the smallest page size and some of them supports a wide range of page sizes. If you are unhappy with the 4096 page size you can in your operating system use whatever size you want, just that the filling of the page table will require that you fill several consecutive entries.

If have no idea if 4096 is the best page size today, but since RAM memory is getting bigger I would say having a larger page would reduce the amount of memory required for the page table itself. It happens very directly, just moving to 8096 helps a great deal. It also makes sense since the TLB hasn't really scaled up with the RAM sizes. Here I assume we are talking about desktops and perhaps mobile computing as well.

Posted: **Fri Oct 21, 2011 6:30 am**

OSwhatever wrote: If you are unhappy with the 4096 page size you can in your operating system use whatever size you want, just that the filling of the page table will require that you fill several consecutive entries.

The 8086 architecture lets you cut out a whole level of page tables by using 2mb pages (or 4mb in 32 bit mode). That would be the point of using larger page sizes. Simply filling out consecutive entries for 4kb pages wouldn't achieve much.

Posted: **Fri Oct 21, 2011 6:40 am**

Casm wrote:
OSwhatever wrote: If you are unhappy with the 4096 page size you can in your operating system use whatever size you want, just that the filling of the page table will require that you fill several consecutive entries.
The 8086 architecture lets you cut out a whole level of page tables by using 2mb pages (or 4mb in 32 bit mode). That would be the point of using larger page sizes. Simply filling out consecutive entries for 4kb pages wouldn't achieve much.

There is not much performance gain CPU architecture wise as get the same amount of TLB misses, it's more that you reduce the amount management as the number of pages reduce inside your OS.

The x86 architecture is limited to 4096KB and 2MB so you don't have much choice here. You can support both sizes but that is non-trivial. Usually optimizing page sizes is only done for locked memory.

Posted: **Fri Oct 21, 2011 7:09 am**

OSwhatever wrote:There is not much performance gain CPU architecture wise as get the same amount of TLB misses, it's more that you reduce the amount management as the number of pages reduce inside your OS.

The x86 architecture is limited to 4096KB and 2MB so you don't have much choice here. You can support both sizes but that is non-trivial. Usually optimizing page sizes is only done for locked memory.

I suppose four levels of page tables offends against my aesthetic sense, because it looks too much like a kludge (and a desperate one at that) to keep piling on extra levels of page tables as memory sizes increase.

Posted: **Fri Oct 21, 2011 10:12 am**

there is an equation in my OS:

kernel vir addr = 64T + phys addr

that's something like equal mapping in kernel space, so i use the largest page size support here, 2M in bochs and qemu, 1G on my bare metal.

and the minimize space can be allocated by kernel is 1K, and 1G max currently. and these are none of page's business.
but in user space 2M/4k mix size will be much more flexible and strictly controlled, though more complex.

4K page is still support for compatible reason, i think. intel/amd would rather waste their silicon than lost their customers.
it seems sparc support variable page size, but i'm not sure.
if i can design a chip, i would use 1M as basic page size.

Posted: **Sat Oct 22, 2011 2:59 am**

Hi,

For RAM usage, you can't just look at the overhead of paging structures alone.

Typically (for user space), each process has at least 3 areas with different characteristics: executable, read only no-execute and read/write no-execute. When paging is used to enforce this you end up with padding from the actual end of each area up to the next page boundary. You can assume that on average this padding will be 50% of the page size. For example, with 4096 byte pages and 3 areas you'd expect 6 KiB of RAM wasted per process due to padding, and with 2 MiB pages you'd expect 3 MiB of RAM wasted per process.

Typically (for OSs like Windows and Linux) there's around 50 processes where most use very little RAM and some use lots (X, browser, etc). If you assume 50 processes that use an average of 10 MiB of RAM each; then (for 4 KiB pages in long mode) each process (on average) would use a PML4, PDPT, PD and 5 page tables; or 32 KiB for all paging structures. For 2 MiB pages each process (on average) would use a PML4, PDPT and a PD; or 12 KiB for all paging structures.

That gives us some rough figures for comparison. For 50 processes where each process is an average of 10 MiB and has 3 different areas:

4 KiB paging will cost about 6 KiB for padding and 32 KiB for paging structures, or about 38 KiB of overhead per process
2 MiB paging will cost about 3 MiB for padding and 12 KiB for paging structures, or about 3084 KiB of overhead per process
4 KiB paging will cost about 1.86 MiB of overhead for all 50 processes combined
2 MiB paging will cost about 150.6 MiB of overhead for all 50 processes combined
With 500 MiB of RAM actually used by all processes; 1.86 MiB of total overhead works out to about 0.37% more, which is almost nothing
With 500 MiB of RAM actually used by all processes; 150.6 MiB of total overhead works out to about 23.15% more, which is massive

For performance (e.g. TLB misses) it's much harder to estimate the likely cost, as it depends on how much of the paging structures remain in the cache (and how much has to be fetched from RAM), how large the TLBs are (for both small pages and large pages), if the CPU caches higher level structures (modern CPUs do), the working set of each process and its access pattern, how often you switch between processes, etc.

However (for typical OSs with typical loads), I'd assume it's very unlikely that the performance gains you get from using 2 MiB pages is going to justify having roughly 23% more RAM wasted.

Basically, 4 KiB pages (with 4 levels of paging structures) is starting to get a little small, but the next step up (2 MiB pages with 3 levels of paging structures) is far too big to be practical for most things.

To reduce the number of levels of paging structures, a better idea would be to also increase the size of page directories, PDPTs, etc. For example, for 55-bit virtual addresses you could have 64 KiB pages, 64 KiB page tables, 64 KiB page directories and 64 KiB PDPTs. Unfortunately we have to wait for Intel (or AMD) to do something like that though.

Cheers,

Brendan

Posted: **Mon Oct 24, 2011 4:06 am**

berkus wrote:ARM architecture has a superior set of page sizes at programmer's disposal: large=64Kb, small=4Kb, tiny=1Kb. They do it by overlapping some page table entries (giving less memory savings for the page tables), but still it's more handy and the sizes are mostly practical for ARM systems.

Depends what you mean with a good page table design. The ARM page table is not recursive like you'd find in x86, a very handy feature. 1KB page size is deprecated and does lo longer exist in the newer Cortex line of CPUs. Default is 4KB, the 64KB is determined by a type field in the entry and you fill the page table repeatedly with the same page entry 16 times. In the TLB, it has a 64KB entry size and if there is a miss it doesn't matter which of 16 entries it gets in the page table walk, it's the same 64KB page. You also have 1MB and 16MB page sizes as well.

Problem with ARM page table is that 4KB L1 fills 4MB of virtual space in L2. In practice you have to fill 4 entries in L1 unless you want your OS to work with 1KB block sizes. There are a few design flaws here and there because of legacy functionality. In practice with the ARM page table you get the following page sizes 4KB, 64KB, 4MB, 16MB.

Then with Cortex A15, ARM introduces a new page table format called LPAE. This page table is very similar to PAE found in x86. It support recursive page tables this time. This page tables format also gives us the hint that they are going to use this one in their upcoming 64-bit processors.

OSDev.org

Time for 2mb pages?

Time for 2mb pages?

Re: Time for 2mb pages?

Re: Time for 2mb pages?

Re: Time for 2mb pages?

Re: Time for 2mb pages?

Re: Time for 2mb pages?

Re: Time for 2mb pages?

Re: Time for 2mb pages?

Re: Time for 2mb pages?

Re: Time for 2mb pages?