If I were to create artificially large pages, would that eliminate the need for page coloring or reduce the number of colors needed?
For example, if I always map physical memory to virtual memory at 64K boundaries in 64K chucks (essentially creating a form of 64K pages), would I be correct is assuming I've just reduced the number of colors needed by a factor of 16? (Thus entirely eliminating the need for coloring on any system requiring 16 or fewer colors).
Similarly, if I use 2MB pages (real pages, not "artificial" ones), would that reduce the number of colors needed by a factor of 512?
Page coloring and page sizes
Re: Page coloring and page sizes
Hi,
However...
Imagine you've got 50 tiny processes that each have 4 KiB of code at 0x00200000 and 4 KiB of data at 0x00400000 in their own/separate virtual address space. In this case, the code for all 50 processes will alias each other in the CPU's L2 instruction cache, and the data for all 50 processes would alias each other in the CPU's L2 data cache, and all the code and all the data for all 50 processes would alias in the L3 unified cache. Assuming the L2 caches are both 256 KiB 8-way associative you'd only be using 32 KiB of each L2 cache due to aliasing between processes, and assuming the L3 cache is 8 MiB with 16-way associativity you'd only be using 256 KiB of the L3 cache due to aliasing between processes.
Normally to avoid this aliasing between processes you'd use skewing - instead of doing "page_colour = (address/page_size)%page_colours" you do "page_colour = (address/page_size + process_ID)%page_colours", so that the page at address 0x00200000 is a different colour for each different process.
By increasing the page size, you decrease the amount of skewing you can do (all the way down to "no skewing possible at all") and end up with different processes aliasing in caches.
Also note that increasing page size increases other problems, like RAM wasted due to "partially used" pages. The normal formula I use is "RAM_wasted = number_of_processes * average_number_of_sections_with_different_page_permissions * page_size/2". With 50 processes that each have 3 sections (executable, read only, read/write sections) you'd waste 300 KiB of RAM if you use 4 KiB pages, and you'de waste 150 MiB of RAM if you use 2 MiB pages. It also makes things like shared memory, memory mapped files and swap space less efficient (e.g. if you've got a memory mapped executable file that has 12 KiB of "only used during initialisation" code and 50 KiB of "used after initialisation" code, you can't free that 12 KiB of "not needed anymore" code after the process is initialised). Of course more RAM wasted means less RAM for things like file caches, which means worse performance.
Cheers,
Brendan
It would reduce the number of page colours as you've described (all the way down to one page colour for 2 MiB pages in most cases, where "1 page colour" is the same as "no page colouring at all").azblue wrote:If I were to create artificially large pages, would that eliminate the need for page coloring or reduce the number of colors needed?
For example, if I always map physical memory to virtual memory at 64K boundaries in 64K chucks (essentially creating a form of 64K pages), would I be correct is assuming I've just reduced the number of colors needed by a factor of 16? (Thus entirely eliminating the need for coloring on any system requiring 16 or fewer colors).
Similarly, if I use 2MB pages (real pages, not "artificial" ones), would that reduce the number of colors needed by a factor of 512?
However...
Imagine you've got 50 tiny processes that each have 4 KiB of code at 0x00200000 and 4 KiB of data at 0x00400000 in their own/separate virtual address space. In this case, the code for all 50 processes will alias each other in the CPU's L2 instruction cache, and the data for all 50 processes would alias each other in the CPU's L2 data cache, and all the code and all the data for all 50 processes would alias in the L3 unified cache. Assuming the L2 caches are both 256 KiB 8-way associative you'd only be using 32 KiB of each L2 cache due to aliasing between processes, and assuming the L3 cache is 8 MiB with 16-way associativity you'd only be using 256 KiB of the L3 cache due to aliasing between processes.
Normally to avoid this aliasing between processes you'd use skewing - instead of doing "page_colour = (address/page_size)%page_colours" you do "page_colour = (address/page_size + process_ID)%page_colours", so that the page at address 0x00200000 is a different colour for each different process.
By increasing the page size, you decrease the amount of skewing you can do (all the way down to "no skewing possible at all") and end up with different processes aliasing in caches.
Also note that increasing page size increases other problems, like RAM wasted due to "partially used" pages. The normal formula I use is "RAM_wasted = number_of_processes * average_number_of_sections_with_different_page_permissions * page_size/2". With 50 processes that each have 3 sections (executable, read only, read/write sections) you'd waste 300 KiB of RAM if you use 4 KiB pages, and you'de waste 150 MiB of RAM if you use 2 MiB pages. It also makes things like shared memory, memory mapped files and swap space less efficient (e.g. if you've got a memory mapped executable file that has 12 KiB of "only used during initialisation" code and 50 KiB of "used after initialisation" code, you can't free that 12 KiB of "not needed anymore" code after the process is initialised). Of course more RAM wasted means less RAM for things like file caches, which means worse performance.
Cheers,
Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
Re: Page coloring and page sizes
You lost me on the L3 cache; wouldn't it only be using 128KB? If it's 16 way associative I can have 16 of the same color loaded at once; if everything's mapped to the same color wouldn't I only have 16x4KB = 64KB in L3?Brendan wrote: Assuming the L2 caches are both 256 KiB 8-way associative you'd only be using 32 KiB of each L2 cache due to aliasing between processes, and assuming the L3 cache is 8 MiB with 16-way associativity you'd only be using 256 KiB of the L3 cache due to aliasing between processes.
Re: Page coloring and page sizes
Hi,
Cheers,
Brendan
Yes, you're right (I messed up the maths)azblue wrote:You lost me on the L3 cache; wouldn't it only be using 128KB? If it's 16 way associative I can have 16 of the same color loaded at once; if everything's mapped to the same color wouldn't I only have 16x4KB = 64KB in L3?Brendan wrote: Assuming the L2 caches are both 256 KiB 8-way associative you'd only be using 32 KiB of each L2 cache due to aliasing between processes, and assuming the L3 cache is 8 MiB with 16-way associativity you'd only be using 256 KiB of the L3 cache due to aliasing between processes.
Cheers,
Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
Re: Page coloring and page sizes
Whew! I primarily posted this to ensure I understood coloring; when I saw your L3 numbers I thought I was still totally lostBrendan wrote:Hi,
Yes, you're right (I messed up the maths)
Cheers,
Brendan
Thank you so much for your replies!