Paging and Cache Look Ahead

Question about which tools to use, bugs, the best way to implement a function, etc should go here. Don't forget to see if your question is answered in the wiki first! When in doubt post here.
Post Reply
johnsa
Member
Member
Posts: 296
Joined: Mon Oct 15, 2007 3:04 pm

Paging and Cache Look Ahead

Post by johnsa »

Hey all,

Here's a quick question regarding what happens to the look-ahead caching when paging is enabled.
Lets say we have physical memory laid out as follows:

| free | used | free |
| 128kb | 64kb | 128kb |

and in a process we request 256kb of memory, so we page it in from the 2 128kb blocks sequentially from 0 - 256kb (for example).
Lets say that memory was allocated and used for a single 256kb array that we scan through .. normally (assuming no paging) that CPU would cache ahead
as we scan through that array/memory in a forward direction. Now with paging, does the CPU cache ahead from the physical memory based on the paging values
IE: paging in the 2 blocks with a physical gap in them wouldn't cause any cache stall, or does it still look at the actual physical memory layout to cache ahead, thus causing a stall in the middle of the array?

My logic tells me the cpu is smart enough to cache based on the virtual -> linear lookup seing as it knows paging is enabled... but I couldn't find a direct answer to this in the manuals.
User avatar
Colonel Kernel
Member
Member
Posts: 1437
Joined: Tue Oct 17, 2006 6:06 pm
Location: Vancouver, BC, Canada
Contact:

Re: Paging and Cache Look Ahead

Post by Colonel Kernel »

You might be getting caching confused with prefetching.

For caching, when you read even a single byte from memory, the CPU will load the entire cache line to which that byte belongs. So if you're reading sequentially, the pattern will look like:

miss, hit, hit, hit, ..., miss, hit, hit, hit, ...

Cache lines are aligned such that they won't cross page boundaries, so it shouldn't matter whether caching is done based on physical or virtual addresses. AFAIK on x86 caching is based on physical addresses, but this is not always true (e.g. -- I think the Pentium 4's trace cache used virtual addresses... not sure if I remember correctly). In any case, it shouldn't matter from the point of view of paging.

Prefetching happens a little "closer" to the CPU than caching, and I'm pretty sure it's based on virtual addresses when paging is enabled, because I think prefetching can easily cross page boundaries (e.g. -- fetching the next instruction which happens to start on the next page). Otherwise, the principle of locality wouldn't really apply and performance would be terrible.
Top three reasons why my OS project died:
  1. Too much overtime at work
  2. Got married
  3. My brain got stuck in an infinite loop while trying to design the memory manager
Don't let this happen to you!
Hyperdrive
Member
Member
Posts: 93
Joined: Mon Nov 24, 2008 9:13 am

Re: Paging and Cache Look Ahead

Post by Hyperdrive »

johnsa wrote:does the CPU cache ahead from the physical memory based on the paging values
IE: paging in the 2 blocks with a physical gap in them wouldn't cause any cache stall, or does it still look at the actual physical memory layout to cache ahead, thus causing a stall in the middle of the array?
As far as I know there are caches both before and after the virtual-to-physical address translation (i.e. caching blocks at virtual addresses and caching blocks at physical addresses). Just think a moment about that: Which cache (L1/L2/...) should do what and why? ... So the answer is: You don't know. You have to detect it (if that's possible...).

--TS
User avatar
Combuster
Member
Member
Posts: 9301
Joined: Wed Oct 18, 2006 3:45 am
Libera.chat IRC: [com]buster
Location: On the balcony, where I can actually keep 1½m distance
Contact:

Re: Paging and Cache Look Ahead

Post by Combuster »

I think the question mainly deals with the stream detector - that thing that detects serial memory accesses and then speculatively bursts lots of memory into the cache just before it is needed. (i.e. lines of memory that aren't being referenced anywhere in the pipeline)

I know it exists, but I can't tell from the top of my head whether it works on virtual or physical addresses, sorry :(
"Certainly avoid yourself. He is a newbie and might not realize it. You'll hate his code deeply a few years down the road." - Sortie
[ My OS ] [ VDisk/SFS ]
johnsa
Member
Member
Posts: 296
Joined: Mon Oct 15, 2007 3:04 pm

Re: Paging and Cache Look Ahead

Post by johnsa »

Its an interesting one.. Combuster is right, there is specifically a component of the cpu designed to detect that sort of memory access pattern and optimize prefetch/cache loading of data ahead. If that were based on physical addressing rather than VMA, it would be nice to know as even if you plan to use a full paging system in your OS, I'd have an option on the allocator to provide better performance by not splitting memory allocation requests across non contiguous blocks.. memory is cheap and plentiful especially in 64bit :) I can afford to waste a little bit here and there due to fragmentation for performance sake.
User avatar
Brendan
Member
Member
Posts: 8561
Joined: Sat Jan 15, 2005 12:00 am
Location: At his keyboard!
Contact:

Re: Paging and Cache Look Ahead

Post by Brendan »

Hi,

The L1/L2/L3 caches (at least on Intel CPUs) operate on physical addresses only, and are completely unaware of linear addresses or paging.

The hardware prefetcher (at least on Netburst/Pentium 4) detects sequential cache line accesses (in both directions), and is capable of handling several streams at once, but it will not cross a page boundary (even if paging is disabled), and takes several accesses before it will begin prefetching.

For example, if you're reading dwords sequentially starting from physical address 0x00000000 (with paging disable), then it might go like this:

Code: Select all

0x00000000: Cache miss  - CPU fetches the cache line
0x00000004 to 0x0000003C: Cache hits (same cache line as last fetch)
0x00000040: Cache miss  - CPU fetches the cache line
0x00000044 to 0x0000007C: Cache hits (same cache line as last fetch)
0x00000080: Cache miss  - CPU fetches the cache line. Hardware prefetcher notices sequential accesses.
0x00000084 to 0x000000CC: Cache hits (same cache line as last fetch)
0x000000D0: Cache hit because hardware prefetcher has prefetched the cache line
0x000000D4 to 0x00000FFC: Cache hits (either prefetched, or same cache line as last prefetch)
0x00001000: Cache miss, because the hardware prefetcher stopped caring (crossed a page boundary)
0x00001004 to 0x0000103C: Cache hits (same cache line as last fetch)
0x00001040: Cache miss  - CPU fetches the cache line
0x00001044 to 0x0000107C: Cache hits (same cache line as last fetch)
0x00001080: Cache miss  - CPU fetches the cache line. Hardware prefetcher notices sequential accesses again.
0x00001084 to 0x000010CC: Cache hits (same cache line as last fetch)
0x000010D0: Cache hit because hardware prefetcher has prefetched the cache line
In this case you get 3 cache misses per page, and 61 hardware prefetches per page.

For certain benchmarks, CPU manufacturers disable hardware prefetching and do the prefetching manually. This helps to avoid those 3 cache misses per page, and also helps to avoid prefetching the cache line after the last cache line (that isn't needed) - e.g. if you stop reading in the middle of a page, then the hardware prefetcher will keep going and prefetch one extra cache line you don't need.

The hardware pretecher stops on page boundaries because different pages may have different cache types. For example, if you read data sequentially from 0x0009F000 to 0x0009FFFC (write-back cache type) then you don't want the hardware prefetcher to prefetch from 0x000A0000 (which should be uncached).


Cheers,

Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
User avatar
Love4Boobies
Member
Member
Posts: 2111
Joined: Fri Mar 07, 2008 5:36 pm
Location: Bucharest, Romania

Re: Paging and Cache Look Ahead

Post by Love4Boobies »

Brendan's becoming annoying :D
"Computers in the future may weigh no more than 1.5 tons.", Popular Mechanics (1949)
[ Project UDI ]
User avatar
JackScott
Member
Member
Posts: 1035
Joined: Thu Dec 21, 2006 3:03 am
Location: Hobart, Australia
Mastodon: https://aus.social/@jackscottau
Matrix: @JackScottAU:matrix.org
GitHub: https://github.com/JackScottAU
Contact:

Re: Paging and Cache Look Ahead

Post by JackScott »

How on earth can you say that? If we could somehow convince Brendan to let us download his brain onto a web server, the Wiki would be finished.
User avatar
Love4Boobies
Member
Member
Posts: 2111
Joined: Fri Mar 07, 2008 5:36 pm
Location: Bucharest, Romania

Re: Paging and Cache Look Ahead

Post by Love4Boobies »

Lol :P I obviously don't think that way about him...
"Computers in the future may weigh no more than 1.5 tons.", Popular Mechanics (1949)
[ Project UDI ]
Post Reply