Uncacheable Memory

Programming, for all ages and all languages.
Post Reply
janktrank
Posts: 13
Joined: Thu Aug 11, 2011 2:52 pm

Uncacheable Memory

Post by janktrank »

While reading Intel's Software Developer's Manual Volume 3, chapter eleven: "Memory Cache Control," I stumbled upon something interesting under the section "Code Fetches in Uncacheable Memory." Intel states that, despite being in uncacheable memory, the processor will still fetch instructions (although they didn't specify for data) into the cache(s).

After doing a little bit of searching for clarity, I found the "Introduction to Intel Architecture- The Basics" whitepaper in which it was stated that "data transfers between the CPU and memory are always the full width of the L2 cache on the CPU." This makes it sound like cachelines are allocated regardless of the memory type of the region from which the data is being read from/written to which means (for data when the memory type is UC) that those cachelines must also be quickly invalidated. To me, this seems highly inefficient; potentially deallocating valid cachelines (if the cache is full) and, depending on write policy, writing their contents back to memory simply to make room for temporary cachelines.

I also had a look at intel memory controller hub datasheets and couldn't seem to find any control pins on the front-side-bus that enabled/disabled specific bit groups on the data bus from being written to memory so it seems like a minimum of eight bytes can be written to memory per cycle anyways.

Can anyone provide some clarity on this?
User avatar
Combuster
Member
Member
Posts: 9301
Joined: Wed Oct 18, 2006 3:45 am
Libera.chat IRC: [com]buster
Location: On the balcony, where I can actually keep 1½m distance
Contact:

Re: Uncacheable Memory

Post by Combuster »

Intel states that, despite being in uncacheable memory, the processor will still fetch instructions (although they didn't specify for data) into the cache(s).
That is because code is immutable by design, and even the old 16-bit CPUs would read entire bus widths as a feature. The idea is that executing from MMIO is always a bug, and even existing emulators like QEmu and Bochs will attempt to hurt you if you try.
in which it was stated that "data transfers between the CPU and memory are always the full width of the L2 cache on the CPU." This makes it sound like cachelines are allocated regardless of the memory type of the region from which the data is being read from/written to which means
I changed the emphasis a bit. Memory is separate from IO (and therefore MMIO), and glancing over the rest of the document shows it hasn't been proofread in general: there are numerous typos and grammar errors, and the doc also suggest that L1/L2 caches are universally used which is in direct contradiction to what the reference says.
I would treat that document as describing "typical use" rather than the full truth with all it's murky details - it is after all meant to be an overview rather than a replacement of volume 3.
"Certainly avoid yourself. He is a newbie and might not realize it. You'll hate his code deeply a few years down the road." - Sortie
[ My OS ] [ VDisk/SFS ]
Post Reply