While reading Intel's Software Developer's Manual Volume 3, chapter eleven: "Memory Cache Control," I stumbled upon something interesting under the section "Code Fetches in Uncacheable Memory." Intel states that, despite being in uncacheable memory, the processor will still fetch instructions (although they didn't specify for data) into the cache(s).
After doing a little bit of searching for clarity, I found the "Introduction to Intel Architecture- The Basics" whitepaper in which it was stated that "data transfers between the CPU and memory are always the full width of the L2 cache on the CPU." This makes it sound like cachelines are allocated regardless of the memory type of the region from which the data is being read from/written to which means (for data when the memory type is UC) that those cachelines must also be quickly invalidated. To me, this seems highly inefficient; potentially deallocating valid cachelines (if the cache is full) and, depending on write policy, writing their contents back to memory simply to make room for temporary cachelines.
I also had a look at intel memory controller hub datasheets and couldn't seem to find any control pins on the front-side-bus that enabled/disabled specific bit groups on the data bus from being written to memory so it seems like a minimum of eight bytes can be written to memory per cycle anyways.
Can anyone provide some clarity on this?
Uncacheable Memory
- Combuster
- Member
- Posts: 9301
- Joined: Wed Oct 18, 2006 3:45 am
- Libera.chat IRC: [com]buster
- Location: On the balcony, where I can actually keep 1½m distance
- Contact:
Re: Uncacheable Memory
That is because code is immutable by design, and even the old 16-bit CPUs would read entire bus widths as a feature. The idea is that executing from MMIO is always a bug, and even existing emulators like QEmu and Bochs will attempt to hurt you if you try.Intel states that, despite being in uncacheable memory, the processor will still fetch instructions (although they didn't specify for data) into the cache(s).
I changed the emphasis a bit. Memory is separate from IO (and therefore MMIO), and glancing over the rest of the document shows it hasn't been proofread in general: there are numerous typos and grammar errors, and the doc also suggest that L1/L2 caches are universally used which is in direct contradiction to what the reference says.in which it was stated that "data transfers between the CPU and memory are always the full width of the L2 cache on the CPU." This makes it sound like cachelines are allocated regardless of the memory type of the region from which the data is being read from/written to which means
I would treat that document as describing "typical use" rather than the full truth with all it's murky details - it is after all meant to be an overview rather than a replacement of volume 3.