I have been told not to use hugepage maps if they cover multiple memory types. That is very sensible, of course, but it does put the burden on me to actually check this. Since my current approach is to just map everything as writeback memory, only MTRRs could make it nonconsistent. Since I am a very top-down thinker, I just decomposed the problem. So we are given a physical address and a length, and have to make sure the whole range has the same memtype. OK, easy enough to conceptualize:
Code: Select all
static bool is_range_memtype_consistent(phys_addr_t addr, size_t len)
{
phys_addr_t end = addr + len;
int t = query_memtype(addr);
while ((addr = next_region(addr)) < end)
if (query_memtype(addr) != t) return false;
return true;
}
But the problem is the last remaining case: If I have an address that is not part of any MTRR, how do I find at least halfway quickly the next address inside of some MTRR? Or even figure out that there are no other MTRRs left? If the variable MTRRs were given as base addresses and lengths, this would be easy, but they are not. The masks can have holes in them. I don't know why anyone would use a striped MTRR setup, but BIOS writers can be weird people.
Maybe I ought to deconstruct the problem into a smaller one. Given an input, a mask, and a value, the fact that (value & ~mask) == 0 (because if that is not true, the MTRR is effectively disabled, and I just filter that at initialization time), and the fact that (input & mask) != value, what is the smallest x such that ((input + x) & mask) == value? If I can solve that problem, I can iterate over all variable MTRRs, figure out for which one I get the smallest x, and return that. Thankfully, we don't need to handle the case where x does not exist, because the addition is allowed to overflow, and if it does, the addend will be bigger than for any non-overflowing one. So I can filter out the overflows at the end of the function and return the highest physical address in that case.
I get the feeling I need to look at (input & mask) ^ value, the mask of differences. But not all differences are the same! Again, I see two possibilities: If an input bit is 0 but needs to be 1, I can just set that bit in x, but if the bit is 1 and needs to be 0, setting the bit in x will cause a carry-out in the addition. Which can cause more differences. Does anyone have an idea here? Or am I way overthinking it and should just query the memtype for all pages in the range?