Page 1 of 1

phenom's TLB/L3 cache issue

Posted: Sat Dec 08, 2007 12:15 pm
by madeofstaples
I've been reading about the phenom's recent TLB bug, but I'm a bit confused...

How would a BIOS patch fix this? Wouldn't the kernel of whatever OS also need to be patched to yield to the BIOS whenever this bug occurs? in which case, it'd be a much quicker solution to just write a kernel patch?

Maybe I'm just not quite understanding, in what event this bug occurs? So, some core reads/writes a page for the first time, and then another core reads/writes the same page but doesn't see that the page is dirty/accessed soon enough?

Would someone care to clarify this bug to me?

Thanks

Posted: Sat Dec 08, 2007 1:12 pm
by cyr1x
First the L3 has no TLB but the L2 has.
If the L2-TLB copies some entries into the L3 and the entry gets accessed before the CPU has written the accesed/dirty-flags an error occurs and hangs up the CPU.
The"BIOS-fix" just disables some functionalties in the TLB.
There's a Linux-Patch which emulates the a/d-flag-setting.
Another way would be to not using the a/d-flags.

Posted: Sat Dec 08, 2007 2:04 pm
by madeofstaples
Hey, thanks for the reply

I think I'm a little more confused now though, heh
cyr1x wrote:First the L3 has no TLB but the L2 has.
If the L2-TLB copies some entries into the L3 and the entry gets accessed before the CPU has written the accesed/dirty-flags an error occurs and hangs up the CPU.
All right, bear with me:

If the entry is in the L2 cache, likely one of the bits is already set (unless the OS has cleared them--but wouldn't it reach L3 before then?) because first a program has to try to read/write an address before the CPU looks up and caches the page table entry, right?

So, say the first access ever to a page is a read, if the OS never clears the accessed bit, the L2 cache should never contain a version of the relevant page entry with a cleared access bit. I gather, in this case, the bug occurs on the first write to and subsequent read of the same data from this page? But I can't understand why this is a huge problem unless IMMEDIATELY after a write, a task switch occurs AND the OS is low on memory so it swaps out the dirty page thinking it's clean -- but the likelihood of it selecting such a recent page to swap seems odd, so this can't be it...

Maybe I need to read up on caches, it's been a while and I have a feeling I'm making some false assumptions
cyr1x wrote:The"BIOS-fix" just disables some functionalties in the TLB.
There's a Linux-Patch which emulates the a/d-flag-setting.
Another way would be to not using the a/d-flags.
So the BIOS patch is probably just some start-up code, and the 1-20% performance hit is due to certain functionality being unavailable?

Thanks again