Hi,
Some notes....
Every 80x86 CPU ever made has bugs. For Intel CPUs ranging from Pentium to the latest "Core" CPUs there's between 70 to 190 different bugs for each CPU, listed in the corresponding errata (and published by Intel). I assume AMD CPUs aren't much different.
For the errata lists, some of the things Intel include are insane. One example would be"
Item No 8: Mode C paging in SMM causes use of incorrect page tables" (Pentium Pro), which doesn't make much sense because the manual explicitly states that paging isn't supported in SMM mode on any CPU.
Some of the things listed can't effect any OS. An example would be"
Item No 43: L2 cache may incorrectly report BIST failure" (Pentium II). In this case the Built In Self Test would only occur during boot before an OS is started, so the problem can't effect an OS after it has started.
There's usually quite a few problems with FRC mode (Functional Redundancy Checking). The idea here is that a pair of CPUs are glued together and one CPU is used to check the output the other. If the output of both CPUs doesn't match an FRCERR is signalled to external circuitry. I've never seen or heard of this feature actually being used - it might only be used by Intel for quality control (although I have heard of similar arrangements for non-Intel high end servers, where extreme fault tolerance is needed -
here if you're curious).
For a number of things listed there's motherboard, chipset or BIOS work-arounds, and some items only effect motherboards that do some things in certain ways. In this case it's impossible to know if the work-around has been implemented or not, or if the motherboard is effected by the bug. In these cases I assume the problem is fixed because there's no practical way of detecting otherwise and it's possibly not a good idea to warn users about problems that might not exist.
For how I've been classifying CPU bugs for my OS, it might be worth reading
this page in my user manual. For Intel chips from Pentium to Pentium II, I've got 7 classified as "errata" and 8 classified as "flaws", however some of the "flaws" represent several problems that all effect the same CPUs. There's also 33 of them that I've marked as "TO BE REVIEWED" - for these I either don't understand enough about the problem, or I am unsure whether it will effect my OS or not (at least half of these won't be classified until I implement machine check exception handling).
I should also point out that Intel's published errata is very comprehensive - they seem to list everything that could ever effect anything. Because of this I'm quite lenient - I tend to look for reasons why each bug won't matter, rather than looking for reasons why each bug might matter. This is party because I've been using Intel CPU's (with heaps of "bugs") for many years without ever actually having a problem that can be attributed to a CPU bug.
For a public database of bugs, I'd suggest the following fields:
- Bug Name
Category
Effected CPU/s (manufacturer, family, model, stepping, etc)
Contributing Factors (a short description)
Bug Description (about 1 paragraph)
Workaround (about 1 paragraph if a work-around can be implemented by the OS kernel)
For an example:
Bug Name: F00F Bug
Category: SEVERE, FIXABLE
Contributing Factors: LOCK CMPXCHG8B instruction
Bug Description: The LOCK CMPXCHG8B instruction can be used to completely lock up the computer (at any privilege level) due to a CPU bug that leaves the bus locked while trying to invoke an invalid instruction exception.
Workaround: The easiest known work-around is to set the IDT to "write-through" caching (e.g. using the flags in the page table entry). For more information see
http://www.x86.org/errata/dec97/f00fbug.htm.
Effected CPUs: All Intel Pentium CPUs.
The problem here is going to be deciding which pieces of errata published by Intel and AMD actually matter, and finding details for other manufacturers....
Cheers,
Brendan