CPU Errata Resource?
CPU Errata Resource?
I'm hoping someone here could point me to a single resource that lists all known CPU errata. I'd rather not have to go digging through OS source and tons of different websites or PDFs looking for these things. Perhaps if something like this doesn't exist, it'd be a good thing to get on the wiki. I'm mostly interested in x86 and x86-64 at this point, but going forward other architecture's errata would be greatly appreciated as well.
Re: CPU Errata Resource?
If I remember correctly, Ralf Brown's Interrupt List package used to have one. But it may be pretty outdated (only includes bugs up to the Pentium).
- Brynet-Inc
- Member
- Posts: 2426
- Joined: Tue Oct 17, 2006 9:29 pm
- Libera.chat IRC: brynet
- Location: Canada
- Contact:
Re: CPU Errata Resource?
After discussing it with quok, here is what we have..
ftp://download.intel.com/design/processor/specupdt/
ftp://download.intel.com/design/mobile/SPECUPDT/
AMD doesn't have a public FTP server AFAIK, but their processor errata documents appear to be called "Revision Guides".
"Revision Guide for AMD Athlon(tm) 64 and AMD Opteron(tm) Processors"
http://www.amd.com/us-en/assets/content ... /25759.pdf
"Revision Guide for AMD Family 10h Processors"
http://www.amd.com/us-en/assets/content ... /41322.pdf
Why hasn't anyone tried to organize this better for errata tracking?
ftp://download.intel.com/design/processor/specupdt/
ftp://download.intel.com/design/mobile/SPECUPDT/
AMD doesn't have a public FTP server AFAIK, but their processor errata documents appear to be called "Revision Guides".
"Revision Guide for AMD Athlon(tm) 64 and AMD Opteron(tm) Processors"
http://www.amd.com/us-en/assets/content ... /25759.pdf
"Revision Guide for AMD Family 10h Processors"
http://www.amd.com/us-en/assets/content ... /41322.pdf
Why hasn't anyone tried to organize this better for errata tracking?

Re: CPU Errata Resource?
Probably because there is just too damned much of it, and it changes too fast.
- Brynet-Inc
- Member
- Posts: 2426
- Joined: Tue Oct 17, 2006 9:29 pm
- Libera.chat IRC: brynet
- Location: Canada
- Contact:
Re: CPU Errata Resource?
Still, as developers, we should be tracking these changes.. perhaps warning users of known problems.bewing wrote:Probably because there is just too damned much of it, and it changes too fast.
OpenBSD currently does this, if a known errata is detected.. they recommend a BIOS update.
Re: CPU Errata Resource?
I agree, I think it should be tracked. There is a lot of it, yes, and it does change very fast. However manufacturers also pull documents from their websites for older processors as new ones come out. I think a central place to find this information would be worth the hassle, especially as I'm sure I'm not the only one that's going to be implementing the proper workarounds and such in their OS.Brynet-Inc wrote:Still, as developers, we should be tracking these changes.. perhaps warning users of known problems.bewing wrote:Probably because there is just too damned much of it, and it changes too fast.
OpenBSD currently does this, if a known errata is detected.. they recommend a BIOS update.
- Combuster
- Member
- Posts: 9301
- Joined: Wed Oct 18, 2006 3:45 am
- Libera.chat IRC: [com]buster
- Location: On the balcony, where I can actually keep 1½m distance
- Contact:
Re: CPU Errata Resource?
There was a page on this in the wiki: CPU Bugs - maybe an idea to update it?
Re: CPU Errata Resource?
Hi,
For Intel and AMD, if someone discovers a new problem then the corresponding errata is updated, so (if you're seriously vigilant) you'd need to double check everything every few months. Of course errata for new CPUs is more likely to be updated than errata for older CPUs. In your source code, I'd recommend keeping track of the version or date of the errata document the code was derived from.
For other 80x86 CPU manufacturers (IBM, Cyrix/VIA, NSC, Centaur, IDT, SiS, NexGen, Rise, Transmeta, UMC, STM, Texas Instruments, ZF) you'd first want to work out which CPUs you support. For example, if your OS requires MMX or something then you can forget most of these. In any case, good errata is hard to find (any documentation for any old CPU is hard to find, especially if the company is no longer in business). The only advice I can give here is to find what you can while you can and archive it. Sandpile.org is a good place to start, as they list all the documentation - even though you can't download it from their private links you can find the document titles and file names, and use this information to improve your web searches.
Also note that some of these "old" CPUs seem like they're discontinued, but aren't. For example, about a month ago I bought a new computer (a tiny diskless thing, that's entirely "PC compatible" and came with USB, video, ethernet, etc all built-in, including PXE support). This computer came with a "Vortex86" CPU - an "80486 compatible" CPU made by SiS that's normally used for embedded systems.
The information you get from most web sites (including the OSdev wiki) is normally "minimal" - usually only significant problems are mentioned (e.g. Cyrix Coma bug, Intel Pentium F00F bug) and a huge number of other problems aren't mentioned (if you've seen the errata for any modern/recent CPU you'll know what I mean).
IMHO it's important to consider what you're going to do with the errata information, and to deal with this information in a responsible way. Too many times I've seen average users on news sites and forums in a panic (saying how broken CPUs are, advocating boycotts, etc), because they lack the knowledge needed to understand the errata information provided by the CPU manufacturer and how it effects (or more commonly, doesn't effect) real systems. Worse, sometimes some useless journalist will sensationalize a recently released piece of errata. It's all bad, because with enough bad publicity the CPU manufacturers will be very tempted to restrict access to their errata information (e.g. make people sign an NDA before they can see the errata, or perhaps only give the errata information to chipset/motherboard manufacturers and Microsoft). I hope you can understand why all OS developers would like to avoid that...
If you carefully look at the full list of errata for a CPU you'll notice that some of it is irrelevant. For example, it might involve something that isn't used (e.g. FRC mode, where the pins on one CPU are used to monitor the signals on a second CPU) or it might involve software that does something that the programming manuals say causes undefined behavior. Some of the errata might not effect your OS anyway (for e.g. my OS will never support Virtual86 mode, so any bugs involving Virtual86 mode will never effect my OS). Some of the errata can be easily fixed by the OS (e.g. dodgy CPUID return values is common, but is easily fixed by the OS because all software should use the OS's "CPU data" rather than using CPUID directly). In all of these cases the OS should be silent (don't mention the errata to any user).
Some of the errata can be be fixed by a work-around in the kernel, and won't effect normal software (applications, etc). For these I'd have a set of flags to keep track of them (so the kernel knows if the work-around/s are needed or not). The user doesn't need to know about these either.
Some of the errata is for chipset/motherboard designers to worry about, and an OS can't assume the problem effects the chipset/motherboard. IMHO the user doesn't need to know about these problems either (unless the OS knows for sure that the chipset/motherboard is effected).
Some of the errata may cause problems (typically only in rare circumstances) for normal users. Most normal users don't really need to know about these problems either, but the OS should make this information available to system administrators so they know the computer is effected by the problem. If the OS does make the information available, then it should also provide an adequate explanation of the problem (including the chance of the errata causing any problem) and advice for the system administrators.
Also, if you're going to be thorough, the first thing I'd recommend is having a "unknown" flag, so that if there isn't enough good information (or if you haven't had time to examine each piece of errata yet) you can set the "unknown" flag to let people know that there may or may not be errata that the OS hasn't detected.
Finally, for Intel CPUs some of the errata is fixed by CPU microcode updates; but there's no way to determine which problems are fixed by which microcode updates. In this case (for example) your OS might tell system administrators about a problem that might or might not have been fixed, that might or might not effect the reliability of software they use. In this case it is possible to detect if any microcode updates have been installed, and to use this to improve the information the OS tells system administrators. For example, if there's no microcode update installed, then the OS could tell system administrators there definitely is a problem and recommend a microcode update; but if there is some sort of microcode update installed, then the OS could stay silent or perhaps mention that the problem may or may not exist.
For AMD CPUs, I'd recommend reading Chapter 17, "OS-Visible Workaround Information" from a recent copy of "AMD64 Architecture Programmer’s Manual Volume 2: System Programming". You'll find these "OS-visible workaround/s" listed in AMD's revision guides. It complicates things a little for the OS, but should also be useful because it lets the OS know if the problem is fixed by hardware or not (I wish Intel would implement this feature).
Cheers,
Brendan
For Intel and AMD, the "specification updates" or "revision guides" are on their web sites. IMHO the fastest way to find them is a Google search like this "specification update site:intel.com". For both companies, errata for old CPUs may still be on the company's web site but not listed like recent CPUs (e.g. it could be in a "archived" area). Old CPUs might also be renamed - I remember finding 80486 errata in Intel's "embedded CPU" section about a year ago, as it'd been "end of lifed" for desktop use but was still being sold for embedded systems until a few years ago.quok wrote:I'm hoping someone here could point me to a single resource that lists all known CPU errata. I'd rather not have to go digging through OS source and tons of different websites or PDFs looking for these things. Perhaps if something like this doesn't exist, it'd be a good thing to get on the wiki. I'm mostly interested in x86 and x86-64 at this point, but going forward other architecture's errata would be greatly appreciated as well.
For Intel and AMD, if someone discovers a new problem then the corresponding errata is updated, so (if you're seriously vigilant) you'd need to double check everything every few months. Of course errata for new CPUs is more likely to be updated than errata for older CPUs. In your source code, I'd recommend keeping track of the version or date of the errata document the code was derived from.
For other 80x86 CPU manufacturers (IBM, Cyrix/VIA, NSC, Centaur, IDT, SiS, NexGen, Rise, Transmeta, UMC, STM, Texas Instruments, ZF) you'd first want to work out which CPUs you support. For example, if your OS requires MMX or something then you can forget most of these. In any case, good errata is hard to find (any documentation for any old CPU is hard to find, especially if the company is no longer in business). The only advice I can give here is to find what you can while you can and archive it. Sandpile.org is a good place to start, as they list all the documentation - even though you can't download it from their private links you can find the document titles and file names, and use this information to improve your web searches.
Also note that some of these "old" CPUs seem like they're discontinued, but aren't. For example, about a month ago I bought a new computer (a tiny diskless thing, that's entirely "PC compatible" and came with USB, video, ethernet, etc all built-in, including PXE support). This computer came with a "Vortex86" CPU - an "80486 compatible" CPU made by SiS that's normally used for embedded systems.
The information you get from most web sites (including the OSdev wiki) is normally "minimal" - usually only significant problems are mentioned (e.g. Cyrix Coma bug, Intel Pentium F00F bug) and a huge number of other problems aren't mentioned (if you've seen the errata for any modern/recent CPU you'll know what I mean).
IMHO it's important to consider what you're going to do with the errata information, and to deal with this information in a responsible way. Too many times I've seen average users on news sites and forums in a panic (saying how broken CPUs are, advocating boycotts, etc), because they lack the knowledge needed to understand the errata information provided by the CPU manufacturer and how it effects (or more commonly, doesn't effect) real systems. Worse, sometimes some useless journalist will sensationalize a recently released piece of errata. It's all bad, because with enough bad publicity the CPU manufacturers will be very tempted to restrict access to their errata information (e.g. make people sign an NDA before they can see the errata, or perhaps only give the errata information to chipset/motherboard manufacturers and Microsoft). I hope you can understand why all OS developers would like to avoid that...

If you carefully look at the full list of errata for a CPU you'll notice that some of it is irrelevant. For example, it might involve something that isn't used (e.g. FRC mode, where the pins on one CPU are used to monitor the signals on a second CPU) or it might involve software that does something that the programming manuals say causes undefined behavior. Some of the errata might not effect your OS anyway (for e.g. my OS will never support Virtual86 mode, so any bugs involving Virtual86 mode will never effect my OS). Some of the errata can be easily fixed by the OS (e.g. dodgy CPUID return values is common, but is easily fixed by the OS because all software should use the OS's "CPU data" rather than using CPUID directly). In all of these cases the OS should be silent (don't mention the errata to any user).
Some of the errata can be be fixed by a work-around in the kernel, and won't effect normal software (applications, etc). For these I'd have a set of flags to keep track of them (so the kernel knows if the work-around/s are needed or not). The user doesn't need to know about these either.
Some of the errata is for chipset/motherboard designers to worry about, and an OS can't assume the problem effects the chipset/motherboard. IMHO the user doesn't need to know about these problems either (unless the OS knows for sure that the chipset/motherboard is effected).
Some of the errata may cause problems (typically only in rare circumstances) for normal users. Most normal users don't really need to know about these problems either, but the OS should make this information available to system administrators so they know the computer is effected by the problem. If the OS does make the information available, then it should also provide an adequate explanation of the problem (including the chance of the errata causing any problem) and advice for the system administrators.
Also, if you're going to be thorough, the first thing I'd recommend is having a "unknown" flag, so that if there isn't enough good information (or if you haven't had time to examine each piece of errata yet) you can set the "unknown" flag to let people know that there may or may not be errata that the OS hasn't detected.
Finally, for Intel CPUs some of the errata is fixed by CPU microcode updates; but there's no way to determine which problems are fixed by which microcode updates. In this case (for example) your OS might tell system administrators about a problem that might or might not have been fixed, that might or might not effect the reliability of software they use. In this case it is possible to detect if any microcode updates have been installed, and to use this to improve the information the OS tells system administrators. For example, if there's no microcode update installed, then the OS could tell system administrators there definitely is a problem and recommend a microcode update; but if there is some sort of microcode update installed, then the OS could stay silent or perhaps mention that the problem may or may not exist.
For AMD CPUs, I'd recommend reading Chapter 17, "OS-Visible Workaround Information" from a recent copy of "AMD64 Architecture Programmer’s Manual Volume 2: System Programming". You'll find these "OS-visible workaround/s" listed in AMD's revision guides. It complicates things a little for the OS, but should also be useful because it lets the OS know if the problem is fixed by hardware or not (I wish Intel would implement this feature).
Cheers,
Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.