I wrote this for something for others to think about and discuss. I'm not sure if it's really helpful, but...
OS Testing
Eventually (hopefully) your OS will reach the stage where you need to test things like device drivers, support for different CPU features, etc. You think the code you've written works because it works on your computer and some emulators, but there's lots of different computers with lots of different chipsets, lots of different devices, lots of different features, and unfortunately, lots of different problems. You've got no idea how many computer your OS will actually work properly on. Maybe it works on all computers, but maybe it only works on some.
You need to test your OS on a variety of computers, but how do you do that? Hopefully this guide will help.
Volunteers
The first option is to ask volunteers to test the OS on their computers and let you know how it went. This can work, but mostly it doesn't really. The volunteers don't have much idea what they're looking for, and if something doesn't work trying to find and fix the problem can be a tedious mess (typically involving asking the volunteer to do a series of different things to see if you can track down the problem, with a variety of misunderstandings and explanations along the way). I've been in this situation before - after several rounds of questions and answers leading nowhere I asked if I could just buy their computer (unfortunately they wanted to keep it).
Even if the volunteer says the OS did work properly you've still got no idea how much they actually tested. For example, it's extremely rare for a volunteer to trash the data on their hard drives to test someone else's OS, or to spend more than 5 minutes trying different things.
Using volunteers is still a viable method of testing, and there are some extremely generous and knowledgable volunteers out there, but in the end the best option is to do as much testing as you can yourself.
Testing Yourself
The problem with testing your OS yourself is that you need a lot of different computers. How many computers? Lots.
You can have a perfect OS that works extremely well on 99.99% of computers, and still find problems on the remaining 0.01% of computers because different hardware has different problems. Supporting dodgy hardware is a pain in the neck, and it'd be nice if all hardware conformed to all relevant standards and all documentation and behaved correctly, but the reality is that the hardware itself is complex and it's almost impossible to find complete a computer that doesn't have some "errata". This is partly because you're looking at the sum of many parts; or, the chance of problems in the CPU, plus the chance of problems in the northbridge/memory controller, plus the chance of problems in the southbridge, plus the chance of problems in every other device, which adds up to an extremely likely chance of problems somewhere. Basically if your OS doesn't support any flawed/dodgy hardware, then your OS probably won't support any hardware.
This means that to find out if your OS works on all computers you'd need to test your OS on all possible combinations. If there was only 4 CPUs and 4 chipsets to worry about then there'd only be 16 combinations. Unfortunately there's about 100 different CPUs and about 300 different chipsets, and thousands of different add-on cards and other devices. I'd estimate you'd need several billion computers to cover all combinations, and that just isn't practical.
So what do you do? You try to maximize the chance of finding problems by increasing the number of computers you can use for testing and maximizing the variation between these computers. Having a collection of fifty identical computers isn't going to help much, and having a collection of 2 very different computers probably won't help much either.
Management
If you're planning to have many computers for testing, the next problem is managing these computers to make them effective for OS testing purposes. If some of them need to use boot floppies and others need to use boot CDs and others need to use USB/flash, then it'll be a nightmare for regular testing. IMHO the best possible option is booting from network. You setup a server with suitably configured DHCP and TFTP services, and when you build your OS you update the boot files in the TFTP service's directory. Then you can turn on any test computers you like and they'll all boot the latest version of your OS (at the same time if you like), without messing about with floppies, CDs, etc and waiting for one test computer to boot before you take the disk out and start the next test computer.
In this case all the test machines need to support booting from network. This actually isn't that hard to do - some computers (most newer computers) will support this already. For the others you can buy network cards that support PXE for around $40 (Australian dollars that is). Another alternative is to use cheaper ethernet cards (or whatever the computer has) and install Etherboot/gPXE on a floppy or on the hard drive.
This is only the start though. You'd also need a ethernet switch, network cabling and power cabling. The cabling itself is cheap, and it's not too hard to find a good deal on a second-hand ethernet switch (especially if you stick with 100BASE-TX because everyone else is replacing it with 1000BASE-T equipment at the moment, and honestly, 100BASE-TX is plenty fast enough for what you're doing). Also note that ethernet switches can be cascaded. For example, you could have a main 16-port ethernet switch, with several smaller ethernet switches hanging off of it (using cross-over cables or up-link ports).
Then comes the next "biggie" - keyboards, monitors and mouses. There's 2 main problems here - cost and space. Don't bother with second-hand keyboard and mouse - you can buy new cheap keyboard and mouse for $20 or less. The cost of monitors is a problem though - it can cost $100 per monitor (second-hand) plus postage (and believe me, old/cheap monitors are typically heavy CRT things that cost a lot for shipping). This adds up to around $150 or more per computer, which is far too much. Worse, it'll cost you around 1 square meter of desk space for each computer. I don't know about you, but most people just don't have desks large enough for that sort of setup.
There's several options here:
- - constant plugging and unplugging
- KVMs
- headless!
In this case you use one keyboard, one mouse and one monitor, and when you want to try a different computer you unplug everything and plug it into the next computer. If you test your OS on 5 different computers 5 times a day, then it works out to 25 cables unplugged and plugged back in each day (excluding network and power cabling). It's the cheapest option, but it's a pain in the neck. It also doesn't work too well when some computers need USB keyboard and mouse while others need PS/2 keyboard and mouse and others need PS/2 keyboard and serial mouse. However, to be honest you can probably live without having a mouse on the test machines most of the time...
KVMs
A KVM is basically a switch that lets you connect several computers to the same keyboard, monitor and mouse. There's 3 types - the old fashioned mechanical switch, the electronic version, and the newer electronic version with additional stuff like on screen display. The old mechanical switch type of KVMs are a bit dodgy and aren't worth the price you pay. The newer electronic version isn't too bad, but usually you can only get small KVMs like this (e.g. for 2 or 4 computers). My advice is don't mess around with kids toys. If you look at the "price per computer" large KVMs aren't too expensive in comparison and can often work out cheaper than many smaller KVMs. They're extremely well designed pieces of equipment because they're mainly intended for server rooms (smaller KVMs are mostly designed for small home offices, and are less reliable), and typically they come with features like on screen display and password protection, support for PS/2 and USB, and let you assign names to each computer so you can remember which computer is connected where.
For an example, you can get a good 16-port KVM for around $750 new with cables, which works out to $47 per computer. As a bonus, most good KVMs can be daisy-chained together or cascaded, so you can use 3 KVMs connected together as if it's one huge KVM with one keyboard, one monitor and one mouse (instead of having 3 KVMs with 3 keyboards, 3 monitors and 3 mouses). If you care about audio you can get KVMs with audio switching capabilities too, but to be honest I don't really care that much about audio...
Just one warning here - make sure you know how much the cables costs. There are some KVMs out there where you need to pay $90 per cable, and some "KVM over IP" switches can be much worse. A cheap KVM can cost you a lot if you're not careful.
Headless Systems
A headless system is a computer without any monitor or keyboard. Instead, you use a serial port and a dumb terminal (or terminal emulator); or perhaps telnet over the network. Using serial ports can be cheap and effective, but the problem is most computers only have one or 2 of them, so you can't connect 7 test computers to your server at the same time. To get around that you can buy devices that monitor serial ports and buffer the data. This probably still works out a lot cheaper than a KVM, but it also means you can't test the video, keyboard or mouse on the test computers easily.
Using telnet over the network seems like a good idea, especially when you're booting from the network and have the cabling and ethernet switch already. The problem here is that you need a device driver for each ethernet card, and the device driver needs to be working before the computer can communicate. One alternative may be to use PXE protected mode functions instead of providing your own ethernet card drivers, but this can mean trying to get PXE to work in protected mode, which isn't that easy considering how badly documented it is (and considering that during boot your OS will probably need to mess with a lot of hardware, etc that PXE relies on). I really should do more research on supporting and using PXE like this - if you can get it to work, then it'd work out the cheapest option (but it'd still mean you can't test the video, keyboard or mouse on the test computers).
Getting Computers
The final problem is actually getting the computers for testing. The good news here is that you don't need to buy "latest and greatest" machines. Forget about computers with large amount of RAM and fast CPU frequencies. For example, if your OS crashes on a 1.66 GHz Pentium 4 then it'll crash the same on a more expensive 3 GHz Pentium 4, and if your testing hardware compatibility then it's unlikely that more RAM will make any difference.
The first thing I'd recommend is your local computer shop. Go and see them and tell them you're looking for "dodgy old crud". They might think you're insane, they might even be right, but they might also sell you heap of older computers for next to nothing just to free up space in the back room. As an example, I did this and ended up paying $100 for 10 computers (a mixture - an 80386, a 80486DX, some Cyrix 6x86, some AMD K6, and some Pentium) and a monitor. The monitor would have been worth around $40 alone. Out of those 10 computers 6 of them worked fine, and I got 2 more working by playing "mix and match" with the parts. The remaining 2 computers got stripped for spare parts and dumped. That worked out to an average of about $7.50 per computer (excluding the $40 monitor), and they delivered it all for free.
The next thing is friends, family, neighbors, etc. Let everyone know you're looking for cheap computers and you'll be surprised what people have hiding in the back shed. Most of the time they'll give it to you for free, especially older stuff they don't know what to do with. I've got 2 computers this way (both free), and one has a "hard to find" Am5x86 CPU.
Now comes buying second-hand stuff. You can spend months going to garage sales and second hand shops and find nothing - don't waste your time. Instead, find something like eBay and start bidding. Start with a lot of small/cheap bids on lots of computers, and never bid what it's worth. Now and then you'll get lucky and nobody else will bid, and you'll end up with a half decent computer for an very decent price. The seller might not like it, but they'll still sell it and send it for the agreed price (but make sure you know postage and handling costs before you bid, and stick with sellers with good ratings).
Sooner or later you'll end up with a collection of computers, and you'll start looking for specific items you don't already have. My pet project is rare and hard to find CPUs - Transmeta, Nexgen, National Semiconductor's Geode, STPC Atlas, IBM's Blue Lighting, etc. In this case you can't wait patiently and hope you get lucky - you need to be prepared to put your money on the line. This doesn't mean you can't get lucky though - some people think this stuff is worthless, and to be honest, it is relatively worthless (for people who aren't testing OSs).
Here's a big tip - thin clients. Yeah you heard me right: thin clients. They're silly little boxes that mostly contain a CPU, some RAM, crappy video and nothing else.... or are they? You might be surprised, but sometimes you can get thin clients at very cheap prices (e.g. large companies upgrading with lots of thin clients to get rid of) and when you look up the specifications you'll be surprised - silent/fanless systems with reasonable built-in video, perfectly usable ethernet, keyboard, sound, mouse, USB ports, etc. About the only real difference is that you can't have a normal hard drive in it and there's no PCI expansion slots. Does it matter? No - if you're booting over network you don't need any hard drives and you'll have plenty of computers with PCI slots you can use when you need to try out a specific card. Most of the newer thin clients support PXE too - just plug it in and you're done. Just make sure you find out exactly what the thin client is and what it supports (there's a few older thin clients going around that really are crap - stripped down to bare essentials, and sometimes only usable with the manufacturer's software). Also, make sure it definitely does support PXE, because you won't be able to put a different ethernet card in or install etherboot/gPXE on a hard drive that doesn't exist.
Then there's the cutting edge. I'm talking about VIA's new "Nano" CPU that nobody sells anywhere, and Intel's Atom, and all sorts of top of the line "Quad core with extra bells and whistles" and "just released yesterday" systems. For these you can afford to wait - they'll be second hand stuff in a year or 2, and old obsolete stuff you can get cheap in 3 or 4 years. You just get what you can whenever the price drops to your level.
Above all, don't forget you're looking for variety. Something completely different will help you find more bugs than something similar to what you've already got.
Summary
Ok, let's talk numbers. Let's talk about a network of 30 computers, all setup for rapid/streamlined OS testing. How much would something like that cost?
Well the price of the computers themselves will vary a lot - anywhere from free to $2000. For a good variety at least half of them should be old/obsolete stuff you get for less than $200 each. Allow roughly 20 computers at an average of $150 per computer. The remaining 10 computers will cost you more - allow for an average of $400 per computer. That works out to around $7000.
Next, the KVMs. For 30 computers I'd buy a pair of 16-port KVMs and daisy-chain them together - allow $1600 for those. Make it $1800 to include a monitor (and mouse and keyboard).
Then there's ethernet switches. A pair of 16-port switches will do at $150 each. 31 ethernet cables (including the cable used for cascading) at $5 each is another $155. Then throw in 15 ethernet cards at $40 each, and you end up with about $905 for networking.
Lastly you'd need some "miscellaneous". 30 computers, 3 KVMs, 2 ethernet switches and a monitor means you need at least 35 power sockets, or about seven 6-way power boards (where an entire power board is used to plug the other power boards in) at $18 each, and probably 30 power cables at $5 each. You might also want a few spare parts, some spare cabling, a mouse pad and other things, so round this up to $300.
That gives us a total of about $10000 ($333 per computer). That's about $2000 per year for 5 years, because you don't need it all as soon as you start writing the boot loader.
Is it worth it? I can't answer that - it depends on you and your OS...
I've been following my own advice for a while now (I'm up to 14 computers, all ready to boot at the press of a button). I like it a lot. I'm expanding too - I plan to reach 20 computers by the end of the year.
Note: All prices (everywhere in this guide) are very rough estimates in Australian dollars.
Cheers,
Brendan