Debugging OS on real hardware

MollenOS · Post by **MollenOS** » Mon Apr 18, 2016 3:44 am

rdos wrote:
gerryg400 wrote:I must be misunderstanding printf debugging. What's wrong with it ?
Everything. They alter state and introduce new bugs + cannot tell you what a debugger typically can tell you. Once the basic exception handlers are in place, the next step is to integrate a remote debugger or local debugger thread so you can single step the kernel, preferably in source-code form. Before that is done, planting "int 3" in code, which triggers the exception handlers, is a good way to get the current register state.

Of course, if you are in an emulator rather than on real hardware, then the emulator might have good options for doing the above an easier way, but you still need it when you come to real hardware that isn't booting.

I really disagree with this statement. Printf debugging is slow, but it surely works really well. Integrating support for a remote debugger or a local debugger is a lot more work out of the box and printf debugging is a lot more simple to implement.

There hasn't been one issue in my OS that could not get fixed by printf debugging. Printf debugging might not be suitable for you, but it sure as hell is for 90% of people here.

rdos · Post by **rdos** » Mon Apr 18, 2016 5:27 am

MollenOS wrote:
rdos wrote:
gerryg400 wrote:I must be misunderstanding printf debugging. What's wrong with it ?
Everything. They alter state and introduce new bugs + cannot tell you what a debugger typically can tell you. Once the basic exception handlers are in place, the next step is to integrate a remote debugger or local debugger thread so you can single step the kernel, preferably in source-code form. Before that is done, planting "int 3" in code, which triggers the exception handlers, is a good way to get the current register state.

Of course, if you are in an emulator rather than on real hardware, then the emulator might have good options for doing the above an easier way, but you still need it when you come to real hardware that isn't booting.
I really disagree with this statement. Printf debugging is slow, but it surely works really well. Integrating support for a remote debugger or a local debugger is a lot more work out of the box and printf debugging is a lot more simple to implement.

There hasn't been one issue in my OS that could not get fixed by printf debugging. Printf debugging might not be suitable for you, but it sure as hell is for 90% of people here.

Well, if you can do printf, then you can do register dumps, and if you can do register dumps and have support for keyboard, then you can do an interactive debugger which is far more effective than printf-debugging will ever be. You are just used to an inefficient method.

BTW, I developed the low level stuff in my OS before emulators were generally available, so one of my first steps was to do a 486-emulator so I could trace the boot process. After that, I made the integrated debugger that works after multithreading is started, and it is running as a thread, and individual faulted threads can be inspected, modified and single-stepped / restarted regardless if they run in kernel or application space, including V86-mode, protected mode and long mode. This is still available and used when there are problems in the startup process on real hardware or in specific drivers. Lastly, I ported OpenWatcom and their remote debugging stub so I could debug applications (and drivers) at source level remotely, which is the optimal way of debugging. I use this to debug device-drivers too, because then I just define the "test" syscall, chain it to the code I want to debug in kernel space, and can then trace into it with the application debugger.

gerryg400 · Post by **gerryg400** » Mon Apr 18, 2016 5:53 am

rdos wrote:Testing on real hardware should be done ASAP because emulators are not real hardware. Debugging is best done by catching serious bugs like protection fault, stack fault and double faults with handlers, and then printing register state on-screen. These handlers can be tested on an emulator. Printf-debugging is worthless.

I still don't understand rdos. You say here that you print registers on the screen. Is that not printf debugging ? What precisely is the thing called 'Printf-debugging' that you say is worthless ?

Schol-R-LEA · Post by **Schol-R-LEA** » Mon Apr 18, 2016 9:08 am

Perhaps I am misreading this, but I suspect that rdos is actually arguing against the sort of ad-hoc 'I go this far, I got this far, this is what this value is now" sort of print debugging that most people use, and advocating using a more systematic approach, by writing a sort of monitor program for doing the debugging in. I have to agree, but I think rdos is saying it in a way that is a bit unclear, not to mention a bit too strident - writing such a monitor is itself a significant undertaking, so in the earliest stages of OS development it wouldn't be feasible. It is good advice, but probably premature until you have at least some of the OS facilities already in place.

gerryg400 · Post by **gerryg400** » Mon Apr 18, 2016 2:50 pm

Schol-R-LEA wrote:Perhaps I am misreading this, but I suspect that rdos is actually arguing against the sort of ad-hoc 'I go this far, I got this far, this is what this value is now" sort of print debugging that most people use, and advocating using a more systematic approach, by writing a sort of monitor program for doing the debugging in. I have to agree, but I think rdos is saying it in a way that is a bit unclear, not to mention a bit too strident - writing such a monitor is itself a significant undertaking, so in the earliest stages of OS development it wouldn't be feasible. It is good advice, but probably premature until you have at least some of the OS facilities already in place.

"I got this far, I got this far, this is what this value is now" sounds to me like logging and is completely respectable. RDOS cannot have meant that logging is worthless.

In reality the best tool for a debugging job depends on lots of things about the bug and the system that it affects. In some cases a monitor or debugger cannot be used. In those cases printfing may be more useful. As an experienced engineer RDOS must know that. His statement was not correct and certainly not helpful.

jojo · Post by **jojo** » Mon Apr 18, 2016 3:37 pm

You guys know what the best tool/method is?

The one that gets your job done.

Schol-R-LEA · Post by **Schol-R-LEA** » Mon Apr 18, 2016 3:54 pm

I think he was arguing more for being systematic about it, i.e., having a structured set of logging functions rather than just throwing a printf call in, having an actual monitor-debugger for handling the testing, etc.

However, as I said, while this is a great idea, it wouldn't be practical at many points on the OS dev process, especially early on. In the end, Jojo hit the nail on the head - having something that works counts most of all.

Hellbender · Post by **Hellbender** » Tue Apr 19, 2016 1:37 am

Also, when you add/remove printf (or enable/disable logging), you change the way the code works. Different code paths, different addresses, different register contents, etc. That can hide or create problems, just like between debug and release builds often does.

So using non-invasive debugging (I think that's what rdos was after) is better, but as was stated above, requires more work to get going.

However, I want to add that avoiding 'the proper way' just because 'it requires more work' is plain wrong attitude. 'the proper way' is actually the long term least effort way (almost by definition).

gerryg400 · Post by **gerryg400** » Tue Apr 19, 2016 4:09 am

Hellbender wrote:Also, when you add/remove printf (or enable/disable logging), you change the way the code works. Different code paths, different addresses, different register contents, etc. That can hide or create problems, just like between debug and release builds often does.

So using non-invasive debugging (I think that's what rdos was after) is better, but as was stated above, requires more work to get going.

However, I want to add that avoiding 'the proper way' just because 'it requires more work' is plain wrong attitude. 'the proper way' is actually the long term least effort way (almost by definition).

Whether a debugging technique is invasive or not depends on the system and the bug. Even an ICE is invasive when it stops a CPU core because the rest of the system continues to run and the bug may depend on interactions with parts that are still running. Some issues cannot be debugged that way. And in those cases a clever logging system might be able to tell you what's happening.

Note that no-one here is saying that emulators, debuggers, simulators, monitors are bad or has event ranked them on a scale of good to bad. And that's a good thing because it's situation dependant.

But one person, RDOS, said that printf debugging is worthless and that is wrong. If you think that printf debugging is worthless then you are wrong too.

Hellbender · Post by **Hellbender** » Tue Apr 19, 2016 5:39 am

gerryg400 wrote:Whether a debugging technique is invasive or not depends on the system and the bug.
... If you think that printf debugging is worthless then you are wrong too.

You are correct that the proper method depends on the situation, and I too use printf debuggin occasionally. It just should not be the only tool in the box.

Schol-R-LEA · Post by **Schol-R-LEA** » Tue Apr 19, 2016 6:35 am

gerryg400 wrote:
Hellbender wrote:But one person, RDOS, said that printf debugging is worthless and that is wrong.

Hyperbole is a thing. This whole argument comes down to rdos exaggerating his point to make it stick but failing to make it clear that it was an exaggeration, with the results that several others took that exaggeration literally.

Eh, it happens all the time, especially in text-based media.

Luns · Post by **Luns** » Wed Apr 20, 2016 7:03 am

Late to the thread, but wanted to note that hardware debuggers do exist and are very useful for debugging low level code. Probably a bit expensive for the hobbyist though.

abcdef4bfd · Post by **abcdef4bfd** » Thu Apr 21, 2016 9:02 am

Luns wrote:Late to the thread, but wanted to note that hardware debuggers do exist and are very useful for debugging low level code. Probably a bit expensive for the hobbyist though.

Yep, I don't have so much money.

jojo · Post by **jojo** » Thu Apr 21, 2016 11:59 am

Yeah, you just need one of these, at $3,000:
https://designintools.intel.com/product ... 3brext.htm

And one of these, for $1,100, to go with it:
https://designintools.intel.com/product ... ehswh3.htm

NBD

rdos · Post by **rdos** » Fri Apr 22, 2016 4:13 pm

OK, let me develop the argument with a few examples.

1. Assume you have some fault in the boot process. One of your device-drivers faults. Is printf-debugging useful to find this problem? No, because if you have decent fault handlers, then you will have the exact position and register context directly on your screen.

2. Assume you have some other error in the boot process that doesn't end in a fault but rather hangs the system. In this situation you might add printf at various places to figure out how far it goes.

3. Assume your boot process causes a RESET. In this case printf debugging will not be helpful because you are unlikely to see the last text before it reboots, and in most cases decent fault handlers will solve this too. If not, the best method would be to plant "int 3" at different positions until you have figured out where it happens.

If you are debugging things that are post-boot, then at least in RDOS there will be virtual consoles, so you cannot directly printf to screen, rather this will end up in the current console, which might not be visible. At this point you want to use an application debugger or integrated debugger.

Let me give you another good example where printf debugging is not likely to be useful: Debugging multicore problems. Many multicore problems are highly dependent on timing, and even a tiny modification of the code might change the behavior. Also, as previously mentioned, I cannot debug multicore issues with printf because there will be multiple consoles, and printf would only create visible output if the console is in the foreground.

Another example (related to multicore):

4. Assume the kernel stack on one core gets exhausted by IPIs. First, a simple fault handler that just dumps register context (which is simple to do) will not be likely to spot the problem. The interactive debugger cannot solve it either because when the kernel stack is full, then the scheduler will no longer work, and neither will the interactive debugger. Instead, I've created a separate monitor that is entered upon fatal (multicore) issues in the scheduler, which will freeze all the cores, and use one core to setup an independent environment where register and memory context can be inspected per core. The scheduler then has a number of "asserts" which will enter the monitor when something goes wrong.

OSDev.org

Debugging OS on real hardware

Re: Debugging OS on real hardware

Re: Debugging OS on real hardware

Re: Debugging OS on real hardware

Re: Debugging OS on real hardware

Re: Debugging OS on real hardware

Re: Debugging OS on real hardware

Re: Debugging OS on real hardware

Re: Debugging OS on real hardware

Re: Debugging OS on real hardware

Re: Debugging OS on real hardware

Re: Debugging OS on real hardware

Re: Debugging OS on real hardware

Re: Debugging OS on real hardware

Re: Debugging OS on real hardware

Re: Debugging OS on real hardware