Page 1 of 1

Tips for random disk lockups

Posted: Fri Sep 21, 2007 9:31 pm
by bewing
I would think that just about all of us have run into random disk lockups at one time or another as we build our OSes? I did a search, but couldn't find any threads giving suggestions on any of the (millions, of course) things that could be wrong. -- But I thought that such a thread might give future users (or even me!) a glimmer of an idea of what might be wrong when we run into the problem (again).

I can even give one tip that hung me up for a bit, a few months ago: I didn't realize back then that I needed to wait for an interrupt, (or alternately for the "busy" flag to clear) -- in between EVERY SINGLE sector that I was reading or writing in PIO mode. I was under the impression that once I issued the command, I'd have to wait for the disk to become ready, and then the whole thing would be streamed in one gulp ... silly, I know. :wink:


But now I've run into it again. I just moved from Bochs (where everything was running perfectly) onto *gasp* real hardware. And now I get random disk lockups -- it seems after about 10 or 20 reads/writes.

So: I'd love to hear some examples of what caused disk lockups for all the rest of you, and maybe it'll shorten my debugging time. I hate debugging random stuff. :x

Posted: Sat Sep 22, 2007 7:20 am
by frank
I've noticed that between bochs and real hardware for example the hardware sometimes issues a different number of IRQs for the same operation. This was one of the problems I had getting the floppy driver to work on real hardware, bochs generated an IRQ when the floppy drive reset and real hardware didn't. I'd keep an eye out for stuff like that.

Posted: Sat Sep 22, 2007 6:56 pm
by bewing
OK, this incidence of disk lockups was caused by a hardware timing problem. I could read all day long, just so long as I never did any writes. It was the writes that would kill me (after a delay, usually).

Hardware: AMD K2/266Mhz - non-ATX mbrd (ie. older) - IBM 1GB non-UDMA of any sort ATA disk.

The ATA6 specification says that you are intended to use REP INSW and REP OUTSW to copy data in PIO mode. For reading (REP INSW) this works. For writing, it does not, on this hardware setup. I needed to do a loop:

Code: Select all

.odlylp:
	call io_delay
	outsw		; dump one word from esi
	dec ecx
	jg short .odlylp
If I tried to use REP OUTSW, sometimes the disk would lock up right then, sometimes the write would work correctly, sometimes the write would fail gracefully -- and sometimes the write would fail invisibly, and the NEXT disk access would lock up.

Posted: Mon Sep 24, 2007 10:36 am
by Combuster
Hardware: AMD K2
That must've been a typo, that chip doesn't exist...

Posted: Mon Sep 24, 2007 10:58 am
by Brynet-Inc
Combuster wrote:
Hardware: AMD K2
That must've been a typo, that chip doesn't exist...
It was a nickname for AMD's K6-2 models.. I know a few people who call it that, stop being so damn pedantic ;)