Page 1 of 1

Load via BIOS & USB floppy - fail after mode switch [solved]

Posted: Mon Apr 11, 2011 5:01 pm
by DavidCooper
I've finally dared to try booting a laptop with my OS via a USB floppy drive. The first few modules load fine, but after switching into protected mode and back, the machine gets stuck at the next attempt to load more. I've tried putting in a little bit of extra code (see below) to switch into protected mode and back before loading the first modules (without doing anything else at all while in protected mode) and it then gets stuck before loading the first module. I've also tried turning the A20 on and running the LGDT instruction before loading the first modules and this doesn't prevent the BIOS from loading them (so the problem can't be anything to do with my GDT getting in the way of the BIOS or the memory above 1MB being opened up). It looks as if it must be the mode switch itself that's causing the problem, so something must be different after the processor has run in protected mode.

Here is the mode-switch code I added in that causes the BIOS to lose its ability to load from the USB floppy drive:-

Code: Select all

250 (disable interrupts)
15 32 192 12 1 15 34 192 (standard way to switch to protected mode)
[CPU now running in protected mode, but CS still needs to be loaded]
102 234 a a a a 8 0 (far jump to address aaaa which is the next byte after this)
[CPU now running in 32-bit protected mode]
234 b b b b 24 0 (far jump to address bbbb which is the next byte after this)
[CPU now running in 16-bit protected mode]
15 32 192 36 254 15 34 192 (standard way to switch to real mode)
[CPU now back in real mode, but CS needs to be loaded]
234 c c 0 0 (far jump to address cc which is the next byte after this)
[CPU now fully back in real mode]
251 (enable interrupts)
GDT (standard values that most people probably use):-

Code: Select all

0 0 0 0 0 0 0 0
255 255 0 0 0 155 207 0 (32-bit code seg)
255 255 0 0 0 147 207 0 (32-bit data seg)
255 255 0 0 0 155 15 0 (16-bit code seg)
255 255 0 0 0 147 15 0 (16-bit data seg)
(Changing 155 and 147 to 154 and 146 makes no difference - the accessed bit).

It all works fine on old machines with built-in floppy drives and in Bochs.

So, here are the questions:-

1. Have any of you managed to do what I'm trying to do (i.e. use the BIOS to load and save via a USB floppy drive after running in protected mode)?

2. Does anyone have any ideas as to what may have changed in the machine (perhaps in some hidden register) which is upsetting the BIOS?

Re: Load via BIOS and USB floppy - fails after mode switch

Posted: Mon Apr 11, 2011 7:14 pm
by Brendan
Hi,

Code: Select all

250 (disable interrupts)
15 32 192 12 1 15 34 192 (standard way to switch to protected mode)
[CPU now running in protected mode, but CS still needs to be loaded]
102 234 a a a a 8 0 (far jump to address aaaa which is the next byte after this)
[CPU now running in 32-bit protected mode]
234 b b b b 24 0 (far jump to address bbbb which is the next byte after this)
[CPU now running in 16-bit protected mode]
15 32 192 36 254 15 34 192 (standard way to switch to real mode)
[CPU now back in real mode, but CS needs to be loaded]
234 c c 0 0 (far jump to address cc which is the next byte after this)
[CPU now fully back in real mode]
251 (enable interrupts)
Could you convert this unmaintainable stream of gibberish into actual code that sane people can read?

Alternatively, given that (in the long term) an OS project needs to (eventually) attract volunteers, and that attracting volunteers is hard enough with good, clean, maintainable code; why bother fixing the bugs? It's easier to delete everything and start from scratch now, rather than later...


Cheers,

Brendan

Re: Load via BIOS and USB floppy - fails after mode switch

Posted: Mon Apr 11, 2011 8:55 pm
by DavidCooper
Brendan wrote:Could you convert this unmaintainable stream of gibberish into actual code that sane people can read?
I wasn't expecting anyone to try to read the machine code - I included it purely so that you'd see from the comments that it does nothing more than switch into protected mode and back by the bog-standard route that everyone uses. If there had been a possible bug in it I would have translated it into standard equally-unmaintainable (or equally-maintainable) mnemonical gibberish for you. Here it is again though with some added explanation just to clarify that it does nothing out of the ordinary, though I'm not going to insult anyone's intelligence by turning the far jump 234 instructions into standard gibberish.

Code: Select all

250 (disable interrupts)
15 32 192 (mov eax,cr0)
12 1 (or al,1)
15 34 192 (mov cr0,eax)
(standard way to switch to protected mode)
[CPU now running in protected mode, but CS still needs to be loaded]
102 234 a a a a 8 0 (far jump to address aaaa which is the next byte after this)
(the 102 is a prefix to make the far jump instruction use a 32-bit address)
[CPU now running in 32-bit protected mode]
234 b b b b 24 0 (far jump to address bbbb which is the next byte after this)
[CPU now running in 16-bit protected mode]
15 32 192 (mov eax,cr0)
36 254 (and al,254)
15 34 192(mov cr0,eax)
(standard way to switch to real mode)
[CPU now back in real mode, but CS needs to be loaded]
234 c c 0 0 (far jump to address cc which is the next byte after this)
[CPU now fully back in real mode]
251 (enable interrupts)
Alternatively, given that (in the long term) an OS project needs to (eventually) attract volunteers,
It doesn't need any volunteers. The main purpose of my OS is to run A.I. on top of it which will at some future time take over the programming, so I'll have a practically infinite workforce to call on to develop it further. The problem is getting to that point, because my old machines are clapped-out and failing fast, so I have to make the switch to my newer machines which lack internal floppy drives.
and that attracting volunteers is hard enough with good, clean, maintainable code; why bother fixing the bugs?
The code is perfectly clean and maintainable, and it's damned good code too - well worth fixing any bugs in. There's one known bug which I haven't got round to tracking down (related to de-indexing external indexes - very low priority), but apart from that the OS is rock solid - never crashes, never freezes.
It's easier to delete everything and start from scratch now, rather than later...
I'm not going to delete perfectly good code just because it's written using a sensible programming method rather than using some masochistic assembly language system which pours tons of unnecessary complexity on top of everything you're trying to do. I have no problem with you complaining that you can't read my code as it's done using a non-standard programming method - that's a fair complaint, but you're not being fair in attacking a non-standard method of programming on the basis that it's non-standard. If I had to work within assembler or a compiler, I wouldn't live long enough to get my programs written because I'd have to spend years getting on top of the over-complicated tools (which are quite frankly a mess) and then the reward for all that effort would be to end up trapped in a system where it's infinitely harder to debug code - I can debug mine with ease by running it through a monitor program and reading the indexed machine code as it runs. The only kinds of bug that get in my way are where the hardware throws up unexpected problems.

I asked two questions (labelled 1 and 2 at the bottom of my original post). I simply want to know if what I'm trying to has successfully been done by someone else who might know how to get round the problem, or failing that, I want to know if there's anything that's known to change in the machine after a switch to protected mode and back. If the answer to both of those turns out to be no, I may simply have to change track and write my own USB code to communicate directly with the external floppy drive and load it through the BIOS without leaving real mode until it's all in place, but that would be an expensive delay (my health isn't good and I can't afford unnecessary delays).

Re: Load via BIOS and USB floppy - fails after mode switch

Posted: Mon Apr 11, 2011 9:52 pm
by Brendan
Hi,

USB floppy behaves no differently than legacy floppy (when used via. the "int 0x13" BIOS functions); there are no hidden registers or anything that your code could have possibly messed up; and (other than modifying EAX and consuming CPU cycles) your code effects nothing that could possibly influence other code.

That's why I originally suspected a bug and wanted to double check the code.

For reference (for other people who might care), here's the code converted back into assembly (NASM syntax):

Code: Select all

	cli
	mov eax,cr0
	or al,0x1
	mov cr0,eax                  ;Enable protected mode

	jmp dword 0x8:.temp1         ;Jump to 32-bit code segment
.temp1:

	bits 32

	jmp dword 0x18:.temp2        ;Jump to 16-bit code segment
.temp2:

	bits 16

	mov eax,cr0
	and al,0xfe
	mov cr0,eax                  ;Disable protected mode

	jmp word 0x0:.temp3          
.temp3:
	sti
Unfortunately, now that I've been able to confirm that the bug is not in this code (unless the offsets to jumps are wrong), the only remaining possibility is that there's bugs somewhere else in the code.

I wish you luck trying to find someone willing to search for this bug in the remainder of your source code.


Cheers,

Brendan

Re: Load via BIOS and USB floppy - fails after mode switch

Posted: Mon Apr 11, 2011 10:16 pm
by Brynet-Inc
Brendan wrote:I wish you luck trying to find someone willing to search for this bug in the remainder of your source code.
s/source code/magic numbers/

As for the problem, obviously the BIOS is simulating a floppy drive.. USB floppy drives are simply USB mass storage devices.

Assuming for a moment that Loseth.., I mean, DavidCooper's crazy OS isn't at fault.. the problem might be somewhere else.

Re: Load via BIOS and USB floppy - fails after mode switch

Posted: Mon Apr 11, 2011 10:53 pm
by Brendan
Hi,
Brynet-Inc wrote:
Brendan wrote:I wish you luck trying to find someone willing to search for this bug in the remainder of your source code.
s/source code/magic numbers/

As for the problem, obviously the BIOS is simulating a floppy drive.. USB floppy drives are simply USB mass storage devices.
I think it's a real USB floppy drive (not emulated) - you can still buy them. Mine looks very similar to this one.

For some reason I have a feeling that something somewhere is trashing the EBDA; and it "works by accident" for legacy floppy because the EBDA isn't relied on by the BIOS as much, but fails when the BIOS is relying on data in the EBDA to support its USB stack. It's just a silly guess though.

I'm guessing we'll never know the real problem - DavidCooper will work around the symptoms instead of finding/fixing the real bug, and be back in 3 weeks time with some other bug. Of course maybe I'm wrong, and he'll unleash an army of "magical AI workers" that will find and fix all the bugs for him while they write an elite OS that makes every other piece of software obsolete, and then go on to cure cancer and solve world hunger (while David takes a holiday flying through space in a unicorn-powered submarine).


Cheers,

Brendan

Re: Load via BIOS and USB floppy - fails after mode switch

Posted: Mon Apr 11, 2011 11:13 pm
by Chandra
I'll be answering to this post with respect to the Brendan's converted code.

Code: Select all

cli
   mov eax,cr0
   or al,0x1
   mov cr0,eax                  ;Enable protected mode

   jmp dword 0x8:.temp1         ;Jump to 32-bit code segment
.temp1:

   bits 32

   jmp dword 0x18:.temp2        ;Jump to 16-bit code segment
.temp2:

   bits 16

   mov eax,cr0
   and al,0xfe
   mov cr0,eax                  ;Disable protected mode

   jmp word 0x0:.temp3          
.temp3:
   sti
Nowhere in this code, I see GDTR being loaded. But, there is the use of selectors 0x18 and 0x8, which might be the reason behind the original issue. I'm not going to complain about DavidCoopers's way of writing code since it definately looks unique(at the same time unreadable), but the decimal number system being used, looks strange.

Re: Load via BIOS and USB floppy - fails after mode switch

Posted: Mon Apr 11, 2011 11:38 pm
by Brynet-Inc
Brendan wrote:I think it's a real USB floppy drive (not emulated) - you can still buy them. Mine looks very similar to this one.
AFAIK, it's still a USB device.. not unlike using a flash device emulating a floppy or hard drive.

The BIOS/firmware is still exposing only an approximation of the legacy device that works with BIOS services, perhaps this trickery ceases when you switch to protected mode (..so it can be enumerated by the OS, without causing problems).

For all intents and purposes, it is an emulated drive.

Re: Load via BIOS and USB floppy - fails after mode switch

Posted: Mon Apr 11, 2011 11:45 pm
by DavidCooper
Thanks for your comments and sorry to have taken up your time on something where there was actually a silly bug causing a misdiagnosis. As you already know, my attempt to debug this led to me adding in some code to switch to protected mode and back before loading in the first module, and it seemed to replicate the problem, but I now know that the switch to protected mode and back caused no such problem for the BIOS. I had previously moved the bit of code that loads the GDT (using the LGDT instruction) to an earlier position in order to test whether that was messing up the BIOS's GDT (just in case it had created one in order to access the USB floppy and was assuming it wouldn't be changed by me). That repositioned code worked in Bochs by luck - Bochs must set the segment registers to zeros on entry - but when running it directly on the laptop it failed because the segment registers must have been different.

Anyway, the long and the short of it is that changing to protected mode and back is not the cause of the problem. Also, I've just added a massive half-minute delay loop into the added code (while in protected mode) to slow down the return from protected mode in an attempt to simulate a lot of code running in case a delay was relevant, but the BIOS then loaded the first few modules without difficulty. So, it appears that what I'm trying to do should indeed be possible - it's just a matter of hunting for the bug elsewhere. No one will be able to help with that without learning how to read machine code directly in order to work with my OS, but I'm confident that I can now find it without help.

Thanks again for your input.

[EDIT - p.s. the EBDA is completely untouched]

Re: Load via BIOS and USB floppy - fails after mode switch

Posted: Mon Apr 11, 2011 11:49 pm
by Brynet-Inc
One additional example I can come up with, I own several ATAPI LS-120 "SuperDisk" drives.. essentially they're fancy floppy drives that support higher capacity media (..120M) while also retaining compatibility with legacy floppy drives.

The BIOS on many systems can simulate a floppy or hard disk to allow booting from these special devices, this also ceases to function once the OS switches to protected mode.

This functionality was probably the precursor to offering such services for USB media, but that alone means there is some fairly complex mechanism behind the scenes.. keeping up this facade until it's no longer necessary, perhaps it's handled in SMM mode?

The point being, don't switch from protected mode to real mode and assume that such luxuries will still be available to you.. better yet, don't switch back to real mode at all.

Re: Load via BIOS and USB floppy - fails after mode switch

Posted: Tue Apr 12, 2011 12:07 am
by DavidCooper
Brendan wrote:I'm guessing we'll never know the real problem - DavidCooper will work around the symptoms instead of finding/fixing the real bug, and be back in 3 weeks time with some other bug. Of course maybe I'm wrong, and he'll unleash an army of "magical AI workers" that will find and fix all the bugs for him while they write an elite OS that makes every other piece of software obsolete, and then go on to cure cancer and solve world hunger (while David takes a holiday flying through space in a unicorn-powered submarine).
I always enjoy a bit of ridicule - it'll just make the success all the more fun.

Anyway, I'll mark this thread as solved, even though it isn't technically solved yet - the problem is no longer the one stated on the tin.

BP value miles from SP caused BIOS to freeze on int 13h call

Posted: Wed Apr 13, 2011 1:12 am
by DavidCooper
Bug found at last. It was caused by BP. In a routine called "mdbm" (make disk bitmap) which makes a bitmap of sectors to be loaded, I used BP to hold the address of the end of the directory map. The value in BP was miles away from the one in SP, and that caused the BIOS to throw a wobbly, freezing it up such that is simply wouldn't return from the int 13h call. I hunted down the bug by writing a bit of extra code to switch to real mode and load a couple of sectors while displaying on the screen the track value from immediately after the int 13h call so that I could see if it had run or not before the BIOS froze. By calling this code from different places I was able to pin down the point where the problem was: the BIOS loaded the test sectors when the call was immediately in front of a bit of code loading BP and it failed to load them when I moved it to immediately after the bit of code loading BP. The solution was to add two bytes of code to make BP equal to SP before calling the BIOS. The result, my entire OS loaded from USB floppy and I was able to go on to load the word processor and text files.

I had no idea that any BIOS would require you to make sure BP and SP hold values that are relatively close together, but this one does. For the record, it's a Megatrends BIOS in an Advent netbook.