[SOLVED] Partially written string literals

Question about which tools to use, bugs, the best way to implement a function, etc should go here. Don't forget to see if your question is answered in the wiki first! When in doubt post here.
eisdt
Member
Member
Posts: 31
Joined: Sat Nov 07, 2015 9:58 am
Location: Italy

Re: [SOLVED] Partially written string literals

Post by eisdt »

intx13, will you post the code you wrote to dump the MBR to disk? However, shouldn't this issue be discussed on a separate topic? I marked this one as solved because the problem of printing out was indeed solved, though the sneaky one of the BIOS touching reserved memory was not and, therefore, is missing a solution I believe should be given to on a separate topic and documented accordingly. It's very weird that such a problem doesn't seem to have any material about it.
intx13
Member
Member
Posts: 112
Joined: Wed Sep 07, 2011 3:34 pm

Re: [SOLVED] Partially written string literals

Post by intx13 »

eisdt wrote:intx13, will you post the code you wrote to dump the MBR to disk?
The code I wrote wasn't working on the laptop for some reason. Worked fine on Bochs. Possibly because of the very bug/feature we identified? It's on a PC without an internet connection though, but if I get a chance I will grab it. You could probably reproduce it really quickly.. something like this (untested):

Code: Select all

[BITS 16]
section vstart=0x7C00 align=1

top:
  jmp 0x0000:force

prepad:
  times 400 db '.'

force:
  xor AX, AX
  mov DS, AX
  mov SI, dap
  mov AX, 0x4301
  mov DL, 0x80
  int 0x13
  jmp $

dap:
  db 0x10
  db 0x00
  dw 0x01
  dd 0x00007C00
  dq 0x0000000000000001

postpad:
  times 446 - ($ - top) db '.'

parttable:
  times 64 db 0
  dw 0xAA55

buffer:
  times 512 db 0

Code: Select all

nasm -f bin <that file>
When assembled and written to disk, the first 512 bytes should be a bootable MBR containing the "dump 0x7C00 - 0x7EFF" code, and the second 512 bytes should be zeroed. After running, the second 512 bytes should be replaced with the MBR read back from RAM at runtime.
However, shouldn't this issue be discussed on a separate topic? I marked this one as solved because the problem of printing out was indeed solved, though the sneaky one of the BIOS touching reserved memory was not and, therefore, is missing a solution I believe should be given to on a separate topic and documented accordingly. It's very weird that such a problem doesn't seem to have any material about it.
Well I think the problem of printing was because of the bug/feature. I believe that either (1) the bug is in INT10H and since your solution avoids INT10H the bug doesn't occur or (2) the bug has to do with the BPB and happens prior to execution of the MBR, and your solution happened to be structured in a way that it wasn't noticeable affected (Brendan's suggestion).
eisdt
Member
Member
Posts: 31
Joined: Sat Nov 07, 2015 9:58 am
Location: Italy

Re: [SOLVED] Partially written string literals

Post by eisdt »

intx13 wrote: Well I think the problem of printing was because of the bug/feature. I believe that either (1) the bug is in INT10H and since your solution avoids INT10H the bug doesn't occur or (2) the bug has to do with the BPB and happens prior to execution of the MBR, and your solution happened to be structured in a way that it wasn't noticeable affected (Brendan's suggestion).
Let me raise this problem again.
To find out whether int $0x10 was buggy (even though I don't like this term — we're talking about AMI and I strongly doubt a bug like this would be left unfixed), I copied an area of code away from the "corrupted" area in 0x7C00 to some safe region and jump there: the code did print the message with the interrupts. I would therefore tend to exclude the faultiness of the interrupt and would opt for the BPB.

You may find the code here.
intx13
Member
Member
Posts: 112
Joined: Wed Sep 07, 2011 3:34 pm

Re: [SOLVED] Partially written string literals

Post by intx13 »

eisdt wrote: Let me raise this problem again.
To find out whether int $0x10 was buggy (even though I don't like this term — we're talking about AMI and I strongly doubt a bug like this would be left unfixed), I copied an area of code away from the "corrupted" area in 0x7C00 to some safe region and jump there: the code did print the message with the interrupts. I would therefore tend to exclude the faultiness of the interrupt and would opt for the BPB.

You may find the code here.
That doesn't test the question though, since you're still using INT10H. It could still be mangling those bytes around 7C00 and you wouldn't know. To test it you'd have to do the dot test using a non-INT10H-based print function. If that still shows mangled memory, then it's something written by BIOS prior to bootloader execution. If not, it's something written by BIOS during INT10H execution.
kzinti
Member
Member
Posts: 898
Joined: Mon Feb 02, 2015 7:11 pm

Re: [SOLVED] Partially written string literals

Post by kzinti »

It seems more likely / reasonable to me if the BIOS is updating something (BPB) after loading the bootloader (as opposed to int 10h corrupting it). But it would be nice to get confirmation one way or another.
eisdt
Member
Member
Posts: 31
Joined: Sat Nov 07, 2015 9:58 am
Location: Italy

Re: [SOLVED] Partially written string literals

Post by eisdt »

intx13 wrote: That doesn't test the question though, since you're still using INT10H. It could still be mangling those bytes around 7C00 and you wouldn't know. To test it you'd have to do the dot test using a non-INT10H-based print function. If that still shows mangled memory, then it's something written by BIOS prior to bootloader execution. If not, it's something written by BIOS during INT10H execution.
What you're saying is true: I took a different, probably less straightforward approach which doesn't fully exonerate the INT from generating the issues, though.

I assumed the BIOS was touching the region the code issued the INT from for some reason, so whether it was the MBR or somewhere else shouldn't make difference as the issue would show up regardless of that. The code sorted out that because issuing the INT 0x10 away from 7C00 works perfectly. Now, does that mean the BIOS doesn't mangle the memory when it detects the code calling it is in the MBR? No, it does not. But it becomes consequently less likely, because it would do that knowingly (not being a bug): why would it check for that? Hope that was clear.

I'll probably try to give it a definitive reason, although I'm not very familiar with the BPB. I still wonder why so many bootsectors I have seen don't take that into account and why there's apparently no documentation about that (again, we're talking about AMI), given the caliber of the problem.

Cheers,
eisdt.
User avatar
Brendan
Member
Member
Posts: 8561
Joined: Sat Jan 15, 2005 12:00 am
Location: At his keyboard!
Contact:

Re: [SOLVED] Partially written string literals

Post by Brendan »

Hi,
eisdt wrote:I'll probably try to give it a definitive reason, although I'm not very familiar with the BPB. I still wonder why so many bootsectors I have seen don't take that into account and why there's apparently no documentation about that (again, we're talking about AMI), given the caliber of the problem.
There's only really 2 normal cases:
  • There's no partitions and the MBR is the boot loader. This is mostly only used for floppy disks now; and it's "slightly reasonable" for firmware to expect a BPB in this case.
  • There are partitions and the MBR contains a partition table; and the boot loader is in the first sector of its partition and not at the start of the disk. Most boot loaders you see are probably designed for this case.
I'd also be very tempted to assume that you're having problems because your boot loader isn't doing either of the normal cases (there's no BPB like firmware would expect for a "floppy like device", and there's no partition table like firmware would expect for a "hard disk like device"). You're wandering around in the land of "unexpected and untested corner case" wondering why strange things happen. ;)


Cheers,

Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
intx13
Member
Member
Posts: 112
Joined: Wed Sep 07, 2011 3:34 pm

Re: [SOLVED] Partially written string literals

Post by intx13 »

Additionally, I think I mentioned in a previous comment that GRUB does include an unused buffer space for the BPB, so (assuming that's what's happening, which is probably a good bet) it is accounted for in that bootloader at least. NTLOADER would be easy to check too, if you happen to have a pre-UEFI Windows system around.
User avatar
Brendan
Member
Member
Posts: 8561
Joined: Sat Jan 15, 2005 12:00 am
Location: At his keyboard!
Contact:

Re: [SOLVED] Partially written string literals

Post by Brendan »

Hi,
intx13 wrote:Additionally, I think I mentioned in a previous comment that GRUB does include an unused buffer space for the BPB, so (assuming that's what's happening, which is probably a good bet) it is accounted for in that bootloader at least. NTLOADER would be easy to check too, if you happen to have a pre-UEFI Windows system around.
Here's an examination of the Windows 7 & 8 MBR. It has no BPB (and is only designed/intended for the "has partitions/partition table" case where BPB isn't needed).

GRUB is probably designed for 2 different cases: with BPB (and no need for partition table), and with partitions/partition table (and no need for BPB).


Cheers,

Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
eisdt
Member
Member
Posts: 31
Joined: Sat Nov 07, 2015 9:58 am
Location: Italy

Re: [SOLVED] Partially written string literals

Post by eisdt »

Sorry for the late answer -- hope it's not too much of necroposting.
Brendan wrote:You're wandering around in the land of "unexpected and untested corner case" wondering why strange things happen. ;)
Yeah, exactly. What I struggle to understand is that lots -actually, most- of bootsectors I have seen provide neither a BPB nor a partition layout, as I did, furthermore stating they are "fine and working". First example that comes to my mind here. Many OSDev examples don't either.

Could it be that some BIOSs are (much) more tolerant than others, thus fading away the issue? IIRC, intx13 also ran into the same issue but when the cause wasn't discovered yet and so (s)he was somewhat stumped too because, again, the code itself is fine.
User avatar
Brendan
Member
Member
Posts: 8561
Joined: Sat Jan 15, 2005 12:00 am
Location: At his keyboard!
Contact:

Re: [SOLVED] Partially written string literals

Post by Brendan »

Hi,
eisdt wrote:
Brendan wrote:You're wandering around in the land of "unexpected and untested corner case" wondering why strange things happen. ;)
Yeah, exactly. What I struggle to understand is that lots -actually, most- of bootsectors I have seen provide neither a BPB nor a partition layout, as I did, furthermore stating they are "fine and working". First example that comes to my mind here. Many OSDev examples don't either.
Yes; there's lots of examples (in tutorials and beginners projects) that are all bad for various reasons (including the fact that they were intentionally "simple" because they were intended as an example and not intended as production quality code for a real OS).

Did you notice the "minimal usable bootsector" example/tutorial you linked doesn't even setup a stack before loading sectors into memory where the stack might be, the error handling is "lock up if the read failed", and there's a "entire kernel fits on a single track" assumption (that would probably cause it to fail if kernel is larger than 9 KiB)?

Also don't forget that when most people say "it works", often what they mean is that it worked for them on one computer (or on one emulator) under a specific set of circumstances (e.g. as a real floppy and not as USB flash pretending to be a "floppy like" device). It takes a relatively long time before beginners (which includes most of the people writing tutorials) really understand the difference between "works for me" and "works on all computers".
eisdt wrote:Could it be that some BIOSs are (much) more tolerant than others, thus fading away the issue? IIRC, intx13 also ran into the same issue but when the cause wasn't discovered yet and so (s)he was somewhat stumped too because, again, the code itself is fine.
Note that actual floppy disks probably don't strictly need a BPB on almost all computers (and if an actual floppy has no BPB you just get annoying whining from OSs like Windows about the disk not being formatted and/or not being valid even though it is); but "USB flash pretending to be a floppy disk" is not an actual floppy disk and a BIOS is much more likely to have problems in that case.


Cheers,

Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
Post Reply