Bootloaderis not consistent when booting in real hardware

Question about which tools to use, bugs, the best way to implement a function, etc should go here. Don't forget to see if your question is answered in the wiki first! When in doubt post here.
liwinux
Member
Member
Posts: 46
Joined: Sat Jun 12, 2021 4:13 pm

Re: Bootloaderis not consistent when booting in real hardwar

Post by liwinux »

Octocontrabass wrote:
liwinux wrote:I don't know why it works now.. Can someone explain it to me ?
It's hard to say for sure without being able to run dozens of tests. Maybe you finally got a valid partition table. Maybe something about the far JMP instruction you added convinces your firmware that it doesn't need to "fix" the BPB anymore. Maybe your 4.5kiB stack is finally big enough. (Some uncommon BIOS calls still need a bigger stack!)
Well that sucks, because I really wanted to know what's going on but unfortunately this kind of thing is really hard to debug without testing every changes and because I have to reboot my computer every time that I'm changing the code, it makes it even harder. But anyway, thank you all for your lovely help, I learned a LOT and I think it's the most important ! :)
User avatar
neon
Member
Member
Posts: 1567
Joined: Sun Feb 18, 2007 7:28 pm
Contact:

Re: Bootloaderis not consistent when booting in real hardwar

Post by neon »

Hi,

Did you add the changes incrementally? That would have allowed to narrow down the potential cause pretty quickly. My guess is, as what was posted above, is that the BIOS was probably trying to "correct" the BPB which would corrupt code. You can test this:

1. Move the include directives back to where they were and remove the jmp. Does it cause issues? Does it still cause issues with the jmp?
2. What if you keep the jmp but include additional padding for the area that would typically be where the BPB is right after the jmp at the start. Does it still occur?
OS Development Series | Wiki | os | ncc
char c[2]={"\x90\xC3"};int main(){void(*f)()=(void(__cdecl*)(void))(void*)&c;f();}
liwinux
Member
Member
Posts: 46
Joined: Sat Jun 12, 2021 4:13 pm

Re: Bootloaderis not consistent when booting in real hardwar

Post by liwinux »

neon wrote:Hi,

Did you add the changes incrementally? That would have allowed to narrow down the potential cause pretty quickly. My guess is, as what was posted above, is that the BIOS was probably trying to "correct" the BPB which would corrupt code. You can test this:

1. Move the include directives back to where they were and remove the jmp. Does it cause issues? Does it still cause issues with the jmp?
2. What if you keep the jmp but include additional padding for the area that would typically be where the BPB is right after the jmp at the start. Does it still occur?
Hi, so basically yeah, I knew that in order to track down what the issue could be, I would have to to change one thing and test it immediately, god knows how many times I've restarted my computer...
neon wrote:Move the include directives back to where they were and remove the jmp. Does it cause issues? Does it still cause issues with the jmp?
Both with and without the jump, It's the same story. As I said, the only time it actually worked as expected, was when I moved the include directives at the beginning OR when the include directive stayed where they were BUT, removing the ".text section". (Is it really required to have a .text section ? What's the difference ? Also, I don't really understand why I have to use the Global directive. I know it makes the symbol visible for the linker but still I don't think I need that).
neon wrote: What if you keep the jmp but include additional padding for the area that would typically be where the BPB is right after the jmp at the start
I'm not really sure how to achieve that sorry.. Could you give me an example ? (Are you talking about the "times" directive that I'm using at the end of my boot loader ?)
User avatar
neon
Member
Member
Posts: 1567
Joined: Sun Feb 18, 2007 7:28 pm
Contact:

Re: Bootloaderis not consistent when booting in real hardwar

Post by neon »

Hi,

You got the idea. If the BIOS was indeed modifying the code trying to "correct" a nonexistent BPB you can introduce some initial padding where the BPB would normally go to test it (can be done with the times directive).

Also - you shouldnt need to use the global directive here. Technically dont need "section .text" either as that is the default. If you are exhibiting different behavior with and without section .text I would be interested to see the resulting image files with just that one change and how you are currently building it.
OS Development Series | Wiki | os | ncc
char c[2]={"\x90\xC3"};int main(){void(*f)()=(void(__cdecl*)(void))(void*)&c;f();}
liwinux
Member
Member
Posts: 46
Joined: Sat Jun 12, 2021 4:13 pm

Re: Bootloaderis not consistent when booting in real hardwar

Post by liwinux »

I think guys you were all right. My bios is really trying to look for a BPB but as I didn't created one, it ends up messing my code..
neon wrote:Hi,

You got the idea. If the BIOS was indeed modifying the code trying to "correct" a nonexistent BPB you can introduce some initial padding where the BPB would normally go to test it (can be done with the times directive).
So the idea is to add some padding (because the far jump isn't effective). But what's the size of a BPB then ? I've looked at the source code of GRUB2 and saw this :

Code: Select all

.globl _start, start;
_start:
start:
	/*
	 * _start is loaded at 0x7c00 and is jumped to with CS:IP 0:0x7c00
	 */

	/*
	 * Beginning of the sector is compatible with the FAT/HPFS BIOS
	 * parameter block.
	 */

	jmp	LOCAL(after_BPB)
	nop	/* do I care about this ??? */

#ifdef HYBRID_BOOT
	nop
	nop
	nop
	nop
	nop
	nop
	nop
	nop
	nop
	nop
	nop
	nop
	nop

	nop
	nop
	nop
	nop
	nop
	nop
	nop
	nop

	nop
	nop
	jmp	LOCAL(after_BPB)
#else
	/*
	 * This space is for the BIOS parameter block!!!!  Don't change
	 * the first jump, nor start the code anywhere but right after
	 * this area.
	 */

	.org GRUB_BOOT_MACHINE_BPB_START
	.org 4
Is it to prevent the BPB check ? Are the nops just to prevent the strange behavior that I'm having with my computer ?

Also as Octocontrabass said :
Octocontrabass wrote:Firmware gets picky if you don't have either a MBR with a valid partition table and one active partition, or a FAT VBR with a valid BPB.
Now, for programs like etcher, etc ... It's obvious that it won't work because of the missing BPB but with a MBR, it could technically skip the creation of a BPB. But here is a thing : It does nothing... In fact, I think I'm doing something wrong. I opened gparted, create a MSDOS partition table, create an empty partition and set the "boot" flag to make it active and finally, used dd directly on that partition. Have I done this right ?
neon wrote:I would be interested to see the resulting image files with just that one change and how you are currently building it.
Well, I used xxd against the created binary and compared them and strangely it's the same image.
Octocontrabass
Member
Member
Posts: 5567
Joined: Mon Mar 25, 2013 7:01 pm

Re: Bootloaderis not consistent when booting in real hardwar

Post by Octocontrabass »

liwinux wrote:and finally, used dd directly on that partition.
Hold on a minute. Why are you writing to the partition? You're trying to install your code on the first sector of the disk, right?
Ethin
Member
Member
Posts: 625
Joined: Sun Jun 23, 2019 5:36 pm
Location: North Dakota, United States

Re: Bootloaderis not consistent when booting in real hardwar

Post by Ethin »

Question: why are you testing this on real hardware? You should probably use an emulator until your ready for real hardware. Or at least be doing the real HW tests on a machine that isn't your primary machine.
User avatar
bzt
Member
Member
Posts: 1584
Joined: Thu Oct 13, 2016 4:55 pm
Contact:

Re: Bootloaderis not consistent when booting in real hardwar

Post by bzt »

liwinux wrote:I think guys you were all right. My bios is really trying to look for a BPB but as I didn't created one, it ends up messing my code..
I'm pretty sure that's not the case. If that were true, that would mean you couldn't boot anything else than FAT file systems, but I'm pretty sure you can boot Windows (NTFS) and Linux (ext2/3/4) on those machines just fine, so that cannot be the reason, it just happens your issue looks like it. (Plus it would mean no GPT table, no MBR partitions allowed on your machines, again, I'm sure that's not true. BPB is part of the VBR, and not the MBR.)

If I were you, I'd look for a different reason. Probably you need the 3 bytes jump at the beginning and the 2 bytes of magic at the end of the sector, but that's all.

AGAIN: the BPB, despite what its name suggests, is not part of the BIOS, it is part of the FAT12/FAT16/FAT32 file systems, and those file systems only. Nothing else uses it, and BIOSes are not checking for it (they actually can't check for it, that's just not possible). BIOSes should check for the magic bytes at the end, some BIOS might check for a jump instruction at the beginning, end of story. Nothing about the BPB after the starting jump check is mandatory (actually differs in FAT12/16/32), and it never was mandatory. Not even old DOS (3.3 and up) and Windows (3.11, 95, 98) version installed a BPB in the MBR.

Cheers,
bzt
User avatar
neon
Member
Member
Posts: 1567
Joined: Sun Feb 18, 2007 7:28 pm
Contact:

Re: Bootloaderis not consistent when booting in real hardwar

Post by neon »

Hi,
AGAIN: the BPB, despite what its name suggests, is not part of the BIOS, it is part of the FAT12/FAT16/FAT32 file systems, and those file systems only.
NTFS boot code reserves space for a valid BPB (although like with fat32 it is an extended BPB.)

But this doesn't matter anymore: I don't even know what is being targeted. If the intent is to write an MBR you wouldn't need a BPB. But you also wouldn't need to edit a partition table so I don't know what is being done at this time.
OS Development Series | Wiki | os | ncc
char c[2]={"\x90\xC3"};int main(){void(*f)()=(void(__cdecl*)(void))(void*)&c;f();}
nexos
Member
Member
Posts: 1081
Joined: Tue Feb 18, 2020 3:29 pm
Libera.chat IRC: nexos

Re: Bootloaderis not consistent when booting in real hardwar

Post by nexos »

bzt wrote:If that were true, that would mean you couldn't boot anything else than FAT file systems
That couldn't be true, as the first sector normally has the partition table, not the FS data. The BIOS won't read the partition boot record, where the FS data is stored
"How did you do this?"
"It's very simple — you read the protocol and write the code." - Bill Joy
Projects: NexNix | libnex | nnpkg
Octocontrabass
Member
Member
Posts: 5567
Joined: Mon Mar 25, 2013 7:01 pm

Re: Bootloaderis not consistent when booting in real hardwar

Post by Octocontrabass »

bzt wrote:If that were true, that would mean you couldn't boot anything else than FAT file systems, but I'm pretty sure you can boot Windows (NTFS) and Linux (ext2/3/4) on those machines just fine, so that cannot be the reason, it just happens your issue looks like it.
NTFS has a BPB with geometry in the same location as FAT, so even if the firmware patches the BPB, Windows is unaffected.

GRUB doesn't store anything where the BPB would go, so even if the firmware patches the BPB, Linux is unaffected.

Firmware only patches USB flash drives without a partition table. All other media, including USB flash drives with a partition table, are unaffected.
bzt wrote:Nothing else uses it, and BIOSes are not checking for it (they actually can't check for it, that's just not possible).
They can check for and use it. I've found one Dell PC that uses the BPB to decide how INT 0x13 AH=0x02 should translate CHS to LBA. It hangs during POST if the BPB is mostly correct but specifies 0 heads per cylinder. (This only happens with unpartitioned drives, of course.)
neon wrote:If the intent is to write an MBR you wouldn't need a BPB. But you also wouldn't need to edit a partition table so I don't know what is being done at this time.
Some firmware will try to correct the geometry in the BPB when booting from an unpartitioned USB flash drive. The solution is to either have a BPB so the corrected geometry will end up where it belongs, or add a partition table with an active partition so firmware will detect that it's a MBR and stop trying to correct the nonexistent BPB.
nexos wrote:That couldn't be true, as the first sector normally has the partition table, not the FS data.
USB flash drives don't always have a partition table. Sometimes the first sector of the disk is the first sector of the filesystem.
liwinux
Member
Member
Posts: 46
Joined: Sat Jun 12, 2021 4:13 pm

Re: Bootloaderis not consistent when booting in real hardwar

Post by liwinux »

Octocontrabass wrote: Hold on a minute. Why are you writing to the partition? You're trying to install your code on the first sector of the disk, right?
I'm sorry my bad, I actually wanted to say that I'm writing right in the first sector and leave the partition as it is like no touching it. And apparently without a BPB, it doesn't boot nothing is displayed. So yeah I think the goal would be to write it.
Ethin wrote:Question: why are you testing this on real hardware? You should probably use an emulator until your ready for real hardware. Or at least be doing the real HW tests on a machine that isn't your primary machine.
I think you didn't red the whole thread. I know I could use QEMU, Virtualbox, etc .. But what if I want to test it on real hardware ? Like booting it directly from a USB drive ? Well that's exactly what I thought and while it is working fine with QEMU, it actually doesn't on real Hardware. So I was wondering why because I don't see what the issue could be. To get my point, every change that I make has to be tested out. Because imagine if I was writing my whole boot loader without testing it and just rely on emulation, and when I boot, I realize that nothing works ! Good luck to find out why It doesn't work ! The more code I write, the more I find it to be difficult to debug. Once I know that it's booting correctly then I will probably test it less often.
Ethin wrote:If I were you, I'd look for a different reason. Probably you need the 3 bytes jump at the beginning and the 2 bytes of magic at the end of the sector, but that's all.
Actually I have both of those thing but something is messing with code as for example, 2 of the 3 strings are printed or the beginning of my string is replaced with another character, so yeah I don't see any other possibility... I'm really confused like many think can cause this to happen.
User avatar
neon
Member
Member
Posts: 1567
Joined: Sun Feb 18, 2007 7:28 pm
Contact:

Re: Bootloaderis not consistent when booting in real hardwar

Post by neon »

Hi,
2 of the 3 strings are printed or the beginning of my string is replaced with another character
I don't think you mentioned this detail before. This is a very strong indicator to the theory that the firmware may be modifying it.

This is basically what I meant for reserving space for a BPB (we use a fat32 BPB for this):

Code: Select all

bits 16
org 0x7c00
jmp 0x0000:BOOTSECT
times 0x5a db 0
;
; rest of code here...
If everything works perfectly then we can be sure that the firmware is for some reason thinking there is a BPB and trying to correct it. Keep in mind this is just a test - you can remove it afterwords. Just seeing if it'll work.
OS Development Series | Wiki | os | ncc
char c[2]={"\x90\xC3"};int main(){void(*f)()=(void(__cdecl*)(void))(void*)&c;f();}
liwinux
Member
Member
Posts: 46
Joined: Sat Jun 12, 2021 4:13 pm

Re: Bootloaderis not consistent when booting in real hardwar

Post by liwinux »

liwinux wrote:If everything works perfectly then we can be sure that the firmware is for some reason thinking there is a BPB and trying to correct it. Keep in mind this is just a test
Wow, that was it, end of the story.. Honestly, without your help guys, I think I couldn't make it on my own. Now, is that OK if I just let this padding at the beginning ? I guess it's just a waste of memory after all...

But thank you all for your help, it finally worked as expected on both computers !
nexos
Member
Member
Posts: 1081
Joined: Tue Feb 18, 2020 3:29 pm
Libera.chat IRC: nexos

Re: Bootloaderis not consistent when booting in real hardwar

Post by nexos »

Octocontrabass wrote:USB flash drives don't always have a partition table. Sometimes the first sector of the disk is the first sector of the filesystem.
That struck me after posting #-o . Most USB flash drives still have a FAT FS, and BIOSes manufacturers care only about Windows.
"How did you do this?"
"It's very simple — you read the protocol and write the code." - Bill Joy
Projects: NexNix | libnex | nnpkg
Post Reply