Page 1 of 2

Boot sequence and MBR

Posted: Wed Oct 14, 2020 11:34 am
by abstractmath
For the past month or so as I've been getting into os development, I've just been dumping all of my kernel and bootloader code into a binary file, however this feels like a bit of a hack, and I would like to understand a bit more of a proper way to load my kernel from a file system, or at least have an MBR and a well known kernel location on disk. The thing is that I'm having difficulty understanding how the boot process actually works when an MBR and partition table is present, and how I can make a disk image while still rolling my own bootloader code. This is my current understanding: From every tutorial I have ever seen on writing a bootloader, they always include the

Code: Select all

org 0x7C00
directive, which I understand means that the assembler starts putting the code at that 0x7C00 address, because that's where the BIOS will look for a sector with the bootloader code, and it will start executing from there. And from what I understand about MBR, it's at the very first sector of the disk, rather than at this 0x7C00 address, and I'm confused as to why this seems to be the case. Does the BIOS look for an MBR first, and then just default to the 0x7C00 location on disk? Also, how can I roll my own bootloader code within the MBR? Any clarification on these questions would be much appreciated, thanks!

Re: Boot sequence and MBR

Posted: Wed Oct 14, 2020 11:49 am
by iansjack
You are confusing addresses on the disk and addresses in memory.

The BIOS loads the first sector of the boot device into memory (at the address you mentioned), then jumps to the first instruction at that address.

Re: Boot sequence and MBR

Posted: Wed Oct 14, 2020 11:54 am
by abstractmath
Oh okay, that makes sense. Now a follow up question, why include the org directive at all? I'm guessing it's because it already has the BIOS rom already loaded into memory, and it would be bad to overwrite that?

Re: Boot sequence and MBR

Posted: Wed Oct 14, 2020 12:30 pm
by iansjack
The org directive tells the assembler at what address the program will be loaded. This is important to ensure that any references to memory addresses are correct. Note that this means you should explicitly set the segment registers to 0 early in your code.

Re: Boot sequence and MBR

Posted: Wed Oct 14, 2020 6:31 pm
by BenLunt
abstractmath wrote:Also, how can I roll my own bootloader code within the MBR? Any clarification on these questions would be much appreciated, thanks!
If I may, I would like to expand on the answers that have been given.

The BIOS has absolutely no idea, nor does it care what the first sector of the disk is, other than if the sector has the correct Signature at offset 510. Some BIOSes also check the first few bytes of the sector for valid code, but this is not standard.

With this in mind, the first sector on the disk can be anything you want, though most of the time, for Legacy Boot machines, this is a MBR.

The purpose of the MBR is simply to:
1) move out of the way
2) find a valid partition entry
3) load the first sector of the found partition
4) jump to it

The code must inspect the four entries within its already loaded data (offset 0x1BE of this sector), as well as account for extended partitions. There never was a standard specification stating exactly how an extended partitioned drive was to be formatted. For example, most took the idea that you could have one valid partition entry and one valid extended partition entry per partition table. However, this was/is not set in stone (as far as I remember). Most commercial OSes of that time did it this way.

However, this doesn't mean that you can't have three valid partition entries, each pointing to a partition (within the range limit of course) and then have a single extended partition entry, pointing to a table of four more entries. Doing something like this should be perfectly valid, however not the normal way of doing it.

Your MBR code should gracefully bow out if it didn't find a valid/bootable partition entry as well as display disk read errors, if any.

It is common for the MBR to only occupy one 512-byte sector, and nothing more. With 512 bytes, minus (16 * 4) + 2 bytes, you have enough space to do almost anything you would need to in a MBR.

I have a (somewhat older) version of my MBR code at https://github.com/fysnet/FYSOS/blob/ma ... sc/mbr.asm that is able to use CHS addressing or LBA addressing, depending on the build flags. It uses a stack of 512-byte blocks to parse the extended partition table(s), allowing for multiple nested partition entries (up to 65536 bytes of stack space).

Once you have a working MBR, and it can load the first sector of your partition, usually called a Partition Boot Record (PBR), you now need a valid PBR.

The PBR is specific to the file system it resides on. However, it needs a few notes:
1) The PBR needs some way to know where is resides--the base LBA of this partition. This is usually done with a BPB, like in the FAT file systems, for example.
2) It may need to load extra sectors since the 512 bytes your MBR loaded may not be enough. One huge mistake some newbies have made is to put their "read_from_disk" code in the sectors yet to be read. Your "read_from_disk" code *must* reside within the first 512 bytes, including any data it might need to do so.
3) This PBR will usually parse the file system to find a loader file. This may have a few requirements, such as the loader file must be in the root directory. However, if you feel ambitious enough, you can write your PBR to search sub directories as well.

It is not necessary, but just a few thoughts.
1) The MBR (obviously) must be written in assembly.
2) the PBR (obviously) must be written in assembly.
3) however, the loader does not. My loader is 90% ANSI C.

Since my loader is in ANSI C, it is much easier to maintain. My loader also is written for all of the file systems I support. I then simply give it a build flag to only include the specific file system I need. This way, the loader does the same thing for all file systems.

My loader searches in the Root directory ('\'), the '\BOOT' sub-directory, and then finally, the '\SYSTEM\BOOT' sub-directory for the kernel file(s) and other files needed.

Each of these kernel file(s) have a header at the beginning of the file telling the loader any information about the file.
1) is it the kernel file? (if so, where to jump to after loaded)
2) where to load into memory.
3) is this a mandatory file (i.e.: if not a valid file/didn't read all of file, can we continue?)
4) compression type. (i.e.: these files can be compressed to save disk space)

This way the loader file only has to have a list of filenames to load. Nothing more. The loader file doesn't care about things like kernels, drivers, etc. It only cares about what the header of these files, tell it.

You don't have to build you project like this, but this is just a way you might think about it.

Remember, maintainability is/will be a key factor when your project gets more advanced. Right now, simplicity is the key. However, when your project gets to be hundreds of files, if not thousands, maintainability will be a must.

Hope this helps,
Ben
- http://www.fysnet.net/osdesign_book_series.htm

Re: Boot sequence and MBR

Posted: Thu Oct 15, 2020 12:49 pm
by abstractmath
Ahhh I see! Thank you, that was incredibly helpful.

Re: Boot sequence and MBR

Posted: Thu Oct 15, 2020 2:03 pm
by rdos
iansjack wrote:The org directive tells the assembler at what address the program will be loaded. This is important to ensure that any references to memory addresses are correct. Note that this means you should explicitly set the segment registers to 0 early in your code.
I actually prefer to set CS to 7C0 instead (with a far jmp) so that org is not needed, and also so it works regardless of how the BIOS set up CS before jumping to the boot sector code.

Code: Select all

BootSectInit:
    jmp StartBoot
    nop

    db 'Rdos    '

BootMedia       boot_struc      <>

StartBoot:
    db 0EAh
    dw OFFSET JmpBootCode
    dw 07C0h

JmpBootCode:

Re: Boot sequence and MBR

Posted: Thu Oct 15, 2020 2:24 pm
by rdos
BenLunt wrote:
abstractmath wrote: Once you have a working MBR, and it can load the first sector of your partition, usually called a Partition Boot Record (PBR), you now need a valid PBR.

The PBR is specific to the file system it resides on. However, it needs a few notes:
1) The PBR needs some way to know where is resides--the base LBA of this partition. This is usually done with a BPB, like in the FAT file systems, for example.
2) It may need to load extra sectors since the 512 bytes your MBR loaded may not be enough. One huge mistake some newbies have made is to put their "read_from_disk" code in the sectors yet to be read. Your "read_from_disk" code *must* reside within the first 512 bytes, including any data it might need to do so.
3) This PBR will usually parse the file system to find a loader file. This may have a few requirements, such as the loader file must be in the root directory. However, if you feel ambitious enough, you can write your PBR to search sub directories as well.
I prefer to use the hidden sector count in the BPB to place the second stage boot loader directly after the MBR. When I partition a disc for my operating system, I always create a MBR with 16 hidden sectors and then the actual partitions will be after that. I have no nested booting, rather it's the second stage boot loader that scans the root partition (which always needs to be FAT12, FAT16 or FAT32) for OS image files, and then either loads the normal boot file on timeout or the one the user selects.

The EFI boot loader does something similar. Efi loader will scan the EFI system partition for OS image files, and similar to the BIOS boot loader, allow the user to select one or use the normal boot image on timeout.

Re: Boot sequence and MBR

Posted: Thu Oct 15, 2020 2:42 pm
by Octocontrabass
BenLunt wrote:The BIOS has absolutely no idea, nor does it care what the first sector of the disk is, other than if the sector has the correct Signature at offset 510. Some BIOSes also check the first few bytes of the sector for valid code, but this is not standard.
This is definitely not true of USB flash drives. I've even seen one BIOS skip the MBR code entirely and go directly to loading the first sector of the active partition. (Unfortunately, it's not a convenient development machine, so I haven't checked to see if it will do that with the internal hard disk.)
rdos wrote:I actually prefer to set CS to 7C0 instead (with a far jmp) so that org is not needed, and also so it works regardless of how the BIOS set up CS before jumping to the boot sector code.
Most control flow instructions don't rely too much on the value of CS, so they work regardless of how the BIOS set up CS. But I'd still recommend setting the segment registers (including CS) to 0 and using org 0x7C00 so that address calculations can ignore segmentation.

Re: Boot sequence and MBR

Posted: Fri Oct 16, 2020 1:55 pm
by rdos
Octocontrabass wrote:
rdos wrote:I actually prefer to set CS to 7C0 instead (with a far jmp) so that org is not needed, and also so it works regardless of how the BIOS set up CS before jumping to the boot sector code.
Most control flow instructions don't rely too much on the value of CS, so they work regardless of how the BIOS set up CS. But I'd still recommend setting the segment registers (including CS) to 0 and using org 0x7C00 so that address calculations can ignore segmentation.
You cannot use offsets if you don't know the CS (and IP).

Personally, I think segmentation is useful, and orgs are a bit problematic, especially if the linker/locator fills up with zeros. It's a lot easier to create a binary image that starts at zero without having to rely on fancy tools. You can even create a dos .com executable.

Also, when the second stage boot-loader is loaded, it's most convenient to avoid orgs there too and place it at offset 0 in some fixed segment.

Re: Boot sequence and MBR

Posted: Fri Oct 16, 2020 2:48 pm
by Octocontrabass
rdos wrote:You cannot use offsets if you don't know the CS (and IP).
You cannot use absolute offsets, but IP-relative offsets still work fine as long as you stay within the boundaries of CS. It's safe to assume CS will be either 0 or 0x7C0 upon entry to your boot sector, so you can jump anywhere within the overlap between those two 64k segments using a relative jump (although it's still a good idea to set CS as soon as possible).

The most commonly used jump instructions use IP-relative offsets.
rdos wrote:Personally, I think segmentation is useful,
Personally, I don't.
rdos wrote:and orgs are a bit problematic, especially if the linker/locator fills up with zeros.
It's not a problem if you understand what it's doing and why. That's why we recommend being familiar with your toolchain.
rdos wrote:It's a lot easier to create a binary image that starts at zero without having to rely on fancy tools. You can even create a dos .com executable.
But a DOS .com executable requires org 0x100.
rdos wrote:Also, when the second stage boot-loader is loaded, it's most convenient to avoid orgs there too and place it at offset 0 in some fixed segment.
Gotta disagree with you there too. Setting segment registers to zero is much more convenient.

Re: Boot sequence and MBR

Posted: Fri Oct 16, 2020 3:23 pm
by nexos
Octocontrabass wrote:Gotta disagree with you there too. Setting segment registers to zero is much more convenient.
I agree with you, Octocontrabass. I in my bootloader was using segments in the second stage at one point. This made it nearly impossible to switch to PMode, as I had to deal with segments. I prefer zeroing everything, and then using flat segments in PMode or long mode. I think segmentation could be better, but it just needs some changes.

Re: Boot sequence and MBR

Posted: Sat Oct 17, 2020 6:54 am
by rdos
nexos wrote:
Octocontrabass wrote:Gotta disagree with you there too. Setting segment registers to zero is much more convenient.
I agree with you, Octocontrabass. I in my bootloader was using segments in the second stage at one point. This made it nearly impossible to switch to PMode, as I had to deal with segments. I prefer zeroing everything, and then using flat segments in PMode or long mode. I think segmentation could be better, but it just needs some changes.
Actually, it's just as easy to load a code selector with a zero base as a non-zero base. If you load your second stage boot loader at a fixed segment, you can safely use a fixed base for the code segment in protected mode.

There certainly is a need to create a 4G flat selector too, but code should not be using orgs or similar.

Besides, if the second stage boot loader is zero-based without orgs, it can be loaded anywhere.

Re: Boot sequence and MBR

Posted: Sat Oct 17, 2020 6:58 am
by rdos
Octocontrabass wrote:
rdos wrote:and orgs are a bit problematic, especially if the linker/locator fills up with zeros.
It's not a problem if you understand what it's doing and why. That's why we recommend being familiar with your toolchain.
My original tool chain in 1988 was a DOS-based assembler & linker. :-)

Re: Boot sequence and MBR

Posted: Sat Oct 17, 2020 2:57 pm
by bzt
rdos wrote:I prefer to use the hidden sector count in the BPB to place the second stage boot loader directly after the MBR. When I partition a disc for my operating system, I always create a MBR with 16 hidden sectors and then the actual partitions will be after that.
I got a bit confused about this part.

So, the MBR is the first sector of the disk. It does not store hidden sectors count, only a partitioning table. BPB is part of the file system (stored on the first sector of a FAT volume). So if you don't have partitions, just one single FAT file system, then BPB is indeed in the first sector, but then you don't have an MBR. If you have partitions, you also have an MBR, but then BPB is not on the first sector of the disk (rather in the first sector of the partition). So how can hidden sectors (given by BPB) mark the space directly after the MBR?

Just for the records, I prefer to have a simple starting "pointer" (first sector of the 2nd stage) in the 1st stage code, and then I can boot from a GPT partitioned disk too. It doesn't matter then if the 2nd stage is right after the MBR (before the first partition but after the GPT), after the VBR (in hidden sectors) or in a defragmented system file on the boot partition.
nexos wrote:
Octocontrabass wrote:Gotta disagree with you there too. Setting segment registers to zero is much more convenient.
I agree with you, Octocontrabass.
I second that. It's easier to set the segment's base to 0. However I do the far jump rdos was talking about, because I've run into different BIOSes, some started the code at 0:7C00 and others at 07C0:0.

Cheers,
bzt