MBR partition parsing

Discussions on more advanced topics such as monolithic vs micro-kernels, transactional memory models, and paging vs segmentation should go here. Use this forum to expand and improve the wiki!
Antti
Member
Member
Posts: 923
Joined: Thu Jul 05, 2012 5:12 am
Location: Finland

Re: MBR partition parsing

Post by Antti »

The OSDev we practise should not underestimate different viewpoints. If there is a list of goals that suggests more or less (in)directly that anything not meeting the same goals is universally inferior, the discussion is very controversial. Especially if the goals are like strict requirements that are not really accepted as such by the industry. I am not saying that this is necessarily the case here but it could not be unnoticed that there is some disparagement against "Legacy MBR code understands GPT and boots a partition that is marked as legacy bootable in GPT."

For me, this version ("MBR understands GPT") sounds the most sensible one that I should implement and intuition backs this up. If I tried to implement the goals listed above, it would guarantee that I ended up having nothing. Besides, it would not be "my style" to excercise so much policy on this partition scheme. Perhaps this is one of the many reasons I should just follow the industry standard (that exists) and cause as little controversy as possible. And still support all the "four" scenarios. :)

Please note that I'm not saying that the intention of those goals were bad. I just brought out my viewpoint.
User avatar
zaval
Member
Member
Posts: 656
Joined: Fri Feb 17, 2017 4:01 pm
Location: Ukraine, Bachmut
Contact:

Re: MBR partition parsing

Post by zaval »

yeah, I confused a little, looking at the table (5.2.3) where its says:

Code: Select all

Boot Code                      0    440 Unused by UEFI systems.
Unique MBR disk signature 440 4     Unused. Set to zero.
Unknown                       444 2 Unused. Set to zero.
The specifcation doesn't mandate mbr code to be filled with zeros, just unused, i confused with the two next fields.
If it matters, since still there is requirement to have only 1 partition record (for PMBR partition that should start at LBA=1) that effectively means MBR code should parse the real Partition Array and derive boot source from there which is problematic for 440 bytes of code at least. And again, "legacy" MBR (0xEF) means GPT not present.
it could not be unnoticed that there is some disparagement against "Legacy MBR code understands GPT and boots a partition that is marked as legacy bootable in GPT."
If you referring to me, :D I am not disparaging using GPT scheme for platforms based on BIOS. I dislike "hybrid GPT" yes, since it's a mess, non-conformant and very confusing (dual scheme -> dual management, nested partitions).
I just expressed concern that it might not work. But it would be cool if it worked always, GPT scheme is quite good, and I would like to use it everywhere. For example, in my ideas I am going to rely on GUIDs for drives and partitions as a means to avoid all those confusions with drive identification. So easily found for example when you have multiple disk multi boot.

As of using "legacy boot attribute flag". It's good, but it doesn't resolve the problem of choosing between many such partitions. It's not an "active" marker. How your MBR code is gonna pick one?

Personally I am using either MBR or GPT for 1 media, not both. It's easier and more clear.
On the board I program, ROM code just requires FW to lay at LBA 1, so here, there is no way to use GPT. Also GPT might be problematic on NAND, there, it would be necessary to use something different.
ANT - NT-like OS for x64 and arm64.
efify - UEFI for a couple of boards (mips and arm). suspended due to lost of all the target park boards (russians destroyed our town).
Antti
Member
Member
Posts: 923
Joined: Thu Jul 05, 2012 5:12 am
Location: Finland

Re: MBR partition parsing

Post by Antti »

zaval wrote:I just expressed concern that it might not work. But it would be cool if it worked always, GPT scheme is quite good, and I would like to use it everywhere.
It is a valid concern but something that could be solvable if we made some assumptions. No legacy partition table parsing, no CHS, and only two error messages (e.g. "Invalid GPT" and "Disk error"). Should be doable and with "rather good" GPT validation.
zaval wrote:As of using "legacy boot attribute flag". It's good, but it doesn't resolve the problem of choosing between many such partitions. It's not an "active" marker. How your MBR code is gonna pick one?
A good question. This one is exactly a detail that could be solved in collaboration? Maybe the "expensive" MBR area contains a corner for one GUID?
Neutron
Posts: 2
Joined: Sat Jul 29, 2017 8:14 am

Re: MBR partition parsing

Post by Neutron »

zaval wrote:
You could do what GRUB does: in the BIOS/GPT case the MBR boot sector looks for a "BIOS boot partition" containing the next stage.
but this rejects the main advantage of GPT - provide as many bootable partitions as one wants and his/her disk is able to fit.
What? How does requiring one GPT partition entry for embedding the next stage of the boot loader do that? I meant that the MBR boot sector would parse the GPT and load the next stage from there. You still have quite a lot of entries left.

The reason I suggested this was to keep the boot loader independent of the MBR table entries, because if loading legacy systems is desired, the first MBR entry should probably cover the GPT structures, the bios boot partition and the efi system partition, if it exists. I don't think it would be unreasonable to require these partitions at the beginning of the disk. The boot loader could select one of the installed legacy systems and alias its partition as the next entry in the MBR table as a primary partition. If some scheme for extended partitions can be devised, the 3rd MBR entry could be an alias for a some other type of a partition in the GPT. The last entry, if the disk is smaller than 2 TB, would be a protective entry for the GPT backup.

This would mean, of course, that the legacy systems are not visible to each other unless they support GPT and wouldn't get confused by this.

Aliasing extended partitions would probably require some sort of "extended partition record header partition" in the GPT set preceding a partition that would be allowed into the extended partition alias in the MBR table.
User avatar
Brendan
Member
Member
Posts: 8561
Joined: Sat Jan 15, 2005 12:00 am
Location: At his keyboard!
Contact:

Re: MBR partition parsing

Post by Brendan »

Hii,
Antti wrote:For me, this version ("MBR understands GPT") sounds the most sensible one that I should implement and intuition backs this up. If I tried to implement the goals listed above, it would guarantee that I ended up having nothing. Besides, it would not be "my style" to excercise so much policy on this partition scheme. Perhaps this is one of the many reasons I should just follow the industry standard (that exists) and cause as little controversy as possible. And still support all the "four" scenarios. :)
OK, let's estimate code size.

The first thing MBR should do is some basic setup (set segment registers, set a sane stack, etc). Let's call that 10 bytes.

The next thing MBR should do is check if there's a video card, and if there is make sure the video mode is text mode to ensure that any error messages are displayed correctly. The simplest way to do this is to always set the video mode, but that's ugly (leads to "excessive screen flashing" during boot when too many things set a new video mode). Instead I use code to check if there's a video card; then check the current video mode, and then set the mode if and only if it wasn't already in text mode. I also disable the cursor (it should only be visible when software is waiting for user input) and make sure the video mode is using "16 background colours" and not "8 background colours with blinking". The other thing I do is get "display page number" (because you have to know what that is to be able to print text properly). Some of these things can be skipped (more if you're happy with poor quality code); but for now let's allow about 50 bytes for video setup.

The next thing you'll need to do is try to figure out the relationship between "BIOS LBA that uses 512-byte sectors" and "GPT LBA that may not use 512-byte sectors". I think this ends up being a crude brute force search. Basically, assume that GPT uses 512-byte sectors and try to find the GPT header at "BIOS LBA 1", and if it's not there assume GPT uses 1024-byte sectors (such that "GPT LBA 1 = 1024/512 = BIOS LBA 2") and try to find the GPT header at "BIOS LBA 2", and keep doing that (doubling the "BIOS LBA" for each attempt) until you hit some maximum (maybe "max. of 65536-byte sectors for GPT" would be enough future proofing). Alternatively you could do something smarter (e.g. start with 512-byte sectors then try 4096-byte sectors to get the most likely cases out of the way early; then fall back to brute force search for unlikely cases). Anyway, let's allow about 70 bytes for this.

You'll also want some code to check a potential GPT header (is the "EFI PART" signature correct, is the header size sane, is the CRC32 correct, etc). I'd probably want to use several hundred bytes for this alone (to check everything as thoroughly as possible), but for now let's allow about 100 bytes.

Of course you'll also want some "disk IO" code - detect if "int 0x13 extensions" exists, do CHS->LBA conversion, do "GPT LBA -> BIOS LBA" conversion, handle retries, handle errors, etc. I very much like having good error messages (e.g. so the user can tell the difference between "programmer mistake" and "media removed" and "faulty device"), but maybe acceptable code is asking too much given the space limitations. Let's maybe allow another 80 bytes for this anyway.

Next; I'd want to check if the GPT header (at "GPT LBA 1") matches the backup GPT header (at "backup LBA"). This would help to make sure that the "GPT LBA -> BIOS LBA" conversion is being done right, would detect header corruption, and would be the beginning of basic fault tolerance (e.g. use the backup if something is wrong with the original). Let's allow another 50 bytes for this.

At this point we've gone through about 360 bytes of code, and we haven't looked at any piece of (any copy of) the partition table itself yet. Let's skip ahead...

After we find whatever we're trying to load (the boot manager's second stage?) using whatever method we use to find it (if it's in UEFI system partition or if it's in its own partition); the MBR should load it and then update TPM before executing anything. This probably costs about 40 bytes (including a "jmp whatever_got_loaded" to complete the MBR's responsibilities).

That means (likely with error messages that are worse than I'd consider acceptable, GPT header checking that's worse than I'd consider acceptable, etc) we've consumed about 400 bytes out of the 440 bytes of space available; and all of the GPT partition table loading, checking, and parsing/searching (including "fall back to backup GPT if necessary") needs to fit 40 bytes. If you're going to search for a partition with the right GUID (which is what I'd probably want to do - e.g. have a "boot manager partition" with a specific GUID that no OS uses), then you'll have to store a copy of the GUID you're looking for somewhere, and that will cost 16 of those 40 bytes all by itself.

At this point, it's just a question of how poor the quality needs to be to make everything fit, and whether or not your standards are so low that you're willing to accept the horror of a "least worst that fits" solution.

Note: An alternative would be to store a (non-standard) "BIOS LBA for second stage of boot manager" field (or maybe 2 of them, for redundant 2nd stages) in the MBR somewhere, and postpone everything to do with GPT. This would be "slightly bad" (would break if any OS moves partitions around without updating those fields in the MBR), but perhaps "slightly bad" is the least worst solution.


Cheers,

Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
Antti
Member
Member
Posts: 923
Joined: Thu Jul 05, 2012 5:12 am
Location: Finland

Re: MBR partition parsing

Post by Antti »

Thank you for the reply. This is a discussion forum at its best. I really mean this, no sarcasm.
Brendan wrote:[...]
Let's assume you are right. It is possible to ridicule a so-called normal implementation. Would it really help me or other developers if we set our standards that high? Would standards that high be directly proportional to the overall value a developer can provide to others? I honestly don't know what to answer. Your goals are different and I believe that it helps that there are people like you "pushing limits" but I really doubt that it helps in general if everyone tried to do that.

Would you be happy if an MBR with GPT support that practically speaking just works does not get born? Beaten by perfection.
Brendan wrote:This would be "slightly bad" (would break if any OS moves partitions around without updating those fields in the MBR), but perhaps "slightly bad" is the least worst solution.
To help you achieve your goals, there is a bit in the partition entry attributes that says "Required Partition" and this may be a very good feature for the hybrid scheme you and BenLunt are thinking of. I think that this mostly prevents any modifications, including partition moving. Probably you knew this already but this is a very good thing to bring up when introducing the hybrid scheme idea to others.
User avatar
zaval
Member
Member
Posts: 656
Joined: Fri Feb 17, 2017 4:01 pm
Location: Ukraine, Bachmut
Contact:

Re: MBR partition parsing

Post by zaval »

Neutron wrote:
zaval wrote:
You could do what GRUB does: in the BIOS/GPT case the MBR boot sector looks for a "BIOS boot partition" containing the next stage.
but this rejects the main advantage of GPT - provide as many bootable partitions as one wants and his/her disk is able to fit.
What? How does requiring one GPT partition entry for embedding the next stage of the boot loader do that? I meant that the MBR boot sector would parse the GPT and load the next stage from there. You still have quite a lot of entries left.
Ah. then both the potential to boot from any partition and problem to find out from which exactly remain. :^) Of course, this second stage might be as powerful as a whole Firmware's Boot Manager and keeping its own environment, where every needed pieces of information live, it would not have any problems with the decision. But then it turns into yet another Big Firmware Boot Manager. Thus to use GPT on legacy BIOS systems the full way, we need a "little" addition in from of ... UEFI-like boot manager. If one is ready to create and supply this as a complement to his/her OS installation on legacy machines, why not. :^)
ANT - NT-like OS for x64 and arm64.
efify - UEFI for a couple of boards (mips and arm). suspended due to lost of all the target park boards (russians destroyed our town).
User avatar
zaval
Member
Member
Posts: 656
Joined: Fri Feb 17, 2017 4:01 pm
Location: Ukraine, Bachmut
Contact:

Re: MBR partition parsing

Post by zaval »

Antti wrote:
zaval wrote:I just expressed concern that it might not work. But it would be cool if it worked always, GPT scheme is quite good, and I would like to use it everywhere.
It is a valid concern but something that could be solvable if we made some assumptions. No legacy partition table parsing, no CHS, and only two error messages (e.g. "Invalid GPT" and "Disk error"). Should be doable and with "rather good" GPT validation.
zaval wrote:As of using "legacy boot attribute flag". It's good, but it doesn't resolve the problem of choosing between many such partitions. It's not an "active" marker. How your MBR code is gonna pick one?
A good question. This one is exactly a detail that could be solved in collaboration? Maybe the "expensive" MBR area contains a corner for one GUID?
On UEFI this is decided by the Boot Policies established by the "platform firmware" itself. And by the user, that running OS setup efi applications creates new boot options. And also modifies them in an interactive fashion with the FW. All this then is stored somewhere inside FW's non-volatile storage as the "environment".
I am thinking, that it would be too hard to simplify all this just to one GUID placed somewhere where it might be wiped out by some clueless utility.
You need a map between a set of partitions and your boot entries you are about to try as a boot sequence. At the bare minimum. Plus some fallback behavior if none of them worked out. Some "default boot". The latter might be just "disk error" happy message to the user, right.) But even this minimum requires a little more place. And effort. How to build this? Ask user at the MBR creation showing to him/her available partitins? then put this map into mbr. And then check everytime is it still valid or partitions are not the same anymore. How to determine they aren't the same? Just by their number? Or LBAs and sizes? Or GUIDs representing their type too? the latter is important for deciding whether it is worth to trasfer control there or it's just not a bootable partition. Things would be simpler if any partition modification would go via your control. But they won't. Your mbr code won't be asked and informed of the changes always. So you need to embed a quite non-trivial map analysis into 440 bytes. UEFI's Boot Manager should do this too. Boot options might get invalid since the last run of the FW too. But Boot Manager has the advantage of getting as bloated in size as it wants. :^D Unlike mbr code.
Really, as Neutron suggested, you would need to recreate your way the Boot Manager and put it into its own GPT partition, then your mbr code just needs to pick only this partition, Boot Manager's one, hardcoded, and there you are free to bring into life the power of using GPT onto the BIOS machine.
ANT - NT-like OS for x64 and arm64.
efify - UEFI for a couple of boards (mips and arm). suspended due to lost of all the target park boards (russians destroyed our town).
Octocontrabass
Member
Member
Posts: 5513
Joined: Mon Mar 25, 2013 7:01 pm

Re: MBR partition parsing

Post by Octocontrabass »

Brendan wrote:Who says it's an everyday OS? Why does it matter which OS it is? Why does how powerful the computer is matter?
Because when you're aiming for perfection, eventually you will have to make a compromise, and it makes sense to start making those compromises on the scenarios that are least likely to affect a real user first. For example, anyone installing FreeDOS in 2017 is either using it for fun (and won't care that they can't dual boot FreeDOS and your OS) or using it on a ridiculously old piece of hardware (that your OS can't be installed on in the first place).
Brendan wrote:The next thing MBR should do is check if there's a video card, and if there is make sure the video mode is text mode to ensure that any error messages are displayed correctly.
As an example of another good compromise, you can remove all of this and operate on the (very safe) assumption that INT 0x10 AH=0x0E will output text somewhere that the user can see it, without worrying about whether it's through a video card or a serial console or some other peripheral, and without worrying about how said peripheral is configured.
Brendan wrote:The next thing you'll need to do is try to figure out the relationship between "BIOS LBA that uses 512-byte sectors" and "GPT LBA that may not use 512-byte sectors".
I suspect BIOS LBA and GPT LBA are always the same size. The real trick is finding a BIOS that's capable of booting from a hard disk with sectors that are not 512 bytes. (If you do happen to find one, I hear the GRUB folks could use some help figuring out how to make it work.)
Brendan wrote:At this point, it's just a question of how poor the quality needs to be to make everything fit, and whether or not your standards are so low that you're willing to accept the horror of a "least worst that fits" solution.
Users are willing to accept any solution that works, if it's a choice between that or no solution at all.
Post Reply