Page 1 of 1
Why do multiboot2 & Limine binary search?
Posted: Thu Feb 15, 2024 2:51 pm
by eekee
Both multiboot2 and Limine binary-search the kernel image for magic numbers to find places to communicate with the kernel. I assume this is so they can load kernels in ELF format, but I don't understand what's so great about ELF format itself, for kernels. Is there any reason other than just convenience? I'm kind-of hoping there is, because I find binary search to be rather questionable, and I don't see why anyone wouldn't want to tack binary data onto the beginning of the kernel image anyway. Or even text data; an ini file isn't hard to parse.
Re: Why do multiboot2 & Limine binary search?
Posted: Thu Feb 15, 2024 3:58 pm
by nexos
Not exactly sure, I find this to be a questionable design choice as well. The better solution (and this is what old school Limine did) would be to have a special section in the ELF which contains this data structure
Re: Why do multiboot2 & Limine binary search?
Posted: Thu Feb 15, 2024 6:10 pm
by thewrongchristian
eekee wrote:Both multiboot2 and Limine binary-search the kernel image for magic numbers to find places to communicate with the kernel. I assume this is so they can load kernels in ELF format, but I don't understand what's so great about ELF format itself, for kernels. Is there any reason other than just convenience? I'm kind-of hoping there is, because I find binary search to be rather questionable, and I don't see why anyone wouldn't want to tack binary data onto the beginning of the kernel image anyway. Or even text data; an ini file isn't hard to parse.
When you say "binary search", do you mean it searches the binary?
I certainly don't think they do a binary search in the accepted computer science sense, which is a way of find specific values in a sorted list in logarithmic time. Kernel binaries are almost certainly not a sorted list of numbers.
So I assume you mean why do they linear scan the binary?
To which I'd reply, why not? Certainly in multiboot, the actual binary format doesn't matter, multiboot supports ELF and a.out formats. Looking for a specific signature allows the kernel to embed the structure without worrying about exactly where the multiboot structure will end up in the file (other than it being in the first 8Kbytes of the image that multiboot requires.)
So, basically, they do it because it is easier than not doing it, and the kernel will already be in memory after they're been instructed to load the kernel file from the bootloader config file or user, and scanning memory is cheap if you're only doing it within known bounds (probably on the order of microsoconds to locate the multiboot header of a loaded kernel.)
Re: Why do multiboot2 & Limine binary search?
Posted: Thu Feb 15, 2024 10:51 pm
by nullplan
Multiboot at least allows you to use any binary format you should choose, if you enable the a.out kludge. So the custom headers were added as a way to have independence of the host tool chain. Also, they contain data not transmittable via the standard headers.
I personally use ELF for my 64-bit kernel to instruct the bootloader on how to map it correctly. The ELF headers contain the mapping information in a portable way, and also contain an entry point. And this is basically all I need.
Re: Why do multiboot2 & Limine binary search?
Posted: Fri Feb 16, 2024 5:36 am
by eekee
thewrongchristian wrote:When you say "binary search", do you mean it searches the binary?
I certainly don't think they do a binary search in the accepted computer science sense, which is a way of find specific values in a sorted list in logarithmic time. Kernel binaries are almost certainly not a sorted list of numbers.
So I assume you mean why do they linear scan the binary?
Oh lol, yes I do. I've been looking at text search algorithms lately, and just substituted 'binary' for 'text' without really thinking. I thought there was something off about it.
thewrongchristian wrote:To which I'd reply, why not?
Because there's a chance of finding arbitrary data which just happens to match the search key. The 128-bit keys used by both Multiboot and Limine minimize that chance, but I'm still not 100% comfortable with it. ...*re-reads multiboot page* Actually, multiboot requires the headers be within the first 8KB, further reducing the chance of a bad match. The linker script on our wiki's multiboot page places the multiboot header in the very first section for the lowest possible risk. I suppose you could apply the same tricks to the Limine header, but the Limine page doesn't mention them.
nullplan wrote:I personally use ELF for my 64-bit kernel to instruct the bootloader on how to map it correctly. The ELF headers contain the mapping information in a portable way, and also contain an entry point. And this is basically all I need.
Mapping and entry point are good reasons to use ELF. I just don't see what's so hard about prepending a header to the ELF, adding a little program to generate it and calling the program from the makefile or whatever. But I'm writing this when it's all the same to me. I'd have to modify my toolchain to use either multiboot or limine; it only generates PE32.
In actual fact, the dead-code elimination in my compiler means I'd have to implement extra headers in the same part of the compiler which generates the normal headers. (There is no linker.) I suppose I could make the compiler generate ELF and add a special case to detect multiboot or limine headers in my source, maybe with a specially-named record (struct). Such a special case could instead trigger creation of a custom binary format for any other bootloader. Huh... this might actually be a good idea.
Re: Why do multiboot2 & Limine binary search?
Posted: Fri Feb 16, 2024 9:12 am
by nullplan
eekee wrote:I just don't see what's so hard about prepending a header to the ELF, adding a little program to generate it and calling the program from the makefile or whatever.
ELF files must necessarily start with the ELF header. You can get a hell of a benefit from keeping the kernel file a valid ELF file, because then GDB can find things where they are expected. And for most people it is just easier to add the multiboot header to the first section. If only there weren't so many tutorials out there calling for linker scripts that way over-align all the sections, things could be so much easier.
eekee wrote:I'd have to modify my toolchain to use either multiboot or limine; it only generates PE32.
As I said, you can use multiboot at least with whatever binary format blows your skirt up if you enable the bit they call "a.out kludge". Which essentially adds a very simple executable header (containing memory sizes and the entry point) into the multiboot header. You can even use it without any header whatsoever if you do that.
eekee wrote:In actual fact, the dead-code elimination in my compiler means I'd have to implement extra headers in the same part of the compiler which generates the normal headers.
The normal solution for problems like these is to have some way to tell the compiler that something is used no matter what it thinks. GCC has __attribute__((used)), linker scripts have KEEP(), etc.
Re: Why do multiboot2 & Limine binary search?
Posted: Fri Feb 16, 2024 11:58 am
by Octocontrabass
eekee wrote:Mapping and entry point are good reasons to use ELF.
Doesn't PE have those too?
eekee wrote:I just don't see what's so hard about prepending a header to the ELF, adding a little program to generate it and calling the program from the makefile or whatever.
That would work with Multiboot, since you can repeat all the information the bootloader would normally get from the ELF (or PE) headers in the Multiboot headers, but Limine seems to require a valid executable binary.
Re: Why do multiboot2 & Limine binary search?
Posted: Sat Feb 17, 2024 6:17 pm
by davmac314
nexos wrote:Not exactly sure, I find this to be a questionable design choice as well. The better solution (and this is what old school Limine did) would be to have a special section in the ELF which contains this data structure
In my opinion, using sections for this purpose was wrong; section headers shouldn't even need to be present in a loadable ELF.
It should've instead used special segments. That's what Tosaithe does (though, if the special segment is missing, it looks for the magic at the beginning of every segment in the file, as a concession to making it even easier to generate suitable executables).
Re: Why do multiboot2 & Limine binary search?
Posted: Sun Feb 18, 2024 12:30 pm
by eekee
nullplan wrote:eekee wrote:I just don't see what's so hard about prepending a header to the ELF, adding a little program to generate it and calling the program from the makefile or whatever.
ELF files must necessarily start with the ELF header. You can get a hell of a benefit from keeping the kernel file a valid ELF file, because then GDB can find things where they are expected. And for most people it is just easier to add the multiboot header to the first section. If only there weren't so many tutorials out there calling for linker scripts that way over-align all the sections, things could be so much easier.
Right. GDB can be configured to work with a different format, but writing that config is often one of those, "In the carpenter's house, all the doors are off their hinges," kind of things. I've seen it happen in 9front; some wanted an acid script (≈ GDB config) for something or other, (not the kernel,) but it was months before someone got around to writing it.
nullplan wrote:eekee wrote:I'd have to modify my toolchain to use either multiboot or limine; it only generates PE32.
As I said, you can use multiboot at least with whatever binary format blows your skirt up if you enable the bit they call "a.out kludge". Which essentially adds a very simple executable header (containing memory sizes and the entry point) into the multiboot header. You can even use it without any header whatsoever if you do that.
I might just do that!
nullplan wrote:eekee wrote:In actual fact, the dead-code elimination in my compiler means I'd have to implement extra headers in the same part of the compiler which generates the normal headers.
The normal solution for problems like these is to have some way to tell the compiler that something is used no matter what it thinks. GCC has __attribute__((used)), linker scripts have KEEP(), etc.
I'd have to implement that. I could instead take a pointer to the multiboot structure during initialization; that wouldn't hurt.
Octocontrabass wrote:eekee wrote:Mapping and entry point are good reasons to use ELF.
Doesn't PE have those too?
Just there, I should have written "ELF or PE," as opposed to a kernel/bootloader-specific non-executable format.
Octocontrabass wrote:eekee wrote:I just don't see what's so hard about prepending a header to the ELF, adding a little program to generate it and calling the program from the makefile or whatever.
That would work with Multiboot, since you can repeat all the information the bootloader would normally get from the ELF (or PE) headers in the Multiboot headers, but Limine seems to require a valid executable binary.
Right; Multiboot's a.out kludge.
Regardless of what I think is "doing it properly", Multiboot with the a.out kludge could save me a lot of time.