OSDev.org

Posted: **Sat Feb 13, 2010 9:03 pm**

Well, I've gotten my hands pretty dirty and learned a lot. Maybe more than I expected initially...

I learned to make PE executables multiboot compliant, how to use I/O ports, how to install a GDT and IDT, work with VGA, PIC, basic drivers, etc. So I've covered most of the bottom rung of the ladder (I guess, unless that excludes something really basic/fundamental). I've kept my code and design as neat as possible, even though I started with a very blurry picture of how things would turn out. But it was inevitable this design would begin to look sloppy when it was time to step up to the next level. I couldn't really plan ahead and outline the project before I knew what the project was going to include; in stark contrast to writing a userland application. I've got a much better picture of what needs to be done now, but certainly not the whole range of it. So, I'm trying to get organized, make some coherent plans and get into a real development cycle. I can't stay in this chaotic cowboy-coding approach, lol.

The first big concerns of mine are maintainability, modularity and structure. For example, it's totally unacceptable for me to just dump tons of code into a single project/folder and try to work with it. Way too chaotic and too difficult for other people to jump into further along the road. Likewise, that's going to cause a LOT of code repetition and thus increase the size of the kernel (or whatever component we're dealing with). That's one of the biggest downfalls of static libs, though they do have some big advantages. To put into perspective, my VGA code is currently a static lib which gets linked into the kernel. That's ok at this very moment, but other processes will need it soon. That means that it will be linked into ALL of them; each with their own little copy of it. So it seems to, obviously, follow that this should be a module (shared library) which can be digested by multiple processes as needed. If it hasn't hit your accumulator registers in your brain yet, I'm essentially going for a micro-kernel-like design, in which I'm going to attempt some modifications and deviations from the norm (hyrbid-kernel I suppose).

So it occurred to me that what I'm currently using as my primitive monolithic kernel could/should actually be a multiboot compliant system initialization program. From what I've read, Grub can load other things as "modules", which your own code has to handle (fair enough). So I figure why not have it load a more sophisticated micro-kernel as a module, load libraries/drivers/etc, and have my "initializer" perform all the setup work (maybe even installation in the future if needed). Then as control is given to the shiny new kernel, the initializer's code and resources can be disposed of and it won't need to exist from the duration of runtime (sounds good to me). I think this can be very advantageous in the future, since a certain interface between the initializer and kernel can be created which doesn't need changes, even though the internal implementations may be changed. This could make, for instance, upping the system to 64-bit less painful. I guess really I'm looking for an overview of how this is done. All I've really seen/read was a snippet from menu.lst files with "module <file name here>". I already get the multiboot information pointer after loading and know it has some data about modules. I've just never seen (and can't find) how to tie that together and handle it correctly. I'm pretty sure I can handle implementing something like GetProcAddress to find functions/resources and whatnot. The Grub side of it is a bit mysterious to me though, and how I can find out information about each and every module BEFORE I start trying to parse them. If anyone can fill in some of my mental blanks here, please do!

Another thing, which I can't find an answer to, is how to reliably determine the actual boot device on different hardware. All I've really seen is the suggestion that getting 0x0 is likely to be the 1st floppy drive, and 0x80 is likely to be the main HD. You could determine this much easier in Real Mode, but is this even relevant anymore when we're in PMode? Or is this information useless? I can't seem to find any standards on this about knowing the actual device with certainty; which I think could be very important. The spec doesn't seem to offer much insight here. Will I just need to come up with something original to determine this because BIOSes are too diverse in this area?

Wow, this is getting way too long, so let me try to wrap it up... I'm trying to learn more about how some working, modern operating systems work. It's been hard to find certain things, since I really don't know what to search for or it's simply non-existent. For example, there are plenty of very broad/general diagrams about how all the major operating system components relate, but practically nothing along the lines of deeper, more detailed diagrams about what goes on between them. Anyone know where some are? I've also been unable to find any info/diagrams about the things that go on AFTER booting and BEFORE the shell is initialized in modern operating systems. All that nitty-gritty initialization work is a bit mysterious to me, and so far I've just made stuff up (varying degrees of success). Most articles and those little pictures of all the stuff (like the kernel, HAL, drivers, etc) are nice, but demonstrate nothing along the lines of how all that gets setup in the first place.

I'm also starting to try to dissect MINIX in hopes that it can teach me alot about how a compact, Unix-like system works. That's great and all, but there are 497 files in my copy and the names of most files/folders reveal little to me. I'm unable to find any outline of directories and their contents or any diagrams of how everything fits together. I also can't extract the floppy image to see the final deployment structure. So this is starting to seem like a long, tedious, Sherlock Holmes mystery.

So to end this, if anyone knows where I can find a "handbook" to the structure and implementation of MINIX, please point me to it (other than Tannenbaum's $120+ book, for now). I think it can be a great teacher, but I hate to have to spend eons (or hundreds $$$) just figuring out what's IN the thing.

Thanks, and my apologies for the wall-O-text! Just starting to get more and more curious as I learn! Any other insight you can offer me about organizing and planning this thing will be greatly appreciated as well!

Posted: **Sat Feb 13, 2010 11:43 pm**

Hi,

ATC wrote:So I figure why not have it load a more sophisticated micro-kernel as a module, load libraries/drivers/etc, and have my "initializer" perform all the setup work (maybe even installation in the future if needed).

That's similar to the way I do things; although I have a boot image (containing lots of files) as one module (rather than lots of seperate modules/files loaded by GRUB), because this is easier for end users and makes it easier for my code to find files (it's like a mini file system in RAM, where I can search for the file "/foo/bar/whatsit.bin" within the boot image).

ATC wrote:Then as control is given to the shiny new kernel, the initializer's code and resources can be disposed of and it won't need to exist from the duration of runtime (sounds good to me). I think this can be very advantageous in the future, since a certain interface between the initializer and kernel can be created which doesn't need changes, even though the internal implementations may be changed. This could make, for instance, upping the system to 64-bit less painful.

The initializer could check which features the CPU supports, then search the boot image for a kernel that supports those features. For an example, imagine a "live CD" that has a 32-bit kernel and a 64-bit kernel, and auto-selects which one to use.

ATC wrote:The Grub side of it is a bit mysterious to me though, and how I can find out information about each and every module BEFORE I start trying to parse them. If anyone can fill in some of my mental blanks here, please do!

GRUB (multi-boot) gives you a list containing the address of each loaded module. You'd need to look at each module to determine what it is. I put a "file type" field in the module's header, which makes this easy.

ATC wrote:Another thing, which I can't find an answer to, is how to reliably determine the actual boot device on different hardware. All I've really seen is the suggestion that getting 0x0 is likely to be the 1st floppy drive, and 0x80 is likely to be the main HD. You could determine this much easier in Real Mode, but is this even relevant anymore when we're in PMode? Or is this information useless?

This information isn't useless. After the OS boots it may want to update some of its boot files (for e.g. I have a "boot script" that controls which video mode is setup during boot, etc). This means you need to know where the boot files are.

When the OS is installed you could store the boot device somewhere (e.g. type of device, which bus the controller is on, etc). This method fails for USB devices (which could have been plugged in anywhere) and can also cause problems when the user shifts hard drives (or IP addresses) around.

When the OS is installed you could store a randomly generated signature somewhere, then search all devices for this signature. This method fails when the user clones the boot disk, as you may have 2 (or more) disks with the same signature.

You could also ask the user "where did I boot from?", but this isn't very user-friendly, and it's probably a bad idea to assume the current user is the administrator (or has any idea where the OS booted from).

The best way is for the boot loader to tell the OS exactly which device it booted from. Unfortunately the "BIOS device number" is inadequate for this. For example "device 0x00" could be the first or second floppy drive (some BIOSs have a "swap floppy drives at boot" option), or could be an emulated floppy drive (when you've actually booted from CD or USB). The best way is for the boot loader to tell the OS which partition on which device on which controller on which bus, but this is difficult for the boot loader itself to find out in some cases (and GRUB/multi-boot doesn't even try).

In the end, if you need to know where you booted from, then you will probably need to use a combination of several methods.

Cheers,

Brendan

Posted: **Sun Feb 14, 2010 7:25 pm**

Thank you! Very helpful thoughts! I think this is really nice because it can dodge the PMode paradox of needing to load critical modules/drivers without the critical modules/drivers to load themselves!

I've realized operating system implementation can be rife with weird paradoxes like that. I think I should keep it simple, load just the modules I need to handle everything else, and go from there, no?

But it begs the question, how exactly do you make your module to act as a virtual filesystem in RAM? It's made up of multiple files/directories, so how do you load it as one module? Some sort of archive, like GZ/RAR? Please elaborate if you can. I've never done this with GRUB before. I'm about to just try loading a DLL as a module and see if I can just get a successful call and return to/from one function. If that works out, I can get a lot more leverage to tackle some of my conceptual issues. I have some pretty cool ideas which I'm pretty sure are relatively viable, but I just lack the experience to dive in and push the envelope. I'm also wondering what would be the best way to arrange the kernel and critical system components in memory. I've heard of and read about the "higher-half kernel" concept, but have heard of no alternatives. Are there any, and what sort of advantages/problems do different approaches have? If I've missed this in the Wiki somehow, I'll take no offense to being sent to it!

Also, my curiosity is still peaked about the way some modern operating systems are designed and implemented. I've seen some very simple diagrams and tidbits of information, but little that goes beyond protection layers and some general theory. I'm still liking a lot of what I see in MINIX for study material, but yeah, that's a ton of files to dissect with no guidance; so it will be outlandishly slow. And this phase between system boot and normal operation is still murky and mysterious to me; and I think it's extremely important to get things right here before I try to get too fancy. No use trying to build a mansion on top of a smelly bog, lol! It's pretty important to me to get a deeper understanding in these areas so I can plan this thing properly; before randomly hacking away at a messy implementation.

I've searched all over for a lot of this information, but I encounter problems in finding it. First, I often don't know what the correct search terms should be. Second, it may just be unavailable, period. And third, I'm going into some uncharted waters in certain areas in the name of innovation; problematic, but definitely fun and potentially beneficial.

Posted: **Mon Feb 15, 2010 4:12 am**

Hi,

ATC wrote:But it begs the question, how exactly do you make your module to act as a virtual filesystem in RAM? It's made up of multiple files/directories, so how do you load it as one module? Some sort of archive, like GZ/RAR?

I wrote a fairly simple utility to parse a text file and create a boot image containing the files listed in that text file (while creating/checking checksums; and creating fake "directory entry" information with permissions, file owner, modification times, etc). I've also got a second utility to compress files. Mostly it's like "tar" and "gzip" (except it's for my OS and complies with my file format specifications, etc).

GRUB loads the boot image as a module, my boot code checks it (and if it's a compressed file rather than a boot image it decompresses it, then checks it again), and after that it's used like a read-only file system in RAM. Eventually (when I've got the VFS setup) the files in the "file system in RAM" will be injected into the VFS's file cache (so the boot image could be used as a way to pre-fetch files that are likely to be used after boot, for e.g.); and the VFS will check if the files from the boot image are in the file system and write them to disk if they aren't (which will eventually make installing the OS easier - format a partition, mount it at "/", then let the VFS create the initial directory structure and store the files from the boot image in it).

In addition, during boot I keep track of which files (in the boot image) were read and which ones weren't. This also makes installing the OS easier - e.g. a generic boot CD that has lots of things in the boot image (several kernels, lots of device drivers, etc), where (after boot) a new "minimal" boot image (containing only the things that were needed for that computer) can be created automatically (and then installed on that computer's hard drive).

ATC wrote:I'm also wondering what would be the best way to arrange the kernel and critical system components in memory. I've heard of and read about the "higher-half kernel" concept, but have heard of no alternatives. Are there any, and what sort of advantages/problems do different approaches have? If I've missed this in the Wiki somehow, I'll take no offense to being sent to it!

My boot code has it's own physical memory manager (and dynamically allocates pages to use for everything). It also sets up paging before the kernel is started. I have no idea which pieces end up at which physical addresses (it's different for different computers) - it doesn't actually matter much.

For the virtual address space, most OSs (including mine) have "process space" at the bottom of each virtual address space and "kernel space" at the top. This makes it easier to change the size of "process space" without recompiling applications, etc - processes just use from 0x0000000 to "MEM_TOP"; which can be important if different kernels use different amounts of space. A good example of this is a 32-bit process, where a 32-bit kernel might tell it that "MEM_TOP" is 0xBFFFFFFF (3 GiB) and a 64-bit kernel might tell it that "MEM_TOP" is 0xFFFFFFFF (4 GiB).

Of course the virtual address space layout is much easier for a micro-kernel, because device drivers are like processes and you don't need to worry about space for them. For a 32-bit monolithic kernel, it can be "fun" trying to figure out where to map video display memory (for e.g.) without running out of space for the disk caches, etc.

Cheers,

Brendan

Posted: **Mon Feb 15, 2010 5:29 am**

Cool! Again, very helpful!

Right now I'm working out a solid implementation of a "Get Process Address" function and a whole set of other functions to gather data about PE files. Goodness, I've learned more about the PE format than I ever wanted to! And I'm sure I've hardly scratched the surface!

But I'm hoping within the next hour or two I'll be able to get the module data from GRUB's multibootinfo* and be able to (somewhat) elegantly gather data from PEs, find their exports, and handle them properly. If it all works out, this will be a huge leap forward for me and this project, so I'm pretty anxious to see what happens.

Your methods sound very cool though, so I'll be thinking about it and see if I can put my own spin on it to work for my project. You've been quite helpful, and I can't thank you enough!

If anyone can address any of my other concerns/questions and has time, please do! Thanks!

OSDev.org

Need a friendly slap in the right direction...

Need a friendly slap in the right direction...

Re: Need a friendly slap in the right direction...

Re: Need a friendly slap in the right direction...

Re: Need a friendly slap in the right direction...

Re: Need a friendly slap in the right direction...