Hi, I'm finding that I am being hindered by addressing, memory management and memory models. I really need to properly understand all this stuff before I can do anything terrific.
What are the memory models? There's 16-bit real mode, where there are only 64K segments, or somesuch; there's "unreal" mode (is that where you enable the A20 gate so you have 32-bit addressing, but are working with physical memory, but still have access to the BIOS and you aren't yet in pmode, hence "unreal" mode?), and then of course there's 32-bit protected mode, and then there's the AMD x86-64's 64-bit mode. Not sure about that. I haven't yet saved up the cash to get a 64-bit PC, but that is definately what I am going to do next.
I am trying not to be confused by DOS memory models and the real memory models offered by Intel and compatible chips. As far as I know, when there are 64KB segments in 16-bit real mode, in theory an application could use an address space much larger, but will need "near" and "far" pointers, and changes of segment registers.
Is that all there is? 16-bit segmented real mode, unreal mode (16/32 bit? Physical addresses? BIOS access OK? Segments? Flat?), 32-bit flat protected mode with no segments in the sense of 16-bit segmented mode, (also add something about Virtual 8086 Mode - what is it, how to switch in and out of it as well...), and then 64-bit mode offered by the AMD x86-64. I'll cover that last one in great detail after I get my 64-bit PC and have learned about it.
I would also like more details on the BIOS, especially detecting the amount of memory installed and so on. I believe there is a limitation in the BIOS or something which means perhaps there is somewhere else in the ROM where the amount of memory can be detected, and such.
In any mode, theoretically the kernel could be loaded at 2K, right? Since there is only so much space the interrupt vectors for real mode take up. If the kernel is loaded at 2K, then it must take care not to bump into the 640K BIOS starting point... what is the physical address the BIOS starts at, exactly the offset for 640KB? It runs until 1MB, with the 1024th (or 1025th?) kilobyte being available for general use? I wonder where I can find the BIOS memory map, so that a structure describing BIOS reserved memory areas can be built? There are various addresses that are reserved high in memory (or mapped there anyway) like the PCI configuration space aka 32-bit PCI BIOS, and so on...
Hmm... does anyone know of any documentation available for a modern "standard" BIOS? Also, any decent literature on memory models?
Ah yes, I need to understand more about stacks and heaps... I'll come back to that later.
Memory Models: and what is "unreal mode?"
Re:Memory Models: and what is "unreal mode?"
Lots of very nice questions, each having a single answer. Let me try if I can give it to you.
16-bit realmode: You have 64k segments with 16-bit offsets, both for code and for data. The address space is cut up into 64k segments (16-bit segment registers) where each points to a block 16 bytes away from each other. This puts your max address at 0x10FFEF, because you can set them both to 0xFFFF. Doing this is wrapped around using the A20 gate, to 0xFFEF. If you enable the A20 gate, you get what DOS people call high memory, an additional block of nearly 64k where nobody will come fight.
24-bit protected mode: As can be found on 286'es, there is also a 24-bit protected mode. It has segments of max. 64k with 64k addressing, but it can physically address 16M of memory. To do this, switch to protected mode and use 286 segments, and don't try to use >16M addresses. This isn't a good idea.
32-bit protected mode, w/o paging or with plain paging: You have 4GB addresses using 32-bit registers, and you can make up to 16383 segments (you need one entry for the LDT) each stretching to 4GB itself. These segments are translated to a 32-bit linear space, which is then translated to a 32-bit physical address amount.
Using PAE paging, you can get the last number to go to 36-bit, giving you a possible 64GB physical, but still only 4GB logical. This DOES mean, that if you use multiple paging directories (or PDPT's, which you have to to support 64G physical) you can still only use 4G per process, which can be split up into segments.
The 16-bit like V86 mode:
Intel engineers thought you might want to run 16-bit programs on a 32-bit computer. How could they be more right, you STILL use some 16-bit code on the newest Microsoft-oses. Also, most OS devers abuse this mode for 16-bit BIOS calls. The properties are the same as real mode (by design), but the mode IS translated using paging, not using segmentation (said differently, using the 16-bit segmentation mechanism, ignoring 32-bit stuff), and can thus be multiplexed on a 32-bit machine. Yay.
The bastard, or unreal, mode:
In unreal mode you abuse a bug that was in the first 32-bit processors. Intel has however seen the use for this bug, and decided to call it a feature. This results in you having 32-bit addressing possibility using the 32-bit operand and address-size prefixes. This results in larger code, and your code must still be within the first meg (no, the internal IP isn't stretched). Also, there's no 32-bit segmentation or paging, so you are going to have to live with 16-byte offsets. In short, for 16-bit loading of a really big program (say, an OS) it's useful. For anything else, the code is too bloated & slow, plus too limited, to be of use.
The 64-bit AMD mode:
AMD came up with a really nice mode they called long mode. which has 64-bit and compatibility modes. The 64-bit mode gives you 64-bit addressing (48-bits because they ignore the first 4 nibbles) on a 52-bit physical address space, not using segmentation. The paging is 4-level, 4k pages with 8-byte entries (matching exactly with the 48 bits). This mode allows up to 256TB of data per process, and 16 fully loaded address spaces of that, or a total of 16PB (which not many people know, petabyte). Since memory is only starting to go in the GB ranges, and only Microsoft OSes actually need that much, this is probably no problem for the near to semidistant future. For big DB systems, it might get a little cramped in the not too distant future. For hobby OS programmers, it's really cool and something you probably need, since my guess is that all future computers will be 64-bit (with of course the niche markets for 8-bit, 16-bit and soon 32-bit).
Next post contains more answers:
There are LOTS of models. The bulk:kernel_journeyman wrote: What are the memory models? There's 16-bit real mode, where there are only 64K segments, or somesuch; there's "unreal" mode (is that where you enable the A20 gate so you have 32-bit addressing, but are working with physical memory, but still have access to the BIOS and you aren't yet in pmode, hence "unreal" mode?), and then of course there's 32-bit protected mode, and then there's the AMD x86-64's 64-bit mode. Not sure about that. I haven't yet saved up the cash to get a 64-bit PC, but that is definately what I am going to do next.
16-bit realmode: You have 64k segments with 16-bit offsets, both for code and for data. The address space is cut up into 64k segments (16-bit segment registers) where each points to a block 16 bytes away from each other. This puts your max address at 0x10FFEF, because you can set them both to 0xFFFF. Doing this is wrapped around using the A20 gate, to 0xFFEF. If you enable the A20 gate, you get what DOS people call high memory, an additional block of nearly 64k where nobody will come fight.
24-bit protected mode: As can be found on 286'es, there is also a 24-bit protected mode. It has segments of max. 64k with 64k addressing, but it can physically address 16M of memory. To do this, switch to protected mode and use 286 segments, and don't try to use >16M addresses. This isn't a good idea.
32-bit protected mode, w/o paging or with plain paging: You have 4GB addresses using 32-bit registers, and you can make up to 16383 segments (you need one entry for the LDT) each stretching to 4GB itself. These segments are translated to a 32-bit linear space, which is then translated to a 32-bit physical address amount.
Using PAE paging, you can get the last number to go to 36-bit, giving you a possible 64GB physical, but still only 4GB logical. This DOES mean, that if you use multiple paging directories (or PDPT's, which you have to to support 64G physical) you can still only use 4G per process, which can be split up into segments.
The 16-bit like V86 mode:
Intel engineers thought you might want to run 16-bit programs on a 32-bit computer. How could they be more right, you STILL use some 16-bit code on the newest Microsoft-oses. Also, most OS devers abuse this mode for 16-bit BIOS calls. The properties are the same as real mode (by design), but the mode IS translated using paging, not using segmentation (said differently, using the 16-bit segmentation mechanism, ignoring 32-bit stuff), and can thus be multiplexed on a 32-bit machine. Yay.
The bastard, or unreal, mode:
In unreal mode you abuse a bug that was in the first 32-bit processors. Intel has however seen the use for this bug, and decided to call it a feature. This results in you having 32-bit addressing possibility using the 32-bit operand and address-size prefixes. This results in larger code, and your code must still be within the first meg (no, the internal IP isn't stretched). Also, there's no 32-bit segmentation or paging, so you are going to have to live with 16-byte offsets. In short, for 16-bit loading of a really big program (say, an OS) it's useful. For anything else, the code is too bloated & slow, plus too limited, to be of use.
The 64-bit AMD mode:
AMD came up with a really nice mode they called long mode. which has 64-bit and compatibility modes. The 64-bit mode gives you 64-bit addressing (48-bits because they ignore the first 4 nibbles) on a 52-bit physical address space, not using segmentation. The paging is 4-level, 4k pages with 8-byte entries (matching exactly with the 48 bits). This mode allows up to 256TB of data per process, and 16 fully loaded address spaces of that, or a total of 16PB (which not many people know, petabyte). Since memory is only starting to go in the GB ranges, and only Microsoft OSes actually need that much, this is probably no problem for the near to semidistant future. For big DB systems, it might get a little cramped in the not too distant future. For hobby OS programmers, it's really cool and something you probably need, since my guess is that all future computers will be 64-bit (with of course the niche markets for 8-bit, 16-bit and soon 32-bit).
Next post contains more answers:
Re:Memory Models: and what is "unreal mode?"
The far pointers are segment:offset pairs, so in effect, you already covered the segment register change option.I am trying not to be confused by DOS memory models and the real memory models offered by Intel and compatible chips. As far as I know, when there are 64KB segments in 16-bit real mode, in theory an application could use an address space much larger, but will need "near" and "far" pointers, and changes of segment registers.
if you add compatibility mode on the 64-bitters, yes.Is that all there is? 16-bit segmented real mode, unreal mode (16/32 bit? Physical addresses? BIOS access OK? Segments? Flat?), 32-bit flat protected mode with no segments in the sense of 16-bit segmented mode, (also add something about Virtual 8086 Mode - what is it, how to switch in and out of it as well...), and then 64-bit mode offered by the AMD x86-64. I'll cover that last one in great detail after I get my 64-bit PC and have learned about it.
Memory detection: search on rbil (www.ctyme.com/rbrown.htm) for int 15, eax=0000E820. That's the most common memory detection routine. For old computers, same int, EAX=e810, e802, e801 and int88 (all from memory). The others, search rbrown yourself .I would also like more details on the BIOS, especially detecting the amount of memory installed and so on. I believe there is a limitation in the BIOS or something which means perhaps there is somewhere else in the ROM where the amount of memory can be detected, and such.
The interrupt vectors take up 1k, then comes the BDA which takes up up to 0.5k. You can load your kernel at 1.5k, but for paging convenience, try 4k. The BIOS starting point is technically speaking at 896K, or at 960K. The hardware bioses in total can start between 640k and 1020k. They at least got that entire range. Common is that 640k-704k is mapped for your video card graphical mode, and 704-768k is for your text mode video card. The EBDA is also present, up to 8 pages (or 32k) from 640k downward. Keep those reserved if you care, but at least don't ask the bios to overwrite it, it's not reliable. There is no BIOS memory map except for the one we have in the OSFAQ, which tells you more or less the same as I just did.In any mode, theoretically the kernel could be loaded at 2K, right? Since there is only so much space the interrupt vectors for real mode take up. If the kernel is loaded at 2K, then it must take care not to bump into the 640K BIOS starting point... what is the physical address the BIOS starts at, exactly the offset for 640KB? It runs until 1MB, with the 1024th (or 1025th?) kilobyte being available for general use? I wonder where I can find the BIOS memory map, so that a structure describing BIOS reserved memory areas can be built? There are various addresses that are reserved high in memory (or mapped there anyway) like the PCI configuration space aka 32-bit PCI BIOS, and so on...
The architecture manuals from both intel & amd, indispensable PC hardware book, stuff like that. Also, try RBIL for lots of bios calls.Hmm... does anyone know of any documentation available for a modern "standard" BIOS? Also, any decent literature on memory models?
Re:Memory Models: and what is "unreal mode?"
Hi Candy,
Thanks very much for your responses! Getting my head around this has been confusing: information overload. Attempting to learn everything at once just doesn't work. Instead, I'm now taking it piece by piece: starting with improving my assembly language skills. This means machine organisation, computer architecture, and the IA-32 architecture and IBM and compatible PCs to be precise. First, I need to understand everything there is to know about memory models, then the in-built BIOS services (have bookmarked Ralf's site , then I/O at the lowest level (controllers and chipsets), then everything else to do with OSes from there.
With 16-bit real mode, I understand that I use 64K segments, and that each segment may start on a paragraph boundary as you say. I can choose from any of these paragraphs where I would like to begin and end the segments, great. Now, before enabling the A20 line, I have only 19 or 20 bits of address? The A20 line enables what, 24-bit or 32-bit addresses or something? Anyway, before the A20 line is enabled, I have 1MB or "lowmem" to play with. If I load myself, without Grub or some other bootloader, I must enable the A20 gate to be able to see beyond 1MB, and up to (what? 16MB as in the 286? What about the rest of memory?)
So, if I load myself, I may first disable interrupts to setup a kernel stack safely out the way at some arbitrary location, re-enable interrupts, enable the A20 line, then relocate the kernel (or boot code that loads the kernel) to 1MB, or whatever. Or of course relocate everything down to 4KB, which I may well do because it will be a small kernel anyway. Should interrupts be disabled while this is happening? Perhaps they should, because there is no IDT yet, and only real mode vectors (what happens when an interrupt occurs here at this point?)
I understand that the PIC must be re-programmed because some IRQ lines are in the wrong places, where Intel reserves for divide by zero and other things. At this point, should the interrupt chip be reprogrammed? (Where can I get the datasheet for it? What is it, the old 80259A chip (or a similar name), or the more modern APIC? I have an ASUS A7V333 board, Athlon XP chip, etc, and have an APIC. Should I use this instead of the old interrupt chip, or does it behave the same way as the old chip but with enhancements like multiprocessing?)
After the kernel loader has loaded and it has loaded the kernel, I may setup the GDT for kernel space and userspace, (must I have LDTs yet?), the IDT pointing to spurious_irq routine that prints a harmless message until all the real IDT addresses are filled in, then switch into 32-bit protected mode with no paging and a flat address space (as specified by the GDT presumably), so I am in a 32-bit flat address space with linear addresses having a one-to-one correspondence with physical addresses. Then I find the kernel entry point and jump to it. Done? Kernel running? From the kernel I can enable paging and stuff.
Thanks!
Thanks very much for your responses! Getting my head around this has been confusing: information overload. Attempting to learn everything at once just doesn't work. Instead, I'm now taking it piece by piece: starting with improving my assembly language skills. This means machine organisation, computer architecture, and the IA-32 architecture and IBM and compatible PCs to be precise. First, I need to understand everything there is to know about memory models, then the in-built BIOS services (have bookmarked Ralf's site , then I/O at the lowest level (controllers and chipsets), then everything else to do with OSes from there.
With 16-bit real mode, I understand that I use 64K segments, and that each segment may start on a paragraph boundary as you say. I can choose from any of these paragraphs where I would like to begin and end the segments, great. Now, before enabling the A20 line, I have only 19 or 20 bits of address? The A20 line enables what, 24-bit or 32-bit addresses or something? Anyway, before the A20 line is enabled, I have 1MB or "lowmem" to play with. If I load myself, without Grub or some other bootloader, I must enable the A20 gate to be able to see beyond 1MB, and up to (what? 16MB as in the 286? What about the rest of memory?)
So, if I load myself, I may first disable interrupts to setup a kernel stack safely out the way at some arbitrary location, re-enable interrupts, enable the A20 line, then relocate the kernel (or boot code that loads the kernel) to 1MB, or whatever. Or of course relocate everything down to 4KB, which I may well do because it will be a small kernel anyway. Should interrupts be disabled while this is happening? Perhaps they should, because there is no IDT yet, and only real mode vectors (what happens when an interrupt occurs here at this point?)
I understand that the PIC must be re-programmed because some IRQ lines are in the wrong places, where Intel reserves for divide by zero and other things. At this point, should the interrupt chip be reprogrammed? (Where can I get the datasheet for it? What is it, the old 80259A chip (or a similar name), or the more modern APIC? I have an ASUS A7V333 board, Athlon XP chip, etc, and have an APIC. Should I use this instead of the old interrupt chip, or does it behave the same way as the old chip but with enhancements like multiprocessing?)
After the kernel loader has loaded and it has loaded the kernel, I may setup the GDT for kernel space and userspace, (must I have LDTs yet?), the IDT pointing to spurious_irq routine that prints a harmless message until all the real IDT addresses are filled in, then switch into 32-bit protected mode with no paging and a flat address space (as specified by the GDT presumably), so I am in a 32-bit flat address space with linear addresses having a one-to-one correspondence with physical addresses. Then I find the kernel entry point and jump to it. Done? Kernel running? From the kernel I can enable paging and stuff.
Thanks!
Re:Memory Models: and what is "unreal mode?"
Technically speaking, Ralf's site is a different one. This site is by Mark Perkel, some guy who hosts an HTML version of the list.kernel_journeyman wrote: First, I need to understand everything there is to know about memory models, then the in-built BIOS services (have bookmarked Ralf's site , then I/O at the lowest level (controllers and chipsets), then everything else to do with OSes from there.
Well, theoretically, 31-bit addresses. You can address anything in memory, but the 20th bit in your pointer will be forced to 0. Since realmode can't generate more than 21 bits, you can't access beyond 1M. In other modes you can, but then you can only use odd megabytes unless you enable it.Now, before enabling the A20 line, I have only 19 or 20 bits of address? The A20 line enables what, 24-bit or 32-bit addresses or something?
When you enable the A20 gate, you can see up to 4GB of memory. When and if you enable PAE paging, you can use 64GB on ppro / athlon like thingies, and up to 1TB on AMD64's. Note, this doesn't fit in your address space, but using map/unmap calls you can access it. If you even enable long mode, you can reach up to 4096TB or 4PB. Of these you can then access 256TB per address space.Anyway, before the A20 line is enabled, I have 1MB or "lowmem" to play with. If I load myself, without Grub or some other bootloader, I must enable the A20 gate to be able to see beyond 1MB, and up to (what? 16MB as in the 286? What about the rest of memory?)
Most people choose an arbitrary location far away from their code not realising that it's on top of something very important. Hint: place it below 608k and above 1.5kSo, if I load myself, I may first disable interrupts to setup a kernel stack safely out the way at some arbitrary location, re-enable interrupts, enable the A20 line, then relocate the kernel (or boot code that loads the kernel) to 1MB, or whatever. Or of course relocate everything down to 4KB, which I may well do because it will be a small kernel anyway. Should interrupts be disabled while this is happening? Perhaps they should, because there is no IDT yet, and only real mode vectors (what happens when an interrupt occurs here at this point?)
you don't HAVE to reprogram them, it's just inconvenient. With MOST new computers, the APIC. I believe all AMD64's have 'em, all athlons except model 1, and with intels I believe all p4's, and stuff like that, down to the ppro and pentiums. Might be more, might miss some (people leave out the APIC for price considerations... how stupid can you be?).I understand that the PIC must be re-programmed because some IRQ lines are in the wrong places, where Intel reserves for divide by zero and other things. At this point, should the interrupt chip be reprogrammed? (Where can I get the datasheet for it? What is it, the old 80259A chip (or a similar name), or the more modern APIC? I have an ASUS A7V333 board, Athlon XP chip, etc, and have an APIC. Should I use this instead of the old interrupt chip, or does it behave the same way as the old chip but with enhancements like multiprocessing?)
You must setup a GDT when entering pmode, and you may replace it at any time you like, or not at all. If you don't use an LDT, you don't need one. IDT is very useful, but you can keep interrupts disabled (keep track of irets, they push the flag too, so if you iret to a different task it might have them enabled). For the rest, pretty much ok as far as I can tell... It's my own boot sequence, aside from me using the setup routine (in C, pmode) to set up paging, so the kernel can go to its own location at its entry point. Saves for some ugly coding / compiling. If you want your kernel low in memory, that's not an issue (placing it from 2M-3M is a good choice, don't even need A20 for that).After the kernel loader has loaded and it has loaded the kernel, I may setup the GDT for kernel space and userspace, (must I have LDTs yet?), the IDT pointing to spurious_irq routine that prints a harmless message until all the real IDT addresses are filled in, then switch into 32-bit protected mode with no paging and a flat address space (as specified by the GDT presumably), so I am in a 32-bit flat address space with linear addresses having a one-to-one correspondence with physical addresses. Then I find the kernel entry point and jump to it. Done? Kernel running? From the kernel I can enable paging and stuff.
Re:Memory Models: and what is "unreal mode?"
Thanks again Candy. You've made my day
Now before I go off to do some experiments, when I link a kernel written in C, is it always position independent code that results after everything has been linked? So references to data and jumps, assuming it is self contained and doesn't depend on external modules, will always be relative to the current instruction?
Should I compile ELF modules with the PIC flag, or is this unnecessary? I would like to use LCC instead of GCC actually, because it's smaller, faster and less buggy. I don't have much of a choice with linkers however. GNU's LD is probably my only choice. I may have to fiddle around with segments with an LD script anyway.
Now before I go off to do some experiments, when I link a kernel written in C, is it always position independent code that results after everything has been linked? So references to data and jumps, assuming it is self contained and doesn't depend on external modules, will always be relative to the current instruction?
Should I compile ELF modules with the PIC flag, or is this unnecessary? I would like to use LCC instead of GCC actually, because it's smaller, faster and less buggy. I don't have much of a choice with linkers however. GNU's LD is probably my only choice. I may have to fiddle around with segments with an LD script anyway.
- Pype.Clicker
- Member
- Posts: 5964
- Joined: Wed Oct 18, 2006 2:31 am
- Location: In a galaxy, far, far away
- Contact:
Re:Memory Models: and what is "unreal mode?"
i wouldn't enable the Position Independent Code flag for a kernel, if i were you:
- you *know* where the kernel will be loaded. so PIC is unneeded
- PIC requires a non-trivial amount of run-time magic aswell as assumptions on register uses that you'll probably find hard to follow in an OS (regarding interrupt handlers and registers that 'should always contain X', for instance).
- you *know* where the kernel will be loaded. so PIC is unneeded
- PIC requires a non-trivial amount of run-time magic aswell as assumptions on register uses that you'll probably find hard to follow in an OS (regarding interrupt handlers and registers that 'should always contain X', for instance).
Re:Memory Models: and what is "unreal mode?"
Usually not. Most executables are not position independant, and some shared libraries aren't either. My kernel for instance is statically linked to a certain location.kernel_journeyman wrote: Now before I go off to do some experiments, when I link a kernel written in C, is it always position independent code that results after everything has been linked?
That's not possible in traditional x86 (it is in amd64). It's ebx-relative, where ebx is a static point in the file. This also means you lose yet another very valuable register... not a good thing :S. For future PIC code I'd be tempted to use FS or GS for a base, or prelink it to a certain address. Most people don't use enough libraries to fill their virtual address space, so it's worth it.So references to data and jumps, assuming it is self contained and doesn't depend on external modules, will always be relative to the current instruction?
That depends. I'm using ELF modules for runtime linking in my kernel, and I'm compiling them statically, and relocating them at load, using a kernel-bound ELF loader. Still need testing, but it's nearly done. I could send you a dump of the current code, containing all but the relocations themselves (and the loading of the relocated module in the module/object repository, but that needs lots of work).Should I compile ELF modules with the PIC flag, or is this unnecessary? I would like to use LCC instead of GCC actually, because it's smaller, faster and less buggy. I don't have much of a choice with linkers however. GNU's LD is probably my only choice. I may have to fiddle around with segments with an LD script anyway.