Page 1 of 2

Having trouble with multiple address spaces

Posted: Thu Dec 04, 2003 7:50 am
by Candy
I have my IDT at 0x1000 physical 0xFF101000 virtual, and the GDT at 0x1800 physical (and thus 0xFF101800 virtual). the registers themselves are loaded with 0x1000/7FF and 0x1800/7FF so they should work. However, when I switch from the initial space with those pages identity-mapped, to a different one without those pages, it crashes on each interrupt / segment load.

Now, I think, maybe they're virtual addresses (seems logical in a way). So I reload the GDTR to 0xFF101800 and the IDTR to 0xFF101000, both are mapped to the exact same pages, and now it trips on each interrupt from then on. Not sure whether it trips on segment loads though...

Is the IDTR pointing to a physical or virtual address?

If physical, why doesn't it work when I switch address space?

If virtual, why does it crash when I change the IDTR and GDTR to their virtual places (mapping to the exact same physical page nonetheless)?

Anyone has some help / examples on how you do it?

Re:Having trouble with multiple address spaces

Posted: Thu Dec 04, 2003 8:55 am
by Pype.Clicker
IDTR.base and GDTR.base refers to *linear* addresses -- which means, after segmentation applied but before paging apply.

So every address space should map them at the very same place (and potentially each address space could have its own GDT by mapping *something else* at that place -- which is strongly discouraged ;)

Now let's say you have initially loaded IDTR.base==0x1000 and that the segmentation unit makes physical address 0x1000 appear at virtual address 0xF0001000 and that paging is disable. Everything will work fine as the 'base' information is translated directly to a physical address.

As soon as you enable paging, the physical address used by the GDT will depend on the page table entries that map linear address 0x1000. If these page are not present, you're likely to have a reboot when you'll be trying to reload registers. So my suggestion would be the following
  • setup page(0xF00xxxxx --> 0x000xxxxx), which is the translation you wish
  • setup page(0x000xxxxx --> 0x000xxxxx) so that you can still access your GDT and IDT.
  • enable paging
  • change code & data selectors to zero-based ones.
  • reload GDTR and IDTR with their 0xF0001000 counterparts
  • remove page(0x000xxxxxx) mapping

Re:Having trouble with multiple address spaces

Posted: Thu Dec 04, 2003 9:12 am
by Candy
Pype.Clicker wrote: IDTR.base and GDTR.base refers to *linear* addresses -- which means, after segmentation applied but before paging apply.
tend to use wrong names - bad me. I don't use segmentation in the original sense of the word, all I use about segmentation is using GS as an offset into DS so I can put some processor-specific thread-data there, without using weird structures for them. It speeds up context switching a bit, and makes SMP a lot better for me.

I meant, it's at virtual 0xFF101000 / 0xFF101800.
So every address space should map them at the very same place (and potentially each address space could have its own GDT by mapping *something else* at that place -- which is strongly discouraged ;)
Agreed, did that.
Now let's say you have initially loaded IDTR.base==0x1000 and that the segmentation unit makes physical address 0x1000 appear at virtual address 0xF0001000 and that paging is disable. Everything will work fine as the 'base' information is translated directly to a physical address.

As soon as you enable paging, the physical address used by the GDT will depend on the page table entries that map linear address 0x1000. If these page are not present, you're likely to have a reboot when you'll be trying to reload registers. So my suggestion would be the following
  • setup page(0xF00xxxxx --> 0x000xxxxx), which is the translation you wish
  • setup page(0x000xxxxx --> 0x000xxxxx) so that you can still access your GDT and IDT.
  • enable paging
  • change code & data selectors to zero-based ones.
  • reload GDTR and IDTR with their 0xF0001000 counterparts
  • remove page(0x000xxxxxx) mapping
Thanks for the idea, but I got this part, except for removing the page 0x00001000 mapping. AFAIK the IA32 thingies allow you to map the same physical page to multiple virtual addresses (at least, the bochs one).

Still, the problem remains. The GDTR and IDTR are both reloaded (debugger dump_cpu shows they are), both correctly loaded with new values, mapping set up properly, and still it crashes.

I have manually checked all page tables for the entries, and they are present.

As an aside, the entire kernel space region is copied (as in, the same page tables mapped in all address spaces), so they must be identical (checked this too).

The page the IDTR points to is valid, it is present, it is 100% identical to the entry making 0x1000 present, yet the CPU doesn't accept it.

I'll try to get some code here soon, but atm I'm not making it on an internet-connected box.

Help me?

Re:Having trouble with multiple address spaces

Posted: Thu Dec 04, 2003 9:13 am
by Ozguxxx
Well, I dont know if that will be useful but taking care of TLBs(flushing them) on address space change might be helpful. BTW, Pype how can segmentation unit make 0x1000 to be seen as 0xF0001000? Can it do some kind of mapping? Good luck.

Re:Having trouble with multiple address spaces

Posted: Thu Dec 04, 2003 10:49 am
by Pype.Clicker
TLB are *automatically* flushed on address space switches. Check your manuals and you'll see that mov CR3,<whatever> actually flush the whole TLBs.

Re:Having trouble with multiple address spaces

Posted: Thu Dec 04, 2003 12:22 pm
by Tim
In fact, reloading CR3 was the only way of invalidating any TLB entry on the 386. INVLPG was introduced with the 486.

Re:Having trouble with multiple address spaces

Posted: Thu Dec 04, 2003 12:49 pm
by FlashBurn
What is when I have the 1st MB and the kernel 1:1 mapped (my IDT is in the kernel) and I my ring3 segment starts at the address 0x40000000. This means that the IDT isn?t in this code segment, because I use the 1st GB for the kernel and the last 3GB for the app. Will this work or have the IDT to be in the address space of my app?

Re:Having trouble with multiple address spaces

Posted: Thu Dec 04, 2003 12:54 pm
by Tim
The IDT is in the address space as long as the mapping for its pages are valid in the page directory. The CPU ignores segmentation when looking at the address in IDTR.

Re:Having trouble with multiple address spaces

Posted: Sat Dec 06, 2003 6:30 am
by Candy
Found out what I did wrong...

My IDT (& GDT) were at 0xFF100000 and 0xFF100800 :(

First rule of debugging: Once you've tried all complex solutions, check the easy ones

Thanks all

Re:Having trouble with multiple address spaces

Posted: Sat Dec 06, 2003 7:50 am
by Tim
There's no reason why you should have to hard-code addresses like this. Just define your GDT and IDT as variables in your program, and the linker will decide where to put them. It will elimininate problems like this.

Re:Having trouble with multiple address spaces

Posted: Sat Dec 06, 2003 3:39 pm
by FlashBurn
But what is if you want to add more entries to your GDT/IDT? For this you need to use a bigger memory area.

Re:Having trouble with multiple address spaces

Posted: Sat Dec 06, 2003 5:05 pm
by Tim
malloc

Re:Having trouble with multiple address spaces

Posted: Sun Dec 07, 2003 2:31 pm
by Candy
I chose to use static allocation for a number of structures in my OS which

- Never need to grow (beyond a certain size)
- I want to be able to check without indirection
- I would like to have a static address

For me, there is no reason not to do it. It simplifies a lot of administrational things, such as determining what a certain thread is, which number is next free, etc

The things I want to put in a static location are the thread & process tables (both have a static maximum which is reasonable in terms of address space allocation), the IDT, GDT, BDA, EBDA, and the HWINT table (which are all small, constant in size and for which dynamic allocation would only be excessive overhead, since they are this big, they never grow, and they are always needed), and the processor dependant stuff (such as the XH table, page fault handler lists, hash table, current thread, current process, current anything).

Re:Having trouble with multiple address spaces

Posted: Sun Dec 07, 2003 4:27 pm
by Tim
What do you mean by 'static allocation'? As a C programmer, to me that means module-level variables, or function-level variables with the static keyword. There is utterly no point hard-coding addresses for what could be arrays defined in code. Coming up with your own addresses is very error-prone as you have to check addresses and lengths with pencil and paper. Defining them as variables moves that task to the linker -- and isn't that what the linker is for, anyway?

Mobius never grows its GDT or IDT, so it doesn't use malloc, and there are no special alignment requirements on the GDT and IDT, so it is appropriate to stick them in the program as arrays and have the linker figure out what address they have.

There are only a few places in the Mobius source code where I've had to come up with full addresses myself for pointers.
  • Module and kernel base addresses: the places in memory where EXEs, DLLs, drivers and the kernel start. I keep track of these in a text file and I have a Perl script to look them up and pass them to ld
  • Kernel heap. Clearly this changes in size, and can't be defined by malloc, because it is malloc. This is between E000_0000 and F000_0000
  • Kernel stacks. For stability, I don't allocate these on the heap, because buffer overruns of heap allocations could wipe out a thread's vital structures. These grow downwards between D000_0000 and E000_0000.
  • Temporary mapping area. This is an internal part of the physical memory manager used for temporarily mapping physical pages when some part of the kernel needs to access them. This is between F000_0000 and FFC0_0000.
  • Current page tables and page directory. These appear in the top 4MB of each address space because of the way they are mapped.

Re:Having trouble with multiple address spaces

Posted: Sun Dec 07, 2003 4:46 pm
by Candy
As I've also thought about moving them to the linker, that doesn't count out not doing it right now. I made my own malloc / free set and I don't trust them 200% right now, so I want to be able to develop the rest without worrying about malloc bugs too much.

I still don't know how you do your thread data. I was planning on using a flat array for speed, with links that may or may not be used for priority queueing. Using a flat array with malloc implies that it is a certain size and can only grow if you copy the first part. I try to avoid that (certainly because I can spare the address space for it), and it keeps everything nicely naturally-aligned. Also, unused entries do not waste space, which they might if you do not page out pages yet. I just don't see how you can do that efficiently without static allocation.

IMO, static allocation means that you allocate a bit of address space to a certain bit of data (allocation), without allowing it to be deallocated (static).

How do you allocate kernel stacks? Do you do that in a heap-like way, page-wise, do you allocate address space or pages, and do you use guard pages at the start/end?

And, something for my future, do you allocate user-space stacks in the same way?