OSDev.org

Posted: **Thu Jan 06, 2005 2:57 pm**

@brendan: *rofl* I haven't expressed myself very clear, obviously.

I do a hybrid micro kernel, say, Services Like Process manager, File Manager and GUI Manager are processes of their own, while device drivers are *threads* running in kernel space - they are present in each and every adress space but are in general accesses only by the services - which means, one needs some way to transfer data from one adress space to another one.

@balroj: we are talking about means to transfer information between processes. This mostatime happens via small fixed size messages in a micro kernel. Gonna rant more later on or tomorrow. I'm debugging my floppy driver at the moment and don't really have a hankering for lengthy explanations. Pls bear with me for the moment.

Posted: **Fri Jan 07, 2005 3:57 am**

Hi,

Balroj wrote:
There's nothing wrong with message passing to/from ring0 code if the device drivers have their own address space. If the device drivers are mapped into all address spaces, then use far calls or software interrupts instead of IPC (it'd be much faster).
I couldn't understand it very well, can you explain it a little bit? thanks and sorry.

If device drivers have their own address spaces, then (as far as IPC is concerned) there's no difference between a device driver and any other process, so device drivers could use the same/normal IPC. In this case it really doesn't matter (to the kernel's IPC code) if the driver/process is running in ring0 or not. This is similar to my current kernel (except device drivers run in ring3, but that makes little difference).

If the device drivers are mapped into all address spaces (like the kernel usually is), then there's no real point bothering with the extra overhead of IPC - the device drivers can use IRQ's directly (e.g. IDT descriptor/s pointing directly at device driver code), and software interrupts and/or call gates (the same as the kernel can). The device drivers could also be called by the kernel using a far call (or even a near call in some cases). This is faster because there's less overhead than IPC. The only problem here is that you have to be careful with re-entrancy - ie. N threads and M IRQs could be trying to use the device driver at the same time (with message passing you only handle one message at a time and don't have to worry about re-entrancy). In this case the device driver runs in the context of the thread/process that called it (and/or the thread/process it interrupted). There's nothing to prevent the device driver from running as a thread/process too (such that it runs in it's own context rather than in the context of other threads/processes), however the device driver/s would need to be split (part that runs in any context and the rest running in it's own context, where the former provides re-entrancy protection for the later). This is how my earliest kernels did things, with "cli;sti" used for "quick & dirty" re-entrancy protection.

If the device drivers run as threads/processes and are mapped into all address spaces then they can still use messaging if (and only if) the messaging code uses message queues. In this case device drivers run in their own context, and there's no problems with re-entrancy, etc. This is how my kernels did things around 5 years ago. I even had a hybrid thing where the message could be passed to the device driver via. it's message queue or by far call if the device driver had it's "re-entrant" flag set.

Of course there's probably heaps of combinations and permutations I've missed (e.g. all device drivers sharing an address space).

Cheers,

Brendan

Posted: **Fri Jan 07, 2005 4:27 am**

Big trouble shows up if you have a device driver in kernel land which is only for fetching blocks from a device and passing them forward to a buffer in a service process - if you don't take care that the address spaces remain the same, you're likely to overwrite crucial data in an other process (task switch happens while floppy driver waits for irq) and pof - the system is gone.

I will sooner or later replace my current async message passing stuff with a synchronous approach - say a rendez vous approach, to make sure that too much messages sent away by one task don't eat up all the memory. Gonna be interesting to do this in paged environment. Maybe I keep a buffer in the TCB-structure. Gunna have a look. All the slot allocating in the message passer slows a bit down after all, despite I have done some optimizations. For the async stuff, I already have events at hand.

@brendan: You are using variable sized messages? That sounds interesting.

Posted: **Fri Jan 07, 2005 5:13 am**

Hi,

beyond infinity wrote: Big trouble shows up if you have a device driver in kernel land which is only for fetching blocks from a device and passing them forward to a buffer in a service process - if you don't take care that the address spaces remain the same, you're likely to overwrite crucial data in an other process (task switch happens while floppy driver waits for irq) and pof - the system is gone.

This is what "bounce buffers" are for - the floppy loads the data into it's own private buffer and when it's all there it sends it to the thread as a message (or something). For floppy drivers there's (normally) little choice because it (normally) uses the ISA DMA, which is incapable of transferring data to anywhere you like (DMA buffer must be below 16 Mb, contiguous, less than 64 KB and can't cross a 64 Kb boundary). It's possible to use PIO instead, but you need to waste huge amounts of CPU time waiting for floppy data and it makes the OS look like crap (e.g. try formatting a floppy while editing a text document on windows 95/98).

For things like ethernet cards, video, etc bounce buffers do add a fair bit of overhead though. Whether the added security (and design consistancy) is worth the extra overhead or not is up to you

.

beyond infinity wrote: I will sooner or later replace my current async message passing stuff with a synchronous approach - say a rendez vous approach, to make sure that too much messages sent away by one task don't eat up all the memory. Gonna be interesting to do this in paged environment. Maybe I keep a buffer in the TCB-structure. Gunna have a look. All the slot allocating in the message passer slows a bit down after all, despite I have done some optimizations. For the async stuff, I already have events at hand.

@brendan: You are using variable sized messages? That sounds interesting.

I'm using asynchronious variable sized messages with message queues, designed to allow messages up to 32 Mb in size, reduce the number of thread switches and honour thread priorities. On top of this there's a hack to allow synchronious messaging to use the same message queues (mainly for swap space and memory mapped file support, but also so it's possible to implement a standard C library for legacy software).

Cheers,

Brendan

Posted: **Sun Jan 09, 2005 6:27 am**

Thanks for all.

What's exactly mapping device drivers in all address spaces? I've got now my own space for DD. What you said means interesting, but don't I know exactly what is/means.

Posted: **Sun Jan 09, 2005 8:55 am**

Hi,

Balroj wrote: What's exactly mapping device drivers in all address spaces? I've got now my own space for DD. What you said means interesting, but don't I know exactly what is/means.

OK, some basics first. Modern computers generally support "virtual address spaces", where there's physical addresses that correspond directly to actual hardware addresses (RAM, ROM, etc) and virtual addresses that correspond to whatever you like. When your designing an OS the first descisions made are normally if the OS will use these virtual address spaces (almost all do) and then what will be in each virtual address space. Generally you'd need to decide where (at least) 3 different types of things go - the kernel, the processes memory, and the device drivers. Wach "type of thing" may be mapped into all address spaces or only exist 1 (or some?) address spaces.

A typical monolithic OS would have the kernel and device drivers mapped into one part of all address spaces, while the remaining space used for processes - a different process in each address space.

A typical microkernel OS would have the kernel mapped into one part of all address spaces, while the remaining space is used for processes (a different process in each address space). Here device drivers are implemented as processes, such that each device driver has it's own address space (with the kernel mapped into part of it).

Another variation would be an exo-kernel, where nothing is really mapped into all address spaces. In this case each process contains it's own kernel (and possibly device driver code) which is added to the process's code in the form of a library.

There's also a heap of alternatives - for e.g. you could have a special "device driver only" address space, or you could use one address space for all processes, or have device drivers in physical memory (ie. not in any virtual address space).

Now some of the possible choices come with "technical difficulties" that would need to be resolved. For example, usually different pieces of hardware need a way to tell the corresponding device drivers that they need attention. On the hardware side this is done with IRQ's (Interupt ReQuests). A hardware generated IRQ is received by the CPU, and the CPU looks in the IDT (or IVT in real mode) to detemine where the IRQ handler code is, then the CPU runs this IRQ handling code (interrupting whatever was running at the time).

Now if device drivers are mapped into the same addresses in all virtual address spaces the CPU/IDT can be configured so that the correct device driver code is started directly by the CPU (which is the fastest possible method as there's no intermediate code/extra overhead).

If this is not the case (e.g. device drivers aren't mapped to the same addresses in all virtual address spaces) then the OS will need some way of receiving the IRQ and passing it on to wherever the correct device driver/s are. For my OS the kernel is mapped into all address spaces, but device drivers are implemented as processes and have their own address space. Therefore my kernel contains IRQ handling code that receives the IRQ and sends messages to the corresponding device drivers.

If you put all device drivers into a single special purpose address space, then you'll also need some way of notifying the device driver/s when an IRQ is received (you can't get the CPU/IDT to run the device driver's IRQ handling code directly because the CPU may be using a completely different address space when the IRQ is received).

Cheers,

Brendan

Posted: **Sun Jan 09, 2005 9:48 am**

Another things concerning microkernels. What malloc version do you use? Because when you do a microkernel this also means you will have many threads within one process - or am I wrong? - and when more than one thread want to have some memory they have to wait , if you use a normal malloc. There are some special mallocs for multithreading like mtmalloc, ptmalloc or hoard.

Posted: **Sun Jan 09, 2005 11:08 am**

Hi,

FlashBurn wrote: Another things concerning microkernels. What malloc version do you use? Because when you do a microkernel this also means you will have many threads within one process - or am I wrong? - and when more than one thread want to have some memory they have to wait , if you use a normal malloc. There are some special mallocs for multithreading like mtmalloc, ptmalloc or hoard.

I don't know about everyone else, but I have my own versions of malloc, etc built into the kernel (I write my own code in general). My OS also has "process space" and "thread space", so I've got different kernel API functions for malloc/calloc/free for each area - e.g. "malloc_process()" and "malloc_thread()".

The versions for thread space are quicker because it doesn't have to lock it's heap management data structure (or wait if it's already locked).

Note: This isn't a "micro-kernel only" thing (the type of kernel doesn't have much effect on how processes, threads and memory/address spaces are implemented).

Cheers,

Brendan

Posted: **Mon Jan 10, 2005 4:16 am**

A typical microkernel OS would have the kernel mapped into one part of all address spaces, while the remaining space is used for processes (a different process in each address space). Here device drivers are implemented as processes, such that each device driver has it's own address space (with the kernel mapped into part of it).

Well, kernel is mapped into a part of dd as some kind of lib?

. For my OS the kernel is mapped into all address spaces,

Sorry another time, but with all address spaces what you mean?

Posted: **Mon Jan 10, 2005 5:13 am**

@balroj:

what we mean with address space is the virtualization/management of an area of memory, which belongs to one process - or is shared, what ever you like.

in other words: address space = 1 page directory.

let's do it the 'graphics' way:

say, you create a process. Say, it is the program ls we are about to run:

the kernel sets off and creates an address space - with an allocation tree, which keeps book of all the allocated regions (what physical memory lies behind is beyond relevance when dealing with virtual address spaces). During this process, the kernel takes and clears out a memory area of 4 kb - that's the page directory, which is loaded into the register cr3 of the processor.

Then, the kernel enters the page tables which belong to the kernel, say, starting at 0xb0000000 - virtual. this is translated to a set of offsets - one into the page directory, one into the pagetable and the rest is the offset into the page itself.

Ok, this a virtual address space looks like in sketchy way:

-----> 0x00000000 ->zero - to catch NuLL Pointer references
-----> 0x00001000 ->Start of the process image
-----> 0x00009ffff -> end of the process image
-----> 0xa0000000 -> start of the heap
-----> 0xa2fffffff -> current end of the heap
-----> 0xb0000000 -> stack bottom
-----> 0xb001ffff -> stack top
-----> 0xc0000000 -> this is the start of the mapped in kernel image
----> 0xffc00000 - ffffffff -> page directory and page table management area.

mark: for each virtual allocation, the address space management deals out a set of physical memory areas - memory which actually exists: say the 4 kb chunk at 0x1000-0x1fff virt. represents the memory physical from 0x34000-0x34fff.

mark2: on i386 the address space management can also be done with segments - but that's kinda complicated and adds more overhead thatn necessary to kernel programming - and to development of user level programs.

To make things a bit funnier: imagine, you have a lot of these programs running on your system: you have an equal amount of page directories - and at each switch of *processes* the register cr3 is loaded with the new page directory - so the processor knows in which address space it is operating.

To explain the thing with the allocation tree: we use a dynamic data structure to manage chunks of allocated memory: if a process (program) needs more memory (or gives back memory) it issues a call "gimme mo memory from xxxx with limit yyyy". Virtual memory manager checks back: valid allocation and enough memory available? Then it just adds an element to the allocation tree without actually mapping something into the page directory.

As soon, as the process dares to access an address area with no physical memory behind, it traps into the page fault handler which checks: memory allocated? if yes, get a chunk of 4 kb and map it in; if no: kill the process - with a nice message (seg fault).

this is valid for each and every process you ever create on your system.

Hope this clears it a bit

Posted: **Mon Jan 10, 2005 5:55 am**

Hi,

Balroj wrote:
A typical microkernel OS would have the kernel mapped into one part of all address spaces, while the remaining space is used for processes (a different process in each address space). Here device drivers are implemented as processes, such that each device driver has it's own address space (with the kernel mapped into part of it).
Well, kernel is mapped into a part of dd as some kind of lib?

. For my OS the kernel is mapped into all address spaces,
Sorry another time, but with all address spaces what you mean?

On 80x86 CPUs (and almost all other modern CPUs) there's the physical address space which does exist and is what you use in real mode, and then there's virtual address spaces. Virtual address spaces are what applications use on modern OS's - they don't actually exist, but the CPU and OS create the illusion that they do. This illusion allows multiple processes to be run, where each process uses the same range of addresses. It also allows for better protection because software can only access things that are in it's virtual address space.

The illusion is created with paging data structures (page directories, page tables and pages) that are used to convert the virtual/fake address used by applications/processes into actual physical addresses. On 80x86 this is normally done with 4 Kb pages, where any 4 KB area in a virtual address space is mapped to any 4 KB area of the physical address space. You can also map the same 4 KB of physical memory into more than one virtual address space. When I say "mapping the kernel into all address spaces", I mean that the physical memory that actually stores the kernel is mapped into every virtual address space, so that the kernel looks the same regardless of which virtual address space the CPU is using.

Now, imagine you've got 6 processes labelled A, B, C, D, E and F, where each process lives in it's "own" address space. You've also got the kernel's space, labelled X, mapped into the top of all address spaces (so the a specific part of all address spaces are identical). The 6 address spaces end up looking like this:

Code: Select all

 4 GB-> X X X X X X
 3 GB-> X X X X X X
 2 GB-> A B C D E F
 1 GB-> A B C D E F

We also need somewhere to put device drivers. If they go in all address spaces (like the kernel), you could end up with each address space looking like this (with # representing space for all device drivers):

Code: Select all

 4 GB-> X X X X X X
 3 GB-> # # # # # #
 2 GB-> A B C D E F
 1 GB-> A B C D E F

Alternatively you could have all device drivers in a special "device driver only" address space, which would end up looking like this:

Code: Select all

 4 GB-> X X X X X X X
 3 GB-> X X X X X X X
 2 GB-> A B C D E F #
 1 GB-> A B C D E F #

For my OS, there's no real difference between device drivers and other processes. Each device driver has it's own address space (they don't share). If there's 4 device drivers labelled 0, 1, 2 and 3 then my OS would end up looking like this:

Code: Select all

 4 GB-> X X X X X X X X X X
 3 GB-> X X X X X X X X X X
 2 GB-> A B C D E F 0 1 2 3
 1 GB-> A B C D E F 0 1 2 3

Cheers,

Brendan

Posted: **Sun Jan 16, 2005 3:40 pm**

First of all,

Thanks both of you beyond infinity and brendan for the patience and the answers.

If I don't missunderstand, what you mean with map in all adress space is for each process/app ( with their own page dir ) kernel is present is it?

For brendan models:

Code: Select all

 4 GB-> X X X X X X X X X X
3 GB-> X X X X X X X X X X
2 GB-> A B C D E F 0 1 2 3
1 GB-> A B C D E F 0 1 2 3

Device drives ( 0,...,3), like any process will only see themselfs and kernel ( with its restringtions ), and in the other model:

Code: Select all

 4 GB-> X X X X X X
3 GB-> # # # # # #
2 GB-> A B C D E F
1 GB-> A B C D E F

Every process will device drivers and kernel, is it?

Thanks for all.

OSDev.org

Microkernels

Re:Microkernels

Re:Microkernels

Re:Microkernels

Re:Microkernels

Re:Microkernels

Re:Microkernels

Re:Microkernels

Re:Microkernels

Re:Microkernels

Re:Microkernels

Re:Microkernels

Re:Microkernels