(candy) modules & drivers.

Pype.Clicker · Post by **Pype.Clicker** » Tue Dec 23, 2003 8:49 am

Initial post in another thread (leaving undefined references by LD ...) by Candy.

Candy · Post by **Candy** » Tue Dec 23, 2003 10:14 am

Adek336 wrote: Candy: I see you have designed your method in every aspect!

No way I didn't. There are still a lot of things about it that I didn't work out yet, it's just that I thought it out to the point where it is now. And as far as my module system is working now, it's sufficient. As Tim would put it, it's a design made by somebody that didn't do something like that before, so it's probably wrong.

At this moment I'll be trying to make it as simple as possible but, I will think about all those nifty features. What do you think of a system where all modules are in kernel space and none in user? Something like Linux, I don't like the microkernel design very much.

Must admit that I never thought "Hey let's make a microkernel", just like you. The concept of message-passing including making the kernel a pure message-passer (amoeba etc) seems like an oversimplification of the kernel, as in, it might look nice and if you optimize it like **** it might be faster than a normal monolithic kernel (as they call 'em) but I don't see any reason to do so. Or, like Einstein put it, make it as simple as it can be, but no simpler (quote from book of Tanenbaum himself, page where he describes amoeba).

My idea of as simple as can be is a kernel that understands the entire hardware layer. That is hardware up to a level that it's possible to make a driver for a piece of hardware without wondering about memory problems, paging stuff and scheduling for instance. Timers also belong to this bottom layer, if only because some other core bits need timers. The central piece of the kernel that is loaded at bootup is called the Core.

The core is loaded alongside a relocatable module (just a .o file) that contains a number of drivers that is required to get it up to self-loading strength. These include the FDD driver, the FDC driver, the old-DMA driver (it's not strictly non-hardware, but it can be separated from the core without damaging core functionality) and for instance, the FAT12 driver. As they will be loaded without using any additional help, they need to be in memory.

After this loading phase, the memory allocator frees all init memory (still to be implemented, just found out how thanks to Chris Giese) and then it starts loading modules that either have no objects they depend on, or whose depending objects are present. It loads only the probe-section (.probe.text, .probe.data and .probe.bss) that can only probe for the device. If it returns with success, the rest of the module is loaded, called to init, then init and probe are removed again (so you can use some detected info from the probe to the module).

As for the modules that run in kernel space, 100% good idea. I was thinking on going purely kernel myself, but then I thought about two things.

1. I am not going to make all drivers (I hope) and I don't trust programming skills of anybody else.
2. I myself am only human, so I too am going to make buggy drivers.

Thinking about this, it would be best if you relocated every module before running it, and then according to user settings decided whether it would be kernel mode or user mode. The module themselves can indicate through a BIGMODULE bit that they are big @$$ (big module or big dataset) so they need their own address space. The kernel/user bit is fully decided by the user (not presented as such, presented as speed or safety switch. Users will probably be intelligent enough to see that if the system crashes you have to set some things back to safety)

IMHO this combines the advantages of microkernel with the advantages of monolithic systems (as in, you can call everything you want, but the kernel is still safe from shitty drivers).

And here is also a point that I haven't thought out. How do you let a user space driver communicate quickly with a kernel-space driver? Off to the thinking board again then...

As I've seen the thinking board again, this is an actual edit of that post.

I was thinking about linking the modules through a highspeed kernel call something that all modules call to call other modules. Still wondering about any better ways, since this would require all kernel-only modules to communicate through that path too, thus becoming a lagpoint.

Pype.Clicker · Post by **Pype.Clicker** » Tue Dec 23, 2003 11:29 am

Candy wrote: After this loading phase, the memory allocator frees all init memory (still to be implemented, just found out how thanks to Chris Giese) and then it starts loading modules that either have no objects they depend on, or whose depending objects are present.

Hmm ... so conceptually, you're walking the list of available modules, looking for modules whose dependency graph is only made of loaded modules, then load them and look for further "ready" modules.

It loads only the probe-section (.probe.text, .probe.data and .probe.bss) that can only probe for the device. If it returns with success, the rest of the module is loaded, called to init, then init and probe are removed again (so you can use some detected info from the probe to the module).

That probe-section thing is interresting... I have myself tried to do something alike and come to a design with roughly 3 modules per driver:
- the "xyzCore" module that contains helper functions required by both the probing and the running code.
- the "xyzBusdriver" that is responsible for scanning hardware and loading the requested modules for the available devices.
- the "xyzDeviceDrivers" modules that will actually do the useful job.
Once the busdriver has been installing the device drivers, it could theorically be unloaded...

One thing that remain obscure to me with those "probe" section is how much the compiler can help you detecting probe code that calls non-probe code or non-probe data.

Thinking about this, it would be best if you relocated every module before running it, and then according to user settings decided whether it would be kernel mode or user mode.

Do you mean the loader should be able to make the same binary loaded either at kernel or at user level ?
I suppose you'll use something like a dynamically linked (and generated ?) library to provide "trampolines" to system calls when needed (like, calling INT $82 instead of calling kprint directly).

One thing i wonder is how you'll deal with "objects" or pointer to data structures that must be kept within the kernel ... Let's say for instance you need to trampoline to kDelay* setupTimeout(int millis) and need the return value for later cancelTimeout(kDelay* delay) but you do not want user-level code to give you "forged" nor "stolen" timeout structures ? ...

I was thinking about linking the modules through a highspeed kernel call something that all modules call to call other modules. Still wondering about any better ways, since this would require all kernel-only modules to communicate through that path too, thus becoming a lagpoint.

If you're sure that the target of a communication do not change once the modules have been installed, what you can do is, at load time, decide whether a direct component connection is available or if a proxy/stub (hiding the cross-domain communication) must be used.

I used to think 'blah! stubs and proxies must be generated on the fly, this is too much complication for kernel level", but if you look at how COM works, you'll see that in most cases, those proxy and stub code is actually located in Yet Another Module which location can be resolved from the id of what it allows to connect to (or from)...

I have ideas on my own for that kind of flexible communication, but they're still in the sketchbook and need further study before i can express them clearly ...

Candy · Post by **Candy** » Tue Dec 23, 2003 1:47 pm

Pype.Clicker wrote: Hmm ... so conceptually, you're walking the list of available modules, looking for modules whose dependency graph is only made of loaded modules, then load them and look for further "ready" modules.

That's true. Also note that all of the modules must be completely written for reentrancy, as will the kernel. As far as possible, multiple modules will be probed simultaneously (so slow probes do not slow down the boot by themselves). There will be a kernel level module admin thread that does all the adminning and decides what to try (something like the intelligence of the system), and then posts messages (no I'm still not going microkernel) to the worker threads, that then call those functions.

That probe-section thing is interresting... I have myself tried to do something alike and come to a design with roughly 3 modules per driver:
- the "xyzCore" module that contains helper functions required by both the probing and the running code.
- the "xyzBusdriver" that is responsible for scanning hardware and loading the requested modules for the available devices.
- the "xyzDeviceDrivers" modules that will actually do the useful job.
Once the busdriver has been installing the device drivers, it could theorically be unloaded...

One thing that remain obscure to me with those "probe" section is how much the compiler can help you detecting probe code that calls non-probe code or non-probe data.

Well, the compiler helps you a lot. Since the gnu compiler (as the gnu linker) do not remove any relocations until the final relocation step, which I do myself, I have all relocation info (strip is forbidden), and can thus check for every symbol whether it is in one or another section. If I thus find a relocation in either probe or core that links to a symbol of the other, that'll be obviously wrong.

Also, I think compilers just tell you straight out that you cannot make that call or something. It's hard to represent a relative call from one relative segment to another in ELF relocations (at least, it's impossible in the ones that are in the docs).

Do you mean the loader should be able to make the same binary loaded either at kernel or at user level ?
I suppose you'll use something like a dynamically linked (and generated ?) library to provide "trampolines" to system calls when needed (like, calling INT $82 instead of calling kprint directly).

No, as a matter of fact, I was thinking about a method with a number of macros to wrap around those module calls in a shared header, and to let them function somewhat like the PLT in ELF files. That is, code that modifies itself but only in that place. It is still something I'm not sure about, so correct me if this is a really stupid idea.

One thing i wonder is how you'll deal with "objects" or pointer to data structures that must be kept within the kernel ... Let's say for instance you need to trampoline to kDelay* setupTimeout(int millis) and need the return value for later cancelTimeout(kDelay* delay) but you do not want user-level code to give you "forged" nor "stolen" timeout structures ? ...

Well, I don't do timeouts that way, honestly... Still, your point is taken, and I must admit, didn't think about that. I'll think about it some tonight.

I have ideas on my own for that kind of flexible communication, but they're still in the sketchbook and need further study before i can express them clearly ...

Please, let your conceptual ideas out. If all we can only learn from them and point you to obvious flaws, as you yourself do with all the others here, including me.

Pype.Clicker · Post by **Pype.Clicker** » Tue Dec 23, 2003 3:23 pm

Candy wrote: Please, let your conceptual ideas out. If all we can only learn from them and point you to obvious flaws, as you yourself do with all the others here, including me.

Hum. Well, i'll try -- note that i'm rather inclined to idea-sharing, but this one is still at a *very* draft level ... I'll give it a try though

So, in Clicker, you have the so-called KDS services that allows modules to talk to each other. Basically KDS services describes a set of interfaces and group them by "services". One of these interfaces, for instance, is "system.error:log", and another interface for the same service could be "system.error:errno-translator"

Each interface consist of a list of methods that server modules will provide and that client modules may call. a small tool parses IDL files and generate arguments structures, aswell as inline functions that will wrap the kdsInvoque call with something that looks much like a local call.

Note that so far, it sounds much like COM with its IDispatch interface, except that the logical-name-to-dispatchID is resolved at load time. (from the programmer point of view only)

Of course, user programs can be servers or client as well as kernel components can, and of course we'd like to allow efficient single-domain calls while keeping the ability of doing inter-domain calls.

What i have in my mind for transmitting 'calls' across domain boundaries is "multipart messages": rather than marshalling the whole component to one single data array, the client will be responsible for providing a {nb parts, {ptr, length, r/w},{ptr, length, r/w},{ptr, length,r/w} ... } structure that identifies all of the consecutive "parts" of the message (in addition of the 'atomic' parameters that are kept within the message.

The KDS core is then able to deliver that message efficiently within the domain (by just passing the {nbparts, ...} structure and letting the other side use the pointers directly), or it can see that a domain cross is mandatory and collect the parts to a file/network connexion/shared memory area/whatever ...

On reception of the multipart message, it's quite easy to generate a multipart message that will have pointer to the locally-retrieved things.

Compared to JAVA serialisation, or COM marshalling, the multipart technique has the advantage that the core do not need to 'know' primitive types of the parts: it just has (or needn't) to copy given-length arrays into a IPC medium.

Pype.Clicker · Post by **Pype.Clicker** » Tue Dec 23, 2003 4:06 pm

now, about the "kernel objects returned to the user-level", what i planned to do is to have a "resource table" associated with each process. Each resource is something the kernel is able to export to the user on which a list of method is defined. When the KDS core will have to expose a kernel "object" to the client, it will simply store it as a new resource ...

now, i admit that i still have to find out how the KDS core will know what class of resource it must use for the object, but that information is available through the IDL description of the service, so it could simply be inferred from the return type name or something like this ...

Another feature i plan to have is the ability for the kernel to add components code to the user's app.
This would allow (for instance) the user to receive a "proxy" of the object that it could manipulate through locally imported functions. But here again, it's still very sketchy ...

Candy · Post by **Candy** » Tue Dec 23, 2003 4:45 pm

I've thought a little about it too, and I've come up with some solutions.

One of which is like one of yours, you keep a table per user level driver which objects he was passed, and check all incoming pointers against that hash table to see whether it's an allowed object for that driver. If it isn't call for help.

Another I found to be quite logical, is to only check pointers from user space for validity, and in the case of kernel objects, to see whether it is in kernel space. There is a very small chance that a random pointer is valid, and if it isn't valid you have a 3/4 chance of missing kernel space. The system is to catch buggy drivers, not intentionally trashy drivers (note, this is a general approach. My system wants to protect itself).

As for me, the entire idea of putting drivers in user space was so that the driver could not harm the OS, at whatever speed expense it came. The other part of the story is that it must function at full speed when in kernel space, unhampered. Using a macro to make a PLT-ish structure that automatically uses a stub (defined by the macro itself, preferably) to call the kernel, or directly, would be the best way. The kernel would then not have to worry about the module calls hurting the kernel, since the module would be able to hurt the module if it was loaded at kernel level, and would not be able to harm the kernel through the same methods if it was loaded at user level. The module would not suffer speed loss because the macro part of the code would rewrite the calls to be direct ones. The module in user space would not be safe from limitations, because trying to do this would result in a GPF (supervisor pages), and using the jump technique would just allow it to do just the same, but with supervision.

Seems like a win-win to me. Except for that part which probably interests you, since you're making a real microkernel, all is in user space.

As for that part in particular, you're going to have to excuse me for the day. I'm not into microkernel-specifics even though I should be, and I'm going to need some more time to think. Be back tomorrow.

Candy · Post by **Candy** » Sun Jan 04, 2004 6:47 pm

just for kicks...

What about the method that is fully possible for me but absolutely out of the question for you? Copying everything?

It would for me slow down the system somewhat, and if used for high-bandwidth drivers, a lot, but it would make it fully safe. The interface would though, always require a buffer pointer and a fixed length, plus a direction-indication (usually obvious from the function, otherwise it's both). That would not cause any delay, if implemented properly...

For a function call, it would be sent to the modules list of messages-to-handle, and the sending process would get a please-hold-kind of token (in the kernel trap interface, not for user viewing), and when it returns, the info is there. Still thinking hard about some real-life things such as harddisk drivers... might be an idea to fix 1/4 gb of space for just those drivers...

Still, I'm going to let the subject of theory rest for a while now (for my implementation at least) and get down to the practical side of module loading, linking and starting... kernel-only stuff that is...

Pype.Clicker · Post by **Pype.Clicker** » Mon Jan 05, 2004 2:41 am

Candy wrote: just for kicks...

What about the method that is fully possible for me but absolutely out of the question for you? Copying everything?

Hum. I fear after all this time i've lost the thread a bit ... what everything do you wish to copy exactly ? ...

I don't think creating copies of driver-to-kernel calls parameters will solve everything.
Let's take the following example: a disk driver need to send the content of a user data page to a disk sector ... In order to do this, it performs the following when in kernel mode:
- request for a 4K free system area (just virtual memory) and receive the vaddr_t for it
- maps the page frame provided by the "syscall decoder" to the newly allocated vaddr_t
- reads the user data and send them to the disk with I/O opcodes (that your kernel will have allowed through a proper I/O bitmap)

How would the 'copy everything' handle the fact that only the returned vaddr_t and the syscall-decoded frame_t may be given to the pgMap service ?

Once you'll have allowed PCI dma, how will you still check things, as there will no longer be any kind of pgMap service involved, but rather frame_t stuff directly fed to the ultradma PCI configuration space ? ...

Candy · Post by **Candy** » Mon Jan 05, 2004 3:30 am

Pype.Clicker wrote: Hum. I fear after all this time i've lost the thread a bit ... what everything do you wish to copy exactly ? ...

I don't think creating copies of driver-to-kernel calls parameters will solve everything.
Let's take the following example: a disk driver need to send the content of a user data page to a disk sector ... In order to do this, it performs the following when in kernel mode:
- request for a 4K free system area (just virtual memory) and receive the vaddr_t for it
- maps the page frame provided by the "syscall decoder" to the newly allocated vaddr_t
- reads the user data and send them to the disk with I/O opcodes (that your kernel will have allowed through a proper I/O bitmap)

How would the 'copy everything' handle the fact that only the returned vaddr_t and the syscall-decoded frame_t may be given to the pgMap service ?

by not allowing those modules to allocate kernel space-address space, in the first place. If it was a bad module, it could request too much, thus not good (Also, why would the address space have to be in the kernel? it has its own address space when in user space). So, the function itself maps to a different set of memory pages for user or kernel mode.

Kernel mode:

User requests a block from disk

the disk driver has mapped in a block of IO space, using a module that was a stub-ide interface to get the right to do so. It did this in init, since it's illegal to call the address-space-alloc functions outside init (and probe...).

It then uses those pages to dump it to disk from a kernel copy of the buffer, and voila there it is. Then returns normally.

User mode:

User requests a block from disk

all the user request data is copied into a separate buffer (in user space) for the disk driver, even after being copied to kernel space.

the user space driver == the kernel space driver, so... it once again has the io space allocated at startup. It also has the userland buffer that was made for it, so now all it has to do is exactly the same as in kernel space, start the transaction.

Then it returns fairly normally, but the kernel space section that handles user space drivers catches it, saves the results for that thread (for use by the other part of the driver), signals that thread (which is currently executing in the other part of that driver) and then goes on to execute another function call on that module.

The function calls that module makes to kernel space are again handled immediately, but the ones to a different userland are suspended in exactly the same fashion.

Still need this to work transparently, and most importantly, be kicked out immediately for kernel space calls...

Once you'll have allowed PCI dma, how will you still check things, as there will no longer be any kind of pgMap service involved, but rather frame_t stuff directly fed to the ultradma PCI configuration space ? ...

It is more or less impossible to do so. You can only expect it not to take random frame_t's but if it does it's in its own space, it misbehaves, user notices, kills driver. Still, I don't expect many people to write this kind of driver, I was thinking more in the lines of utility drivers, where it is obvious that it doesn't work anymore (video card drivers, sound card drivers, fs system drivers etc).

OSDev.org

(candy) modules & drivers.

(candy) modules & drivers.

Re:(candy) modules & drivers.

Re:(candy) modules & drivers.

Re:(candy) modules & drivers.

Re:(candy) modules & drivers.

Re:(candy) modules & drivers.

Re:(candy) modules & drivers.

Re:(candy) modules & drivers.

Re:(candy) modules & drivers.

Re:(candy) modules & drivers.