Dynamic linking subsystem (call gates, etc)

rlee0001 · Post by **rlee0001** » Sun Feb 27, 2005 12:00 am

I've been reading a lot about os development over the last few years and took a quick stab at writing a simple realmode os a while ago. I think I'm about ready to get back to coding and have a few little ideas in mind for system design and so forth. Specifically I've been thinking a lot about conventions for dynamic linking and system calls lately and wanted to run an idea across y'all. I'm wondering is anyone out there has similar ideas and if its ever been done before...

The rest of this article outlines the said dynamic linking convention and is somewhat technical...

First, Shared Libraries and eXecutables would share a common proprietary format (call it the SLX format) which would allow it to:

1. Export functions to other processes (ie: be a shared lib)
2. Define and export a "main" function (ie: be an executable)
3. Call to previously loaded libraries (or trigger other libraries to be loaded on the fly) via an embedded link-table and patch table

The link table would be a list of module names and procedure names. For example:

0> cli.slx, writeln
1> kernel.slx, exit

The patch table would contain offsets into the binary image where FAR CALL instructions are located which need to be resolved by the loader. The FAR CALL itself would identify the desired module/procedure in the link table. For example:

.data
msg db 'hello world',0
.code
push offset msg
push seg msg
call far 0:0
call far 0:1

Obviously this is a messy example but you get the idea. The loader looks at the link table, ensures that the nessisary modules are loaded, creates a trap gate descriptor on the app's LDT for the call, looks at the patch table and fixes each FAR CALL with the appropiate trap gate selector.

In this system no dispatcher is needed in the OS or in the libraries. Functions are called directly, through a trap gate. This convention would also be used for system calls as well (except the created call gate would contain a privilage level change). The trap gate will be built with information stored in the CALLEE (not the caller). The number of parameters copied would be specified by the export table in the callee for each specific exported function. For shared libraries calling each-other in ring 3 I think calls could all occur without a context switch right? This would limit context switches to only occur only when a call is made from ring 3 to ring 0 (IE: a system call).

This should significantly increase call speeds between libraries and the system compaired to systems that use dispatch routines and jump tables at runtime right?

In this system, each app with get it's own LDTs which will be filled with trap gates for each external procedure needed. I can't imagine any case where an app would need access to more then 8000 some-odd external procedures but apps would be limited to the number of descripters an LDT can store. Would this be an issue?

Is there something I'm not seeing here?

-Robert

carbonBased · Post by **carbonBased** » Sun Feb 27, 2005 12:00 am

I don't know if the above method would be extensible enough. The minute you enforce a function to number mapping, you're limiting what you can do, I think.

Each application will have its own link table like this? If so, the same shared library will have its contents in several link tables if it is used by several apps. If not, you're somehow going to have to define a master link table for all apps that can be added to... I dunno if that's possible.

I would take a look at the various different object formats currently in use. Typically dynamic linking is done with a functionName -> address mapping per shared library, and each application (or even another shared library) which needs to import a symbol would have a functionName -> offset mapping.

In other words, to dynamically link, you scan through the input executable for all the fix-up addresses (ie, all the offsets) and replace them with the addresses found in the functionName to address mapping table.

If you download the nasm source code, there's a definition of the rdoff2 object format which is fairly easy to understand. There are also rdoff2 tools... the source to rdlink (I believe that's the name... in any event, the linker) is a good source of information on dynamic linking.

My take on the matter?

I, personally, think it would be great to have everything a dynamic object... much like Java classes, all applications would simply be a collection of dynamic objects that any and all other applications can use. Linking would always be done at runtime and dynamicly.

This would be slower then current systems, of course... and the initial linking step upon launching an application would probably be noticeable. In an attempt to quicken this process, I believe one could simply perform the linking in a separate thread *while the application is running*

This is, perhaps, a bizarre notion... running an unlinked application? But it could work. If all the fix-up addresses are made to be something known and unique (such as 0xdeadbeef) then the page fault handler can look at the address, see that the fault was caused by running an unlinked applicaiton, and *not* destroy the app, but rather wait until the linker thread had correctly fixed this portion of the code (perhaps force the linker thread to perform this link right away) before returning back to the application, which would then continue running as if nothing happened.

--Jeff

rlee0001 · Post by **rlee0001** » Tue Mar 08, 2005 12:00 am

carbonBased,

I'm not too stuck on the segmentation detail, I'm primarily concerned with the linking at this point. I might also just go with a offset patch table for relocation in a flat modal system.

Anyways, do you know of any existing OSs that use call gates to call directly into the target procedure without going through any dispatchers first? From the Intel manuals it looks like this was Intel's original intention for call gates. In the descriptor, intel has fields for size of parameters to copy between stacks and so forth.

Anyways...

Anton · Post by **Anton** » Thu Mar 10, 2005 12:00 am

carbonBased wrote: This is, perhaps, a bizarre notion... running an unlinked application? But it could work. If all the fix-up addresses are made to be something known and unique (such as 0xdeadbeef) then the page fault handler can look at the address, see that the fault was caused by running an unlinked applicaiton, and *not* destroy the app, but rather wait until the linker thread had correctly fixed this portion of the code (perhaps force the linker thread to perform this link right away) before returning back to the application, which would then continue running as if nothing happened.
--Jeff

This is actualy how dynamic linking works. The program is linked during runtime, not before execution. The default adresses are 0, which cause a fault, then the linker corrects these address, and the execution is returned to the program.

rexlunae · Post by **rexlunae** » Fri Mar 11, 2005 12:00 am

rlee0001 wrote:Anyways, do you know of any existing OSs that use call gates to call directly into the target procedure without going through any dispatchers first?

I know of one. MMURTL, by Richard A. Burgess. I would give you a link, but I'm not sure the project ever had a website, and the author has officially renounced copyright. If you want a look, I recommend Google, but I warn you...MMURTL is not...good. It is written in a combination of assembly and a language called C-, with its own custom compilers and assemblers.

rlee0001 wrote:From the Intel manuals it looks like this was Intel's original intention for call gates. In the descriptor, intel has fields for size of parameters to copy between stacks and so forth.

Probably. But you run into problems when you rely on features only supported on one or a few specific processors. Even if you never plan on porting your OS, looking to the future portability is wise. The 64-bit x86 processors that are out there now only support the bare minimum features of segmentation, and if you ever want your OS to run on those processors, you will need to find another way. Besides, I would imagine that you will get a much greater performance improvement by looking into using the sysenter/sysleave operations available on later 32-bit x86s.

OSDev.org

Dynamic linking subsystem (call gates, etc)

Dynamic linking subsystem (call gates, etc)

Re: Dynamic linking subsystem (call gates, etc)

Re: Dynamic linking subsystem (call gates, etc)

Re: Dynamic linking subsystem (call gates, etc)

Re: Dynamic linking subsystem (call gates, etc)