Handle managment schemes

proxy · Post by **proxy** » Wed Apr 02, 2008 10:49 am

In my OS, I currently have 3 types of handles:

PIDs, TIDs and SIDs. These are Process IDs, Thread IDs, and Semaphore IDs (semaphores IDs speak generically about any synchronization primitive).

At the moment, to implement sanity checking, my scheme is a 32-bit ID, where the upper 8-bits are the type (I know this is more than necessary but planning ahead). And the lower 24 bits are the unique value for the type.

If the lower 24-bits are all unset, it is the "invalid handle" for that type:

So for example, the first PID issued could be 0x01000001.

On to my question, given that I can have up to 2^24 IDs per type, what is a good way to track which are used and which are available?

Originally, I had a stack for each type, where I just pushed/poped IDs as needed. I soon decided that the memory trade off was a bit much, so I switched to a simple increment a counter and check for overflow (at which point I just panic for now).

I am thinking perhaps I should do a bitmap, which is managed with levels of indirection so I don't need to allocate all the bits at once (which would still be 64 thousand 32-bit values!).

What do you guys think? I could put some more constraints on the IDs to make it more practical, but I am trying to not have hard coded limits in my system, instead I'd rather the limits be imposed by the amount of available memory.

proxy

skyking · Post by **skyking** » Wed Apr 02, 2008 1:06 pm

Since you almost certainly has a kernel side struct describing the ID's content (PCB, Semaphore counter etc) you might want to implement this using a hash table containing pointer to the kernel side struct. Then you might not need to distinguish them with a field indicating type.

proxy · Post by **proxy** » Wed Apr 02, 2008 2:11 pm

yea, i thought about having some sort of mapping of pointer to ID. But one thing to keep in mind, is that this does have some security implications. Specifically, giving user space information about the locations of kernel object can create security issues.

Now, I realize that my OS is currently only used by me

and in fact will likely never be popular enough that anyone would bother trying to attack it...but, personally, I like to do things "the right way" if i can

proxy

t0xic · Post by **t0xic** » Wed Apr 02, 2008 2:16 pm

You could always encrypt the pointer, with very little overhead. Besides, assuming applications are running in ring 3 (and in their own address space) accessing kernel objects shouldn't be a problem.

Ready4Dis · Post by **Ready4Dis** » Thu Apr 03, 2008 2:00 am

How are your tasks sorted? You could just do counter, then once the counter rolls over, you can simply search the list for the next unused number. Sure that will be a bit slow, but if you go through 16 million numbers, something might possibly be wrong (or the computer was just running for a very long time).

Using the pointer in memory (as suggested) is an idea, provided it will never change, and that your kernel data is off-limits to userland. The process won't be able to read/write directly to the memory pointed to by the pointer anyways, so it doesn't really matter if it knows where it is in kernel land.

Korona · Post by **Korona** » Thu Apr 03, 2008 3:23 am

Use a map that maps handles -> task/whatever structs. Use 64 bit unsigned handles to make sure they won't overflow in the next few centuries even if millions of tasks get created every second. The map can be efficiently implemented by using a binary tree (perhaps a red-black tree).

proxy · Post by **proxy** » Thu Apr 03, 2008 9:45 am

@Ready4Dis: knowing pointers has more risks associated with it than just the idea of a process being able to tample with kernel memory (though that would be a concern if I ever implement anything like /dev/kmem

). In addition to that, a user space app could coax the kernel into performing operations on memory it doesn't own, or used to own through system calls.

A contrived example goes like this:

* user space app, figures out where some interesting kernel structure is, perhaps even relative to valid data it does have.

* this application, takes notice that the read system call does not properly validate that the target buffer is kernel space or user space. (i do, but this is an example).

* the application then calls read providing a target buffer that it acquired in step 1, viola, tampering with kernel memory indirectly!

there are a few other ways that I can think of that a user space app knowing the address of kernel data can go badly, but that's besides the point.

@Korona: Yea, I keep my processes in a "map" (a tree). So I could just go with a "big enough" handle space. I'll have to keep that in mind.

proxy

skyking · Post by **skyking** » Fri Apr 04, 2008 12:27 pm

proxy wrote:yea, i thought about having some sort of mapping of pointer to ID. But one thing to keep in mind, is that this does have some security implications. Specifically, giving user space information about the locations of kernel object can create security issues.

Well, but the kernel has to know the mapping - so the mapping has to exist. If you want to make it possible for userspace code to tell whether a handle is valid or not you might add a syscall so that the kernel can check that (this shouldn't be needed since the program should know that already anyway).

Korona · Post by **Korona** » Fri Apr 04, 2008 1:12 pm

If you use pointers as handles, the kernel cannot tell whether a handle is valid or not. It has to lookup the handle in some sort of set to validate it, but if you do that all advantages (O(1) lookups) of using pointers as handles vanish. Use a 64bit counter instead. Even if a gigahertz cpu would create one thread per clock cycle the 64bit counter won't overflow in the next 600 years.

skyking · Post by **skyking** » Fri Apr 04, 2008 3:50 pm

Korona wrote:If you use pointers as handles, the kernel cannot tell whether a handle is valid or not. It has to lookup the handle in some sort of set to validate it, but if you do that all advantages (O(1) lookups) of using pointers as handles vanish. Use a 64bit counter instead. Even if a gigahertz cpu would create one thread per clock cycle the 64bit counter won't overflow in the next 600 years.

Yes, but then you still have to look up the pointer anyway - so you win not much more than the address of the PCB remains secret.

omegaman · Post by **omegaman** » Sun Apr 06, 2008 9:53 am

Note that some developers assume their user apps will receive handles with the usual Posix semantics: lowest available ID first. Some application may even fail if this is not honored. So if portability is an issue, take note!

The various pointer-as-handle schemes violates this, as well as leaking information from the kernel (maybe even in non-obvious ways... not as safe as it sounds.)

My favorite handle implementation is the radix tree solution in Solaris - it's very fast and fairly easy to implement.

skyking · Post by **skyking** » Mon Apr 07, 2008 12:21 pm

omegaman wrote:Note that some developers assume their user apps will receive handles with the usual Posix semantics: lowest available ID first. Some application may even fail if this is not honored. So if portability is an issue, take note!

Is this really so? I have not seen any documentation mentioning anything like that. If portability is an issue one should not assume things that can't be guaranteed (especially when there is no need to do those assumtions).

robos · Post by **robos** » Mon Apr 07, 2008 12:48 pm

skyking wrote:
omegaman wrote:Note that some developers assume their user apps will receive handles with the usual Posix semantics: lowest available ID first. Some application may even fail if this is not honored. So if portability is an issue, take note!
Is this really so? I have not seen any documentation mentioning anything like that. If portability is an issue one should not assume things that can't be guaranteed (especially when there is no need to do those assumtions).

Yes, that's what programmers do. They assume things they shouldn't. It's one of the big things with the whole backwards compatibility in Windows, see this blog entry for example:

http://blogs.msdn.com/oldnewthing/archi ... 03614.aspx

So if you need to compatible then everything the API exposes (which includes the way handles are allocated) should be done the same way on your system (if you want max compat).

zaleschiemilgabriel · Post by **zaleschiemilgabriel** » Tue Apr 08, 2008 2:03 am

Well, being that it's your OS, you can impose whatever constraints you want on the applications that run it. Whatever you do to fix this, however, will not be perfect. The only way a fix would be perfect is if you had a variable that you could modify an infinite amount of times and get a different value every time, which computers simply can't do very well.
Encryption and rollover algorithms will help a bit, but there's always a chance that they'll overflow, because they can't count forever.