How does QNX solve async IPC w user malloc and free?

OSwhatever · Post by **OSwhatever** » Wed Feb 01, 2012 9:02 am

Despite QNX heavily relies on synchronous IPC they have added an asynchronous API as well.

Most noticeable is that when you create a asynchronous channel you can provide your own malloc and free defined in user space. If you don't it just use normal malloc and free.

You can read about their API here:
http://www.qnx.com/developers/docs/6.3. ... aging.html

I can't really get my head around how they would have solved this efficiently. I can think about a few ways but that would require more task switches and in and out of user mode callbacks. It all gets down to that you have to alter the memory allocation of a process that isn't currently running which is an unfortunate effect.

QNX isn't really open source so it is hard know but does anybody know how they would have solved this efficiently. User mode memory management would be a preferable solution but I have problems to see how to solve it in an efficient manner without involving the kernel.

Brynet-Inc · Post by **Brynet-Inc** » Wed Feb 01, 2012 12:55 pm

The source was available.. sadly they disabled public access to their svn repositories. You might be able to find someone that still has a copy you can look at.

gerryg400 · Post by **gerryg400** » Thu Feb 02, 2012 5:25 am

I'm guessing that their asynch messaging, like their synchronous messaging uses process-to-process copy.

To receive a msg you create a channel with a bunch of message buffers malloced from your own memory. The "asyncmsg_channel_create" specifies the number of msg buffers, the size of the buffers and the call-backs to get the actual memory. Then do a receive. If there's a message available it will be copied into one of your buffers.

So all buffering is, I'm guessing, done in user space. The kernel just does the copy.

OSwhatever · Post by **OSwhatever** » Thu Feb 02, 2012 6:27 am

gerryg400 wrote:I'm guessing that their asynch messaging, like their synchronous messaging uses process-to-process copy.

To receive a msg you create a channel with a bunch of message buffers malloced from your own memory. The "asyncmsg_channel_create" specifies the number of msg buffers, the size of the buffers and the call-backs to get the actual memory. Then do a receive. If there's a message available it will be copied into one of your buffers.

So all buffering is, I'm guessing, done in user space. The kernel just does the copy.

Indeed, but imagine this scenario.

You send a message to another process, message is already allocated in sending process.
Kernel switches to the destination process
Kernel copies the message by mapping the parts of the source address space somewhere.
Now the kernel must deallocate the message in the source process but the source process is not running, we are currently running in the destination process.
In order to deallocate we must switch back again to source process and call a user mode callback.
Scheduling continues and we probably have to switch process again.

You can either do the copying in the destination or source address space but you will end up with the same problem that you have to either allocate or deallocate in the other process address space. So these malloc and free callback have task switching penalty. If you would have the allocation kernel controlled everything would be in kernel space so no task switching would be required. Now, poeple who are developing QNX aren't stupid, so the question if they implemented this without this penalty.

turdus · Post by **turdus** » Thu Feb 02, 2012 9:10 am

OSwhatever wrote:Indeed, but imagine this scenario.

You send a message to another process, message is already allocated in sending process.
Kernel switches to the destination process
Kernel copies the message by mapping the parts of the source address space somewhere.
Now the kernel must deallocate the message in the source process but the source process is not running, we are currently running in the destination process.
In order to deallocate we must switch back again to source process and call a user mode callback.
Scheduling continues and we probably have to switch process again.

It can be done with only one switch. Imagine this:
1. Kernel maps the destination process' address space partially into kernel memory
2. Copies the message from sender's address space to temporary, partially mapped memory
3. Unmaps dest. memory in kernel space
4. Deallocates message in source process (it is still running, no need to switch!)
5. Scheduling continues and switch process only once to destination process.

Or even better, if message is paged aligned, it's simply mapped directly to dest. process' address space. First 2 points would be:
1. kernel maps dest. process' paging tables into kernel memory
2. map message into dest. process' memory
3.-5. ...(the same as above)

OSwhatever · Post by **OSwhatever** » Thu Feb 02, 2012 10:53 am

turdus wrote: It can be done with only one switch. Imagine this:
1. Kernel maps the destination process' address space partially into kernel memory
2. Copies the message from sender's address space to temporary, partially mapped memory
3. Unmaps dest. memory in kernel space
4. Deallocates message in source process (it is still running, no need to switch!)
5. Scheduling continues and switch process only once to destination process.

So how does the kernel know where exactly to put the message in the destination process?
Or even better, if message is paged aligned, it's simply mapped directly to dest. process' address space. First 2 points would be:
1. kernel maps dest. process' paging tables into kernel memory
2. map message into dest. process' memory
3.-5. ...(the same as above)

I think you are thinking about the synchronous case now. When dozens of threads sends messages to one thread, you must have some mechanism where to store multiple messages. This temporary buffer then becomes kind of a kernel controlled allocation again.

For small messages double copying can be faster but then you often use kernel memory that is always mapped. QNX for example have a threshold of 256 bytes where it does a double copying, this is however for synchronous case but it is a valid case for asynchronous messages as well.

OSwhatever · Post by **OSwhatever** » Thu Feb 02, 2012 10:56 am

berkus wrote:
OSwhatever wrote:You send a message to another process, message is already allocated in sending process.
Kernel switches to the destination process
Kernel copies the message by mapping the parts of the source address space somewhere.
Now the kernel must deallocate the message in the source process but the source process is not running, we are currently running in the destination process.
Wrong, you just mark that buffer as available.

How do you mark something that is not available?

What you can do is put this buffer in a kernel managed list, and when you visit that process again, then you can deallocate.

turdus · Post by **turdus** » Thu Feb 02, 2012 12:46 pm

OSwhatever wrote:I think you are thinking about the synchronous case now.

No, it has nothing to do with synchronization. I was describing a way how to pass a message with one process switch, independent of whether sender waits for a response or not.

OSwhatever · Post by **OSwhatever** » Thu Feb 02, 2012 1:02 pm

turdus wrote:
OSwhatever wrote:I think you are thinking about the synchronous case now.
No, it has nothing to do with synchronization. I was describing a way how to pass a message with one process switch, independent of whether sender waits for a response or not.

2. Copies the message from sender's address space to temporary, partially mapped memory

When the kernel is about to copying the message the buffer for that message hasn't been allocated yet in destination process. It has no idea which part of the user memory it is supposed to map. The kernel simply don't know about it.

You must describe this part in further detail I think.

OSwhatever · Post by **OSwhatever** » Thu Feb 02, 2012 5:22 pm

berkus wrote:How is it not available? Kernel has access to the entire address space and when it copies data FROM the buffer it immediately marks it available. Now that the data is copied into the recipient's buffer the sender can reuse this buffer for other messages.

Is marking it available enough? In practice it must be freed and modify the internal allocation structures in order to show up again so that a concurrent malloc can use it again. Doing that would require locking the allocation structures.

An allocation algorithm can however, step through the allocated buffers in order to see if any is available but that would require a search among the allocated buffers. Checking a word is usually atomic so that would work in the concurrent case.

turdus · Post by **turdus** » Fri Feb 03, 2012 5:08 am

OSwhatever wrote:When the kernel is about to copying the message the buffer for that message hasn't been allocated yet in destination process.

The temporary mapping points to a part of destination process' memory (not a temporary memory of source process). Message is copied directly to destination process' stack. The stack frame is pointed by rbp/rsp, which is indeed known by the kernel, and can be easily adjusted, no allocation needed. I assume the blocked recv() will expect the message as an argument anyway. If not you can still use a dedicated message area.

OSwhatever · Post by **OSwhatever** » Fri Feb 03, 2012 8:23 am

turdus wrote:
OSwhatever wrote:When the kernel is about to copying the message the buffer for that message hasn't been allocated yet in destination process.
The temporary mapping points to a part of destination process' memory (not a temporary memory of source process). Message is copied directly to destination process' stack. The stack frame is pointed by rbp/rsp, which is indeed known by the kernel, and can be easily adjusted, no allocation needed. I assume the blocked recv() will expect the message as an argument anyway. If not you can still use a dedicated message area.

You can't adjust the stack while the destination thread is running which might be the case here. This might work with synchronous messages but not with asynchronous. With a dedicated message receiving area, that's basically kernel controlled memory again.

turdus · Post by **turdus** » Fri Feb 03, 2012 2:01 pm

OSwhatever wrote:basically kernel controlled memory again.

Think it again. Any kind of IPC is controlled by the kernel. The destination memory will be written/mapped by the kernel, regardless of sync or async.

OSwhatever · Post by **OSwhatever** » Fri Feb 03, 2012 4:02 pm

turdus wrote:
OSwhatever wrote:basically kernel controlled memory again.
Think it again. Any kind of IPC is controlled by the kernel. The destination memory will be written/mapped by the kernel, regardless of sync or async.

My original point was that I wanted to find a way to let the user process manage the allocation rather than the kernel. The benefit of doing so is for example that different types of allocation algorithms are suitable for different types of message traffic. If the user process could handle its own allocation and management structures that would be beneficial. Also sometimes you want a separate pool for some purpose rather than a single one, if you want less locking one pool for example.

It's not the mapping I'm after but the fine granularity message allocations.

turdus · Post by **turdus** » Sun Feb 05, 2012 6:54 am

OSwhatever wrote:My original point was that I wanted to find a way to let the user process manage the allocation rather than the kernel. The benefit of doing so is for example that different types of allocation algorithms are suitable for different types of message traffic. If the user process could handle its own allocation and management structures that would be beneficial. Also sometimes you want a separate pool for some purpose rather than a single one, if you want less locking one pool for example.

It's not the mapping I'm after but the fine granularity message allocations.

I'm still convinced that you mess up things. Userspace process can allocate memory for receiver buffer, and inform the kernel about that on recv(). In this case memory would be a dedicated area as I mentioned before. I think your problem is, you do not know what async and sync is. Either way, the receiver is blocked waiting for the message, the difference is on sender's side (go on or wait for response). If you assume that receiver is not blocked, then it would be a signal, where no alloc required (the signal handler gets it's arguments in registers or on top of stack, as discussed before).

OSDev.org

How does QNX solve async IPC w user malloc and free?

How does QNX solve async IPC w user malloc and free?

Re: How does QNX solve async IPC w user malloc and free?

Re: How does QNX solve async IPC w user malloc and free?

Re: How does QNX solve async IPC w user malloc and free?

Re: How does QNX solve async IPC w user malloc and free?

Re: How does QNX solve async IPC w user malloc and free?

Re: How does QNX solve async IPC w user malloc and free?

Re: How does QNX solve async IPC w user malloc and free?

Re: How does QNX solve async IPC w user malloc and free?

Re: How does QNX solve async IPC w user malloc and free?

Re: How does QNX solve async IPC w user malloc and free?

Re: How does QNX solve async IPC w user malloc and free?

Re: How does QNX solve async IPC w user malloc and free?

Re: How does QNX solve async IPC w user malloc and free?

Re: How does QNX solve async IPC w user malloc and free?