How does QNX solve async IPC w user malloc and free?

Discussions on more advanced topics such as monolithic vs micro-kernels, transactional memory models, and paging vs segmentation should go here. Use this forum to expand and improve the wiki!
OSwhatever
Member
Member
Posts: 595
Joined: Mon Jul 05, 2010 4:15 pm

How does QNX solve async IPC w user malloc and free?

Post by OSwhatever »

Despite QNX heavily relies on synchronous IPC they have added an asynchronous API as well.

Most noticeable is that when you create a asynchronous channel you can provide your own malloc and free defined in user space. If you don't it just use normal malloc and free.

You can read about their API here:
http://www.qnx.com/developers/docs/6.3. ... aging.html

I can't really get my head around how they would have solved this efficiently. I can think about a few ways but that would require more task switches and in and out of user mode callbacks. It all gets down to that you have to alter the memory allocation of a process that isn't currently running which is an unfortunate effect.

QNX isn't really open source so it is hard know but does anybody know how they would have solved this efficiently. User mode memory management would be a preferable solution but I have problems to see how to solve it in an efficient manner without involving the kernel.
User avatar
Brynet-Inc
Member
Member
Posts: 2426
Joined: Tue Oct 17, 2006 9:29 pm
Libera.chat IRC: brynet
Location: Canada
Contact:

Re: How does QNX solve async IPC w user malloc and free?

Post by Brynet-Inc »

The source was available.. sadly they disabled public access to their svn repositories. You might be able to find someone that still has a copy you can look at.
Image
Twitter: @canadianbryan. Award by smcerm, I stole it. Original was larger.
gerryg400
Member
Member
Posts: 1801
Joined: Thu Mar 25, 2010 11:26 pm
Location: Melbourne, Australia

Re: How does QNX solve async IPC w user malloc and free?

Post by gerryg400 »

I'm guessing that their asynch messaging, like their synchronous messaging uses process-to-process copy.

To receive a msg you create a channel with a bunch of message buffers malloced from your own memory. The "asyncmsg_channel_create" specifies the number of msg buffers, the size of the buffers and the call-backs to get the actual memory. Then do a receive. If there's a message available it will be copied into one of your buffers.

So all buffering is, I'm guessing, done in user space. The kernel just does the copy.
If a trainstation is where trains stop, what is a workstation ?
OSwhatever
Member
Member
Posts: 595
Joined: Mon Jul 05, 2010 4:15 pm

Re: How does QNX solve async IPC w user malloc and free?

Post by OSwhatever »

gerryg400 wrote:I'm guessing that their asynch messaging, like their synchronous messaging uses process-to-process copy.

To receive a msg you create a channel with a bunch of message buffers malloced from your own memory. The "asyncmsg_channel_create" specifies the number of msg buffers, the size of the buffers and the call-backs to get the actual memory. Then do a receive. If there's a message available it will be copied into one of your buffers.

So all buffering is, I'm guessing, done in user space. The kernel just does the copy.
Indeed, but imagine this scenario.

You send a message to another process, message is already allocated in sending process.
Kernel switches to the destination process
Kernel copies the message by mapping the parts of the source address space somewhere.
Now the kernel must deallocate the message in the source process but the source process is not running, we are currently running in the destination process.
In order to deallocate we must switch back again to source process and call a user mode callback.
Scheduling continues and we probably have to switch process again.

You can either do the copying in the destination or source address space but you will end up with the same problem that you have to either allocate or deallocate in the other process address space. So these malloc and free callback have task switching penalty. If you would have the allocation kernel controlled everything would be in kernel space so no task switching would be required. Now, poeple who are developing QNX aren't stupid, so the question if they implemented this without this penalty.
User avatar
turdus
Member
Member
Posts: 496
Joined: Tue Feb 08, 2011 1:58 pm

Re: How does QNX solve async IPC w user malloc and free?

Post by turdus »

OSwhatever wrote:Indeed, but imagine this scenario.

You send a message to another process, message is already allocated in sending process.
Kernel switches to the destination process
Kernel copies the message by mapping the parts of the source address space somewhere.
Now the kernel must deallocate the message in the source process but the source process is not running, we are currently running in the destination process.
In order to deallocate we must switch back again to source process and call a user mode callback.
Scheduling continues and we probably have to switch process again.
It can be done with only one switch. Imagine this:
1. Kernel maps the destination process' address space partially into kernel memory
2. Copies the message from sender's address space to temporary, partially mapped memory
3. Unmaps dest. memory in kernel space
4. Deallocates message in source process (it is still running, no need to switch!)
5. Scheduling continues and switch process only once to destination process.

Or even better, if message is paged aligned, it's simply mapped directly to dest. process' address space. First 2 points would be:
1. kernel maps dest. process' paging tables into kernel memory
2. map message into dest. process' memory
3.-5. ...(the same as above)
OSwhatever
Member
Member
Posts: 595
Joined: Mon Jul 05, 2010 4:15 pm

Re: How does QNX solve async IPC w user malloc and free?

Post by OSwhatever »

turdus wrote: It can be done with only one switch. Imagine this:
1. Kernel maps the destination process' address space partially into kernel memory
2. Copies the message from sender's address space to temporary, partially mapped memory
3. Unmaps dest. memory in kernel space
4. Deallocates message in source process (it is still running, no need to switch!)
5. Scheduling continues and switch process only once to destination process.

So how does the kernel know where exactly to put the message in the destination process?
Or even better, if message is paged aligned, it's simply mapped directly to dest. process' address space. First 2 points would be:
1. kernel maps dest. process' paging tables into kernel memory
2. map message into dest. process' memory
3.-5. ...(the same as above)
I think you are thinking about the synchronous case now. When dozens of threads sends messages to one thread, you must have some mechanism where to store multiple messages. This temporary buffer then becomes kind of a kernel controlled allocation again.

For small messages double copying can be faster but then you often use kernel memory that is always mapped. QNX for example have a threshold of 256 bytes where it does a double copying, this is however for synchronous case but it is a valid case for asynchronous messages as well.
Last edited by OSwhatever on Thu Feb 02, 2012 11:06 am, edited 1 time in total.
OSwhatever
Member
Member
Posts: 595
Joined: Mon Jul 05, 2010 4:15 pm

Re: How does QNX solve async IPC w user malloc and free?

Post by OSwhatever »

berkus wrote:
OSwhatever wrote:You send a message to another process, message is already allocated in sending process.
Kernel switches to the destination process
Kernel copies the message by mapping the parts of the source address space somewhere.
Now the kernel must deallocate the message in the source process but the source process is not running, we are currently running in the destination process.
Wrong, you just mark that buffer as available.
How do you mark something that is not available?

What you can do is put this buffer in a kernel managed list, and when you visit that process again, then you can deallocate.
User avatar
turdus
Member
Member
Posts: 496
Joined: Tue Feb 08, 2011 1:58 pm

Re: How does QNX solve async IPC w user malloc and free?

Post by turdus »

OSwhatever wrote:I think you are thinking about the synchronous case now.
No, it has nothing to do with synchronization. I was describing a way how to pass a message with one process switch, independent of whether sender waits for a response or not.
OSwhatever
Member
Member
Posts: 595
Joined: Mon Jul 05, 2010 4:15 pm

Re: How does QNX solve async IPC w user malloc and free?

Post by OSwhatever »

turdus wrote:
OSwhatever wrote:I think you are thinking about the synchronous case now.
No, it has nothing to do with synchronization. I was describing a way how to pass a message with one process switch, independent of whether sender waits for a response or not.
2. Copies the message from sender's address space to temporary, partially mapped memory

When the kernel is about to copying the message the buffer for that message hasn't been allocated yet in destination process. It has no idea which part of the user memory it is supposed to map. The kernel simply don't know about it.

You must describe this part in further detail I think.
OSwhatever
Member
Member
Posts: 595
Joined: Mon Jul 05, 2010 4:15 pm

Re: How does QNX solve async IPC w user malloc and free?

Post by OSwhatever »

berkus wrote:How is it not available? Kernel has access to the entire address space and when it copies data FROM the buffer it immediately marks it available. Now that the data is copied into the recipient's buffer the sender can reuse this buffer for other messages.
Is marking it available enough? In practice it must be freed and modify the internal allocation structures in order to show up again so that a concurrent malloc can use it again. Doing that would require locking the allocation structures.

An allocation algorithm can however, step through the allocated buffers in order to see if any is available but that would require a search among the allocated buffers. Checking a word is usually atomic so that would work in the concurrent case.
User avatar
turdus
Member
Member
Posts: 496
Joined: Tue Feb 08, 2011 1:58 pm

Re: How does QNX solve async IPC w user malloc and free?

Post by turdus »

OSwhatever wrote:When the kernel is about to copying the message the buffer for that message hasn't been allocated yet in destination process.
The temporary mapping points to a part of destination process' memory (not a temporary memory of source process). Message is copied directly to destination process' stack. The stack frame is pointed by rbp/rsp, which is indeed known by the kernel, and can be easily adjusted, no allocation needed. I assume the blocked recv() will expect the message as an argument anyway. If not you can still use a dedicated message area.
OSwhatever
Member
Member
Posts: 595
Joined: Mon Jul 05, 2010 4:15 pm

Re: How does QNX solve async IPC w user malloc and free?

Post by OSwhatever »

turdus wrote:
OSwhatever wrote:When the kernel is about to copying the message the buffer for that message hasn't been allocated yet in destination process.
The temporary mapping points to a part of destination process' memory (not a temporary memory of source process). Message is copied directly to destination process' stack. The stack frame is pointed by rbp/rsp, which is indeed known by the kernel, and can be easily adjusted, no allocation needed. I assume the blocked recv() will expect the message as an argument anyway. If not you can still use a dedicated message area.
You can't adjust the stack while the destination thread is running which might be the case here. This might work with synchronous messages but not with asynchronous. With a dedicated message receiving area, that's basically kernel controlled memory again.
User avatar
turdus
Member
Member
Posts: 496
Joined: Tue Feb 08, 2011 1:58 pm

Re: How does QNX solve async IPC w user malloc and free?

Post by turdus »

OSwhatever wrote:basically kernel controlled memory again.
Think it again. Any kind of IPC is controlled by the kernel. The destination memory will be written/mapped by the kernel, regardless of sync or async.
OSwhatever
Member
Member
Posts: 595
Joined: Mon Jul 05, 2010 4:15 pm

Re: How does QNX solve async IPC w user malloc and free?

Post by OSwhatever »

turdus wrote:
OSwhatever wrote:basically kernel controlled memory again.
Think it again. Any kind of IPC is controlled by the kernel. The destination memory will be written/mapped by the kernel, regardless of sync or async.
My original point was that I wanted to find a way to let the user process manage the allocation rather than the kernel. The benefit of doing so is for example that different types of allocation algorithms are suitable for different types of message traffic. If the user process could handle its own allocation and management structures that would be beneficial. Also sometimes you want a separate pool for some purpose rather than a single one, if you want less locking one pool for example.

It's not the mapping I'm after but the fine granularity message allocations.
User avatar
turdus
Member
Member
Posts: 496
Joined: Tue Feb 08, 2011 1:58 pm

Re: How does QNX solve async IPC w user malloc and free?

Post by turdus »

OSwhatever wrote:My original point was that I wanted to find a way to let the user process manage the allocation rather than the kernel. The benefit of doing so is for example that different types of allocation algorithms are suitable for different types of message traffic. If the user process could handle its own allocation and management structures that would be beneficial. Also sometimes you want a separate pool for some purpose rather than a single one, if you want less locking one pool for example.

It's not the mapping I'm after but the fine granularity message allocations.
I'm still convinced that you mess up things. Userspace process can allocate memory for receiver buffer, and inform the kernel about that on recv(). In this case memory would be a dedicated area as I mentioned before. I think your problem is, you do not know what async and sync is. Either way, the receiver is blocked waiting for the message, the difference is on sender's side (go on or wait for response). If you assume that receiver is not blocked, then it would be a signal, where no alloc required (the signal handler gets it's arguments in registers or on top of stack, as discussed before).
Post Reply