OSDev.org

Posted: **Sat Aug 10, 2024 7:06 pm**

Hello,
I'm currently implementing system calls in my hobby OS, and I'm curious about how different operating systems move data between user/kernel space in system calls. How does this work in other people's operating systems?

I'm currently brainstorming different designs. I'm developing (very slowly) a microkernel for the RISC-V architecture, and running the OS in S-Mode. When moving from U-Mode to S-Mode in a syscall, the kernel can't directly access memory in user pages. Of course you can manipulate the page tables to give yourself access, I guess. This means that passing data to the kernel isn't as simple as just storing a pointer in a register, because the kernel can't just access arbitrary addresses in userspace.

So far I've had a few ideas, but all of them have their downsides. So far I've made a 'shared buffer' (accessible at different addresses in user/kernel space) for copying data in/out. This works well enough, but it does take up extra memory, and the the copying is inefficient. I'd love to hear other people's ideas.

Posted: **Sat Aug 10, 2024 10:54 pm**

I highly doubt that it is impossible to access user pages from system mode on RISC-V. But maybe some operation is necessary to tell the CPU it's OK now. Other architectures have a feature called KUAP (kernel user access prevention). On current x86, for example, you can set it up so that the kernel has to set the AC bit in the FLAGS register to be able to access user pages. Otherwise it gets a protection fault.

In general, the approach I would recommend is to transmit integer parameters and pointers in registers. Limit yourself to 6 arguments and 1 register for the syscall number, and the whole scheme will be portable even to i386, if you should ever desire this.

Registers have the advantage of always existing and always being accessible without side effects.

Larger and variable-length arguments (like path names) need to be transmitted through memory. You will typically have a function that does the user access explicitly. This function is typically instrumented so that it can fail if the access faults (so the fault handler can see that it was that function and redirect the instruction stream accordingly).

Everything needs to be copied into kernel space before it can be used. This prevents a variety of possible attacks.

Posted: **Sat Aug 10, 2024 11:45 pm**

nullplan wrote: ↑Sat Aug 10, 2024 10:54 pm I highly doubt that it is impossible to access user pages from system mode on RISC-V. But maybe some operation is necessary to tell the CPU it's OK now. Other architectures have a feature called KUAP (kernel user access prevention). On current x86, for example, you can set it up so that the kernel has to set the AC bit in the FLAGS register to be able to access user pages. Otherwise it gets a protection fault.

I had a bit of a look into this, and you are right. Pardon me. RISC-V has the 'SUM' field in sstatus, which controls whether accesses to user pages will fault in kernel mode. That does make things much simpler.

Posted: **Fri Aug 16, 2024 12:41 am**

Pointers to user memory would typically need validation since the linear address space of user mode and kernel mode is the same. That means user mode can send a pointer to kernel memory, and if used without validation, user mode could corrupt kernel memory.

My segmented design doesn't need pointer validation. User mode has flat selectors that only maps user mode memory, and not kernel mode memory. Pointers are passed as 48-bit segment:offset, and used that way too in kernel.

Posted: **Fri Aug 16, 2024 9:00 am**

rdos wrote: ↑Fri Aug 16, 2024 12:41 am Pointers to user memory would typically need validation since the linear address space of user mode and kernel mode is the same. That means user mode can send a pointer to kernel memory, and if used without validation, user mode could corrupt kernel memory.

Exactly. In any higher-half kernel design I know of, this is a single compare.

rdos wrote: ↑Fri Aug 16, 2024 12:41 am My segmented design doesn't need pointer validation. User mode has flat selectors that only maps user mode memory, and not kernel mode memory. Pointers are passed as 48-bit segment:offset, and used that way too in kernel.

And if the user guesses the kernel segment correctly, you will be writing file data into the process list, anyway. Unless you pass the segments in segment registers; that might work.

Posted: **Mon Aug 19, 2024 12:58 am**

nullplan wrote: ↑Fri Aug 16, 2024 9:00 am And if the user guesses the kernel segment correctly, you will be writing file data into the process list, anyway. Unless you pass the segments in segment registers; that might work.

All pointers must be passed in registers, so if user tries to access kernel data, loading the segment register will fault.

OSDev.org

How to move data between user and kernel space in syscalls?

How to move data between user and kernel space in syscalls?

Re: How to move data between user and kernel space in syscalls?

Re: How to move data between user and kernel space in syscalls?

Re: How to move data between user and kernel space in syscalls?

Re: How to move data between user and kernel space in syscalls?

Re: How to move data between user and kernel space in syscalls?