Question about blocking operations in syscalls

Question about which tools to use, bugs, the best way to implement a function, etc should go here. Don't forget to see if your question is answered in the wiki first! When in doubt post here.
Post Reply
heavyweight87
Posts: 22
Joined: Sun Apr 28, 2019 7:39 am

Question about blocking operations in syscalls

Post by heavyweight87 »

I was thinking about syscalls, and that they basically do all their work in an interrupt.


I come from the embedded world where you want to as little as possible in an interrupt but the syscall design seems to not follow this.
It seems to me that every syscall operation is a blocking one...
E.g. a read to a slow file system could hold the system up for tens of ms. A task that reads a large file could fire multiple interrupts really slow everything down.

Is this not problematic?
User avatar
iansjack
Member
Member
Posts: 4703
Joined: Sat Mar 31, 2012 3:07 am
Location: Chichester, UK

Re: Question about blocking operations in syscalls

Post by iansjack »

1. System calls don't have to be handled by interrupts. You could also use the much faster syscall mechanism (although a system call may still spend an appreciable amount of time running in kernel mode).

2. Not all system calls block.

3. A blocking system call returns almost immediately, having put the blocked task on a queue of blocked tasks and switching to the next runnable task. They may well spend less time in the interrupt routine than non-blocking calls.

4. If you use buffered I/O routines then a single system call will transfer a large amount of data (depending upon the size of your buffers). If the disk driver is interrupt driven then you will get an interrupt for each sector (or possibly group of sectors) read, but the routine called will probably be very small.

5. The time spent in interrupts is far less important in interactive systems than real-time ones. A real-time OS will follow a very different design to a general-purpose OS.
heavyweight87
Posts: 22
Joined: Sun Apr 28, 2019 7:39 am

Re: Question about blocking operations in syscalls

Post by heavyweight87 »

Thanks for the reply.

So a blocking call will just return, and as it has the pointer to the users buffer it can just copy the data to the buffer and wake the task up when its done?

How exactly does the kernel block the process once its returned to user land from the syscall interrupt?
User avatar
iansjack
Member
Member
Posts: 4703
Joined: Sat Mar 31, 2012 3:07 am
Location: Chichester, UK

Re: Question about blocking operations in syscalls

Post by iansjack »

To block a process, the running process, that called the system call, is placed on the queue of blocked processes. The kernel then institutes a task switch to the next runnable task. This will then return to whatever caused it to be switched from in the first place. So the system call doesn't actually return until the blocked task runs again, but no cpu time is consumed by the blocked process.

The exact mechanism by which the blocked task is subsequently unblocked is part of the design of the OS; it's not necessarily the same for all OSs.

All system calls have to do a certain amount of work, whether they are called by a software interrupt or a syscall mechanism. There's not really any way of avoiding this. I prefer to use syscall rather than a software interrupt as it's simpler and, I believe, faster. Even if you use software interrupts, they are not really interrupts because they are not asynchronous.

There's some interesting information about system calls here: https://x86.lol/generic/2019/07/04/kernel-entry.html
User avatar
bellezzasolo
Member
Member
Posts: 110
Joined: Sun Feb 20, 2011 2:01 pm

Re: Question about blocking operations in syscalls

Post by bellezzasolo »

heavyweight87 wrote:Thanks for the reply.

So a blocking call will just return, and as it has the pointer to the users buffer it can just copy the data to the buffer and wake the task up when its done?

How exactly does the kernel block the process once its returned to user land from the syscall interrupt?
Well, I don't have syscalls, yet, but the approach is quite similar to a hardware interrupt.

The hardware interrupt signals a semaphore, and returns, just doing basic EOI sort of stuff.
This wakes an event thread which deals with actually handling the interrupt (Deffered Procedure Call on Windows is a similar mechanism).

So, for a syscall, what you probably want to do is as follows.

Dispatch to the appropriate subsystem, say, the VFS.

Create a semaphore, and put the request, along with associated semaphore, into a queue. Then wait on the semaphore, which should block the current thread, and immediately invoke the scheduler.

So the thread is blocked, and another thread is run. Meanwhile, you have a kernel thread (high priority) that processes requests, and issues them to the hardware. Meanwhile, another thread handles complete requests - or it could be the same thread. How much threading you want in the kernel/drivers is up to you.

But, when the requisite event is complete, you signal the associated semaphore. The user thread is woken, and you now have the data requested.

This also has the advantage that, for non-blocking I/O, you do exactly the same thing, but return the semaphore, instead of waiting. The user process might WaitForSingleObject or whatever at a later stage.

In fact, you could just implement blocking I/O in user space by chaining the two syscalls.
Whoever said you can't do OS development on Windows?
https://github.com/ChaiSoft/ChaiOS
heavyweight87
Posts: 22
Joined: Sun Apr 28, 2019 7:39 am

Re: Question about blocking operations in syscalls

Post by heavyweight87 »

So the system call doesn't actually return until the blocked task runs again, but no cpu time is consumed by the blocked process.
This is the part I’m unsure about. So the blocking process is stuck in an in interrupt?
I guess the interrupt is still in the context of the running task so it’s ok to switch to another task and resume the blocking task later on
User avatar
iansjack
Member
Member
Posts: 4703
Joined: Sat Mar 31, 2012 3:07 am
Location: Chichester, UK

Re: Question about blocking operations in syscalls

Post by iansjack »

Again, it's probably best to stop thinking in terms of interrupts. It's really just a synchronous change of privilege level.
User avatar
bzt
Member
Member
Posts: 1584
Joined: Thu Oct 13, 2016 4:55 pm
Contact:

Re: Question about blocking operations in syscalls

Post by bzt »

heavyweight87 wrote:This is the part I’m unsure about. So the blocking process is stuck in an in interrupt?
I guess the interrupt is still in the context of the running task so it’s ok to switch to another task and resume the blocking task later on
Think of it like this:

Process A makes a "read" syscall. In the syscall handler (which could be an interrupt handler too), the kernel sees that the request cannot be fulfilled. Therefore the kernel removes process A from the active tasks queue, and instructs the disk driver to read sectors. Then the scheduler picks process B (the next task in the active tasks queue), and carries on.

When the sector read finished, an interrupt is generated, and process B is interrupted. There the kernel sees that this interrupt is a response to a request by process A, so it awakes process A (adding it to the head of active tasks queue), and control is returned to process A, which now has the data for the "read" syscall.

Process A has absolutely no clue, that it was blocked during the syscall. From its perspective it called "read" and the kernel returned the data. The kernel never blocked in the syscall handler either, it just returned to another process, and did not schedule process A until its data become ready.

Cheers,
bzt
nexos
Member
Member
Posts: 1081
Joined: Tue Feb 18, 2020 3:29 pm
Libera.chat IRC: nexos

Re: Question about blocking operations in syscalls

Post by nexos »

As long as the register state is saved when blocked, all is well. That is how I do it in NexOS.
"How did you do this?"
"It's very simple — you read the protocol and write the code." - Bill Joy
Projects: NexNix | libnex | nnpkg
User avatar
bellezzasolo
Member
Member
Posts: 110
Joined: Sun Feb 20, 2011 2:01 pm

Re: Question about blocking operations in syscalls

Post by bellezzasolo »

nexos wrote:As long as the register state is saved when blocked, all is well. That is how I do it in NexOS.
The one thing to be careful of is correct handling of context switches.

E.g. you have a thread which is preempted, via a timer interrupt.

You load a context which is from a thread that blocked.

You need to make sure that you send the EOI for the timer interrupt, probably just before you change context.

I think this could be the issue I currently have with semaphores, actually...

I'm probably going to need to put a flag in the per-cpu data, so I know whether to call the EOI function or not.
Whoever said you can't do OS development on Windows?
https://github.com/ChaiSoft/ChaiOS
nexos
Member
Member
Posts: 1081
Joined: Tue Feb 18, 2020 3:29 pm
Libera.chat IRC: nexos

Re: Question about blocking operations in syscalls

Post by nexos »

That explains why the timer waiting code in NexOS wasn't working. Thanks, @bellezzasolo. It is probably best to use hardware locks in the timer code. Although this helps prevent it, it doesn't mitigate it fully. It is just something to look out for, as I have fallen victim to lock problems before. Locks sometimes cause more problems then they solve... :?
"How did you do this?"
"It's very simple — you read the protocol and write the code." - Bill Joy
Projects: NexNix | libnex | nnpkg
nexos
Member
Member
Posts: 1081
Joined: Tue Feb 18, 2020 3:29 pm
Libera.chat IRC: nexos

Re: Question about blocking operations in syscalls

Post by nexos »

My blocking code can be found here.
"How did you do this?"
"It's very simple — you read the protocol and write the code." - Bill Joy
Projects: NexNix | libnex | nnpkg
heavyweight87
Posts: 22
Joined: Sun Apr 28, 2019 7:39 am

Re: Question about blocking operations in syscalls

Post by heavyweight87 »

Thanks everyone!!
linguofreak
Member
Member
Posts: 510
Joined: Wed Mar 09, 2011 3:55 am

Re: Question about blocking operations in syscalls

Post by linguofreak »

heavyweight87 wrote:
So the system call doesn't actually return until the blocked task runs again, but no cpu time is consumed by the blocked process.
This is the part I’m unsure about. So the blocking process is stuck in an in interrupt?
I guess the interrupt is still in the context of the running task so it’s ok to switch to another task and resume the blocking task later on
On some non-x86 architectures, there's a distinction between interrupts and traps.

On such architectures, an interrupt is an event triggered by an external signal from hardware that causes the CPU to start fetching instructions from a defined address. Different architectures determine the address to look for the interrupt handler in different ways, but often there is a register that contains a pointer to a table of pointers to handlers. On older, simpler architectures there might not be a table, the CPU might just jump to a hard-wired address in low memory.

A trap, on architectures that make the distinction, is an instruction that, when executed, causes the CPU to start fetching instructions from a defined address.

Traps never involve external hardware, so the trap handler doesn't have to service any hardware or interact with the interrupt controller, and interrupts are only disabled very briefly, if at all. System calls are generally implemented in terms of traps.

However, traditionally, on x86, the trap instruction is called "Int", and selects a handler from the same table of handler addresses that is used to service hardware interrupts. More recently, though, there are instructions like "syscall" that speed things up by eliminating a lot of things that have to be done for a real hardware interrupt, but not for a trap. But whether you use "Int" or "syscall", the things that the handler needs to do are more along the lines of a trap handler than an interrupt handler.

So when a process makes a blocking syscall, the interrupt (if the "Int" instruction is used) is generally over, as far as the hardware is concerned, pretty much immediately, but from the software standpoint, the OS switches to another task and only completes the interrupt handler much later.

But you have to look at what the process is doing when it makes the syscall. If it were a program on an embedded system that had full control of the machine, the equivalent operation would generally be something like issuing a command to an I/O device and then halting or busy-waiting until the device responds. But since user programs generally don't have access to device I/O or the halt instruction, a blocking syscall replaces that combination. It tells the OS to perform some operation on behalf of the program, and tells the OS that the program is done executing until that operation is complete. The syscall is *implemented* in terms of a trap/software interrupt to get the attention of the OS, and so, yes, the program's register state does end up on an interrupt/trap stack somewhere in kernel space until the program is ready to run again, but it's better to think of the program as being halted than as being in the middle of an interrupt.
Korona
Member
Member
Posts: 1000
Joined: Thu May 17, 2007 1:27 pm
Contact:

Re: Question about blocking operations in syscalls

Post by Korona »

Part of the confusion is probably that when you say "interrupt", you think of an atomic (for the lack of a better work) interrupt context, i.e., one that blocks other interrupts from happening, runs on its own per-CPU stack and is disallowed from performing most operations that are not reentrant (this is due to the asynchronous nature of the interrupt and the kind of disallowed operations usually includes sleeping / blocking). For syscalls, however, that is not the case: syscalls typically run on a per-task kernel thread and interrupts are not masked during syscalls.
managarm: Microkernel-based OS capable of running a Wayland desktop (Discord: https://discord.gg/7WB6Ur3). My OS-dev projects: [mlibc: Portable C library for managarm, qword, Linux, Sigma, ...] [LAI: AML interpreter] [xbstrap: Build system for OS distributions].
Post Reply