Question about blocking operations in syscalls

heavyweight87 · Post by **heavyweight87** » Fri Jul 24, 2020 3:48 am

I was thinking about syscalls, and that they basically do all their work in an interrupt.

I come from the embedded world where you want to as little as possible in an interrupt but the syscall design seems to not follow this.
It seems to me that every syscall operation is a blocking one...
E.g. a read to a slow file system could hold the system up for tens of ms. A task that reads a large file could fire multiple interrupts really slow everything down.

Is this not problematic?

iansjack · Post by **iansjack** » Fri Jul 24, 2020 4:18 am

1. System calls don't have to be handled by interrupts. You could also use the much faster syscall mechanism (although a system call may still spend an appreciable amount of time running in kernel mode).

2. Not all system calls block.

3. A blocking system call returns almost immediately, having put the blocked task on a queue of blocked tasks and switching to the next runnable task. They may well spend less time in the interrupt routine than non-blocking calls.

4. If you use buffered I/O routines then a single system call will transfer a large amount of data (depending upon the size of your buffers). If the disk driver is interrupt driven then you will get an interrupt for each sector (or possibly group of sectors) read, but the routine called will probably be very small.

5. The time spent in interrupts is far less important in interactive systems than real-time ones. A real-time OS will follow a very different design to a general-purpose OS.

heavyweight87 · Post by **heavyweight87** » Fri Jul 24, 2020 4:25 am

Thanks for the reply.

So a blocking call will just return, and as it has the pointer to the users buffer it can just copy the data to the buffer and wake the task up when its done?

How exactly does the kernel block the process once its returned to user land from the syscall interrupt?

iansjack · Post by **iansjack** » Fri Jul 24, 2020 5:08 am

To block a process, the running process, that called the system call, is placed on the queue of blocked processes. The kernel then institutes a task switch to the next runnable task. This will then return to whatever caused it to be switched from in the first place. So the system call doesn't actually return until the blocked task runs again, but no cpu time is consumed by the blocked process.

The exact mechanism by which the blocked task is subsequently unblocked is part of the design of the OS; it's not necessarily the same for all OSs.

All system calls have to do a certain amount of work, whether they are called by a software interrupt or a syscall mechanism. There's not really any way of avoiding this. I prefer to use syscall rather than a software interrupt as it's simpler and, I believe, faster. Even if you use software interrupts, they are not really interrupts because they are not asynchronous.

There's some interesting information about system calls here: https://x86.lol/generic/2019/07/04/kernel-entry.html

bellezzasolo · Post by **bellezzasolo** » Fri Jul 24, 2020 5:14 am

heavyweight87 wrote:Thanks for the reply.

So a blocking call will just return, and as it has the pointer to the users buffer it can just copy the data to the buffer and wake the task up when its done?

How exactly does the kernel block the process once its returned to user land from the syscall interrupt?

Well, I don't have syscalls, yet, but the approach is quite similar to a hardware interrupt.

The hardware interrupt signals a semaphore, and returns, just doing basic EOI sort of stuff.
This wakes an event thread which deals with actually handling the interrupt (Deffered Procedure Call on Windows is a similar mechanism).

So, for a syscall, what you probably want to do is as follows.

Dispatch to the appropriate subsystem, say, the VFS.

Create a semaphore, and put the request, along with associated semaphore, into a queue. Then wait on the semaphore, which should block the current thread, and immediately invoke the scheduler.

So the thread is blocked, and another thread is run. Meanwhile, you have a kernel thread (high priority) that processes requests, and issues them to the hardware. Meanwhile, another thread handles complete requests - or it could be the same thread. How much threading you want in the kernel/drivers is up to you.

But, when the requisite event is complete, you signal the associated semaphore. The user thread is woken, and you now have the data requested.

This also has the advantage that, for non-blocking I/O, you do exactly the same thing, but return the semaphore, instead of waiting. The user process might WaitForSingleObject or whatever at a later stage.

In fact, you could just implement blocking I/O in user space by chaining the two syscalls.

heavyweight87 · Post by **heavyweight87** » Fri Jul 24, 2020 6:19 am

So the system call doesn't actually return until the blocked task runs again, but no cpu time is consumed by the blocked process.

This is the part I’m unsure about. So the blocking process is stuck in an in interrupt?
I guess the interrupt is still in the context of the running task so it’s ok to switch to another task and resume the blocking task later on

iansjack · Post by **iansjack** » Fri Jul 24, 2020 6:30 am

Again, it's probably best to stop thinking in terms of interrupts. It's really just a synchronous change of privilege level.

bzt · Post by **bzt** » Fri Jul 24, 2020 7:44 am

heavyweight87 wrote:This is the part I’m unsure about. So the blocking process is stuck in an in interrupt?
I guess the interrupt is still in the context of the running task so it’s ok to switch to another task and resume the blocking task later on

Think of it like this:

Process A makes a "read" syscall. In the syscall handler (which could be an interrupt handler too), the kernel sees that the request cannot be fulfilled. Therefore the kernel removes process A from the active tasks queue, and instructs the disk driver to read sectors. Then the scheduler picks process B (the next task in the active tasks queue), and carries on.

When the sector read finished, an interrupt is generated, and process B is interrupted. There the kernel sees that this interrupt is a response to a request by process A, so it awakes process A (adding it to the head of active tasks queue), and control is returned to process A, which now has the data for the "read" syscall.

Process A has absolutely no clue, that it was blocked during the syscall. From its perspective it called "read" and the kernel returned the data. The kernel never blocked in the syscall handler either, it just returned to another process, and did not schedule process A until its data become ready.

Cheers,
bzt

nexos · Post by **nexos** » Fri Jul 24, 2020 7:48 am

As long as the register state is saved when blocked, all is well. That is how I do it in NexOS.

bellezzasolo · Post by **bellezzasolo** » Fri Jul 24, 2020 9:13 am

nexos wrote:As long as the register state is saved when blocked, all is well. That is how I do it in NexOS.

The one thing to be careful of is correct handling of context switches.

E.g. you have a thread which is preempted, via a timer interrupt.

You load a context which is from a thread that blocked.

You need to make sure that you send the EOI for the timer interrupt, probably just before you change context.

I think this could be the issue I currently have with semaphores, actually...

I'm probably going to need to put a flag in the per-cpu data, so I know whether to call the EOI function or not.

nexos · Post by **nexos** » Fri Jul 24, 2020 6:25 pm

That explains why the timer waiting code in NexOS wasn't working. Thanks, @bellezzasolo. It is probably best to use hardware locks in the timer code. Although this helps prevent it, it doesn't mitigate it fully. It is just something to look out for, as I have fallen victim to lock problems before. Locks sometimes cause more problems then they solve...

nexos · Post by **nexos** » Fri Jul 24, 2020 6:28 pm

My blocking code can be found here.

heavyweight87 · Post by **heavyweight87** » Fri Jul 24, 2020 11:25 pm

Thanks everyone!!

linguofreak · Post by **linguofreak** » Sat Jul 25, 2020 2:08 pm

heavyweight87 wrote:
So the system call doesn't actually return until the blocked task runs again, but no cpu time is consumed by the blocked process.
This is the part I’m unsure about. So the blocking process is stuck in an in interrupt?
I guess the interrupt is still in the context of the running task so it’s ok to switch to another task and resume the blocking task later on

On some non-x86 architectures, there's a distinction between interrupts and traps.

On such architectures, an interrupt is an event triggered by an external signal from hardware that causes the CPU to start fetching instructions from a defined address. Different architectures determine the address to look for the interrupt handler in different ways, but often there is a register that contains a pointer to a table of pointers to handlers. On older, simpler architectures there might not be a table, the CPU might just jump to a hard-wired address in low memory.

A trap, on architectures that make the distinction, is an instruction that, when executed, causes the CPU to start fetching instructions from a defined address.

Traps never involve external hardware, so the trap handler doesn't have to service any hardware or interact with the interrupt controller, and interrupts are only disabled very briefly, if at all. System calls are generally implemented in terms of traps.

However, traditionally, on x86, the trap instruction is called "Int", and selects a handler from the same table of handler addresses that is used to service hardware interrupts. More recently, though, there are instructions like "syscall" that speed things up by eliminating a lot of things that have to be done for a real hardware interrupt, but not for a trap. But whether you use "Int" or "syscall", the things that the handler needs to do are more along the lines of a trap handler than an interrupt handler.

So when a process makes a blocking syscall, the interrupt (if the "Int" instruction is used) is generally over, as far as the hardware is concerned, pretty much immediately, but from the software standpoint, the OS switches to another task and only completes the interrupt handler much later.

But you have to look at what the process is doing when it makes the syscall. If it were a program on an embedded system that had full control of the machine, the equivalent operation would generally be something like issuing a command to an I/O device and then halting or busy-waiting until the device responds. But since user programs generally don't have access to device I/O or the halt instruction, a blocking syscall replaces that combination. It tells the OS to perform some operation on behalf of the program, and tells the OS that the program is done executing until that operation is complete. The syscall is *implemented* in terms of a trap/software interrupt to get the attention of the OS, and so, yes, the program's register state does end up on an interrupt/trap stack somewhere in kernel space until the program is ready to run again, but it's better to think of the program as being halted than as being in the middle of an interrupt.

Korona · Post by **Korona** » Sun Jul 26, 2020 2:12 am

Part of the confusion is probably that when you say "interrupt", you think of an atomic (for the lack of a better work) interrupt context, i.e., one that blocks other interrupts from happening, runs on its own per-CPU stack and is disallowed from performing most operations that are not reentrant (this is due to the asynchronous nature of the interrupt and the kind of disallowed operations usually includes sleeping / blocking). For syscalls, however, that is not the case: syscalls typically run on a per-task kernel thread and interrupts are not masked during syscalls.

OSDev.org

Question about blocking operations in syscalls

Question about blocking operations in syscalls

Re: Question about blocking operations in syscalls

Re: Question about blocking operations in syscalls

Re: Question about blocking operations in syscalls

Re: Question about blocking operations in syscalls

Re: Question about blocking operations in syscalls

Re: Question about blocking operations in syscalls

Re: Question about blocking operations in syscalls

Re: Question about blocking operations in syscalls

Re: Question about blocking operations in syscalls

Re: Question about blocking operations in syscalls

Re: Question about blocking operations in syscalls

Re: Question about blocking operations in syscalls

Re: Question about blocking operations in syscalls

Re: Question about blocking operations in syscalls