How to kill a (different) task? using signal?

songziming · Post by **songziming** » Tue Nov 19, 2024 2:01 am

Hi,

What do you think is the proper way of killing a task (aka thread)?

My OS is multiprocessor and multitasking. If I want to stop task-X from task-Y, then task-Y calls `task_kill(X)`. But task X might be running on another CPU, task X might be modifying its TCB, and even more, task-X might be trying to kill task-Y.

Linux use signals to kill process, task-Y sends a signal KILL to task-X, when task-X runs, it checks pending signals, found KILL, and stops it self.

Only self TCB can be modified, so no need to use spinlock. I use uint32 bitmask in TCB to represent signals, sending signal is just atomic_or. Signals are checked during interrupt return and syscall return.

The only problem is, during interrupt, schduler may run, and `tid_next` might change. During interrupt return, I check `tid_next->signals` to see if SIG_KILL is set. If so, `tid_next` is terminated, scheduler runs again, a new task is assigned to `tid_next`, and signals are checked again.

Interrupt calls scheduler, handling SIG_KILL calls scheduler, signals are handled during interrupt return. Should I set a limit on how many cycles of signal should I run during int_return?

thewrongchristian · Post by **thewrongchristian** » Tue Nov 19, 2024 5:44 am

songziming wrote: ↑Tue Nov 19, 2024 2:01 am Hi,

What do you think is the proper way of killing a task (aka thread)?

My OS is multiprocessor and multitasking. If I want to stop task-X from task-Y, then task-Y calls `task_kill(X)`. But task X might be running on another CPU, task X might be modifying its TCB, and even more, task-X might be trying to kill task-Y.

Linux use signals to kill process, task-Y sends a signal KILL to task-X, when task-X runs, it checks pending signals, found KILL, and stops it self.

Only self TCB can be modified, so no need to use spinlock. I use uint32 bitmask in TCB to represent signals, sending signal is just atomic_or. Signals are checked during interrupt return and syscall return.

The only problem is, during interrupt, schduler may run, and `tid_next` might change. During interrupt return, I check `tid_next->signals` to see if SIG_KILL is set. If so, `tid_next` is terminated, scheduler runs again, a new task is assigned to `tid_next`, and signals are checked again.

Interrupt calls scheduler, handling SIG_KILL calls scheduler, signals are handled during interrupt return. Should I set a limit on how many cycles of signal should I run during int_return?

I'm not sure what you're asking.

Are you asking whether you should limit the number of pending signal handlers to invoke in the user context you're about to return to?

If so, then yes, you can only handle one signal at a time, so if you're handling a signal in a user signal handler, you should just find the first signal to handle, put its context on the user stack, and return to the user signal context (assuming the user is handling the signal.)

Once the user code makes another system call, upon return from that system call, you can add any further signal being handled to the context on return from the system call (assuming said signals are not blocked.)

If you're asking whether you should limit the number of processes that end up getting killed when the scheduler picks a new process, then no, you should not limit that. If each process the scheduler chooses needs to be killed due to a pending signal, then kill it and move on until you pick a process that does not have a pending signal to kill it, or you have no more runnable processes left (in which case you idle).

nullplan · Post by **nullplan** » Tue Nov 19, 2024 2:09 pm

Here's how I do it: Task has a word of task information flags in its TCB. TCB is conveniently located for assembler code. If another task sends a signal, it sets the TIF_SIGPENDING flag. The signal is kept pending in another way as well (actually two ways; either a siginfo structure, which is a list node, or as a sigpending flag), because of the need to support sigpending() and signal holding semantics.

Anyway, only use of the TIF_SIGPENDING flag is that if it is set during interrupt/syscall return to userspace (e.g. destination CS is the user CS), then the kernel is entered again before the return. The function reenter_kernel() clears the TIF_SIGPENDING flag, gets the least-numbered pending unblocked signal (so signal 1 is processed before signal 2, etc.), and handles it according to policy. For SIGKILL and SIGSTOP, this unconditionally means to act on their respective default actions; for the others, the handler is looked up and acted upon.

Now, true, the destination process may have active threads running on another CPU when the signal is sent. I have an IPI that makes the destination CPU reschedule the current thread. And since returning from IPI is returning from an interrupt, the above processing will happen whether another process is runnable for that CPU or not. All the sending process must do is check that the destination thread might be running on that CPU.

rdos · Post by **rdos** » Tue Nov 19, 2024 3:07 pm

There are many problems with killing threads (or processes, for that matter). They might hold locks that will cause other threads to become indefinitely blocked. They might hold resources that must be freed. They might be in kernel mode waiting for input (or other events, like disc data). A typical example is waiting for keyboard input. Killing a thread in kernel mode introduces many more problems.

I keep track of kernel resources per process, and so killing a thread will not free those since the kernel doesn't know it owns them.

Additionally, I don't have a "decode level" in syscalls, and so this cannot be used to check for signals. Syscalls use call gates that acts just like any call, execpt it's a far call to a different mode.

So, as a consequence, I don't support killing threads or processes from outside. They need to terminate gracefully themselves.

nullplan · Post by **nullplan** » Tue Nov 19, 2024 8:05 pm

rdos wrote: ↑Tue Nov 19, 2024 3:07 pm There are many problems with killing threads (or processes, for that matter). They might hold locks that will cause other threads to become indefinitely blocked. They might hold resources that must be freed. They might be in kernel mode waiting for input (or other events, like disc data). A typical example is waiting for keyboard input. Killing a thread in kernel mode introduces many more problems.

You know, those are all solved problems. On UNIX, killing a thread kills the whole process. This gets rid of process-internal locks. And since exiting a process cleans up any open files and memory maps it may have, most resources end up freed as well. The only locks that remain are process-external ones, e.g. in shared memory, but there is the option of using robust mutexes, so that the next one to lock them gets notified that we may be in an inconsistent state, and has to clean up that state as best as possible.

And as for killing threads in kernel mode: If they are in interruptible sleep (e.g. waiting on the keyboard, or a pipe, or a socket), there is no issue, since they can hold no kernel resources like spin locks or such. And if they are in non-interruptible sleep (e.g. waiting on disk data), sending a signal does not wake them, even if it is a killing signal.

rdos · Post by **rdos** » Wed Nov 20, 2024 1:53 am

nullplan wrote: ↑Tue Nov 19, 2024 8:05 pm
rdos wrote: ↑Tue Nov 19, 2024 3:07 pm There are many problems with killing threads (or processes, for that matter). They might hold locks that will cause other threads to become indefinitely blocked. They might hold resources that must be freed. They might be in kernel mode waiting for input (or other events, like disc data). A typical example is waiting for keyboard input. Killing a thread in kernel mode introduces many more problems.
You know, those are all solved problems. On UNIX, killing a thread kills the whole process. This gets rid of process-internal locks. And since exiting a process cleans up any open files and memory maps it may have, most resources end up freed as well. The only locks that remain are process-external ones, e.g. in shared memory, but there is the option of using robust mutexes, so that the next one to lock them gets notified that we may be in an inconsistent state, and has to clean up that state as best as possible.

Then UNIX actually doesn't support killing a thread, when what actually happens is that they kill the process. Killing the process is a lot easier than killing a thread given that then you can clear the whole user address space, and remember kernel resources per process so they can be freed too. I have most of this done.

Still, if you want to kill multithreaded processes, then you are back to problems with synchronization within the process, as well as the problem with exiting syscalls in a safe way. All threads in the process must be killed before the process can be terminated.

nullplan wrote: ↑Tue Nov 19, 2024 8:05 pm And as for killing threads in kernel mode: If they are in interruptible sleep (e.g. waiting on the keyboard, or a pipe, or a socket), there is no issue, since they can hold no kernel resources like spin locks or such. And if they are in non-interruptible sleep (e.g. waiting on disk data), sending a signal does not wake them, even if it is a killing signal.

The issue is that you must clutter all kernel synchronization primitives and resources with checking for signals.

OSDev.org

How to kill a (different) task? using signal?

How to kill a (different) task? using signal?

Re: How to kill a (different) task? using signal?

Re: How to kill a (different) task? using signal?

Re: How to kill a (different) task? using signal?

Re: How to kill a (different) task? using signal?

Re: How to kill a (different) task? using signal?