Multitasking issues - Triple Fault for unknown reason

Question about which tools to use, bugs, the best way to implement a function, etc should go here. Don't forget to see if your question is answered in the wiki first! When in doubt post here.
Post Reply
minater247
Posts: 17
Joined: Sat Jun 18, 2022 11:38 pm

Multitasking issues - Triple Fault for unknown reason

Post by minater247 »

Hello again folks,

I've been struggling with multitasking yet again for the past week, and am completely stuck on a weird issue related to multitasking. At this point, I think I might be a bit out of my depth, so I would appreciate any and all help on this.

So far, I believe that:
- Task switching works properly. If I don't use the usermode process management functions such as execv, fork, and waitpid, everything appears to work perfectly fine.
- Something is very wrong, as I don't even get the "Kernel panic:\n" message in serial (which should be the very first thing it does).
- The issue has something to do with task switching, as single-stepping the code in GDB and setting breakpoints causes the issue not to occur. However, I don't know exactly where the issue lies.

To expand on that last one, for some reason, this issue is weird. It happens in Qemu but not Bochs, and occasionally will work without UEFI enabled. If single-stepping, it will work all the time, although I believe this is only because QEMU doesn't run interrupts the same way with GDB single-stepping.

One notable thing is that when I halt the program while it hangs after executing the command, but before it triple faults, it appears to be running some code up in the 0x770000 range (which shouldn't even be mapped in virtual memory?). When I backtrace with GDB, I get 2 calls at 0x20, which isn't an instruction address for either the shell or loaded program, so I have no idea how it ended up there, despite it looking like a valid location in memory.

Unfortunately, for now, that's all I've got. I'm completely lost trying to debug this, so if anyone is able to garner any insight from this, it would be much appreciated!

The current branch for the OS code is at:
https://github.com/Minater247/Xana3/tree/multitasking
Some notable files are:
process.c: https://github.com/Minater247/Xana3/blo ... /process.c
elf_loader.c: https://github.com/Minater247/Xana3/blo ... f_loader.c

All C code is located in /arch/x86_64/c/, and assembly in arch/x86_64/asm/.

And the current branch for the shell, in case it's helpful, is here:
https://github.com/Minater247/Xansh

The issue occurs when attempting to run any file from the shell - the provided XanHello.elf is a perfect example of the issue at hand. To test this, all that must be done after booting is typing `/mnt/ramdisk/bin/XanHello.elf` into the terminal.

If anything else would help, please let me know! I will try and provide anything else that would help in figuring this one out.
nullplan
Member
Member
Posts: 1760
Joined: Wed Aug 30, 2017 8:24 am

Re: Multitasking issues - Triple Fault for unknown reason

Post by nullplan »

Your syscall handler doesn't switch stacks. That may be an issue. The syscall instruction does not automatically switch stacks for you, you have to do it yourself. Something like

Code: Select all

syscall_hander_asm:
  mov [syscall_old_rsp], rsp
  mov rsp, kernel_stack_top
And then at the end, you need to restore rsp, of course. You somehow already have syscall_old_rsp, but it is never set, nor is the rsp restored at the end.
Carpe diem!
Octocontrabass
Member
Member
Posts: 5497
Joined: Mon Mar 25, 2013 7:01 pm

Re: Multitasking issues - Triple Fault for unknown reason

Post by Octocontrabass »

You can't switch stacks in inline assembly. (Huh, I feel like I'm repeating myself.)
minater247
Posts: 17
Joined: Sat Jun 18, 2022 11:38 pm

Re: Multitasking issues - Triple Fault for unknown reason

Post by minater247 »

nullplan wrote:Your syscall handler doesn't switch stacks. That may be an issue.
It probably will be down the line - I had implemented that back around (this commit) but in the process of debugging had removed it. In this case it turned out to be unrelated, but thanks for the tip!
Octocontrabass wrote:You can't switch stacks in inline assembly. (Huh, I feel like I'm repeating myself.)
I see how it may cause issues, especially in the long run, but I still don't see why it's a "can't" and not a "may cause issues if used incorrectly"? I've tried to minimize the use of inline assembly, but in the cases where I've used it I don't see any other way to read, write, or store these variables without using it. Notably, if I call a true assembly function, RSP is modified and GCC sometimes pushes registers - so I can't effectively get the calling function's RSP/RBP (although I can get its RIP as such, which there is an assembly function for). All in all it seems, at least from my knowledge, to be a lot more complex and error prone than just using inline assembly and marking the registers as clobbered if C code is used afterwards. Is there just something I'm missing here? (and just to make sure it's clear, not saying you're wrong in any capacity, I just really don't understand why you're saying it can't be done).



As for the original issue, it turned out to just be a small error in the execv function. Turns out I was modifying the process' page directory without ever setting the address of the structure in the process struct - so when execv jumped to usermode, when it set CR3 it worked once (hence why if there was no schedule and the process exited on first run, no problem), but on subsequent reads it would read the old process' CR3 from the unmodified task structure, end up running at a random location in another process' code, and crash hard. One-line fix, which turned out to be completely unrelated to task switching!
Octocontrabass
Member
Member
Posts: 5497
Joined: Mon Mar 25, 2013 7:01 pm

Re: Multitasking issues - Triple Fault for unknown reason

Post by Octocontrabass »

minater247 wrote:I see how it may cause issues, especially in the long run, but I still don't see why it's a "can't" and not a "may cause issues if used incorrectly"?
GCC's optimizations assume you're not going to change the stack pointer, so it can do things like spill temporary values to the stack (which will all disappear or be replaced by garbage when you change the stack pointer) or precalculate pointers to the stack (which will all point to the old stack once you switch to the new stack). There's no way to stop GCC from doing these things around inline assembly, so you can't use inline assembly to change the stack pointer.
minater247 wrote:Notably, if I call a true assembly function, RSP is modified and GCC sometimes pushes registers - so I can't effectively get the calling function's RSP/RBP
The whole point of the assembly function is that it is the task switch. Consider what happens if you call this assembly function from different places in your kernel: each stack will contain the return address of the caller, so when you switch stacks, the function "returns" to whatever code called it while that stack was previously in use.
minater247 wrote:All in all it seems, at least from my knowledge, to be a lot more complex and error prone than just using inline assembly and marking the registers as clobbered if C code is used afterwards.
You can't clobber RSP or spilled temporary values.
User avatar
eekee
Member
Member
Posts: 872
Joined: Mon May 22, 2017 5:56 am
Location: Kerbin
Discord: eekee
Contact:

Re: Multitasking issues - Triple Fault for unknown reason

Post by eekee »

minater247 wrote:Notably, if I call a true assembly function, RSP is modified and GCC sometimes pushes registers - so I can't effectively get the calling function's RSP/RBP
Why do you need the calling function's RSP? Can't you return to the C function which called the assembly routine? I mean this:

Code: Select all

proc1 user code
      |
   syscall
      |
kernel func x
      |
    call
      |
scheduler (c) with vars for proc1
      |
    call
      |
stack switcher (asm)
      |
     ret
      |
scheduler (c) with vars for proc2
      |
     ret
      |
kernel func y
      |
   syscall ret
      |
proc2 user code
Note that the stack switcher returns to the same function, but it's a different instance of that function; it has a whole different set of local variables because the stacks have been switched.
Kaph — a modular OS intended to be easy and fun to administer and code for.
"May wisdom, fun, and the greater good shine forth in all your work." — Leo Brodie
minater247
Posts: 17
Joined: Sat Jun 18, 2022 11:38 pm

Re: Multitasking issues - Triple Fault for unknown reason

Post by minater247 »

For Octocontrabass: I felt like I wasn't fully understanding what you said so rather than throw out more questions I decided to keep working on that section of code until I got it, and boy, it makes a whole lot more sense now! I definitely see the problems with it, and I've been gradually moving any presently-non-erroneous C stack movement to assembly. I know it's been a while, but I wanted to come back and say thank you so much for taking the time to explain all that!! You've helped me avoid so many future problems.


And for eekee:
Why do you need the calling function's RSP? Can't you return to the C function which called the assembly routine? I mean this:
You're totally right. I was misunderstanding how the calls were working at that point in time and assuming I needed a bit more context to return, but nope, it just needs a return with the context stack. Thanks for the explanation and chart, though - that was definitely helpful!
Post Reply