Hi,
If possible, I would like to receive some advices because I have been stuck with this problem for a while...I have been involved with system programming for 5 months, so I know basic things related to OS theories (e.g., GDT, IDT, TSS, how interrupt works, segmentation etc.).
I am working on a project that requires me to write a simple user-space routine for an open-source hypervisor project. The hypervisor is bare-metal running in kernel mode VMX root, and it does not support linux OS on top of it, which means that this hypervisor has no concept like Linux process. My problem is that I need to execute a simple user-space routine (i.e., which probably do nothing but to store an integer into a register for testing purpose) and whenever I execute IRET, the system hangs/freezes/ or runs into infinite loop, so I am unable to get more information on the hang. Here are things what I have tried:
- Related to my kernel-space to user-space routine, I follow Linux's way to return (i.e., kernel_execve) in which an interrupt is generated (i.e., int $0x80) in kernel-space of the hypervisor, it is trapped and the hypervisor's handler prepares the context for user-space routine before executing IRET. Unlike Linux, I did not statically initialize the process context.
- The user-space routine context requires 3 main things: (1) 2 GDT entries for User Code (DPL=3) and User Data(DPL=3), (2)(optional) TSS for user-space routine to return back to kernel-space if an interrupt happens in user-space, and (3) top of the stack so that IRET can pop information.
- I have tried to enable some flag by the temporaryFlag saved on the stack (i.e., RF, TF, IF) but none of them is effective.
- My returning routine is taken from returning to user-mode by James Molloy and OSDev-Getting to Ring 3 and they works on bochs emulator. So I basically know how to return to user-mode.
- I have tried to perform IRET between segments with the same privilege, and it works as I expect it. This experiment is to prove that I setup the GDT entries and the top-of-the-stack correctly. However, if I switch between segments with different privileges, IRET hangs/freezes/ or runs into infinite loop.
- I have tried to intentionally supply wrong parameter (i.e., wrong DPL, wrong limitation, out-of-bound index) to IRET, and I get correct exceptions(e.g., #GPF, #SS, #NP)
I am currently reading how Linux OS boot-up the first user-space process PID 1, and this is complicated for me to trip down all the dependency at the moment. So I am confused what I should do to make a successful switch and what could be a possible reason that IRET fails to execute.
Thanks in advance,
Minh
IRET hangs when switch back from kernel-space to user-space
-
- Posts: 3
- Joined: Sat Sep 30, 2017 11:03 pm
Re: IRET hangs when switch back from kernel-space to user-sp
There may be multiple reasons for your kernel to hang while switching to user-space after executing the IRET/IRETD instruction. Switching to user-space requires the following -
1. A GDT with the TSS. The GDT must be loaded using LGDT and a TSS entry must be present.
2. A TSS must be loaded using the LTR instruction.
3. IRET/IRETD instructions depend on the current stack. The stack must contain the following data in memory for a kernel-to-user mode switch (lowering the CPL, numerically increasing from 0 to 3) -
5. The TSS must contain a ESP0 stack. This must be set to a allocated region in the kernel space. When a interrupt occurs in user-mode, the ESP will change to this value only. So, if a page-fault occurs and ESP0 is not set, a triple fault will occur.
NOTE: From your post, I think you haven't mapped your kernel code at a user-mode location.
1. A GDT with the TSS. The GDT must be loaded using LGDT and a TSS entry must be present.
2. A TSS must be loaded using the LTR instruction.
3. IRET/IRETD instructions depend on the current stack. The stack must contain the following data in memory for a kernel-to-user mode switch (lowering the CPL, numerically increasing from 0 to 3) -
- a) SS - This should equal the user-mode data segment selector. You can look into the GDT tutorial for this value.
b) User-mode ESP - The stack to switch to after coming into user-mode.
c) EFLAG register - This is the EFLAGS register to switch to after going to user-mode. You could push the value with only the IF flag set (1 << 9)
d) CS - User-mode code segment selector. You can look into the GDT tutorial for this value.
e) EIP - Where do you want to jump to in user-mode>
5. The TSS must contain a ESP0 stack. This must be set to a allocated region in the kernel space. When a interrupt occurs in user-mode, the ESP will change to this value only. So, if a page-fault occurs and ESP0 is not set, a triple fault will occur.
NOTE: From your post, I think you haven't mapped your kernel code at a user-mode location.
-
- Posts: 3
- Joined: Sat Sep 30, 2017 11:03 pm
Re: IRET hangs when switch back from kernel-space to user-sp
Hi SukantPal,
Thank for your advices ! Related to your concerns, I have the following answers:
1+2. I have a user-mode TSS entry with DPL = 3 and this will be loaded before the switch_to_user_from_kernel. The load is done by LTR instruction, I have reconfirmed that the TS register was successfully changed.
3. The top-of-the stack is correctly prepared so the switching between same privileged segments was succeeded, the SS and CS image on the stack also need RPL = 3. So I do not think there are any problems regarding the stack.
4. The destination code-page was mapped as : kernel code as user-mode page. However, in either case, I think there should be a #PF exception instead of hanging.
5. I have double check that the user-mode TSS entrywhich must contain a ESP0 stack and SS0, this one I have also prepared in case interrupt happens in user-mode.
So I actually mapped my kernel code with user/supervisor bit set to 1 and they are present in the pte table. I am sorry for missing this information. But I have question on "map at user-mode location", why does the location matter ? I mapped them but at kernel-address space with user privilege
Minh
Thank for your advices ! Related to your concerns, I have the following answers:
1+2. I have a user-mode TSS entry with DPL = 3 and this will be loaded before the switch_to_user_from_kernel. The load is done by LTR instruction, I have reconfirmed that the TS register was successfully changed.
3. The top-of-the stack is correctly prepared so the switching between same privileged segments was succeeded, the SS and CS image on the stack also need RPL = 3. So I do not think there are any problems regarding the stack.
4. The destination code-page was mapped as : kernel code as user-mode page. However, in either case, I think there should be a #PF exception instead of hanging.
5. I have double check that the user-mode TSS entrywhich must contain a ESP0 stack and SS0, this one I have also prepared in case interrupt happens in user-mode.
So I actually mapped my kernel code with user/supervisor bit set to 1 and they are present in the pte table. I am sorry for missing this information. But I have question on "map at user-mode location", why does the location matter ? I mapped them but at kernel-address space with user privilege
Minh
Re: IRET hangs when switch back from kernel-space to user-sp
Hi makata0611,
By user-mode location, I just meant a location which was mapped with the user-mode bit on in paging support. If kernel-code is mapped with user-mode bit set, that is good and correct.
Looks like everything is fine. Can you explain what your code does in user-mode? Posting your kernel-to-user-mode-switch code and user-mode routine code will be very helpful in solving your problem?
By user-mode location, I just meant a location which was mapped with the user-mode bit on in paging support. If kernel-code is mapped with user-mode bit set, that is good and correct.
Looks like everything is fine. Can you explain what your code does in user-mode? Posting your kernel-to-user-mode-switch code and user-mode routine code will be very helpful in solving your problem?
-
- Posts: 3
- Joined: Sat Sep 30, 2017 11:03 pm
Re: IRET hangs when switch back from kernel-space to user-sp
Hi SukantPal,
I have managed to pinpoint my hanging problem. The problem comes from multi-core setting where multiple cores are waiting for a single core but that core is crashed without dumping out any information -> I think it hangs the system. In single-core setting, my user-mode routine works well. The interrupt in user-mode is triggered and is trapped into kernel's interrupt handler. The testing user-mode routine is nothing but a bunch of updating register and an interrupt call
I don't have access to the switch-from-kernel-to-user code now, but it basically does what you suggested. And I confirmed that they worked in single-core setting. I installed my own soft-ware interrupt handler, it simply dumps out information on the stack and then halt. The point is that when multiple cores are running in parallel, I pick a single core and let it return to userland. However, this does not dump out any information while the other cores are waiting for it to return. Hence, I think it was hanged.
Thank you very much for your guidance, Sukant ! I am very appreciated it.
Minh
I have managed to pinpoint my hanging problem. The problem comes from multi-core setting where multiple cores are waiting for a single core but that core is crashed without dumping out any information -> I think it hangs the system. In single-core setting, my user-mode routine works well. The interrupt in user-mode is triggered and is trapped into kernel's interrupt handler. The testing user-mode routine is nothing but a bunch of updating register and an interrupt call
Code: Select all
user-routine.S
movl $0x2, %ecx
movl $0x2, %ecx
int $0x80
movl $0x2, %ecx
Thank you very much for your guidance, Sukant ! I am very appreciated it.
Minh