OSDev.org

Posted: **Wed Jun 08, 2022 7:49 am**

I am working on enabling Intel SGX on a unikernel that does not have a native ring 3 support. Hence in order to invoke the user-mode SGX instruction I need to implement a ring switch routine. I followed the JamesM's tutorial (http://jamesmolloy.co.uk/tutorial_html/ ... 0Mode.html) , which is a 32-bit solution) to drafted a long-mode version:

Code: Select all

void switch_to_ring3()
{
	asm volatile("  \
      mov $0x23, %rax; \
      mov %rax, %ds; \
      mov %rax, %es; \
      mov %rsp, %rax; \
      push $0x23; \
      push %rax; \
      pushf; \
      push $0x1B; \
      push $1f; \
      iretq; \
    1: \
      "); 
        return;

I am sure that I have set up GDT entries properly and 0x23/0x1B is exactly the indexes of user-mode code/data descriptors, in which the code descriptor value is 0xaffb000000ffff and the data descriptor value is 0xaff3000000ffff.

What's strange is that the iretq can be executed successfully, and the rip register could go to the next instruction of the iretq, which is a nop if I disabled the optimization and a ret if I enabled the optimization. However, when executing the next instruction, it will die without any output (my unikernel has an exception handler, even if for unhandled exceptions, it will output something). I try to use GDB to debug and GDB said that the program received SIGQUIT.

I checked the registers but find nothing wrong, cs is 0x1b, ss, ds and es are 0x23, and rip points correctly to the next instruction of iretq.

I am really confused about why it receives SIGQUIT. If some exception happened, it should output the dump message, or at least qemu log will track some 'check_exception' message, but the log is empty. Everything seems okay, correct segment registers, correct rsp/rbp/rip, the kernel code segment is user-accessible by setting the conformed bit of its descriptor, and the high/low base address in all descriptors are pointed to 0x0.

Being trapped in this problem for a whole day but cannot find any solution. I hope someone here could save my life T_T

Posted: **Thu Jun 09, 2022 4:02 am**

Presumably you are using the qemu-sgx version as, I believe, the standard version of qemu doesn't support SGX.

Anyway, that doesn't solve your immediate problem. From what you have said, my guess would be that your mapping of the page containing the instructions you are trying to execute isn't configured to allow execution by user-mode code. It's difficult to think of any other fault that would prevent execution of a "non" instruction.

That's just a guess as I suspect that the code you have shared isn't enough to determine the true cause of the problem; but faults in paging code often cause peculiar errors. Do you have a link to a repository of your code?

Posted: **Thu Jun 09, 2022 7:53 am**

Are you using KVM? Qemu + KVM seems to act up when trying to use a debugger (e.g. receiving Debug exceptions in the guest), which does not happen when using an emulated CPU.

Posted: **Thu Jun 09, 2022 8:48 am**

iansjack wrote:Presumably you are using the qemu-sgx version as, I believe, the standard version of qemu doesn't support SGX.

Yes, I am using qemu-sgx with correct SGX options tested by Linux guest. So this would not be a problem.

iansjack wrote:Anyway, that doesn't solve your immediate problem. From what you have said, my guess would be that your mapping of the page containing the instructions you are trying to execute isn't configured to allow execution by user-mode code. It's difficult to think of any other fault that would prevent execution of a "non" instruction.

YES! I finally found the mistake just now. I forgot to set the U/S bit of the page table entry, hence once switched to ring 3, page fault occurs. After I set this bit, things went well.

But now a new bug occurs. When I try to test the exception handler in ring-3 by triggering a simple divide error, I found it cannot switch back to ring-0 correctly. I have set the rsp of the TSS to the kernel stack addr. For the IDT entries, I set the selector to the kernel code segment index 0x8,and also set the DPL to 0. But it's weird that once getting into the exception handler, the CS register will have value 0xb, which is (0x8 | 0x3), somehow meaning that it did not switch back to ring-0.

I am wondering if I miss some important configuration or setup for the CPU or kernel to correctly switch back to ring-0 once exception occurs in ring-3. But up to now, I cannot find any useful information.

iansjack wrote:Do you have a link to a repository of your code?

Yes, this is my fork of Unikraft, a light-weighted unikernel. The link to my repo is https://github.com/xymeng16/unikraft

In case you kindly want to go through the code briefly, I hereby point out some important source code files:
1. plat/kvm/x86/traps.c sets the GDT TSS and IDT
2. plat/common/x86/traps.c defines and registers the exception handler, and the real entry point of the handlers are defined in plat/kvm/x86/cpu_vectors_x86_64.S
3. lib/uksgx/sgx_utils.c defines switch_to_ring3() and it is invoked by lib/uksgx/sgx_ioctl.c:sgx_ioc_enclave_create(), user application (in this unikernel application also runs under ring 0) use the ioctl() manner to cooperate with SGX device. I am porting SGX driver right now.

Thanks for your help!

Posted: **Thu Jun 09, 2022 8:54 am**

Demindiro wrote:Are you using KVM? Qemu + KVM seems to act up when trying to use a debugger (e.g. receiving Debug exceptions in the guest), which does not happen when using an emulated CPU.

Yes, I am using KVM. I've considered this point, but sadly I cannot use emulated CPU. SGX can only be virtualized under KVM/QEMU/Linux 5.13+. But anyway my original question is fixed by setting the U/S bit of page table entry. Although CPU exceptions would not be printed in the QEMU log, I think it would not be a problem once I make the exception handler works under ring-3. The handler will output dumped registers and some messages related to the exception.

Posted: **Fri Jun 10, 2022 12:52 am**

I finally worked out how to fix the exception handler. I thought in order to share the code/data segments between ring-0 and ring-3, the conforming bit must be set. However, it is not. As I checked from Wiki:

For code selectors: Conforming bit.
If clear (0) code in this segment can only be executed from the ring set in DPL.
If set (1) code in this segment can be executed from an equal or lower privilege level. For example, code in ring 3 can far-jump to conforming code in a ring 2 segment. The DPL field represent the highest privilege level that is allowed to execute the segment. For example, code in ring 0 cannot far-jump to a conforming code segment where DPL is 2, while code in ring 2 and 3 can. Note that the privilege level remains the same, ie. a far-jump from ring 3 to a segment with a DPL of 2 remains in ring 3 after the jump.

So if I am right, conforming bit only influences those far jump without CPL changes. That's why my code doesn't need to set it.

OSDev.org

Very strange SIGQUIT when try to switch from ring 0 to ring

Very strange SIGQUIT when try to switch from ring 0 to ring

Re: Very strange SIGQUIT when try to switch from ring 0 to r

Re: Very strange SIGQUIT when try to switch from ring 0 to r

Re: Very strange SIGQUIT when try to switch from ring 0 to r

Re: Very strange SIGQUIT when try to switch from ring 0 to r

Re: Very strange SIGQUIT when try to switch from ring 0 to r