Debugging UEFI Issues on real hardware (POSIX-UEFI)

Programming, for all ages and all languages.
charco
Posts: 7
Joined: Sun Jul 18, 2021 2:12 am

Debugging UEFI Issues on real hardware (POSIX-UEFI)

Post by charco »

Hi All,

Yesterday I started playing with UEFI and wanted to try out en efi app. I downloaded POSIX-UEFI and went to test out the hello world program. Under qemu it worked without any issues, however, I could not get it to run on real hardware.

These are the steps that I am running:

I generate helloworld.efi program, as indicated in the wiki article, I also downloaded a version of the UEFI Shell so I can boot to it.

Then I created a FAT16 filesystem which includes the UEFI Shell and the EFI app, following the instructions in the UEFI wiki article, the only difference here is that I created the EFI/BOOT folders with mmd, and put the UEFI Shell as

Code: Select all

EFI\BOOT\BOOTX64.EFI
. I then burned the image using dd if=path/to/img of=/dev/sda, synced and rebooted the computer.

Upon reboot, I am able to boot into the UEFI Shell if I choose the flash drive. I can also see the helloworld.efi app with ls, but when I try to execute it, the computer just hangs. I don't know how to further debug this. The next thing I am going to try is to use GNU-UEFI or TianoCore directly to see if it's a problem related to POSIX-UEFI.

Any ideas?
nexos
Member
Member
Posts: 1078
Joined: Tue Feb 18, 2020 3:29 pm
Libera.chat IRC: nexos

Re: Debugging UEFI Issues on real hardware (POSIX-UEFI)

Post by nexos »

Try filing a bug report on bzt's gitlab.
"How did you do this?"
"It's very simple — you read the protocol and write the code." - Bill Joy
Projects: NexNix | libnex | nnpkg
charco
Posts: 7
Joined: Sun Jul 18, 2021 2:12 am

Re: Debugging UEFI Issues on real hardware (POSIX-UEFI)

Post by charco »

nexos wrote:Try filing a bug report on bzt's gitlab.
Thanks for the suggestion, nexos. I created an issue in their gitlab. Today I will try to get GNU-EFI or TianoCore to run and see if the issue is only with POSIX-UEFI.

However, in general, could I debug this issue? sadly everything works perfectly on QEMU with OMVF
charco
Posts: 7
Joined: Sun Jul 18, 2021 2:12 am

Re: Debugging UEFI Issues on real hardware (POSIX-UEFI)

Post by charco »

Update: I tried a hello world app with GNU-EFI and it worked. So it seems like the issue is related to POSIX-UEFI. However, I still have no clue how to debug it.
Octocontrabass
Member
Member
Posts: 5524
Joined: Mon Mar 25, 2013 7:01 pm

Re: Debugging UEFI Issues on real hardware (POSIX-UEFI)

Post by Octocontrabass »

You can try removing the printf call to narrow down if the problem is the startup code or the printf code. You can also try using something like this in place of the printf call:

Code: Select all

ST->ConOut->OutputString(ST->ConOut, L"Hello World!\r\n");
I heard recently that some UEFI implementations don't initialize the FPU correctly. I've attached a utility to display the contents of CR0 and CR4, which I'm pretty sure will show us if that's the problem.
Attachments
printcr.zip
Print CR0 and CR4 (x64 only)
(531 Bytes) Downloaded 136 times
Ethin
Member
Member
Posts: 625
Joined: Sun Jun 23, 2019 5:36 pm
Location: North Dakota, United States

Re: Debugging UEFI Issues on real hardware (POSIX-UEFI)

Post by Ethin »

According to the UEFI spec, the FPU word has to be set, as well as the MMX control word. However, it also states that interrupts have to be masked, excluding those for the timer. It does not state that floating-point has to actually work, though. (From what I can tell, it doesn't actually say anything about the FPU other than what the control wordshould be set to.) EDK II doesn't use any floating-point operations anywhere, and they actively discourage its use in both UEFI applications and drivers. I imagine that other UEFI vendors also discourage -- if not try to prevent -- usage of the FPU because you just don't need it there. Especially since FP usually means SSE. Does POSIX-EFI use FP?
Octocontrabass
Member
Member
Posts: 5524
Joined: Mon Mar 25, 2013 7:01 pm

Re: Debugging UEFI Issues on real hardware (POSIX-UEFI)

Post by Octocontrabass »

Ethin wrote:However, it also states that interrupts have to be masked, excluding those for the timer.
Exceptions are separate from interrupts. It does state exceptions have to be masked, but supposedly there are some implementations that don't follow the spec correctly, so that's another thing that might be set up wrong.
Ethin wrote:Does POSIX-EFI use FP?
It uses variadic functions, which can cause various floating-point instructions to be emitted even in programs that don't do any floating-point operations.
charco
Posts: 7
Joined: Sun Jul 18, 2021 2:12 am

Re: Debugging UEFI Issues on real hardware (POSIX-UEFI)

Post by charco »

Octocontrabass wrote:You can try removing the printf call to narrow down if the problem is the startup code or the printf code. You can also try using something like this in place of the printf call:

Code: Select all

ST->ConOut->OutputString(ST->ConOut, L"Hello World!\r\n");
I heard recently that some UEFI implementations don't initialize the FPU correctly. I've attached a utility to display the contents of CR0 and CR4, which I'm pretty sure will show us if that's the problem.
Thanks for the answer, Octocontrabass!

This is the result of the experiments:
  • Running an empty UEFI app with POSIX-UEFI works (no prints).
  • Calling OutputString instead of printf works on QEMU but not on real hw.
  • The value of CR0 and CR4, as printed from an EFI App compiled with GNU-EFI are:
    CR0: 0x0000000080010033
    CR4: 0x0000000000000668
    These are the same values that I get from running the app in QEMU
This is the code I used:

Code: Select all

#include <efi.h>
#include <efilib.h>

EFI_STATUS
EFIAPI
efi_main (EFI_HANDLE ImageHandle, EFI_SYSTEM_TABLE *SystemTable)
{
	InitializeLib(ImageHandle, SystemTable);
	Print(L"Hello, World!\n");
	uint64_t cr0 = 0x4141414142424242;
	uint64_t cr4 = 0x4141414142424242;

	__asm__ volatile("mov %%cr0, %0\n" : "=r" (cr0));
	__asm__ volatile("mov %%cr4, %0\n" : "=r" (cr4));

	Print(L"CR0: 0x%016lx\n", cr0);
	Print(L"CR4: 0x%016lx\n", cr4);
	return EFI_SUCCESS;
}
Ethin
Member
Member
Posts: 625
Joined: Sun Jun 23, 2019 5:36 pm
Location: North Dakota, United States

Re: Debugging UEFI Issues on real hardware (POSIX-UEFI)

Post by Ethin »

Octocontrabass wrote:
Ethin wrote:However, it also states that interrupts have to be masked, excluding those for the timer.
Exceptions are separate from interrupts. It does state exceptions have to be masked, but supposedly there are some implementations that don't follow the spec correctly, so that's another thing that might be set up wrong.
OVMF catches all exceptions -- or at least I think it does. It definitely catches #GP and #PF.
Octocontrabass wrote:
Ethin wrote:Does POSIX-EFI use FP?
It uses variadic functions, which can cause various floating-point instructions to be emitted even in programs that don't do any floating-point operations.
I didn't know that. That doesn't make much sense if your not using any float/double/long double/__float128/__float32/__float80/... operands.
charco
Posts: 7
Joined: Sun Jul 18, 2021 2:12 am

Re: Debugging UEFI Issues on real hardware (POSIX-UEFI)

Post by charco »

Ok, I also made some calls to UEFI Boot Services and they work, so it looks like the problem is at mostly related to the print functions.

I tried adding

Code: Select all

-mgeneral-regs-only
to the CFLAGS to see if that changed anything, but no lucl.
charco
Posts: 7
Joined: Sun Jul 18, 2021 2:12 am

Re: Debugging UEFI Issues on real hardware (POSIX-UEFI)

Post by charco »

Ethin wrote:
Octocontrabass wrote:
Ethin wrote:However, it also states that interrupts have to be masked, excluding those for the timer.
Exceptions are separate from interrupts. It does state exceptions have to be masked, but supposedly there are some implementations that don't follow the spec correctly, so that's another thing that might be set up wrong.
OVMF catches all exceptions -- or at least I think it does. It definitely catches #GP and #PF.
Octocontrabass wrote:
Ethin wrote:Does POSIX-EFI use FP?
It uses variadic functions, which can cause various floating-point instructions to be emitted even in programs that don't do any floating-point operations.
I didn't know that. That doesn't make much sense if your not using any float/double/long double/__float128/__float32/__float80/... operands.
By disassembling the program I didn't see any usage of xmm registers other than in the printing functions in the case of eax != 0 (which is not). So it looks like this isn't the case.
Octocontrabass
Member
Member
Posts: 5524
Joined: Mon Mar 25, 2013 7:01 pm

Re: Debugging UEFI Issues on real hardware (POSIX-UEFI)

Post by Octocontrabass »

charco wrote:
  • Running an empty UEFI app with POSIX-UEFI works (no prints).
  • Calling OutputString instead of printf works on QEMU but not on real hw.
You also mentioned in your bug report that you can call other boot services, so this points to an issue with linking. Unfortunately, I have no idea how POSIX-UEFI links the final binary, so I don't know where to go from here.
charco
Posts: 7
Joined: Sun Jul 18, 2021 2:12 am

Re: Debugging UEFI Issues on real hardware (POSIX-UEFI)

Post by charco »

Octocontrabass wrote:
charco wrote:
  • Running an empty UEFI app with POSIX-UEFI works (no prints).
  • Calling OutputString instead of printf works on QEMU but not on real hw.
You also mentioned in your bug report that you can call other boot services, so this points to an issue with linking. Unfortunately, I have no idea how POSIX-UEFI links the final binary, so I don't know where to go from here.
You lost me there. How could it be an issue with linking if it works in QEMU?
Octocontrabass
Member
Member
Posts: 5524
Joined: Mon Mar 25, 2013 7:01 pm

Re: Debugging UEFI Issues on real hardware (POSIX-UEFI)

Post by Octocontrabass »

The answer is relocations.

EFI executables must be relocatable: the binary is linked to run at one particular address, but includes information (relocations) so that it can be patched at runtime (relocated) to run at a different address. OVMF tends to load EFI executables at their desired address, so they don't need to be relocated, but there's no guarantee that all firmware will do that.

Not everything requires relocations; some things work fine without them. That would explain why you're able to make it work without strings.
davmac314
Member
Member
Posts: 121
Joined: Mon Jul 05, 2021 6:57 pm

Re: Debugging UEFI Issues on real hardware (POSIX-UEFI)

Post by davmac314 »

There is a wiki page with a technique for debugging UEFI applications which I found helpful.

I've tried out POSIX EFI and in particular, I had issues until I added "-mno-sse" to compilation flags (both for my app and for the POSIX UEFI library). In fact I went with "-march=x86_64 -mno-sse" to also avoid other extensions.

It compiles code with "-fpic" and then includes stub code to perform the necessary relocations, but on x86_64 a better (IMO) option is to use "-fpie", and then no relocations are necessary.
Post Reply