Page 1 of 1

[semi-solved] memory mapping throwing error at 476MiB mark

Posted: Sat Jan 30, 2021 1:04 pm
by austanss
Note: no error is reported when running without KVM, but the behaviour is still unexpected behaviour

I'm getting a KVM internal error:

Code: Select all

KVM internal error. Suberror: 3
extra data[0]: 80000306
extra data[1]: 31
extra data[2]: 182
extra data[3]: ee8ce968
RAX=000000000000001d RBX=0000000000151de8 RCX=0000000000001d00 RDX=00000000ee77296c
RSI=00000000ee8ce8b4 RDI=0000000000107c67 RBP=000000001dc00000 RSP=00000000ee8ce97c
R8 =000000001ffa0000 R9 =0000000000000030 R10=0000000000000017 R11=0000000000400000
R12=0000000000000000 R13=0000000000000000 R14=0000000000012770 R15=0000000000011000
RIP=0000000000100c31 RFL=00010082 [--S----] CPL=0 II=0 A20=1 SMM=0 HLT=0
ES =0010 0000000000000000 00000000 00009300 DPL=0 DS   [-WA]
CS =0008 0000000000000000 f0000fff 00a09b00 DPL=0 CS64 [-RA]
SS =0010 0000000000000000 00000000 00009300 DPL=0 DS   [-WA]
DS =0010 0000000000000000 00000000 00009300 DPL=0 DS   [-WA]
FS =0010 0000000000000000 00000000 00009300 DPL=0 DS   [-WA]
GS =0010 0000000000000000 00000000 00009300 DPL=0 DS   [-WA]
LDT=0000 0000000000000000 0000ffff 00008200 DPL=0 LDT
TR =0000 0000000000000000 0000ffff 00008b00 DPL=0 TSS64-busy
GDT=     00000000001021f1 00000017
IDT=     0000000000111260 00000fff
CR0=80010033 CR2=ffffffffffffffad CR3=000000001fc01000 CR4=00000668
DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000 
DR6=00000000ffff0ff0 DR7=0000000000000400
EFER=0000000000000d00
Code=00 00 00 03 40 b8 1d 00 00 00 00 03 50 b8 1d 00 00 00 00 03 <60> b8 1d 00 00 00 00 03 70 b8 1d 00 00 00 00 03 80 b8 1d 00 00 00 00 03 90 b8 1d 00 00 00
RIP points to a line in the printf function.
Unrelated.

This seems to be a memory mapping issue, because I never called printf.
Also, in the serial output I get:

Code: Select all

text buffer at 12
which comes from a function I didn't call.

Now, I did do some debugging.

The error comes from this chunk of code:

Code: Select all

	 for (uint64_t t = 0; t < memory::allocation::get_total_memory_size(bootloader_info->memory_map, 
														  bootloader_info->mmap_size, 
														  bootloader_info->mmap_descriptor_size); t += 0x1000) 
    						                                                                         memory::paging::map_memory((void*)t, (void*)t);
which most likely arises when calling map_memory.

However, the issue does not lie in the map_memory function itself, but rather the parameters.

The first time the function is called, everything is fine. I am yet to discover which iteration it is when the error is thrown but I am working on that right now.

If you need more source, dig in:
Referenced source file above: https://github.com/microNET-OS/microCOR ... onfigf.cxx
Source file with paging code, including the map_memory function: https://github.com/microNET-OS/microCOR ... memory.cxx

Re: Memory mapping error (paging) calling random functions

Posted: Sat Jan 30, 2021 1:12 pm
by austanss
UPDATE: the error is thrown on the 121856-th iteration, or when

Code: Select all

mapping virtual address 1dc00000 to physical address 1dc00000
or, the 476 MiB mark.
(the QEMU instance has 512 'M' allocated, I do not know if that represents MiB or MB)

Re: memory mapping throwing error at 476MiB mark

Posted: Sat Jan 30, 2021 2:24 pm
by xeyes
KVM internal error
One of the few occasion where you might want to try bochs.

Or if you are confident enough that it's KVM's bug not yours, try it on real hardware.
512
Should be in the binary size (1K = 1024). But it is always a good idea to check against the multi-boot header's memory map structure before deciding to use a physical location. We all know that on real hardware a sole 4GB stick doesn't give the OS 4GB "usable memory" and there are good reasons for that.
121856-th iteration
Something for you to consider: how about using bigger pages if you are identity mapping anyways? Even if you need finer granularity updates later you can drop down to the finer levels at that time.

Re: memory mapping throwing error at 476MiB mark

Posted: Sat Jan 30, 2021 2:43 pm
by austanss
xeyes wrote:
KVM internal error
One of the few occasion where you might want to try bochs.

Or if you are confident enough that it's KVM's bug not yours, try it on real hardware.
It wasn't a KVM bug, the bug was fatal but QEMU without KVM didn't throw an error.

Also, I doubt bochs has a stable UEFI firmware. It's also probably very slow.
Should be in the binary size (1K = 1024). But it is always a good idea to check against the multi-boot header's memory map structure before deciding to use a physical location. We all know that on real hardware a sole 4GB stick doesn't give the OS 4GB "usable memory" and there are good reasons for that.
Thanks for pointing that out, but my get_memory_size function does use the memory map as a source. So no, I am not under the assumption that you can use all the memory advertised on a stick.

Re: memory mapping throwing error at 476MiB mark

Posted: Sat Jan 30, 2021 5:24 pm
by Octocontrabass
rizxt wrote:

Code: Select all

KVM internal error. Suberror: 3
extra data[0]: 80000306
extra data[1]: 31
extra data[2]: 182
extra data[3]: ee8ce968
According to this information, the CPU reported a fault during virtualization that the hypervisor was not prepared to handle. The CPU says the hypervisor's page tables for translating guest physical addresses to host physical addresses are invalid. I suspect this is caused by a memory corruption bug.
rizxt wrote:Also, I doubt bochs has a stable UEFI firmware.
OVMF works in Bochs. This thread has advice on setting it up, if you're interested.
rizxt wrote:So no, I am not under the assumption that you can use all the memory advertised on a stick.
Do you assume that the memory map will always describe memory starting at address zero? Do you assume that the memory map will never have holes in it? If you make either of these assumptions, you'll end up reading or writing past the end of your bitmap.

Re: memory mapping throwing error at 476MiB mark

Posted: Sat Jan 30, 2021 5:29 pm
by austanss
Octocontrabass wrote:
rizxt wrote:

Code: Select all

KVM internal error. Suberror: 3
extra data[0]: 80000306
extra data[1]: 31
extra data[2]: 182
extra data[3]: ee8ce968
According to this information, the CPU reported a fault during virtualization that the hypervisor was not prepared to handle. The CPU says the hypervisor's page tables for translating guest physical addresses to host physical addresses are invalid. I suspect this is caused by a memory corruption bug.
rizxt wrote:Also, I doubt bochs has a stable UEFI firmware.
OVMF works in Bochs. This thread has advice on setting it up, if you're interested.
rizxt wrote:So no, I am not under the assumption that you can use all the memory advertised on a stick.
Do you assume that the memory map will always describe memory starting at address zero? Do you assume that the memory map will never have holes in it? If you make either of these assumptions, you'll end up reading or writing past the end of your bitmap.
Well, I tested the code on different RAM sizes, and it always broke when reaching 476MiB.

Also, I too highly suspect it is a memory corruption bug, but I believe it is inflicted by my OS.

Re: memory mapping throwing error at 476MiB mark

Posted: Sat Jan 30, 2021 7:12 pm
by neon
Hi,

...Which is irrelevant. The memory map (1) doesn't need to start at 0, (2) does not need to be in ascending order, or any order for the matter, (3) can contain memory holes and remappable holes, (4) can contain overlapping regions, (5) the pointer provided might exist in firmware memory and not free memory, and (6) can contain reclaimable memory currently in use by the firmware as EFI is still being used until ExitBootServices is called. The software must not make any assumptions on the memory map or it will break in unexpected ways. The amount of physical memory allocated via the virtual machine is irrelevant.

If the goal is to map only kernel space, you can map just the respective PTE's, PDE's etc and the entire identity mapping code can be about 20 lines. If the goal is to map the entire address space, then you should be doing just that -- using the highest addressable page that can be mapped. Neither case relies on the memory map: the amount of physical memory installed or reported via the memory map has no effect on the address space itself.

Is the goal to identity map the entire address space? How does this correlate with the repo that says this is to be a microkernel? User space is typically mapped on demand independent of kernel space so I don't understand the goal here.

Re: memory mapping throwing error at 476MiB mark

Posted: Sat Jan 30, 2021 7:32 pm
by austanss
Thank you, neon, for your amazing answer.

ON the comment on my kernel's design as a microkernel: you are right, my kernel has design choices that do not correlate with microkernels. I hope to resolve these issues. My current goal is to reach a status where I can load executables in userland, and execute them. Then I will focus more on design.

And that piece on microkernel address spaces, is *chef's kiss*.

Thank you.

Re: memory mapping throwing error at 476MiB mark

Posted: Sat Jan 30, 2021 8:00 pm
by Octocontrabass
xeyes wrote:Something for you to consider: how about using bigger pages if you are identity mapping anyways?
Bigger pages aren't allowed to cross memory type boundaries, so the code to use bigger pages will be more complicated.

Re: memory mapping throwing error at 476MiB mark

Posted: Sat Jan 30, 2021 8:10 pm
by neon
Hi,

The thing is -- you can't change this later without rewriting almost everything. Like with supporting multiprocessing some choices need to be carefully made from the start and followed through from the start. Unless, of course, the intent is to rewrite later. This is particular an important choice as all page tables use page frame numbers (PFN)'s not virtual addresses so a method needs to be decided upon on how you plan to access them when paging is enabled - we use recursive page structures but there are other ways. I.e. quite literally all your current paging code will be unusable when you are no longer identity mapping and paging is enabled.

Besides, writing a small function to just identity map kernel space for now is also less code for you to have to debug and work through. Compare -- our identity mapping code is around 20 lines. If there was an issue with it, it would be only that to look over. And as noted above, no need for the memory map.