[semi-solved] memory mapping throwing error at 476MiB mark

Question about which tools to use, bugs, the best way to implement a function, etc should go here. Don't forget to see if your question is answered in the wiki first! When in doubt post here.
Post Reply
User avatar
austanss
Member
Member
Posts: 377
Joined: Sun Oct 11, 2020 9:46 pm
Location: United States

[semi-solved] memory mapping throwing error at 476MiB mark

Post by austanss »

Note: no error is reported when running without KVM, but the behaviour is still unexpected behaviour

I'm getting a KVM internal error:

Code: Select all

KVM internal error. Suberror: 3
extra data[0]: 80000306
extra data[1]: 31
extra data[2]: 182
extra data[3]: ee8ce968
RAX=000000000000001d RBX=0000000000151de8 RCX=0000000000001d00 RDX=00000000ee77296c
RSI=00000000ee8ce8b4 RDI=0000000000107c67 RBP=000000001dc00000 RSP=00000000ee8ce97c
R8 =000000001ffa0000 R9 =0000000000000030 R10=0000000000000017 R11=0000000000400000
R12=0000000000000000 R13=0000000000000000 R14=0000000000012770 R15=0000000000011000
RIP=0000000000100c31 RFL=00010082 [--S----] CPL=0 II=0 A20=1 SMM=0 HLT=0
ES =0010 0000000000000000 00000000 00009300 DPL=0 DS   [-WA]
CS =0008 0000000000000000 f0000fff 00a09b00 DPL=0 CS64 [-RA]
SS =0010 0000000000000000 00000000 00009300 DPL=0 DS   [-WA]
DS =0010 0000000000000000 00000000 00009300 DPL=0 DS   [-WA]
FS =0010 0000000000000000 00000000 00009300 DPL=0 DS   [-WA]
GS =0010 0000000000000000 00000000 00009300 DPL=0 DS   [-WA]
LDT=0000 0000000000000000 0000ffff 00008200 DPL=0 LDT
TR =0000 0000000000000000 0000ffff 00008b00 DPL=0 TSS64-busy
GDT=     00000000001021f1 00000017
IDT=     0000000000111260 00000fff
CR0=80010033 CR2=ffffffffffffffad CR3=000000001fc01000 CR4=00000668
DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000 
DR6=00000000ffff0ff0 DR7=0000000000000400
EFER=0000000000000d00
Code=00 00 00 03 40 b8 1d 00 00 00 00 03 50 b8 1d 00 00 00 00 03 <60> b8 1d 00 00 00 00 03 70 b8 1d 00 00 00 00 03 80 b8 1d 00 00 00 00 03 90 b8 1d 00 00 00
RIP points to a line in the printf function.
Unrelated.

This seems to be a memory mapping issue, because I never called printf.
Also, in the serial output I get:

Code: Select all

text buffer at 12
which comes from a function I didn't call.

Now, I did do some debugging.

The error comes from this chunk of code:

Code: Select all

	 for (uint64_t t = 0; t < memory::allocation::get_total_memory_size(bootloader_info->memory_map, 
														  bootloader_info->mmap_size, 
														  bootloader_info->mmap_descriptor_size); t += 0x1000) 
    						                                                                         memory::paging::map_memory((void*)t, (void*)t);
which most likely arises when calling map_memory.

However, the issue does not lie in the map_memory function itself, but rather the parameters.

The first time the function is called, everything is fine. I am yet to discover which iteration it is when the error is thrown but I am working on that right now.

If you need more source, dig in:
Referenced source file above: https://github.com/microNET-OS/microCOR ... onfigf.cxx
Source file with paging code, including the map_memory function: https://github.com/microNET-OS/microCOR ... memory.cxx
Last edited by austanss on Sat Jan 30, 2021 11:28 pm, edited 2 times in total.
Skylight: https://github.com/austanss/skylight

I make stupid mistakes and my vision is terrible. Not a good combination.

NOTE: Never respond to my posts with "it's too hard".
User avatar
austanss
Member
Member
Posts: 377
Joined: Sun Oct 11, 2020 9:46 pm
Location: United States

Re: Memory mapping error (paging) calling random functions

Post by austanss »

UPDATE: the error is thrown on the 121856-th iteration, or when

Code: Select all

mapping virtual address 1dc00000 to physical address 1dc00000
or, the 476 MiB mark.
(the QEMU instance has 512 'M' allocated, I do not know if that represents MiB or MB)
Skylight: https://github.com/austanss/skylight

I make stupid mistakes and my vision is terrible. Not a good combination.

NOTE: Never respond to my posts with "it's too hard".
xeyes
Member
Member
Posts: 212
Joined: Mon Dec 07, 2020 8:09 am

Re: memory mapping throwing error at 476MiB mark

Post by xeyes »

KVM internal error
One of the few occasion where you might want to try bochs.

Or if you are confident enough that it's KVM's bug not yours, try it on real hardware.
512
Should be in the binary size (1K = 1024). But it is always a good idea to check against the multi-boot header's memory map structure before deciding to use a physical location. We all know that on real hardware a sole 4GB stick doesn't give the OS 4GB "usable memory" and there are good reasons for that.
121856-th iteration
Something for you to consider: how about using bigger pages if you are identity mapping anyways? Even if you need finer granularity updates later you can drop down to the finer levels at that time.
User avatar
austanss
Member
Member
Posts: 377
Joined: Sun Oct 11, 2020 9:46 pm
Location: United States

Re: memory mapping throwing error at 476MiB mark

Post by austanss »

xeyes wrote:
KVM internal error
One of the few occasion where you might want to try bochs.

Or if you are confident enough that it's KVM's bug not yours, try it on real hardware.
It wasn't a KVM bug, the bug was fatal but QEMU without KVM didn't throw an error.

Also, I doubt bochs has a stable UEFI firmware. It's also probably very slow.
Should be in the binary size (1K = 1024). But it is always a good idea to check against the multi-boot header's memory map structure before deciding to use a physical location. We all know that on real hardware a sole 4GB stick doesn't give the OS 4GB "usable memory" and there are good reasons for that.
Thanks for pointing that out, but my get_memory_size function does use the memory map as a source. So no, I am not under the assumption that you can use all the memory advertised on a stick.
Skylight: https://github.com/austanss/skylight

I make stupid mistakes and my vision is terrible. Not a good combination.

NOTE: Never respond to my posts with "it's too hard".
Octocontrabass
Member
Member
Posts: 5568
Joined: Mon Mar 25, 2013 7:01 pm

Re: memory mapping throwing error at 476MiB mark

Post by Octocontrabass »

rizxt wrote:

Code: Select all

KVM internal error. Suberror: 3
extra data[0]: 80000306
extra data[1]: 31
extra data[2]: 182
extra data[3]: ee8ce968
According to this information, the CPU reported a fault during virtualization that the hypervisor was not prepared to handle. The CPU says the hypervisor's page tables for translating guest physical addresses to host physical addresses are invalid. I suspect this is caused by a memory corruption bug.
rizxt wrote:Also, I doubt bochs has a stable UEFI firmware.
OVMF works in Bochs. This thread has advice on setting it up, if you're interested.
rizxt wrote:So no, I am not under the assumption that you can use all the memory advertised on a stick.
Do you assume that the memory map will always describe memory starting at address zero? Do you assume that the memory map will never have holes in it? If you make either of these assumptions, you'll end up reading or writing past the end of your bitmap.
User avatar
austanss
Member
Member
Posts: 377
Joined: Sun Oct 11, 2020 9:46 pm
Location: United States

Re: memory mapping throwing error at 476MiB mark

Post by austanss »

Octocontrabass wrote:
rizxt wrote:

Code: Select all

KVM internal error. Suberror: 3
extra data[0]: 80000306
extra data[1]: 31
extra data[2]: 182
extra data[3]: ee8ce968
According to this information, the CPU reported a fault during virtualization that the hypervisor was not prepared to handle. The CPU says the hypervisor's page tables for translating guest physical addresses to host physical addresses are invalid. I suspect this is caused by a memory corruption bug.
rizxt wrote:Also, I doubt bochs has a stable UEFI firmware.
OVMF works in Bochs. This thread has advice on setting it up, if you're interested.
rizxt wrote:So no, I am not under the assumption that you can use all the memory advertised on a stick.
Do you assume that the memory map will always describe memory starting at address zero? Do you assume that the memory map will never have holes in it? If you make either of these assumptions, you'll end up reading or writing past the end of your bitmap.
Well, I tested the code on different RAM sizes, and it always broke when reaching 476MiB.

Also, I too highly suspect it is a memory corruption bug, but I believe it is inflicted by my OS.
Skylight: https://github.com/austanss/skylight

I make stupid mistakes and my vision is terrible. Not a good combination.

NOTE: Never respond to my posts with "it's too hard".
User avatar
neon
Member
Member
Posts: 1567
Joined: Sun Feb 18, 2007 7:28 pm
Contact:

Re: memory mapping throwing error at 476MiB mark

Post by neon »

Hi,

...Which is irrelevant. The memory map (1) doesn't need to start at 0, (2) does not need to be in ascending order, or any order for the matter, (3) can contain memory holes and remappable holes, (4) can contain overlapping regions, (5) the pointer provided might exist in firmware memory and not free memory, and (6) can contain reclaimable memory currently in use by the firmware as EFI is still being used until ExitBootServices is called. The software must not make any assumptions on the memory map or it will break in unexpected ways. The amount of physical memory allocated via the virtual machine is irrelevant.

If the goal is to map only kernel space, you can map just the respective PTE's, PDE's etc and the entire identity mapping code can be about 20 lines. If the goal is to map the entire address space, then you should be doing just that -- using the highest addressable page that can be mapped. Neither case relies on the memory map: the amount of physical memory installed or reported via the memory map has no effect on the address space itself.

Is the goal to identity map the entire address space? How does this correlate with the repo that says this is to be a microkernel? User space is typically mapped on demand independent of kernel space so I don't understand the goal here.
OS Development Series | Wiki | os | ncc
char c[2]={"\x90\xC3"};int main(){void(*f)()=(void(__cdecl*)(void))(void*)&c;f();}
User avatar
austanss
Member
Member
Posts: 377
Joined: Sun Oct 11, 2020 9:46 pm
Location: United States

Re: memory mapping throwing error at 476MiB mark

Post by austanss »

Thank you, neon, for your amazing answer.

ON the comment on my kernel's design as a microkernel: you are right, my kernel has design choices that do not correlate with microkernels. I hope to resolve these issues. My current goal is to reach a status where I can load executables in userland, and execute them. Then I will focus more on design.

And that piece on microkernel address spaces, is *chef's kiss*.

Thank you.
Skylight: https://github.com/austanss/skylight

I make stupid mistakes and my vision is terrible. Not a good combination.

NOTE: Never respond to my posts with "it's too hard".
Octocontrabass
Member
Member
Posts: 5568
Joined: Mon Mar 25, 2013 7:01 pm

Re: memory mapping throwing error at 476MiB mark

Post by Octocontrabass »

xeyes wrote:Something for you to consider: how about using bigger pages if you are identity mapping anyways?
Bigger pages aren't allowed to cross memory type boundaries, so the code to use bigger pages will be more complicated.
User avatar
neon
Member
Member
Posts: 1567
Joined: Sun Feb 18, 2007 7:28 pm
Contact:

Re: memory mapping throwing error at 476MiB mark

Post by neon »

Hi,

The thing is -- you can't change this later without rewriting almost everything. Like with supporting multiprocessing some choices need to be carefully made from the start and followed through from the start. Unless, of course, the intent is to rewrite later. This is particular an important choice as all page tables use page frame numbers (PFN)'s not virtual addresses so a method needs to be decided upon on how you plan to access them when paging is enabled - we use recursive page structures but there are other ways. I.e. quite literally all your current paging code will be unusable when you are no longer identity mapping and paging is enabled.

Besides, writing a small function to just identity map kernel space for now is also less code for you to have to debug and work through. Compare -- our identity mapping code is around 20 lines. If there was an issue with it, it would be only that to look over. And as noted above, no need for the memory map.
OS Development Series | Wiki | os | ncc
char c[2]={"\x90\xC3"};int main(){void(*f)()=(void(__cdecl*)(void))(void*)&c;f();}
Post Reply