Ethin wrote:Doing all the generation in the macro call itself probably is slower than generating the string by hand, but this does the job. If I add an allocate_page_range() call (which is my VMMs function for allocating virtual pages) or allocate_phys_range() (which is my VMMs function for allocating physical pages of RAM for MMIO access) the fault handler continues to be called repeatedly with lots of addresses and never seems to stop faulting.
Random guess: maybe both of your allocators need to touch un-mapped memory in order to allocate the memory for the page you need in the page fault handlers, that's why you get an infinite fault recursion. Debugging this with QEMU + GDB shouldn't be hard. Not sure for the Rust case, though.
For debugging purposes, you could raise an
in_allocator flag at the beginning of your vmm allocator (or physical one, it depends how you wanna allocate memory in the PF handler) and ASSERT(!in_allocator) in the PF handler.
Another theory: your VMM allocator might need to actually map the page allocated with the physical in the page directory. When doing that, sometimes, you need to allocate a page for the page table. If that calls the same allocator, that's a problem.
If I remember correctly, I had several kinds of small problems when I had both a physical pageframe allocator and a VMM allocator in Tilck. At some point, I made it all work properly but the performance wasn't as good as I wanted because the VMM allocator had to call the physical allocator from time to time. Even if that happened only for some bigger chunks (e.g. 1 MB), that had still too much of an overhead for my taste (note: my performance criteria are
very tight). Also, each page directory needed a PADDR pointer, because I wasn't using a linear mapping: each page in the kernel VMM could be anywhere in the physical memory. That approach is flexible, but it has an overhead in many places, both in terms of performance and in terms of code complexity.
At the end, I switched to a simplified version of the Linux model:
pure linear mapping. At boot, I map all the usable regions of physical memory in the range [0, 896 MB) to +3 GB in the kernel's VMM (0xC0000000). When possible, I use 4MB pages to save memory for page tables. With that, the conversion between paddr and vaddr is linear, with a simple constant. The allocator reserves virtual memory in "heaps" contained in usable regions that have a 1:1 mapping to the physical memory. The extra 128 MB in the VMM are used for special purposes, but there's nothing like Linux's "HighMem": in other words, on 32-bit machines my kernel won't support more than 896MB of RAM. But that's
not a problem for me: either the OS will run on small 32-bit embedded systems where 896 MB is more than we can ever consider, or it will run (in the future) on 64-bit machines, where the linear mapping has no
practical limitations.
Windows' model: in my knowledge, Windows uses a hybrid model: linear mapping between regions in the VMM and the physical memory, but different regions in the VMM can be mapped anywhere in the physical relatively to each other. So, given a vaddr we need to know its region before being able to determine where is mapped. Given a paddr, we need to determine first if we have a region that contains that paddr range and its initial address, in order to get an usable vaddr. That's the most complex model IMHO, but it certainly works as well.