OSDev.org

Posted: **Wed Apr 23, 2025 5:09 am**

Dear colleagues,

I wonder if I ever need to allocate a specific address in my physical memory manager?

If not, it would make implementation much easier, as I just keep a pointer to the first free page and every free page contains a pointer to the next free page, resulting in O(1) for allocating and freeing.

But if I ever need to allocate/free a specific address, this algorithm becomes unfeasible, because it would result in O(n) (traversing all pages until I have found the address)
For that case I thought of something like

Code: Select all

struct PhysicalMemoryChunk_t {
	size_t mSize;
	PhysicalMemoryChunk_t *mPrev;
	PhysicalMemoryChunk_t *mNext;
	PhysicalMemoryChunk_t *mParentByAddress;
	PhysicalMemoryChunk_t *mLeftByAddress;
	PhysicalMemoryChunk_t *mRightByAddress;
	PhysicalMemoryChunk_t *mParentBySize;
	PhysicalMemoryChunk_t *mLeftBySize;
	PhysicalMemoryChunk_t *mRightBySize;
};

this structure is at the beginning of each free physical page. The pointers are then used for double-linked-list-access, an AVL-tree by Address and an AVL-tree by Size. Implementation of this is much harder (I forgot avl-trees after not using them for more than 10 years and need to figure out/learn the implementation again) but makes access much faster at O(log n).

So, is stack-like for a physical page frame allocator really good enough? I heard that nearly all devices today are capable of uncontigous DMA. Is this correct? How is it in regard of Bochs and especially QEMU?
Did I made a thinking mistake in my alternate approach?

Best regards
Sebi

Posted: **Wed Apr 23, 2025 5:52 am**

There are cases where you need to allocate addresses satisfying certain conditions for reasons of DMA. For example, 32-bit EHCI devices place control structures in RAM, and can typically only talk to the first 4GB of physical address space, so you need to allocate memory in the low 4GB. Even the 64-bit EHCI devices only allow you to pick a specific 32-bit "page" to use, they don't let you allocate any 64-bit address.

Or if you want to start secondary CPUs with the SIPI method, you need to allocate at least a page in the low 1MB for that.

So no, you never need a specific address, but you might need more than just "the next free address".

Posted: **Wed Apr 23, 2025 6:23 am**

Thanks @nullplan

Below 1M, Below 4G and the rest are all memory regions I know of, with special requirements.
In that case I could simply have 3 stack-likes - one for each region.

Currently I ignore multiple CPUs/Cores and plan to add them later. Or would you recommend directly plan with SMP in mind?

Posted: **Wed Apr 23, 2025 1:52 pm**

sebihepp wrote: ↑Wed Apr 23, 2025 6:23 am Below 1M, Below 4G and the rest are all memory regions I know of, with special requirements.
In that case I could simply have 3 stack-likes - one for each region.

According to the Linux source, there are also certain EHCI devices that only support 31-bit addresses. I would wager such silliness can occur with other devices as well. So this region approach might not be flexible enough. The gold standard is of course 16-bit ISA DMA: Allocate a contiguous buffer that lives in the low 24 MB and does not cross a 64kB boundary. That is probably the most complicated physical allocation you will need to support.

sebihepp wrote: ↑Wed Apr 23, 2025 6:23 am Currently I ignore multiple CPUs/Cores and plan to add them later. Or would you recommend directly plan with SMP in mind?

Some changes are hard to retrofit. If you started out writing your code with the idea that "cli" is the same as taking a global lock, then changing that decision later is hard. In the specific case of the PMM, I will tell you that you should write it to cope with the requirements your device drivers will have. More generally, I will tell you that writing your code to be SMP-ready, just starting with the special case of only supporting a single CPU, makes your code easier to actually parallelize. Always assume there are other threads accessing your global data, for example. This also reduces the amount of code that has to run with interrupts disabled.

Posted: **Wed Apr 23, 2025 5:13 pm**

But if I ever need to allocate/free a specific address, this algorithm becomes unfeasible, because it would result in O(n) (traversing all pages until I have found the address)

How about a list, every entry of which describes a page? The index of such an entry becomes a page frame number, giving you constant time search for a specific address. Also, searches through avl tree (or any other binary tree), describing ranges is not O(n) aren't they?

Posted: **Thu Apr 24, 2025 1:40 am**

Using bitmaps instead of lists solves the region issues a lot better. Additionally, there are devices that need more than one 4k page of continious physical memory, and you also might want to allocate 2M pages since they are more efficient with one less paging level and requires less entries in the TLB.

As for ISA DMA, I would simply skip that. It's not necessary to support.

Another definite issue with lists is that all physical memory must be mapped. Impossible for a 32-bit kernel, and unwanted for a 64-bit.

Posted: **Thu Apr 24, 2025 6:16 am**

rdos wrote: ↑Thu Apr 24, 2025 1:40 am Another definite issue with lists is that all physical memory must be mapped. Impossible for a 32-bit kernel, and unwanted for a 64-bit

Ist this really such a big issue, to map all physical memory in 64bit? The virtual address space is big enough. i could also get along with only one Page that i always map wherever i need it.

For a Bitmap i need to know the highest physical address and then allocate physical Pages without yet having a physical Page frame allocator. And it uses Up much space. In case of a list of only free pages, it uses much less space.

Posted: **Thu Apr 24, 2025 11:57 am**

sebihepp wrote: ↑Thu Apr 24, 2025 6:16 am For a Bitmap i need to know the highest physical address and then allocate physical Pages without yet having a physical Page frame allocator. And it uses Up much space. In case of a list of only free pages, it uses much less space.

One bit per 4k (1 byte per 32k) is not a lot of space. It's 0.003%. The page tables themselves consume 8 bytes per 4k, which is 0.2%. 32-bit paging only requires 4 bytes per 4k, which is 0.1%.

You don't need to know the highest physical address, and the bitmap is only mapped for areas that contain physical memory. You need one startup page to initialize it, but after that, you can allocate new pages as you fill up the bitmap.

Besides, the list allocator also needs startup pages since you need to map physical memory in linear memory, which requires physical memory in the page table structures.

Additionally, the bitmap allocator requires only 128k of linear memory to manage 4G of physical memory, whereas a list allocator necessitates 4G of linear memory to handle the same amount of physical memory.

So, you need to reserve linear memory based on how much physical memory you want to support (and thus, the highest physical address). This reservation of linear address space is an issue with 32-bit mode, but my physical memory allocator supports 1.5 TB of physical memory, which requires a reservation of 48MB, which I feel is a good compromise. With a list allocator, I could not support more than 48M or so without using strange tricks.

Another advantage of the bitmap allocator is that it possible to implement it so it is lock-free, and thus it can be accessed from anywhere, including in locked parts of the scheduler and in IRQs.

Posted: **Thu Apr 24, 2025 11:58 am**

sebihepp wrote: ↑Thu Apr 24, 2025 6:16 am Ist this really such a big issue, to map all physical memory in 64bit?

No, it isn't. It is just a bee in rdos's bonnet. He still thinks that segmentation is superior to paging and hasn't yet noticed that that particular ship has sailed all the way around the globe, came back home, and sailed again. When pressed, he will typically say something about protection from errors being better that way, and that might be true. But considering segmentation is just not available in 64-bit mode (and that mode is the only one I would ever consider writing something new for, as the others are just for compat), the point is kind of moot. It's like claiming trains are the superior method of transportation when you're stuck somewhere in Michigan.

sebihepp wrote: ↑Thu Apr 24, 2025 6:16 am For a Bitmap i need to know the highest physical address and then allocate physical Pages without yet having a physical Page frame allocator. And it uses Up much space. In case of a list of only free pages, it uses much less space.

A bitmap uses one bit in the bitmap for each page of physical memory. So every 8 pages use up 1 byte. Since 8 pages is 32kB the ratio is 1/32768. I highly doubt that that's an appreciable amount of memory. And that system allows you to allocate space with constraints (even with ISA DMA if needed). This becomes especially important when you want to allocate more than 1 page contiguous. Your linked list has no way to do this, ever. Well, OK, I can think of an algorithm, but it takes O(n) space and O(m^n) run time, where n is the number of pages needed and m is the number of nodes in the list. Meanwhile, the bitmap algorithm just counts bits.

The space for the bitmap can be allocated just by iterating over the memory info and taking the first free spot you have that is big enough. At the time you do this, RAM is empty except for the kernel.

I myself am using memblocks, which is essentially run-length encoded bitmaps. That's also what Linux uses during early boot.

Posted: **Thu Apr 24, 2025 12:14 pm**

nullplan wrote: ↑Thu Apr 24, 2025 11:58 am
sebihepp wrote: ↑Thu Apr 24, 2025 6:16 am Ist this really such a big issue, to map all physical memory in 64bit?
No, it isn't. It is just a bee in rdos's bonnet. He still thinks that segmentation is superior to paging and hasn't yet noticed that that particular ship has sailed all the way around the globe, came back home, and sailed again. When pressed, he will typically say something about protection from errors being better that way, and that might be true. But considering segmentation is just not available in 64-bit mode (and that mode is the only one I would ever consider writing something new for, as the others are just for compat), the point is kind of moot. It's like claiming trains are the superior method of transportation when you're stuck somewhere in Michigan.

Well, mapping physical memory in linear memory is not related to segmentation. It's just a bad practice since it makes all physical memory accessible anywhere in the kernel, even if it is mapped only in a high-security server process where it should be protected from tampering.

Besides, I use paging, and my user space uses a flat memory model. It's my kernel that is segmented.

Posted: **Thu Apr 24, 2025 4:20 pm**

rdos wrote: ↑Thu Apr 24, 2025 1:40 am Using bitmaps instead of lists solves the region issues a lot better. Additionally, there are devices that need more than one 4k page of continious physical memory, and you also might want to allocate 2M pages since they are more efficient with one less paging level and requires less entries in the TLB.

As for ISA DMA, I would simply skip that. It's not necessary to support.

Another definite issue with lists is that all physical memory must be mapped. Impossible for a 32-bit kernel, and unwanted for a 64-bit.

If page frames were to have more than 2 states (free, taken), but rather muti state, like zeroed, free, standby, modified, modified no write, taken, then bitmaps wouldn't be this "perfect" usable.

Posted: **Fri Apr 25, 2025 12:59 am**

zaval wrote: ↑Thu Apr 24, 2025 4:20 pm
rdos wrote: ↑Thu Apr 24, 2025 1:40 am Using bitmaps instead of lists solves the region issues a lot better. Additionally, there are devices that need more than one 4k page of continious physical memory, and you also might want to allocate 2M pages since they are more efficient with one less paging level and requires less entries in the TLB.

As for ISA DMA, I would simply skip that. It's not necessary to support.

Another definite issue with lists is that all physical memory must be mapped. Impossible for a 32-bit kernel, and unwanted for a 64-bit.
If page frames were to have more than 2 states (free, taken), but rather muti state, like zeroed, free, standby, modified, modified no write, taken, then bitmaps wouldn't be this "perfect" usable.

You can still support that. For instance, you can keep a pool of zeroed and free pages in a list rather than in the free bitmap. Although, it's a bit questionable if this is any advantage over just zeroing a page after having allocated it. Also note that if a page is linked into a free list of pages, it cannot be zeroed. Modified is typically an attribute of allocated pages, and when a page is allocated and mapped, you can use available bits in the paging structure for things like accessed and modified.

Still, there are a few drawbacks with not having all physical memory mapped to linear memory. Initializating page directory entries is one, which would be easier and safer if the page was mapped. Dynamically loading and relocating user space pages is another, which can fail if the user process has multiple threads and this is done in user space rather than on a linear mapping of the physical page. This would be easier to do if the physical page was already mapped. Handing PCI device queues can also be a challenge since these use physical addresses and not linear.

Perhaps a 32-bit kernel could reserve linear address space for x pages that are mapped to physical memory, and which then can be initialized and moved elsewhere. If they are handled like a circular buffer, then when the buffer wraps, new pages could be allocated and linked as needed and all the linear TLB entries could be flushed. This would avoid doing flushes for every entry allocation which is costly.

Posted: **Tue May 06, 2025 2:18 am**

sebihepp wrote: ↑Wed Apr 23, 2025 6:23 am Thanks @nullplan

Below 1M, Below 4G and the rest are all memory regions I know of, with special requirements.
In that case I could simply have 3 stack-likes - one for each region.

Currently I ignore multiple CPUs/Cores and plan to add them later. Or would you recommend directly plan with SMP in mind?

Definitely plan with SMP in mind, it's incredibly difficult to retrofit.

As for regions, there are a couple more considerations that a good PMM should account for.

Cache Colouring
NUMA

With these, failure to allocate with the preferred region and colour is soft, rather than hard.

OSDev.org

PMM: Do I ever need to allocate a specific address?

PMM: Do I ever need to allocate a specific address?

Re: PMM: Do I ever need to allocate a specific address?

Re: PMM: Do I ever need to allocate a specific address?

Re: PMM: Do I ever need to allocate a specific address?

Re: PMM: Do I ever need to allocate a specific address?

Re: PMM: Do I ever need to allocate a specific address?

Re: PMM: Do I ever need to allocate a specific address?

Re: PMM: Do I ever need to allocate a specific address?

Re: PMM: Do I ever need to allocate a specific address?

Re: PMM: Do I ever need to allocate a specific address?

Re: PMM: Do I ever need to allocate a specific address?

Re: PMM: Do I ever need to allocate a specific address?

Re: PMM: Do I ever need to allocate a specific address?