Question about which tools to use, bugs, the best way to implement a function, etc should go here. Don't forget to see if your question is answered in the wiki first! When in doubt post here.
nullplan wrote:The problem with this approach is that constrained allocation becomes harder. You don't even know from looking at the list if any memory fulfilling the requirements even can exist. You can iterate over the list to find a suitable page, but given today's humongous memory sizes, that would take a while. Even then, since nothing imposes any order on the list, you might end up having to traverse pretty much the entire list to find a suitable page. Whereas, with a bitmap allocator, you at least know the right region to look at.
Thanks for all of your answears, I edited my bitmap algorithm to be much better, but the thing I don't know how to do the best way is - where should I store my bitmap, because I don't have an allocator to allocator space for it so I need to give it pointer some location, but where?
A simple bump allocator should be sufficient for bootstrap. Just start the pointer at the end of your uninitialized data, then when you allocate something, just advance that pointer over whatever you're allocating, and return the old pointer.
Thanks for your answear, I would also like to ask:
1. How do I allocate several frames one after another and return them as a chunk(for example for 1GB Huge pages on x86)?
2. What if the drivers have MMIO in areas that are marked as free and some frame is allocated there before the driver says that this space should be reserved. Or is it impossible for MMIO to be located in free memory areas in memory map(what about the bootloader framebuffers?)?
ngx wrote:How do I allocate several frames one after another and return them as a chunk(for example for 1GB Huge pages on x86)?
Look for enough contiguous frames with appropriate alignment to satisfy the allocation, then allocate all of them at once. You only need to return the address of the first frame, since they're contiguous. Freeing will require the caller to inform you how many frames there are.
ngx wrote:What if the drivers have MMIO in areas that are marked as free and some frame is allocated there before the driver says that this space should be reserved. Or is it impossible for MMIO to be located in free memory areas in memory map(what about the bootloader framebuffers?)?
MMIO is not memory, so it will never be listed as free memory in the memory map. The framebuffer is MMIO.
ngx wrote:How do I allocate several frames one after another and return them as a chunk(for example for 1GB Huge pages on x86)?
Look for enough contiguous frames with appropriate alignment to satisfy the allocation, then allocate all of them at once. You only need to return the address of the first frame, since they're contiguous. Freeing will require the caller to inform you how many frames there are.
ngx wrote:What if the drivers have MMIO in areas that are marked as free and some frame is allocated there before the driver says that this space should be reserved. Or is it impossible for MMIO to be located in free memory areas in memory map(what about the bootloader framebuffers?)?
MMIO is not memory, so it will never be listed as free memory in the memory map. The framebuffer is MMIO.
How should I determine the size of the bitmap because base address of last entry + it's size is bigger then sum of sizes all entries, so which one of these numbers should I use to determine the bitmap size - end of last area in memmap or size of all size?
If 4 GB fits in 128 KB, then the maximum amount of memory a PC can address - 64 TB - would need a 2 GB bitmap. That would be quite slow to initialize.
I suppose there aren't that many PCs around with 64 TB of RAM, but still... This doesn't seem to scale. I suppose one can incrementally build the bitmap as needed by walking the memory map provided by the firmware (and not do everything up-front).
kzinti wrote:If 4 GB fits in 128 KB, then the maximum amount of memory a PC can address - 64 TB - would need a 2 GB bitmap. That would be quite slow to initialize.
How did you come up with 64TiB? The architectural limit is 4096TiB, which would need a 128GiB bitmap. (That ought to make 2GiB look pretty fast!)
kzinti wrote:I suppose there aren't that many PCs around with 64 TB of RAM, but still... This doesn't seem to scale. I suppose one can incrementally build the bitmap as needed by walking the memory map provided by the firmware (and not do everything up-front).
The first "PC" with 64TiB of RAM will be a server with many CPUs and hundreds of cores (if it doesn't already exist - I'm not sure if it does). You'll have bigger scaling problems than just initializing the bitmap.
Consumer hardware with 64TiB of RAM is far enough off that I'm not sure it's worth worrying about right now. Check back in a decade or so.
Octocontrabass wrote:How did you come up with 64TiB? (That ought to make 2GiB look pretty fast!)
It is my understanding that current x86 processors can only address 64 TB of physical memory. I understand that more bits are being added and that we could one day reach 4096 TB. But we are not there yet.
Octocontrabass wrote:Consumer hardware with 64TiB of RAM is far enough off that I'm not sure it's worth worrying about right now. Check back in a decade or so.
Octocontrabass wrote:The first "PC" with 64TiB of RAM will be a server with many CPUs and hundreds of cores (if it doesn't already exist - I'm not sure if it does). You'll have bigger scaling problems than just initializing the bitmap.
I believe there are such issues with my PC with only 128 GB of physical memory. It boots considerably slower than other PCs, and it's BIOS that is slower at initializing things (probably memory).
A few decades ago, I think many BIOSes actually tested most of the physical memory at boot up, but this no longer is the case, however, with 100s of GB of memory initialization seems to be slow even without testing all memory.
Octocontrabass wrote:The first "PC" with 64TiB of RAM will be a server with many CPUs and hundreds of cores (if it doesn't already exist - I'm not sure if it does). You'll have bigger scaling problems than just initializing the bitmap.
And it would almost certainly have some degree of NUMA partitioning, so you can easily have each CPU setting up its own bitmaps in parallel. Or, you could initialize, say, the first 10% to get the machine up and running, and spin off a kernel thread to set up the rest later in the boot process.
The machine you describe does exist in the server space. The highest end IBM p-series supports up to 64TB of RAM and 192 cores.
I wouldn't worry too much about how much memory you need for free page bitmaps. There's just a certain amount of overhead a kernel has to consume to be able to do anything. At least the need is predictably linear. If you need X% of your memory to store free page bitmaps, you'll also need X% with 100x the RAM. You're sure to consume more memory storing page maps anyway.
Octocontrabass wrote:The offset within the bitmap indicates the address of the frame, so your bitmap needs to be big enough to cover all addresses that contain memory.
I have finally implemented the Physical Memory Allocator, am I correct with how I have implemented the bitmap algorithm, are there any serious bugs?
Last edited by rpio on Sun Jan 23, 2022 4:46 am, edited 1 time in total.
sj95126 wrote:I wouldn't worry too much about how much memory you need for free page bitmaps. There's just a certain amount of overhead a kernel has to consume to be able to do anything. At least the need is predictably linear. If you need X% of your memory to store free page bitmaps, you'll also need X% with 100x the RAM. You're sure to consume more memory storing page maps anyway.
It's not the amount of memory consumed by the bitmap that is of concern, it's how much time it is going to take to initialize it (and by extension managing it).
Managing your memory using a bitmap tends to produce algorithms of linear complexity, and that is not going to scale well.
Last edited by kzinti on Thu Mar 11, 2021 11:36 am, edited 1 time in total.
iansjack wrote:Whatever scheme you use is going to have to be initialized.
But not all schemes will have linear initialization time.
I am happy with the proposed solution above for bitmaps: initialize a subset (say the first 4GB) to get the kernel going and when you have SMP up and running, spawn tasks to finish initializing the bitmaps.
I think bitmap initialization will be much faster than linking pages. For linking pages, you will need one write per 4k entry. For a bitmap, you can init 32 4k entries with one 32-bit write, and 64 with a 64-bit write.