Worst way to initialize physical frame allocator
Posted: Sat May 22, 2021 9:45 pm
So, I believe that I have succeeded in making just about the worst initializer for the physical frame allocator (and by extension, one of the worst allocators ever). I figure if anyone wants to die a little inside, I'll go ahead and show what I did.
Basically, I used a bitmap that mapped all 4k blocks in the entire 4GB address space. Any unusable memory was considered to be allocated, and any gaps were considered to be allocated. This whole bitmap was more 131KB in size.
This is the initializer code:
Basically, it tries to find available memory for the bitmap, avoiding the kernel and stack regions, and then sets all of the bits in the range to 0xFF. After that, it goes through the memory map and then figures out which bits are in the available memory regions, not part of the bitmap, and then sets those bits to zero. Ideally, it should also be checking for stack and kernel regions, though. The reason I had the phys_mem_bitmap as a uint64_t is because I figured that eventually I'll want to port this kernel to the x86_64. I figured that if it is running on an x86, the addresses will obviously never exceed the 4GB range, and so I'm good to truncate half of it when I treat it as a pointer. Ideally, I'll have those settled in the arch directory (i386, 86-64), though.
It literally took like two or three minutes just for it to go through the whole bitmap.
Now that I've made something so unspeakable, it's time to redo it. I figured do it once poorly just to get a feel for it, and then do it better a second time. I'm not sure exactly what data structure I'm going to use next time around. I could save space by having three or four structs that each contain a bitmap instead. Each one would then cover a region of usable memory (since there are large chunks that just plain aren't usable). I considered doing the buddy system, but based on everything I've read about it, it's extremely circular between using it as an allocator and needing an allocator for it.
Basically, I used a bitmap that mapped all 4k blocks in the entire 4GB address space. Any unusable memory was considered to be allocated, and any gaps were considered to be allocated. This whole bitmap was more 131KB in size.
This is the initializer code:
Code: Select all
int init_memory_map(multiboot_info_t *mbd, uint32_t page_size) {
/* We only actually use the mmap_t here, and so we're going
* to go ahead and define it within the function
*/
struct mmap_t {
uint32_t size;
uint64_t base;
uint64_t len;
uint32_t type;
} __attribute((packed));
/* Make sure that page size is a power of 2 */
if ((page_size == 0) || ((page_size & (page_size - 1)) != 0)) {
printf("Page size must be a power of 2!\n");
return 0;
}
unsigned char check_flag = (1 << 0) & mbd->flags;
// Check flags to make sure memory mapping is sane
if (!check_flag) {
printf("Flag is not set at bit 0!\n");
return 0;
}
check_flag = (1 << 6) & mbd->flags;
if (!check_flag) {
printf("Flag is not set at bit 6!\n");
return 0;
}
// Get the address and length to set up
struct mmap_t *mmap_addr = (struct mmap_t *) mbd->mmap_addr;
uint32_t arr_size = (uint64_t) mbd->mmap_length / sizeof(struct mmap_t);
/* Make a copy. Since it isn't very big for an x86 system, we'll have
* it the stack. That way it'll get destroyed when we finish initializing
* the physical memory and we also don't need to worry about clobbering it
* when we do our bitmap
*/
struct mmap_t entries[arr_size];
memcpy(entries, mmap_addr, mbd->mmap_length);
/* The exponent of the page_size as a power of 2 */
uint32_t page_power = 0;
uint32_t page_size_copy = page_size;
for (page_power; page_size_copy; page_power++) {
/* This ensures that page_power does not end
* up one greater than it should be
*/
if ((page_size_copy >> 1) == 0) {
break;
}
page_size_copy >>= 1;
}
/* Get the base + length to determine total "usable"
* ram
*/
uint64_t last_address = entries[arr_size-1].base +
entries[arr_size-1].len;
// Get the number of pages
uint32_t blocks = last_address / (1 << page_power);
// Get the number of bytes for the bitmap, each bit is 1 page
bitmap_size = blocks / 8;
uint64_t end = 0;
uint64_t top_check = 0;
uint64_t bottom_check = 0;
uint64_t sb_addr = (uint64_t) &stack_bottom;
uint64_t st_addr = (uint64_t) &stack_top;
uint64_t ke_addr = (uint64_t) &kernel_end;
/* Find available memory after the end of the kernel
*
* After that, align to the next nearest page boundary (unless the
* base is already aligned to a page boundary), and then use that
* to store the bitmap.
*/
for (uint32_t i = 0; i < arr_size; ++i) {
end = entries[i].base + entries[i].len;
/* In order to check the space, we are going to check
* to see if end-bitmap_size is aligned on a page boundary.
* If it is not, then we are going to round down
* to the nearest page boundary. If it is, we are going
* to use it as is. That way, we can ensure that we
* are actually using pages to allocate memory for the
* bitmap
*/
top_check = (((end-bitmap_size) % page_size) != 0) ?
((end-bitmap_size) & ~(page_size-1)) :
(end-bitmap_size);
/* Additionally, we are going to check to see if the aligned base
* can fit into memory. This time we are rounding up
*/
bottom_check = ((entries[i].base % page_size) != 0) ?
((entries[i].base & ~(page_size-1)) + page_size) :
(entries[i].base);
bottom_check += bitmap_size;
if (entries[i].type != MULTIBOOT_MEMORY_AVAILABLE) {
continue;
}
/* We found some space between the stack and the kernel */
if (st_addr < (1 << 20) &&
top_check > st_addr &&
end < (1 << 20)) {
phys_mem_bitmap = top_check;
break;
}
/* We found memory below the stack */
else if (sb_addr < (1 << 20) &&
bottom_check < sb_addr) {
phys_mem_bitmap = bottom_check - bitmap_size;
break;
}
/* The stack is above the kernel, but we did manage to find
* some memory that is below the kernel and the stack
*
* In this case, I don't think that it actually matters if
* we use the top range or bottom range for the bitmap
*/
else if (sb_addr > (1 << 20) &&
bottom_check < sb_addr &&
bottom_check < (1 << 20)) {
phys_mem_bitmap = bottom_check - bitmap_size;
break;
}
/* The bottom of the stack is at a higher address than the
* kernel, and we found some memory between the kernel and the
* stack bottom
*/
if (sb_addr > (1 << 20) &&
bottom_check - bitmap_size > ke_addr &&
bottom_check < sb_addr) {
phys_mem_bitmap = bottom_check - bitmap_size;
break;
}
/* The bottom of the stack is at a higher address than the kernel
* and we found some memory after the top of the stack.
*/
else if (sb_addr > (1 << 20) &&
st_addr < top_check) {
phys_mem_bitmap = top_check;
break;
}
/* We went through all of the memory, but didn't find
* anything we liked. Time to give the kernel a panic.
*/
if (i == (arr_size - 1)) {
printf("Failed to find memory for the bitmap!\n");
return 0;
}
}
// Set all bits to 1
memset((uint8_t *) phys_mem_bitmap, 0xFF, bitmap_size);
uint32_t ndx = 0;
uint64_t base = 0;
uint64_t limit = 0;
uint64_t address = 0;
/* TODO: Find a faster way to do this */
for (uint32_t i = 0; i < blocks; ++i) {
/* Get the address and determine if it
* is inside of a mmap_t structure
*/
address = i*(1 << page_power);
base = entries[ndx].base;
limit = entries[ndx].len;
/* This means that there is a gap between
* two different mmap_t entries.
*/
if (address < base) {
// The bit has already been set
printf_serial("Bit has already been set... continuing\n");
continue;
}
/* This means that the address is within the physical memory bitmap */
if (address >= phys_mem_bitmap &&
address <= phys_mem_bitmap + bitmap_size) {
printf_serial("The bit is within the bitmap... continuing\n");
continue;
}
// Clear the bit if the memory is available
if (entries[ndx].type == MULTIBOOT_MEMORY_AVAILABLE) {
((uint32_t *) phys_mem_bitmap)[i/8] &= ~(1 << (i%8));
}
// Ignore if the memory is not specifically available
/* Determine if the next address is within the
* mmap struct
*/
if ((address+(1 << page_power)) > (base+limit)) {
++ndx;
}
}
return 1;
}
Basically, it tries to find available memory for the bitmap, avoiding the kernel and stack regions, and then sets all of the bits in the range to 0xFF. After that, it goes through the memory map and then figures out which bits are in the available memory regions, not part of the bitmap, and then sets those bits to zero. Ideally, it should also be checking for stack and kernel regions, though. The reason I had the phys_mem_bitmap as a uint64_t is because I figured that eventually I'll want to port this kernel to the x86_64. I figured that if it is running on an x86, the addresses will obviously never exceed the 4GB range, and so I'm good to truncate half of it when I treat it as a pointer. Ideally, I'll have those settled in the arch directory (i386, 86-64), though.
It literally took like two or three minutes just for it to go through the whole bitmap.
Now that I've made something so unspeakable, it's time to redo it. I figured do it once poorly just to get a feel for it, and then do it better a second time. I'm not sure exactly what data structure I'm going to use next time around. I could save space by having three or four structs that each contain a bitmap instead. Each one would then cover a region of usable memory (since there are large chunks that just plain aren't usable). I considered doing the buddy system, but based on everything I've read about it, it's extremely circular between using it as an allocator and needing an allocator for it.