Yeah, I realized it, many people have mentioned it. I guess I will have to deal with it. Even 1000 KB of RAM is nothing compared to 4 GB.
I finally fully understand what you meant!LtG wrote:@Octacone..
I'm in a bit of a hurry, so don't have time to quote properly, also let me know if I missed anything..
Wrt to the "why 5 and not 3", not sure what you meant by that..? If you are referring to the "5" in the "100x5B", then that 5 is picked up from your earlier post (32-27), so I was just continuing with your example and pointing out that wasting 5B per allocation is irrelevant if there's only ever going to be a 100 such allocations (ie. 500B). Also it's irrelevant if there's 10k such allocations on modern x86 systems (though embedded may of course be different). So I was trying to point out that while you want to be frugal with most of your memory, in some cases (especially wrt to the kernel) it really doesn't matter.
Whether you want to optimize for cache lines is of course up to you, but only supporting a couple of allocation sizes for the kernel malloc (kmalloc) makes things simpler on that side. If your userland malloc crashes it's easier to diagnose, if your kmalloc crashes it's more difficult, so keeping kmalloc simple has benefits, keeping malloc simple isn't really feasible..
Didn't really get your 7247 => 200+B, where's the 200 come from? But regardless, if there's only handful of such structs allocated then 200B wasted on each is again meaningless.
Wrt to 33 -> 64 rounding, I was just giving some examples, the exact rounding you want to do is up to you. There's two things to consider, one is having a relatively low number of allocation sizes to keep the allocator simple and the other is for cache line optimizations. You can ignore the cache line optimizations for now if you like, but keeping things simple does help. So 33 could also be rounded to 36 or 48, etc.. Once you know all the structs you need you can decide how you want to round them, for starters you could just give each 4KiB to get the ball rolling and fix those once you have all of your structs ready, since you probably don't know all the sizes yet..
The basic idea is that when you call kmalloc:
void* ptr = kmalloc(32); // ok
void* ptr = kmalloc(33); // kpanic, invalid allocation size
void* ptr = kmalloc(64); // ok
etc..
So then inside kmalloc you just check the requested size, and based on it you have different page for each size from which you allocate, that way you don't have to deal with odd sizes, etc. The idea here is to keep it as simple as possible. So if you only needed three different allocation sizes, say 32B, 64B and 256B you might have:
void* free32BList;
void* free64BList;
void* free256BList;
And then just keep track of those, but if this complicates things for you, you can certainly do it anyway you like =)
Note also that I would not use identity mapping at all, what benefit do you think you get from it?
Finally, the userland malloc is not something that comes with the OS at all, it's part of the language runtime, yes, you'll have to create it if you want to compile C apps for your OS, but it's not part of the OS and you should _NOT_ want to enforce any malloc on anyone.
Even if you did enforce your malloc on your userland, you couldn't. Consider the following code on _any_ OS:
The point here is that you can't enforce your malloc nor should you even attempt. AFAIK Windows apps that use malloc, the malloc internally uses VirtualAlloc (part of Win32 API) or something similar.Code: Select all
void* lotsOfMem; // allocate ram from OS, using "enforced" malloc void* myMalloc(size_t count) { void* addr; // somehow parcel out the large 10GiB block for internal allocations return addr; } void main() { lotsOfMem = malloc(10GiB); uint32_t* manyInts = myMalloc(10 * 1000 * 1000 * 4); }
So the idea is that the language runtime malloc uses the OS's VMM to allocate what it wants. And you can offload the burden of figuring out what they want to userland, which makes simpler/stabler kernel, leaves more room for userland optimizations, etc.
So when userland calls:
VMM_Alloc(/*start*/ 0x10000, /*stop*/ 0x20000);
All your kernel VMM has to do is check that the userland is not requesting allocation on top of kernel memory. You may also want to check it's not doing something stupid, like requesting allocation on top of something that's already allocated to it (or you may want to allow it as a quick way of allowing to change the allocation type (read-only vs read-write vs execute, etc)). My VMM_Alloc example above didn't include "type", you may want to do that though.
So all the VMM_Alloc needs to do is basic sanity and security checking and then just do what it was asked to do, which makes its implementation quite simple.
Then as a starting malloc (for userland) you can do something simple (though inefficient) and just always round up the requested allocation to the nearest 4KiB and allocate that (yeah, I said it's inefficient). And couple of days later fix it when it becomes a problem.
So all four pieces (PMM, VMM, kmalloc and malloc) become really simple, and only reason to change them is to improve efficiency, but that's just optimization.
What confused me was the fact that I thought that you could only round to 32 and 64 bits (because of the CPU architecture). Looks like that is not the case, perfect. So I just need to figure out the most common sizes that my kernel will use and make a chunk of them available. Maybe to just copy and paste an already existing PMM allocator and make it compatible with different even sizes. Or even, yo yo PMM give me a page (0x1000/4096 bytes), now I have 4096 bytes available, why not parcel that page and turn in into 62 chunks of 48 bytes + 14 chunks of 32 bytes + 10 chunks of 64 bytes, or something like that. You really managed to get my brain running. Thanks for that!!! Okay, I will not enforce it after all, but still I am going to provide a decent implementation that programmers will be able to use if wanted.
Okay, I've got all the answer I needed related to this topic above. I will try to use them wisely and not ignore them.
Now: does anybody have a way of knowing if the page table (virtual one that needs to be allocated) has been already allocated so I don't need to call PMM_Allocate_Block() twice/n for each mapping.
Right now I am using this awful hack (I am surprised it even works, sometimes):
Code: Select all
if(((uint32_t) page_directory->virtual_page_tables[virtual_address >> 22]) % 0x1000 == 0)
{
//page table already allocated, not need to create it again
//mapping...
}
else
{
//page table not allocated, need to create one
uint32_t page_table_address = PMM.Allocate_Block();
page_table_t* new_page_table = (page_table_t*) page_table_address;
String.Memory_Set(new_page_table, 0, 4096);
//mapping...
}