Some people will claim that all you need is kmalloc, and then you have solved all memory allocation issues. However, I'm moving away from kmalloc completely since bugs in it cause random kernel crashes in unrelated code. I have an API for doing safe kmalloc that returns a selector that is mapped to the allocated address. This works for objects that are created in rather small numbers, otherwise, there is a risk of running out of GDT selectors. Also, using too many selectors results in less selectivity when more are valid.
For USB devices, I created a new block memory allocator based on a 4k page mapped to a selector. Internal memory in it can be allocated through a bitmap, and lockless operation is supported. The header also contains both physical & linear addresses, which makes it easy to convert objects in schedules to linear addresses and the reverse. When 4k is too small, the code will allocate & link another 4k selector. The drawback with this allocator is that is still uses flat linear addresses, and so memory corruption still can occur. On the positive side, it can detect double-frees, correct memory type is enforced, and it has garbage collection (or rather, the whole object can be freed in one simple step without any possible memory leakage).
I tried to use the block memory allocator for file descriptors, which will have lists of various lengths of mapped sectors & physical addresses, as well as for wait objects and process contexts. However, I want a design where the allocator returns an offset within the file descriptor, and not a flat linear address. This can be done in a similar way as extending the block memory allocator by allocating another 4k page. However, now I want to add this page at the end of the file descriptor. I cannot rely on the possibility to extending the underlying linear address by another page, and so I need to allocate a new linear address, copy the pages and then change the base for the file descriptor. Which is a bit of a problem in a multicore environment where other cores might still use the old selector (and thus the old linear address). One solution is to wait a millisecond or so to make sure all threads are scheduled (and thus has reloaded the selector) before freeing the old linear address. Another might be to use the TLB shoot-down mechanism to force a selector reload.