Re: Physical page allocation
Posted: Tue Oct 16, 2012 11:49 am
So I've implemented the basics, and it seems to work well. The nice thing is that this new scheme is pretty easy to debug in a standard kernel thread, because the managed physical memory is not accessed. The next step is to replace the old allocation functions with the new.
I've also figured out a few optimizations I'll probably implement.
First, I think the search algorithm for a good bitmap should break when it has found a bitmap with over 50% free pages. No sense in searching for bitmaps that are better than that.
Second, I think there is a very effective optimization that could be done. By providing a cache (which uses a stack) per processor-core that contain a fixed number of entries, the core can quickly reclaim recently freed pages. That would improve performance in scenarios with many allocate-free pairs. Each cache entry could contain:
That's 8 byte per entry. A single page could contain 512 entries (2MB), which might be enough. BitmapNr and BitNr could be used to quickly be able to update the bit in the bitmap (this would still be done in order to detect double-frees), and to calculate the physical address. As this cache is per core, it won't need any spinlocks, and the links could be updated by using cli/sti.
While a possible optimization might be to only link the pages, and not free them in the bitmap, this also has some disadvantages. One is that double-frees cannot be detected, and another is that the entries would only be possible to allocate from the core they were freed on. So I think they always should be freed. A complication of freeing them is that another core could allocate them, but that is detected when trying to clear the bit in the bitmap with btr. If CY is not set, the entry will be discarded, and another tried instead.
An additional advantage is that cache-misses are reduced when the same physical page is reused again.
I've also figured out a few optimizations I'll probably implement.
First, I think the search algorithm for a good bitmap should break when it has found a bitmap with over 50% free pages. No sense in searching for bitmaps that are better than that.
Second, I think there is a very effective optimization that could be done. By providing a cache (which uses a stack) per processor-core that contain a fixed number of entries, the core can quickly reclaim recently freed pages. That would improve performance in scenarios with many allocate-free pairs. Each cache entry could contain:
Code: Select all
struct PhysCacheEntry
{
short int BitmapNr;
short int BitNr;
struct PhysCacheEntry *Next;
};
While a possible optimization might be to only link the pages, and not free them in the bitmap, this also has some disadvantages. One is that double-frees cannot be detected, and another is that the entries would only be possible to allocate from the core they were freed on. So I think they always should be freed. A complication of freeing them is that another core could allocate them, but that is detected when trying to clear the bit in the bitmap with btr. If CY is not set, the entry will be discarded, and another tried instead.
An additional advantage is that cache-misses are reduced when the same physical page is reused again.