Page 1 of 1
mixing 4KB and 4MB pages
Posted: Thu Jul 03, 2008 4:28 pm
by blound
does anyone here mix both 4MB and 4KB pages inside their OS? I have been using 4MB pages out of convience, but knowing its limitations and wastefulness ( allocating 4MB at one time for each need of a physical page).
I do not have multitasking yet, so the 4MB pages I have allocated now are
1) identity map of 0-4MB
2) kernel code/data/bss
3) kernel stack ( this uses 4k pages, but has a 4MB space reserved )
4) kernel heap
so as you can see I am using 16MB of ram for no real reason since they cannot be subdivided. I am wondering what is the best way to mix 4mb and 4kb pages. should your physical memory manager take allocations of 4k and 4mb? the other method I was thinking of doing for it would be to have the physical memory manager know of 4MB pages and then subdivide them out into 4k blocks also. I wouldn't use a 2d array for this, but it easy to visualize as phys_pages[4MB_PAGE][4K_index], and then if a request for a 4MB page is requested it can give out a whole index, but if not it can use a partial used 4MB for 4k ranges. the other thing I thought of was to have the phyiscal memory manager know of 4k pages only, but if 4MB page was requrested then it could give a 4MB page, but then you would have to try and compensate to keep enough 4MB contigous regions around.
I am mostly wondering what the proper method would be to code a physical memory manager that could effectivly handle both 4MB and 4kb pages...
Re: mixing 4KB and 4MB pages
Posted: Thu Jul 03, 2008 4:41 pm
by iammisc
I think the best solution here would be to have a zone allocator. Basically, you start out with a list of all 4 megabyte blocks and an empty list of 4kb blocks. You use these lists to allocate 4 mb and 4kb requests. When a request for a 4kb page comes in for the first time, the 4kb list will be empty so you will request a 4mb page and add the other 1023 unused pages into the 4kb list. Then you return the value of the page that was not put into the list. The next time, a request comes in for a 4kb page, you can simply return one of the pages already in the list. When the list becomes empty, you can again allocate another 4 megabyte page and hence, start the cycle again.
Re: mixing 4KB and 4MB pages
Posted: Thu Jul 03, 2008 9:18 pm
by geppy
blound wrote:
I am mostly wondering what the proper method would be to code a physical memory manager that could effectivly handle both 4MB and 4kb pages...
to have the physical memory manager know of 4MB pages and then subdivide them out into 4k blocks
Thats what I use. Linked list of 2MB pages (LongMode) is saved in unused & unmapped RAM(check Brendan's posts). Then I got 128bit bitmask at the beginning of 2MB describing 16KB blocks(thats minimum mem per request that you can get). I then use bsr/bsf to locate the 16KB. In some cases in my implementation I can find more than 16KB with single bsr instruction. Reason for bitmask is less memory accesses - single uncached access (which may cache entire cache line) likely to cost min 100 cycles.
I have 2 pointers. One points to completely empty 2MB and the other on partially empty.
Re: mixing 4KB and 4MB pages
Posted: Thu Jul 03, 2008 10:19 pm
by Brendan
Hi,
When you're supporting 4 MB pages and 4 KB pages, the ability to split a 4 MB page into 4 KB pages would be assumed as it's relatively simple to implement.
The problem is that eventually all 4 MB pages end up split into 4 KB pages, and (due to fragmentation, etc) you end up with no 4 MB pages left and no way to get more free 4 MB pages. Basically, after a while it degenerates into "4 KB pages only". In this case I'd wonder if there's any point bothering with 4 MB pages in the first place.
So, to have a sensible system that uses both 4 MB and 4 KB pages you'd need a way to recombine 4 KB pages back into a 4 MB page; and you'd need to do it in a way where the overhead of detecting if pages can be recombined is less than the overhead saved by using 4 MB pages (otherwise it'd work but would be slower/worse than "4 KB pages only").
I can't think of any method to detect if 4 KB pages can be recombined into a 4 MB page that has low enough overhead, given that each method I think of prevents the use of (fast) free page stacks, and given that it'll also add complexity/overhead to other things (e.g. swap space management).
This mostly just leaves one good use for 4 MB pages - large static mappings that are made during boot (and are never freed/changed), that the paging code can mostly ignore.
Cheers,
Brendan
Re: mixing 4KB and 4MB pages
Posted: Fri Jul 04, 2008 3:41 am
by jal
Brendan wrote:This mostly just leaves one good use for 4 MB pages - large static mappings that are made during boot (and are never freed/changed), that the paging code can mostly ignore.
Just of the top of my head, if an OS needs to load large binaries often, or has processes often requesting large amounts of memory, one could have a zone for 4 Kb pages and one for 4 Mb pages, only eating up the 4 Mb zone if the 4 Kb zone is full. Recombining 4 Kb pages into 4 Mb pages needn't be that much of a problem, as ye average (desktop) OS is idling most of the time, so there's plenty o' time in the idle loop to take care of that, methinks.
JAL
Re: mixing 4KB and 4MB pages
Posted: Sat Jul 05, 2008 1:22 am
by Brendan
Hi,
jal wrote:Just of the top of my head, if an OS needs to load large binaries often, or has processes often requesting large amounts of memory, one could have a zone for 4 Kb pages and one for 4 Mb pages, only eating up the 4 Mb zone if the 4 Kb zone is full. Recombining 4 Kb pages into 4 Mb pages needn't be that much of a problem, as ye average (desktop) OS is idling most of the time, so there's plenty o' time in the idle loop to take care of that, methinks.
Ok, imagine that there's 1000 MB of RAM. 500 MB is being used by processes, 484 MB is being used by file system cache and the remaining 16 MB is free 4 KB pages. These free 4 KB pages are managed with 256 free page stacks (256 page/cache colours, with a free page stack for each page colour). Your job is to end up with four free 4 MB pages without adding any extra overhead when the OS isn't idle. There's a few restrictions:
1) you can't do any indexing, etc when pages are allocated/freed to speed things up (as this increases overhead when the OS isn't idle)
2) you can't do anything with large data structures (trashing CPU caches and TLBs also increases overhead when the OS isn't idle)
3) it needs to be relatively fast so the CPU can cool down during idle time (don't increase the chance of thermal throttling when the OS isn't idle)
Also I should point out that it's extremely unlikely that the 16 MB of free 4 KB pages will contain all the pages needed for any 4 MB page. For example, you might find 1000 pages on the free page stack and have to search "in use" pages for the remaining 24 pages that you need to construct a free 4 MB page.
So, what will your code do in idle time to combine 4 KB pages into 4 MB pages?
After you've worked this out, do you still think it needn't be that much of a problem?
Cheers,
Brendan
Re: mixing 4KB and 4MB pages
Posted: Sun Jul 06, 2008 12:50 pm
by FlashBurn
As I´m also a fan of mxing 4mb pages and 4kb ones and I found a solution which as very fast for allocation and not so fast for deallocation. The real disadvantage is the memory overhead
Which is 16bytes per 4kb page (4mb -> 1024 4kb pages -> 16kb mem for keeping track of used and unused pages).
It is possible to lower the memory overhead to a bitmap, but then I wouldn´t be that fast as I´m now. Allocating -> removing object from list, decreasing count and putting it at the end of another list. Deallocating -> looking if there is a page near the mine (like in a bitmap) and at the worst taking 2 objects from list, merging them and putting them back into another lists end.
Re: mixing 4KB and 4MB pages
Posted: Sat Jul 12, 2008 2:30 pm
by Combuster
A tree approach with a counter how much is available on the lower levels can recombine 4M pages very quickly as soon as it sees that at the level corresponding to 4MB, there is 4MB in nodes below.
That makes allocation and deallocation a O(log n) operation rather than O(1), but it makes detecting a possibility for recombination very fast since you can detect the condition during the update of the tree.
Of course, you could decide to never degrade a certain percentage of memory, which essentially comes down to keeping a separate 4K set and a 4M set (and once 4k runs out, start using 4Ms in whole, or some more clever algorithm that reduces memory waste)
The problem is basically, everybody here is right with his/her arguments. Its just about which arguments have more importance to you.