Paging ...

dc0d32 · Post by **dc0d32** » Sat May 06, 2006 11:28 am

hi,

i have done paging code months ago.
for deallocating pages, i have done a trick. while booting, i create an array of structures representing page objects, one entry per physical page. when deallocating a page, the application gives me the virtual address. i find out the physical address of the page from the application's page tables. then, using this as index in my array, i move this page from that process to the free pages list.

when a professor of mine saw the logic & code, he said that i would eventually run into trouble.

what flaws you see (apart from shared pages) ?

BTW, is this similar to what is called 'inverted page tables'?

paulbarker · Post by **paulbarker** » Sat May 06, 2006 2:19 pm

I don't see any real problems in this simple overview of the design, it may be more to do with the implementation. I can guess off the top of my head that it may be to do with not properly handling the initial state of memory (accounting for all the holes, etc) or a lack of protection against corrupting the array (use spinlocks or mutexes to protect the array). These are just guesses though, your implementation may be perfect and there may be a flaw in the details of the design.

Inverted page tables are a method of handling the virtual to physical translation, see http://en.wikipedia.org/wiki/Page_table#Inverted_page_table. They are useful when the size of the address space is much bigger than the amount of physical RAM (say a 64bit system with 1GB of memory). They must be implemented in hardware (as part of the MMU) or can be used on systems which throw a TLB fault rather than looking up the translation in a table in RAM (eg. mips systems). They generally have nothing to do with the management of physical memory.

I hope the above makes sense.

dc0d32 · Post by **dc0d32** » Sat May 06, 2006 2:45 pm

i am going to add locking once i am able to. he said that the design wasn't so appropriate for handling this task. my point is, that it does all the work in nearly O(1) time complexity(which really matters today, instead of space complexity).
has anyone implemented something similar? probably he might be able to tell what i should do and what not.

paulbarker · Post by **paulbarker** » Sat May 06, 2006 3:45 pm

On the space usage, remember that you are using a proportion of the total memory rather than a fixed amount, and so space is still a concern (using 1k per page with 4k pages would be a problem no matter how much memory you had). I do agree that saving time is more important than saving memory though, as long as the memory usage is reasonable.

I think the problem may be to do with allocating more than one page as a block. For DMA to perform well you need a physically contiguous block of memory, possibly up to 64k. Do you handle anything like this?

dc0d32 · Post by **dc0d32** » Sat May 06, 2006 10:15 pm

totally forgot the DMA :-\ (i have only 2 driver modules, out of which the HDD is in PIO) . thanks.

we can have something similar to extents, similar to sector allocation. any sugestions on how to implement linked lists and extents on same array, preferrably using space interchangeably?

edit :- and about space, the structure is 16 bytes, that takes up 1MB for handling 256MB RAM. quite descent in my openion.

dc0d32 · Post by **dc0d32** » Sat May 06, 2006 10:27 pm

i see one more.

how can i account for physical addresses which are not memory but memory mapped devices, like framebuffers (eg. with LFB@0xE0000000). because while booting, i create the array only for the amount of memory detected.

a serious mistake...

Brendan · Post by **Brendan** » Sun May 07, 2006 1:48 am

Hi,

prashant wrote:totally forgot the DMA :-\ (i have only 2 driver modules, out of which the HDD is in PIO) . thanks.

we can have something similar to extents, similar to sector allocation. any sugestions on how to implement linked lists and extents on same array, preferrably using space interchangeably?

For DMA, you'd first need to look at what uses it. Most modern hard drive controllers, USB controllers, ethernet cards, etc use "scatter gather" DMA where there's a lookup table containing lists of pages (i.e. they are designed for doing DMA to/from physical pages that are scattered).

It's usually older hardware that needs contiguous physical pages for DMA - floppy, sound cards, parallel port, etc. For half of this you're looking at the ISA DMA controller chips that are limited to the first 16 MB of physical memory.

There are also 32 bit PCI cards (and PCI to PCI bridges) that don't handle 64 bit DMA transfers. In this case you need to be able to allocate pages that are below 4 GB. If you use PAE then there's also page directory pointer tables (they need to be below 4 GB too). This is only a concern if your OS supports physical addresses above 4 GB (PAE and/or long mode).

My solution to this is to use a "special" physical memory manager for all RAM below 16 MB. I use a simple bitmap, where one byte represents a physical page (and 16 MB adds up to 4 KB).

In addition to this I have seperate physical memory managers for memory between 16 MB and 4 GB, and memory above 4 GB. This means I can allocate single pages with 32 bit physical addresses without using the memory below 16 MB. This can be important - at least one well known OS messed this up. They wasted several MB of this limited space by identity mapping a large kernel at 1 MB, and then ran out of space on 64 bit computers due to those "32 bit limited" PCI devices (which had to use the area below 16 MB because there was no other way to ask for pages below 4 GB).

prashant wrote:how can i account for physical addresses which are not memory but memory mapped devices, like framebuffers (eg. with LFB@0xE0000000). because while booting, i create the array only for the amount of memory detected.

I like to think of it as 2 completely seperate things - a "physical memory manager" which manages pages of RAM, and a "physical address space manager" which manages ranges within the physical address space.

The physical memory manager is only ever used to allocate or free RAM - nothing more.

During boot the physical address space manager detects as much as it can about the physical address space and determines which areas are usable RAM (and which areas are used by other things). Then it tells the physical memory manager which pages are free to use. After boot the physical address space manager is used to make sure memory mapped devices don't overlap with each other (or RAM or anything else), to handle the MTRRs, to handle "hot plug" RAM, etc.

For the physical address space manager, you don't need it to be fast because nothing changes often. A sorted list of "address range entries" will do (which is convenient, considering that that's what "int 0x15, eax=0xE820" gives you)...

Cheers,

Brendan

dc0d32 · Post by **dc0d32** » Sun May 07, 2006 2:23 am

i had included those "gaps" in between those areas as resources. your idea is more like it, practical. I'll try and do that. Thanks Brendan.

edit :- i'll google about scatter gather DMA

dc0d32 · Post by **dc0d32** » Sun May 07, 2006 2:30 am

but Brendan sir,

the problem of allocating and deallocating from these holes persists. i had used buddy system for allocation inside these areas(in my "resource" approach). do you mean the same?
the only change i require is that any special addresses that an application requests and holds, are just mapped in the app's page tables, i've considered these resource areas as ranges. but this is not paging system, this is resource allocation system. do i need to merge these two systems? because the allocation system also takes care of ports allocation in similar way, ie. port address ranges. it can be done easily though.

Rob · Post by **Rob** » Sun May 07, 2006 6:56 am

Brendan wrote: A sorted list of "address range entries" will do (which is convenient, considering that that's what "int 0x15, eax=0xE820" gives you)...

Is that garantueed? That E820 gives you a sorted list? I don't
seem to remember that being indicated anywhere (I could be
wrong). I was planning on not counting on that and doing a
sort just to be safe.

prashant: it sounds like the user level memory manager is still
dealing in pages. You say that on deallocate the application
gives the virtual *page* to free. Normally that would simply
be a pointer to a memory buffer of arbitrary size.

The memory manager then returns that memory range to the
application's pool or if it deems necessary free up some pages
with the physical memory manager.

dc0d32 · Post by **dc0d32** » Sun May 07, 2006 7:59 am

The memory manager then returns that memory range to the
application's pool or if it deems necessary free up some pages
with the physical memory manager.

this is what we are referring to (not the application heap manager!).

Brendan · Post by **Brendan** » Sun May 07, 2006 11:15 am

Hi,

Rob wrote:
Brendan wrote: A sorted list of "address range entries" will do (which is convenient, considering that that's what "int 0x15, eax=0xE820" gives you)...
Is that garantueed? That E820 gives you a sorted list? I don't
seem to remember that being indicated anywhere (I could be
wrong). I was planning on not counting on that and doing a
sort just to be safe.

No - I'm wrong! E820 returns an unsorted list that may contain unused entries and (in rare/dodgy cases) may return overlapping areas.

Most OS's filter the entries returned from E820 - removing unused entries (entries with length = 0), combining adjacent ranges of the same type, changing any overlapping areas to the most restrictive type and changing any unrecognised "type" values to a standard value (i.e. so you know it's safe to use "deviceID | 0x80000000" to denote entries used by memory mapped devices later on).

Also, if E820 isn't supported the OS can still contruct a compatibile list using older "get memory size" functions (and assuming there's ROMs in the ROM areas, etc).

A sorted list of entries is what you should end up with after you've done "physical address space detection" (more commonly called "memory detection")...

Cheers,

Brendan

Brendan · Post by **Brendan** » Sun May 07, 2006 12:41 pm

Hi,

prashant wrote:the problem of allocating and deallocating from these holes persists. i had used buddy system for allocation inside these areas(in my "resource" approach). do you mean the same?

I mean that you should have about 5 seperate "managers" (one for the physical address space, three for physical memory and one for linear/virtual memory), and that these "managers" should all be entirely isolated from each other so that you can completely change how they work without effecting the interface to them and without effecting any other code.

The physical address space manager tells the physical memory managers which pages are free, and the physical memory managers never touch anything that it hasn't been told is free (including those holes).

The linear/virtual memory manager allocates single pages and contiguous groups of pages and frees single pages and contiguous groups of pages using the physical memory managers. The linear/virtual memory manager also maps memory mapped devices into address spaces, but the physical memory managers aren't involved in this at all (the linear memory manager might get this information directly from the physical address space manager, or there could be a "device manager" inbetween, or the device driver might do it itself). Then there's things like swap space, memory mapped files and shared memory that complicate things.

For a very rough example, each "manager" might have the following functions:

Physical address space manager (PASM):

- detect memory (called early during boot)
- register "prefetchable" device area (called by PCI device initialization code)
- register "non-prefetchable" device area (called by PCI device initialization code)
- register "device ROM" area (called by PCI device initialization code)
- remove "prefetchable" device area (called by PCI device initialization code)
- remove "non-prefetchable" device area (called by PCI device initialization code)
- remove "device ROM" area (called by PCI device initialization code)
- add RAM area (called by chipset driver for "hot plug" RAM support)
- change "ACPI reclaimable" areas into free RAM (called by ACPI initializaton code)

Physical memory managers (3 * PMM):

- free page (called by PASM or VMM)
- free pages (optional, called by PASM only)
- allocate page (called by VMM)
- allocate contiguous pages (called by VMM, only supported by the PMM that handles memory below 16 MB)

Linear/virtual memory manager (VMM):

- create address space (called by scheduler)
- destroy address space (called by scheduler)
- allocate RAM (called by lots of things, including user level code)
- free pages/s (called by lots of things, including user level code)
- map "prefetchable" device area (called by <don't know>)
- map "non-prefetchable" device area (called by <don't know>)
- map "device ROM" area (called by <don't know>)
- map "system ROM/BIOS" area (optional, called by user level code)
- unmap special area (called by <don't know>)
- memory map a file (called by user level code and/or possibly some parts of the kernel)
- unmap a memory mapped a file (called by user level code or possibly some parts of the kernel)

Of course this list is only an example - you might have reserved areas in the linear address space, or some areas could have different behaviour (e.g. "allocation on demand" areas), or other functions for other things.

You'd also want to think about where access/permission checks are done, and what happens when someone decides to map or allocate something into an area of the linear address space that is already in use.

prashant wrote:the only change i require is that any special addresses that an application requests and holds, are just mapped in the app's page tables, i've considered these resource areas as ranges. but this is not paging system, this is resource allocation system. do i need to merge these two systems? because the allocation system also takes care of ports allocation in similar way, ie. port address ranges. it can be done easily though.

That sounds easy to me - for each page table entry there's at least 2 "available" bits that can be used to encode the type of present pages (and for not-present pages there's at least 31 available bits). This gives you 4 "page types", which could be RAM, "special", and "memory mapped file".

When the linear/virtual memory manager needs to free a page it checks this "page type". If it's normal RAM it calls the physical memory manager's "free page" function. If it's a "special" page it just unmaps it. If it's part of a memory mapped file then it would check if the page was modified (the dirty bit) and flush the data to disk if it was, and then call the physical memory manager's "free page" function.

Cheers,

Brendan

dc0d32 · Post by **dc0d32** » Sun May 07, 2006 2:09 pm

more clear now. thanks Brendan.

distantvoices · Post by **distantvoices** » Mon May 08, 2006 3:40 am

@prashant: If I were you, I'd avoid using a virtual-address-to-page trick. I'd remove the pages from the list of dealt-out pages and push them back on the stack (linked list structure) No looping throu page tables & page directories, just finding the pages associated with a given pagedir and removing them.

Then, give back the zeroed-out pagedir and there you are: a deleted process. *gg*

stay safe.

OSDev.org

Paging ...

Paging ...

Re:Paging ...

Re:Paging ...

Re:Paging ...

Re:Paging ...

Re:Paging ...

Re:Paging ...

Re:Paging ...

Re:Paging ...

Re:Paging ...

Re:Paging ...

Re:Paging ...

Re:Paging ...

Re:Paging ...

Re:Paging ...