Best/Most efficient way of managing virtual address space

heat · Post by **heat** » Sat Jun 04, 2016 6:11 pm

Hi,

I'm developing an x86-64 OS. While I was trying to be able to run user programs, I ran into a problem: my virtual memory manager isn't good enough. It suffers from a lot of problems: overly-complicated code, spaghetti code, too much memory used, etc. I'm trying to figure out the best solution to make a virtual space manager.

I've tried looking into a bitmap, but the huge address space x86_64 has makes it impossible for me to use ( > 1 GiB).

I'm currently using a linked-list, and the biggest problem I have is that I can't zone the memory, like another operating systems do (allocated stacks are at 0x7fffffffffffff, heap is at 0x30000000, etc).

What solutions do you use, and what's the best?

Thanks a bunch.

Combuster · Post by **Combuster** » Sun Jun 05, 2016 1:59 am

Bitmaps are supposed to only index the used status of RAM, not the entire address space. So if you have say 8 gigabytes of RAM and you split it into typical 4kb page-size chunks, you need 256kb for your bitmap.

heat · Post by **heat** » Sun Jun 05, 2016 5:48 am

If a bitmap isn't an option, what method should I use?

I know that Linux uses AVL trees. Is it any good? Are there other ways to manage the virtual adress space?

iansjack · Post by **iansjack** » Sun Jun 05, 2016 6:15 am

I'm not quite sure what management you are looking at for the virtual address space. Each process only has it's own discrete address space (although parts of it will belong to all processes) so there's not a huge amount of management to do. That's one of the advantages of paging. And because the available address space for each process is so huge that just makes the job much easier.

neon · Post by **neon** » Sun Jun 05, 2016 8:23 am

We use AVL trees for user space addresses and a free list of pte's to manage kernel space. I believe the AVL tree is the best current option given its design; both Windows and Linux use it.

In any case, whatever method you choose - list or tree - the process control block should contain a pointer to the root of the structure. It should be used to reserve areas of the address space so that it can be demand paged into memory later on if its needed.

You might also want to look into recursive page directories if not already done so. They can simplify the architecture a lot.

heat · Post by **heat** » Sun Jun 05, 2016 8:54 am

neon wrote:We use AVL trees for user space addresses and a free list of pte's to manage kernel space. I believe the AVL tree is the best current option given its design; both Windows and Linux use it.

In any case, whatever method you choose - list or tree - the process control block should contain a pointer to the root of the structure. It should be used to reserve areas of the address space so that it can be demand paged into memory later on if its needed.

You might also want to look into recursive page directories if not already done so. They can simplify the architecture a lot.

What does recursive mapping have to do with virtual memory management? I'm thinking about unmapping the recursive trick since with x86_64, I can map all the physical memory to the address space with ease. Even the most RAM I've seen ( 192 GB ) in a single consumer PC would fit into a single PDPT.

neon · Post by **neon** » Sun Jun 05, 2016 9:58 am

Sorry, I consider the paging system as a subset of virtual memory management given that being able to update and manipulate paging structures directly (which is the whole point of recursive mapping) effect the virtual address spaces.

You can certainly just map all of physical memory into the address space. You can even use segmentation to separate processes in the address space. Sure, you would lose the benefits of what paging offers, but it is your design so make it however you want it to be.

Boris · Post by **Boris** » Sun Jun 05, 2016 11:24 pm

I use two types of virtual memory representations on my x86_64 kernel:
* Linked list of free virtual zones, for my highest memory zone ( last GB ). The advantage of this is : it's simple and does not require complicated kmalloc. I use that for my kernel heap with pages of 2MB=> 512 zones max. I could have used a bitmap but it's slower.
* AVL trees ( double trees, to quickly allocate by best size and quickly merge by address ) ,for my kernel modules zone (-2Gb) , and non-kernel memory

heat · Post by **heat** » Mon Jun 06, 2016 2:22 am

I think I figured out how I will re-do this. I'm implementing an AVL tree for the VMM. But one thing I'm not sure is how does the kernel allocate special memory like the heap or the thread's stacks to special memory. I think (correct me If I'm wrong) that the the address space is split into zones, and each zone has a purpose and a max size. So If I want to allocate stacks for threads, and imagine I want to start allocation at address 0x7fffffffffffff, I would just declare a zone that has that address, some random size like 512GB and ZONE_STACK purpose?

This is what

Code: Select all

pmap /bin/bash

looks like (notice the stack is at a very high address in the virtual address space, far from any other allocated memory and together with the shared libraries) :

Code: Select all

715:   /bin/bash
0000000000400000    756K r-x-- bash
00000000006bc000      4K r---- bash
00000000006bd000     16K rw--- bash
00000000006c1000     40K rw---   [ anon ]
0000000002284000    264K rw---   [ anon ]
00007fa206ef6000     44K r-x-- libnss_files-2.23.so
00007fa206f01000   2044K ----- libnss_files-2.23.so
00007fa207100000      4K r---- libnss_files-2.23.so
00007fa207101000      4K rw--- libnss_files-2.23.so
00007fa207102000     24K rw---   [ anon ]
00007fa207108000   1628K r-x-- libc-2.23.so
00007fa20729f000   2048K ----- libc-2.23.so
00007fa20749f000     16K r---- libc-2.23.so
00007fa2074a3000      8K rw--- libc-2.23.so
00007fa2074a5000     16K rw---   [ anon ]
00007fa2074a9000      8K r-x-- libdl-2.23.so
00007fa2074ab000   2048K ----- libdl-2.23.so
00007fa2076ab000      4K r---- libdl-2.23.so
00007fa2076ac000      4K rw--- libdl-2.23.so
00007fa2076ad000    412K r-x-- libncursesw.so.6.0
00007fa207714000   2048K ----- libncursesw.so.6.0
00007fa207914000     16K r---- libncursesw.so.6.0
00007fa207918000      8K rw--- libncursesw.so.6.0
00007fa20791a000    260K r-x-- libreadline.so.6.3
00007fa20795b000   2044K ----- libreadline.so.6.3
00007fa207b5a000      8K r---- libreadline.so.6.3
00007fa207b5c000     24K rw--- libreadline.so.6.3
00007fa207b62000      8K rw---   [ anon ]
00007fa207b64000    140K r-x-- ld-2.23.so
00007fa207d5a000     20K rw---   [ anon ]
00007fa207d87000      4K r---- ld-2.23.so
00007fa207d88000      4K rw--- ld-2.23.so
00007fa207d89000      4K rw---   [ anon ]
00007ffef7d6f000    132K rw---   [ stack ]
00007ffef7dcc000     12K r----   [ anon ]
00007ffef7dcf000      8K r-x--   [ anon ]
ffffffffff600000      4K r-x--   [ anon ]
 total            14136K

xmm15 · Post by **xmm15** » Mon Jun 06, 2016 12:30 pm

FWIW: I use the kernel's page tables. Since each process have their own page tables, and the kernel has an identity mapping, I use the AVL bits in each PTE of the kernel page tables.

So when I need to find an available page to map into a user process, the kernel searches the first PTE with the AVL=0 and sets it to 1. This way, it knows that the page is mapped somehwere. Since I have a few AVL bits available, I can use different values to mark as free, in use by process, in use by mmio etc.

Certainly not the best approach though but very simple. AVL would be much more performant. But the nice thing about my solution is that it doesn't require any extra memory since I need those PTE anyway (well... I could use huge pages since I'm using identity mapping though).

OSDev.org

Best/Most efficient way of managing virtual address space

Best/Most efficient way of managing virtual address space

Re: Best/Most efficient way of managing virtual address spac

Re: Best/Most efficient way of managing virtual address spac

Re: Best/Most efficient way of managing virtual address spac

Re: Best/Most efficient way of managing virtual address spac

Re: Best/Most efficient way of managing virtual address spac

Re: Best/Most efficient way of managing virtual address spac

Re: Best/Most efficient way of managing virtual address spac

Re: Best/Most efficient way of managing virtual address spac

Re: Best/Most efficient way of managing virtual address spac