Page 4 of 4

Re: which memory allocation system to use

Posted: Fri Jul 23, 2021 7:25 am
by nexos
rdos wrote:In 32-bit mode, there are only 1024 PDEs, and 256 of those would be for the 1G kernel space. Allocating these on demand has no speed implications. However, if you pre-allocate them then you will waste 256 x 4k = 2M of physical memory (4M for PAE). Compare that with 4k for the system mapping. It actually doesn't matter if you set bit 0 or not. If you are not allocating them dynamically, then you will need to reserve physical memory for them that cannot be used for other things.
Actually, I was talking about allocating page tables. Here is how my flow will go:
User or kernel request, say, 12K of memory.
Kernel searches AVL tree for region of that size
Kernel allocates Address Region descriptor
Kernel now access PDE for this region
If the page table is not present, the kernel goes ahead and allocates the page table from zero filled memory and maps it.
The kernel sets the corresponding PTEs to point to the ARD for this region, unsetting bit 0 so the kernel doesn't think it has been mapped.
User or kernel access this region
Kernel grabs the PTE for the faulting area
It checks the ADR pointed to by the PTE as to what kind of memory this is (zero fill, swapped, mapped to a file, etc. I will only explain how zero fill will work)
If zero fill, the kernel allocates a zero filled page and makes the PTE point to it, also setting bits such as R/W, NX, and so on according to the ADR.
Kernel then restarts the instruction.
Of course, thing are much different for swapped or memory mapped files, but, I haven't planned those yet :)
rdos wrote:I set unallocated PTEs to zero, and when they are allocated I set bit 1 to indicate this. When the PF handler see a PTE with bit 0 zero and bit 1 as one, it will allocate physical memory for it, set up the PTE, and reexecute. So, if I allocate 10 pages in kernel, they will be setup with PTEs containing "2", and will consume no physical memory. When a pagefault occurs, physical memory is mapped automatically. The pagefault handler will generate a pagefault exception if bit 1 is not set. I also have a more specialized function where I can link a callback. This is not really demand paging, rather lazy allocation of physical memory in kernel space.
I will have one more level of indirection to make allocations more efficient, and also to allow for versatility with swap and memory mapped files.

Re: which memory allocation system to use

Posted: Fri Jul 23, 2021 7:48 am
by rdos
I think you are over-complicating stuff. When I allocate a linear area in kernel space I scan PDEs for unallocated entries. The algorithm is a bit smart so it remembers where it last found a page. This will make linear memory allocation rather fast. I doubt AVL trees will perform significantly better on real-life scenarios. I see no reason at all for having region descriptors. In my design, I hardcode the function of the 1G kernel areas. Most of it will be for the page aligned memory allocator, much less for a malloc-like byte aligned allocator, the current page alias, the system alias and the kernel image itself.

As for swap, that is ancient stuff that no sane OS today should support. Actually, this never worked properly anyway in ancient OSes. If you overcommit memory and cannot fix it by discarding file caches, then the only reasonable action is a shutdown.

When it comes to memory mapping files, this is a userspace function. How you implement the userspace VMM will depend on executable formats used if you preload images or not and memory mapping files is just a specialized userspace function and thus doesn't need to be part of kernel space VMM.

Re: which memory allocation system to use

Posted: Fri Jul 23, 2021 8:11 am
by nexos
rdos wrote:I think you are over-complicating stuff. When I allocate a linear area in kernel space I scan PDEs for unallocated entries. The algorithm is a bit smart so it remembers where it last found a page. This will make linear memory allocation rather fast. I doubt AVL trees will perform significantly better on real-life scenarios. I see no reason at all for having region descriptors. In my design, I hardcode the function of the 1G kernel areas. Most of it will be for the page aligned memory allocator, much less for a malloc-like byte aligned allocator, the current page alias, the system alias and the kernel image itself.
Remembering where a page was last found works, until you reach the top of the address space. Then you have to do the search from the start all over again, and then that solution won't work nearly as well. Graphs of AVL trees show that performance scales very well. Having region descriptors will make larger allocations perform better, as we could centralize management of that memory region and only have one structure for that region, making anticipatory paging easier.
rdos wrote:As for swap, that is ancient stuff that no sane OS today should support. Actually, this never worked properly anyway in ancient OSes. If you overcommit memory and cannot fix it by discarding file caches, then the only reasonable action is a shutdown.
In theory, swap should be antiquated, until you start linking huge apps :) Try building LLVM... As memory gets larger, so do apps, so meaning that swap isn't quite as important, but still is relevant.
rdos wrote:When it comes to memory mapping files, this is a userspace function. How you implement the userspace VMM will depend on executable formats used if you preload images or not and memory mapping files is just a specialized userspace function and thus doesn't need to be part of kernel space VMM.
All major OSes implement mmap in kernel space. Could you please explain what you mean?

Re: which memory allocation system to use

Posted: Fri Jul 23, 2021 10:08 am
by Korona
rdos wrote:As for swap, that is ancient stuff that no sane OS today should support. Actually, this never worked properly anyway in ancient OSes. If you overcommit memory and cannot fix it by discarding file caches, then the only reasonable action is a shutdown.
That might be true on the desktop but it's certainly not true in the server and mobile segments. If you have a program on a cloud server that briefly consumes a lot of memory (say, to link a large object or run a large database query), it's certainly better to swap for 5s than to attach more RAM to the VM (which costs money even if it is not used).

Re: which memory allocation system to use

Posted: Fri Jul 23, 2021 3:11 pm
by rdos
nexos wrote:All major OSes implement mmap in kernel space. Could you please explain what you mean?
I didn't say it's not implemented in kernel, only that it should not be part of the kernel VMM. You can implement mmap by using the mapping functions in the kernel VMM. It also need to be integrated with the filesystem. You don't want to invove filesystem code in your VMM.

Actually, I would say that mmap belongs to the filesystem implentation. You might also decide to demand load your executables, and this function would belong to the executable loader. That's because demand loading might also require fixing references and relocation, something that is dependent on executable format used.

You might also want to export syscalls that allocates user memory and shares it between applications. These don't belong to the kernel VMM either.

Re: which memory allocation system to use

Posted: Fri Jul 23, 2021 3:14 pm
by rdos
Korona wrote:
rdos wrote:As for swap, that is ancient stuff that no sane OS today should support. Actually, this never worked properly anyway in ancient OSes. If you overcommit memory and cannot fix it by discarding file caches, then the only reasonable action is a shutdown.
That might be true on the desktop but it's certainly not true in the server and mobile segments. If you have a program on a cloud server that briefly consumes a lot of memory (say, to link a large object or run a large database query), it's certainly better to swap for 5s than to attach more RAM to the VM (which costs money even if it is not used).
I couldn't care less about emulators or virtual servers. For mobile devices, there typically is plenty of memory and no suitable swap device, and so these do not benefit from having a swap function.

Re: which memory allocation system to use

Posted: Fri Jul 23, 2021 6:10 pm
by nexos
rdos wrote:
nexos wrote:All major OSes implement mmap in kernel space. Could you please explain what you mean?
I didn't say it's not implemented in kernel, only that it should not be part of the kernel VMM. You can implement mmap by using the mapping functions in the kernel VMM. It also need to be integrated with the filesystem. You don't want to invove filesystem code in your VMM
I throughout this whole conversation thought you were talking about the user VMM #-o . Oops.... For the kernel VMM, that should be as simple as possible, no doubt.

Re: which memory allocation system to use

Posted: Sat Jul 24, 2021 3:24 am
by Korona
rdos wrote:I couldn't care less about emulators or virtual servers. For mobile devices, there typically is plenty of memory and no suitable swap device, and so these do not benefit from having a swap function.
Nothing that I said applies exclusively to virtual servers. And nobody uses emulators in production anyway. For servers, it's not about having "plenty of memory" available, but about the ratio of RAM to core count. There are many physical servers that have 48 hardware threads but only 192 GiB of RAM (RAM is much more expensive than CPUs with higher core counts + you pay a premium for mainboards with more DIMM slots and DIMMs with higher density), so the amount of memory per core is much worse than on your high end desktop machine. And a large parallel compilation (e.g., of C++ or Rust code; try compiling LLVM) can take more than 6 GiB of RAM; that's why you always want to run your build servers (or any other servers that run throughput- and not latency-critical jobs) with swap enabled. Swap to SSDs is very fast anyway.

For mobile devices: that's just wrong; modern mobile apps are quite memory hungry and it's way faster to reload them from swap than to restart them from disk. Android swaps by default, with the swappiness of background apps set to a very high value (up to 100 in some configurations, i.e., swap everything immediately).

The claim that swap is a feature of the past just does not match reality at all.

Re: which memory allocation system to use

Posted: Sat Jul 24, 2021 3:55 am
by rdos
Korona wrote:
rdos wrote:I couldn't care less about emulators or virtual servers. For mobile devices, there typically is plenty of memory and no suitable swap device, and so these do not benefit from having a swap function.
Nothing that I said applies exclusively to virtual servers. And nobody uses emulators in production anyway. For servers, it's not about having "plenty of memory" available, but about the ratio of RAM to core count. There are many physical servers that have 48 hardware threads but only 192 GiB of RAM (RAM is much more expensive than CPUs with higher core counts + you pay a premium for mainboards with more DIMM slots and DIMMs with higher density), so the amount of memory per core is much worse than on your high end desktop machine. And a large parallel compilation (e.g., of C++ or Rust code; try compiling LLVM) can take more than 6 GiB of RAM; that's why you always want to run your build servers (or any other servers that run throughput- and not latency-critical jobs) with swap enabled. Swap to SSDs is very fast anyway.

For mobile devices: that's just wrong; modern mobile apps are quite memory hungry and it's way faster to reload them from swap than to restart them from disk. Android swaps by default, with the swappiness of background apps set to a very high value (up to 100 in some configurations, i.e., swap everything immediately).

The claim that swap is a feature of the past just does not match reality at all.
I don't believe that compiling is a good example of a system needing a swap function. In fact, swap is that slowest possible operation to fix low-memory situations. You need to write the code or data you want to swap out and then read it again. Before the OS even considers doing this it should adjust disc cache and use demand-loading of executables. You can also discard parts of the loaded application with no cost since it is read only, and this doesn't require a swap device.

Besides, I have no idea why a compiler would need several GBs of memory. The C++ compiler I use doesn't even come close to such a memory requirement, and if indeed the compiler uses too much memory, it will perform horribly if part of it's data must be written & reread all the time from a disc (even if it's an SSD). This is exactly the scenario that created useless results in the past.

The only time I ever had these kind of problems with recent hardware was with STATA and an extremely large dataset. Windows didn't have enough memory, and the analysis took almost forever (and so it was useless). When I installed 24 GB it worked fine. Thus, swap was not the solution at all.

Re: which memory allocation system to use

Posted: Sat Jul 24, 2021 4:05 am
by Korona
rdos wrote:I don't believe that compiling is a good example of a system needing a swap function. In fact, swap is that slowest possible operation to fix low-memory situations. You need to write the code or data you want to swap out and then read it again. Before the OS even considers doing this it should adjust disc cache and use demand-loading of executables. You can also discard parts of the loaded application with no cost since it is read only, and this doesn't require a swap device.

Besides, I have no idea why a compiler would need several GBs of memory. The C++ compiler I use doesn't even come close to such a memory requirement, and if indeed the compiler uses too much memory, it will perform horribly if part of it's data must be written & reread all the time from a disc (even if it's an SSD). This is exactly the scenario that created useless results in the past.
Well, swap is not needed for the system to function but swap improves performance: RAM usage very rarely spikes above the RAM size and if it does, it only does so briefly (say, for 20s). Yes, the compiler will obviously be much slower when it needs to swap. However, the alternative would be to limit the number of concurrently running jobs, but that is a pessimization and has a higher performance overhead than swapping briefly.

Obviously, letting the system swap constantly is quite bad. But using swap for spikes in RAM usage makes a lot of sense if only care about throughput, and not latency. In fact, there is still ongoing work in Linux to make swap more scalable, with the multi-generational LRU patch and locking improvements to the swap system.

Re: which memory allocation system to use

Posted: Sat Jul 24, 2021 4:13 am
by rdos
nexos wrote:
rdos wrote:
nexos wrote:All major OSes implement mmap in kernel space. Could you please explain what you mean?
I didn't say it's not implemented in kernel, only that it should not be part of the kernel VMM. You can implement mmap by using the mapping functions in the kernel VMM. It also need to be integrated with the filesystem. You don't want to invove filesystem code in your VMM
I throughout this whole conversation thought you were talking about the user VMM #-o . Oops.... For the kernel VMM, that should be as simple as possible, no doubt.
I brought it up since it appeared you wanted to mixup kernel & user functionality in your VMM. My (kernel) VMM only contains process creation, linear memory allocation & mapping functions. The filesystem will support memory-mapping files, and the executable loader will support demand loading (if configured). I also export a set of functions for user mode to allow tweaking memory objects which calls the VMM.

In fact, my VMM comes in two variants: A 32-bit paging variant and a PAE variant. Which the system uses is determined at boot time. The VMM will export a set of basic functions and these will be linked either to the 32-bit paging interface or the PAE interface. The page tables cannot be accessed outside of this module, primarily because the layout differ based on mode.

Re: which memory allocation system to use

Posted: Sat Jul 24, 2021 4:23 am
by rdos
Korona wrote:
rdos wrote:I don't believe that compiling is a good example of a system needing a swap function. In fact, swap is that slowest possible operation to fix low-memory situations. You need to write the code or data you want to swap out and then read it again. Before the OS even considers doing this it should adjust disc cache and use demand-loading of executables. You can also discard parts of the loaded application with no cost since it is read only, and this doesn't require a swap device.

Besides, I have no idea why a compiler would need several GBs of memory. The C++ compiler I use doesn't even come close to such a memory requirement, and if indeed the compiler uses too much memory, it will perform horribly if part of it's data must be written & reread all the time from a disc (even if it's an SSD). This is exactly the scenario that created useless results in the past.
Well, swap is not needed for the system to function but swap improves performance: RAM usage very rarely spikes above the RAM size and if it does, it only does so briefly (say, for 20s). Yes, the compiler will obviously be much slower when it needs to swap. However, the alternative would be to limit the number of concurrently running jobs, but that is a pessimization and has a higher performance overhead than swapping briefly.

Obviously, letting the system swap constantly is quite bad. But using swap for spikes in RAM usage makes a lot of sense if only care about throughput, and not latency. In fact, there is still ongoing work in Linux to make swap more scalable, with the multi-generational LRU patch and locking improvements to the swap system.
Well, at least my web hotel will reboot the virtual server if RAM consumption goes too high or if the CPU gets too much loaded. This will terminate offending programs and require them to be restarted. When you actually restart them you need to make sure they have enough resources. I can tweak the memory settings, but it costs money, and so I prefer to keep them at a point that doesn't create too many reboots.

And, again, it's not my impression that swapping is a feature that improves performance, rather the contrary. It's better to simply terminate offending applications if it's impossible to get enough RAM to run them, or to reboot the virtual server.