Page 1 of 5

Automatic memory management

Posted: Thu Nov 19, 2015 10:48 am
by Roman
Because I'm working on a programming language (code generation currently), I'm thinking about the way memory management will be implemented. The only thing I'm sure about is that it should be automatic.

One of the popular choices is a garbage collector, but I think, that it creates unwanted overhead. The "garbage" is not always collected in time with it, which is bad, if there's something important in destructors.

The ARC technique is the one I always favored. However, it has got a disadvantage - it can't deal with cyclic references, instead it lets the programmer to use weak references, that don't increment reference counts. Recently I came across different ideas about using the region-based memory model or tracing garbage collection in a hybrid with ARC.

I'm quite interested in what are your thoughts on this and memory management in general.

Re: Automatic memory management

Posted: Sat Nov 21, 2015 11:56 am
by Schol-R-LEA
If you can afford to buy them, or have access to a college library, you might want to look up either Garbage Collection (1996) or its recent update The Garbage Collection Handbook (2012), by Richard Jones.

Re: Automatic memory management

Posted: Sat Nov 21, 2015 12:52 pm
by Rusky
Garbage collection works fine for memory, if you can afford or work around its downsides. It doesn't work so well when you're managing other resources than memory that need to be explicitly closed, which is why Java grew 'try-with-resource' and C# grew 'using' blocks.

Region-based memory management is a pretty powerful alternative if you're willing to make the language a little more complex. Probably the most accessible implementation today is the Rust programming language, but there's also Cyclone and ATS.

Rust builds the concept of ownership into the language, so that all values begin with a single owner responsible for running their destructor. Values are moved by default (not in the C++ sense where a move constructor replaces the old value with an empty destructible object, but in the sense where the object is simply memcpy'd and the old location becomes statically inaccessible) so they continue to have a single owner, though values that can be safely duplicated can be marked as available for copying.

Pointers to these objects are marked with lifetime regions, which they are statically prevented from outliving. Functions accepting or returning pointers are generic on those lifetimes so you can say something like "fn foo<'a>(x: &'a X) -> &'a Y", meaning "given a lifetime 'a where the value pointed to by x lives at least that long, return a pointer to another value that also lives at least that long." In simple cases like this, lifetimes are inferred to avoid proliferating too much line noise (in this example, the only other sane option for the return value's lifetime is 'static, meaning it lives forever, but that's not very common and can thus be specified explicitly if necessary).

At any one time, a value can have either 1) only a single mutable reference, or 2) many immutable references. This is not, at first glance, strictly related to memory management, but IIRC it is necessary to prevent the above invariants from being invalidated accidentally by separate parts of the program. However, it also happens to statically prevent things like iterator invalidation, makes data races impossible, and makes things easier to optimize, so it's not too much of a pain.

Finally, if you need something outside of this pattern (pretty much the usual way of managing memory in C or C++, but enforced by the compiler), you (or a library) can create new types that mark parts of their implementation as "unsafe," essentially telling the compiler that the library author has verified their safety manually. This enables things like shared ownership through reference counted pointers, multiple mutable references through things like locks (or their single-threaded equivalents), or even integration with garbage collectors in other runtimes.

So in the end the question is if you want just memory management to be automatic, or all resource management, and if the extra language complexity is worth it in your language's target niche.

Re: Automatic memory management

Posted: Sat Nov 21, 2015 8:52 pm
by Brendan
Hi,
Roman wrote:I'm quite interested in what are your thoughts on this and memory management in general.
My thoughts on memory management are that traditional approaches like "malloc()" are broken, because:
  • it's virtually impossible to get good locality (e.g. if some pieces of data are used frequently and others aren't, make sure all the pieces used frequently are packed into the same area to reduce cache misses and TLB misses)
  • the ability to specify alignment shouldn't been built in, but wasn't and became an ugly and/or non-standard afterthought
  • there's no way to set quotas (e.g. when there's isn't much virtual address space remaining, refuse to allocate memory for some things to ensure there's always enough space to allocate other more important things).
  • "free()" returns nothing, making it impossible to return errors (e.g. when you free something twice, free something that was never allocated, etc)
  • there's no sane way to allow inspection (e.g. get a list of allocated memory areas with descriptions of what they are and where they were allocated) which makes tracking down memory leaks hard
In general; garbage collection is far worse - it doesn't do anything to fix the majority of problems with "malloc()" (the only problem it avoids is the "free() returns nothing" problem); and just adds more problems:
  • memory that could've/should've been freed when it was no longer needed stays around until the garbage collector collects it, which increases the average amount of RAM consumed a process; which makes locality worse and means that memory can't be used by other processes or by the OS (e.g. file system caches, etc).
  • garbage collection adds overhead, often in the form of cache/TLB thrashing "lag spikes of doom".
  • garbage collection ruins swap space management because everything touched by the garbage collector is considered "recently used" when the OS is trying to figure out which page/s to send to swap space
  • programmers that have no idea what's going behind the scenes tend to write worse code, and automatic memory management encourages this
My thoughts are that the new problems that garbage collection cause are unsolvable; and the original problems with "malloc()" and "free()" would be relatively trivial to fix. For example, you could have a "createPool(maxPoolSize)" and "malloc(pool, size, alignment, description)" and return errors from "free()"; and you could provide facilities to inspect a pool (e.g. to iterate over the metadata for each allocated area in a specific pool, to count the number of things with the same description, to return the amount of space left in the pool, etc) and make it easy to detect and fix any leaks.

Also note that there is another problem that you might want to consider - the proliferation of languages. When you've got many languages for many niches you create artificial barriers that prevent programmers from understanding and/or using each other's work. This destroys things like "many eyes" and causes a huge amount of wasted effort (e.g. 100 different people writing 100 libraries that all do the same thing in 100 different languages). The benefits of niche languages (for their niche and nobody else) aren't as important as the benefits of a common language. Basically; it should be possible to use a language for kernels and high performance software and then (possibly with some "convenience" libraries) use that same language for rapid development of things like temporary scripts; and in the same way it should be possible to quickly slap together a crude prototype and then (one piece at a time) evolve the prototype into a high quality finished product.


Cheers,

Brendan

Re: Automatic memory management

Posted: Sun Nov 22, 2015 7:38 am
by embryo2
Brendan wrote:In general; garbage collection is far worse - it doesn't do anything to fix the majority of problems with "malloc()" (the only problem it avoids is the "free() returns nothing" problem);
Why we should trade new value for the mediocre old-fashioned solution? There's no "it improves this piece of the $hit" approach. It just replaces all the mediocre thing. Because the new value is a collection of new things and in case it is mixed with old problem solving approaches the value will be lost.
Brendan wrote:and just adds more problems:
  • memory that could've/should've been freed when it was no longer needed stays around until the garbage collector collects it, which increases the average amount of RAM consumed a process; which makes locality worse and means that memory can't be used by other processes or by the OS (e.g. file system caches, etc).
It's the trade "a bit of efficiency for new value". The value here is programmer's productivity. And efficiency increase is just a matter of time. But productivity is already here.
Brendan wrote:
  • garbage collection adds overhead, often in the form of cache/TLB thrashing "lag spikes of doom".
It should be looked at from the global perspective of the memory management. It includes compile time decisions also. The latter is very important and you seems to miss it.
Brendan wrote:
  • garbage collection ruins swap space management because everything touched by the garbage collector is considered "recently used" when the OS is trying to figure out which page/s to send to swap space
The swap is a very ugly thing of the past. It ruins the uninterrupted flow of user's work. It just should be removed.
Brendan wrote:
  • programmers that have no idea what's going behind the scenes tend to write worse code, and automatic memory management encourages this
Programmers that have an idea what's going behind the scenes tend to think about how to liberate themselves from the need to think about tedious details, and automatic memory management not only encourages this, but it really helps in the process. And the final goal is not the 1% performance increase, but the 100% productivity growth.
Brendan wrote:Also note that there is another problem that you might want to consider - the proliferation of languages. When you've got many languages for many niches you create artificial barriers that prevent programmers from understanding and/or using each other's work. This destroys things like "many eyes" and causes a huge amount of wasted effort (e.g. 100 different people writing 100 libraries that all do the same thing in 100 different languages).
I agree. But there also are the greater parts of the language landscape. It is mostly the safe vs unsafe discussion. And here is no 100 different languages, but just two different approaches.

Re: Automatic memory management

Posted: Sun Nov 22, 2015 8:54 am
by Brendan
Hi,
embryo2 wrote:
Brendan wrote:In general; garbage collection is far worse - it doesn't do anything to fix the majority of problems with "malloc()" (the only problem it avoids is the "free() returns nothing" problem);
Why we should trade new value for the mediocre old-fashioned solution? There's no "it improves this piece of the $hit" approach. It just replaces all the mediocre thing. Because the new value is a collection of new things and in case it is mixed with old problem solving approaches the value will be lost.
A good engineer looks at advantages and disadvantages (and tries to increase the advantages while decreasing disadvantages). Age has nothing to do with this, and the fact that garbage collection is ancient (dating back to 1959) doesn't make it bad.
embryo2 wrote:
Brendan wrote:and just adds more problems:
  • memory that could've/should've been freed when it was no longer needed stays around until the garbage collector collects it, which increases the average amount of RAM consumed a process; which makes locality worse and means that memory can't be used by other processes or by the OS (e.g. file system caches, etc).
It's the trade "a bit of efficiency for new value". The value here is programmer's productivity. And efficiency increase is just a matter of time. But productivity is already here.
It's a trade - slightly faster development time for low quality software that people wish never existed.
embryo2 wrote:
Brendan wrote:
  • garbage collection adds overhead, often in the form of cache/TLB thrashing "lag spikes of doom".
It should be looked at from the global perspective of the memory management. It includes compile time decisions also. The latter is very important and you seems to miss it.
Dynamic memory management excludes compile-time decisions by definition (anything allocated at compile time is not dynamically allocated).
embryo2 wrote:
Brendan wrote:
  • garbage collection ruins swap space management because everything touched by the garbage collector is considered "recently used" when the OS is trying to figure out which page/s to send to swap space
The swap is a very ugly thing of the past. It ruins the uninterrupted flow of user's work. It just should be removed.
It takes a massive misunderstanding of the benefits of (correctly implemented) swap space to make such a misinformed statement like that.
embryo2 wrote:
Brendan wrote:
  • programmers that have no idea what's going behind the scenes tend to write worse code, and automatic memory management encourages this
Programmers that have an idea what's going behind the scenes tend to think about how to liberate themselves from the need to think about tedious details, and automatic memory management not only encourages this, but it really helps in the process. And the final goal is not the 1% performance increase, but the 100% productivity growth.
Programmers that "liberate themselves from the need to think about tedious details" tend to completely oblivious to how incompetent they've become.
embryo2 wrote:
Brendan wrote:Also note that there is another problem that you might want to consider - the proliferation of languages. When you've got many languages for many niches you create artificial barriers that prevent programmers from understanding and/or using each other's work. This destroys things like "many eyes" and causes a huge amount of wasted effort (e.g. 100 different people writing 100 libraries that all do the same thing in 100 different languages).
I agree. But there also are the greater parts of the language landscape. It is mostly the safe vs unsafe discussion. And here is no 100 different languages, but just two different approaches.
2 different approaches (with 50 different languages for the "safe" approach and another 50 different languages for the "unsafe" approach) is nothing like 100 different languages! :roll:

Note that this has nothing at all to do with safe vs. unsafe - you can do explicit memory management with either approach, and you can do garbage collection with either approach.


Cheers,

Brendan

Re: Automatic memory management

Posted: Sun Nov 22, 2015 12:49 pm
by Boris
I'm curous about "smart" usage of swap space (beside extanding physical RAM).
Do you have cases when its better to swap a phyisical RAM page into disk and reuse it for another purpose, than using a free physical page ?

Re: Automatic memory management

Posted: Sun Nov 22, 2015 4:40 pm
by FallenAvatar
Boris wrote:I'm curous about "smart" usage of swap space (beside extanding physical RAM).
Do you have cases when its better to swap a phyisical RAM page into disk and reuse it for another purpose, than using a free physical page ?
When a user closes an application they use frequently, you can swap out the apps current state instead of truly closing the app to allow the app to start up faster (and right where the user left off) more quickly without the app developer needing to do anything special.

- Monk

Re: Automatic memory management

Posted: Sun Nov 22, 2015 11:18 pm
by Brendan
Hi,
Boris wrote:I'm curous about "smart" usage of swap space (beside extanding physical RAM).
Do you have cases when its better to swap a phyisical RAM page into disk and reuse it for another purpose, than using a free physical page ?
Let's forget about swap space and think about "ideal RAM contents", and things like pre-fetching.

Imagine there is:
  • 1 GiB of temporary data used by running applications (stack, heap, etc)
  • 1234 GiB of files on that computer's file system (including executable files that are currently running)
  • 8 GiB of domain names that the OS could cache
  • 100 EiB of static files on the internet that the OS could cache
Let's call this "the set of all data".

Now imagine if you split all that data into 4 KiB pages, and give each page a score representing the probability it will be needed soon. If the computer's RAM can store 1 million pages, then "ideal RAM contents" is when the 1 million pages that have the highest probability of being needed soon are in RAM. Of course maximum performance is achieved when "ideal RAM contents" is achieved (e.g. everything pre-fetched before its needed and no delays for any IO).

This means that (e.g.) if there's 100 KiB of data that the GUI used during initialisation and you know it's very unlikely it will be needed again, and if you know that the user that just logged in happens to go to Wikipedia fairly often; then you want to send those "unlikely to be used soon" GUI pages to swap space just so that you can pre-fetch "more likely to be used soon" data from Wikipedia (so if/when the user wants the data it's in RAM before they ask for it).

However; it's not this simple - disk and network bandwidth limits combined with frequently changing "most likely to be needed" scores mean that the OS will probably never reach the "ideal RAM contents" state. The other problem is that it takes a lot of work just to determine which pages are most likely to be needed (e.g. you'd have to keep track of a huge amount of things, like access patterns for all software, end user habits, etc) and you need to compromise between the complexity/overhead of calculating the scores and the quality of the scores. Basically; the goal of the OS is to do the best job it can within those constraints - constantly trying to get closer to (a less than perfect approximation of) the "ideal RAM contents" state but never achieving it; and only being able to get closer to maximum performance.

Now, swap space...

Getting "most likely to be needed" data into RAM means getting "less likely to be needed" data out of RAM to make space. If the "less likely to be needed" data has been modified, then the only place you can put it is swap space. If the "less likely to be needed" data has not been modified then you could get it from wherever it originally came from and don't need to store it in swap space, but if swap space is faster (or more reliable?) than wherever the data originally came from then you might want to store the data in swap space anyway.

Basically swap space is just one of the pieces of the puzzle; but without it you can't improve performance (e.g. using pre-fetching to avoid IO delays) anywhere near as much.

Now let's think about how to design swap space support...

One of the things people often don't realise is that modern RAM chips have their own power saving. If you've got a computer with a pair of 4 GiB RAM chips and only use the first RAM chip, then the other RAM chip can go into a low power mode. If we put all the "most likely to be used" data in the first RAM chip and all the "slightly less likely to be used" data in the second RAM chip; then that second RAM chip can spend a lot more time in its low power mode; and that means longer battery life for mobile systems (and servers that last longer on emergency UPS power), and less heat, less fan noise, etc. Fortunately an OS typically has the logic needed for this - just use the second RAM chip for swap space (or more correctly; half of each NUMA domain's RAM as swap space).

If we're using RAM as swap space anyway; then we can try to compress data before storing it in "RAM swap". With 4 GiB of RAM being used for "RAM swap" half of the pages might not compress well and half might compress to 50% (or less) of their original size; and we might be able to store 6 GiB of data in "RAM swap".

The next thing you're going to want is a tiered system of swap providers. Maybe the video card has 1 GiB of video memory and is only using a quarter of so it offers 768 MiB of swap space, maybe the SSD drive has a swap partition and provides another 10 GiB of swap space, and maybe there's a mechanical hard drive with another swap partition it that offers another 20 GiB of swap space. They're all different speeds. Obviously you want to use the swap space in order - so that more likely to be used data (in swap space) gets stored by the fastest swap provider and least likely to be used data gets stored by the slowest swap provider. The "RAM swap" is fastest so it's first, and if you run out of space in "RAM swap" you transfer potentially already compressed data from "RAM swap" to "video display memory swap". This means that for a total of "2+0.75+10+20=32.75" GiB of physical swap space you might store almost 50 GiB of data due to compression.

Of course when you're transferring data from a faster swap provider to a slower swap provider, if the original data wasn't modified (and swap is just being used for caching) then you'd check if the slower swap provider is faster or slower than the original supplier. For a file that came from SSD, it makes sense to cache it in "video display memory swap" to improve performance, but doesn't make any sense to cache it on "mechanical hard disk swap" that's slower.

Now; let's go all the way back to the start (to "ideal RAM contents"). Imagine if you split all that (> 100 EiB) "set of all data" into 4 KiB pages, and give each page a score representing the probability it will be needed soon. The pages with the highest scores go in RAM, the pages with the next highest scores go in "RAM swap", the pages with the next highest scores go in "video display memory swap", and so on. This gives you "ideal RAM+swap contents". Maximum performance is achieved when "ideal RAM+swap contents" is achieved.

However; it's not that simple (disk and network bandwidth limits and.... ); but the goal of the OS is to do the best job it can; constantly trying to get closer to (a less than perfect approximation of) the "ideal RAM+swap contents" state but never achieving it; and only being able to get closer to maximum performance.


Cheers,

Brendan

Re: Automatic memory management

Posted: Mon Nov 23, 2015 6:18 am
by Octocontrabass
Brendan wrote:One of the things people often don't realise is that modern RAM chips have their own power saving. If you've got a computer with a pair of 4 GiB RAM chips and only use the first RAM chip, then the other RAM chip can go into a low power mode.
What about interleaving?

Re: Automatic memory management

Posted: Mon Nov 23, 2015 8:46 am
by Brendan
Hi,
Octocontrabass wrote:
Brendan wrote:One of the things people often don't realise is that modern RAM chips have their own power saving. If you've got a computer with a pair of 4 GiB RAM chips and only use the first RAM chip, then the other RAM chip can go into a low power mode.
What about interleaving?
There's a few cases where you're not going to get power saving (e.g. only one RAM chip or 2 RAM chips that are interleaved). For the majority of the cases there's either no interleaving or there's some interleaving that doesn't matter (e.g. RAM chips 1 & 2 are interleaved, and RAM chips 3 & 4 are interleaved).


Cheers,

Brendan

Re: Automatic memory management

Posted: Mon Nov 23, 2015 8:54 am
by embryo2
Brendan wrote:A good engineer looks at advantages and disadvantages (and tries to increase the advantages while decreasing disadvantages).
Automatic memory management gives us the productivity advantage. It also reduces or complicates things like flexibility and speed. But with the help of a few additional architectural decisions it is possible to minimize the drawbacks while keeping the advantage intact. So, we have the advantage (and we have it now) and we have some gradually decreasing (as the time passes) disadvantages. Our future world will have the advantage and no disadvantage (minimized to the irrelevance). Your future world has no advantage.
Brendan wrote:It's a trade - slightly faster development time for low quality software that people wish never existed.
People buy Android and Windows, use Java based internet and work in companies with the information infrastructure run by Java application servers. But yes, some people see such world as something disgusting.
Brendan wrote:Dynamic memory management excludes compile-time decisions by definition (anything allocated at compile time is not dynamically allocated).
If something can be allocated at the stack it greatly reduces the garbage collection overhead. And it's the compile time decision.
Brendan wrote:Programmers that "liberate themselves from the need to think about tedious details" tend to completely oblivious to how incompetent they've become.
It's just matter of experience. It's not the technology. Experienced developer can use it efficiently and be productive at the same time. While alternative just allows some efficiency without any productivity advantage.
Brendan wrote:2 different approaches (with 50 different languages for the "safe" approach and another 50 different languages for the "unsafe" approach) is nothing like 100 different languages! :roll:
Yes, 2 is less than 100. And even if there are 10000 languages per approach the number of approaches is still manageable.
Brendan wrote:Note that this has nothing at all to do with safe vs. unsafe - you can do explicit memory management with either approach, and you can do garbage collection with either approach.
So, now it can be obvious to you that automatic memory management can help even in case of your "very efficient" OS. At least, it can be used with your OS.

Re: Automatic memory management

Posted: Mon Nov 23, 2015 9:11 am
by embryo2
Brendan wrote:Let's forget about swap space and think about "ideal RAM contents", and things like pre-fetching.
Your approach is about keeping things as close as possible. Generally, it looks good. But there's some devil in details.

First, you need the information. The information about what is really needed and when. But your approach is so general (at the OS level) that it just prevents you from having a lot of required information.

Let's remember how programs work. They load, process and store some data. And here it seems your approach should work, it just keeps things a bit closer to the application. But who on earth should know what are the actual data fragments that are more or less important for the application? My bet is with the application developer. But yours, it seems, with the OS developer. Here is just one question - how the OS developer can know more than the application developer?

And while application developer is aware of his application needs it's just a matter of experience and available libraries for him to make the optimal decision. And in case of the OS developer it's the matter of some voodoo magic or very inaccurate statistics.

So, the second, it's much (much, much, much) better to ask the application developer what information is needed and when. And only after the application developer provides you the required information, only then it's viable to help the application developer to get the information as soon as possible for every information priority setting.

And yes, it's not about traditional swapping. It's about traditional caching.

Re: Automatic memory management

Posted: Mon Nov 23, 2015 10:50 am
by Brendan
Hi,
embryo2 wrote:
Brendan wrote:Let's forget about swap space and think about "ideal RAM contents", and things like pre-fetching.
Your approach is about keeping things as close as possible. Generally, it looks good. But there's some devil in details.

First, you need the information. The information about what is really needed and when. But your approach is so general (at the OS level) that it just prevents you from having a lot of required information.

Let's remember how programs work. They load, process and store some data. And here it seems your approach should work, it just keeps things a bit closer to the application. But who on earth should know what are the actual data fragments that are more or less important for the application? My bet is with the application developer. But yours, it seems, with the OS developer. Here is just one question - how the OS developer can know more than the application developer?
What application?

I turn a computer on, then go to the toilet, then make a cup of coffee, then rummage through the fridge and grab something to eat. By the time I get back to the computer maybe it's been booted and running for 10 minutes. From past behaviour it should know only one user logs on, and it should know it's very likely that user is going to use a web browser sooner or later; so it should have pre-fetched most of the files that the web browser would want. It should also pre-fetch the IP addresses for some frequently used domain names (osdev.org, google.com, wikipedia.org); plus files for a few other applications (PDF viewer, text file editor) and some other things (my project's source files, NASM, GCC, etc). It should start doing all of this before I even log on (before any application is started at all).

If I was playing a game that gobbles a lot of RAM and then exit the game; the OS should notice there's suddenly a lot of "under-used"/free RAM and start pre-fetching. If I happen to start the web browser while it's pre-fetching the browser's executable the OS just bumps that file up to "high priority".

Note that Windows has supported something roughly like this since Vista (they call it "SuperFetch"), and Apple's OS X also supports something roughly like this (they call it "Task Working Set").
embryo2 wrote:And while application developer is aware of his application needs it's just a matter of experience and available libraries for him to make the optimal decision. And in case of the OS developer it's the matter of some voodoo magic or very inaccurate statistics.
Nothing prevents an application from also doing its own pre-fetching (which is also beneficial if done right).
embryo2 wrote:So, the second, it's much (much, much, much) better to ask the application developer what information is needed and when. And only after the application developer provides you the required information, only then it's viable to help the application developer to get the information as soon as possible for every information priority setting.
You start a text editor, and it notices you're editing a source code of some kind and (because it's been profiling user behaviour in the background when it's not even running) the text editor decides to start pre-fetching the compiler? While you're using a spreadsheet, it notices that its getting late at night and "knows" that at this time of day you normally check your email before turning the computer off, so it starts pre-fetching your email client?
embryo2 wrote:And yes, it's not about traditional swapping. It's about traditional caching.
It's about interactions between many pieces (where swap is just one of the pieces involved). If you only look at swap in isolation then you end up with naive misconceptions.


Cheers,

Brendan

Re: Automatic memory management

Posted: Tue Nov 24, 2015 8:45 am
by embryo2
Brendan wrote:What application?
Database, browser, game.

And there are also the Notepad, Calculator or Minesweeper.

For the first list it is important to have tools an application can use to cache it's data or code. For the second list may be it is better if the OS takes care of the caching. If there are some common libraries (used by notepad and calculator and many other applications) then the library itself can optimize the caching better than OS can. But it requires some change in the way the libraries are created.

However, some common things are suitable for the OS. And the OS can improve the application's performance even better if it has additional information like the AOT or JIT can provide. It means the OS is destined to include the features that help it "manage" the application. Yes, in the end it's the managed environment.
Brendan wrote:From past behaviour it should know only one user logs on, and it should know it's very likely that user is going to use a web browser sooner or later; so it should have pre-fetched most of the files that the web browser would want.
Prefetching "most of the files" is inefficient. The browser can help and tell the OS "I need this and this". But of course, there can be some cooperation.
Brendan wrote:It should also pre-fetch the IP addresses for some frequently used domain names
The effect of such prefetch is negligible. And there also should be some service for frequent refresh of the prefetched data to detect it's changes. Such service requires some memory. Also the actually prefetched data requires some memory. The indexes and pattern usage statistics also require some memory. Final result will be less memory for 100ms gain in the page load speed.

I think it's useful for you to look at databases and the way they use memory for caches. It's not a simple story. But your idea is right about the creation of a database.
Brendan wrote:If I was playing a game that gobbles a lot of RAM and then exit the game; the OS should notice there's suddenly a lot of "under-used"/free RAM and start pre-fetching.
Hopefully, it starts the prefetching with some background priority. And also hopefully, I can tell the OS what I want and what I do not want to be prefetched (a bit of extra control over the situation).
Brendan wrote:If I happen to start the web browser while it's pre-fetching the browser's executable the OS just bumps that file up to "high priority".
There are such things as schedulers. User or scheduler's developer still know better what it is for and how to optimize it. And even more, because the scheduler runs not so often (once per day while I can run a browser many times a day) it can be in the state "poorly prefetched". But when it starts it's disk usage can break my uninterrupted work in browser. It's actually the case for some antivirus soft and it's scheduled updates on my PC (and it also wants to do all it can at the boot time when it seems the OS allows you to start working, but there's this ugly thing running...).
Brendan wrote:Note that Windows has supported something roughly like this since Vista (they call it "SuperFetch")
I know it. And the prefetch is not working "as expected". May be there are some other problems with Windows and it's actually not the prefetch's sin, but there's something that starts periodically and slows things in a very noticeable manner. In the end I just turned prefetching off. At least now I can control what happens and when. But the general direction of the prefetching elaboration in different OSes is good.
Brendan wrote:You start a text editor, and it notices you're editing a source code of some kind and (because it's been profiling user behaviour in the background when it's not even running) the text editor decides to start pre-fetching the compiler?
In fact I do not use a great mix of applications for development. It's just Eclipse and the most of the time there's nothing else. But yes, sometime an external watching entity can speed things. The only problem here is the actual value we can get from it. The speed increase of 100ms when start time takes 1 second is not visible. But if it's something like "from 15 seconds to 3 seconds", then yes, it's interesting. And again, if it buys the start time for the interruption of my work with other software, then I prefer not to have such trade.
Brendan wrote:It's about interactions between many pieces (where swap is just one of the pieces involved). If you only look at swap in isolation then you end up with naive misconceptions.
Well, the initial question was about the swap use cases.

But the cooperation of the applications and the OS is something I really support. And the management of an application life cycle can be performed optimally if the OS has more information about the application, would it be some annotations or AOT/JIT's output or whatever related to the ability of the OS to introspect the code and it's usage.