Hi,
AUsername wrote:I was under the impression that pagefiles are to only be used when required by the system in order to continue operation.
That's not quite right. For performance, RAM should be used to store whatever is most likely to be needed (regardless of what it is).
If there's 4 GiB of usable RAM and software is using 3 GiB for code and data; then "whatever is most likely to be needed" may be 2 GiB of software's code and data plus 2 GiB of file system cache; where the remaining 1 GiB of software's code and data is "unlikely to be needed" and is sent to swap space to make room for "more likely to be needed" file system cache.
Of course it's not that simple either - you can't reliably predict the future, and there's different costs involved with sending different pages of RAM to disk.
Instead of predicting the future you estimate (typically based on "least recently used"; although there's other ways that are better for less common access patterns, and there's also more complex and more accurate ways involving tracking previous access patterns; and in any case you might detect things like sequential reads and use that to prefetch data). Something else to consider is shared memory areas (e.g. the estimated chance of a shared page being needed is the sum of the estimated chance for each task that shares the page needing it).
For "different costs" there's 3 categories: "full cost", "half cost" and "no cost".
Normal read-write pages that were modified are almost always "full cost"; where there's no copy of the data on disk and the data needs to be written to swap space and then read from swap space if/when it's needed. However, most OS's have something I call "allocate on demand" where there's a special page full of zeros that's mapped many times into many address spaces as read-only. When software attempts to write to a "read-only page of zeros" the page fault handler allocates a new page (and fills it with zeros) and maps the new page into the address space as read-write. This creates the illusion that large areas of the address space/s (e.g. the ".bss" section) are read-write and full of zeros without actually using real RAM until/unless it's necessary. So; normal pages that were modified are almost always "full cost"; but if software fills several pages with zero (e.g. using "memset()" or maybe "calloc()") then you'd end up with normal read-write pages that were modified but happen to be full of zeros, and the OS could detect this and convert it to "allocate on demand" (instead of doing the "full cost" thing and writing the page/s to disk). This is "no cost" (because the original data can obtained without any disk I/O at all).
For "half cost", any page that hasn't been modified must have come from somewhere. Often it came from disk, and often there's still a copy of it on disk (a simple example is memory mapped files). In this case you don't need to write the data to disk again, and if/when the page is needed again you can read it from wherever it originally came from (e.g. for memory mapped files you can read the data from the original file on disk). For a less obvious example; when a page of data in swap space is needed the OS reads it into RAM from swap space; but the OS could try to leave a copy of it in swap space for as long as possible. If the page isn't modified after it's loaded from swap space, and if there's still a copy of it in swap space, then that page doesn't need to be written to disk again.
Also note that (especially for memory mapped files) there's different costs involved in reading data from disk, because different devices have different seek times and different transfer times (and for swapping the seek times can be a lot more important than transfer times). Because of this, in some cases, "full cost" might end up being faster than "half cost". For example if a memory mapped file is stored on an old slow hard drive, then it might be faster to write it to a newer faster hard drive and read it back again (instead of not writing it anywhere and then reading it from the old slow hard drive if it's needed). The other thing to consider is disk load - e.g. when a disk is idle the cost of reading/writing data might be lower, and when a disk is going flat out trying to keep up then the cost of reading/writing data might be higher. For swap space, you could also take into account the amount of free swap space (e.g. so that if swap space gets close to full the OS starts preferring to reclaim pages of RAM where the data is in the file system).
Now imagine that for every virtual page (in every address space) you calculate a score using "
score = cost_of_storing_data_somewhere + estimated_chance_of_needing_page_soon * cost_of_estimate_being_wrong" and then try to make sure that pages of RAM contain the data with the lowest score. If you use a complex method of estimating the chance of needing a page soon, then this could mean pre-loading data from disk into RAM before it's needed. For example, maybe your OS has noticed that once per hour "cron" starts something that uses certain pages - immediately after this has happened the estimated chance of needing those pages soon is "unlikely" (and the OS reclaim these pages of RAM immediately) and after 59 minutes and 45 seconds the estimated chance of needing those pages soon is "extremely likely" (and the OS pre-loads the pages before they're actually accessed).
However, this is extremely complex, and most OSs use a much simpler method of estimating which pages will be needed. If you use something like "least recently used" then the estimated chance of a page being needed always decreases unless it's accessed (and you can assume that recently accessed pages are always in RAM). In this case you'd only need to care about pages of RAM (rather than every virtual page). You'd wait until you need to free some page/s of RAM and only then calculate a score for each page of RAM using something like "
score = cost_of_storing_data_somewhere + estimated_chance_of_needing_page_soon * cost_of_loading_data_if_needed". Obviously if you need to free N pages of RAM you'd select the N pages with the lowest score.
You can also end up with some sort of hybrid arrangement, where something simpler (e.g. "least recently used") is augmented by a separate pre-fetching/pre-loading feature. This might give some of the advantages of more complex methods without so much complexity; but it might also be difficult to coordinate (worst case would be the same pages being repeatedly loaded into RAM by the pre-fetcher and being repeatedly evicted by the "least recently used" code
).
On top of all of this; most OSs try to keep some pages free, so that they can respond quickly when a reasonable amount of RAM is allocated. This can mean sending some "unlikely to be used" pages to swap space to ensure that there's a reasonable amount of free pages. For a computer with 2 GiB of RAM, if processes are using 2 GiB of RAM you might have 128 MiB in swap space and 128 MiB of free RAM (rather than no swap space and no free RAM).
Cheers,
Brendan