Automatic memory management

SpyderTL · Post by **SpyderTL** » Sat Dec 19, 2015 3:53 pm

Technically, none of the scenarios above have anything at all to do with swap or pre-fetching.

I was pointing out that humans would rather work on a problem for 5 minutes than be forced to wait for three minutes and then work on a problem for 30 seconds.

It's just human nature, and it's not a "technical" problem to solve.

azblue · Post by **azblue** » Sat Dec 19, 2015 4:58 pm

SpyderTL wrote:
azblue wrote:I don't understant the contempt for swapping that I've seen from a few OSdevers
I turn off paging on all of my Windows machines...
I also disable all prefetch/superfetch functionality as well, for similar reasons. I always feel like I'm waiting on it to finish loading stuff so that it will be faster at some later point, while making me wait right now.

This is essentially "Windows is [or appears to be] stupid," but that's no reason to believe that the only way to implement swapping is to do it stupidly.

Basically, swapping should be implemented in such a way that performance cannot possibly be any slower than if swapping were not implemented. Doing it any other way is wrong and stupid and broken.

Brendan · Post by **Brendan** » Sat Dec 19, 2015 9:49 pm

Hi,

onlyonemac wrote:At least once my Linux system has started it's actually ready for use, and not still ridiculously slow because it's trying to impress me by making it look like the web browser can open in half a second (actually the web browser on my Linux system *can* open in half a second, but that's a different matter...).

For my Linux machine, I don't think there's any pre-fetching at all. If the machine boots and isn't used for 5 entire days everything will still be slow (come from disk rather than file system cache) the first time you use it. It's like it's suffering from Alzheimer's disease and is completely surprised when I start the web browser, despite the fact that I've probably started the web browser on this machine 4 times a day (on average) for the last 2000+ days. It's just another reason why Linux is a crippled joke.

SpyderTL wrote:
Do you honestly think Windows is so stupid that it'd do a low priority "background prefetch" when there are higher priority "an application needs this ASAP" fetches waiting?
I honestly think that a non-SSD drive can only read one block at a time, and that once the command is executed, there is no way to interrupt it with a higher priority command. And unless the low priority data is right next to the high priority data, you're going to be waiting on the drive head to find the low priority data, read it, and then find your high priority data.

If the OS is smart (and I'm not saying Windows is smart), very low priority reads (for pre-fetching) would be broken up into lots of small operations (e.g. one seek only, then read one sector, then read the next sector, and so on) and wouldn't be a single large "seek then read 1234 sectors" thing; so that (for worst possible case) if a high priority transfer is requested at any point the time until the higher priority transfer can begin is always negligible (e.g. possibly faster than waiting for the drive to come out of a power saving state that it would've been in if there was no activity).

SpyderTL wrote:A good approach to prefetching would be to wait for no activity from the user for, say 5 minutes, and then start loading stuff. Or better yet, wait till the machine is locked.

Yes; and Windows "superfetch" does wait for about 5 minutes before it starts prefetching.

azblue wrote:
SpyderTL wrote:
azblue wrote:I don't understant the contempt for swapping that I've seen from a few OSdevers
I turn off paging on all of my Windows machines...
I also disable all prefetch/superfetch functionality as well, for similar reasons. I always feel like I'm waiting on it to finish loading stuff so that it will be faster at some later point, while making me wait right now.
This is essentially "Windows is [or appears to be] stupid," but that's no reason to believe that the only way to implement swapping is to do it stupidly.

Basically, swapping should be implemented in such a way that performance cannot possibly be any slower than if swapping were not implemented. Doing it any other way is wrong and stupid and broken.

Yes (mostly).

At some point (e.g. maybe when there's no free RAM left, and the OS has to "steal" pages from file cache or unmodified memory mapped files if it needs some RAM) the OS would have to start keeping track of "least recently used" pages so that it knows which pages to send to swap space if things get worse. This can add a little overhead (e.g. scanning page tables, etc; and fiddling with the accessed/dirty bits to build up a reasonable idea of when a page was used last).

This means that there's a grey area between "no swapping, zero overhead" and "swap space in use, no denial of service" where there's some overhead even though swap space isn't being used.

Of course the amount of overhead this involves depends on how it's implemented. Ideally; you'd want to determine how close you are to actually needing to send pages to swap space, and adjust the overhead of "least recently used tracking" to suit that - e.g. if you're very close to needing to swap (or are swapping) you'd do thorough "least recently used tracking" (with more overhead), and if you're not that close you'd do less thorough tracking (with less overhead).

Also don't forget that (e.g.) if a process hasn't received any CPU time (all threads blocked) for 1234 seconds you can assume that none of its "not shared" pages have been used for at least 1234 seconds without scanning anything at all. This gives you a decent head start.

Finally; don't forget that swapping is just one piece of a larger puzzle (e.g. and it can/does make perfect sense to send pages to swap space just so you can prefetch files, if/when the files are more likely to be needed than the page/s you send to swap). The whole system (caching, pre-fetching, swapping, etc) should always be beneficial (even if there are pathological cases where individual pieces of that whole system add a little extra overhead).

Cheers,

Brendan

SpyderTL · Post by **SpyderTL** » Sat Dec 19, 2015 10:23 pm

If the OS is smart (and I'm not saying Windows is smart), very low priority reads (for pre-fetching) would be broken up into lots of small operations (e.g. one seek only, then read one sector, then read the next sector, and so on) and wouldn't be a single large "seek then read 1234 sectors" thing; so that (for worst possible case) if a high priority transfer is requested at any point the time until the higher priority transfer can begin is always negligible (e.g. possibly faster than waiting for the drive to come out of a power saving state that it would've been in if there was no activity).

The problem is with the "seek" command. Once that has been executed, you've already tied up the drive for a few milliseconds, and you are better off just finishing the entire "transaction", because being interrupted and seeking back to where you just came from is a complete waste of time.

But I would still prefer setting up my own startup sequence, rather than someone at Microsoft writing some heuristic algorithm to try to guess my usage pattern and pre-load any data that it thinks I will use at boot time.

I would rather the OS boot 1 second quicker, and then stop hitting the hard drive completely. Even if that meant that loading all three of my applications took 3 seconds longer.

It's all about perception.

Let me ask this question. We can all agree that MS-DOS boots faster than Windows or Linux. A lot faster, in fact. Let's say 2 seconds from the end of POST...

Now, let's say that the 3 applications that you use most actually would run in MS-DOS, today. They had the same start up time, and had the same functionality. Hypothetically...

Now, the question is, would you still use Windows/Linux?

I would have to seriously think about it. To be up and running in 2 seconds, and in Visual Studio 3 seconds later... That may actually be worth losing all of the other "crap" that Windows provides. Maybe.

This, incidentally, is the focus behind my current OS / UI design. I want to boot faster than DOS, and have enough functionality to browse the web and watch videos. So far, I've got the first part nailed. Now I'm just working on the second part...

Muazzam · Post by **Muazzam** » Sat Dec 19, 2015 11:17 pm

SpyderTL wrote:
muazzam wrote:I'm sorry for the off-topic post, but am I too dumb that I never get things like garbage collection? I think it's just unnecessarily complex. I don't use memory management at all in my OS. Anyway, what else could you expect from someone with the IQ of 83?
Normally, when you are done using a block of memory, you give it back to the OS memory manager.

With garbage collection, you don't give the memory back when you are done with it. The memory manager "detects" that the memory is no longer needed, and "collects" the memory itself.

Pretty simple concept. Making it work is a little bit more difficult, because your application has to be written slightly differently.

Yeah, I knew that. But what's the point of garbage collection at all!

SpyderTL · Post by **SpyderTL** » Sat Dec 19, 2015 11:34 pm

muazzam wrote:
SpyderTL wrote:
muazzam wrote:I'm sorry for the off-topic post, but am I too dumb that I never get things like garbage collection? I think it's just unnecessarily complex. I don't use memory management at all in my OS. Anyway, what else could you expect from someone with the IQ of 83?
Normally, when you are done using a block of memory, you give it back to the OS memory manager.

With garbage collection, you don't give the memory back when you are done with it. The memory manager "detects" that the memory is no longer needed, and "collects" the memory itself.

Pretty simple concept. Making it work is a little bit more difficult, because your application has to be written slightly differently.
Yeah, I knew that. But what's the point of garbage collection at all!

The point is that with garbage collection, the application doesn't have to give memory that it no longer needs back to the OS. In a way, this makes the application "faster" than a traditional non-garbage collected application, because it doesn't have to take the time to clean up after itself.

But it's really just a trade off. Either way, someone has to do the clean up. It's just a matter of what code does the clean up -- the application or the OS -- and when the clean up is done -- immediately, or at idle time.

If you think about it, this is why corporations pay people to come in and clean up the trash, instead of just having their employees clean up after themselves. Both models work, but the latter is less efficient, because the employee's time is more valuable when they are doing their actual job.

Brendan · Post by **Brendan** » Sun Dec 20, 2015 3:08 am

Hi,

SpyderTL wrote:
If the OS is smart (and I'm not saying Windows is smart), very low priority reads (for pre-fetching) would be broken up into lots of small operations (e.g. one seek only, then read one sector, then read the next sector, and so on) and wouldn't be a single large "seek then read 1234 sectors" thing; so that (for worst possible case) if a high priority transfer is requested at any point the time until the higher priority transfer can begin is always negligible (e.g. possibly faster than waiting for the drive to come out of a power saving state that it would've been in if there was no activity).
The problem is with the "seek" command. Once that has been executed, you've already tied up the drive for a few milliseconds, and you are better off just finishing the entire "transaction", because being interrupted and seeking back to where you just came from is a complete waste of time.

You're ignoring rotational latency (waiting for the start of the sector you want to pass under the disk head, after the head is on the right track). For a 5400 RPM disk drive it takes 1/(5400/60) seconds (or 11.11111 ms) for the disk to rotate once, so on average you can expect rotational latency to be 5.5555 ms.

For a low priority read (where the data being read might never actually be needed) I'd skip the "5.5555 ms" rotational latency just to get a high priority ("needed ASAP") read started sooner.

SpyderTL wrote:But I would still prefer setting up my own startup sequence, rather than someone at Microsoft writing some heuristic algorithm to try to guess my usage pattern and pre-load any data that it thinks I will use at boot time.

I would rather the OS boot 1 second quicker, and then stop hitting the hard drive completely. Even if that meant that loading all three of my applications took 3 seconds longer.

For your specific usage pattern; best case would be for everything to be in contiguous sectors in order, starting with the OS's boot files, then Visual Studio's files, then Outlook's files, etc; so that the OS can do "prefetch whole track, one by one" (with minimal seeking) from very early boot (e.g. starting before things like networking, etc are initialised).

Of course I don't think Windows is able to optimise disk layout like that (even though it should know the order to prefetch things).

SpyderTL wrote:Let me ask this question. We can all agree that MS-DOS boots faster than Windows or Linux. A lot faster, in fact. Let's say 2 seconds from the end of POST...

Now, let's say that the 3 applications that you use most actually would run in MS-DOS, today. They had the same start up time, and had the same functionality. Hypothetically...

Now, the question is, would you still use Windows/Linux?

On modern hardware DOS should boot in less than half a second on modern hardware; not because it's good, but because it's a crippled piece of crap that does almost nothing (e.g. loads very few drivers, doesn't start a GUI, doesn't do network auto-config, etc). Obviously doing nothing is faster than doing something useful (like providing features that modern end users expect now).

Hypothetically; if anyone tried to implement something like Visual Studio on DOS it'd probably take 10 times longer to start because the OS is so retarded it doesn't even support asynchronous IO (the entire OS and everything running on it "freezes" while the disk controller is doing a DMA transfer), and doesn't support memory mapped files either (so the executable file and its shared libraries/DLLs have to be fetched in entirety before they can be used, and the OS can't fetch only what is needed/used and avoid a whole pile of unnecessary disk IO).

SpyderTL wrote:This, incidentally, is the focus behind my current OS / UI design. I want to boot faster than DOS, and have enough functionality to browse the web and watch videos. So far, I've got the first part nailed. Now I'm just working on the second part...

Like I said; doing nothing is faster than doing something useful (like providing features that modern end users expect). Booting faster than DOS is easy if your OS does less than DOS.

Cheers,

Brendan

Brendan · Post by **Brendan** » Sun Dec 20, 2015 3:15 am

Hi,

muazzam wrote:
SpyderTL wrote:
muazzam wrote:I'm sorry for the off-topic post, but am I too dumb that I never get things like garbage collection? I think it's just unnecessarily complex. I don't use memory management at all in my OS. Anyway, what else could you expect from someone with the IQ of 83?
Normally, when you are done using a block of memory, you give it back to the OS memory manager.

Normally, when you are done using a block of memory you give it back to the "heap" so it can be reallocated next time, and don't give it back to the OS.

muazzam wrote:Yeah, I knew that. But what's the point of garbage collection at all!

The point of garbage collection is to increase "ease of programming" and reduce the chance of programming bugs (by making it so that programmers don't need to free anything themselves, and can't free the wrong thing or free something twice).

Of course this means searching for things to free (which is slower that being told what to free when); so essentially it sacrifices performance for the sake of programmer laziness.

Cheers,

Brendan

Muazzam · Post by **Muazzam** » Sun Dec 20, 2015 5:26 am

Brendan wrote:so essentially it sacrifices performance for the sake of programmer laziness.

That's what I thought!

Roman · Post by **Roman** » Sun Dec 20, 2015 6:44 am

Brendan wrote:laziness

Isn't that the reason of tons of different inventions, including computers?

Muazzam · Post by **Muazzam** » Sun Dec 20, 2015 7:16 am

Roman wrote:
Brendan wrote:laziness
Isn't that the reason of tons of different inventions, including computers?

no

onlyonemac · Post by **onlyonemac** » Sun Dec 20, 2015 10:29 am

Brendan wrote:For my Linux machine, I don't think there's any pre-fetching at all. If the machine boots and isn't used for 5 entire days everything will still be slow (come from disk rather than file system cache) the first time you use it. It's like it's suffering from Alzheimer's disease and is completely surprised when I start the web browser, despite the fact that I've probably started the web browser on this machine 4 times a day (on average) for the last 2000+ days. It's just another reason why Linux is a crippled joke.

Yes you are correct, there is no pre-fetching at all. I don't know why yours is slow, but my Linux machine is *always* fast to open the web browser (or any other application), no matter when last I used either the machine or the browser. Things open faster the second time per each startup, but they're still fast enough the first time I open them after starting up and definitely faster than Windows.

The swapping/paging in Linux is also done intelligently: if there's RAM free, use the excess RAM to cache disk access; if there's not enough RAM free, discard some of the disk access cache (if it's a read cache, just discard it; if it's a write cache, sync it to disk); if there's still not enough RAM free, swap RAM for inactive processes to disk; if there's still not enough RAM free, swap RAM for active processes to disk; if there's still not enough RAM free, kernel panic. That's the basic idea. There's also a setting which determines how much RAM should be kept free before swapping to disk - if it's set to "0" then swapping will only occur when RAM runs out (actually not the best idea, as you lose a lot of the benefits of disk caching and when RAM does run out you'll have to sit waiting for a potentially large amount of swapping to occur rather than having the system swap RAM when the disk is idle meaning that you're likely to already have enough free RAM to allocate a block of memory); if it's set higher than "0" then the system will try to always keep at least this percentage of RAM free.

Brendan · Post by **Brendan** » Sun Dec 20, 2015 12:08 pm

Hi,

onlyonemac wrote:
Brendan wrote:For my Linux machine, I don't think there's any pre-fetching at all. If the machine boots and isn't used for 5 entire days everything will still be slow (come from disk rather than file system cache) the first time you use it. It's like it's suffering from Alzheimer's disease and is completely surprised when I start the web browser, despite the fact that I've probably started the web browser on this machine 4 times a day (on average) for the last 2000+ days. It's just another reason why Linux is a crippled joke.
Yes you are correct, there is no pre-fetching at all. I don't know why yours is slow, but my Linux machine is *always* fast to open the web browser (or any other application), no matter when last I used either the machine or the browser. Things open faster the second time per each startup, but they're still fast enough the first time I open them after starting up and definitely faster than Windows.

I rarely turn the computer off, have more RAM than I normally use, and have rotating mechanical hard disks. Normally everything I use is cached in RAM and starting the web browser is noticeably slow (e.g. about 400 ms to start and not "instant"). When the computer has been rebooted and none of the browser's files are cached in RAM, starting the web browser excruciatingly slow (about 3 seconds to start and not "instant"), most likely because it has to pound the daylights out of the (rotating mechanical) hard disks as it fetches about 12 million stupid shared libraries all at the same time.

It's not just the web browser though. For example, when I press F12 (the keyboard shortcut that builds my OS and web site) it typically takes 2 seconds (when everything is cached in RAM, especially directory entries as my build utility checks the timestamps for lots of files to see if anything needs to be rebuilt); but after a reboot it's more like 10 seconds because the OS/kernel is slapped together by incompetent noobs.

onlyonemac wrote:The swapping/paging in Linux is also done intelligently: if there's RAM free, use the excess RAM to cache disk access; if there's not enough RAM free, discard some of the disk access cache (if it's a read cache, just discard it; if it's a write cache, sync it to disk); if there's still not enough RAM free, swap RAM for inactive processes to disk; if there's still not enough RAM free, swap RAM for active processes to disk; if there's still not enough RAM free, kernel panic. That's the basic idea. There's also a setting which determines how much RAM should be kept free before swapping to disk - if it's set to "0" then swapping will only occur when RAM runs out (actually not the best idea, as you lose a lot of the benefits of disk caching and when RAM does run out you'll have to sit waiting for a potentially large amount of swapping to occur rather than having the system swap RAM when the disk is idle meaning that you're likely to already have enough free RAM to allocate a block of memory); if it's set higher than "0" then the system will try to always keep at least this percentage of RAM free.

The problem is when there's 10 GiB of RAM being wasted and all the disk drives and all the CPUs are idle; where the OS does absolutely nothing to improve "future performance". It should be prefetching. It should also be checking file systems for errors (to avoid the "You can't use your entire computer for 2 frigging hours because all of the file systems haven't been checked for 123 days" idiocy when it is rebooted), and should probably also be optimising disk layout (e.g. de-fragmenting, data de-duplication, compressing files that haven't been used for ages, etc).

Cheers,

Brendan

Rusky · Post by **Rusky** » Sun Dec 20, 2015 2:01 pm

Brendan wrote:Of course this means searching for things to free (which is slower that being told what to free when); so essentially it sacrifices performance for the sake of programmer laziness.

A good garbage collector never even looks at the objects it frees, only what it keeps. It walks the reachable part of the heap, moving objects to the next generation or another space, and then considers the rest of the heap free. This of course has overhead, but it's not always worse than c++-style destructors or the equivalent c- collections can batch up freeing work and touch less memory, similarly to arenas or object pools (which are how you get good performance with manually-managed memory anyway).

The real problem with garbage collection is not the heap scanning, but the way GCed languages get designed. Far more objects are boxed rather than stored on the stack or inline in their owners, which leads to more cache misses and more freeing work overall, counteracting GC's benefits.

If a language allowed the programmer to selectively apply a GC the same way you can selectively apply arenas and pools, it would be a good tool, especially for graph-based data structures and parallel, immutable data structures. In fact, Linux's RCU is very similar to GC and would be less error-prone with compiler support.

onlyonemac · Post by **onlyonemac** » Sun Dec 20, 2015 2:31 pm

Brendan wrote:The problem is when there's 10 GiB of RAM being wasted and all the disk drives and all the CPUs are idle; where the OS does absolutely nothing to improve "future performance". It should be prefetching. It should also be checking file systems for errors (to avoid the "You can't use your entire computer for 2 frigging hours because all of the file systems haven't been checked for 123 days" idiocy when it is rebooted), and should probably also be optimising disk layout (e.g. de-fragmenting, data de-duplication, compressing files that haven't been used for ages, etc).

File system checks aren't needed if you use a journalling filesystem. I can only assume that your system is mis-configured; since ext4 the startup filesystem checks have been disabled by default. Also de-fragmentation is not necessary because of techniques in the filesystem (such as delayed allocation) to reduce the effects of fragmentation. The only thing I've observed a Linux system doing when idle is swapping RAM to disk if the amount of free RAM is below the threshold, or swapping it back to RAM if there's enough free RAM. Other than that my Linux systems always keep their hard drives idle when not in use, and frankly to me that's a good thing because it means that the machine's ready to spring into action as soon as I need it to, rather than slowing my use of it down because it's still trying to do a whole pile of other disk activity. On the other hand, if I walk up to a Windows system that's been sitting idle for the last half an hour, it's always doing so much disk activity that for the first five minutes after sitting down at it I pretty much can't do anything. I'd far rather have my hard drive immediately available to load all the files for the web browser than to have it "prefetching" the email client that actually I'm not going to use because I get my email on my phone. It doesn't matter how quickly it stops doing any background activity when I sit down, it's still going to take a while to realise that I'm needing the hard drive's full bandwidth now, and when we're talking about matters of seconds, any length of time is too long for something that maybe only speeds things up once in a while.

Brendan wrote:Of course this means searching for things to free (which is slower that being told what to free when); so essentially it sacrifices performance for the sake of programmer laziness.

It's not just programmer laziness; memory leaks due to forgetting to free pointers are pretty much non-existent (as are segfaults due to attempting to free a pointer twice, or other undefined behaviour when a pointer is mistakenly freed when it is still needed or without being marked as freed). These are all real issues, a pain to debug and even more of a pain to the end user.

OSDev.org

Automatic memory management

Re: Automatic memory management

Re: Automatic memory management

Re: Automatic memory management

Re: Automatic memory management

Re: Automatic memory management

Re: Automatic memory management

Re: Automatic memory management

Re: Automatic memory management

Re: Automatic memory management

Re: Automatic memory management

Re: Automatic memory management

Re: Automatic memory management

Re: Automatic memory management

Re: Automatic memory management

Re: Automatic memory management