1) About the question whether there is a way to know if a sub-page byte range has been accessed, the answer is no. Why, you may ask, such a feature does not exist? Let's say that there was a way to (for example) get the information of which bytes in RAM have been accessed since you told the hardware to start monitoring. Then for each byte (8 bits) there would exist one more bit holding the "have been accessed" information, so a 8GB System that would need plus 1GB of storage. I think that the problem is obvious.
2) As iansjack explained, for a GC you don't care if the object
has been accessed but whether it
can be accessed.
(An object may not get accessed for several passes of the GC, yet it may be referenced by a static variable and therefore it should remain in the RAM as it can get accessed in the future) So how you can do that?
https://en.wikipedia.org/wiki/Garbage_c ... r_science), specifically, take a look at the
Strategies Section.
The 2 most common approaches are:
A)
Tracing For high level (and Object Oriented) programing languages: .NET (C#, VB, F#), Java, ...
Basically, you start at the "Roots" (all the places that an object can be immediately accessed, i.e. registers, static varables, stack, ...) and then for each object you find, you mark it as "accessible". Then you continue by checking each object's fields for pointers to further objects until each and every accessible object has been marked "accessible". Then you free the rest of the memory.
B)
Reference Counting For lower level environments (not particularly for Object Oriented), the Windows kernel uses that for objects that can be accessed by several components and/or programs such as: files, events, threads, ...
Basically, when you want to firstly access the object (i.e. you open a file, represented by an object) you increment a counter in the object's header by one. When you don't want the object anymore (i.e. close the file) you decrement the counter by one. When the counter reaches zero (i.e. none is operating on the file) the object gets freed.
Footnotes:
1) In both strategies I am skipping many minor (yet important) details, check the links for a better description and different sub-strategies.
2) Many people refer to "Tracing Garbage Collection" as "Garbage Collection".
3) About kzinti's (last) answer, I think that you need to first implement a GC that works and then try to use those kinds of tricks. (In my opinion, without any testing whatsoever, I think that using page-faults for lazy reference updates is just to slow. I am actually interested in that so if anyone can provide further information on that it would be really helpful).