I was reading a thread (http://forum.osdev.org/viewtopic.php?f=15&t=27914) and at some point it was mentioned that segmentation (without paging) could be used to memory-map files. I was wondering which kind of trick can be used to accomplish this or if there is no such trick and they were only talking about a hypothetical segmented processor.
Sure, we can just create a small segment with the first L bytes of a given file. And when the user tries to read something at a position higher than L, we get a fault, so that we can read the file and update the segment descriptor. But then if the user reads the position associated with the the last byte wouldn't we need to copy the entire file into memory? Is there a clever way to do this?
Memory map files with segmentation
-
- Member
- Posts: 5513
- Joined: Mon Mar 25, 2013 7:01 pm
Re: Memory map files with segmentation
On x86, the clever solution is paging. If you aren't using paging, you have to memory-map the entire file.
On a hypothetical segmented processor, where segments have both lower and upper limits, you could conceivably map only part of a file. Of course, as you improve your hypothetical segmentation, it gets more and more similar to paging...
On a hypothetical segmented processor, where segments have both lower and upper limits, you could conceivably map only part of a file. Of course, as you improve your hypothetical segmentation, it gets more and more similar to paging...
Re: Memory map files with segmentation
Well, not the entire file but everything between the lowest (ok maybe it has to be 0) and the highest position ever claimed by the process, right?
I never hoped it would be possible to keep only fragmented pieces of the file in memory, but maybe just a window near the last accessed byte (which would be possible if we had both lower and high limits).
If the program is reading the first bytes and then suddenly tries to read far away bytes, I was wondering if couldn't we just turn the segment into a "Expand Down" one (cause then we could free the memory that stores the first bytes of the file, since a later memory access to the first bytes of the segment would trigger a fault which would allow us to write them back)?
I never hoped it would be possible to keep only fragmented pieces of the file in memory, but maybe just a window near the last accessed byte (which would be possible if we had both lower and high limits).
If the program is reading the first bytes and then suddenly tries to read far away bytes, I was wondering if couldn't we just turn the segment into a "Expand Down" one (cause then we could free the memory that stores the first bytes of the file, since a later memory access to the first bytes of the segment would trigger a fault which would allow us to write them back)?
-
- Member
- Posts: 5513
- Joined: Mon Mar 25, 2013 7:01 pm
Re: Memory map files with segmentation
The process accesses both the beginning and end of the file. You end up loading the entire file to memory, or crashing if there is not enough memory to load the entire file.goback wrote:Well, not the entire file but everything between the lowest (ok maybe it has to be 0) and the highest position ever claimed by the process, right?
The process accesses the middle of the file. You end up loading half of the file to memory, or crashing if there is not enough memory to load half of the file.goback wrote:If the program is reading the first bytes and then suddenly tries to read far away bytes, I was wondering if couldn't we just turn the segment into a "Expand Down" one (cause then we could free the memory that stores the first bytes of the file, since a later memory access to the first bytes of the segment would trigger a fault which would allow us to write them back)?
Re: Memory map files with segmentation
Hi,
Note that:
There's also a limit to the number of descriptors you can have in the GDT and/or LDT. This causes compromises - for a single 2 GiB file you can't save more RAM by splitting it into a lot more smaller segments (e.g. 4 KiB segments) because you'll run out of descriptors; and for multiple files you'll probably run out of descriptors anyway (e.g. with four 2 GiB files and 1 MiB segments you'd need 8192 descriptors).
Of course if you could use lots of descriptors (e.g. and could use a massive number of 4 KiB segments) then it'd become an "excessively annoying for programmers" version of paging.
Cheers,
Brendan
The first "clever" trick I can think of is automatically deciding to use either a normal segment (program wants data near the start of the file only) or an expand down segment (program wants data near the end of the file only).goback wrote:Sure, we can just create a small segment with the first L bytes of a given file. And when the user tries to read something at a position higher than L, we get a fault, so that we can read the file and update the segment descriptor. But then if the user reads the position associated with the the last byte wouldn't we need to copy the entire file into memory? Is there a clever way to do this?
Note that:
- The general protection fault handler won't tell you the offset (in the segment) that was accessed. The only way to find this information (without paging) would be decode and analyse the instruction at EIP; and without that information you can't know whether to load the first part of the file or the last part (and would have to load the entire file as soon as its accessed).
- Software typically accesses multiple parts of a file. For example, if software reads one byte "offset 1234"; then you could allocate 1235 bytes of memory, create a 1235 byte segment and load those 1235 bytes; but then the software may read another byte at "offset 2345" and you'd have to re-allocate the memory, adjust the segment and load the next 1111 bytes. This means that for the "simplest case" sequential access pattern you'd be constantly diddling (reallocating memory and adjusting the segment).
- RAM must be physically contiguous. Physical memory will become fragmented (as programs allocate and free memory); and when allocating larger amounts of memory you'll probably also have to relocate everything in memory to be able to allocate the physically contiguous RAM needed (which is also going to be painfully slow and complicated; especially for multi-CPU where other CPUs are using things in memory while you're trying to de-fragment). In some cases (e.g. fast SSD drives) de-fragmenting RAM may be more expensive than loading a large file.
There's also a limit to the number of descriptors you can have in the GDT and/or LDT. This causes compromises - for a single 2 GiB file you can't save more RAM by splitting it into a lot more smaller segments (e.g. 4 KiB segments) because you'll run out of descriptors; and for multiple files you'll probably run out of descriptors anyway (e.g. with four 2 GiB files and 1 MiB segments you'd need 8192 descriptors).
Of course if you could use lots of descriptors (e.g. and could use a massive number of 4 KiB segments) then it'd become an "excessively annoying for programmers" version of paging.
Cheers,
Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.