OSDev.org

Posted: **Sun Jan 12, 2014 12:33 pm**

So I was reading http://lwn.net/Articles/75174/, which talks about why Linux couldn't support more that 1G of RAM on 32-bit systems back in the day. From what I understand, the kernel wanted to be able to access all of RAM from it's address space (which was 0xC0000000+). Since the kernel only had 1G of virtual address space on 32-bit systems, only 1G of RAM could be supported.

But my question is, why did Linux want to map all of physical memory into the kernel address space so badly? The solution to the issue even makes less sense to me. Now the kernel maps the first 896M of physical memory into the kernel's address space starting at 0xC0000000 and the top 128M of kernel virtual memory is used to temporarily address the rest of physical memory. Under what conditions would the kernel need to access physical memory addresses which aren't already mapped in either the kernel's or user's address space?

Thanks.

Posted: **Sun Jan 12, 2014 12:45 pm**

I think the idea was that it was considerably cheaper to access raw physical memory if it was already mapped at one point and all you had to do was to read at base_physical_memory_in_virtual_memory + physical_memory_offset to access the physical memory at physical_memory_offset, compared to manually having to remap the desired physical memory at a new location and possibly do a TLB flush and such. While this approach doesn't make that much sense when the present physical memory is roughly the same size as the virtual memory address space on 32-bit x86 systems, this approach makes a bit more sense if done on a x86_64 system where the address space is huge compared to the RAM in many present commodity systems.

Posted: **Sun Jan 12, 2014 12:46 pm**

One reason I know about is that the slab allocator (now replaced by slub) used the 1:1 mapped physical memory to find which physical memory that slab was using. If the kernel know the virtual address of the slab, then the physical address was that address minus some offset as it was mapped with an offset in kernel high memory. The physical memory address of the slab was not stored inside its own metadata. However, I'm not sure if this was the original reason for the physical mapped memory or if it was only the slab taking advantage of it.

Posted: **Sun Jan 12, 2014 1:02 pm**

OSwhatever wrote:One reason I know about is that the slab allocator (now replaced by slub) used the 1:1 mapped physical memory to find which physical memory that slab was using. If the kernel know the virtual address of the slab, then the physical address was that address minus some offset as it was mapped with an offset in kernel high memory. The physical memory address of the slab was not stored inside its own metadata. However, I'm not sure if this was the original reason for the physical mapped memory or if it was only the slab taking advantage of it.

In my time looking into the Linux kernel I keep hearing about the "slab allocator". I think it's time I actually learn about it

. Thanks for the response.

Posted: **Sun Jan 12, 2014 1:07 pm**

sortie wrote:I think the idea was that it was considerably cheaper to access raw physical memory if it was already mapped at one point and all you had to do was to read at base_physical_memory_in_virtual_memory + physical_memory_offset to access the physical memory at physical_memory_offset, compared to manually having to remap the desired physical memory at a new location and possibly do a TLB flush and such.

I do get this, it is definitely easier to access physical memory this way. I guess I can't think of many good reasons you would need this operation to be cheap and simple. I can only think of a few cases where the kernel would want to access raw, unmapped, physical memory:

1. During fork, the kernel needs to copy data from one address space to another. Obviously only one address space is mapped at a time so getting raw access to the pages in the other address space would be nice.

Ok I can actually only think of 1

. What are some others? (I'm assuming things like DMA don't count because the kernel's virtual address space covers the first 896M of physical memory).

Posted: **Sun Jan 12, 2014 9:54 pm**

Hi,

dmatlack wrote:But my question is, why did Linux want to map all of physical memory into the kernel address space so badly?

When the Linux kernel was first written, people were using 80386 machines, computers with more than 16 MiB of RAM were rare, computers with more than 1 GiB of RAM were hard to imagine, and Linus didn't think the kernel would be used for much anyway. Mapping all memory into kernel space sounded like a good idea at the time.

It took around 10 years before it started becoming a problem - the amount of RAM computers were using increased and servers with more than 1 GiB started to appear. Then Intel added PAE. Then computers with more than 4 GiB started to appear.

By the time Linux developers finally realised that the "map all physical memory into kernel space" idea was a problem it was too late. There was 10 years or more of code built on top of the "all physical memory mapped" assumption, and Linux was being relied on by actual people. The risk of changing the physical memory management and breaking something (regressions) meant that nobody wanted to change part of the fundamental parts of the kernel too much - they couldn't redesign it properly, and had to add a hack-job that resulted in the "low memory/high memory" split (so that old code can continue using "low memory" and there was very little chance of regressions).

Of course now we know that "X bytes of RAM will be enough for everything!" is always wrong sooner or later. For example, for 64-bit 80x86, kernel space can be as large as 128 TiB and (like Linus working on 80386 machines) computers with that much RAM seem hard to imagine now. If you assume RAM sizes double every 3 years then you can expect to see computers with more than 128 TiB in about 30 years. When that happens, you can expect Linux developers will start whining about CPU manufacturers failing to increase virtual address space sizes (and blaming the CPU designers for the kernel's own stupidity) again. Of course there are technologies currently being developed that can have a massive impact on this; like non-volatile RAM (e.g. instead of having 16 GiB of RAM and 3 TiB of hard disk space, you might just have 3 TiB of non-volatile RAM).

The other thing we know is that (when used properly) paging is extremely powerful. By mapping "all" physical memory into kernel space like that, you make it impossible to use any paging tricks in kernel space, and make it impossible for the kernel itself to gain any benefits from one of the CPU's most powerful features (even when all physical memory actually does fit in kernel space).

Cheers,

Brendan

Posted: **Sun Jan 12, 2014 10:10 pm**

Hi,

dmatlack wrote:I do get this, it is definitely easier to access physical memory this way. I guess I can't think of many good reasons you would need this operation to be cheap and simple. I can only think of a few cases where the kernel would want to access raw, unmapped, physical memory:

1. During fork, the kernel needs to copy data from one address space to another. Obviously only one address space is mapped at a time so getting raw access to the pages in the other address space would be nice.

Actually, no.

During fork, the kernel needs to clone the virtual address space using "copy on write" paging tricks. It only really needs to copy the process' highest level paging table (e.g. PML4) to do this (and make entries in both copies of that highest level paging table "read only") - the kernel doesn't have "memcpy()" all the data in each individual page during fork. It's after fork has finished (e.g. when one of the processes attempts to write to a "read only, copy on write" page) where the page fault handler may need to copy the actual data. Of course (because only data that is modified is copied) this is a massive performance improvement and saves a massive amount of RAM; especially when most processes do the "fork then exec" thing anyway.

Cheers,

Brendan

Posted: **Tue Jan 14, 2014 6:46 pm**

Thanks Brendan!

OSDev.org

Linux "High Memory"

Linux "High Memory"

Re: Linux "High Memory"

Re: Linux "High Memory"

Re: Linux "High Memory"

Re: Linux "High Memory"

Re: Linux "High Memory"

Re: Linux "High Memory"

Re: Linux "High Memory"