64 bit recursive paging

Question about which tools to use, bugs, the best way to implement a function, etc should go here. Don't forget to see if your question is answered in the wiki first! When in doubt post here.
Korona
Member
Member
Posts: 1000
Joined: Thu May 17, 2007 1:27 pm
Contact:

Re: 64 bit recursive paging

Post by Korona »

kzinti wrote:But how is that helping in user space (if at all)? Surely you have to map the pages so that the user programs can access them and that means TLB shootdowns and all the complexity is back?
Yes, but now you have the kernel MM available when implementing stuff like lazy shootdown which makes stuff simpler. In many contexts, userspace memory is not touched at all, e.g., it is (presumably) never accessed from IRQs or NMIs (maybe not even from regions with IRQs disabled), which also makes everything a lot simpler. You can also access the userspace page tables through the physical mapping without worrying about shootdown of these pages.
managarm: Microkernel-based OS capable of running a Wayland desktop (Discord: https://discord.gg/7WB6Ur3). My OS-dev projects: [mlibc: Portable C library for managarm, qword, Linux, Sigma, ...] [LAI: AML interpreter] [xbstrap: Build system for OS distributions].
User avatar
iansjack
Member
Member
Posts: 4703
Joined: Sat Mar 31, 2012 3:07 am
Location: Chichester, UK

Re: 64 bit recursive paging

Post by iansjack »

Apart from anything else, does anyone here seriously expect their OS to run on that sort of hardware?

I agree with the "map all physical memory" approach. As long as your paging code is modular, you can always rewrite it without affecting the rest of the OS should address space become a problem.
kzinti
Member
Member
Posts: 898
Joined: Mon Feb 02, 2015 7:11 pm

Re: 64 bit recursive paging

Post by kzinti »

iansjack wrote:Apart from anything else, does anyone here seriously expect their OS to run on that sort of hardware?
Not really.

I started with recursive mapping in the bootloader because it sounded interesting at the time. Then I did the same thing in 64 bits because why not.

I haven't actually implemented a proper virtual memory manager yet (or even a physical memory manager yet). I also haven't started working on SMP.

But I am getting there now... So this discussion came up at the right time for me. I will just map all physical memory in 64 bits as it indeed seems the way to go about it. I am not sure what I'll do for 32 bits yet.
linguofreak
Member
Member
Posts: 510
Joined: Wed Mar 09, 2011 3:55 am

Re: 64 bit recursive paging

Post by linguofreak »

bzt wrote:
Octocontrabass wrote:
Virtual memory is 56 bits wide, always (architectural limit).
Citation needed? I can't find anything in the AMD documentation on the architecture that says that there is an architectural limit on virtual memory short of the 64-bit mark. Indeed, while no implementations currently implement that many bits and the mechanism for doing so is "reserved for a future specification", AMD explicitly states the intention to eventually have 64-bit virtual addresses. However, AMD documentation *does* state that *physical* addresses are limited to 52 bits, and that this *is* an architectural limit (due to the PTE format).
User avatar
bzt
Member
Member
Posts: 1584
Joined: Thu Oct 13, 2016 4:55 pm
Contact:

Re: 64 bit recursive paging

Post by bzt »

linguofreak wrote:Citation needed? I can't find anything in the AMD documentation on the architecture that says that there is an architectural limit
Try the search phrase "architectural limit" :-) I've looked it up for you, and I remembered incorrectly, it is not 56 bits, just 52 bits in best case.

Btw, here are the references:
  • AMD64 Architecture Programmer's Manual Volume 2: System Programming, section 3.1.2 CR2 and CR3 registers, bits 52-63 reserved and MBZ.
  • AMD64 Architecture Programmer's Manual Volume 2: System Programming, section 5.1 Page Translation Overview,
    Currently, the AMD64 architecture defines a mechanism for translating 48-bit virtual addresses to 52-bit physical addresses. The mechanism used to translate a full 64-bit virtual addresses is reserved
    and it also mentions that a certain implementation may implement fewer bits, requiring minimum 36 bits only (to support legacy PAE).
  • Intel64 and IA-32 Architectures Software Developer's Manual, Volume 3A, System Programming Guide, Part 1, section 3.10.4 Enchanced Paging Data Structures (allows up to 39 bits, but bits 52-63 reserved for system programmer, so it can't go above 52 bits).
  • ARM DDI0478 ARM Architecture Reference Manual ARMv8 section D4.1.3 VMSA address types and address spaces: with ARMv8-LVA it is 52 bits, otherwise 48 bits, physical address likewise.
Furthermore, on wikipedia:
In addition, the AMD specification requires that the most significant 16 bits of any virtual address, bits 48 through 63, must be copies of bit 47 (in a manner akin to sign extension). If this requirement is not met, the processor will raise an exception.
and
The first versions of Windows for x64 did not even use the full 256 TiB; they were restricted to just 8 TiB of user space and 8 TiB of kernel space. Windows did not support the entire 48-bit address space until Windows 8.1, which was released in October 2013.
This means there's several hundred times more virtual space than RAM ever will be on any 64-bit architecture, x86 and ARM alike (unless RAM because super cheap and manufacturers begun to sell machines with 256T RAM, without AMD revising the 48-bit VA barrier).
To sum it up:
AMD: VA 48, PA 52 (however nobody ever build a CPU with more than 39 bits, maybe some research arch use 40-bits, 1T RAM)
Intel: VA 48, PA 39 (as of the first release of the spec) theoretical max 52
ARM: VA 48, PA max 48 (but BCM2835 supports just 39), with LVA and LPA extensions VA 52, PA max 52

Cheers,
bzt
Korona
Member
Member
Posts: 1000
Joined: Thu May 17, 2007 1:27 pm
Contact:

Re: 64 bit recursive paging

Post by Korona »

Intel CPUs with more than 39 physical address bits do exist (for quite some time now). For example, the newest Xeons implement 42 bits.
managarm: Microkernel-based OS capable of running a Wayland desktop (Discord: https://discord.gg/7WB6Ur3). My OS-dev projects: [mlibc: Portable C library for managarm, qword, Linux, Sigma, ...] [LAI: AML interpreter] [xbstrap: Build system for OS distributions].
User avatar
bzt
Member
Member
Posts: 1584
Joined: Thu Oct 13, 2016 4:55 pm
Contact:

Re: 64 bit recursive paging

Post by bzt »

Korona wrote:Intel CPUs with more than 39 physical address bits do exist (for quite some time now). For example, the newest Xeons implement 42 bits.
There's still a 2^6 factor. It has 64 times more virtual address space than physical bus. For ARM, there's still a factor of 2^10 (52 - 42, read as: thousand times more). Plus I'd like to see that motherboard which you actually can fit 4T RAM into.

Let's take a practical point of view. Most desktop computers have 4 memory slots. Some expensive gamer motherboards have 8. In enterprise server appliances you might find 16. For the sake of the argument, let's assume your hypothetical motherboard has 32 slots.

Now the biggest memory module you can buy is 64G, but for the sake of the argument, let's overestimate again and say you have 128G RAM modules.

That's 32*128G = 4T, 2^42. May I remind you we have overestimated both the number of modules and the module size by a magnitude. In reality, as of 2020 on Amazon you can buy 8x64G = 512G of RAM tops, which is very far from the 256T that would cause trouble (and that's only if AMD does not change and update the spec). A typical desktop machine in 2020 has 16G - 32G RAM only, and 512G is extremely rare even for servers.

Cheers,
bzt
Korona
Member
Member
Posts: 1000
Joined: Thu May 17, 2007 1:27 pm
Contact:

Re: 64 bit recursive paging

Post by Korona »

I fully agree with your "virtual memory will be much larger than physical memory on x86_64 for the next few years". But 16x128 GiB = 2 TiB is something that you can easily buy and something that people actually use for specialized applications. At work, we use such a machine for computation on very large data sets.

This is a mainboard that can host such a configuration: https://www.supermicro.com/en/products/ ... d/X11DPI-N
EDIT: Note that if you go to quad socket boards, you can also have much more memory: https://www.supermicro.com/en/products/ ... rd/X11QPH+ (e.g., this board has 48 DIMMs)

Below is an example configuration (which I just got after some googling -- if you ask your favorite enterprise retailer, they'll surely have something similar).
Attachments
2tib.png
managarm: Microkernel-based OS capable of running a Wayland desktop (Discord: https://discord.gg/7WB6Ur3). My OS-dev projects: [mlibc: Portable C library for managarm, qword, Linux, Sigma, ...] [LAI: AML interpreter] [xbstrap: Build system for OS distributions].
nullplan
Member
Member
Posts: 1792
Joined: Wed Aug 30, 2017 8:24 am

Re: 64 bit recursive paging

Post by nullplan »

Korona wrote:I fully agree with your "virtual memory will be much larger than physical memory on x86_64 for the next few years". But 16x128 GiB = 2 TiB is something that you can easily buy and something that people actually use for specialized applications. At work, we use such a machine for computation on very large data sets.
For the next lots of years. Intel published the 5-level paging whitepaper in 2016, and Ice Lake was announced as the first microarchitecture to implement it. That apparently the freshly released Ice Lake CPUs still don't have it is irksome, but not of importance at this point. Even the $2 mill. configuration Octo posted above does not quite get to the 64 TB limit imposed by mapping all physical memory to a quarter of the address space, and by the time we get there, Intel will hopefully have ironed out the kinks in Ice Lake, granting us a 56 bit VA space, with a new memory limit of 16 PB. And if we ever actually get there, maybe someone should ask what the hell we are doing with enough memory to save all human knowledge.
Carpe diem!
Korona
Member
Member
Posts: 1000
Joined: Thu May 17, 2007 1:27 pm
Contact:

Re: 64 bit recursive paging

Post by Korona »

True. By "next few" I meant: probably a decade or so. (At least once we get to a 128 bit ISA, that will be enough forever. :))
managarm: Microkernel-based OS capable of running a Wayland desktop (Discord: https://discord.gg/7WB6Ur3). My OS-dev projects: [mlibc: Portable C library for managarm, qword, Linux, Sigma, ...] [LAI: AML interpreter] [xbstrap: Build system for OS distributions].
linguofreak
Member
Member
Posts: 510
Joined: Wed Mar 09, 2011 3:55 am

Re: 64 bit recursive paging

Post by linguofreak »

bzt wrote:
linguofreak wrote:Citation needed? I can't find anything in the AMD documentation on the architecture that says that there is an architectural limit
Try the search phrase "architectural limit" :-) I've looked it up for you, and I remembered incorrectly, it is not 56 bits, just 52 bits in best case.
You cut the sentence off early when you quoted me, what I said was:
Citation needed? I can't find anything in the AMD documentation on the architecture that says that there is an architectural limit on virtual memory short of the 64-bit mark.
(Missing bit in italics, important bit also in bold).

The AMD docs mention the limit on the number of *physical* address bits as an architectural limit, but while it is true that no existing implementations support full 64-bit virtual addresses, and that the page table structure to do so has not yet been specified, they do *not* say anywhere that the limit on *virtual* address width is an *architectural* limit.

Btw, here are the references:
  • AMD64 Architecture Programmer's Manual Volume 2: System Programming, section 3.1.2 CR2 and CR3 registers, bits 52-63 reserved and MBZ.
  • AMD64 Architecture Programmer's Manual Volume 2: System Programming, section 5.1 Page Translation Overview,
    Currently, the AMD64 architecture defines a mechanism for translating 48-bit virtual addresses to 52-bit physical addresses. The mechanism used to translate a full 64-bit virtual addresses is reserved
    and it also mentions that a certain implementation may implement fewer bits, requiring minimum 36 bits only (to support legacy PAE).
Once again, you're mixing up physical and virtual addresses. The 36 bits needed to support PAE are bits of *physical* address width, and have nothing to do with virtual addresses. And saying that the mechanism to translate a full 64-bit virtual address is reserved is basically a statement that the current limit to 64-bits is *not* an architectural limit: they're reserving it (and enforcing sign-extension of addresses) for the explicit purpose of being able to expand virtual addresses out later on in a backwards-compatible manner.
[*] Intel64 and IA-32 Architectures Software Developer's Manual, Volume 3A, System Programming Guide, Part 1, section 3.10.4 Enchanced Paging Data Structures (allows up to 39 bits, but bits 52-63 reserved for system programmer, so it can't go above 52 bits).
The format discussed here is exactly identical in the AMD and Intel manuals, except for slight differences in the terminology used to describe it. For the CR3 format, bits 52-63 are *reserved*, not for the system programmer, but for future expansion, and must be zero. For paging structure entries that point at a paging structure one level down, bits 52-62 are Available/Ignored (as stated by the AMD/Intel manuals respectively), and bit 63 is the NX/XD bit. For paging structure entries that designate pages of any size, bits 59-62, instead of being available/ignored/reserved to the system programmer are a protection key if CR4.PKE is set.

Note that CR3 and paging structure entries contain (partial) *physical* addresses (plus flag bits). And all of those XD/PKE/Available bits are *why* the *physical* address width is *architecturally* limited to 52-bits for all future implementations (whatever the physical address width of current implementations), while virtual addresses may, in future implementations, eventually be as long as a full 64-bits (whatever the virtual address width of current implementations). AMD left themselves room to expand virtual addresses out to 64 bits, but they only left room to expand physical addresses out to 52-bits without introducing breaking changes to the PTE format.
[*]ARM DDI0478 ARM Architecture Reference Manual ARMv8 section D4.1.3 VMSA address types and address spaces: with ARMv8-LVA it is 52 bits, otherwise 48 bits, physical address likewise.
ARM manuals are irrelevant to the specification of the x86-64 architecture.[/list]
To sum it up:
AMD: VA 48, PA 52 (however nobody ever build a CPU with more than 39 bits, maybe some research arch use 40-bits, 1T RAM)
Intel: VA 48, PA 39 (as of the first release of the spec) theoretical max 52
Not quite:

AMD: Current VA 48, Current PA 39-ish, Max VA 64, Max PA 52
Intel: Current VA 48, Current PA 39-ish, Max VA 64, Max PA 52

In both cases, the only *architectural* limit on VA size is the word-width of the architecture, but PA size is architecturally limited to 52 bits by the PTE format. Meanwhile, there are no implementations that have address spaces as big as the 64/52 architectural limit, nor have the architectural extensions needed to reach that point been specified yet (but room has been left in the current specifications to draw up such extensions without breaking backward compatibility).
User avatar
bzt
Member
Member
Posts: 1584
Joined: Thu Oct 13, 2016 4:55 pm
Contact:

Re: 64 bit recursive paging

Post by bzt »

linguofreak wrote:Once again, you're mixing up physical and virtual addresses.
Nope. In order to calculate the address space / physical RAM max factor you need both.
linguofreak wrote:current limit to 64-bits is *not* an architectural limit: they're reserving it (and enforcing sign-extension of addresses) for the explicit purpose of being able to expand virtual addresses out later on in a backwards-compatible manner.
Okay, I won't argue on this. Let's say the architecture was specified with a 48 bit limit, and that part of the spec hasn't been updated yet, so that limit still stands. Don't call it a limit of the architecture if you don't want to. :-)
linguofreak wrote:The format discussed here is exactly identical in the AMD and Intel manuals
Nope, AMD specifies as 52 bits, but Intel specifies as 39, explicitly marking bits 40 to 52 as MBZ. This was clearly a shortsightedness on Intel's part, as current processors are using more bits for physical address bus, they should have wrote 52 bits just like AMD.
linguofreak wrote:bits 52-63 are *reserved*, not for the system programmer, but for future expansion, and must be zero.
Since we are talking about such a non-significant things, I meant the page table format in which it is marked as "Avail." in both specs, just as you wrote in the next sentence.
linguofreak wrote:ARM manuals are irrelevant to the specification of the x86-64 architecture.
Only if you fail to recognize that "64 bit recursive paging" applies to multiple platforms.

I have a feeling we are saying the same thing, just with different words. Am I right? :-) The point is, the theoretical maximum for virtual address space is 64, while the hard upper limit for physical address bus is 52 bits, so there will be always more virtual memory than physical. Even though the former is 48 bits and the latter is 42 bits tops in reality, the relation still stands, so there's nothing to argue about.

Cheers,
bzt
Octocontrabass
Member
Member
Posts: 5575
Joined: Mon Mar 25, 2013 7:01 pm

Re: 64 bit recursive paging

Post by Octocontrabass »

bzt wrote:Let's say the architecture was specified with a 48 bit limit, and that part of the spec hasn't been updated yet, so that limit still stands.
Here's a draft of the spec update to support 57-bit virtual addresses. Intel hasn't officially finalized the spec yet, but it's likely they intend to finalize it with the release of Ice Lake server processors, whenever that finally happens.
bzt wrote:Nope, AMD specifies as 52 bits, but Intel specifies as 39, explicitly marking bits 40 to 52 as MBZ.
Intel's latest manual doesn't say that. Are you perhaps looking at an older one?
linguofreak
Member
Member
Posts: 510
Joined: Wed Mar 09, 2011 3:55 am

Re: 64 bit recursive paging

Post by linguofreak »

bzt wrote:
linguofreak wrote:current limit to 64-bits is *not* an architectural limit: they're reserving it (and enforcing sign-extension of addresses) for the explicit purpose of being able to expand virtual addresses out later on in a backwards-compatible manner.
Okay, I won't argue on this. Let's say the architecture was specified with a 48 bit limit, and that part of the spec hasn't been updated yet, so that limit still stands. Don't call it a limit of the architecture if you don't want to. :-)
The point is, AMD uses the phrase "architectural limit" *only* in discussing the 52-bit limit on physical addresses, and *nowhere* in discussing the width of virtual addresses. And the reason for this is plain: "architectrual limit" means a limit that can't be gotten around without breaking changes. A limit that is *not* an architectural limit is one that can be gotten around with backward-compatible extensions. All those "Available/ignored/reserved to the system programmer" bits at bit 52 and up mean that physical addresses for the architecture cannot be extended to 64-bits without breaking changes. Virtual addresses, however, can be, it's just that the appropriate extensions don't exist yet.

The reason that I don't call the 48-bit virtual address limit an "architectural limit" isn't just because I don't want to call it that, I don't call an architectural limit *because AMD doesn't*, and because what they mean by "architectural limit", when referring to the 52-bit limit on physical addresses, is clear.
User avatar
bzt
Member
Member
Posts: 1584
Joined: Thu Oct 13, 2016 4:55 pm
Contact:

Re: 64 bit recursive paging

Post by bzt »

Octocontrabass wrote:Intel hasn't officially finalized the spec yet
Octocontrabass wrote:Are you perhaps looking at an older one?
I wrote:
bzt wrote:Intel: VA 48, PA 39 (as of the first release of the spec) theoretical max 52
Look, we could discuss this as long as you want, it won't change the fact that all 64 bit CPUs handle 48 bit virtual address space, and even the most bleeding edge top CPUs have several magnitude less physical address bus. And just because the bus allows it, doesn't mean your motherboard actually has enough slots for that much RAM, you should be lucky to have 64G RAM these days, that's 2^36.

Btw, do as you like. Map the entire RAM or don't map. You can, if you want to, but you don't have to.

Cheers,
bzt
Post Reply