confusion over canonical/higher-half addressing

Discussions on more advanced topics such as monolithic vs micro-kernels, transactional memory models, and paging vs segmentation should go here. Use this forum to expand and improve the wiki!
Post Reply
User avatar
furryfreak
Member
Member
Posts: 28
Joined: Mon Nov 03, 2008 12:45 pm
Location: SW England

confusion over canonical/higher-half addressing

Post by furryfreak »

Hi, I've been reading up on canonical address form for quite a while, but one thing remains unclear.

for simplicity's sake, lets say I have just 1k of RAM, and I'm using 16-bit addressing.
If I wrote a byte to the canonical address 0xFFFF, would it be written to the end of memory (0x03FF), or to some random location?
User avatar
thepowersgang
Member
Member
Posts: 734
Joined: Tue Dec 25, 2007 6:03 am
Libera.chat IRC: thePowersGang
Location: Perth, Western Australia
Contact:

Re: confusion over canonical/higher-half addressing

Post by thepowersgang »

Theoretically in that case a page fault error would occur and call the error handler.
Normally if memory is accessed that does not exist a page fault happens. The only exception is access above 1Mb without the A20 line enabled, in which case the access wraps.
Kernel Development, It's the brain surgery of programming.
Acess2 OS (c) | Tifflin OS (rust) | mrustc - Rust compiler
Currently Working on: mrustc
User avatar
furryfreak
Member
Member
Posts: 28
Joined: Mon Nov 03, 2008 12:45 pm
Location: SW England

Re: confusion over canonical/higher-half addressing

Post by furryfreak »

so basically, I should just do what I've already been doing and use INT 0x15 / AX 0xE820 to build a memory map, and use that to determine the end of physical memory.
User avatar
Love4Boobies
Member
Member
Posts: 2111
Joined: Fri Mar 07, 2008 5:36 pm
Location: Bucharest, Romania

Re: confusion over canonical/higher-half addressing

Post by Love4Boobies »

Canonical addresses are only there in 64-bit mode. The address gets sign-extended from the 48th bit to the most significant.
"Computers in the future may weigh no more than 1.5 tons.", Popular Mechanics (1949)
[ Project UDI ]
User avatar
Brendan
Member
Member
Posts: 8561
Joined: Sat Jan 15, 2005 12:00 am
Location: At his keyboard!
Contact:

Re: confusion over canonical/higher-half addressing

Post by Brendan »

Hi,
furryfreak wrote:Hi, I've been reading up on canonical address form for quite a while, but one thing remains unclear.

for simplicity's sake, lets say I have just 1k of RAM, and I'm using 16-bit addressing.
If I wrote a byte to the canonical address 0xFFFF, would it be written to the end of memory (0x03FF), or to some random location?
In long mode, the linear address space is split in half with a big hole in the middle. For example, a CPU might support 48-bit linear/virtual addresses, where the highest support bit is extended into the unsupported bits - e.g. the address 0xABCD000000001234 isn't a valid 48-bit address, and the 48th bit is zero, so this zero bit would be extended into the unsupported bits of the address, causing the actual address to be 0x0000000000001234. In the same way, the address 0x1234800000001234 isn't a valid 48-bit address and would become 0xFFFF800000001234 because the 48th bit was set.

An address is "canonical" if the unsupported bits in the address are already set the same as the highest supported bit. For example, 0x0000000000001234 and 0xFFFF800000001234 are canonical addresses; but 0xABCD000000001234 and 0x1234800000001234 are not canonical.

Most (all?) linear/virtual addresses must be canonical or the CPU will generate an exception.

Also note that (for 48-bit linear addressing in long mode) you end up with a linear/virtual address space like this:

0x0000000000000000 to 0x00007FFFFFFFFFFF - canonical addresses (often used for "user space")
0x0000800000000000 to 0xFFFF7FFFFFFFFFFF - non-canonical addresses (unusable)
0xFFFF800000000000 to 0xFFFFFFFFFFFFFFFF - canonical addresses (often used for "kernel space")

Note that this has everything to do with linear/virtual addressing; and has nothing to do with physical addresses.

The "end of the physical address space" depends on how many bits the CPUs supports for physical addresses. You can get this information from CPUID (eax = 0x80000008).


Cheers,

Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
User avatar
furryfreak
Member
Member
Posts: 28
Joined: Mon Nov 03, 2008 12:45 pm
Location: SW England

Re: confusion over canonical/higher-half addressing

Post by furryfreak »

Note that this has everything to do with linear/virtual addressing; and has nothing to do with physical addresses.
I've got it now, I thought the physical address a page was mapped to had to be canonical.
cheers for the help.

--freak
Post Reply