Page 1 of 1

confusion over canonical/higher-half addressing

Posted: Sun Nov 16, 2008 7:35 am
by furryfreak
Hi, I've been reading up on canonical address form for quite a while, but one thing remains unclear.

for simplicity's sake, lets say I have just 1k of RAM, and I'm using 16-bit addressing.
If I wrote a byte to the canonical address 0xFFFF, would it be written to the end of memory (0x03FF), or to some random location?

Re: confusion over canonical/higher-half addressing

Posted: Sun Nov 16, 2008 7:54 am
by thepowersgang
Theoretically in that case a page fault error would occur and call the error handler.
Normally if memory is accessed that does not exist a page fault happens. The only exception is access above 1Mb without the A20 line enabled, in which case the access wraps.

Re: confusion over canonical/higher-half addressing

Posted: Sun Nov 16, 2008 8:05 am
by furryfreak
so basically, I should just do what I've already been doing and use INT 0x15 / AX 0xE820 to build a memory map, and use that to determine the end of physical memory.

Re: confusion over canonical/higher-half addressing

Posted: Sun Nov 16, 2008 8:24 am
by Love4Boobies
Canonical addresses are only there in 64-bit mode. The address gets sign-extended from the 48th bit to the most significant.

Re: confusion over canonical/higher-half addressing

Posted: Sun Nov 16, 2008 8:30 am
by Brendan
Hi,
furryfreak wrote:Hi, I've been reading up on canonical address form for quite a while, but one thing remains unclear.

for simplicity's sake, lets say I have just 1k of RAM, and I'm using 16-bit addressing.
If I wrote a byte to the canonical address 0xFFFF, would it be written to the end of memory (0x03FF), or to some random location?
In long mode, the linear address space is split in half with a big hole in the middle. For example, a CPU might support 48-bit linear/virtual addresses, where the highest support bit is extended into the unsupported bits - e.g. the address 0xABCD000000001234 isn't a valid 48-bit address, and the 48th bit is zero, so this zero bit would be extended into the unsupported bits of the address, causing the actual address to be 0x0000000000001234. In the same way, the address 0x1234800000001234 isn't a valid 48-bit address and would become 0xFFFF800000001234 because the 48th bit was set.

An address is "canonical" if the unsupported bits in the address are already set the same as the highest supported bit. For example, 0x0000000000001234 and 0xFFFF800000001234 are canonical addresses; but 0xABCD000000001234 and 0x1234800000001234 are not canonical.

Most (all?) linear/virtual addresses must be canonical or the CPU will generate an exception.

Also note that (for 48-bit linear addressing in long mode) you end up with a linear/virtual address space like this:

0x0000000000000000 to 0x00007FFFFFFFFFFF - canonical addresses (often used for "user space")
0x0000800000000000 to 0xFFFF7FFFFFFFFFFF - non-canonical addresses (unusable)
0xFFFF800000000000 to 0xFFFFFFFFFFFFFFFF - canonical addresses (often used for "kernel space")

Note that this has everything to do with linear/virtual addressing; and has nothing to do with physical addresses.

The "end of the physical address space" depends on how many bits the CPUs supports for physical addresses. You can get this information from CPUID (eax = 0x80000008).


Cheers,

Brendan

Re: confusion over canonical/higher-half addressing

Posted: Sun Nov 16, 2008 10:54 am
by furryfreak
Note that this has everything to do with linear/virtual addressing; and has nothing to do with physical addresses.
I've got it now, I thought the physical address a page was mapped to had to be canonical.
cheers for the help.

--freak