Page 1 of 2

64 bit - what difference will it make ?

Posted: Wed Apr 22, 2009 5:29 am
by extremecoder
What would be the difference in developing an 64 bit Operating System from 32 bit OS ?

Apart from addressing 2^64 bits, memory, able to handle 64bit numbers; etc what else significant difference will it make?

When I want to develop a 64-bit OS, does that mean I have to do that using 64 bit ASM / Compiler ?

Just want to know multiple views on 64 bit OS's pros and cons ...

Re: 64 bit - what difference will it make ?

Posted: Wed Apr 22, 2009 5:40 am
by Brendan
Hi,
extremecoder wrote:What would be the difference in developing an 64 bit Operating System from 32 bit OS ?

Apart from addressing 2^64 bits, memory, able to handle 64bit numbers; etc what else significant difference will it make?
That alone makes a significant difference (depending on a lot of things).

The other advantage is that there's more registers and they're wider (for e.g. sixteen 64-bit general purpose registers instead of eight 32-bit general purpose registers), which can make some pieces of code run a lot faster.

For a "64-bit only" OS you can avoid a lot of legacy issues too - for e.g. you know the CPU has an FPU, MMX and SSE without even looking, the chance of ISA cards being present is a lot lower, you're almost guaranteed to have local APIC/s and I/O APICs, etc.
extremecoder wrote:When I want to develop a 64-bit OS, does that mean I have to do that using 64 bit ASM / Compiler ?
Yes.
extremecoder wrote:Just want to know multiple views on 64 bit OS's pros and cons ...
It's possible to write one OS that has a 32-bit kernel and a 64-bit kernel... ;)


Cheers,

Brendan

Re: 64 bit - what difference will it make ?

Posted: Wed Apr 22, 2009 7:38 am
by 01000101
Also the x86-64 ABI uses registers for function parameters and thus is a little faster there.

"Knowing" SSE is available is huge benefit as being able to utilize vector operations using 128-bit XMM registers in the core of an OS can really speed things up. There is of course the 48-bit memory addressing mech that is always a plus.

Even the non-SSE instruction suffixes are a plus (eg: movsq), these can cut iteration counts in half if not more and will speed up basic memory functions.

Re: 64 bit - what difference will it make ?

Posted: Wed Apr 22, 2009 8:50 am
by Love4Boobies
I'm not sure it really speeds up memory accesses. If you want faster memory, you need faster memory.

Re: 64 bit - what difference will it make ?

Posted: Wed Apr 22, 2009 9:45 am
by Firestryke31
It won't speed up the time it takes to read memory, but it will cut the number of times you need to read said memory in about half, resulting in increased speed.

Re: 64 bit - what difference will it make ?

Posted: Wed Apr 22, 2009 12:29 pm
by Dex
Most of the advantage can be disconted if you have FPU, MMX and SSE etc, its also possable to have a 32bit OS, but run programs in 64bit long mode

Re: 64 bit - what difference will it make ?

Posted: Wed Apr 22, 2009 4:09 pm
by bewing
Not mentioned yet: one big difference is that the segment registers don't do nearly as much anymore, so you can forget about using segmentation as a memory model. (Not meaningful, of course, if you had no intention of doing that anyway.)

Re: 64 bit - what difference will it make ?

Posted: Wed Apr 22, 2009 4:37 pm
by madeofstaples
Firestryke31 wrote:It won't speed up the time it takes to read memory, but it will cut the number of times you need to read said memory in about half, resulting in increased speed.
Could you (or anyone) elaborate on this a little bit? Sure you have more registers so data need not retire from the registers as soon to make room for more data, but is it necessarily faster to load a quadword from memory in 64-bit word rather than 32-bit mode? With a 64-bit data bus, and the presumption that the compiler automatically aligns data, requesting to read the first 32-bits would cause the last 32-bits to be cached and a subsequent instruction to load those bits would generate a cache hit, no?

Re: 64 bit - what difference will it make ?

Posted: Wed Apr 22, 2009 5:02 pm
by Firestryke31
I probably shouldn't have said that since I don't actually know all of the specific cases, so I can't actually back that up. It makes sense to me, but what makes sense to me and what's actually going on are usually two different things.

Re: 64 bit - what difference will it make ?

Posted: Wed Apr 22, 2009 6:54 pm
by Brendan
Hi,
madeofstaples wrote:
Firestryke31 wrote:It won't speed up the time it takes to read memory, but it will cut the number of times you need to read said memory in about half, resulting in increased speed.
Could you (or anyone) elaborate on this a little bit? Sure you have more registers so data need not retire from the registers as soon to make room for more data, but is it necessarily faster to load a quadword from memory in 64-bit word rather than 32-bit mode? With a 64-bit data bus, and the presumption that the compiler automatically aligns data, requesting to read the first 32-bits would cause the last 32-bits to be cached and a subsequent instruction to load those bits would generate a cache hit, no?
It's not about improving RAM access speeds, it's about ordering those RAM accesses better and doing more in parallel.

For an example, imagine you've got 8 * 8 font data and you need to draw a character in an 8-bits per pixel video mode. With 64-bit code you can load all the font data into 8 general registers (one byte per register per screen line, e.g. "movzx r8, byte [fontData]" to "movzx r15,[fontData+7]"); then use these registers as an index into a lookup to get eight 64-bit masks (e.g. so that 00111000b becomes 0x0000FFFFFF000000); then do "clearMask = !mask" and "setMask = mask & colourData", then write the data to the screen with "temp = clearMask", "and temp,[rdi]", "or temp,setMask" and "mov [rdi],temp".

Now consider how an out-of-order CPU works:

Code: Select all

    movzx r8,byte [rsi]
    movzx r9,byte [rsi+1]
    movzx r10,byte [rsi+2]
    movzx r11,byte [rsi+3]
    movzx r12,byte [rsi+4]
    movzx r13,byte [rsi+5]
    movzx r14,byte [rsi+6]
    movzx r15,byte [rsi+7]
    mov r8,[lookupTable + r8*8]   ;CPU needs to wait for "movzx r8,byte [rsi]" to complete
    mov r9,[lookupTable + r9*8]   ;CPU needs to wait for "movzx r9,byte [rsi+1]" to complete
No consider attempting the same with 32-bit code:

Code: Select all

    movzx eax,byte [rsi]
    movzx ecx,byte [rsi+1]
    mov ebx,eax                    ;CPU needs to wait for "movzx eax,byte [rsi]" to complete
    mov edx,ecx                    ;CPU needs to wait for "movzx ecx,byte [rsi+1]" to complete
    and al,0x0F
    and cl,0x0F
    shr ebx,4                      ;CPU needs to wait for "mov ebx,eax" to complete
    shr edx,4                      ;CPU needs to wait for "mov edx,ecx" to complete
    mov eax,[lookupTable+eax*4]    ;CPU needs to wait for "and al,0x0F" to complete
    mov ecx,[lookupTable+ecx*4]    ;CPU needs to wait for "and cl,0x0F" to complete
    mov ebx,[lookupTable+ebx*4]    ;CPU needs to wait for "shr ebx,4" to complete
    mov edx,[lookupTable+edx*4]    ;CPU needs to wait for "shr edx,4" to complete
Notice that for the 32-bit code there's less distance between dependant instructions (more chance of performance being effected by instruction latencies), and there's more instructions, and it's doing sixteen pixels at a time (rather than 64 pixels at a time in the 64-bit version). In this case the 32-bit code would need to do 4 groups of 16 pixels (instead of one group 64 pixels) which would make it 4 times slower to start with, and when you take into account the instruction latencies plus the additional instructions for bit twiddling you might end up with code that's actually about 6 times slower.


Cheers,

Brendan

Re: 64 bit - what difference will it make ?

Posted: Wed Apr 22, 2009 11:49 pm
by skyking
madeofstaples wrote:
Firestryke31 wrote:It won't speed up the time it takes to read memory, but it will cut the number of times you need to read said memory in about half, resulting in increased speed.
Could you (or anyone) elaborate on this a little bit? Sure you have more registers so data need not retire from the registers as soon to make room for more data, but is it necessarily faster to load a quadword from memory in 64-bit word rather than 32-bit mode? With a 64-bit data bus, and the presumption that the compiler automatically aligns data, requesting to read the first 32-bits would cause the last 32-bits to be cached and a subsequent instruction to load those bits would generate a cache hit, no?
This is actually a lot to do with different CPU's have different bus performance which is not necessarily not only using particular instruction set. Also as pointed out in the 32-bit instruction set it is also possible to use 64-bit memory accesses, and the physical memory access these days will not be as wide as that due to the way the FSB is designed, however the memory will be read in cache-line blocks.

Also the number of registers is only a logical number, the actual number of physical registers may be more and given the problems with the limitation of only 8 gpr's implies it's quite reasonable that the number of physical registers are more and you might not need to stall the processor waiting for operations to complete if execution is made out of order.

However there are cases when it is likely to be performance differences. If you do actual arithmetic processing of 64-bit data the 32-bit program must do things with more instructions that the 64-bit program can do in a single instruction. Also the super scalar capabilities will not always do the optimal thing.

Brendan: if your kernel is able to control 64-bit processes the kernel has to be writen for a 64-bit processor and then IMHO it is a 64-bit OS even though not all it's code is written in 64-bit instruction set.

Re: 64 bit - what difference will it make ?

Posted: Thu Apr 23, 2009 12:55 am
by Brendan
Hi,
Brendan wrote:
extremecoder wrote:Just want to know multiple views on 64 bit OS's pros and cons ...
It's possible to write one OS that has a 32-bit kernel and a 64-bit kernel... ;)
skyking wrote:Brendan: if your kernel is able to control 64-bit processes the kernel has to be writen for a 64-bit processor and then IMHO it is a 64-bit OS even though not all it's code is written in 64-bit instruction set.
If you've got 2 separate kernels, where one kernel is "32-bit only" and runs in protected mode, and the other kernel runs in long mode (and is capable of running both 32-bit and 64-bit processes), then you can have one OS that uses either kernel. Further, if device drivers, etc are implemented as processes (e.g. the micro-kernel approach) then both kernels would be able to run 32-bit device drivers. In this case you couldn't really call it a 32-bit OS or a 64-bit OS, because it's both.
Dex wrote:Most of the advantage can be disconted if you have FPU, MMX and SSE etc, its also possable to have a 32bit OS, but run programs in 64bit long mode
For a 64-bit process to run under a 32-bit kernel, it'd need to run at CPL=0, have special thunking to access the kernel API, run with IRQs disabled (or use thunking for IRQs too, complete with horrendous race conditions), include it's own exception handlers (because the 32-bit kernel's exception handlers will be useless), manage it's own page tables (e.g. convert "32-bit paging" into "long mode paging" and update the accessed and dirty bits in the "32-bit paging" when thunking back to 32-bit), and have it's own GDT and IDT.

In general, I also think it'd be a good idea to classify OSs more accurately. For example, if an OS's kernel allows massive security/protection violations, then it should be called a "machine monitor" or something to avoid confusion with real OSs... :P


Cheers,

Brendan

Re: 64 bit - what difference will it make ?

Posted: Thu Apr 23, 2009 2:32 am
by Tomaka17
Maybe a noobish question because I don't know much about amd64 architecture

But isn't it possible for a kernel to detect whether it runs on a 32-bits or 64-bits CPU and adapt itself to this?
I mean a 64bits kernel that would also run on a 32bits CPU by using only protected mode and not long mode

Re: 64 bit - what difference will it make ?

Posted: Thu Apr 23, 2009 2:45 am
by AJ
Hi,

It's possible to have a kernel image containing both 32 and 64 bit code and only run the 64 bit code if on a 64 bit CPU (determined with cpuid), yes. In fact, that's pretty much how my boot loader works. But for a full kernel, that seems a bit of a waste.

You need to load the entire kernel image in to memory, so you are (almost) doubling up there. You also need to include code in the kernel to determine the processor type and run appropriate code. IMO, it's better to make this distinction at compile-time and have your boot loader decide which kernel image to load and run.

Cheers,
Adam

Re: 64 bit - what difference will it make ?

Posted: Thu Apr 23, 2009 8:37 am
by Love4Boobies
Just thought I should mention this. Back when I was a Free Pascal fan, I remember Florian Klämpfl (the project coordinator) saying he did some experiments and found 64-bit code to be somewhat slower than 32-bit. I'm not quite sure what exactly he tried however.