long mode

Dex4u · Post by **Dex4u** » Fri Aug 19, 2005 9:05 am

I agree with you, from a hobby OS point, games are the only thing you would need it for, but these the big problem, unless you have written a driver for your graphic card and most other peoples, you will all ways have a bottle neck with vesa from a games point of view.

It may be good for things like password crackers >:( .

Pype.Clicker · Post by **Pype.Clicker** » Sat Aug 20, 2005 3:23 pm

Dex4u wrote: but i do not see any advantage for using long mode for tesktop OS, let alone hobby OS, may be some one could point them out .

well, the main advantage i can point out is the easiness to implement "small address space" like technologies (co-hosting several programs in the same address space to reduce switching cost), and easier support of shared libraries (either because you have more room to put them or because AMD64 now at last supports RIP-relative addressing of datas.

Colonel Kernel · Post by **Colonel Kernel** » Sat Aug 20, 2005 8:54 pm

Pype.Clicker wrote: well, the main advantage i can point out is the easiness to implement "small address space" like technologies (co-hosting several programs in the same address space to reduce switching cost)

I thought long mode didn't support segmentation. Without segmentation, how would you implement small spaces?

Brendan · Post by **Brendan** » Sun Aug 21, 2005 12:31 am

Hi,

Colonel Kernel wrote:
Pype.Clicker wrote:well, the main advantage i can point out is the easiness to implement "small address space" like technologies (co-hosting several programs in the same address space to reduce switching cost)
I thought long mode didn't support segmentation. Without segmentation, how would you implement small spaces?

Supporting small address spaces isn't a problem - each small address space can have one or more 1 GB "chunks" (the large address space uses multiple page directories that control 1 GB each). If all CPL=3 code uses RIP relative addressing you won't need segmentation to make them run at any address (it'd be position independant code). The problem is doing it securely.

For doing small spaces securely (or protecting small address spaces from each other) I can't think of a way that would help. An alternative would be to only allow trusted code in small spaces (device drivers and system code perhaps), or to group related code (e.g. a parent process and all it's child processes share the same large address space). I call it "minimizing risk" , and for some OS's it might be good enough (e.g. a games OS, or an OS that runs virtual machines instead of binary code).

In any case, the cost of TLB flushes will depend on the OS design. Systems where the CPU regularly goes from one address space to a second address space and then back to the first address space (e.g. OS's that use client/server style blocking IPC) would suffer from TLB flushes a lot - getting rid of the TLB flushes would mean that the TLB entries for the first address space may still be in the CPU when it returns to the first address space.

For systems that use non-blocking IPC, where context switches (or address space switches) aren't required for IPC the TLB flushes aren't as much of a problem. This is mainly because the TLB entries for one address space are more likely to be replaced by more recent TLB entries by the time the CPU returns to that address space, so flushing them makes less difference to performance.

Another interesting factor is "global pages" that aren't flushed when address spaces are changed, which are commonly used for the kernel's part of the address space. If you put commonly used things in the kernel's part of the address space and run them at CPL=0 then their TLB entries won't be flushed because of a context switch, and you also won't need to change address spaces to use those commonly used things. An example here would be memory management.

@Dex4u - IMHO one of the main benefits of long mode is the extra general registers, which can improve performance a lot for some code. The fact is that by the time most of our OSs are actually useful, single-CPU 32 bit stuff will probably be obsolete.

Cheers,

Brendan

Dex4u · Post by **Dex4u** » Sun Aug 21, 2005 7:05 am

@Dex4u - IMHO one of the main benefits of long mode is the extra general registers, which can improve performance a lot for some code. The fact is that by the time most of our OSs are actually useful, single-CPU 32 bit stuff will probably be obsolete.

As a asm programmer the extra registers will come in handy, but the advantage's are out way-ed by the disadvantage's.

I think multicore, will make a bigger difference.
I know no one will agrees with me, but multitasking as we know it will be dead,in years to come, we will give each task its own CPU (may be low end CPU).
As i said a long time ago and was flamed for it.
Multitasking is a invention from when CPU were $3000 a peace.

And as 64bit chips are backward compatable with 32bit, it will be as obsolete as 16bit, but people used too 32bit OS, will not see the big difference they did between 16 &32 bit OS.

Brendan · Post by **Brendan** » Sun Aug 21, 2005 10:52 am

Hi,

Dex4u wrote:
@Dex4u - IMHO one of the main benefits of long mode is the extra general registers, which can improve performance a lot for some code. The fact is that by the time most of our OSs are actually useful, single-CPU 32 bit stuff will probably be obsolete.
As a asm programmer the extra registers will come in handy, but the advantage's are out way-ed by the disadvantage's.

For most OS designs, the only disadvantage that I'm aware of is slightly larger binaries - something that is mostly irrelevant considering modern RAM and cache sizes.

Dex4u wrote:I think multicore, will make a bigger difference.
I know no one will agrees with me, but multitasking as we know it will be dead,in years to come, we will give each task its own CPU (may be low end CPU).
As i said a long time ago and was flamed for it.
Multitasking is a invention from when CPU were $3000 a peace.

And as 64bit chips are backward compatable with 32bit, it will be as obsolete as 16bit, but people used too 32bit OS, will not see the big difference they did between 16 &32 bit OS.

I just did "ps -A" as root on the dual-CPU computer I'm using and got a list of 108 processes, but almost all of these processes spend almost all of the time waiting. If this computer had over 100 CPUs, then almost all CPUs would be doing nothing almost all of the time. For this computer, what matters is that when something happens (some network activity, a keypress, etc) any processing that needs to be done is done quickly. Unfortunately, for any one "event" I doubt that more than one of the CPUs is normally used to process it. In fact I'd be surprised if having a second CPU helps with much on this computer except compiling (and then only if "make -j" is used).

The problem, as I see it, is that programmers have been taught to write linear code since the beginning of computers. Multi-tasking helped, but allowed programmers to continue writing their linear code. Multi-threading helps, but unfortunately a relatively large number of programs/programmers ignore it or break their code into a few large threads rather than a lot of small threads, and continue to program without too much change since single-tasking systems.

In the next few years manufacturers will roll out a whole pile of multi-processing hardware, and I wouldn't be surprised if 4 or more CPUs is common in 5 years time, but it's going to take a decade or more for programmers to write code that is designed to get the best performance from these multi-processor systems.

Because of this, for CPU manufacturers it makes much more sense to produce single core, dual core and possibly quad core chips with fast cores than it does to produce chips with lots more cores and reduced "per core" performance. I don't think this will change until programmers and OSs start doing things differently.

Cheers,

Brendan

Candy · Post by **Candy** » Mon Aug 22, 2005 1:58 am

Brendan wrote: I just did "ps -A" as root on the dual-CPU computer I'm using and got a list of 108 processes, but almost all of these processes spend almost all of the time waiting. If this computer had over 100 CPUs, then almost all CPUs would be doing nothing almost all of the time. For this computer, what matters is that when something happens (some network activity, a keypress, etc) any processing that needs to be done is done quickly. Unfortunately, for any one "event" I doubt that more than one of the CPUs is normally used to process it. In fact I'd be surprised if having a second CPU helps with much on this computer except compiling (and then only if "make -j" is used).

The problem, as I see it, is that programmers have been taught to write linear code since the beginning of computers. Multi-tasking helped, but allowed programmers to continue writing their linear code. Multi-threading helps, but unfortunately a relatively large number of programs/programmers ignore it or break their code into a few large threads rather than a lot of small threads, and continue to program without too much change since single-tasking systems.

In the next few years manufacturers will roll out a whole pile of multi-processing hardware, and I wouldn't be surprised if 4 or more CPUs is common in 5 years time, but it's going to take a decade or more for programmers to write code that is designed to get the best performance from these multi-processor systems.

Because of this, for CPU manufacturers it makes much more sense to produce single core, dual core and possibly quad core chips with fast cores than it does to produce chips with lots more cores and reduced "per core" performance. I don't think this will change until programmers and OSs start doing things differently.

Can't agree more.

However, you can do your bit to help programmers realise the advantage of multithreading. Make the things you offer them all thread safe, offer easy multithreading primitives with short & intuitive names (get/put instead of WaitOne() and ReleaseMutex()), make creating a thread very low overhead and preferably make the base of the application (the thing generated by wizards creating a basic application) use at least two threads, one for the UI and one for the processing.

This is exactly what Microsoft is not doing. Complex threading interface, weird stuff about single-threaded apartments and multi-threaded apartments, awkward bugs and untraceable problems when multithreading, not counting that pretty much all of the .NET classes are not thread safe. I've added threads to a .NET app because it had a very good function there (it's a viewing app with very complex import, so there are imports taking a few hours or up to a day). That took me a few days of getting right. Creating the multithreading stuff in my own OS took less.

OSDev.org

long mode

Re:long mode

Re:long mode

Re:long mode

Re:long mode

Re:long mode

Re:long mode

Re:long mode