OSDev.org

Posted: **Thu May 19, 2011 4:15 am**

Hi everyone.
I've heard that hardware mechanism of multitasking in Intel processors is very slow, nobody uses it nowadays and multitasking in os is often represented by software mechanism. I'd like you express your opinion about this problem, need your advice.

Posted: **Thu May 19, 2011 5:07 am**

One, it is slower than doing it in software.

Two, it binds the number of possible processes to the hardware limit, as each TSS requires a GDT entry. The GDT is limited to 8192 entries - simply not sufficient for a modern platform.

There might be more de-facto limitations, but those two were enough for me to never seriously consider using HW/TSS based multitasking so I didn't really read up on it.

Posted: **Thu May 19, 2011 6:11 am**

You may also want to look at the wiki page on Context Switching and also at this thread, particularly Brendan's post.

You also asked for my opinion, which is this: Design the interface first, then it doesn't matter which you choose. Then you can always change the implementation later. That said, I'd definitely lean towards software-based task switching.

Posted: **Thu May 19, 2011 10:56 am**

I think the hardware method is faster for segmented designs, where segment-registers frequently change when switching threads. For flat designs, where segment-registers are not reloaded when switching threads, software is probably faster.

In the beginning I used the hardware method, because it seemed superior, and could catch invalid selectors better than the software method. It is especially more efficient when stack exceptions occurs in kernel, something that is easily cathed when TSSes are used. The software task-switch uses the stack of the incoming thread, and thus can fault in the scheduler while saving register, which is bad indeed. OTOH, this could be solved by having a private per-core stack in the scheduler.

The biggest problem with the hardware method is the problems associated with SMP and multiple CPUs. I think the hardware method is fundamentally incompatible with SMP. That was why I switched to the software method.

Posted: **Thu May 19, 2011 11:29 am**

Solar wrote:Two, it binds the number of possible processes to the hardware limit, as each TSS requires a GDT entry. The GDT is limited to 8192 entries - simply not sufficient for a modern platform.

I don't see a 8192 thread limit as a problem. I don't even see a 8192 / 3 (3 GDT selectors per thread) = 2730 thread limit as a problem. A normal system would not need to have thousands of threads. I use three GDT selectors per thread (thread selector, TSS selector and ring 0 stack) per task, and I've not run into a non-error situation where GDT selectors are running out. I have the TSS selector only so that ring 3 to ring 0 switches would not need patching ESS0 in a single TSS, and to allow for the IO permission bitmap to be different between threads. I almost phased this out as I tried TR for processor core identification, but after I selected to have a sligtly different GDT between cores instead for core identification, I also reverted to one TSS per thread because it offered more flexibility (especially regarding IO permission bitmap). In the current setup, the TSS selector has the same base and size as the thread-selector, which means that the thread block starts with the 32-bit TSS, has some private data, and ends with the IO permission bitmap. Software taskswitching would save and load registers in the normal places of the TSS. I don't save registers on the stack, especially since the kernel-debugger needs some defined way to find the registers of a kernel-thread that has faulted. The standard TSS provides this, and there is no reason to have any other layout.

Posted: **Thu May 19, 2011 12:46 pm**

rdos wrote:I don't see a 8192 thread limit as a problem. I don't even see a 8192 / 3 (3 GDT selectors per thread) = 2730 thread limit as a problem.

I'm browsing now with a basic (Core 2 duo T5250 @1.5GHz, 2GiB RAM), old laptop with Vista Home Premium. A quick look at the task manager reveals 70 processes with 997 active threads. I have an email client and this browser open with 2 tabs (and everything's running perfectly quickly enough...). If that's the case on such a basic setup, I can see how in a server environment or when running more processes at once, those threads would quickly surpass the 2730 limit.

Cheers,
Adam

Posted: **Thu May 19, 2011 1:03 pm**

AJ wrote:
rdos wrote:I don't see a 8192 thread limit as a problem. I don't even see a 8192 / 3 (3 GDT selectors per thread) = 2730 thread limit as a problem.
I'm browsing now with a basic (Core 2 duo T5250 @1.5GHz, 2GiB RAM), old laptop with Vista Home Premium. A quick look at the task manager reveals 70 processes with 997 active threads. I have an email client and this browser open with 2 tabs (and everything's running perfectly quickly enough...). If that's the case on such a basic setup, I can see how in a server environment or when running more processes at once, those threads would quickly surpass the 2730 limit.

Cheers,
Adam

Windows is not a normal system. I'm sure that you did not start 70 programs with average 14 threads each yourself?

In my configurations of RDOS, number of threads usually fit on a single page (25), and when our payment application is run, might go up to 50. That is a long way to 1000.

Posted: **Thu May 19, 2011 2:11 pm**

rdos wrote:Windows is not a normal system.

It depends how you define "normal". I'd say that the number of computers using it makes it more of a "normal" system to benchmark other OSes against than rdos. The only reason I chose Vista as an example is because that happens to be what I'm using on this computer now.

I'm sure that you did not start 70 programs with average 14 threads each yourself?

Of course not - most of the threads there would be due to background services, but that's the situation on a lot of desktop PC's. It would be sensible for anyone getting in to OS dev to take in to account that on a modern OS running on modern hardware, a large number of threads may be created (particularly on anything running server software). On a new OS, it is also sensible to use the hardware manufacturer's advised method of task switching rather than one which is considered "legacy".

In my configurations of RDOS, number of threads usually fit on a single page (25), and when our payment application is run, might go up to 50. That is a long way to 1000.

That's fine if that's all that's the only environment that an OS is designed to run in. If someone is starting out right now with something that they want to be a genral purpose modern OS (which I am assuming because the OP has not said otherwise), then I would not suggest imposing a limit on threads which should not be there in the first place.

Cheers,
Adam

Posted: **Thu May 19, 2011 2:43 pm**

Hi,

rdos wrote:
Solar wrote:Two, it binds the number of possible processes to the hardware limit, as each TSS requires a GDT entry. The GDT is limited to 8192 entries - simply not sufficient for a modern platform.
I don't see a 8192 thread limit as a problem.

It's not a problem. You need 2 GDT entries per CPU - one for the "currently in use" TSS gate and one for a "currently spare" TSS gate. During task switches you set the base address in the "currently spare" TSS gate to point to the new task's TSS, then do the hardware task switch. With one GDT you end up with a limit of about 4000 CPUs (and "infinite" tasks). For more CPUs than that you could use different GDTs for different CPUs.

For "number of threads", most modern software is crap, and doesn't have anywhere near enough threads to use multiple CPUs effectively. Ideal software would use at least one thread per CPU; so with 100 processes on a 16-CPU system like mine you should expect to have 1600 threads. Of course more CPUs means the user would expect to be able to do more at the same time, so for more CPUs you should expect more processes running (and more threads per process too). A limit (?) of 1000 processes per CPU would imply a need for a kernel designed to handle up to 1000 threads on single CPU, 4000 threads on dual-CPU, 16000 threads on quad-CPU, 64000 threads on eight-CPU, 256000 threads on sixteen-CPU, etc.

rdos wrote:I think the hardware method is faster for segmented designs, where segment-registers frequently change when switching threads. For flat designs, where segment-registers are not reloaded when switching threads, software is probably faster.

Do you still have copies of RDOS from immediately before and after software task switching was implemented? It'd be nice to get accurate "apples vs. apples" benchmarks of the difference, even if it is for an "all segment registers need to be reloaded" design...

Normally only the kernel does task switches. If the kernel always uses specific segment registers, then you don't need to change segment registers during task switches (because you're switching from the kernel in one task to the kernel in another task); even if user-space uses lots of different segment registers. If the kernel does change some segment registers then you only need to save/load those segment registers during task switches, so even in that case you wouldn't need to save/load CS and may be able to avoid saving/loading some of the other segment registers.

Hardware multi-tasking was designed to allow user space code in one task to switch directly to user-space code in another task, without any kernel involvement at all. In this case everything must be saved/loaded, and extra checks are needed to determine if the GDT entry is in GDT limits, if the GDT entry is a task gate, if the current task is allowed to switch directly to the other task and if the other task is "busy". This is what makes it slower than saving/loading all general registers and all segment registers in software (which is obviously slower than only saving/loading some or none of the general registers and some or none of the segment registers).

Cheers,

Brendan

Posted: **Thu May 19, 2011 10:42 pm**

AJ wrote:Of course not - most of the threads there would be due to background services, but that's the situation on a lot of desktop PC's. It would be sensible for anyone getting in to OS dev to take in to account that on a modern OS running on modern hardware, a large number of threads may be created (particularly on anything running server software).

Actually, it would not be sensible for someone doing their own OS to expect to end up with a bloated design like Windows in their lifetime. I've worked on my OS since 1988, and I have still not exceed a few dozen kernel threads, so I don't expect to exceed 1000s of kernel-threads in my lifetime. And the chances of a hobby OS becoming the next mainstream-OS is almost zero (if not zero) since the niche for open source is already filled with Linux.

We also should remember that Microsoft didn't invent "threads" until quite resently. They stomped on one thread for years, so I suspect they are overdoing it now.

And most (even bigger) Windows programs today are single-threaded. If today's software really moved towards efficient multithreaded applications, we would see new computers adverticed on the market with increasing number of cores. This does not seem to be the case. The most popular (and cost-effective) machines have two or four cores, effectively allowing multitasking a few single-threaded applications. So the fact that Windows has 1000s of threads does not mean that the environment is efficient. Most of them seems to be sleeping all of the time.

Posted: **Fri May 20, 2011 12:42 am**

rdos wrote:We also should remember that Microsoft didn't invent "threads" until quite resently.

I don't think that anybody except you considers "Since windows 95" as "recent".

Posted: **Fri May 20, 2011 7:50 am**

Combuster wrote:
rdos wrote:We also should remember that Microsoft didn't invent "threads" until quite resently.
I don't think that anybody except you considers "Since windows 95" as "recent".

No?

Look at the timeline of Microsofts (and other x86) OSes (http://en.wikipedia.org/wiki/Timeline_o ... ng_systems)

* 1980 - Bill Gates and another guy gets the contract for an OS that will later become MSDOS
* 1995 - Microsofts first attempts at multitasking, which didn't work very well

IOW, it took Microsoft 15 years to release there was something called threads. It was only 16 years ago since their first bug-ridden, multithreaded OS was released. It was not until Windows 98 (or possibly Windows NT) that they had something that worked to any reasonable quality-standard.

Posted: **Fri May 20, 2011 9:00 am**

Microsoft's first multi-threading capable OS was released in 1987 (OS/2 1). NT 3.11 was released in 1993. NT and OS/2 were, of course, designed to use threads and built with multi-threaded applications right from the start.

OSDev.org

Do you use hardware mechanism of multitasking?

Do you use hardware mechanism of multitasking?

Re: Do you use hardware mechanism of multitasking?

Re: Do you use hardware mechanism of multitasking?

Re: Do you use hardware mechanism of multitasking?

Re: Do you use hardware mechanism of multitasking?

Re: Do you use hardware mechanism of multitasking?

Re: Do you use hardware mechanism of multitasking?

Re: Do you use hardware mechanism of multitasking?

Re: Do you use hardware mechanism of multitasking?

Re: Do you use hardware mechanism of multitasking?

Re: Do you use hardware mechanism of multitasking?

Re: Do you use hardware mechanism of multitasking?

Re: Do you use hardware mechanism of multitasking?