Page 8 of 12

Re:Minimalist Microkernel Memory Management

Posted: Tue Aug 09, 2005 8:11 pm
by gaf
Colonel Kernel wrote:I also think it would be silly for the OS API to vary wildly depending on the underlying policies.
Thanks dude, you probably helped me more with this than you'd ever expected :D

As I've already mentioned before my plan was to divide the OS into three part:
[*] kernel + drivers: mechanism to access/multiplex the hardware, protected by hierarchical capabilities
[*] user-level managers: system-wide minimum policy to provide fairness
[*] libraries: abstraction for the app programmer, policy as far as the user-level managers would allow

What you've pointed me to is that I never really wasted a thought about how the user-level managers should actually accomplish their task in detail. I just had the rather faint idea that they would have to decide upon a policy (that's why I wanted to support multiple managers..) and then provide a somewhat specialized interface to the apps. I always had a bad feeling about this as it means taking away some of the policy from the apps which in my opinion made it necessary to support multiple managers with all the problem this would cause (different APIs, resource balancing between managers, etc).

What I've for some reason not noticed is that it's not the policy that ensures fairness in a kernel, but built in limits that are either statically set by the programmer or can be specified by the user. One example would be a priority round robin scheduler that seems to ensure fairness although it itself doesn't at all force that apps won't schedule themselves forever by setting their priority to a very high value. In fact it's a built-in limitation that makes sure that this doesn't happen by either restricting access to high priority classes to certain privileged apps (eg: drivers only) or by using some kind of time limitation (priority shrinks for each time-slice used completly).

Consequently it should also be possible to ensure system-wide fairness without involving any policy but only by enforcing user set limitations/quotas for the resources provided by the kernel. This can be done by allowing the user to specify these restrictions in advance using some ini-file and then using a user-space manager that enforces these limits.

In order to show you how this can work in real life, I'll present a small example that will deal with scheduling as this is in my opinion the most complex kernel resource (CPU, Memory+I/O, IRQ) to manage. Let's assume that the user is using a system that runs mostly desktop tasks, but sometimes would also like to execute apps that have to meet real-time constraints.

Here's a small sketch of the ini-file to be used:
[pre]
[scheduler - version: 1.0]

class realtime
{
level: 0
cpu: 100%

capability: GID 0x0012, GID 0x0013

quantum_min: 1 TS
quantum_max: 10 TS
}

class round-robin
{
level: 1
cpu: 50%

capability: 0

quantum_min: 1 TS
quantum_max: 4 TS
}

class priority-class
{
level: 1
cpu: 50%

capability: 0

quantum_min: 3 TS
quantum_max: 6 TS
}
[/pre]

As the scheduler's job in my approach is mainly to decide who gets how many resources, all it has to do when when a new task shall be started, is to keep to what the user has specified. In this case this would mean that there are 3 scheduler classes (real-time, round-robin, priority-class) on two levels. Tasks that are scheduled according to the real-time class can together consume all of the cpu time but it requires a special group ID to be allowed to use it. The other two classes share whatever of the CPU time remains in equal parts (50%, 50%). They don't require any authorisation and are thus open for all. Infact these two classes only differ in the allowed quantum lenght and I only didn't combine them for demonstration purposes ;)

Any opinions about this idea ?

cheers
gaf

Re:Minimalist Microkernel Memory Management

Posted: Tue Aug 09, 2005 10:27 pm
by Solar
gaf wrote: If the system is not able to constantly produce 25 fps for a movie there's hardly any use in trying to play it.
I strongly disagree. It's up to the user to decide what kind of multimedia performance he considers bearable and what doesn't. If he has to have stable rates, he should be using the "real time" mode (and software written for it, since it's the one mode that doesn't easily generalize.)

Add to that that being able to ensure a stable frame rate reduces the maximum frame rate possible. That's why I drafted a dedicated "performance / multimedia" mode: Get the maximum possible framerate, even if it's abysmal.
It's similiar for games which need some certain frame-rate to avoid stutter while it's at the same time pointless to produce more than that, as most engines do today, since a frame-rate around 30 is totaly sufficient to create the illusion of smooth movement.
Many ego-shooter power-gamers will disagree with you on that point.
As Legend has already said a lot of users would probably like to run both desktop and gaming apps on their system without having to restart/reinstall first.
They can anytime. But when would users want to run a system optimized for both at the same time? It's not as if desktop apps would stop working if the system is set for max. gaming performance. I'd daresay the difference would hardly be noticable except under heavy load. (And how do you get that with desktop apps?)
Quite in general I think that they would often like to use applications on their systems that you didn't have on your mind when designing the distro..
It's not about being able to run, it's about what the system is optimized for.
Some weeks ago I was on a (private) LAN and the guy which has invited us set up an older computer that he normally used for desktop work to allow the internet to be accessed throughout the whole local network. If he had used your OS the software needed to allow something like this would have been part of the "server" distro and therefore couldn't have been used on his "desktop" installation. How should he have solved the problem ?
There wouldn't have been a problem. Remember it's about optimization, not about being able to run at all.

The internet-sharing software would most likely use the generic module interface, since you'd want something like this on a desktop or a gaming machine, too. He could have run the system at his usual desktop settings, which would most likely suffice for your use. Or he could have set the system to "throughput" (like, longer time slices, less priority to GUI / user input), which would have speeded up the net software. He'd still be able to use his desktop apps.

Only few applications really require a specific setting. Professional server apps come to mind, or game consoles.

Re:Minimalist Microkernel Memory Management

Posted: Wed Aug 10, 2005 2:46 am
by Legend_
gaf wrote:
Legend wrote:I would try to make an approach that covers all situations okay (like it is done today with monolithic kernels).
Why reinventing the wheel, if you're happy with the current design ? By just taking the linux source and building an own distro you can change the details you don't like about it without having to write a whole OS. This means much less work and you get compatibility with thousands of linux apps for free..
I really don't think Linux is the possible height of monolithic kernel development, but true, my greatest intention to do something more micro was to reduce the amount of code running in Ring 0 and get the chance for more stability that way.
Legend wrote:But in 99% of the cases the application can't really benefit really from a special pager/scheduler etc.
That's just your opinion..
Then how will a program like Word, Firefox, Thunderbird, Winamp, gcc, or, as the example popped up, any game get speed using special policies? At least when during game rendering any swapping occurs using any policy - that frame will take ages.
At best they can lose speed by using some really dumb algorithms.
Legend wrote:I see no reason for punishing them by adding overhead to the generic pager and scheduler, by putting then into user mode, when the amount of code pushed to user space would not be that big either.
I'll never get why this is really such a big problem for some people. Let's assume I'd rewrite parts of the Linux kernel to turn it into a ?kernel with modules in user-space. How big do you expect the difference in performance to be compared to a traditional kernel ?

I'm serious - give me some numbers, maybe this will help me to understand..
Well, I would not be too surprised if on memory operations, the time for an operation would probably really double or triple. Probably the amount of speed lost for the whole program might be somewhere around 1% - but that speed is wasted for nothing.

For scheduling policies in userland, I wouldn't be surprised to see some 5% or similiar less, if I really need to invoke a scheduler process each time, unless you keep your time slices long, which hurts responsiveness.
Legend wrote:If someone thinks that loaded pagers at runtime into kernel mode could be made secure, that discussion would not exist anyway.
It can be made secure using a secure scripting language. In my opinion this wouldn't even be necessary: A signature guaranteeing that the module is an offical release should suffice as monolithic kernels also run all this stuff in kernel mode and nobody worries about security/stability there..

Now does this change anything for you ?
It could change something for example for a database, or a graphics program handling some dozens mb of image data. Those might benefit if they have control, in contrast to the example applications I mentioned above.

Re:Minimalist Microkernel Memory Management

Posted: Wed Aug 10, 2005 2:53 am
by Legend_
Yet, I have to admit that I have never seen that server benchmark.

The only benchmark that I saw and still was able to find was http://pdos.csail.mit.edu/exo/exo-slides/sld030.htm.
Well, as you can see, gcc gets the same amount of apples on all three systems, whereas rm has a lot less oranges when compared to the bsds. Conclusion: The first time I saw a useful benchmark of an exokernel, probably I'll take a closer look at what they did there.

Re:Minimalist Microkernel Memory Management

Posted: Wed Aug 10, 2005 3:23 am
by Legend
Sorry for three posts in a row, but now I'm finally logged in! ;)

And now here the big reason why I'm sceptical about exokernel's benchmarks: http://www.deas.harvard.edu/courses/cs261/papers/exo-sosp97.pdf

What page 9, and how cp should have been speed up.
Very interesting. A program that (when the system is working correctly, meaning, on IDE systems with DMA enabled etc.) fully limited by the speed of the harddisk should have been speed up by a different kernel design - sure!

Re:Minimalist Microkernel Memory Management

Posted: Wed Aug 10, 2005 3:50 am
by Solar
Seeing how the fastest cp in the second go was the unmodified OpenBSD made me question the whole thing when I first read it a couple of years ago.

How often was that test run? Just once? That would explain the "outlier" results, and shed not so good a light on those doing the testing. What were the exact results? (Data table)

My professor would have had my head for a work like that...

Re:Minimalist Microkernel Memory Management

Posted: Wed Aug 10, 2005 8:59 am
by gaf
Legend_ wrote:Then how will a program like Word, Firefox, Thunderbird, Winamp, gcc, or, as the example popped up, any game get speed using special policies? At least when during game rendering any swapping occurs using any policy - that frame will take ages.
Word, Firefoc, etc are bad examples because they are idle all the time anyway. Even if somebody would succeed in speeding them up by 50% (with exokernel techniques or without..) you probably wouldn't really noticy, except that the response-time could be a bit better.

GCC is a different story as it actually does some work. As far as I know (please correct me if I'm wrong) gcc uses several sub-processes to do the compilation. It could therefore benefit from a policy that allows fast communication with these programm aswell as a scheduler that allows gcc to specify which of the sub-processes should be run as it knows the "priorities" best.
Legend_ wrote:At best they can lose speed by using some really dumb algorithms.
Did you ever get the idea that the algorithm in your kernel might be really dump for some applications ?
Legend_ wrote:Probably the amount of speed lost for the whole program might be somewhere around 1% - but that speed is wasted for nothing.
No, it's not as it allows greater flexibility..
Legend_ wrote:For scheduling policies in userland, I wouldn't be surprised to see some 5% or similiar less, if I really need to invoke a scheduler process each time, unless you keep your time slices long, which hurts responsiveness.
Let's assume that your using a scheduler that needs a high responsivness and sets the quantum lenght for apps to 20ms which is a rather small value. This means that the scheduler is called 50 times per second and that each call has to waste 0.1 % of the cycles that your CPU executes in one second.

I've got 3 computers here:
[pre]700 MHz (pIII) -> 700 000 cycles
1920 MHz (aXP) -> 1 920 000 cycles
2800 MHz (pIV) -> 2 800 000 cycles[/pre]
Do you really think that this is a realistic overhead for one call from the kernel to a user-app plus one return to the kernel ?
Solar wrote:Many ego-shooter power-gamers will disagree with you on that point.
You mean those guys that buy one of these new two graphic-card systems for 3000 euros to raise their frame-rate from 180 to 210 while still working with a TFT that cuts it all down to some 60 fps due to the update time ? In that case I really don't care..

I do agree that you need a bit more (40fps?) for ego-shooters, but anything above shouldn't be necessary and in my opinion doesn't have anything to do with the frame-rate, but rather the imput-rate or scanrate that the program processes (your eyes can't see more than 30 fps, but you get a more fine-grained controll over the mouse-movements).
Solar wrote:Only few applications really require a specific setting. Professional server apps come to mind, or game consoles.
I could aswell say that only few applications shouldn't be written in VisualBasic as the performance obviously doesn't matter, but that's not the point. This argument just kills any discussion about optimization in general !

I'm really surprised about this attitude as nobody seems to have any doubts that the increased context switch number in ?-kernels means a severe performance lost for all applications and has to be avoided at any costs..

regards,
gaf

btw: No comments on my second post ?

Re:Minimalist Microkernel Memory Management

Posted: Wed Aug 10, 2005 9:31 am
by Colonel Kernel
Did you ever get the idea that the algorithm in your kernel might be really dump for some applications ?
It's possible... any specific examples? I'm still reading through the Application-Level Networking paper, and so far the only example I've seen is excessive copying of memory when sending and receiving packets. Not good for performance, to be sure, but probably solveable without an exokernel.
No, it's not as it allows greater flexibility..
Flexibility is just a warm happy feeling you get, unless it facilitates something else. Flexibility for its own sake is worthless. I'm not saying that the flexibility provided by exokernels is necessarily worthless, it's just taking a while to grasp what it's facilitating.
I could aswell say that only few applications shouldn't be written in VisualBasic as the performance obviously doesn't matter, but that's not the point. This argument just kills any discussion about optimization in general !
The real question isn't whether optimization is worthwhile. To me, the question is, "if you need that much specialization, why not specialize in hardware and build your own appliance?" This is the essence of game consoles for example.

If I see a real case for "securely multiplexing" between highly specialized apps and regular run-of-the-mill apps on the same machine, then I'm sold.
I'm really surprised about this attitude as nobody seems to have any doubts that the increased context switch number in ?-kernels means a severe performance lost for all applications and has to be avoided at any costs..
Each extra context switch can hurt performance, if it's in an inappropriate part of the system (i.e. -- on the critical path of a frequently-used operation). I don't trust myself to predict when and how extra context switches can slow things down... IMO the only thing to do is implement, then benchmark. :)
btw: No comments on my second post ?
Not yet... I'm madly typing this while the plumber is getting to work on my leaky hot water tank... And I have to go to work soon. Ahhh, distraction. ;D

Re:Minimalist Microkernel Memory Management

Posted: Wed Aug 10, 2005 11:42 am
by Legend
gaf wrote:
Legend_ wrote:Then how will a program like Word, Firefox, Thunderbird, Winamp, gcc, or, as the example popped up, any game get speed using special policies? At least when during game rendering any swapping occurs using any policy - that frame will take ages.
Word, Firefoc, etc are bad examples because they are idle all the time anyway. Even if somebody would succeed in speeding them up by 50% (with exokernel techniques or without..) you probably wouldn't really noticy, except that the response-time could be a bit better.

GCC is a different story as it actually does some work. As far as I know (please correct me if I'm wrong) gcc uses several sub-processes to do the compilation. It could therefore benefit from a policy that allows fast communication with these programm aswell as a scheduler that allows gcc to specify which of the sub-processes should be run as it knows the "priorities" best.
So you really expect 50% ... okay, that explains why you are that side that extremely! ;)
As long as that paper (or pages from it) that I posted is not the only source of that 50% ...

gcc indeed does call other programs, but there is not much more to it then doing several steps in several programs. Give gcc a fast pipe (which any good Unix got to have) and you have satisfied gcc's interprocess communication need.
Gcc is especially a bad program for kernel optimiziations as it really spends most times in user mode messing around with strings, or later, code optimization.
Legend_ wrote:At best they can lose speed by using some really dumb algorithms.
Did you ever get the idea that the algorithm in your kernel might be really dump for some applications ?
Meaning in as what I have implemented now, or what I would implement if I would have the time to focus on it? Yes, probably not the optimum, but I except not more then some 10%.
Why - schedulers today waste time in O(1), what you could do there is to hand out time slices in a different order to satisfy some requirment.
Using a tree for finding free pages in the virtual address space gives me something in the O(log n) class, hard to do a lot more optimization from there. Unless you use some O(1) algorithm, that just adds more and more, but really can't free something in the middle - but that is not a good idea for every application.
Using a stack for physical pages then is O(1) again, not really something possible then anymore either.

With dumb I meant something like bitmaps (as in easy, but really not the fastest way), algorithms that could be done faster without being case specific.
Legend_ wrote:Probably the amount of speed lost for the whole program might be somewhere around 1% - but that speed is wasted for nothing.
No, it's not as it allows greater flexibility..
Nothing to add here to Col. Kernel's post - flexibility in this area is good if you can get a bit of more speed. Flexibility itself is not a sufficient reason.
Legend_ wrote:For scheduling policies in userland, I wouldn't be surprised to see some 5% or similiar less, if I really need to invoke a scheduler process each time, unless you keep your time slices long, which hurts responsiveness.
Let's assume that your using a scheduler that needs a high responsivness and sets the quantum lenght for apps to 20ms which is a rather small value. This means that the scheduler is called 50 times per second and that each call has to waste 0.1 % of the cycles that your CPU executes in one second.

I've got 3 computers here:
[pre]700 MHz (pIII) -> 700 000 cycles
1920 MHz (aXP) -> 1 920 000 cycles
2800 MHz (pIV) -> 2 800 000 cycles[/pre]
Do you really think that this is a realistic overhead for one call from the kernel to a user-app plus one return to the kernel ?
You mean from Process A to the Kernel, from the Kernel to the Scheduler (context switch), from the Scheduler to the Kernel, from the Kernel to Process B (context switch).

Probably not in the millions, but I wouldn't be surprised about a dozens thousands cycles (or more?) lost at a context switch, as RAM is getting slower when compared to the speed of today's CPUs. This would probably however lower my number of %s lost - but I still think they are lost for nothing, especially that I could implement different pagers in kernel mode, too.

Re:Minimalist Microkernel Memory Management

Posted: Wed Aug 10, 2005 1:18 pm
by gaf
Legend wrote: So you really expect 50% ... okay, that explains why you are that side that extremely! ;)
- Did you also read the bracket ?!
- This was meant to show that it's no use to talk about the performance of such apps as it's hard to measure
- If you had read my other posts you'd know that I expect <10 % for such apps
Legend wrote:With dumb I meant something like bitmaps (as in easy, but really not the fastest way), algorithms that could be done faster without being case specific.
Well, if the kernel policy doesn't matter at all I'm wondering why you don't simply take that approach to save memory..
Legend wrote:Nothing to add here to Col. Kernel's post - flexibility in this area is good if you can get a bit of more speed. Flexibility itself is not a sufficient reason.
modularity -> flexability -> specialization

Sorry, but I think that this is really basic stuff..
Legend wrote:You mean from Process A to the Kernel, from the Kernel to the Scheduler (context switch), from the Scheduler to the Kernel, from the Kernel to Process B (context switch).
When the IRQ occures, the CPU is already in kernel mode. The only difference compared to the monolithic approach is that the user-space scheduler has to be involved:

kernel -> Scheduler
Scheduler -> kernel -> ProcessX (this is a system-call: yield)

- TLB costs can mostly be neglected as it doesn't really matter if we switch only once or twice in a row
- The overhead is 2 kernel exits and one kernel entry

Don't get me wrong but the whole discussion is getting pretty pointless as nobody is moving a bit and nothing new is added..

I'd however still be glad about opinions about my second post ;)

regards,
gaf

Re:Minimalist Microkernel Memory Management

Posted: Thu Aug 11, 2005 1:11 am
by Solar
gaf wrote: You mean those guys that buy one of these new two graphic-card systems for 3000 euros to raise their frame-rate from 180 to 210 while still working with a TFT that cuts it all down to some 60 fps due to the update time ? In that case I really don't care..
No, I mean those guys that buy one of these new two graphic-card systems and wouldn't prod a TFT with a stick because it's too slow in updating screen content.

Those same guys want the utmost max in framerates, because it allows them to see a movement that little earlier and have that little better aim.

Anyway. The point being, applications that require the levels of optimization we're talking about are, in virtually all cases, "dedicated systems", meaning that you wouldn't run another application requiring a different type of optimization on the same machine anyway.

Take a pro database application. You would run the production server with the utmost "throughput" settings. You wouldn't run a game, hard-disk recording tool, compiler, or anything like that on that same machine.

Take a game console. Even if you would run a database on it, you probably couldn't be bothered with tuning it to the max - if you'd require that level of performance, you'd be using server hardware (with corresponding periphery like a FibreChannel RAID instead of a IDE disk).

I don't say that Exokernels cannot improve things. But I seriously doubt that you need that kind of "mix-and-mingle" flexibility in a system.

Re:Minimalist Microkernel Memory Management

Posted: Thu Aug 11, 2005 11:27 pm
by Colonel Kernel
Finally, I have a new hot water tank and some time to think. :)
gaf wrote:Thanks dude, you probably helped me more with this than you'd ever expected :D
Yep, didn't expect that. I counted myself as one of the most confused partcipants in this conversation. :)
Consequently it should also be possible to ensure system-wide fairness without involving any policy but only by enforcing user set limitations/quotas for the resources provided by the kernel. This can be done by allowing the user to specify these restrictions in advance using some ini-file and then using a user-space manager that enforces these limits.
Any opinions about this idea ?
I think the idea is pretty sound in general, although it's hard to visualize how it would work for some resources (CPU time for example -- I can't quite wrap my head around your example yet). I've been reading the paper on application-level networking, and it seems that the Cheetah HTTP server does exactly what you suggest -- The user must use a special admin tool to grant access to parts of the file system to Cheetah. I can see how this same tool could be used to limit the amount of physical memory Cheetah can use as well.

What this seems to imply is that specializing in the policies and the apps also means specializing their configuration and administration. I guess sometimes performance has its price. :)

The techniques used in that paper to implement Cheetah sound pretty cool, but I think they could be implemented just as securely in a microkernel architecture. The idea of having a network device driver embedded in the kernel still gives me the willies... Also, I'm having a hard time grasping how Xok defers thread scheduling descisions to user-land. My usual sources come up short when looking into "upcalls" and "scheduler activations".

I have to admit, these techniques sound very interesting, but it's still the kind of "interesting" that you get from watching the aftermath of a car accident. ;)

Re:Minimalist Microkernel Memory Management

Posted: Fri Aug 12, 2005 4:20 am
by Soap_
gaf wrote:It's similiar for games which need some certain frame-rate to avoid stutter while it's at the same time pointless to produce more than that, as most engines do today, since a frame-rate around 30 is totaly sufficient to create the illusion of smooth movement.
Thirty frames might be sufficent - if you are watching a film where you have motion blur giving the illusion of a smooth transition between two separate frames.

Computer games currently do not simulate this, and so they need a higher framerate than your telly in order to generate a perception of smooth movement. This is especially noticable with lateral movement in first person shooters where large sections of the screen are being updated at once - try turning in Doom 3 when it's running on a low powered machine and see what happens.

Smooth, rapid updates help the player guage tricky movement and timing issues, like getting a headshot on a distant and fast moving enemy.

In addition, higher framerates give an advantage in game engines where the physics calculations are tied to the frames per second. In (the non framerate-locked) Quake based games, for example, players with a high frame rate can jump higher and further than their less fortunate counterparts.

Sorry for the off topic lecture, but I thought us ego-shooter power-gamers should get a chance to explain.

Re:Minimalist Microkernel Memory Management

Posted: Fri Aug 12, 2005 8:54 am
by QuiTeVexat
Colonel Kernel wrote: I think the idea is pretty sound in general, although it's hard to visualize how it would work for some resources (CPU time for example -- I can't quite wrap my head around your example yet).
As I understand it, Xok divides up processor time into slices and allows applications to allocate them. So you could conceivably only allow allocation by an application of a certain amount of the slices.
The idea of having a network device driver embedded in the kernel still gives me the willies...
The exokernel idea does not demand that device drivers be in the supervisor mode kernel. Device drivers could be implemented in user mode servers, and they would provide the same secure-multiplexing-without-abstraction interface.
Also, I'm having a hard time grasping how Xok defers thread scheduling descisions to user-land.
I do not know if or how Xok does threads, but it seems to me that if you notify the application on timer interrupts, it could schedule threads for itself if it did so desire.

Re:Minimalist Microkernel Memory Management

Posted: Fri Aug 12, 2005 2:59 pm
by gaf
Colonel Kernel wrote:What this seems to imply is that specializing in the policies and the apps also means specializing their configuration and administration. I guess sometimes performance has its price. :)
A powerfull tool is certainly needed, but for the average case it should be enough to set up standard limits for certain groups of apps or users.
Colonel Kernel wrote:Also, I'm having a hard time grasping how Xok defers thread scheduling descisions to user-land.
"Aegis (-> original MIPS exokernel) represents the CPU as a linear vector, where each element corresponds to a time slice. Time slices are partitioned at the clock granularity and can be allocated in a manner similar to physical memory. Scheduling is done round robin by cycling through the vector of time slices. A crucial property of this representation is position, which encodes an ordering and an approximate upper bound on when the time slice will be run. Position can be used to meet deadlines and to tradeoff latency for throughput. For example, along-running scientific application could allocate contiguous time slices in order to minimize the overhead of context switching, while an interactive application could allocate several equidistant time slices to maximize responsiveness.[..]"

(Exokernel: An Operating System Architecture for Application-Level Resource Management)

In my opinion the approach taken in Aegis leads to problems as soon as there is more than one scheduling policy present becaues the "value" of time-slices that applications can allocate changes when new apps are added or old finish their work.

As an example let's have a look at a simple equal share system in which two users are present and both are entitled to get 50% of the CPU, no matter how many apps are running. In the beginning both users are running one app. If any of them now starts a second task he'll get twice as much as the other and the whole system has to be rebalanced to meet the policy again. The same problem occures if apps quit or are only blocked (!) and it gets even worse when apps are allowed to choose how many time-slices they want..

For these reasons I'll probably use a lottery scheduler in my own OS. In such a system apps can define how many percentages of the CPU time they want and will then be given a number of "tickets" that reflects this value. At each clock-tick a lottery is held by the kernel and the task with the winning ticket is scheduled for one time-slice.
I at first was a bit worried about the overhead due to the lottery, but after doing some reading and benchmarking it turned out to be *very* small..

(goes on)