Page 2 of 2

Posted: Mon Apr 09, 2007 12:30 pm
by jnc100
This is an idea I've been working on too. My inspiration came from thinking about how to design some device drivers for my OS. Originally, I thought that each device driver could implement certain interfaces for the functionality they provide, e.g. a display adapter could implement TextDisplay, GraphicalDisplay and 3DDisplay. After quickly discarding the idea of querying the display object for its interface (ala COM) I settled on providing this information in metadata like Microsoft's comon language infrastructure. From there, its a natural step to attempt to write the as much of the kernel as possible in JIT compiled, metadata described managed code.

The benefits are many. For example, object allocation is actually very fast as it can be performed with simply incrementing an end of heap pointer. A garbage collection algorithm can then identify unused objects and compact the heap in a background task.

JIT compilation also insures against badly written code causing invalid page faults, as the compiler decides where pointers should point. Furthermore, system calls can be done with function calls through a call gate, as the compiler just inserts the relevant call command. Also, it allows a novel method of task switching, a sort of enforced cooperative multitasking, as task switch calls can just be inserted into the compiled code at regular intervals (decided by the tasks priority) and protecting against task switches at inopportune moments like in page faults. It also stops us having to mess around with the timer :wink:

I don't, however, think its really sensible to try and code this all in assembly, as it requires a lot of infrastructure to get a JIT compiler or interpreter up and running. My kernel is currently using C, and although is currently not very advanced, has set up all its internal structures in a tagged 'managed' format that JIT compiled code will be able to interpret (given suitable metadata) once the compiler is up and running.

At the moment, however, that seems a long way away...

Regards,
John.

Posted: Mon Apr 09, 2007 12:38 pm
by mystran
You probably want to mess with the timer anyway. Not only doing so means that you don't need the overhead of polling for task switch in your generated code (polling is bad, interrupts are good) but you'll need the timer anyway to keep track of real time (say, how do you sleep for half a second without having a timer?).

edit: I wanna add that the overhead of polling is necessarily quite large, because you'll necessarily need a check inside every basic block if you don't want to risk getting stuck, which then includes even the most time-critical inner loops. It's much better to just use interrupts.

Posted: Mon Apr 09, 2007 2:25 pm
by Colonel Kernel
jnc100 wrote:JIT compilation also insures against badly written code causing invalid page faults, as the compiler decides where pointers should point.
This kind of memory safety is a property of the source and target languages, not of the compilation model. In other words, you can get this without JIT (this is what Bartok is used for in Singularity).

Posted: Mon Apr 09, 2007 3:53 pm
by jnc100
Thanks, I appreciate the swift response as this is a topic I'm really starting to get interested in.
Colonel Kernel wrote:
jnc100 wrote:JIT compilation also insures against badly written code causing invalid page faults, as the compiler decides where pointers should point.
This kind of memory safety is a property of the source and target languages, not of the compilation model. In other words, you can get this without JIT (this is what Bartok is used for in Singularity).
You make a good point. Basically, my OS rejects languages where memory safety is not ensured, including, dare I say it, good old C. This is not so much of a problem as it seems, as there is now quite a large amount of software available on the internet written in a memory safe language, such as Microsoft's intermediate language (MSIL). I hope to eventually have my os run MSIL as one of its native bytecodes, so it could actually run many 'binaries' (in PE format) straight off.

mystran wrote:You probably want to mess with the timer anyway. Not only doing so means that you don't need the overhead of polling for task switch in your generated code (polling is bad, interrupts are good) but you'll need the timer anyway to keep track of real time (say, how do you sleep for half a second without having a timer?).
Okay, so two issues here. For task switching, if I understand you correctly (please let me know if I don't) then 'standard' task switching involves setting the timer (after reprogramming in to fire at a certain time and keeping track of actual seconds yourself etc) and therefore performing a task switch at certain intervals (at the end of a timeslice). My method is to generate task switch instructions in the code we compile a certain number of instructions apart. Given that all the basic instructions execute in a comparable (short) amount of time, it is possible to approximately place these instructions a certain number of milliseconds apart. This really wouldn't require any more overhead than responding to interrupts.

To be honest, I hadn't really thought about a sleep() function. If I'm honest, its not a real concern, as my os is geared towards a non-realtime, user application/workstation model, where the application should rather respond to user inputs than timer interrupts. Even things like games more often have a doaction() function which moves the players an amount based upon the time elapsed since the function was last called, which doesn't require timer interrupts.

Having said that though, it might be important in device drivers, so I will probably implement a Timer class at some point, once I allow device drivers to hook into an interrupt. As I said, my kernel is still very much of a work in progress...

Regards,
John.

Posted: Mon Apr 09, 2007 4:58 pm
by mystran
jnc100 wrote: Okay, so two issues here. For task switching, if I understand you correctly (please let me know if I don't) then 'standard' task switching involves setting the timer (after reprogramming in to fire at a certain time and keeping track of actual seconds yourself etc) and therefore performing a task switch at certain intervals (at the end of a timeslice). My method is to generate task switch instructions in the code we compile a certain number of instructions apart. Given that all the basic instructions execute in a comparable (short) amount of time, it is possible to approximately place these instructions a certain number of milliseconds apart. This really wouldn't require any more overhead than responding to interrupts.
Wrong.

Consider a simple function like memset():

Code: Select all

void memset(void * p, int c, unsigned long n) {
    unsigned i;
    char * b = p;
    
    for(i = 0; i < n; i++) {
        b[i] = (char) c;
    }
}
The amount of work done per iteration is very small, but the number of iterations could be large. It could take a long time. So you have to place a test within the loop body (leading to overhead), or unroll the loop a few times, and then put a test within the unrolled loop body (leading to less direct overhead, but using more memory and therefore increasing the cache footprint).

Ofcourse the example is trivial, but demonstrates the point. In any turing-complete (= general purpose) language, you can't even always prove that a given piece of code ever terminates (or that it fails to terminate, for that matter). So you need the tests, and the overhead.

Ofcourse you only need one test in the body (unrolled or not), and this is why I said, that in general you need one test per basic block (basic block is a segment of code where there are no jumps).

No matter how fast you make the tests, you lose, because interrupts are completely free, as long as no interrupts occur. So I repeat my point: polling is bad, interrupts are good. That's why processors have interrupts in the first place.
To be honest, I hadn't really thought about a sleep() function.
Consider: you can read the wall clock time from the RTC. It gives you accuracy of one second (and reading it has overhead, but that's not the point). Are you sure you never need to know time (relative time between events mostly) more accurately than by second? There might or might not be other means of getting more accurate time. PIT is always available, and can easily give you around 10ms accuracy, on modern computers even more.

How about power saving, and CPU heating? Are you sure you want to run with 100% CPU use simply because the user told the computer to play an MP3. That's less than a 1% work on a modern computer. Ofcourse if you want to avoid all interrupts, you also need the user interface to check if the user moved a mouse or pressed a key, or whether it's time to feed the soundcard the next block of the audio. So you have to run at 100% CPU use all the time anyway. Laptop batteries last a few hours, because most of the time the CPU is sleeping. Many overclocked and badly cooled computers will crash if you don't let them cooldown by sleeping at times. Even the rest will unnecessarily waste power. When you HLT the CPU in order to cooldown/save power, the only way to wake up is by an interrupt.

Point being, you need to deal with interrupts anyway, and if you are going to deal with them anyway, you might as well generate normal code, and let interrupts take care of multi-tasking, saving you the overhead of polling.

Posted: Tue Apr 10, 2007 3:06 am
by jnc100
mystran wrote:Point being, you need to deal with interrupts anyway, and if you are going to deal with them anyway, you might as well generate normal code, and let interrupts take care of multi-tasking, saving you the overhead of polling.
I think you slightly misunderstand me. I wasn't suggesting doing away with interrupts altogether (where would we be without page faults!). Its just that I _really_ don't like the PIT.

To answer your example of memset, then yes, I would probably just unroll the loop, e.g.

Code: Select all

void memset(void * p, int c, unsigned long n) {
    unsigned i;
    char * b = p;
    unsigned a = 0;
   
    while(i < n) {
        for(i = a; (i < n) && (i < (a + 1000)); i++) {
            b[i] = (char) c;
        }
        a += 1000;
        taskswitch();
    }
}

Alternatively, maybe I can get away with just supporting those systems with an APIC and use that. This would also let me deal better with scheduling in multiple processor systems. I would JIT compile my programs to include NOPs before every branch and at regular intervals. Then, when an APIC timer interrupt comes along, I can just insert a call to the scheduler in the next available NOP, thus ensuring that a taskswitch doesn't occur during atomic operations.

Thanks for your input, it really helps me to think about these things before I try to implement them.

Regards,
John.

Posted: Tue Apr 10, 2007 7:39 am
by mystran
jnc100 wrote:Its just that I _really_ don't like the PIT.
I'm sorry to say this, but to me it sounds like you never learned how to do pre-emptive multi-tasking properly.

Posted: Tue Apr 10, 2007 8:12 am
by Colonel Kernel
jnc100 wrote:Then, when an APIC timer interrupt comes along, I can just insert a call to the scheduler in the next available NOP, thus ensuring that a taskswitch doesn't occur during atomic operations.
How do you know that the next available NOP is not in the middle of an atomic operation?

It sounds to me like you're inventing a solution without a problem. Even Singularity implements pre-emptive multitasking the tried-and-true way (with a timer interrupt that calls the scheduler directly).

Posted: Tue Apr 10, 2007 9:22 am
by mystran
I wanna add that if you only have trusted code generated by a known codegen, then you can implement atomic operations quite easily: emit a CLI, then the operation, then STI. For SMP add spinlocks where necessary.

The code to be compiled need not be able to disable interrupts. Rather, you simply need to implement atomic operations (say, heap allocation) as compiler primitives.

Posted: Tue Apr 10, 2007 9:26 am
by Colonel Kernel
I was thinking more broadly than just kernel-level atomic operations. Would you want interrupts disabled for the duration of a distributed transaction? :wink: No spinlock is going to help you there either...

Instrumenting compiled code is a neat idea (you don't need JIT for that either, BTW), but I guess I don't see the point in inserting calls to the scheduler when the timer fires, when you have already saved most of the register state of the thread anyway, and can just call the scheduler right away...

Posted: Tue Apr 10, 2007 9:31 am
by Brendan
Hi,

IMHO for most systems, most task switches aren't caused by the scheduler's timer. It's more likely to be a task blocking, the user pressing a key, the network driver receiving a packet, the hard disk completing a transfer, etc.

For an example, consider what happens when the user presses a key while the OS is running some other (lower priority) task. The IRQ fires and causes the keyboard device driver to send some sort of "a key was pressed" information to the GUI, which sends it to the active application. For a responsive system the GUI's task would be blocked waiting to receive something, and would preempt whatever was running when it receives the keypress information. In the same way the active application would be blocked waiting to receive something and would be the highest priority task once it receives the keypress information (and once the GUI has finished running).

In this case the first task is preempted by the GUI, the GUI uses a tiny bit of CPU time and blocks. The active application will probably do a little processing and block too (causing a task switch back to the first task). That's 3 task switches in a row that don't involve the timer.

For a system where the highest priority task that can run always does run, the scheduler doesn't even need a timer (but you'd still have a lot of task switches and preemption).
jnc100 wrote:Alternatively, maybe I can get away with just supporting those systems with an APIC and use that. This would also let me deal better with scheduling in multiple processor systems. I would JIT compile my programs to include NOPs before every branch and at regular intervals. Then, when an APIC timer interrupt comes along, I can just insert a call to the scheduler in the next available NOP, thus ensuring that a taskswitch doesn't occur during atomic operations.
You could have a "need to reschedule" flag that is set by many different things (not just the scheduler's timer), and check the flag in the next available NOP. The problem is efficiency. In most (all?) cases the CPU is running kernel code when you find out that a task switch needs to occur - setting a flag, returning to CPL=3, running a little more code, checking the flag, then switching back to CPL=0 is all unnecessary overhead.

Of course you may be thinking of running your JIT code at CPL=0. In that case it's not as bad, but setting a flag, returning to the interrupted JIT code, running a little more code, checking the flag, then calling the scheduler is still all overhead (easier to just call the scheduler instead of setting the flag).


Cheers,

Brendan

Posted: Tue Apr 10, 2007 3:52 pm
by mystran
You can just as well not modify the code, and still use the "need-reschedule" flag: you set the flag in an ISR if the ISR did something that makes rescheduling necessary, and then (after you're done with the interrupt, EOId and everything) before you return to the interrupted task, you check the flag and call the scheduler if necessary.

I actually need that implemented in my own kernel yesterday, so guess I know what I do next. :D

Posted: Wed Apr 11, 2007 11:16 pm
by AndrewAPrice
I've thought about this too. Having a minimal assembly kernel which includes;
  • All the required assembly code
    An interpreter (something like my own C-Script) which supports context-switching
    A FS driver (just to load the C-Script)
The kernel then loads the stage 2 kernel, which is written in C-Script, and the stage 2 kernel loads all the device drivers (also written in C-Script). Then, if the user has permission to do so, they can edit any part of the operating system (including the kernel) while it is running, and see the effects instantly! (This may also cause it to crash (e.g. if the kernel was in a function you just deleted).))