Hi,
IMHO either QNX docs are messed up or QNX isn't very re-entrant or pre-emptable. Either way I'd be tempted to ignore QNX.
Colonel Kernel wrote:
Again, my goal here is to understand all the possibilities for these kernel control-flow mechanisms. I guess to summarize, I want to understand how "the other guys do it" before I undertake designing this portion of my own kernel.
I'm not very familiar with other OS's but I'm willing to describe mine. Perhaps people will point out alternatives with different advantages/disadvantages.
First a short intro to my OS. It's a microkernel where device drivers are user level processes. The kernel contains:
IO port protection
IRQ handling
Basic chip handling (ISA DMA, PIC/APICs, PIT and RTC)
Linear memory management
Physical memory management
Scheduler
IPC/messaging
It's designed for single CPU and multi-CPU systems (up to 256 CPUs). The kernel will be as re-entrant and as pre-emptable as possible.
IRQ Handling
When an IRQ is received by a CPU the kernel sends a message to any user level threads that have requested it (the priority of each thread determines the user level IRQ handler's priority). I don't allow nested interrupts - each IRQ is handled by an "interrupt gate" so interrupts are disabled until the message is sent.
On a multi-CPU computer the IO APIC/IRQ uses the "lowest priority delivery mode" and the kernel's IRQ handlers adjust the "Task Priority Register" so an IRQ will be accepted by a CPU that isn't handling any other IRQs if possible. In this way several IRQs may be being handled by the kernel at the same time (by different CPUs) without IRQ nesting.
On a single-CPU computer it adds an (IMHO negligable) increase in interrupt latency. Because the kernel doesn't have to bother with nested IRQs the kernel's IRQ handler can switch contexts directly within the IRQ handler.
SPINLOCKS
The kernel uses 2 types of spinlocks for quick operations. The first can only be acquired by one thread/CPU at a time. This type of lock is used for data that is always modified and never just read.
The second type of spinlock allows read/write access so that multiple threads/CPUs can read at the same time, but no other thread can have any access if its being modified. It uses a 32 bit value, where 0 = unused, positive values are a count of the number of readers and -1 is used when it's locked for modifying. This type of lock is used when data may only need to be read.
In any case both types of locks disable interrupts when they are acquired, and restore interrupts when the lock is released. All kernel data is split into the smallest lockable structures possible. For example, rather than having a large data structure and one lock for all message queues there's a seperate lock for each message queue, so that seperate queues can be modified at the same time.
LONGER OPERATIONS
Longer operations are split into several small operations that use the spinlocks above. The longest operations that are in the kernel is the code that creates and terminates a thread. As an example, the code that creates a thread is split into these smaller operations:
Create entry in thread state table with "state = init" (first lock)
Create initial address space (no lock needed)
Clone the process's address space (second lock)
Create thread data area (no lock needed as thread "state = init")
Add thread to the process data area (third lock)
Change entry in thread state table to "state = ready" (fourth lock)
Add thread to "ready to run" scheduler queue (fifth lock)
Notice that this code can be pre-empted most of the time, but does use a lot of locks (terminating a thread is worse - around 10 locks). Most of these locks aren't in use for long tough, and most operations only use one or two locks.
Consider the code for the multi-CPU version of the simpler (more common) spinlock:
Code: Select all
lock:
pushfd
mov edi,[GS:CPUlocalAPICaddress]
cli
push dword [edi+LAPICtaskPriority]
mov dword [edi+LAPICtaskPriority],0xFF
lock bts [theLock],1
jnc .l2
.l1:
popfd
pause
pushfd
cli
lock bts [theLock],1
jc .l1
.l2:
unlock:
lock btc [theLock],1
jnc .criticalErrorHandler!!!
pop dword [edi+LAPICtaskPriority]
popfd
If the lock is free the overhead is negligable but if it isn't many cycles can be wasted. More locks means less chance of the CPU spinning. I'm looking at over 300 little spinlocks (mostly due to the way I manage physical memory).
Cheers,
Brendan