OSDev.org

Posted: **Fri Sep 18, 2009 12:12 am**

I need some ideas around how to implement Thread Safety in drivers for my OS.

Some things to keep in mind:

- Drivers are written by other developers and enforcing thread safety flawlessly is a result of other people's ability to adhere to the recommendation to make all drivers thread safe.
- Calls to drivers are implemented with interrupts and pointers to functions. The hardware process/task mechanisms are not implemented at all.
- The kernel only has one process, so utilizing other CPUs are done explicitly using an API to start up some code on a specific processor.

All of the above factors makes it difficult to trap calls to drivers from different CPUs and implement a fixall method to synchronize calls to the driver. The way this was solved in Windows was to use a queue mechanism that is serviced by a pool of synchronized and dedicated process/thread combinations. I do not however have the concept of a task scheduler in the OS and adding one is not an option.

Any ideas ?

Posted: **Fri Sep 18, 2009 1:59 am**

Hi,

mybura wrote:All of the above factors makes it difficult to trap calls to drivers from different CPUs and implement a fixall method to synchronize calls to the driver. The way this was solved in Windows was to use a queue mechanism that is serviced by a pool of synchronized and dedicated process/thread combinations. I do not however have the concept of a task scheduler in the OS and adding one is not an option.

If there's no task scheduler, then you only need to worry about 2 things: multiple CPUs trying to execute the same driver's code code at the same time, and the driver's IRQ handler occurring while the driver is running.

For "multiple CPUs at the same time", most OSs allows that - it's up to the device driver to use spinlocks, mutexes, semaphores, etc as needed. There's no guarantee of safety (but potentially, no performance bottlenecks either). The alternative is to have a big lock around the entire device driver, to make sure that (ignoring IRQs) only one CPU can be using the driver at a time (which would probably suck for performance/scalability, but means the kernel can control the lock and guarantee that device drivers can't mess things up as easily). In any case, "no task scheduler" means you can't use mutexes and semaphores, and have to use something like spinlocks (or Lamport's bakery algorithm, which is similar to a spinlock but avoids starvation) - basically each CPU does nothing until it can use the device driver's code.

For IRQs you can't rely on something like a normal spinlock, because the IRQ can occur while the same CPU already has the re-entrancy lock (so trying to acquire the lock a second time with the same CPU will cause a deadlock). To avoid this, if the IRQ occurs and can't acquire the lock then it can set an "IRQ occured" flag and return from the IRQ. Then, just before the lock is released you check if the "IRQ occured" flag is set, and if it is you execute the IRQ handler's code then (before releasing the lock). You'd need to be careful about race conditions in the re-entrancy locking code though.

Cheers,

Brendan

Posted: **Sat Sep 19, 2009 3:04 pm**

What you are suggesting makes sense.

The solution however requires me to have control over the driver developers and how they implement or ignore the locking. I was hoping for something that would enforce driver stability without burden on the driver writers.

Another caveat which you did mention was around performance. The main aim of the OS is to run bare-bones subsequently having maximum performance available to the applications. Implementing a solution the way suggested would go against the very grain of the OS.

Maybe there is a way to implement your ideas without the driver writers having to worry about it. Almost like a filter mechanism that will check requests to drivers without drivers knowing about it.

What about using debug interrupts to intercept the calls to the drivers (both from irq and software) and the kernel enforcing proper locking using the methods described to stop the threads from treading on eachother's toes? Performance alert on debug interrupts/context switch!!! Are they worth their cost in function provided?

Posted: **Sat Sep 19, 2009 4:21 pm**

Hi,

mybura wrote:What you are suggesting makes sense.

The solution however requires me to have control over the driver developers and how they implement or ignore the locking. I was hoping for something that would enforce driver stability without burden on the driver writers.

Another caveat which you did mention was around performance. The main aim of the OS is to run bare-bones subsequently having maximum performance available to the applications. Implementing a solution the way suggested would go against the very grain of the OS.

Maybe there is a way to implement your ideas without the driver writers having to worry about it. Almost like a filter mechanism that will check requests to drivers without drivers knowing about it.

Unfortunately, there's always compromises.

For maximum possible performance, you need to give device driver writers the ability to completely screw everything up (which includes the ability to use many CPUs at the same time and the need for device drivers to deal with their own locking, and the ability for device drivers to cause deadlocks and other problems if they get the locking wrong); and if you want device drivers that can't screw anything up then you need to sacrifice performance in some way (e.g. preventing device drivers from using more than one CPU at a time so they don't need to deal with locking themselves).

mybura wrote:What about using debug interrupts to intercept the calls to the drivers (both from irq and software) and the kernel enforcing proper locking using the methods described to stop the threads from treading on eachother's toes? Performance alert on debug interrupts/context switch!!! Are they worth their cost in function provided?

The kernel can't enforce proper locking unless it only allows one CPU into the device driver's code at a time (bad performance) or knows exactly what types of re-entrancy protection are needed by different pieces of code inside the device driver (which is impossible to figure out, unless the device driver tells the kernel and the kernel does what the device driver tells it to; which means the device drivers needs to work it out anyway).

Also, you shouldn't let applications call device drivers directly (and therefore shouldn't need to use anything to intercept this).

Cheers,

Brendan

Posted: **Sun Sep 20, 2009 3:06 am**

I'm using an approach that don't requires the drivers to be thread-safe. That means they don't have to use threads but they can, if they like. Thats because:
1: The kernel manages message-queues for sending requests to a driver and write answers back. The drivers simply request the next message and handles it.
2: The kernel is informed by the drivers wether currently there is data to read. Either by a system-call or with the answer of read that is sent to the kernel. This way the kernel blocks a requesting thread if there is no data.
So, the drivers are not forced to handle multiple requests in parallel which makes the drivers very simple. But, as mentioned, a driver can, of course, use multiple threads and implement its own queue to handle multiple requests in parallel. This can e.g. make sense for an ata-driver which uses DMA. I've not implemented it yet but I'm planing to use threads in my ata-driver then to handle multiple requests.

This might not be the most performant solution, but it makes driver-development simple, and I think its fast enough.

OSDev.org

Thread Safety in Drivers

Thread Safety in Drivers

Re: Thread Safety in Drivers

Re: Thread Safety in Drivers

Re: Thread Safety in Drivers

Re: Thread Safety in Drivers