OSDev.org

Posted: **Sun Dec 19, 2010 2:01 pm**

Been awhile since I started a topic, here goes.

Recently, I implemented a mutual exclusion system. It works fairly well. I also took a look at the method in which my kernel had been locking read/write/ioctl requests. It doesn't is the short answer. So now, I figure that the next step in my kernel is to provide a locking system for devices. One of my questions is, how do you handle that?

Since devices need not be locked for everything (eg terminals don't seem to need it, nor do serial ports, but correct me if that assumption is wrong), one approach would be to do no locking whatsoever in the kernel, and then implement the needed locking inside the drivers themselves. The problem with this is that there will probably be wasted code (though not much, if I use mutexes to do it. Just a few lines).

Block devices would seem to need locking (you can't have a process read and write at the same time), I could provide a locking system for them. But then what do I do about ioctl requests? Can they be run during read and write operations? Or do they have to be separate? Another solution would be to provide a locking system for all devices that drivers can simply configure for their device, specify what needs to be locked out from what.

So, any comments, corrections? What do you guys do, and what would you recommend for a device locking system?

Thanks,
-JL

Posted: **Sun Dec 19, 2010 3:28 pm**

Hi,

I'd buffer requests.

For example, you could use FIFO queues. When a task asks a driver to do something their request is put onto the driver's queue. The driver takes a request from the queue, performs the request, tells the task the results, then takes the next request from the queue, etc. The only locking needed is for inserting requests on the queue and removing them (which should be very fast).

For more complex drivers you can replace the FIFO queue with something else, but the same idea applies - some sort of data structure/s to keep track of pending requests, where locking is needed for inserting/removing requests from the data structure/s (and no locking needed anywhere else). For example, rather than performing requests in the order they arrive, the driver might attempt to maximise efficiency by performing requests in a different order and/or you might support "I/O priorities" where some requests are more important than others.

For something like FIFO queues, the kernel could do the buffering on behalf of the driver/s to avoid duplicating the code in every driver; but for something more complex this isn't very practical as the kernel would need to know how to determine the best order for each type of device. However, there's no reason why the requests can't be managed in multiple stages. For example, the kernel could maintain a FIFO queue of requests (where simpler drivers that do requests in the order they arrive need nothing else), and more complex drivers could remove requests from the kernel's FIFO queue, do some pre-processing (if necessary) then insert the request/s into their own data structure/s (where the driver takes the requests from its own internal data structure/s when they're actually performed).

Cheers,

Brendan

Posted: **Sun Dec 19, 2010 3:36 pm**

Interesting. When would the driver process the requests? It would have to schedule time in the scheduler, wouldn't it? Would that be faster than 1 process reading/writing data while other processes are allowed to run (just not using that device)?

-JL

Posted: **Sun Dec 19, 2010 4:28 pm**

Hi,

piranha wrote:Interesting. When would the driver process the requests? It would have to schedule time in the scheduler, wouldn't it? Would that be faster than 1 process reading/writing data while other processes are allowed to run (just not using that device)?

This depends a lot on what sort of kernel you're doing.

For my micro-kernel/s, I use asynchronous messaging (where there's message queues which are FIFOs; and there's no need for the kernel to have anything special for those device driver FIFOs) and device drivers run as separate tasks. When the device driver calls "get_message()" (to get the next message/request), if there are no messages/requests the scheduler puts the task to sleep until a message/request arrives. If there is a message it processes it, sends a reply message and then gets the next message. Basically you end up with something like this in the driver:

Code: Select all

    for(;;) {
        get_message();
        switch (message_type) {
            /* Message handling code (request processing code) */
        }
    }

When device drivers aren't running as separate tasks it'd be a lot uglier. The main difference is where the driver gets it's CPU time from. For example when a task puts a request onto the queue it could check if the driver is idle (queue is empty) and start the request immediately (so the driver runs in the context of the task that made that requests). If the driver isn't idle, the task puts the request on the queue and blocks. When the driver completes a request it checks the FIFO queue and unblocks the task for the next requests (if any). In this case, for a simple FIFO queue it ends up being equivalent to Lamport's bakery algorithm.

For a monolithic kernel, you can choose to use either method - either run the driver as a "kernel thread" that is stopped when it's FIFO is empty (and started when there's work for it to do), or run the driver in the context of the task that made the request it happens to be handling.

Cheers,

Brendan

Posted: **Sun Dec 19, 2010 4:41 pm**

Well, since my drivers are loaded as modules and they don't run in tasks, I'll probably choose the method of running the driver in the context of the task that requests the IO.

For example when a task puts a request onto the queue it could check if the driver is idle (queue is empty) and start the request immediately (so the driver runs in the context of the task that made that requests). If the driver isn't idle, the task puts the request on the queue and blocks. When the driver completes a request it checks the FIFO queue and unblocks the task for the next requests (if any). In this case, for a simple FIFO queue it ends up being equivalent to Lamport's bakery algorithm.

It seems to me that if I were to run the driver in the context of the task, then this would collapse into a mutex type locking, where the task that makes the request is stopped until the IO completes for it (it would wait for the IO to be free of course).

I'm thinking that I could accomplish this easily with mutexes, where the driver simply locks it's important sections of code with a mutex for read/writes and etc. And if I put this into the driver's themselves, then the drivers can organize requests as they like. The way I do mutexes doesn't allow for priorities, however, so I wouldn't be able to do IO priorities.

-JL

Posted: **Sun Dec 19, 2010 5:59 pm**

Hi,

piranha wrote:I'm thinking that I could accomplish this easily with mutexes, where the driver simply locks it's important sections of code with a mutex for read/writes and etc. And if I put this into the driver's themselves, then the drivers can organize requests as they like. The way I do mutexes doesn't allow for priorities, however, so I wouldn't be able to do IO priorities.

The problem with a simple mutex is that there's no "fairness". What this means is that under contention (e.g. many tasks fighting for the same mutex) a task can be very unlucky and remain blocked for a very large amount of time, while other tasks can be very lucky and hog the resource being shared. For extremely short critical sections and/or situations where contention is extremely unlikely, this can be ignored. For things like device drivers (where you can have 1000 tasks all pounding the same disk driver at the same time, where it takes a relatively large amount of time for each request to be completed) it becomes a major problem. To make sure this doesn't happen (e.g. ensure fairness) you need something like a FIFO queue, so that tasks acquire the mutex in order.

So, for a "fair" mutex, tasks would be put onto a FIFO queue and block, and when the mutex is released the task that released it would unblock the next task on the FIFO queue. To implement a "fair" mutex you end up with something like this:

Code: Select all

acquire_mutex() {
    acquire_spinlock();
    status = insert_task_on_FIFO();
    if(status != queue_was_empty) release_spinlock_and_block_task();
    else release_spinlock();
}

release_mutex() {
    acquire_spinlock();
    next = remove_next_task_from_FIFO();
    unblock_task(next);
    release_spinlock();
}

Now, what if that FIFO queue was replaced by a more complex data structure? What if the task that released the lock chose the "best" task to unblock (rather than just the task that has been waiting the longest)? This only takes minor changes to the previous code:

Code: Select all

acquire_mutex() {
    acquire_spinlock();
    status = insert_task_on_SOMETHING();
    if(status != something_was_empty) release_spinlock_and_block_task();
    else release_spinlock();
}

release_mutex() {
    acquire_spinlock();
    next = remove_best_task_from_SOMETHING();
    unblock_task(next);
    release_spinlock();
}

Of course to be able to decide which task is the "best" task to do next, you'd need to store more information:

Code: Select all

acquire_mutex() {
    acquire_spinlock();
    status = insert_description_of_request_on_SOMETHING();
    if(status != queue_was_empty) release_spinlock_and_block_task();
    else release_spinlock();
}

release_mutex() {
    acquire_spinlock();
    next = remove_best_task_from_SOMETHING();
    unblock_task(next);
    release_spinlock();
}

Now the function names are misleading because it's not really a mutex anymore. Lets change them:

Code: Select all

queue_request() {
    acquire_spinlock();
    status = insert_description_of_request_on_SOMETHING();
    if(status != queue_was_empty) release_spinlock_and_block_task();
    else release_spinlock();
}

complete_request() {
    acquire_spinlock();
    next = remove_best_task_from_SOMETHING();
    unblock_task(next);
    release_spinlock();
}

Basically, you need code for spinlocks and code for blocking/unblocking tasks to implement "fair" mutexes; but nothing prevents a device driver from using the spinlocks and blocking/unblocking directly to implement more complex request handling.

Cheers,

Brendan

Posted: **Sun Dec 19, 2010 6:05 pm**

Makes sense. Alright, on to implementation!

Thanks,
-JL

OSDev.org

Device Locking

Device Locking

Re: Device Locking

Re: Device Locking

Re: Device Locking

Re: Device Locking

Re: Device Locking

Re: Device Locking