Hi,
You're getting confused (and trying too hard to avoid "process it later"). Let me be perfectly clear, there are only 2 cases:
- IRQ handlers use some sort of "process it later" somewhere; or
- Everything happens inside an IRQ handler
There are no other choices.
For example, imagine there's a task that is blocked waiting for "read()" to complete, and the disk driver's IRQ occurs:
- You can use "process it later" somewhere - either:
- The disk driver's IRQ handler could use some sort of "process it later" (and all the rest can happen later); or
- The disk driver's IRQ handler could complete that transfer and arrange for the next transfer, then use some sort of "process it later" (and all the rest can happen later); or
- The disk driver's IRQ handler could complete that transfer and arrange for the next transfer and then call the file system's code directly, and the file system's code could use some sort of "process it later" (and all the rest can happen later); or
- The disk driver's IRQ handler could complete that transfer and arrange for the next transfer and then call the file system's code directly, which could call the VFS layer directly, which could use some sort of "process it later" (and all the rest can happen later)
Or:
- The disk driver's IRQ handler could complete that transfer and arrange for the next transfer and then it could call the file system's code directly, which could call the VFS layer directly, which could complete the "read()" and return to the file system's code, which could return to the disk driver's IRQ handler, which could do IRET
The second option is entirely silly, therefore you have to use some sort of "process it later". This could involve IPC (messages, pipes, whatever), or some sort of queues, or "soft IRQ", or something "signal-like", or polling, or any other way of arranging for code to be executed outside of the IRQ handler. It doesn't matter much which method you use (as long as it's not polling because polling sucks, although signals are ugly so I'd avoid them too). It does matter where it is (obviously you'd want to keep the IRQ handler's short/fast, so "as soon as possible" is better). It also matters if it's a generic/standard thing or if every device driver has to implement it's own hack.
sounds wrote:I'm glad you pointed out the potential queue deadlock (priority inversion). I overlooked it at first; some kernels go with "process it later" code to solve this problem. Is a lock-free queue workable here?
For data being sent from software to a device, a lock-free queue could work (sort of), assuming the device generates some sort of "ready for more" IRQ that the driver's IRQ handler can use to take data from the lock-free queue and send it to the device. However there'd be race conditions involved when the queue is empty (and no "ready for more" IRQ is expected).
A lock-free queue wouldn't work well on its own as a form of "process it later" for data being sent from the device to software, as software wouldn't know when to check the lock-free queue (you'd have to poll the queue).
Also, in general lock-free algorithms do solve deadlock problems, but don't solve starvation problems. If the IRQ handler has to get something from a lock-free queue or put something on a lock-free queue; then it could repeatedly fail and retry for an unlimited amount of time.
Note: "lock free" means that someone (not necessarily the IRQ handler) makes forward progress, while "block free" means that everyone (including your IRQ handler) makes forward progress.
sounds wrote:Thanks as well for providing a more complete list of which drivers need IRQs. For the purpose of writing interrupt handlers, are we agreed that e.g. the USB Keyboard driver doesn't have an IRQ handler because the USB controller has one?
Ok.
sounds wrote:The USB controller is a great place to focus in and hash out the essential questions:
- Can the interrupt handler submit the DMA buffer pointer to the appropriate destination without taking forever? [specifically, without deadlocking or calling sections of the code not intended to happen inside an interrupt]
For sending data, the USB controller driver's interrupt handler should probably be able to submit the pointer to the DMA buffer to the USB controller (the destination) relatively quickly. For receiving data, the USB controller driver's interrupt handler may or may not be able to submit the pointer to the DMA buffer to the "whatever" (the destination); depending on what the "whatever" is, and depending on what "submit" means, and depending on how "submit" works.
sounds wrote:- A corollary to #1: when will the destination wake up? Typically the scheduler is invoked at the end of all device interrupt handlers, so blocked threads do run after the interrupt handler releases the right lock, but not inside the interrupt handler.
If "destination" is a task (and not another driver in the kernel or something) and if "wake up" means "unblocked" (e.g. data being received by a task that was blocked waiting for IO from a USB device) then the destination/task will wake up/unblock when "something" tells the scheduler to unblock the task; where "something" might be the VFS or GUI or shell or network stack or kernel or some other process or whatever else (depending on what the device was, what the received data is, etc).
sounds wrote:- Can buffer pointers be allocated/freed outside the interrupt handler - so no memory management is needed during the IRQ?
Yes - everything can happen outside the IRQ handler (except for arranging some sort of "process it later").
sounds wrote:- Data moving the other way may also block waiting for the interrupt to indicate the device's buffers are not full. Is it sufficient to let the scheduler deal with this as another thread, scheduled when the interrupt handler releases the lock that blocked it?
Yes, maybe.
sounds wrote:- (a repeat of the first question for completeness) Does the queue implementation guarantee the interrupt handler can always submit buffer pointers to the queue, potentially causing the destination to have to back off / retry its access? [the destination would be using a kernel-provided library function to pull data off the queue, so the back off / retry behavior is transparent to the driver writer]
Polling sucks (even if it's "polling with back off/retry"). You poll, it's not there, you back off a little; you poll again, it's still not there, you back off a little more. Six months later (after preventing the CPU from going into any sleep state with your wasteful polling), you poll again, it's still not there, you back off a little more and immediately after that it happens, but you backed off so far that you don't even notice for 3 days.
Then there's the issue of how you're planning to measure the time delays for the back off - e.g. are you going to poll a time counter constantly, or are you going to poll the queue from within a timer IRQ handler?
Now it's my turn for questions:
- Does your OS have some sort of IPC that normal processes can use to send data to other normal processes?
- Is the IPC you already have (or the IPC you're planning to have) sane? For example, can tasks block waiting for IPC, and be woken up when the IPC occurs?
- Is the IPC you already have (or the IPC you're planning to have) useful? For example, can a process send data to a driver or receive data from a driver using exactly the same IPC code without knowing/caring that it's talking to a driver and not another process?
Cheers,
Brendan