I appreciate you taking the time to work through the issues of something as complex as this. I hope this thread will be useful for others too.
thewrongchristian wrote:
Consider this scenario:
1. Task A locks L1
2. Task B locks L2
3. Task C locks L3
4. Task A wants L2
5. Task B wants L3
6. Task C wants L1
What is a deadlock in wait queue system, that you can actually detect using dependency tracing, is now a live lock in your system. The wait queue can help you detect this situation if you were so inclined.
That is a very contrived example, but I get your point. In my system, sharing of resources is normally (i.e. at a user space level) done via a class of message which have very strict (and executive enforced) rules about access, and this is all handled within the normal waiting system architecture. But the GUI task currently works differently (this is my first desktop GUI effort so it's easily the weakest part of my project). Other than the GUI task, locks are only used at the executive level, so are not as arbitrary as the example you have given. I will probably (
definitely) need to rewrite the GUI task at some point to better use the messaging system.
Separately, if you do your "hlt" based idle low priority, you have a race condition if you're relying on interrupts to clear any locks. For example:
1. Task A starts an I/O operation
2. I/O completes quicker than expected, signals the completion event
3. Task A waits until the I/O is complete, which presumably involves polling an event
4. With no other tasks to schedule (Task A is waiting for an event that won't happen again) you "hlt" waiting for an interrupt that won't arrive.
From this example, I think we might be talking at crossed purposes, which would explain our disagreement. My system is a microkernel, tasks normally can't lock a resource that doesn't belong to them. I'm using locks to ensure synchronisation (I also have a Forbid()/Permit() pair of functions in the kernel which informs scheduler not to reschedule when the task's quantum has expired for a similar purpose, but I don't intend to expose that to the user space, for obvious reasons! I mostly used them for temporary locking while working on other system components, I don't think they are used anymore and can probably be removed).
So all hardware access is handled by a dedicated device class of task (or multiple tasks depending upon the needs of device, but the user's task has no idea how the device works), if a user's task want to perform an I/O operation, it will raise an IORequest class of message with the required device (task). There is a formal interface for this, and the task can chose to either wait for the IO to complete (in which case it will sleep and be woken via the normal singling system), or do other processing, and periodically check on the status of the IORequest, dealing with it as it sees fit, once returned.
Interrupts do not access system resources. They only fill buffers and then signal their associated device task that an interrupt has happened, if that task is waiting on an interrupt it will then be woken and will deal with the newly filled buffer, if that task is already running it will see the signal (i.e. the Wait(signalMask) function will return immediately and not put the task to sleep) and again handle the buffer. User tasks never wait on an interrupt. They always deal with an associated device task via the messaging system.
In the best case, you have an unrelated interrupt (like a timer) that runs you scheduler again, and task A now notices the I/O has completed, but you've turned an unexpectedly quick task (so quick you missed it) into a long, potentially indeterminate, wait.
Good example, I have a timer device task. The timer interrupt has a simple counter and can signal the timer task when a certain value is reached. The timer task is not scheduled until it receives a signal from the timer interrupt that the interrupt event has occurred, it then checks the counter and from that it knows the current "system time", it can then service any IORequests that it may be holding. If a user task needs a timer, it sets up an IORequest with the timer task, specifying the period of time it expects the IORequest to return. The user task can then wait (it won't be scheduled if it is waiting for a message) for the IORequest to be returned by the timer task. It is also free to do other things and use the CheckIO(IORequest*), to see if the timer has returned.
A task waiting on a lock queue will use literally no CPU time.
If you have to poll for locks now being free, you're wasting CPU time, but there are bigger problems with this approach (see above.)
So, I think I didn't really explain the scope of how locks are used in my system. The examples you are citing are all handled by the messaging system (which is built on a system of signals), and any task waiting for a signal will sleep until it receives a signal it is waiting for.
All events, for example, in my GUI are built using proper messages, if they weren't, the idle task would never run!!! since they would all be busy waiting for the GUI to signal them
The particular issue here is that I don't have a reverse mechanism for the the task sending (graphical) data back to the GUI task, so it just reads the data from the task directly. If the task is making changes to the description of the data it wishes to display, it needs to complete that update before the GUI task tries to read it. This really needs a more formal mechanism... and during the writing of this post, I have thought about how that might work... But the concept of atomic data structure updates remains an important one in the OS design.
That would require the lock to keep a record of all the tasks which want the resource, then upon freeing of the lock one of the tasks in the list would need to be signalled to run, this doesn't feel very robust to me, and would be quite messy to debug... I think...
You need a simple linked list, which needs to be robust in the face of multi-threading. You'll need that anyway, really, in a robust OS.
You build the lock and its waiting list on simpler, easier to debug primitives.
At the bottom, you have an uninterruptible spin lock. It needs to be uninterruptible so you can safely use it in interrupt handlers.
The spinlock protects the higher level mutex, which encapsulates the locking state as well as the task waiting list. Once you have the spinlock protection, you can poll and modify the wait list with impunity.
The waiting list itself can be either a singly or doubly linked list. I use a doubly linked circular list, with a single head pointer forming the wait list, to enable arbitrary removal from the list without having to walk the list.
That's the foundation every OS is built on. It can be written and debugged at user level, so making it robust can be done in a nice easy environment.
The trickiest part is getting the spinlock reliable in the face of interrupts and SMP (for future use). But once that's done, you can build more complex primitives on top. My own personal sync library, disregarding the processor dependent spinlocks and the linked list macros, is less than 450 lines of C for:
- Spinlock based mutex with wait list
- Mutex based read/write lock
- Mutex based monitor (enter/leave/wait/signal/broadcast)
- Spinlock based monitor (as above, but usable for interrupt locking and signalling)
Given how critical synchronization is to implementing an OS robustly, you need to invest time to do this properly.
Yes, I think that my architecture already covers this, but I haven't really communicated it very well (my fault, not yours).
I have used my kernel architecture in embedded designs for many years, but I realise the Desktop is a much more complex use case (users suck
), and that's why I'm doing this!
-edit- It would be possible to implement the lock waiting system using the normal task signalling system, I just wanted a more light weight system as I only use Lock()/Free() function pairs to ensure the update of particular data structures is an atomic operation.