Atomic Operations
- liquid.silver
- Member
- Posts: 46
- Joined: Sat Jun 30, 2007 12:07 pm
- Location: South Africa
- Contact:
Atomic Operations
I was thinking of how to implement atomic operations in my os. My os is specifically for x86 processors. Can i simply execute a cli instruction and then a sti after the operation. Any problems with this? Also if an interrupt occurs when the flag is clear, will the interrupt happen when the flag is set again?
Re: Atomic Operations
Interrupts continue to be asserted until the CPU explicitly clears them. CLI does not clear interrupts -- it just makes the CPU ignore them for a little while. If an interrupt is asserted while the CPU is ignoring interrupts (with CLI), then the CPU will be interrupted immediately (2 opcodes) after the next STI instruction. The CPU may have to handle several pending interrupts immediately after STI.liquid.silver wrote: Also if an interrupt occurs when the flag is clear, will the interrupt happen when the flag is set again?
It is also possible to handle atomics on multicore systems without spinlocks (if you happen to hate spinlocks) by assigning write privileges to your resource (that you are doing the atomic operation on) to just one thread on just one core, and then using any singlecore mechanism to make sure the operation is atomic on just that one thread on just that one core. Any other core trying to read the resource while it is in an "invalid" state would need to task switch to something else, until the resource is valid again.
-
- Member
- Posts: 368
- Joined: Sun Sep 23, 2007 4:52 am
You don't need spinlocks, just a mutex and a "blocked" state in the scheduler.
To make the mutex you use CMPXCHG on single processor systems and LOCK CMPXCHG on multi processor systems (works on single processor systems as well, but is slower).
If the mutex is not available, then the thread goes into the blocked state (not running, and can't run), and is revived (either set to running or can run) when the other thread releases the mutex.
Allowing user processes to disable interrupts is a very bad idea for general purpose operating systems.
To make the mutex you use CMPXCHG on single processor systems and LOCK CMPXCHG on multi processor systems (works on single processor systems as well, but is slower).
If the mutex is not available, then the thread goes into the blocked state (not running, and can't run), and is revived (either set to running or can run) when the other thread releases the mutex.
Allowing user processes to disable interrupts is a very bad idea for general purpose operating systems.
You don't need spinlocks, just a mutex and a "blocked" state in the scheduler.
I do not advocate this as being the most efficient method.If the mutex is not available, then the thread goes into the blocked state (not running, and can't run), and is revived (either set to running or can run) when the other thread releases the mutex.
In a single processing environment, fine it is very efficient, as you know that because the lock is held, there is no way this process can get the lock in this timeslice. However in a concurrent environment, another core may be executing the process/thread that has acquired that lock, and it may just be that process is about to release it!
So you either
(a) spin and wait for it to be released. Best case: released before timeslice ends, process continues some execution.
(b) block the thread and swap out. Processor gets put to work immediately, but that process is starved of resources somewhat.
(c) hybrid of the two: spin for x cycles then block.
Most OSs use (c), and being that (c) is an extension to (a) which is what I suggested, it seems logical to first start with (a).
JamesM
-
- Member
- Posts: 368
- Joined: Sun Sep 23, 2007 4:52 am
- liquid.silver
- Member
- Posts: 46
- Joined: Sat Jun 30, 2007 12:07 pm
- Location: South Africa
- Contact:
Thanx. I do know what spinlocks are. I really just wanted to clarify about interrupts not being lost.
At the moment my os is single core. I'm going with the strategy of trying to get as much stuff to work as possible, even if it's a mess. Then when it's too much of a mess i rewrite everything. As i'm working at a microkernel, it isn't too bad and it helps me implement concepts early in order to understand them and then i can clean everything up later.
At the moment therefore, i don't need to worry about multiple cores, i'll get to that when i start on acpi. I'd rather sort out my ipc first
At the moment my os is single core. I'm going with the strategy of trying to get as much stuff to work as possible, even if it's a mess. Then when it's too much of a mess i rewrite everything. As i'm working at a microkernel, it isn't too bad and it helps me implement concepts early in order to understand them and then i can clean everything up later.
At the moment therefore, i don't need to worry about multiple cores, i'll get to that when i start on acpi. I'd rather sort out my ipc first
Do yourself a favor and abstract out your synchronization to, at the very least, some functions.
For example:
rather than calling asm("cli"); before the access, and asm("sti"); right after it, define some basic functions:
synchronized_enter()
{
asm("cli");
}...
and synchronized_exit() (you can guess what it looks like).
Then, when you decide you want to use spinlocks, sleeplocks, hybrid locks, semaphores, etc... down the line, you don't have to do a find and replace on your whole kernel.
On another note, if you think you're ever going to want to play with transactional memory in your kernel, you should probably make synchronized_enter() and exit take a list of references to data that will be modified. No need to do anything with it now, but you'll thank yourself later if you do try some STM.
For example, if I'm modifying the data in some structs (a, b, and c):
synchronized_enter(a, b, c);
//do operations here
synchronized_exit();
This is probably much more than you wanted to know, but it's not a bad idea to provide some means of changing the way some basic components of your kernel are implemented in a modular way. Your code will be a little easier on the eyes too if you don't toss in too many inline assembly statements without explanations.
For example:
rather than calling asm("cli"); before the access, and asm("sti"); right after it, define some basic functions:
synchronized_enter()
{
asm("cli");
}...
and synchronized_exit() (you can guess what it looks like).
Then, when you decide you want to use spinlocks, sleeplocks, hybrid locks, semaphores, etc... down the line, you don't have to do a find and replace on your whole kernel.
On another note, if you think you're ever going to want to play with transactional memory in your kernel, you should probably make synchronized_enter() and exit take a list of references to data that will be modified. No need to do anything with it now, but you'll thank yourself later if you do try some STM.
For example, if I'm modifying the data in some structs (a, b, and c):
synchronized_enter(a, b, c);
//do operations here
synchronized_exit();
This is probably much more than you wanted to know, but it's not a bad idea to provide some means of changing the way some basic components of your kernel are implemented in a modular way. Your code will be a little easier on the eyes too if you don't toss in too many inline assembly statements without explanations.
- liquid.silver
- Member
- Posts: 46
- Joined: Sat Jun 30, 2007 12:07 pm
- Location: South Africa
- Contact:
What do you do if you need mutual exclusion but can't block? Say interrupt handler that needs to talk to hardware without other processors messing with it? Well, I guess in that particular case you could mask the hardware and just wakeup a service thread, but how do you access scheduler-queues without spinlocks then? And how do you make sure you won't run into a race when accessing PIC (in order to mask interrupts)? Work with nothing but lock-free algorithms? Good luck.Craze Frog wrote:I didn't say it was more efficient, just that you don't have to use spinlocks, like you said.
Even the CMPXCHG-based mutex is just half the truth, because how do you solve the race condition when one thread has decided it must block, but has not yet been added to a wait queue, yet the other already releases the lock? What if a third thread then goes and sees a free lock and takes it while the first is still not yet even in the wait queue?
Honestly, it's probably possible to do without spinlocks, but personally I wouldn't want to try it.
The real problem with goto is not with the control transfer, but with environments. Properly tail-recursive closures get both right.