Multiprocessing critical regions
Multiprocessing critical regions
Which are typical critical regions for multiprocessing? Where I must put the spinlocks?
For now I put them only in the scheduler function.
For now I put them only in the scheduler function.
- zaleschiemilgabriel
- Member
- Posts: 232
- Joined: Mon Feb 04, 2008 3:58 am
I've been asking myself that same question about hard disk access. Can two processors issue commands to an ATA device at the same time? I assume the command registers have to be set accordingly, so that's a big NO, but what would happen if a CPU accesses the status register and figures it's ok to send a command, but another CPU immediately steps in and sends another command before the first one? I've never written a hdd driver before, so I might be talking nonsense (maybe this sort of sync is done automatically by the APICs or something).
If anyone knows of similar cases with other devices, I'm making a list.
Cheers,
Gabriel
If anyone knows of similar cases with other devices, I'm making a list.
Cheers,
Gabriel
Well, the IDE controller or hd doesn't care or know about how many cpus there are nor do they do any syncing or multiplexing, so you have to make sure on your own that only one cpu at a time (and only one thread at a time, if you are in preemtible code areas) deals with it and locks the device as long as it's busy (i.e., as long as it has inconsistent state). This is obviously the case for *all* devices. The APICs don't have anything to do with this, an APIC just delivers interrupts to a cpu core, but it isn't involved in io operations or anything device specific.zaleschiemilgabriel wrote:If anyone knows of similar cases with other devices, I'm making a list.
You have to make sure that only one command is sent to the ata controller at the same time. That applies to every other device (e.g. floppy disk, network cards) that is not part of the cpu itself, too (e.g. you don't need locks to protect the local apic registers). The cpu cannot do that kind of synchronization for you as it does not understand the ata protocol. It just sends bytes over the pci bus.
Hi,
I'm sorting out the same kind of thing at the moment.
It is also no good if one process sends its HDD request, finishes and releases the port range and then a second process locks and writes to the same ports, halfway through the disk access. For device access, therefore, the entire device needs locking out for the entirity of the operation. This means using some kind of messaging system or other queue control.
Cheers,
Adam
I'm sorting out the same kind of thing at the moment.
This is why you need to use spinlocks with an atomic test and write function. The spinlock really needs to be at the point of your hard disk driver, not at the point of port access. This is because you need to ensure that a process can either lock *all* or *none* of the resources it needs for a particular task in order to avoid deadlock and data corruption (particularly if a device has a range of ports).what would happen if a CPU accesses the status register and figures it's ok to send a command, but another CPU immediately steps in and sends another command before the first one?
It is also no good if one process sends its HDD request, finishes and releases the port range and then a second process locks and writes to the same ports, halfway through the disk access. For device access, therefore, the entire device needs locking out for the entirity of the operation. This means using some kind of messaging system or other queue control.
Agree with JamesM. To expand with some examples, you will need locks for physcal memory allocation, "kernel-space" and shared heap allocation, device drivers, VFS functions, thread queue accesses, semaphores and so on...Which are typical critical regions for multiprocessing? Where I must put the spinlocks?
Cheers,
Adam
- Combuster
- Member
- Posts: 9301
- Joined: Wed Oct 18, 2006 3:45 am
- Libera.chat IRC: [com]buster
- Location: On the balcony, where I can actually keep 1½m distance
- Contact:
Well, there's still the lock-free/wait-free class of algorithms. Which means that in my case, large parts of memory management require no locking, just a few atomic read-modify-write operations. (XADD, CMPXCHG and BTS rock)
I can imagine the same principles being applied to hardware too.
However, I agree that locking is the easier route to take.
I can imagine the same principles being applied to hardware too.
However, I agree that locking is the easier route to take.
- zaleschiemilgabriel
- Member
- Posts: 232
- Joined: Mon Feb 04, 2008 3:58 am
- Combuster
- Member
- Posts: 9301
- Joined: Wed Oct 18, 2006 3:45 am
- Libera.chat IRC: [com]buster
- Location: On the balcony, where I can actually keep 1½m distance
- Contact:
singletasking is easier, true, but we want correctness in a multiprocessor environment, no?zaleschiemilgabriel wrote:IMO it's not the easier route, but the safest.
Have you done your own research yet?Can you please explain the lock-free/wait-free method a bit?
It's *definately* the easier route. Locking is much easier than atomic ops.zaleschiemilgabriel wrote:IMO it's not the easier route, but the safest. Can you please explain the lock-free/wait-free method a bit?
In an algorithm where you can identify that the invariant is maintained throughout all code bar a few lines, for example - you calculate something (in thread local storage) then add it to a global accumulator. All this code except the global addition is preemptible and safe. The last one needs exclusivity - you could make a lock, grab it, update the global and release the lock, OR you could use some atomic ops to atomically change the value of that global variable.
Whether you can or not depends on the processor, but instructions like CMPXCHG on the x86 allow you to do this.
- zaleschiemilgabriel
- Member
- Posts: 232
- Joined: Mon Feb 04, 2008 3:58 am
Plainly put, those instructions are sync'ed among processors? Is there anything specific that makes them that way? I think this is the point where I get overly inquisitive and should just STFU and google?
Just wanted to say thanks... and also to point out that actually accessing thread-local-data is not the only circumstance in which you need locks; you also need them for accessing device registers in an atomic fashion.
If there are any other resources you might know of that require similar sync'ing, I'm still making that list...
Just wanted to say thanks... and also to point out that actually accessing thread-local-data is not the only circumstance in which you need locks; you also need them for accessing device registers in an atomic fashion.
If there are any other resources you might know of that require similar sync'ing, I'm still making that list...
- Combuster
- Member
- Posts: 9301
- Joined: Wed Oct 18, 2006 3:45 am
- Libera.chat IRC: [com]buster
- Location: On the balcony, where I can actually keep 1½m distance
- Contact:
That list of devices depends on the functional behaviour of an individual device, your requirements, and your experience points in Advanced Wizardry.
In other words: different per person.
I wouldn't lock thread-local data. As the name suggests, it's limited to a thread, hence there exist no concurrent access.
As for how specific instructions work, see the intel manuals. I recommend getting some background education on concurrent programming first as they don't make sense really without that underlying knowledge.
In other words: different per person.
I wouldn't lock thread-local data. As the name suggests, it's limited to a thread, hence there exist no concurrent access.
As for how specific instructions work, see the intel manuals. I recommend getting some background education on concurrent programming first as they don't make sense really without that underlying knowledge.
Code: Select all
typedef struct spinlock
{
volatile unsigned long lock;
} spinlock_t;
//! Lock the semaphore.
static __INLINE__ void spin_lock( spinlock_t *lock )
{
__asm__ __volatile__(
"1: lock ; decb %0\n"
"jge 3f\n"
"2:\n"
"cmpb $0, %0\n"
"rep; nop\n"
"jle 2b\n"
"jmp 1b\n"
"3:\n"
: "=m"(lock->lock) : : "memory");
}
//! Unlock the semaphore.
static __INLINE__ void spin_unlock( spinlock_t *lock )
{
__asm__ __volatile__(
"movb $1, %0"
: "=m"(lock->lock) : : "memory");
}
In a critical section I do:
Code: Select all
void scheduler(unsigned int stack)
{
unsigned int flags;
local_irq_save(flags);
spin_lock(sched_lock);
...
//Code of the scheduler
...
spin_unlock(sched_lock);
local_irq_restore(flags);
}
Or there are faster or safer methods?