Page 3 of 3
Re: Algorithm Challenge
Posted: Fri Aug 14, 2009 6:28 pm
by tantrikwizard
gravaera wrote:EDIT: @tantrikwizard: I had just typed me up a beautiful discourse on the evils of gotos, with witty little detours, and intellectually engaging points. Great timing.
hehe, no worries, just getting out of hand. I did like your post before the edit though.
gravaera wrote:But anyway, @OP: You'll find that you'll probably get more help if you provide a real synopsis of the AREa where you think you need pointers. you just came and dumped code and expected us to figure out it intricacies. Why not post a smaller snippet with something more specific to ask?
I am the OP BTW, but perhaps I should have labelled the thread 'Algorithm Challenge' or something like that? I hoped the bug would jump out at somebody as obvious. Theres a couple bugs or a single bug manifesting itself in a couple ways which doesn't appear unless two or more threads are hitting it simultaniously. It's not a confusing algorithm (even with the gotos) which is what has me confused. The basic idea is to avoid blocking. In my experience you get better performance by using atomic operations rather than blocking especially if the code stays in user space and doesnt cross the kernel boundary. Even if the code must do extra computations to remain in user space it's normally faster (unless the code is crap).
The basic rundown is this:
- when creating a virtual address space a process level PartitionAllocator is handed off to user space for allocating/freeing heap memory.
- the PartitionAllocator is intially setup with a single free partition at the base of user space which defines the entire user memory space (No next pointer and size==sizeof(entire user memory))
- as allocation requests come in it iterates through the _FreePartitions list until it finds a partition large enough for the request.
- it atomically decreses the requested size and uses the failure of this atomic operation for concurrency control.
- next it atomically pushes the new allocation on the _UsedPartitions and returns the pointer to the caller
Freeing is only slightly more complicated because it attempts to merge adjacent free partitions. The order in which merging occurs is important to avoid a two phase operation.
- it first locates and atomically removes the requested used partition from the _UsedPartitions list.
- it then iterates through the _FreePartitions checking if the address prepends an existing free partition. If it does the new partition is resized to include both partitions and replaces the existing partition.
- next it checks whether the address extends an existing partition, if so it expands the existing partition to include the freed space.
Sorry for not outlining this before, I thought it was obvious. This bug isnt easily tracked down, it goes away once I introduce debugging that outputs some runtime info. Eventually there is some contention or overwrite and a partition's Next pointer will point back to itself, thus iterating through the _UsedPartitions or _FreePartitions results in an infinite loop. (Yes! I can code an infinite loop in under 2 hours.)
What am I missing? I cant think of anything. After staring at a hunk of code for too long it all runs together and doesnt make any sense anymore. My eyes are bleeding over this otherwise simple algorithm. Thanks for the help everyone.
Re: Algorithm Challenge
Posted: Sun Aug 16, 2009 7:39 pm
by dude101
gravaera wrote:Listen: any properly designed system NEVER has to use gotos. Ever. Any competent programmer can think in such purely logical terms that such erratic logic as jumping from one point to another doesn't even come to mind while programming. It has never even been an issue for me. Everything is a flow of logic. The reason why we have procedures and functions, i.e. blocks of code that can be repeated and factored out into one block of related code is so that we can jump to that bit of code by using the proper, standard calling convention.
The example you gave above could easily have been done by factoring out the error portion into a function. And it would have looked nicer too. Not only that, but using comparing operands and then a jump statement (via a goto) kind of messes up the compiler's mojo.
You know, there are so many people whom I believe, if I were to give a compiler life for one day, would get bludgeoned or shot.
Design it properly. Take the extra time. It's worth it.
Sometimes goto's can produce more optimized algorithms. You might save an instruction. For me, every instruction matters.
Re: Algorithm Challenge
Posted: Mon Aug 17, 2009 12:11 am
by Solar
NickJohnson wrote:manonthemoon wrote:Umm.... nested functions are illegal in C and C++. So yeah, it is so hard to do that.
More details.
In *standard* C/C++. Works fine in GNU C though, which many of us are using.
Actually, Nick, that (using compiler-specific language extensions) is even more of an offense against proper coding technique than using goto itself. I can grudingly accept goto's, but that would probably get you fired here...
dude101 wrote:Sometimes goto's can produce more optimized algorithms. You might save an instruction. For me, every instruction matters.
Measure. Optimize. Measure.
Re: Algorithm Challenge
Posted: Tue Aug 18, 2009 2:22 pm
by 54616E6E6572
Just felt like throwing in my 2 cents.
The goto statement is nothing more than an unconditional branch.
The if, switch, while, for, etc... statements are just conditional branches.
gotos are always surrounded by some conditional statement anyways (as it would be pointless to always skip a section of code). Therefore it would in fact produce smaller, simpler, faster code to just rearange the braces of the surrounding conditional statement.
Re: Algorithm Challenge
Posted: Tue Aug 18, 2009 5:21 pm
by geppyfx
Not sure how good I understood you code. Hopefully enough to propose to redesign it into this:
Code: Select all
void * Alloc(size_t size)
{
if (!size) return NULL;
ui32 iActualSize = size + sizeof(TPartitionInfo);
PartitionInfo pFreePartition = _FreePartitions;
while(pFreePartition != NULL)
{
if(InterlockedCompareExchange(..) == DOESNT_NEED_RETRY){
for(I don't get why this is infinite loop if you get out of here on first iteration){
if(InterlockedCompareExchange(..) == DOESNT_NEED_RETRY) return ++pRet;
else
{
pFreePartition = pFreePartition->Next; // or pFreePartition=_FreePartitions ???
break;
}
}
}
else pFreePartition = pFreePartition->Next
}
}
Re: Algorithm Challenge
Posted: Tue Aug 18, 2009 11:10 pm
by thooot
I think there will be some problems with your Free routine. Your allocation routine looks okay at first glance. When you free though you can hit a race condition. If you have items: A -> B -> C -> D on your used partition list and you have two threads freeing B & C at the same time then you can have:
Both threads get to the **** in the code:
Code: Select all
//first remove this item from the _UsedPartitions list
Retry1:
if ((ui32)FreeMe != InterlockedCompareExchange((ui32*)&_UsedPartitions, (ui32)FreeMe->Next, (ui32)FreeMe)) { //if its not the 1st item then iterate
for (PartitionInfo pUsedPartition = _UsedPartitions ; pUsedPartition ; pUsedPartition = pUsedPartition->Next){
assert(pUsedPartition->Magic1 == PartitionMagic && pUsedPartition->Magic2 == PartitionMagic);
if (pUsedPartition->Next && pUsedPartition->Next==FreeMe){
****
if ((ui32)FreeMe->Next != InterlockedCompareExchange((ui32*)&pUsedPartition->Next, (ui32)FreeMe->Next, (ui32)FreeMe)) goto Retry1;
}
}
}
The thread removing B goes first and you get: A -> C -> D
Then the thread removing C goes, but since it has pUsedPartition pointing to B, you still get: A -> C -> D
Then later you'll add C to the free list and now you have the used list pointing in to the free list and you have no way to get to D anymore. The problem is to atomically remove B from the list you need to atomically update both A's next pointer & B's next pointer. This way if another thread is currently looking at B the next pointer will no longer match and it won't be able to remove C. However x86 doesn't provide a way to do two atomic operations on disparate memory at the same time. You could try setting B's next pointer to -1 using an interlocked operation before doing the cmpxchg. Then if another thread is walking the list and hits a -1 next pointer it would need to iterate through the list again. This is essentially a spinlock...
I didn't look at the remainder of the Free algorithm so there could be other issues as well.
Re: Algorithm Challenge
Posted: Wed Aug 19, 2009 6:38 pm
by tantrikwizard
thooot wrote:
The thread removing B goes first and you get: A -> C -> D
Then the thread removing C goes, but since it has pUsedPartition pointing to B, you still get: A -> C -> D
I don't think so if I'm following what you mean. If I'm understanding you correctly then the interlocked operations which adjust the list would fail and the entire operation would be retried until there's no write contention. I've been really busy with work and haven't had a chance to revisit it but will as soon as I get a minute and make sure I understand you correctly. The kernels using a nasty single threaded heap with spin locks at the moment but I really want to get this going and compare the performance. Unfortunately it's crunch time at the office so don't have a lot of time for personal stuff. Thanks for the help!