Maybe to continue the conversation for the sake that as a child I had tried to perform compression using permutation. During that time as a child I did not realize that what I was dealing with was called permutations or that I was simply wanting the change bases which is why my idea failed. However, the fondness of permutations, or combinations as I called them, has remained over the years. Some times I start thinking about it. The last time I took a look at it again was a few months ago when I wanted a hash algorithm that had a domain that was much smaller than the items to be hashed which is exactly like most hashing algorithms I know of today.
What I wanted to do differently was to make the algorithm aware of previous hashes so it could generate a hash in a sequential order with out storing the string. I think a equivalent example would be:
Code: Select all
struct sPair
{
uint32_t hash;
uint8_t *string;
};
sLinkedListItem<sPair> pairs;
uint32_t doHash(uint8_t *string)
{
// check if string already exists.
if(STRING_EXISTS_IN_LIST(pairs))
{
return GET_HASH_FROM_LIST_BY_STRING(string, pairs);
}
// create a new hash - and the return value is the index of the pair in the list.
index = CREATE_NEW_PAIR_IN_LIST(string, pairs)
hash = index;
return hash;
}
This is bloated though. What I wanted to do was remove the need to store current pairs and remove the need to illiterate a list or I suppose I could say make it more linear. The way I tried solving this was using premutations of permutations. I think that is the proper way to say it. The idea was similar to:
Here is a direct mapping.
kevin = kevin
apple = apple
pear = pear
shirt = shirt
wall = wall
a = a
However, I wanted to treat the mapping as a permutation it's self and call this permutation of the mapping a sequence so that something simpler like:
Mapping Sequence: aa
aa = 'aa
ab = 'ab
ba = 'ba
bb = 'bb
Mapping Sequence: ab
aa = 'aa
ab = 'ab
ba = 'bb
bb = 'ba
Mapping Sequence: ba
aa = 'aa
bb = 'bb
ba = 'ab
bb = 'ba
Mapping Sequence: bb
aa = 'aa
bb = 'bb
ba = 'ba
bb = 'ab
Mapping Sequence: baa
aa = 'aa
bb = 'ba
ba = 'bb
bb = 'ab
Mapping Sequence:
aa = 'aa
bb = 'ba
ba = 'ab
bb = 'bb
Mapping Sequence:
.... continued onward ...
So if we hashed in this order:
'aa, 'bb, 'ba, 'bb. Then, a direct mapping could be achieved by changing the mapping sequence from the default
aa to
bb. The need to store the current pairs should be removed (eliminated) since this is obviously stored by the mapping sequence instead of being expanded. More like a compressed state that retains the order. This is most likely already done somewhere although it is named something else and represented differently. I have no real idea but then again it might not be.
I forgot to mention that the point is to completely eliminate collision up to a point where you can detect that a collision will happen and you could remove a older or unused hashing by changing the mapping sequence.