Page 1 of 1

Possible sudden stroke of genius?

Posted: Fri Sep 04, 2009 5:44 pm
by 54616E6E6572
So I had this possibly amazing idea and I couldn't find any reference to a similar idea anywhere so far. Essentially I was getting annoyed at 32 & 64 bit pointers and it suddenly hit me that any pointer (or type) for that matter could be compressed into a single byte. Bits 0-6 represent the actual value and Bit 7 is a flag. The flag represents whether or not you subtract one from the uncompressed value before returning it.

For Example:
You have the compressed value of 0x38, then actualValue = 2 << (0x38 - 1) (2 to the power of 56).
If the compressed value >= 0x80, then actualValue = (2 << (byte - 1)) - 1

Any value between 0 and 2^128 could be easily represented in a single byte without any data loss. This includes integers, floating point numbers, strings, etc... This could be especially useful in managed byte code, where pointers and values need to be represented in a size that is both portable and can be read quickly and efficiently. (it takes upto 8 clock cycles to convert the value on a x86 computer)

I also know there is a quick and efficient algorithm to solve for X when 2^X = actualValue...
Solve for X when 2^X = actualValue; x = ln(y) / ln(2)
I would love to hear any and all criticism on this idea :).

Re: Possible sudden stroke of genius?

Posted: Fri Sep 04, 2009 6:46 pm
by oib111
How would I represent a value like 6? The first method only works with values that are powers of 2, and seeing as 6 isn't a power of 2 that doesn't work. Your second method won't work either. If you left shift 2 by 1 you get 4, then subtract 1, and you get 3. If you left shift 2 by 2 you get 8, subtract 1, and you get 7.

Re: Possible sudden stroke of genius?

Posted: Fri Sep 04, 2009 6:56 pm
by Alboin
No. (To the title.)

There is one simple notion to keep in mind when 'inventing' compression techniques: One cannot losslessly compress two bytes into one *single* byte without overhead. To prove this, make it true. In this case, one could make any two bytes one, and, thus, compress any file into one, single byte. Because we know that there are more than 256 types of files, it must not be true.

Re: Possible sudden stroke of genius?

Posted: Fri Sep 04, 2009 7:41 pm
by neon
In all cases, "compressing" any larger then a byte data to a single byte will guarantee a loss of information.

Re: Possible sudden stroke of genius?

Posted: Fri Sep 04, 2009 9:20 pm
by NickJohnson
neon wrote:In all cases, "compressing" any larger then a byte data to a single byte will guarantee a loss of information.
No, it's that you can't have a method that compresses *all* large messages into smaller messages. You can compress the message to the point at which it has an entropy of one bit per bit. But yeah, the OP is not at all right - he can only represent powers of two, and some simple functions of them.

For those interested in actual compression, check out an algorithm like range encoding or LZ77. And equally important, the theory behind compression.

Re: Possible sudden stroke of genius?

Posted: Fri Sep 04, 2009 9:28 pm
by AndrewAPrice
To prove your theory, take a 1 GB file and iterate through it, turning every 4 bytes into 1 byte, brining the file down to 256MB, do it again and bring it down to 64MB, and again, and again, until you have a single byte long file. Try to extract the original file back ;)

It's possible to store a decimal number (which acts similar to a float in terms of precision vs range) in a single byte using the following algorithm to reconstruct the number:

r = n ^ (b + 1) - n

r = real number
n = raise this to increase the possible range
b = byte value
(where ^ is power, not xor)

For example, if n = 1.03 you have the possible range of 0 to 1932.40477:

Code: Select all

n = 1.03:
r:    b:
0   0
1   0.0309
2   0.062727
3   0.09550881
4   0.129274074
..
100 18.76519094
101 19.35904667
102 19.97071807
103 20.60073961
.. 
200 379.4064897
201 390.8195844
202 402.5750719
..
253 1821.417705
254 1876.091136
255 1932.40477
It's not so good for math (rounding issues really stand out), but your standard >, <, == operators work, and it would be useful for some memory critical file format.

Re: Possible sudden stroke of genius?

Posted: Sat Sep 05, 2009 10:54 am
by earlz
Well.. that sucks for you that it doesn't work.

You could maybe in a managed(or unmanaged for that matter) store "large" pointers in 32 bits. Like if you forced all memory accesses to be rounded to the actual machine word(either 4 or 8 bytes) then you can cut out the bottom 2 or 3 bits and therefore have access to 34-35 bit pointers. Or provide something a bit better to allow for like 36 and up pointers by forcing regular pointers to be aligned to like the bottom 6 or more bits and then allowing intensive pointer arithmetic with unaligned "byte pointers"