What's the best way to split a (base 10)string into bits

earlz · Post by **earlz** » Sun May 13, 2007 12:38 am

I am making an infinite precision library, and I am wanting to be able to set the stored value to a value..simple right? well, the problem is that this string can contain more digits than an int..
this problem is easy to fix in hex, just count up how many digits is in an int, and split it up..but with decimal, 7 with only one digit takes up 3 bits, while 8 takes up 4...this causes problems, and I don't know of a good way to go about fixing them without hardcoding a lot of crappy crap...

anyone have any ideas?

btw I'm using C++

Kevin McGuire · Post by **Kevin McGuire** » Sun May 13, 2007 1:46 am

It is a little challenging to think about it, however it is quite possible.

// perform the binary equivalent in math to factor the binary number by ten.
ndecx = (decx<<3)+(decx<<1);

// multiply a number by 0-10
0 - zero
1 - x
2 - x<<1
3 - x<<1+x<<0
4 - x<<2
5 - x<<2+x<<0
6 - x<<2+x<<1
7 - x<<2+x<<1+x<<0
8 - x<<3
9 - x<<3+x<<0
10 - x<<3+x<<1

This (above) operation will multiply any binary number by ten. What you might want to do is start with the (below) and use the shifts and adds to multiply the place value by ten. Then, multiple this place value by the arbitrary numbers 0-9 to produce a result and then add this result to the total (which should start at zero).

Code: Select all

	binaryBaseObject.FromInt(1); /// just get things started.
	// or : binaryBaseObject.Clear(); binaryBaseObject.SetBit(0);
	for(x = 0; myDecimalString[x]; ++x){
		myDecimalObject.FromInt(myDecimalString[x] - '0');
		// put current place value into a tmp
		binaryResultObject = MyBinaryAdd(MyBinaryShiftLeft(binaryPlaceValueObject, a),MyBinaryAdd(MyBinaryShiftLeft(binaryPlaceValueObject, b), MyBinaryShiftLeft(binaryPlaceValueObject, c)));
		binaryTotal = MyBinaryAdd(binaryTotalObject, binaryResultObject);
		binaryPlaceValueObject = MyBinaryAdd(MyBinaryShiftLeft(binaryPlaceValueObject,3), MyBinaryShiftLeft(binaryPlaceValueObject,1));
	}

There might be a more streamlined way to do it. I will have to think about it for a little while.

earlz · Post by **earlz** » Sun May 13, 2007 8:22 am

That is just a bit hard to understand for some reason..I read it like 5 times, and just now really understand basically how it should work..

edit:
my best idea is to get the digits, add there digit number type thing to them... and then add that digit using my binary add()

like for 921
it'd be split into 1, 20, and 900, and each of these would be added using temporary stuff or something...I really don't know..

Kevin McGuire · Post by **Kevin McGuire** » Sun May 13, 2007 11:42 am

You could do your math in base-65535 which should make the math a lot faster and overflow detection easy instead of using base-0xffffffff.

I have written a program that uses a class called LargeNumber. It does math using base-65535 and supports as large of a number as memory and (*hint*) time will allow.

I have tried to test it throughly, but you know sometimes bug do not surface however it is about as simple as I can make it.

One of the interesting features and problems is that if you had a four megabyte number which is huge; and you wanted to perform some math on it you would create another number which is almost just a huge such as:

a+b=c

So that if a and b are thirty-two million bit numbers then c would need to be at least a thirty-two million bit number.

4MB+4MB=4MB

So you might think well I can add numbers that two gigabytes in size, but this would be incorrect since the result would need to be two gigabyte and the two operands are already filling the entire four gigabyte space.
(Just some information for thought.)

Also in some cases things such as:

LargeNumber a, b;
a = b

Can be very costly compared to the seemly innocent.

unsigned int a, b;
a = b;

To combat this problem is needed.

LargeNumber a, b;
a.Eat(b);

This is the same as (a=b), except b is now zero so its the same as (below) when using a built-in type.

unsigned int a, b;
a = b; b = 0; /// ----> a.Eat(b);

Except there is no memory coping performed.

The class structure looks like this:

Code: Select all

class LargeNumber{
	public:
	unsigned int	 dcount;
	tNumBlock 	 *nblock;
	LargeNumber(void);
	~LargeNumber(void);
	static void Add(LargeNumber &a, LargeNumber &b, LargeNumber &c);
	static void Mul(LargeNumber &a, LargeNumber &b, LargeNumber &c);
	void SetDigit(unsigned int index, unsigned short value);
	unsigned short GetDigit(unsigned int index);
	unsigned int GetDigitCount(void);
	void From(const char *s);
	void Clear(void);
	void Eat(LargeNumber &a);
};

Here is the code so you might view.
http://kmcguire.jouleos.galekus.com/dok ... er_program

Kevin McGuire · Post by **Kevin McGuire** » Sun May 13, 2007 3:01 pm

I forgot to mention that I convert the base10 number using:

Code: Select all

void LargeNumber::From(const char *s){
	unsigned int x;
	LargeNumber input, result;
	LargeNumber base, factor10;
	base.SetDigit(0, 1);
	factor10.SetDigit(0, 10);

	for(x = strlen(s); x > 0; --x){
		input.SetDigit(0, s[x-1]-'0');
		LargeNumber::Mul(base, input, result);
		LargeNumber::Add(*this, result, *this);
		result.Clear();
		/// Increase by place value... 1..10...100...1000....10000...100000...
		LargeNumber::Mul(base, factor10, base);
	}
	return;
}

I maintain base to provide the place value, and multiply it with the input from the string using the internal math routines. This also helps maintain the math in a sixteen-bit base.

a==>00c095d56765a2e62b2c494e6d589f801284
// a to the power of eight.
a==>0ad51374dc43c9e27e1457552fc3ac1b86a66aeae782d0d37d55f6b8102df83291cdd7323f0a2a2a813b08d052e9ad0b099fa41650dbaaa3ef75fb2f73fc61731f8e5d1dd6
a19727a684f6837e5fa674e8f5026353266356d289257191cb4c0759d4b3d0ca06a9e527a73cae829053011da785058185f1005b11aee43feb22103ccfecb3c1439d04bfd0386200a1
e5a2493ec891c37de01a54fb6cdb4643dd4b8cdbe106e142d91642d216dc78b6c4bcefa248bf150f56ec9998e71f21756b2ced1dcfb19c45178340bdae71e9cf09fcf3d3e4148bb03e
df92341293aea9035abcc46324e6bdca47cdd343ae9b3373adc3777f89e4619d99b5be92af3379acb3d3b261ec347947d0e1e20573efc30eb33172543a4702277106fb92cbbd1ffbe9
dd96474778e3531d5d6c9157a8a365ec1bc013c17db623316d0a82408be6093e66f4475e268b6bc570e8c04b0715f01ba87a16cf46e3d2ded87f0fbd44f7da6c32580fe8bf3002c44f
3e45b7c8458751d0cd1c1e91d05fca7736c8419e446e730589c4fdf21ccbcc37cc631c54334164a79cec752823f6c4e20bf15c19284a165eca11b9975dced6c2977d55af8dd2a19ab3
a54840de59beffd71d499eb2b0a66f18f16b34645601e7144f2de60827df9769b5c8e4fdac0cb851ae88a26e6b17a691df8ba168b0ae22e95a539ea2bbe74ff19b66b1d76600cec1ed
de226d21fe9ea40a73e0ce6ecc3225161e4d4877314060efce255e64102722dcc86afe410ef95819189638a7d35698109c369395c73badb6ac7316af88474ff85edabecb3629e37033
0161606c7977d4b8192db24b9cc4610290464e85d24c5c526bb4306cc531288f7aa36e61ff8810e45c55fca033c5a812a1bd7a750f38718176f0930267095b349d6af83ef559404119
d79f09a69ee83ed997bac480edda8094a02ffeb242fddadd99960c932fcddea850116bf33836c2f54b3dbe88483a66ab686879dbba0d4eea2408486324982448626cea335f7a64b04a
d4493513e719a8be82dc3e6017b744414d1011421767f8e9803d51ccfa47c3cdaa37045d21da7cdf49cbc0a13b8d4b9c7cef0f7e93a2374c24df5df15aabd6ceab58e6b863330235b5
5eb257ae15f30f04339bf3645ff3b31d7ae4e282640a0b4598fa33e343d7697f0b1285da143271555606971cfa5d8924851dd1f9af15eac7c2b14770a8b5d744f108cfc9f87044a381
a02221c7ee6d0b153ab90046bc784c7d80fd40cf2f9206daba4f2b7a9eea70d23a7b82f31e6095c4bb90c3883f4d12472529ce7767648c05994c8448cbd9cb634b759a96efb363dc52
1e3f004596f7dbd5d9bc38f4de1957067133f574f54ca6e5560a68b7bbfdf1d5a6be1c8ae5eff6bca130bf2c4c7bfc9a0cc43e5f881ccb6f4c59f802b4d32e6cf4cbeeaf5e5441f3f5
ca81a1d33a966a55a94d40c97cc30b0f200bd702d252ee62db40c72a490c1e8753866f538f95089087fbc0b81fca3c6f395f2fe7c50af90538db01d9236bf75d0da988b7b853b77e3f
a05b35a179e0fa65ed708ae7685145b94543f93de44982c7bb739112ff9d85e50962d6e8349ea27a28058f37b5c7a4c376ecc4ff926a73f68fcd7957556d832fe4eedf742e9f86cb1c
91df4abd24e325ae2958dff22882132463868bba2652094484a539cf7abd212e502e7eefc5e55e3af97b4a881425c5357471d4b5a4723d01a1ac576678088bb93154436ebacc691fff
fe7d81c6de8972101ad6825b338a401c3cb08271f939ed6f8663ef54910bbfaa560f1d6a32e6f3b67b427822b10088fdea3434d7ff136bdeddfe34474c176ff42b5ab8fb1d29918827
18b75ea5ba7487a36c580764e397ea9eabd62ada35b3375cf188ecb21273bec2beb151d9cfcdb8aa8be9ef02093eb0752488f9b13ab7887a5999b0ab8890cc782b86eefa1c733c4a83
a38648641248ab6f17960150e9d0cd2584f4cc02cc38043f2d187a83293442603183066935023839b2e229f472bd7c999f17fcbb2f5e695af5e1ed45da3b29448243ffdd672f41efdb
9b1987cf98e3de212c7609a46e8bc0287d77d623e7152334e69249c88bf31a625607f65f835b82d3b27ba578d1e9ee19420445c3a63b7baec94c39852d119ad40ccbd30028c8420ceb
a2076e339794337f465786af8b5d4208144a3faeea2f3e6f2fe9d344993a2d70cd5c060bf2ca54c97b91971d07389064c5a6d9e8f170050f97fcd704e5db6b0e99739e266b40a4d637
a2207e92152f8b429c0725f602501bf9ad4e66b66a69d1be102d30d5abcd721c6f9b5ae535fc04bd38fd7adbd38fe2aa2fd7d3d2a72f77bd60b11b4a4feebf1519f90cdb98da025b9e
0dc9b8fd60ca7a3c65eff48bfce5564125f892624257b04eaa344537837bd018090ded8bd407d7c5f04324561c280a4b6640b3cec54ed91b6ada4bb77f6a658c8a5c8d743c4c956c17
e7690380c7a499c02ab477f8b063a4be23a4c4cfe206e6238b3e0ff489fcf6e9ab6273c076c5038a2c9aa98644e2c2a464d7d163282075f0163ca4a04d401e58fa2003f9370146557b
c55292b3f4ebda68b7abd1334488c6da226f640c3ebe2e397c60ec8ba3444446cbc467391e0181f9162952482757aaa35b2dcfd082ea79424b7f77d25a8a71b9d973f7541dc0e164a7
131b79a72e21f5d983d1b3a5c1255c9cff49d785263d659fdcf21352cbc695a25b4f941844c65e9d8ad5a9ae442c5afdaae5bd07f601d52a113c20f706c611ecf57fca5e66718c42c5
814a7f92f7d7e4044602f0b5fc5f536fa60b946ea31f9ac4891452f8e76f9e64658bab4aafd23e85c40096bd112ffe68d49ec38d3ad76599a98c3960f76d91a2d1ff0a708fcd4d3fb9
33c18096860af7eac41840139636d34843e447ede1f3773b962cf05941c7909e314f9fb89c312bdb4534caec76acb76bec980d3740ce77d93993dcd17e66c66b78e698b1484d1651b3
2d665d4341e4fb8cfdbc0cca4b912c277461e63ba98ce950010000000000000000000000000000000000000000000000000000000000000000

earlz · Post by **earlz** » Sun May 13, 2007 3:29 pm

wow...16bit base...that just doesn't sound fun...

If computers work at a binary level, then would binary number emulation work best?

Kevin McGuire · Post by **Kevin McGuire** » Mon May 14, 2007 1:54 pm

You let the processor do the conversion to binary, because although a computer fundamentally does math in binary it does not expose this binary bit access to the programmer directly in a whole. Instead it exposes digits in the base of 256, 65535, and 4294967295 with the instruction set.

You can think of a digit in the base of 65535 as having sixteen binary bits, but you have no direct access to this instead you can use binary operations but not binary read/write operations to deduce what bit has been set or not (or set a bit or not).

For example a Little-Indian and Big-Indian processor will make you're code in C/C++ run the same unless you try to perform bit level manipulations using binary operations which is a example of this bit access not being exposed to the programmer; and also not exposed in machine code in whole.
^ Note: The above could be slightly incorrect as I have no real experience in dealing with these matters. ^

unsigned short a = SetBits(1000.0100.1100.0011, ...blah..);

Why should you have to set all those bits when the processor can do it for you?

unsigned short a = 0x84C3;

Why should you add in binary a and b when the processor can do it for you?

unsigned short a = 0x84C3;
unsigned short b = 0x9282;

result = a + b;

Using GetBit and SetBit on a and b one at a time is slow and tedious for adding them together (emulating binary) when the entire addition could be solved in one machine instruction instead of over twenty.

I would have chosen to use a thirty-two bit base, but it makes it difficult to detect a overflow with out resorting to some assembler code.

c=a+b;
0xFFFF+0xFFFF = 0x1FFFE.

A 0x1FFFE will not overflow the register or thirty-two bit location. This allows me to check if a overflow occurred with:

if(c > 0xFFFF)

urxae · Post by **urxae** » Mon May 14, 2007 4:43 pm

Kevin McGuire wrote:You let the processor do the conversion to binary, because although a computer fundamentally does math in binary it does not expose this binary bit access to the programmer directly in a whole. Instead it exposes digits in the base of 256, 65535, and 4294967295 with the instruction set.

Off-by-one errors. You probably mean base 256, 65536, and 4294967296 (Hint: powers of 2 (larger than 1) aren't odd numbers

).

For example a Little-Indian and Big-Indian processor

Endian.

Brynet-Inc · Post by **Brynet-Inc** » Mon May 14, 2007 4:45 pm

That's always funny.. People keep spelling Endian incorrectly..

earlz · Post by **earlz** » Fri May 18, 2007 6:23 pm

Well, I did some work and thinking and figured out how to add any base on paper, so I might actually convert to 8bit arithmetic...

thing I can't figure out is how to do subtraction...(in any base but 10)

Brendan · Post by **Brendan** » Sat May 19, 2007 2:44 am

Hi,

hckr83 wrote:Well, I did some work and thinking and figured out how to add any base on paper, so I might actually convert to 8bit arithmetic...

thing I can't figure out is how to do subtraction...(in any base but 10)

Instead of subtracting, change the sign of one number and add them. For example:

X = A - B

Is the same as:

X = A + (-B);

And:

-X = (-A) + B

You'll see why this last one matters soon....

The algorithm you're probably looking for goes a bit like this:

Code: Select all

NUMBER *addNumbers(firstNumber, secondNumber) {
    NUMBER *newNumber;
    int digit;
    signed int temp;
    unsigned int carry = 0;
    NUMBER *tempNumber;
    signed char relationship;

    setToZero(newNumber);

    // Make sure exponents are the same

    while(firstNumber->exponent > secondNumber->exponent) {
        if(firstNumber's digits can be shifted left) {
            shift firstNumber's digits left;
            decrease firstNumber->exponent;
        } else {
            shift secondNumber's digits right;   // Possible precision loss!
            increase secondNumber->exponent;
        }
    }
    while(secondNumber->exponent > firstNumber->exponent) {
        if(secondNumber's digits can be shifted left) {
            shift secondNumber's digits left;
            decrease secondNumber->exponent;
        } else {
            shift firstNumber's digits right;   // Possible precision loss!
            increase firstNumber->exponent;
        }
    }

    if(firstNumber->sign == secondNumber->sign) {

        // We can add the digits

        newNumber->sign = firstNumber->sign;
        for(digit = 0; digit < totalDigits; digit++) {
            temp = firstNumber[digit] + secondNumber[digit] + carry;
            if(temp < BASE) {
                newNumber[digit] = temp;
                carry = 0;
            } else {
                newNumber[digit] = (temp - BASE);
                carry = 1;
            }
        }
        if(carry != 0) {
            shift newNumber's digits right;   // Possible precision loss!
            increase newNumber->exponent;
            newNumber[totalDigits - 1] = 1;
        }
        return newNumber;

    } else {

        // We need to subtract the digits

        relationship = compareNumbers(firstNumber, secondNumber);
        if(relationship == 0) {
            // Both numbers have same magnitude and different signs so the answer is zero
            return newNumber;
        }
        if(relationship > 0) {
            // FirstNumber has larger magnitude than secondNumber, so sign of newNumber = sign of firstNumber
            newNumber->sign = firstNumber->sign;
        } else {
            // FirstNumber has smaller magnitude than secondNumber, so:
            //  1) newNumber has same sign as secondNumber
            //  2) swap the numbers to make things easier later
            newNumber->sign = secondNumber->sign;
            tempNumber = firstNumber;
            firstNumber = secondNumber;
            secondNumber = tempNumber;
        }
        newNumber->sign = secondNumber->sign;

        // Now we're subtracting a smaller number from a larger number!

        for(digit = 0; digit < totalDigits; digit++) {
            temp = firstNumber[digit] - secondNumber[digit] - carry;
            if(temp >= 0) {
                newNumber[digit] = temp;
                carry = 0;
            } else {
                newNumber[digit] = temp + BASE;
                carry = 1;
            }
        }
        // Note: Carry always = 0 here because we subtracted smallest from largest
        return newNumber;
    }
}

Cheers,

Brendan

OSDev.org

What's the best way to split a (base 10)string into bits

What's the best way to split a (base 10)string into bits

Binary Math Emulation Is Efficent?

Re: Binary Math Emulation Is Efficent?