Unsigned 64-bit Divide

mark3094 · Post by **mark3094** » Mon Jul 25, 2011 1:55 am

I am having trouble with 64-bit divides on a 32-bit machine

I built a printf routine in a normal project (ie, standard headers, etc are included) which works well, and has been thouroughly tested.

When I copied this code into my new kernel (standard headers, etc not included) I have problems when I print long long integers, eg:

Code: Select all

printf ("%lld\n", 50);
printf ("%llu\n", 50);

As this is a 32-bit machine, it will access the _aulldiv function, which (for testing purposes) I have borrowed some code from the Microsoft Research website.
When I use the print statements above, the result I get is 4507997673881650.

It looks like it is turning the sign bit on for some reason. To test this theory, I tried using this printf statement:

Code: Select all

printf ("%llu\n, (unsigned long long int)50);

With this statement it works correctly.
As all this works just fine when using C libraries, I assume that it must be my modified _aulldiv function (listed below).
However, I just can't see how the sign bit is getting turned on...

Code: Select all

;	The stack:
;
;	|			    |
;	|-------------------|	+32
;	|		       (Hi)|
;	| - - - - - - - - -     |	+28
;	|	Divisor      (Lo)|
;	|-------------------|	+24
;	|		       (Hi)|
;	| - - - - - - - - -     |	+20
;	|	Dividend    (Lo)|
;	|-------------------|	+16
;	|	Return Address|
;	|-------------------|	+12
;	|		ESI	    |
;	|-------------------|	+8
;	|		EBP	    |
;	|-------------------|	+4
;	|		EBX	    |
;	---------------------	ESP (0)


[BITS 32]
[GLOBAL __aulldiv]

[SECTION .TEXT]


__aulldiv:
	;  Setup stack frame
	PUSH	ESI
	PUSH	EBP
	PUSH	EBX
	MOV		EBP, ESP								

	mov		ecx,[ebp+24]			; lower divisor
	mov		eax,[ebp+20]			; higher dividend
	xor		edx,edx
	div		ecx
	mov		ebx,eax					; ebx = temporary storage

	mov		eax,[ebp+16]			; lower dividend
	div		ecx
	mov		edx,ebx					; recover from temporary storage

Clean:
	;  Clean up
	POP		EBX
	POP		EBP
	POP		ESI
RET

Combuster · Post by **Combuster** » Mon Jul 25, 2011 2:09 am

printf ("%lld\n", 50);

There's your real error. There's a world of difference between passing an int or a long long. You send 32 bits and the receiver reads 64 bits (which means that 32 bits of it is garbage)

Turn on warnings if you haven't. GCC will not let that go unnoticed, with a bit of luck MSVC is just as professional.

mark3094 · Post by **mark3094** » Mon Jul 25, 2011 2:19 am

You're absolutely right. I did this:

Code: Select all

	unsigned long long int i;
	long long int j;

	i = 50;
	j = 50;

and put that in the printf, and now it's OK. I didn't realise that putting in just the number passed garbage. This will be one I won't forget for a while.

Thankyou very much for your help.

Brendan · Post by **Brendan** » Mon Jul 25, 2011 4:26 am

Hi

mark3094 wrote:You're absolutely right. I did this:
Code: Select all
	unsigned long long int i;
	long long int j;

	i = 50;
	j = 50;

Um?

In C, numerical constants like "50" are assumed to be "int". You can override this behaviour by doing something like "50ULL" so that C treats it as an unsigned long long instead.

Basically, the original problem should've been fixed by replacing the buggy lines with something like:

Code: Select all

    printf ("%lld\n", 50L);
    printf ("%llu\n", 50UL);

For something like "(unsigned long long int)50", C still assumes that the number is an integer, and then generates code that converts the integer into an unsigned long (which would hopefully be optimised out). Conceptually, it's different to "50ULL" (but in practice, with reasonable optimisation, it usually ends up being the same). There are corner cases though - for example, "(long long int)0xFFFFFFFF" is extremely different to "0xFFFFFFFFLL"; because the former would be sign extended to 0xFFFFFFFFFFFFFFFF while the latter would be 0x00000000FFFFFFFF.

Cheers,

Brendan

mark3094 · Post by **mark3094** » Mon Jul 25, 2011 5:32 am

Interesting...

I had assumed (incorrectly as it seems) that because I was using printf with 'll', that it would cast '50' as a long long integer.

There's always more to learn about C. Makes me wonder what other assumptions I have made. I guess I will find out in time.

qw · Post by qw » Mon Jul 25, 2011 5:52 am

mark3094 wrote:I had assumed (incorrectly as it seems) that because I was using printf with 'll', that it would cast '50' as a long long integer.

Fortunately, compilers are not that smart. I wouldn't want my compiler to do something like that behind my back. GCC giving a warning is exactly smart enough IMO.

Love4Boobies · Post by **Love4Boobies** » Mon Jul 25, 2011 7:13 am

There are a few subtle things that might cause confusion sometimes so perhaps it's worth being pedantic...

Brendan wrote:In C, numerical constants like "50" are assumed to be "int". You can override this behaviour by doing something like "50ULL" so that C treats it as an unsigned long long instead.

The actual rule for integer constants is that they be represented with at least the precision specified by the suffix, or more if that's not possible; however, the signedness will always be preserved unless there is also a prefix changing the radix. E.g., if INT_MAX expands to 65535, the constant 65536, unlike 0x10000, will not be of type unsigned int, but rather long. I agree that this is all pretty shitty (and it's nothing compared to the promotion rules), but unfortunately, it can be a great source of bugs.

Brendan wrote:For something like "(unsigned long long int)50", C still assumes that the number is an integer, and then generates code that converts the integer into an unsigned long (which would hopefully be optimised out). Conceptually, it's different to "50ULL" (but in practice, with reasonable optimisation, it usually ends up being the same).

While it does sound probable that there exist compilers that work this way, it's not a requirement of the standard. It's legal for a compiler to generate code or not for any of the two scenarios. I wouldn't be too worried about the cast due to performance reasons; however, the suffixes make the code more readable.

Brendan wrote:There are corner cases though - for example, "(long long int)0xFFFFFFFF" is extremely different to "0xFFFFFFFFLL"; because the former would be sign extended to 0xFFFFFFFFFFFFFFFF while the latter would be 0x00000000FFFFFFFF.

No, if this were true, the following would fail:

Code: Select all

unsigned int a = UINT_MAX;
long b = a;

// or

float c = 5;

// or

double d = c;

It's important to remember that C is representation-agnostic, as far as defined behavior goes. The only question that should be on your mind is whether the value you are trying to assign fits the place you are trying to put it in.

Combuster · Post by **Combuster** » Mon Jul 25, 2011 10:46 am

No, if this were true, the following would fail

How? That code is completely agnostic of what type the constants actually have. In both brendan's everything-is-a-signed-int and your everything-is-the-smallest-lossless-storage, the net result is identical.

Love4Boobies · Post by **Love4Boobies** » Tue Jul 26, 2011 6:12 am

Combuster wrote:
No, if this were true, the following would fail
How? That code is completely agnostic of what type the constants actually have. In both brendan's everything-is-a-signed-int and your everything-is-the-smallest-lossless-storage, the net result is identical.

I was illustrating a few extra examples of implicit conversion in order to show Brendan that C doesn't screw up like he claimed. The same rules apply to constants as to variables, which is why this is related.

blobmiester · Post by **blobmiester** » Tue Jul 26, 2011 10:29 am

The following compiles and runs on my computer (linux, x86-64, gcc):

Code: Select all

#include <assert.h>
 
int main()
{
        assert(0xFFFFFFFFLL == 0x00000000FFFFFFFFLL);
        assert((long long int)0xFFFFFFFF == 0x00000000FFFFFFFFLL);
 
        assert(-1LL == 0xFFFFFFFFFFFFFFFFLL);
        assert((long long int)-1 == 0xFFFFFFFFFFFFFFFFLL);
 
        return 0;
}

OSDev.org

Unsigned 64-bit Divide

Unsigned 64-bit Divide

Re: Unsigned 64-bit Divide

Re: Unsigned 64-bit Divide

Re: Unsigned 64-bit Divide

Re: Unsigned 64-bit Divide

Re: Unsigned 64-bit Divide

Re: Unsigned 64-bit Divide

Re: Unsigned 64-bit Divide

Re: Unsigned 64-bit Divide

Re: Unsigned 64-bit Divide