Unsigned 64-bit Divide

Programming, for all ages and all languages.
Post Reply
User avatar
mark3094
Member
Member
Posts: 164
Joined: Mon Feb 14, 2011 10:32 pm
Location: Australia
Contact:

Unsigned 64-bit Divide

Post by mark3094 »

I am having trouble with 64-bit divides on a 32-bit machine

I built a printf routine in a normal project (ie, standard headers, etc are included) which works well, and has been thouroughly tested.

When I copied this code into my new kernel (standard headers, etc not included) I have problems when I print long long integers, eg:

Code: Select all

printf ("%lld\n", 50);
printf ("%llu\n", 50);
As this is a 32-bit machine, it will access the _aulldiv function, which (for testing purposes) I have borrowed some code from the Microsoft Research website.
When I use the print statements above, the result I get is 4507997673881650.

It looks like it is turning the sign bit on for some reason. To test this theory, I tried using this printf statement:

Code: Select all

printf ("%llu\n, (unsigned long long int)50);
With this statement it works correctly.
As all this works just fine when using C libraries, I assume that it must be my modified _aulldiv function (listed below).
However, I just can't see how the sign bit is getting turned on...

Code: Select all

;	The stack:
;
;	|			    |
;	|-------------------|	+32
;	|		       (Hi)|
;	| - - - - - - - - -     |	+28
;	|	Divisor      (Lo)|
;	|-------------------|	+24
;	|		       (Hi)|
;	| - - - - - - - - -     |	+20
;	|	Dividend    (Lo)|
;	|-------------------|	+16
;	|	Return Address|
;	|-------------------|	+12
;	|		ESI	    |
;	|-------------------|	+8
;	|		EBP	    |
;	|-------------------|	+4
;	|		EBX	    |
;	---------------------	ESP (0)


[BITS 32]
[GLOBAL __aulldiv]

[SECTION .TEXT]


__aulldiv:
	;  Setup stack frame
	PUSH	ESI
	PUSH	EBP
	PUSH	EBX
	MOV		EBP, ESP								

	mov		ecx,[ebp+24]			; lower divisor
	mov		eax,[ebp+20]			; higher dividend
	xor		edx,edx
	div		ecx
	mov		ebx,eax					; ebx = temporary storage

	mov		eax,[ebp+16]			; lower dividend
	div		ecx
	mov		edx,ebx					; recover from temporary storage

Clean:
	;  Clean up
	POP		EBX
	POP		EBP
	POP		ESI
RET
User avatar
Combuster
Member
Member
Posts: 9301
Joined: Wed Oct 18, 2006 3:45 am
Libera.chat IRC: [com]buster
Location: On the balcony, where I can actually keep 1½m distance
Contact:

Re: Unsigned 64-bit Divide

Post by Combuster »

printf ("%lld\n", 50);
There's your real error. There's a world of difference between passing an int or a long long. You send 32 bits and the receiver reads 64 bits (which means that 32 bits of it is garbage)

Turn on warnings if you haven't. GCC will not let that go unnoticed, with a bit of luck MSVC is just as professional.
"Certainly avoid yourself. He is a newbie and might not realize it. You'll hate his code deeply a few years down the road." - Sortie
[ My OS ] [ VDisk/SFS ]
User avatar
mark3094
Member
Member
Posts: 164
Joined: Mon Feb 14, 2011 10:32 pm
Location: Australia
Contact:

Re: Unsigned 64-bit Divide

Post by mark3094 »

You're absolutely right. I did this:

Code: Select all

	unsigned long long int i;
	long long int j;

	i = 50;
	j = 50;
and put that in the printf, and now it's OK. I didn't realise that putting in just the number passed garbage. This will be one I won't forget for a while.

Thankyou very much for your help.
User avatar
Brendan
Member
Member
Posts: 8561
Joined: Sat Jan 15, 2005 12:00 am
Location: At his keyboard!
Contact:

Re: Unsigned 64-bit Divide

Post by Brendan »

Hi
mark3094 wrote:You're absolutely right. I did this:

Code: Select all

	unsigned long long int i;
	long long int j;

	i = 50;
	j = 50;
Um?

In C, numerical constants like "50" are assumed to be "int". You can override this behaviour by doing something like "50ULL" so that C treats it as an unsigned long long instead.

Basically, the original problem should've been fixed by replacing the buggy lines with something like:

Code: Select all

    printf ("%lld\n", 50L);
    printf ("%llu\n", 50UL);
For something like "(unsigned long long int)50", C still assumes that the number is an integer, and then generates code that converts the integer into an unsigned long (which would hopefully be optimised out). Conceptually, it's different to "50ULL" (but in practice, with reasonable optimisation, it usually ends up being the same). There are corner cases though - for example, "(long long int)0xFFFFFFFF" is extremely different to "0xFFFFFFFFLL"; because the former would be sign extended to 0xFFFFFFFFFFFFFFFF while the latter would be 0x00000000FFFFFFFF.


Cheers,

Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
User avatar
mark3094
Member
Member
Posts: 164
Joined: Mon Feb 14, 2011 10:32 pm
Location: Australia
Contact:

Re: Unsigned 64-bit Divide

Post by mark3094 »

Interesting...

I had assumed (incorrectly as it seems) that because I was using printf with 'll', that it would cast '50' as a long long integer.

There's always more to learn about C. Makes me wonder what other assumptions I have made. I guess I will find out in time.
User avatar
qw
Member
Member
Posts: 792
Joined: Mon Jan 26, 2009 2:48 am

Re: Unsigned 64-bit Divide

Post by qw »

mark3094 wrote:I had assumed (incorrectly as it seems) that because I was using printf with 'll', that it would cast '50' as a long long integer.
Fortunately, compilers are not that smart. I wouldn't want my compiler to do something like that behind my back. GCC giving a warning is exactly smart enough IMO.
User avatar
Love4Boobies
Member
Member
Posts: 2111
Joined: Fri Mar 07, 2008 5:36 pm
Location: Bucharest, Romania

Re: Unsigned 64-bit Divide

Post by Love4Boobies »

There are a few subtle things that might cause confusion sometimes so perhaps it's worth being pedantic...
Brendan wrote:In C, numerical constants like "50" are assumed to be "int". You can override this behaviour by doing something like "50ULL" so that C treats it as an unsigned long long instead.
The actual rule for integer constants is that they be represented with at least the precision specified by the suffix, or more if that's not possible; however, the signedness will always be preserved unless there is also a prefix changing the radix. E.g., if INT_MAX expands to 65535, the constant 65536, unlike 0x10000, will not be of type unsigned int, but rather long. I agree that this is all pretty shitty (and it's nothing compared to the promotion rules), but unfortunately, it can be a great source of bugs.
Brendan wrote:For something like "(unsigned long long int)50", C still assumes that the number is an integer, and then generates code that converts the integer into an unsigned long (which would hopefully be optimised out). Conceptually, it's different to "50ULL" (but in practice, with reasonable optimisation, it usually ends up being the same).
While it does sound probable that there exist compilers that work this way, it's not a requirement of the standard. It's legal for a compiler to generate code or not for any of the two scenarios. I wouldn't be too worried about the cast due to performance reasons; however, the suffixes make the code more readable.
Brendan wrote:There are corner cases though - for example, "(long long int)0xFFFFFFFF" is extremely different to "0xFFFFFFFFLL"; because the former would be sign extended to 0xFFFFFFFFFFFFFFFF while the latter would be 0x00000000FFFFFFFF.
No, if this were true, the following would fail:

Code: Select all

unsigned int a = UINT_MAX;
long b = a;

// or

float c = 5;

// or

double d = c;
It's important to remember that C is representation-agnostic, as far as defined behavior goes. The only question that should be on your mind is whether the value you are trying to assign fits the place you are trying to put it in.
"Computers in the future may weigh no more than 1.5 tons.", Popular Mechanics (1949)
[ Project UDI ]
User avatar
Combuster
Member
Member
Posts: 9301
Joined: Wed Oct 18, 2006 3:45 am
Libera.chat IRC: [com]buster
Location: On the balcony, where I can actually keep 1½m distance
Contact:

Re: Unsigned 64-bit Divide

Post by Combuster »

No, if this were true, the following would fail
How? That code is completely agnostic of what type the constants actually have. In both brendan's everything-is-a-signed-int and your everything-is-the-smallest-lossless-storage, the net result is identical.
"Certainly avoid yourself. He is a newbie and might not realize it. You'll hate his code deeply a few years down the road." - Sortie
[ My OS ] [ VDisk/SFS ]
User avatar
Love4Boobies
Member
Member
Posts: 2111
Joined: Fri Mar 07, 2008 5:36 pm
Location: Bucharest, Romania

Re: Unsigned 64-bit Divide

Post by Love4Boobies »

Combuster wrote:
No, if this were true, the following would fail
How? That code is completely agnostic of what type the constants actually have. In both brendan's everything-is-a-signed-int and your everything-is-the-smallest-lossless-storage, the net result is identical.
I was illustrating a few extra examples of implicit conversion in order to show Brendan that C doesn't screw up like he claimed. The same rules apply to constants as to variables, which is why this is related.
"Computers in the future may weigh no more than 1.5 tons.", Popular Mechanics (1949)
[ Project UDI ]
User avatar
blobmiester
Member
Member
Posts: 45
Joined: Fri Jul 16, 2010 9:49 am

Re: Unsigned 64-bit Divide

Post by blobmiester »

The following compiles and runs on my computer (linux, x86-64, gcc):

Code: Select all

#include <assert.h>
 
int main()
{
        assert(0xFFFFFFFFLL == 0x00000000FFFFFFFFLL);
        assert((long long int)0xFFFFFFFF == 0x00000000FFFFFFFFLL);
 
        assert(-1LL == 0xFFFFFFFFFFFFFFFFLL);
        assert((long long int)-1 == 0xFFFFFFFFFFFFFFFFLL);
 
        return 0;
}
Post Reply