I'm using my own bignumber library for converting ints (really big ones) to floats. ATM I'm still testing it, but I've noticed that for ints from 16777216 and up it has a flaw every 4 numbers. No matter whether I compensate, it shows. After printing all numbers my function generates and those the compiler/cpu generates itself, I've noticed that the computers are always rounded to the nearest even number, unless the number itself is also present. In other words:
16777216 -> 16777216 / 16777216
16777217 -> 16777216 / 16777218
16777218 -> 16777218 / 16777218
16777219 -> 16777220 / 16777220
16777220 -> 16777220 / 16777220
16777221 -> 16777220 / 16777222
16777222 -> 16777222 / 16777222
16777223 -> 16777224 / 16777224
16777224 -> 16777224 / 16777224
Is this a known awkwardness in IEEE 754 or is this something I'm doing wrong? I'm getting the second answers for the compilers idea, the third for mine. The CPU used was a K6-2 at 366.
[edit]
Found out where I disagree with my processor. It converts a number ending with a number that in my opinion should be the border case for starting to round up, rounded down if the part left out was a single one with further only zeroes left. IE, imo, it does the border case wrong. Am I wrong or is he wrong?
[/edit]
float conversion not identical to GCC version
Re:float conversion not identical to GCC version
Check out <float.h>'s FLT_ROUND, as well as <fenv.h>, especially the functions fegetround() and fesetround(). A compiler is basically allowed to define bordercase rounding any which way it likes, unless you set it explicitly.
Every good solution is obvious once you've found it.
Re:float conversion not identical to GCC version
Hate that... it rounds it almost logically...Solar wrote: Check out <float.h>'s FLT_ROUND, as well as <fenv.h>, especially the functions fegetround() and fesetround(). A compiler is basically allowed to define bordercase rounding any which way it likes, unless you set it explicitly.
Also, I now have working versions of these codes, one in assembly (which is considerably faster than the C++ ones) and one in c++, which both work afaik for long double, double and float. It's nearly C except for their presence in a class
Am going to PD these, they right now have a skew function that adjusts them for what the processor does differently than my function, so that I can check the results with == on my computer.
Thanks for the explanation.
They'll be PD when the entire huge-num library is complete, of course. Till then,
Candy