Simple Float to String Idea

SpyderTL · Post by **SpyderTL** » Wed Aug 20, 2014 7:53 am

I've started adding support for FPU instructions to my system, and I'm trying to put together a quick and dirty float (single) to string method that I can use to verify my numbers are (somewhat) correct.

Implementing a complex, accurate method based on the Dragon4 method is beyond the scope for what I need at the moment, but once I get it working, I'll probably improve my code over time.

So, here is what I've got after spending about a week researching and coding. I'm essentially writing this code in Assembly, but I'm actually using my own XML based language, so I'll just describe the steps that I'm taking:

1. Get an address to a float structure (1-bit sign flag, 8-bit exponent offset by 127, 23-bit mantissa, with implied 24th bit of 1)
2. Mask high bit to get sign.
3. Shift value right by 23 bits, and subtract result by 127 to get exponent.
4. Add implied 24th bit to mantissa.
5. Shift mantissa right by (23 - exponent) bits to get whole number value.
6. Mask mantissa with ((1 << (23 - exponent)) - 1) to get fractional value (bits to the right of the decimal)
7. Divide 1,000,000,000 by (1 << (23 - exponent) to get resolution of fractional value.
8. Multiply resolution by fractional value to get decimal (base 10) fractional value (digits to the right of the decimal)
9. Convert whole number and fractional digits to strings and combine strings. (fractional digits are padded to 9 digits on the left with zeros)

I'm using a constant value of 1,000,000,000 as my "maximum" fractional decimal value, since it's the largest base 10 number that will fit in a 32-bit register.

Using the method above, I'm getting accurate numbers to about 3 decimal places. I'm using the CPU to calculate the resolution value and fractional values above. I may switch to using the FPU to do the division and multiplication, which should give me accurate results to about 6 decimal places. I think the best accuracy you can get with a 32-bit single value is around 7 decimal places, so I would be happy for a while if I can get 6 decimal places in less than 10 instructions.

Do you guys see any major problems with this approach, or do you have any suggestions to improve this?

Floating point structures and calculations are a lot more complicated than I was expecting. The Wiki article doesn't have much detail at all, so I may update it once I get my head wrapped around all of the different formats and issues.

Brendan · Post by **Brendan** » Wed Aug 20, 2014 8:53 am

Hi,

SpyderTL wrote:Do you guys see any major problems with this approach, or do you have any suggestions to improve this?

Every valid floating point number (excluding NaN, infinity, etc) can be displayed with 100% accuracy relatively simply, by adjusting the exponent to ensure that the significand can be displayed as an integer.

For example, assume you've got the "raw hex dword" 0x414587D0 (which is the float representing the value 12.3456573486328125). You can:

extract the sign: 0x414587DD >> 31 = 0 = positive
extract the significand bits: 0x414587D0 & 0x007FFFFF = 0x004587D0
extract the exponent: (0x414587D0 >> 23) & 0xFF = 0x82
insert the implied bit into the significand: 0x004587D0 | 0x00800000 = 0x00C587D0 = 12945360
subtract the bias from the exponent: 0x82 - 0x7F = 3
subtract the significand's width (23 bits) from the exponent: 3 - 23 = -20

At this point you can display the significand and exponent as plain old integers (e.g. "12945360 * 2**-20"). This is 100% accurate.

However, you can simplify it slightly using a loop like this:

Code: Select all

    while( (significand & 1) == 0 ) {
        significand = significand >> 1;
        exponent = exponent + 1;
    }

For the example number I chose, the significand becomes 809085 and exponent becomes -16, so you'd display "809085 * 2**-16". This is still 100% accurate.

Of course you will need to detect things like NaN, infinities, zero and denormal numbers first. For displaying denormals (not including either zero), you can do almost the same thing as normal numbers except you don't insert the implied bit into the significand.

Cheers,

Brendan

SpyderTL · Post by **SpyderTL** » Wed Aug 20, 2014 9:54 am

Brendan wrote:Hi,
For example, assume you've got the "raw hex dword" 0x414587D0 (which is the float representing the value 12.3456573486328125). You can:
extract the sign: 0x414587DD >> 31 = 0 = positive

extract the significand bits: 0x414587D0 & 0x007FFFFF = 0x004587D0

extract the exponent: (0x414587D0 >> 23) & 0xFF = 0x82

insert the implied bit into the significand: 0x004587D0 | 0x00800000 = 0x00C587D0 = 12945360

subtract the bias from the exponent: 0x82 - 0x7F = 3

subtract the significand's width (23 bits) from the exponent: 3 - 23 = -20

This is exactly what I'm doing. (Well, exactly-ish. I'm currently ignoring the sign bit altogether, but that's a simple matter to tack on a "-" to the front of the string...)

Brendan wrote:At this point you can display the significand and exponent as plain old integers (e.g. "12945360 * 2**-20"). This is 100% accurate.

Technicallly, you are right, and this format would be "accurate". But that's kind of like saying that displaying the original hex value (414587D0) on the screen is also 100% accurate.

While it is, technically, accurate, it is also fairly useless for trying to determine whether my newly written float classes, structures and functions are working properly.

If I write a console app that takes two floats (1.0 and 2.0) and multiplies them together, I'd want the output to be "3.0", or something real close to that. Displaying "15359034 * 2^1" on the screen really doesn't help much. I'd rather the result be "inaccurate" but "human readable". And as I mentioned, I just need something that will get me in the "vicinity" of the right answer for now.

Thanks, though.

SpyderTL · Post by **SpyderTL** » Wed Aug 20, 2014 1:31 pm

I could also feed the non-whole number part back to the FPU, and have it multiply the result by 10.0f, in a loop. Then I could strip off the whole number part each time, and add it to the string.

Or, I guess it would be more accurate if I took the whole value, stripped off the whole number, wrote that to a string, appended a ".", and then multiplied the remainder by 10.0f in a loop until the remainder was Zero (0.0), or until we hit a certain number of digits.

Without trying this, I imagine this would be slower than the approach above, but would be more accurate.

So, here are my options:

a) Calculate digits after the decimal using the CPU to divide 1,000,000,000 by the total bits to the right of the decimal place set to 1, and then multiply that by the actual value to the right of the decimal place.
b) Calculate digits after the decimal using the FPU to divide 1,000,000,000.0 by the total bits to the right of the decimal place set to 1, and then multiply that by the actual value to the right of the decimal place.
c) Use the FPU in a loop to round down (truncate) the entire float to an integer, and multiply by 10.0f, and repeat until value is 0.0f or until specific number of digits.

I think these are in order of fastest (and least accurate) to slowest (and most accurate). Would you guys agree?

Are there any other simple options to convert a float to a string?

Brendan · Post by **Brendan** » Wed Aug 20, 2014 3:24 pm

Hi,

SpyderTL wrote:While it is, technically, accurate, it is also fairly useless for trying to determine whether my newly written float classes, structures and functions are working properly.

If I write a console app that takes two floats (1.0 and 2.0) and multiplies them together, I'd want the output to be "3.0", or something real close to that. Displaying "15359034 * 2^1" on the screen really doesn't help much. I'd rather the result be "inaccurate" but "human readable". And as I mentioned, I just need something that will get me in the "vicinity" of the right answer for now.

For testing newly written code, you want to test "interesting" cases near the limits (e.g. both huge and tiny numbers for floats), not trivial cases (like trying to figure out if "1.0 * 2.0 = 3.0").

For floats, if displayed using "kindergarten format numbers" all of these values would be 35 digits or larger. I doubt that's more human readable than something simpler, faster, less error prone and more accurate.

Cheers,

Brendan

SpyderTL · Post by **SpyderTL** » Wed Aug 20, 2014 5:39 pm

I did some testing with .NET today to see if I could force it to make some Float calculation/ToString errors. Amazingly it handles Floats flawlessly (as far as I could tell) until you got out to 7 decimal places. Then it started leaking some minor errors beyond 6 or 7 decimal places. I was using System.BitConverter to convert 1.0000001f to a 4 byte array, and then back to a Float. .NET does an amazing job of handling this as well as it does.

Just out of curiosity, does anyone know how they are pulling this off?

On a side note, I went back and modified my code to do option C above, and my numbers are accurate to 6 decimal places now. I'm just dropping everything after 6 decimal places at the moment, but I'll throw in some rounding for the last digit pretty soon. Then I'll create new versions of the code for Double and the native 80-bit FPU format as well.

Thanks for the help guys.

alexfru · Post by **alexfru** » Thu Aug 21, 2014 12:29 am

These questions/answers might be useful for developing and testing:
C dynamically printf double, no loss of precision and no trailing zeroes
Is there a tool to know whether a value has an exact binary representation as a floating point variable?
Exploring Binary
Floating Point | Random ASCII

Gigasoft · Post by **Gigasoft** » Fri Aug 22, 2014 2:49 pm

SpyderTL wrote:Amazingly it handles Floats flawlessly

Floating point numbers have been in use since 1938, when the first working computer was built. Mankind had better get it right by now.

FallenAvatar · Post by **FallenAvatar** » Sat Aug 23, 2014 4:13 am

Gigasoft wrote:
SpyderTL wrote:Amazingly it handles Floats flawlessly
Floating point numbers have been in use since 1938, when the first working computer was built. Mankind had better get it right by now.

I think he was more "impressed" with the "apparently correct" handling of decimal places and consistency on a data type that is not really known to handle either incredibly well.

- Monk

SpyderTL · Post by **SpyderTL** » Sat Aug 23, 2014 12:16 pm

Yeah. I was impressed that dotnet could take an array of four bytes and convert that to, say, 1.000001, even though this value technically can not be represented with four bytes. I assume that their Float-to-String logic has some very stable rounding logic.

I would love to see this logic, but it apparently handled by the CLR runtime, and not in .NET code. I checked the IL disassembly, and those ToString methods are marked as extern.

- Joshua

Owen · Post by **Owen** » Sat Aug 23, 2014 12:59 pm

SpyderTL wrote:Yeah. I was impressed that dotnet could take an array of four bytes and convert that to, say, 1.000001, even though this value technically can not be represented with four bytes. I assume that their Float-to-String logic has some very stable rounding logic.

I would love to see this logic, but it apparently handled by the CLR runtime, and not in .NET code. I checked the IL disassembly, and those ToString methods are marked as extern.

- Joshua

Its' generally considered a quality of implementation matter that Float <-> String conversions should Just Work (TM).

Go check the code of your favorite C runtime library. The CLR probably even delegates its' implementation of the rounding to the C runtime it was compiled against (or failing that, Windows)

OSDev.org

Simple Float to String Idea

Simple Float to String Idea

Re: Simple Float to String Idea

Re: Simple Float to String Idea

Re: Simple Float to String Idea

Re: Simple Float to String Idea

Re: Simple Float to String Idea

Re: Simple Float to String Idea

Re: Simple Float to String Idea

Re: Simple Float to String Idea

Re: Simple Float to String Idea

Re: Simple Float to String Idea