Simple Float to String Idea
Simple Float to String Idea
I've started adding support for FPU instructions to my system, and I'm trying to put together a quick and dirty float (single) to string method that I can use to verify my numbers are (somewhat) correct.
Implementing a complex, accurate method based on the Dragon4 method is beyond the scope for what I need at the moment, but once I get it working, I'll probably improve my code over time.
So, here is what I've got after spending about a week researching and coding. I'm essentially writing this code in Assembly, but I'm actually using my own XML based language, so I'll just describe the steps that I'm taking:
1. Get an address to a float structure (1-bit sign flag, 8-bit exponent offset by 127, 23-bit mantissa, with implied 24th bit of 1)
2. Mask high bit to get sign.
3. Shift value right by 23 bits, and subtract result by 127 to get exponent.
4. Add implied 24th bit to mantissa.
5. Shift mantissa right by (23 - exponent) bits to get whole number value.
6. Mask mantissa with ((1 << (23 - exponent)) - 1) to get fractional value (bits to the right of the decimal)
7. Divide 1,000,000,000 by (1 << (23 - exponent) to get resolution of fractional value.
8. Multiply resolution by fractional value to get decimal (base 10) fractional value (digits to the right of the decimal)
9. Convert whole number and fractional digits to strings and combine strings. (fractional digits are padded to 9 digits on the left with zeros)
I'm using a constant value of 1,000,000,000 as my "maximum" fractional decimal value, since it's the largest base 10 number that will fit in a 32-bit register.
Using the method above, I'm getting accurate numbers to about 3 decimal places. I'm using the CPU to calculate the resolution value and fractional values above. I may switch to using the FPU to do the division and multiplication, which should give me accurate results to about 6 decimal places. I think the best accuracy you can get with a 32-bit single value is around 7 decimal places, so I would be happy for a while if I can get 6 decimal places in less than 10 instructions.
Do you guys see any major problems with this approach, or do you have any suggestions to improve this?
Floating point structures and calculations are a lot more complicated than I was expecting. The Wiki article doesn't have much detail at all, so I may update it once I get my head wrapped around all of the different formats and issues.
Implementing a complex, accurate method based on the Dragon4 method is beyond the scope for what I need at the moment, but once I get it working, I'll probably improve my code over time.
So, here is what I've got after spending about a week researching and coding. I'm essentially writing this code in Assembly, but I'm actually using my own XML based language, so I'll just describe the steps that I'm taking:
1. Get an address to a float structure (1-bit sign flag, 8-bit exponent offset by 127, 23-bit mantissa, with implied 24th bit of 1)
2. Mask high bit to get sign.
3. Shift value right by 23 bits, and subtract result by 127 to get exponent.
4. Add implied 24th bit to mantissa.
5. Shift mantissa right by (23 - exponent) bits to get whole number value.
6. Mask mantissa with ((1 << (23 - exponent)) - 1) to get fractional value (bits to the right of the decimal)
7. Divide 1,000,000,000 by (1 << (23 - exponent) to get resolution of fractional value.
8. Multiply resolution by fractional value to get decimal (base 10) fractional value (digits to the right of the decimal)
9. Convert whole number and fractional digits to strings and combine strings. (fractional digits are padded to 9 digits on the left with zeros)
I'm using a constant value of 1,000,000,000 as my "maximum" fractional decimal value, since it's the largest base 10 number that will fit in a 32-bit register.
Using the method above, I'm getting accurate numbers to about 3 decimal places. I'm using the CPU to calculate the resolution value and fractional values above. I may switch to using the FPU to do the division and multiplication, which should give me accurate results to about 6 decimal places. I think the best accuracy you can get with a 32-bit single value is around 7 decimal places, so I would be happy for a while if I can get 6 decimal places in less than 10 instructions.
Do you guys see any major problems with this approach, or do you have any suggestions to improve this?
Floating point structures and calculations are a lot more complicated than I was expecting. The Wiki article doesn't have much detail at all, so I may update it once I get my head wrapped around all of the different formats and issues.
Project: OZone
Source: GitHub
Current Task: LIB/OBJ file support
"The more they overthink the plumbing, the easier it is to stop up the drain." - Montgomery Scott
Source: GitHub
Current Task: LIB/OBJ file support
"The more they overthink the plumbing, the easier it is to stop up the drain." - Montgomery Scott
Re: Simple Float to String Idea
Hi,
For example, assume you've got the "raw hex dword" 0x414587D0 (which is the float representing the value 12.3456573486328125). You can:
However, you can simplify it slightly using a loop like this:
For the example number I chose, the significand becomes 809085 and exponent becomes -16, so you'd display "809085 * 2**-16". This is still 100% accurate.
Of course you will need to detect things like NaN, infinities, zero and denormal numbers first. For displaying denormals (not including either zero), you can do almost the same thing as normal numbers except you don't insert the implied bit into the significand.
Cheers,
Brendan
Every valid floating point number (excluding NaN, infinity, etc) can be displayed with 100% accuracy relatively simply, by adjusting the exponent to ensure that the significand can be displayed as an integer.SpyderTL wrote:Do you guys see any major problems with this approach, or do you have any suggestions to improve this?
For example, assume you've got the "raw hex dword" 0x414587D0 (which is the float representing the value 12.3456573486328125). You can:
- extract the sign: 0x414587DD >> 31 = 0 = positive
- extract the significand bits: 0x414587D0 & 0x007FFFFF = 0x004587D0
- extract the exponent: (0x414587D0 >> 23) & 0xFF = 0x82
- insert the implied bit into the significand: 0x004587D0 | 0x00800000 = 0x00C587D0 = 12945360
- subtract the bias from the exponent: 0x82 - 0x7F = 3
- subtract the significand's width (23 bits) from the exponent: 3 - 23 = -20
However, you can simplify it slightly using a loop like this:
Code: Select all
while( (significand & 1) == 0 ) {
significand = significand >> 1;
exponent = exponent + 1;
}
Of course you will need to detect things like NaN, infinities, zero and denormal numbers first. For displaying denormals (not including either zero), you can do almost the same thing as normal numbers except you don't insert the implied bit into the significand.
Cheers,
Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
Re: Simple Float to String Idea
This is exactly what I'm doing. (Well, exactly-ish. I'm currently ignoring the sign bit altogether, but that's a simple matter to tack on a "-" to the front of the string...)Brendan wrote:Hi,
For example, assume you've got the "raw hex dword" 0x414587D0 (which is the float representing the value 12.3456573486328125). You can:
- extract the sign: 0x414587DD >> 31 = 0 = positive
- extract the significand bits: 0x414587D0 & 0x007FFFFF = 0x004587D0
- extract the exponent: (0x414587D0 >> 23) & 0xFF = 0x82
- insert the implied bit into the significand: 0x004587D0 | 0x00800000 = 0x00C587D0 = 12945360
- subtract the bias from the exponent: 0x82 - 0x7F = 3
- subtract the significand's width (23 bits) from the exponent: 3 - 23 = -20
Technicallly, you are right, and this format would be "accurate". But that's kind of like saying that displaying the original hex value (414587D0) on the screen is also 100% accurate.Brendan wrote:At this point you can display the significand and exponent as plain old integers (e.g. "12945360 * 2**-20"). This is 100% accurate.
While it is, technically, accurate, it is also fairly useless for trying to determine whether my newly written float classes, structures and functions are working properly.
If I write a console app that takes two floats (1.0 and 2.0) and multiplies them together, I'd want the output to be "3.0", or something real close to that. Displaying "15359034 * 2^1" on the screen really doesn't help much. I'd rather the result be "inaccurate" but "human readable". And as I mentioned, I just need something that will get me in the "vicinity" of the right answer for now.
Thanks, though.
Project: OZone
Source: GitHub
Current Task: LIB/OBJ file support
"The more they overthink the plumbing, the easier it is to stop up the drain." - Montgomery Scott
Source: GitHub
Current Task: LIB/OBJ file support
"The more they overthink the plumbing, the easier it is to stop up the drain." - Montgomery Scott
Re: Simple Float to String Idea
I could also feed the non-whole number part back to the FPU, and have it multiply the result by 10.0f, in a loop. Then I could strip off the whole number part each time, and add it to the string.
Or, I guess it would be more accurate if I took the whole value, stripped off the whole number, wrote that to a string, appended a ".", and then multiplied the remainder by 10.0f in a loop until the remainder was Zero (0.0), or until we hit a certain number of digits.
Without trying this, I imagine this would be slower than the approach above, but would be more accurate.
So, here are my options:
a) Calculate digits after the decimal using the CPU to divide 1,000,000,000 by the total bits to the right of the decimal place set to 1, and then multiply that by the actual value to the right of the decimal place.
b) Calculate digits after the decimal using the FPU to divide 1,000,000,000.0 by the total bits to the right of the decimal place set to 1, and then multiply that by the actual value to the right of the decimal place.
c) Use the FPU in a loop to round down (truncate) the entire float to an integer, and multiply by 10.0f, and repeat until value is 0.0f or until specific number of digits.
I think these are in order of fastest (and least accurate) to slowest (and most accurate). Would you guys agree?
Are there any other simple options to convert a float to a string?
Or, I guess it would be more accurate if I took the whole value, stripped off the whole number, wrote that to a string, appended a ".", and then multiplied the remainder by 10.0f in a loop until the remainder was Zero (0.0), or until we hit a certain number of digits.
Without trying this, I imagine this would be slower than the approach above, but would be more accurate.
So, here are my options:
a) Calculate digits after the decimal using the CPU to divide 1,000,000,000 by the total bits to the right of the decimal place set to 1, and then multiply that by the actual value to the right of the decimal place.
b) Calculate digits after the decimal using the FPU to divide 1,000,000,000.0 by the total bits to the right of the decimal place set to 1, and then multiply that by the actual value to the right of the decimal place.
c) Use the FPU in a loop to round down (truncate) the entire float to an integer, and multiply by 10.0f, and repeat until value is 0.0f or until specific number of digits.
I think these are in order of fastest (and least accurate) to slowest (and most accurate). Would you guys agree?
Are there any other simple options to convert a float to a string?
Project: OZone
Source: GitHub
Current Task: LIB/OBJ file support
"The more they overthink the plumbing, the easier it is to stop up the drain." - Montgomery Scott
Source: GitHub
Current Task: LIB/OBJ file support
"The more they overthink the plumbing, the easier it is to stop up the drain." - Montgomery Scott
Re: Simple Float to String Idea
Hi,
For floats, if displayed using "kindergarten format numbers" all of these values would be 35 digits or larger. I doubt that's more human readable than something simpler, faster, less error prone and more accurate.
Cheers,
Brendan
For testing newly written code, you want to test "interesting" cases near the limits (e.g. both huge and tiny numbers for floats), not trivial cases (like trying to figure out if "1.0 * 2.0 = 3.0").SpyderTL wrote:While it is, technically, accurate, it is also fairly useless for trying to determine whether my newly written float classes, structures and functions are working properly.
If I write a console app that takes two floats (1.0 and 2.0) and multiplies them together, I'd want the output to be "3.0", or something real close to that. Displaying "15359034 * 2^1" on the screen really doesn't help much. I'd rather the result be "inaccurate" but "human readable". And as I mentioned, I just need something that will get me in the "vicinity" of the right answer for now.
For floats, if displayed using "kindergarten format numbers" all of these values would be 35 digits or larger. I doubt that's more human readable than something simpler, faster, less error prone and more accurate.
Cheers,
Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
Re: Simple Float to String Idea
I did some testing with .NET today to see if I could force it to make some Float calculation/ToString errors. Amazingly it handles Floats flawlessly (as far as I could tell) until you got out to 7 decimal places. Then it started leaking some minor errors beyond 6 or 7 decimal places. I was using System.BitConverter to convert 1.0000001f to a 4 byte array, and then back to a Float. .NET does an amazing job of handling this as well as it does.
Just out of curiosity, does anyone know how they are pulling this off?
On a side note, I went back and modified my code to do option C above, and my numbers are accurate to 6 decimal places now. I'm just dropping everything after 6 decimal places at the moment, but I'll throw in some rounding for the last digit pretty soon. Then I'll create new versions of the code for Double and the native 80-bit FPU format as well.
Thanks for the help guys.
Just out of curiosity, does anyone know how they are pulling this off?
On a side note, I went back and modified my code to do option C above, and my numbers are accurate to 6 decimal places now. I'm just dropping everything after 6 decimal places at the moment, but I'll throw in some rounding for the last digit pretty soon. Then I'll create new versions of the code for Double and the native 80-bit FPU format as well.
Thanks for the help guys.
Project: OZone
Source: GitHub
Current Task: LIB/OBJ file support
"The more they overthink the plumbing, the easier it is to stop up the drain." - Montgomery Scott
Source: GitHub
Current Task: LIB/OBJ file support
"The more they overthink the plumbing, the easier it is to stop up the drain." - Montgomery Scott
Re: Simple Float to String Idea
These questions/answers might be useful for developing and testing:
C dynamically printf double, no loss of precision and no trailing zeroes
Is there a tool to know whether a value has an exact binary representation as a floating point variable?
Exploring Binary
Floating Point | Random ASCII
C dynamically printf double, no loss of precision and no trailing zeroes
Is there a tool to know whether a value has an exact binary representation as a floating point variable?
Exploring Binary
Floating Point | Random ASCII
Re: Simple Float to String Idea
Floating point numbers have been in use since 1938, when the first working computer was built. Mankind had better get it right by now.SpyderTL wrote:Amazingly it handles Floats flawlessly
-
- Member
- Posts: 283
- Joined: Mon Jan 03, 2011 6:58 pm
Re: Simple Float to String Idea
I think he was more "impressed" with the "apparently correct" handling of decimal places and consistency on a data type that is not really known to handle either incredibly well.Gigasoft wrote:Floating point numbers have been in use since 1938, when the first working computer was built. Mankind had better get it right by now.SpyderTL wrote:Amazingly it handles Floats flawlessly
- Monk
Re: Simple Float to String Idea
Yeah. I was impressed that dotnet could take an array of four bytes and convert that to, say, 1.000001, even though this value technically can not be represented with four bytes. I assume that their Float-to-String logic has some very stable rounding logic.
I would love to see this logic, but it apparently handled by the CLR runtime, and not in .NET code. I checked the IL disassembly, and those ToString methods are marked as extern.
- Joshua
I would love to see this logic, but it apparently handled by the CLR runtime, and not in .NET code. I checked the IL disassembly, and those ToString methods are marked as extern.
- Joshua
Project: OZone
Source: GitHub
Current Task: LIB/OBJ file support
"The more they overthink the plumbing, the easier it is to stop up the drain." - Montgomery Scott
Source: GitHub
Current Task: LIB/OBJ file support
"The more they overthink the plumbing, the easier it is to stop up the drain." - Montgomery Scott
- Owen
- Member
- Posts: 1700
- Joined: Fri Jun 13, 2008 3:21 pm
- Location: Cambridge, United Kingdom
- Contact:
Re: Simple Float to String Idea
Its' generally considered a quality of implementation matter that Float <-> String conversions should Just Work (TM).SpyderTL wrote:Yeah. I was impressed that dotnet could take an array of four bytes and convert that to, say, 1.000001, even though this value technically can not be represented with four bytes. I assume that their Float-to-String logic has some very stable rounding logic.
I would love to see this logic, but it apparently handled by the CLR runtime, and not in .NET code. I checked the IL disassembly, and those ToString methods are marked as extern.
- Joshua
Go check the code of your favorite C runtime library. The CLR probably even delegates its' implementation of the rounding to the C runtime it was compiled against (or failing that, Windows)