When a number is long in my excel document, Excel formats the cell value to scientific notation (ex 1.234567e+5) while the true number still exists in the formula bar at the top of the document (ex 123456789012).
I want to convert this number to a string for my own purposes, but when I do, the scientific notation is captured, rather than the true number. How can I assure that it's the true number that is being converted to a string?
Python will ignore the formatting that Excel uses for anything other than dates and times, so you should just be able to convert the number to a string. You will, however, be limited by Excel's precision. The OOXML file format is not suitable for some tasks notably those with historical dates or high precision times.
Related
I want to convert some floats to Decimal retaining 5 digits after decimal place regardless of how many digits before the decimal place. Is using string formatting the most efficient way to do this?
I see in the docs:
The significance of a new Decimal is determined solely by the number of digits input. Context precision and rounding only come into play during arithmetic operations.
So that means I need to add 0 to force it to use the specified prec but the prec is total digits not after decimal so it doesn't actually help.
The best thing I can come up with is
a=[1.132434, 22.2334,99.33999434]
[Decimal("%.5f" % round(x,5)) for x in a]
to get [Decimal('1.13243'), Decimal('22.23340'), Decimal('99.33999')]
Is there a better way? It feels like turning floats into strings just to convert them back to a number format isn't very good although I can't articulate why.
Do all the formatting on the way out from your code, inside the print and write statements. There is no reason I can think of to lose precision (and convert the numbers to some fixed format) while doing numeric calculations inside the code.
Goal
I want to convert all floats to two decimal places regardless of number of decimal places float has without repeating convert code.
For example, I want to convert
50 to 50.00
50.5 to 50.50
without repeating the convert code again and again. What I mean is explained in the following section - research.
Not what this question is about
This question is NOT about:
Only limiting floats to two decimal points - if there is less than two decimal points then I want it to have two decimal points with zeros for the unused spaces.
Flooring the decimal or ceiling it.
Rounding the decimal off.
This question is not a duplicate of this question.
That question only answers the first part of my question - convert floats to two decimal places regardless of number of decimal places float has, not the second part - without repeating convert code.
Nor this question.
That is just how to add units before the decimal place. My question
is: how to convert all floats to two decimal places regardless of
number of decimal places float has without repeating convert code.
Research
I found two ways I can achieve the convert. One is using the decimal module:
from decimal import *
TWOPLACES = Decimal(10) ** -2
print(Decimal('9.9').quantize(TWOPLACES))
Another, without using any other modules:
print(f"{9.9:.2f}")
However, that does not fully answer my question. Realise that the code to convert keeps being needed to repeat itself? I keep having to repeat the code to convert again and again. Sadly, my whole program is already almost completed and it will be quite a waste of time to add this code here and there so the format will be correct. Is there any way to convert all floats to two decimal places regardless of number of decimal places float has without repeating convert code?
Clarification
What I mean by convert is, something like what Dmytro Chasovskyi said, that I want all places with floats in my program without extra changes to start to operate like decimals. For example, if I had the operation 1.2345 + 2.7 + 3 + 56.1223183 it should be 1.23 + 2.70 + 3.00 + 56.12.
Also, float is a number, not a function.
The bad news is: there is no "float" with "two decimal places".
Floating point numbers are represented internally with a fixed number of digits in base 2. https://floating-point-gui.de/basic/ .
And these are both efficient and accurate enough for almost all calculations we perform with any modern program.
What we normally want is that the human-readable text representation of a number, in all outputs of a program, shows only two digits. And this is controlled at wherever your program is either writting the value to a text file, to the screen, or rendering it to an HTML template (which is "writing it to a text file", again).
So, it happens that the same syntaxes that will convert a number to text, embedded in another string, allows additionally to control the exact output of the number. You put as an example print(f"{9.9:.2f}"). The only thing that looks impractical there is due to you hardcoding your number along with its conversion. Typically, the number will be in a variable.
Them, all you have to do is writting, wherever you output the number:
print(f"The value is: {myvar:.02f}")
instead of
print(f"The value is: {myvar}")
Or in whatever function you are calling that will need the rendered version of the number instead of print. Notice that the use of the word "rendered" here is deliberate: while your program is running, the number is stored in an efficient way in memory, directly usable by the CPU, that is not human readable. At any point you want to "see" the number, you have to convert it into text. It is just that some calls to it implicitly, like print(myvar). Then, just resort to explicitly converting it in these places - `print(f"{myvar:.02f}").
really having 2 decimal places in memory
If you use decimal.Decimal, then yes, there are ways to keep the internal representation of the number with 2 decimal digits,
but them, instead of just converting the number on output, you must convert it into a "2 decimal place" value on all inputs as well
That means that whenever ingesting a number into your program, be it typed by the user, read from a binary file or database, or received via wire from a sensor, you have to apply a similar transform to the one used in the output as detailed above. More precisely: you convert your float to a properly formatted string, and then convert that to a decimal.Decimal.
And this will prevent your program of accumulating errors due to base conversion, but you will still need to force the format to 2 decimal places on every output, just like above.
Use this function.
def cvt_decimal(input):
number = float(input)
return ("%.2f" % number)
print(cvt_decimal(50))
print(cvt_decimal(50.5))
Output is :
50.00
50.50
** Process exited - Return Code: 0 **
Press Enter to exit terminal
you can modify the decimal precision, even if you do any operation between 2 decimal types
import decimal
from decimal import Decimal
decimal.getcontext().prec = 2
a = Decimal('0.12345')
b = Decimal('0.12345')
print(a + b)
Decimal calculations are precise but it takes more time to do calculations, keep that in mind.
I'm trying to use read_sql_query() to read a query from MySQL database, one of the field in the database, its type is double(24, 8), I want to use dtype= parameter to have full control of the datatypes and read it to decimal, but seems like pandas can't recognize decimal type so I had to read it to Float64
In the database, the values for this field look like this:
Value
100.96000000
77.17000000
1.00000000
0.12340000
Then I'm trying to read it from Python code:
from decimal import *
dtypes = {
'id': 'Int64',
'date': 'datetime64',
'value': 'Float64'
}
df = pd.read_sql_query(sql_query, mysql_engine, dtype=dtypes)
but after reading the data from the code above, it looks like this:
Value
100.96
77.17
1.0
0.1234
How can I read this column to decimal and keep all the digits? Thanks.
What "the data looks like in the database" is tricky. This is because the act of printing it out feeds the bits through a formatting algorithm. In this case it removes trailing zeros. To see what is "in the database", one needs to get a hex dump of the file and then decipher it; this is non-trivial.
I believe that DECIMAL numbers hold all the digits specified, packed 2 digits per byte. No, I don't know how they are packed (0..99 versus 2 hex digits; what to do if the number of digits is odd; where is the sign?)
I believe that FLOAT and DOUBLE exactly conform to IEEE-764 encoding format. No, I don't know how the bytes are stored (big-endian vs little-endian). I suspect Python's Float64 is IEEE DOUBLE.
For DECIMAL(10,6), I would expect to see "1.234" to be stored as +, 0001, and 234000, but never displayed with leading zeros and optionally displayed with trailing zeros -- depending on the output formatting package.
For DOUBLE, I would expect to find hex 3ff3be76c8b43958 after adjusting for endianism, and I would not be surprised to see the output be 1.23399999999999999e+0. (Yes, I actually got that, given a suitable formatting in PHP, which I am using.) I would hope to see 1.234 since that is presumably the intent of the number.
Do not use DOUBLE(m,n). The (m,n) leads to extra rounding and it is deprecated syntax. Float and Double are not intended for exact number of decimal places; use DECIMAL for such.
For FLOAT: 1.234 becomes hex 3f9df3b6 and displays something like 1.2339999675751 assuming the output method works in DOUBLE and is asked to show lots of decimal places.
Bottom line: The output method you are using is causing the problem.
I have used the fix presented in YAML loads 5e-6 as string and not a number to correctly load into python numbers represented in scientific notation in a yaml file.
I also have complex numbers in my yaml file. These numbers may have varying amounts of whitespace (e.g., 2+2j, or 2 + 2j, etc). The complex numbers are currently being read in as strings (in the same way that the numbers in scientific notation were read in as strings prior to the fix referenced above). I would like to know how to modify the add_implicit_resolver argument in the fix to correctly read in complex numbers. Ideally, I'd like to continue to use pyyaml.
More specifically, if I have an entry in my yaml file such as:
offset: 2 + 1j
I would like this to be recognized as the complex number (of class 'complex'):
2+1j
Currently, the value in the python dictionary corresponding to the key 'offset' is a string:
'2 + 1j'
which I have to manually convert into a complex number via:
complex('2 + 1j'.replace(' ', ''))
I'm looking to automate this process by modifying the argument to the add_implicit_resolver using the same strategy as in the link above for dealing with numbers in scientific notation.
As for a spec, yes, in general, the real and imaginary parts of the complex number may be in scientific notation: (e.g., ' 2e-3 + 1.3e-4j'). I am fine with restricting the format to have a trailing j for the imaginary part. No tabs, no linefeeds, just individual spaces. Real or imaginary part can be missing and can be expressed as integers or floats.
I need to convert string to float, but there can be different input string formats, such as '1234,5' or '1234.5' or '1 234,5' or '1,234.5' or whatever. And I can not change locale decimal pointer or thousands separator, because I may not know what data I will get in advance.
Is there a way or method or library to parse and convert to float this kind of locale-specific values without knowing which locale is used?
P.S. Is there any solution exists for the same problem with dates?
TIA.
You can make some assumptions on which character is the thousands separator and which is the decimal point. However, there is a case where you cannot know for sure what do do:
Look for the last character that is . or ,. If it occurs more than once, the number does not have a decimal point and that character is the thousands separator
If the string contains exactly one of each, the last one is the decimal point
If the string contains only one point/comma, you are pretty much out of luck: 123.456 or 123,456 might be the number 123456 or 123.456. However, with a number like 123.45 - i.e. the number of digits after the potential thousands separator not being a multiple of three - you can assume that it's a decimal point.