Interpreting Excel Currency Values

Interpreting Excel Currency Values - python

I am using python to read a currency value from excel. The returned from the range.Value method is a tuple that I don't know how to parse.
For example, the cell appears as $548,982, but in python the value is returned as (1, 1194857614).
How can I get the numerical amount from excel or how can I convert this tuple value into the numerical value?
Thanks!

Try this:
import struct
try: import decimal
except ImportError:
divisor= 10000.0
else:
divisor= decimal.Decimal(10000)
def xl_money(i1, i2):
byte8= struct.unpack(">q", struct.pack(">ii", i1, i2))[0]
return byte8 / divisor
>>> xl_money(1, 1194857614)
Decimal("548982.491")
Money in Microsoft COM is an 8-byte integer; it's fixed point, with 4 decimal places (i.e. 1 is represented by 10000). What my function does, is take the tuple of 4-byte integers, make an 8-byte integer using struct to avoid any issues of sign, and then dividing by the constant 10000. The function uses decimal.Decimal if available, otherwise it uses float.
UPDATE (based on comment): So far, it's only COM Currency values being returned as a two-integer tuple, so you might want to check for that, but there are no guarantees that this will always be successful. However, depending on the library you use and its version, it's quite possible that later on, after some upgrade, you will be receiving decimal.Decimals and not two-integer tuples anymore.

I tried this with Excel 2007 and VBA. It is giving correct value.
1) Try pasting this value in a new excel workbook
2) Press Alt + F11. Gets you to VBA Editor.
3) Press Ctrl + G. Gets you to immediate window.
4) In the immediate window, type ?cells("a1").Value
here "a1" is the cell where you have pasted the value.
I am doubting that the cell has some value or character due to which it is interpreted this way.
Post your observations here.

Related

Default representation of number as hex in Spyder

I am debugging an hardware IP and since I am looking at waveforms it's more tidy and organized if I keep the numbers represented in the hex format ( I am saying this because I could change the representation in the waveform viewer as "decimal", but if there is a way to avoid that, I'd prefer it. And the reason I prefer it is because with decimal I lose track of the single byte values). Example of numbers in the waveform:
The drawback however is that whenever I need to check them against my results in python I need to wrap every number with hex() before printing it. Note that I am using Python as book keeping and to check results on the spot, I don't care about performance or whatever else.
import numpy as np
#Test HW dot product - This is just an example
e = np.arange(0x5a,0x61)
a = np.arange(0x1,0x8)
# Intermediate result
r1 = a[0:4]*e[0:4]
#Final result
c = np.dot(a,e)
In the spyder console then I type the variable to display the content:
>>>c
Out[6]: 2632
>>>r1
Out[7]: array([ 90, 182, 276, 372])
However I would like them to be displayed as hex. Is there any console setting or Pyhton representation setting to make this happen?
I know that I can create a wrapper around a print function that calls hex, but I don't want to have prints all around my code, I like that I can just type a variable name in the console and see the value.

Are the values 161137531201111100, 1.611375312011111e+17 equal?

I am trying to manipulate a dataframe. The value of in a list which I use to append a column to the dataframe is 161137531201111100. However, I created a dictionary whose keys are the unique values of this column, and I use this dictionary in further operations. This could used to run perfectly before.
However, after trying this code on another data I had the following error:
KeyError: 1.611375312011111e+17
which means that this value is not the of the dictionary; I tried to trace the code, everything seemed to be okay. However, when I opened the csv file of the dataframe I built I found out that the value that is causing the problem is: 161137531201111000 which is not in the list(and ofc not a key in the dictionary) I used to create this column of dataframe. This seems weird. However, I don't know what is the reason? Is there any reason that a number is saved in another way?
And how can I save it as it is in all phases? Also, why did it change in the csv?

No unfortunately, they are not equal
print(1.611375312011111e+17 == 161137531201111000)` # False.
The problem lies in the way floating numbers are handled by computers, in general, and most programming languages, including Python.
Always use integers (and not "too large") when doing computations if you want exact results.
See Is floating point math broken? for generic explanation that you definitely must know as a programmer, even if it's not specific to Python.
(and be aware that Python tries to do a rather good job at keeping precision on integers, that unfortunately won't work on floating-point numbers).
And just for the sake of "fun" with floating point numbers, 1.611375312011111e+17 is actually equal to the integer 161137531201111104!
print(format (1.611375312011111e+17, ".60g")) # shows 161137531201111104
print(1.611375312011111e+17 == 161137531201111104) # True
a = dict()
a[1.611375312011111e+17] = "hello"
#print(a[161137531201111100]) # Key error, as in question
print(a[161137531201111104]) # This one shows "hello" properly!

Recasting a STRING into a VALUE in LibreOffice Calc

I have a Python class that does some currency conversion and string formatting of numbers. It takes polymorphic input, but only spits out a stringified number. I can push those stringified numbers up to a LibreOffice Calc in Python easy enough:
stringifiednumber = str("1.01")
cell_a1 = sheet1.getCellRange("A1")
cell_a1.String = stringifiednumber
This actually works nicely since the builtin currency formats in Calc work just fine with stringified numbers.
What doesn't work is formulas, or sort of doesn't work. Calling SUM(A1:A2) will not see the stringified A1. There is a workaround (forgive me it is late and I forget it exactly but it is similar to:) =SUMRECORD(VALUE(A1:A2)).
As I understand it, each cell has a memory location for a number, a string, and a formula. The formula only acts on the VALUE memory location.
Through the spreadsheet UI, I can convert one cell type to another during a copy. To do that I just put the following formula in A2, and it converts STRING(A1) to VALUE( A2):
# formula placed in A2
=VALUE(A1)
but that only works by copying one cell to another. Obviously there is an internal recasting function within the spreadsheet that is doing the conversion during the copy.
What I want to do, is write a stringified number to the spreadsheet (as above) and then call the spreadsheets native recasting function in place from Python, so that VALUE(A1) is recast from STRING(A1).
If I knew what the recasting function was I could just call it after every string write. This would make macros in the UI work like the user expects them to work.
If your answer is: "do type conversion Python-side", I've already considered that, and it is not the solution I'm looking for.

Based on your Title, multiply by 1:

openpyxl please do not assume text as a number when importing

There are numerous questions about how to stop Excel from interpreting text as a number, or how to output number formats with openpyxl, but I haven't seen any solutions to this problem:
I have an Excel spreadsheet given to me by someone else, so I did not create it. When I open the file with Excel, I have certain values like "5E12" (clone numbers, if anyone cares) that appear to display correctly, but there's a little green arrow next to each one warning me that "This appears to be a number stored as text". Excel then asks me if I would like to convert it to a number, and if I saw yes, I get 5000000000000, which then converts automatically to scientific notation and displays 5E12 again, only this time a text output would show the full number with zeroes. Note that before the conversion, this really is text, even to Excel, and I'm only being warned/offered to convert it.
So, when reading this file in with openpyxl (from openpyxl.reader.excel import load_workbook), the 5E12 is getting converted automatically to 5000000000000. I assume that openpyxl is making the same assumption that Excel made, only the conversion happens without a prompt or input on my part.
How can I prevent this from happening? I do not want text that look like "numbers stored as text" to convert to numbers. They are text unless I say so.
So far, the only solution I have found is to add single quotes to the front of each cell, but this is not an ideal solution, as it's manual labor rather than a programmatic solution. Also, the solution needs to be general, since I don't always know where this problem might occur (I'm reading millions of lines per day, so I don't want to have to do anything by hand).
I think this is a problem with openpyxl. There is a google group discussion from the beginning of 2011 that mentions this problem, but assumes it's too rare to matter. https://groups.google.com/forum/?fromgroups=#!topic/openpyxl-users/HZfpShMp8Tk
So, any suggestions?

If you want to use openpyxl again (for whatever reason), the following changes to the worksheet reader routine do the trick of keeping the strings as strings:
diff --git a/openpyxl/reader/worksheet.py b/openpyxl/reader/worksheet.py
--- a/openpyxl/reader/worksheet.py
+++ b/openpyxl/reader/worksheet.py
## -134,8 +134,10 ##
data_type = element.get('t', 'n')
if data_type == Cell.TYPE_STRING:
value = string_table.get(int(value))
-
- ws.cell(coordinate).value = value
+ ws.cell(coordinate).set_value_explicit(value=value,
+ data_type=Cell.TYPE_STRING)
+ else:
+ ws.cell(coordinate).value = value
# to avoid memory exhaustion, clear the item after use
element.clear()
The Cell.value is a property and on assignment call Cell._set_value, which then does a Cell.bind_value which according to the method's doc: "Given a value, infer type and display options". As the types of the values are in the XML file those should be taken (here I only do that for strings) instead of doing something 'smart'.
As you can see from the code, the test whether it is a string was already there.

Unable to insert floats as range_keys in DynamoDB with Boto

When I attempt to insert a range_key that contains a number of more than 2 decimal places, the number stored in the database is truncated to the first 2 decimals.
How do I get around this?
max_number = 1000000.0
random_time = random.randrange(1, max_number-1) / max_number
range_key = int(time.time()) + random_time
data['item_id'] = '12345'
result = db.add(table='media', key=group_id,
range_key = range_key,
data=data)
The resulting range_key of "1347053744.819199" gets inserted as "1347053744.82"

UPDATE:
This is actually a bug in boto, see https://github.com/boto/boto/pull/890#issuecomment-8456495 for tracking
ORIGINAL ANSWER:
Why are you using add to store a new item ? add is the function for atomic increments. You should use put_item instead.
I do not know why your float is rounded but anyway, you really should not try it. It is a bad practice to use float as keys. It is impossible to reliably check floats equality because they are inherently approximation. see http://www.cygnus-software.com/papers/comparingfloats/comparingfloats.html for in depth insight on floats.
Nonetheless, if you really need floats as keys, you need to fully control there representation so as not to be dependent of the server-side implementation when de-serializing the JSON of the request. You need to replace it by a regular string.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.