Can't Convert Object to Float. What's the Best Workaround? - python

I'm struggling to convert an object to a float.
df_final['INBCS'] = df_final['INBCS'].astype(float)
It keeps saying: ValueError: could not convert string to float: '1,620,000'
If I try a different approace, I get mostly NAN results.
print(pd.to_numeric(df_final['INBCS'], errors='coerce'))
I tried one more approach, and I still get errors.
df_final = df_final[df_final['INBCS'].apply(lambda x: x.isnumeric())]
There are no NANs in the data; I already converted them to zeros. When I print the data, it shows commas, but there are no commas at all. I even did ran a replace function to get rid of any potential commas, but again, there are no commas in the data. Any idea what's wrong here? Thanks.

The reason you can't convert that string to a float is that Python doesn't know what to do with the commas. You can reproduce this easily:
>>> float('1,000')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: could not convert string to float: '1,000'
It's tempting to just remove the commas and parse the number, but there's an internationalization concern. In some countres, a comma separates thousands (eg, "1,000,000" is one million). In other countries, commas separate decimals (eg, "1,05" is one and five one-hundredths).
For that reason, it's best to use localization to parse a number like that if you can't get it in a native form. See this answer for details on that.

The reason is because you have , there, you can do:
df_final['INBCS'] = df_final['INBCS'].replace(',','')
df_final['INBCS'] = df_final['INBCS'].astype(float)
should work.

Try this:
string = '1,620,000'
decimal = float(''.join(string.split(',')))
print(type(decimal), decimal)
# Prints (<type 'float'>, 1620000.0)
This first gets rid of all the commas using split(','), then recreates the string using ''.join(). Finally, it converts the whole thing to a float using float().

Related

Add percentage numbers from a cell in PPTX

I'm a novice in Python, and I'm struggling with pptx rn.
I have an xlsx file and I want to import a single percentage number from a cell in my presentation.
Here is the traceback:
line 40, in to_unicode
raise TypeError("expected unicode string, got %s value %s" % (type(text), text))
TypeError: expected unicode string, got <class 'float'> value 0.29
Try:
subtitle.text = str(active_sheet['C8'].value)
as a start. This should avoid the exception, but might not give you the exact representation you want.
I expect it will give you "0.29", but let's come back to that. The reason you are getting the exception is because you are assigning a numeric (float) value to a property that expects a str value. Converting it to string using the built-in str() function takes care of that.
To get a percentage representation, try:
subtitle.text = "{:.0%}".format(active_sheet['C8'].value)
There are other ways of what is called interpolating strings from numbers in Python that you can find on search, but this is a good one in this case.

How to convert unusual unicode string with number to integer in python

I have some fairly hairy unicode strings with numbers in them that I'd like to test the value of. Normally, I'd just use str.isnumeric to test for whether it could be converted via int() but I'm encountering cases where isnumeric returns True but int() raises an exception.
Here's an example program:
>>> s = '⒍'
>>> s.isnumeric()
True
>>> int(s)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: invalid literal for int() with base 10: '⒍'
Unicode is always full of surprises, so I'm happy to just be robust to this case and use a try/except block to catch unusual numbers. However, I'd be happier if I could still convert them to integers. Is there a consistent way to do this?
If you want to test if a string can be passed to int, use str.isdecimal. Both str.isnumeric and str.isdigit include decimal-like characters that aren't compatible with int.
And as #abarnert has mentioned in the comments, the most guaranteed way to test if a string can be passed to int is to simply do it in a try block.
On the other hand, '⒍' can be converted to an actual digit with the help of the unicodedata module, e.g.
print(unicodedata.digit('⒍'))
would output 6.
I don't know how much luck you'll have, but unicodedata may handle some cases (python 3 code):
>>> import unicodedata
>>> unicodedata.normalize('NFKC', '⒍')
'6.'
Slightly better. As to testing, if you want an int you could just int() it and catch the exception.
The best way to find out if a string can be converted to int is to just try it:
s = '⒍'
try:
num = int(s)
except ValueError:
# handle it
Sure, you can try to figure out the right way to test the string in advance, but why? If the rule you want is "whatever int accepts", just use int.
If you want to convert something that is a digit, but isn't a decimal, use the unicodedata module:
s = '⒍'
num = unicodedata.digit(s) # 6
num = unicodedata.numeric(s) # 6.0
num = unicodedata.decimal(s) # ValueError: not a decimal
The DIGIT SIX FULL STOP character's entry in the database has Digit and Numeric values, despite being a Number, Other rather than a Number, Decimal Digit (and therefore not being compatible with int).

Import string that looks like a list "[0448521958, +61439800915]" from JSON into Python and make it an actual list?

I am extracting a string out of a JSON document using python that is being sent by an app in development. This question is similar to some other questions, but I'm having trouble just using x = ast.literal_eval('[0448521958, +61439800915]') due to the plus sign.
I'm trying to get each phone number as a string in a python list x, but I'm just not sure how to do it. I'm getting this error:
raise ValueError('malformed string')
ValueError: malformed string
your problem is not just the +
the first number starts with 0 which is an octal number ... it only supports 0-7 ... but the number ends with 8 (and also has other numbers bigger than 8)
but it turns out your problems dont stop there
you can use regex to fix the plus
fixed_string = re.sub('\+(\d+)','\\1','[0445521757, +61439800915]')
ast.literal_eval(fixed_string)
I dont know what you can do about the octal number problem however
I think the problem is that ast.literal_eval is trying to interpret the phone numbers as numbers instead of strings. Try this:
str = '[0448521958, +61439800915]'
str.strip('[]').split(', ')
Result:
['0448521958', '+61439800915']
Technically that string isn't valid JSON. If you want to ignore the +, you could strip it out of the file or string before you evaluate it. If you want to preserve it, you'll have to enclose the value with quotes.

How to strip letters out of a string and compare values?

I have just learned Python for this project I am working on and I am having trouble comparing two values - I am using the Python xlwt and xlrd libraries and pulling values of cells from the documents. The problem is some of the values are in the format 'NP_000000000', 'IPI00000000.0', and '000000000' so I need to check which format the value is in and then strip the characters and decimal points off if necessary before comparing them.
I have tried using S1[:3] to get the value without alphabet characters, but I get a 'float is not subscriptable' error
Then I tried doing re.sub(r'[^\d.]+, '', S1) but I get a Typerror: expected a string or buffer
I figured since the value of the cell that is being returned via sheet.cell( x, y).value would be a string since it is alphanumeric, but it seems like it must be returned as a float
What is the best way to format these values and then compare them?
You are trying to get the numbers from the strings in the format shown? Like to get 2344 from NP_2344? If yes then use this
float(str(S1)[3:])
to get what you want. You can change float to int.
It sounds like the API you're using is returning different types depending on the content of the cells. You have two options.
You can convert everything to a string and then do what you're currently doing:
s = str(S1)
...
You can check the types of the input and act appropriately:
if isinstance(S1, basestring):
# this is a string, strip off the prefix
elif isinstance(S1, float):
# this is a float, just use it

python Invalid literal for float

I am running a code to select chunks from a big file. I am getting some strange error that is
"Invalid literal for float(): E-135"
Does anybody know how to fix this? Thanks in advance.
Actually this is the statement that is giving me error
float (line_temp[line(line_temp)-1])
This statement produces error
line_temp is a string
'line' is any line in an open and file also a string.
You need a number in front of the E to make it a valid string representation of a float number
>>> float('1E-135')
1e-135
>>> float('E-135')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: invalid literal for float(): E-135
In fact, which number is E-135 supposed to represent? 1x10^-135?
Valid literal forms for floats are here.
Looks like you are trying to convert a string to a float. If the string is E-135, then it is indeed an invalid value to be converted to a float. Perhaps you are chopping off a digit in the beginning of the string and it really ought to be something like 1E-135? That would be a valid float.
May I suggest you replace
float(x-y)
with
float(x) - float(y)
Ronald, kindly check the answers again. They are right.
What you are doing is: float(EXPRESSION), where the result of EXPRESSION is E-135. E-135 is not valid input into the float() function. I have no idea what the "line_temp[line(line_temp)-1]" does, but it returns incorrect data for the float() function.

Categories

Resources