Python Pandas: invalid literal for int() with base 10: - python

I keep getting a value error:
ValueError: invalid literal for int() with base 10: '66,790'
I have a column of incomes in the form £xx,xxx as a string, so I'm converting them to an integer:
income["Median_2012/13"]=income["Median_2012/13"].str.replace('£','').replace(',','').astype('int64')
I have no idea why this throws a value error, when I'm removing all the components of the string that stops it from being an integer. £66,790 is the first value in the column. It also fails with "float". Any ideas?
Thanks!

Try to remove all undesired characters with [£,]:
>>> df['Median_2012/13'].str.replace('[£,]', '', regex=True).astype(int)
0 66790
Name: Median_2012/13, dtype: object
But it would be better to use pd.to_numeric instead of cast as int because if you have some decimals it will raise an exception:
>>> pd.to_numeric(df['Median_2012/13'].str.replace('[£,]', '', regex=True))
0 66790
Name: Median_2012/13, dtype: int64

Related

value error while converting object to int

my quantity dtype is object (from csv file) and I am trying to covert to int using below code:
df[x1] = df[x1].astype(str).astype(int)
It throws error as below:
ValueError: invalid literal for int() with base 10: '1,000.000'
Can anyone help me in this please?
You need to remove ',' and '.'. You can use str.replace method to remove the comma and then cast the data to type float then to type int.
df[x1] = df[x1].str.replace(',','').astype(float).astype(int)
For example, for a Series such as
srs = pd.Series(['1,000.00','1'])
if you cast it to dtype int
srs.astype(int)
you get
ValueError: invalid literal for int() with base 10: '1,000.00'
Then if you remove the comma str.replace method and cast to dtype int
srs = srs.str.replace(',','')
srs.astype(int)
you get
ValueError: invalid literal for int() with base 10: '1000.00'
So you cast it to dtype float and to dtype int,
srs = srs.str.replace(',','').astype(float).astype(int)
you get the expected outcome:
0 1000
1 1
dtype: int32
You can use lambda functions!
df[x1] = df[x1].apply(lambda x: int(x))
Actually int() function expects an integer string or a float, but not a float string. If a float string is given you need to convert it to float first then to int as:
int(float(userGuess))
So you have to convert it to float first and then to int.

why I get error when using int() function to convert float to integer?

Why do I get an error When I tried to use int() function to convert a float to integer?
>>> int('99.99')
Traceback (most recent call last):
File "<pyshell#27>", line 1, in <module>
int('99.99')
ValueError: invalid literal for int() with base 10: '99.99'
I expected the result to be 99
Your argument isn't a float, it's a string containing the representation of a float. You have to convert it to a float first, then you can convert that to an int.
int(float('99.99'))
Per the docs
Return an integer object constructed from a number or string x, or
return 0 if no arguments are given. If x is a number, return
x.int(). For floating point numbers, this truncates towards zero.
If x is not a number or if base is given, then x must be a string,
bytes, or bytearray instance representing an integer literal in radix
base.
Pay particular attention to "representing an integer literal". So your str that you are attempting to convert cannot be a float, because that's a float literal, not an int literal.
So, as others have noted, you cannot go directly from a float literal to an int, you need to convert the float first:
x = '123.45'
int(float(x))
You are getting a ValueError because you are overloading int() with an argument that is not consistent with the Python docs.
According to the doc:
"If x is not a number or if base is given, then x must be a string or
Unicode object representing an integer literal "
Basically, x (in your case '99.99') is the string 99.99 which does not satisfy the requirements of being an integer literal. You provided a floating literal.
TL;DR
int(float('99.99'))

How to extract significant numeric digits from an alphanumeric string?

I have an ID DIS002789.I want to extract 2789 from the given ID.I have to use the extracted number in a for loop using a variable.
I tried using re.findall.
inputk='DIS0002789'
non_decimal = re.findall(r'[\d.]+', inputk)
for n in range(non_decimal, non_decimal + 1000):
Im getting 002789. But I want my output to be 2789.And also i cant use the for loop because of this.It shows a n error saying 002789 is an invalid syntax.
I tried converting it to int. but its shows the following error,
TypeError: int() argument must be a string, a bytes-like object or a number, not 'list'
you can pass the result of re.findall(r'[\d.]+', inputk) to int in order to make it an integer. int('0123') will ignore leading zeroes.
Example:
inputk='DIS0002789'
non_decimal = int(re.findall(r'[\d.]+', inputk))
if you want it to be a string you can pass it to str again: str(int('0123')) == '123'
If you want the int value, you should convert it to integer as other answers show. If you only want the string, you can try adding the optional leading zeros:
inputk='DIS0002789'
non_decimal = re.findall(r':?[0]*(\d+)', inputk)
non_decimal
output:
['2789']
you can ignore leading zeros and convert it to an integer to use in a loop
inputk='DIS0002789'
non_decimal = int(re.findall(r':?[0]*(\d+)', inputk)[0])

Convert value in a Python dictionary to an int [duplicate]

I have an issue that really drives me mad. Normally doing int(20.0) would result in 20. So far so good. But:
levels = [int(gex_dict[i]) for i in sorted(gex_dict.keys())]
while gex_dict[i] returns a float, e.g. 20.0, results in:
"invalid literal for int() with base 10: '20.0'"
I am just one step away from munching the last piece of my keyboard.
'20.0' is a string, not a float; you can tell by the single-quotes in the error message. You can get an int out of it by first parsing it with float, then truncating it with int:
>>> int(float('20.0'))
20
(Though maybe you'd want to store floats instead of strings in your dictionary, since that is what you seem to be expecting.)
It looks like the value is a string, not a float. So you need int(float(gex_dict[i]))
It looks like the problem is that gex_dict[i] actually returns a string representation of a float '20.0'. Although int() has the capability to cast from a float to an int, and a string representation of an integer to an int. It does not have the capability to cast from a string representation of a float to an int.
The documentation for int can be found here:
http://docs.python.org/library/functions.html#int
The problem is that you have a string and not a float, see this as comparison:
>>> int(20.0)
20
>>> int('20.0')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: invalid literal for int() with base 10: '20.0'
You can workaround this problem by first converting to float and then to int:
>>> int(float('20.0'))
20
So it would be in your case:
levels = [int(float(gex_dict[i])) for i in sorted(gex_dict.keys())]

python float to in int conversion

I have an issue that really drives me mad. Normally doing int(20.0) would result in 20. So far so good. But:
levels = [int(gex_dict[i]) for i in sorted(gex_dict.keys())]
while gex_dict[i] returns a float, e.g. 20.0, results in:
"invalid literal for int() with base 10: '20.0'"
I am just one step away from munching the last piece of my keyboard.
'20.0' is a string, not a float; you can tell by the single-quotes in the error message. You can get an int out of it by first parsing it with float, then truncating it with int:
>>> int(float('20.0'))
20
(Though maybe you'd want to store floats instead of strings in your dictionary, since that is what you seem to be expecting.)
It looks like the value is a string, not a float. So you need int(float(gex_dict[i]))
It looks like the problem is that gex_dict[i] actually returns a string representation of a float '20.0'. Although int() has the capability to cast from a float to an int, and a string representation of an integer to an int. It does not have the capability to cast from a string representation of a float to an int.
The documentation for int can be found here:
http://docs.python.org/library/functions.html#int
The problem is that you have a string and not a float, see this as comparison:
>>> int(20.0)
20
>>> int('20.0')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: invalid literal for int() with base 10: '20.0'
You can workaround this problem by first converting to float and then to int:
>>> int(float('20.0'))
20
So it would be in your case:
levels = [int(float(gex_dict[i])) for i in sorted(gex_dict.keys())]

Categories

Resources