How can I convert a string like 123,456.908 to float 123456.908 in Python?
For ints, see How to convert a string to a number if it has commas in it as thousands separators?, although the techniques are essentially the same.
Using the localization services
The default locale
The standard library locale module is Python's interface to C-based localization routines.
The basic usage is:
import locale
locale.atof('123,456')
In locales where , is treated as a thousands separator, this would return 123456.0; in locales where it is treated as a decimal point, it would return 123.456.
However, by default, this will not work:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python3.8/locale.py", line 326, in atof
return func(delocalize(string))
ValueError: could not convert string to float: '123,456'
This is because by default, the program is "in a locale" that has nothing to do with the platform the code is running on, but is instead defined by the POSIX standard. As the documentation explains:
Initially, when a program is started, the locale is the C locale, no matter what the user’s preferred locale is. There is one exception: the LC_CTYPE category is changed at startup to set the current locale encoding to the user’s preferred locale encoding. The program must explicitly say that it wants the user’s preferred locale settings for other categories by calling setlocale(LC_ALL, '').
That is: aside from making a note of the system's default setting for the preferred character encoding in text files (nowadays, this will likely be UTF-8), by default, the locale module will interpret data the same way that Python itself does (via a locale named C, after the C programming language). locale.atof will do the same thing as float passed a string, and similarly locale.atoi will mimic int.
Using a locale from the environment
Making the setlocale call mentioned in the above quote from the documentation will pull in locale settings from the user's environment. Thus:
>>> import locale
>>> # passing an empty string asks for a locale configured on the
>>> # local machine; the return value indicates what that locale is.
>>> locale.setlocale(locale.LC_ALL, '')
'en_CA.UTF-8'
>>> locale.atof('123,456.789')
123456.789
>>> locale.atof('123456.789')
123456.789
The locale will not care if the thousands separators are in the right place - it just recognizes and filters them:
>>> locale.atof('12,34,56.789')
123456.789
In 3.6 and up, it will also not care about underscores, which are separately handled by the built-in float and int conversion:
>>> locale.atof('12_34_56.789')
123456.789
On the other side, the string format method, and f-strings, are locale-aware if the n format is used:
>>> f'{123456.789:.9n}' # `.9` specifies 9 significant figures
'123,456.789'
Without the previous setlocale call, the output would not have the comma.
Setting a locale explicitly
It is also possible to make temporary locale settings, using the appropriate locale name, and apply those settings only to a specific aspect of localization. To get localized parsing and formatting only for numbers, for example, use LC_NUMERIC rather than LC_ALL in the setlocale call.
Here are some examples:
>>> # in Denmark, periods are thousands separators and commas are decimal points
>>> locale.setlocale(locale.LC_NUMERIC, 'en_DK.UTF-8')
'en_DK.UTF-8'
>>> locale.atof('123,456.789')
123.456789
>>> # Formatting a number according to the Indian lakh/crore system:
>>> locale.setlocale(locale.LC_NUMERIC, 'en_IN.UTF-8')
'en_IN.UTF-8'
>>> f'{123456.789:9.9n}'
'1,23,456.789'
The necessary locale strings may depend on your operating system, and may require additional work to enable.
To get back to how Python behaves by default, use the C locale described previously, thus: locale.setlocale(locale.LC_ALL, 'C').
Caveats
Setting the locale affects program behaviour globally, and is not thread safe. If done at all, it should normally be done just once at the beginning of the program. Again quoting from documentation:
It is generally a bad idea to call setlocale() in some library routine, since as a side effect it affects the entire program. Saving and restoring it is almost as bad: it is expensive and affects other threads that happen to run before the settings have been restored.
If, when coding a module for general use, you need a locale independent version of an operation that is affected by the locale (such as certain formats used with time.strftime()), you will have to find a way to do it without using the standard library routine. Even better is convincing yourself that using locale settings is okay. Only as a last resort should you document that your module is not compatible with non-C locale settings.
When the Python code is embedded within a C program, setting the locale can even affect the C code:
Extension modules should never call setlocale(), except to find out what the current locale is. But since the return value can only be used portably to restore it, that is not very useful (except perhaps to find out whether or not the locale is C).
(N.B: when setlocale is called with a single category argument, or with None - not an empty string - for the locale name, it does not change anything, and simply returns the name of the existing locale.)
So, this is not meant as a tool, in production code, to try out experimentally parsing or formatting data that was meant for different locales. The above examples are only examples to illustrate how the system works. For this purpose, seek a third-party internationalization library.
However, if the data is all formatted according to a specific locale, specifying that locale ahead of time will make it possible to use locale.atoi and locale.atof as drop-in replacements for int and float calls on string input.
Just remove the , with replace():
float("123,456.908".replace(',',''))
If you don't know the locale and you want to parse any kind of number, use this parseNumber(text) function (My repo). It is not perfect but take into account most cases :
>>> parseNumber("a 125,00 €")
125
>>> parseNumber("100.000,000")
100000
>>> parseNumber("100 000,000")
100000
>>> parseNumber("100,000,000")
100000000
>>> parseNumber("100 000 000")
100000000
>>> parseNumber("100.001 001")
100.001
>>> parseNumber("$.3")
0.3
>>> parseNumber(".003")
0.003
>>> parseNumber(".003 55")
0.003
>>> parseNumber("3 005")
3005
>>> parseNumber("1.190,00 €")
1190
>>> parseNumber("1190,00 €")
1190
>>> parseNumber("1,190.00 €")
1190
>>> parseNumber("$1190.00")
1190
>>> parseNumber("$1 190.99")
1190.99
>>> parseNumber("1 000 000.3")
1000000.3
>>> parseNumber("1 0002,1.2")
10002.1
>>> parseNumber("")
>>> parseNumber(None)
>>> parseNumber(1)
1
>>> parseNumber(1.1)
1.1
>>> parseNumber("rrr1,.2o")
1
>>> parseNumber("rrr ,.o")
>>> parseNumber("rrr1rrr")
1
If the input uses a comma as a decimal point and period as a thousands separator, use .replace twice to convert the data to the format used by the built-in float. Thus:
s = s.replace('.','').replace(',','.')
number = float(s)
What about this?
my_string = "123,456.908"
commas_removed = my_string.replace(',', '') # remove comma separation
my_float = float(commas_removed) # turn from string to float.
In short:
my_float = float(my_string.replace(',', ''))
Better solution for different currency formats:
def text_currency_to_float(text):
t = text
dot_pos = t.rfind('.')
comma_pos = t.rfind(',')
if comma_pos > dot_pos:
t = t.replace(".", "")
t = t.replace(",", ".")
else:
t = t.replace(",", "")
return float(t)
This attempts to detect whether commas are thousands separators and periods are decimal points, or the other way around, by checking where each appears in the string, if at all. (The premise is that thousands separators should not be used in the fractional part of the number.)
s = "123,456.908"
print float(s.replace(',', ''))
Here's a simple way I wrote up for you. :)
>>> number = '123,456,789.908'.replace(',', '') # '123456789.908'
>>> float(number)
123456789.908
You may use babel:
from babel.numbers import parse_decimal
f = float(parse_decimal("123,456.908", locale="en_US"))
Related
How can I convert a string like 123,456.908 to float 123456.908 in Python?
For ints, see How to convert a string to a number if it has commas in it as thousands separators?, although the techniques are essentially the same.
Using the localization services
The default locale
The standard library locale module is Python's interface to C-based localization routines.
The basic usage is:
import locale
locale.atof('123,456')
In locales where , is treated as a thousands separator, this would return 123456.0; in locales where it is treated as a decimal point, it would return 123.456.
However, by default, this will not work:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python3.8/locale.py", line 326, in atof
return func(delocalize(string))
ValueError: could not convert string to float: '123,456'
This is because by default, the program is "in a locale" that has nothing to do with the platform the code is running on, but is instead defined by the POSIX standard. As the documentation explains:
Initially, when a program is started, the locale is the C locale, no matter what the user’s preferred locale is. There is one exception: the LC_CTYPE category is changed at startup to set the current locale encoding to the user’s preferred locale encoding. The program must explicitly say that it wants the user’s preferred locale settings for other categories by calling setlocale(LC_ALL, '').
That is: aside from making a note of the system's default setting for the preferred character encoding in text files (nowadays, this will likely be UTF-8), by default, the locale module will interpret data the same way that Python itself does (via a locale named C, after the C programming language). locale.atof will do the same thing as float passed a string, and similarly locale.atoi will mimic int.
Using a locale from the environment
Making the setlocale call mentioned in the above quote from the documentation will pull in locale settings from the user's environment. Thus:
>>> import locale
>>> # passing an empty string asks for a locale configured on the
>>> # local machine; the return value indicates what that locale is.
>>> locale.setlocale(locale.LC_ALL, '')
'en_CA.UTF-8'
>>> locale.atof('123,456.789')
123456.789
>>> locale.atof('123456.789')
123456.789
The locale will not care if the thousands separators are in the right place - it just recognizes and filters them:
>>> locale.atof('12,34,56.789')
123456.789
In 3.6 and up, it will also not care about underscores, which are separately handled by the built-in float and int conversion:
>>> locale.atof('12_34_56.789')
123456.789
On the other side, the string format method, and f-strings, are locale-aware if the n format is used:
>>> f'{123456.789:.9n}' # `.9` specifies 9 significant figures
'123,456.789'
Without the previous setlocale call, the output would not have the comma.
Setting a locale explicitly
It is also possible to make temporary locale settings, using the appropriate locale name, and apply those settings only to a specific aspect of localization. To get localized parsing and formatting only for numbers, for example, use LC_NUMERIC rather than LC_ALL in the setlocale call.
Here are some examples:
>>> # in Denmark, periods are thousands separators and commas are decimal points
>>> locale.setlocale(locale.LC_NUMERIC, 'en_DK.UTF-8')
'en_DK.UTF-8'
>>> locale.atof('123,456.789')
123.456789
>>> # Formatting a number according to the Indian lakh/crore system:
>>> locale.setlocale(locale.LC_NUMERIC, 'en_IN.UTF-8')
'en_IN.UTF-8'
>>> f'{123456.789:9.9n}'
'1,23,456.789'
The necessary locale strings may depend on your operating system, and may require additional work to enable.
To get back to how Python behaves by default, use the C locale described previously, thus: locale.setlocale(locale.LC_ALL, 'C').
Caveats
Setting the locale affects program behaviour globally, and is not thread safe. If done at all, it should normally be done just once at the beginning of the program. Again quoting from documentation:
It is generally a bad idea to call setlocale() in some library routine, since as a side effect it affects the entire program. Saving and restoring it is almost as bad: it is expensive and affects other threads that happen to run before the settings have been restored.
If, when coding a module for general use, you need a locale independent version of an operation that is affected by the locale (such as certain formats used with time.strftime()), you will have to find a way to do it without using the standard library routine. Even better is convincing yourself that using locale settings is okay. Only as a last resort should you document that your module is not compatible with non-C locale settings.
When the Python code is embedded within a C program, setting the locale can even affect the C code:
Extension modules should never call setlocale(), except to find out what the current locale is. But since the return value can only be used portably to restore it, that is not very useful (except perhaps to find out whether or not the locale is C).
(N.B: when setlocale is called with a single category argument, or with None - not an empty string - for the locale name, it does not change anything, and simply returns the name of the existing locale.)
So, this is not meant as a tool, in production code, to try out experimentally parsing or formatting data that was meant for different locales. The above examples are only examples to illustrate how the system works. For this purpose, seek a third-party internationalization library.
However, if the data is all formatted according to a specific locale, specifying that locale ahead of time will make it possible to use locale.atoi and locale.atof as drop-in replacements for int and float calls on string input.
Just remove the , with replace():
float("123,456.908".replace(',',''))
If you don't know the locale and you want to parse any kind of number, use this parseNumber(text) function (My repo). It is not perfect but take into account most cases :
>>> parseNumber("a 125,00 €")
125
>>> parseNumber("100.000,000")
100000
>>> parseNumber("100 000,000")
100000
>>> parseNumber("100,000,000")
100000000
>>> parseNumber("100 000 000")
100000000
>>> parseNumber("100.001 001")
100.001
>>> parseNumber("$.3")
0.3
>>> parseNumber(".003")
0.003
>>> parseNumber(".003 55")
0.003
>>> parseNumber("3 005")
3005
>>> parseNumber("1.190,00 €")
1190
>>> parseNumber("1190,00 €")
1190
>>> parseNumber("1,190.00 €")
1190
>>> parseNumber("$1190.00")
1190
>>> parseNumber("$1 190.99")
1190.99
>>> parseNumber("1 000 000.3")
1000000.3
>>> parseNumber("1 0002,1.2")
10002.1
>>> parseNumber("")
>>> parseNumber(None)
>>> parseNumber(1)
1
>>> parseNumber(1.1)
1.1
>>> parseNumber("rrr1,.2o")
1
>>> parseNumber("rrr ,.o")
>>> parseNumber("rrr1rrr")
1
If the input uses a comma as a decimal point and period as a thousands separator, use .replace twice to convert the data to the format used by the built-in float. Thus:
s = s.replace('.','').replace(',','.')
number = float(s)
What about this?
my_string = "123,456.908"
commas_removed = my_string.replace(',', '') # remove comma separation
my_float = float(commas_removed) # turn from string to float.
In short:
my_float = float(my_string.replace(',', ''))
Better solution for different currency formats:
def text_currency_to_float(text):
t = text
dot_pos = t.rfind('.')
comma_pos = t.rfind(',')
if comma_pos > dot_pos:
t = t.replace(".", "")
t = t.replace(",", ".")
else:
t = t.replace(",", "")
return float(t)
This attempts to detect whether commas are thousands separators and periods are decimal points, or the other way around, by checking where each appears in the string, if at all. (The premise is that thousands separators should not be used in the fractional part of the number.)
s = "123,456.908"
print float(s.replace(',', ''))
Here's a simple way I wrote up for you. :)
>>> number = '123,456,789.908'.replace(',', '') # '123456789.908'
>>> float(number)
123456789.908
You may use babel:
from babel.numbers import parse_decimal
f = float(parse_decimal("123,456.908", locale="en_US"))
Why is it that Babel does not use the minus sign used by my locale, in functions like format_decimal()? It seems to me like this would be the very job of a library like Babel.
Is there a way I can enforce the usage of locale specific minus signs?
>>> import babel
>>> babel.__version__
'2.11.0'
>>> from babel.numbers import format_decimal, Locale
>>> l = Locale("sv_SE")
>>> l.number_symbols["minusSign"]
'−'
>>> format_decimal(-1.234, locale=l)
'-1,234'
Despite the fact that Local.number_symbols clearly contain a different character (in this case U+2212), format_decimal() (and other Babel formatting functions) use only the fallback hyphen-minus.
I am getting '-1,234' (with an hyphen-minus) where I would have expected '−1,234' (with the U+2212 minus sign).
Babel uses the hyphen-minus ('-') by default, despite the fact that a different character may be specified in the number_symbols attribute of the Locale object.
This is because the format_decimal() function, rely on the locale module of the Python standard library to format numbers. The locale module uses the C library's localization functions, which are not always capable of handling Unicode characters like the U+2212 MINUS SIGN (minus sign used by your locale).
But you can try this:
from babel.numbers import format_decimal, Locale
l = Locale("sv_SE")
number = -1.234
minus_sign = l.number_symbols["minusSign"]
formatted_number = "{}{:,.5f}".format(minus_sign, abs(number))
I have a string that represents a number which uses commas to separate thousands. How can I convert this to a number in python?
>>> int("1,000,000")
Generates a ValueError.
I could replace the commas with empty strings before I try to convert it, but that feels wrong somehow. Is there a better way?
For float values, see How can I convert a string with dot and comma into a float in Python, although the techniques are essentially the same.
import locale
locale.setlocale( locale.LC_ALL, 'en_US.UTF-8' )
locale.atoi('1,000,000')
# 1000000
locale.atof('1,000,000.53')
# 1000000.53
There are several ways to parse numbers with thousands separators. And I doubt that the way described by #unutbu is the best in all cases. That's why I list other ways too.
The proper place to call setlocale() is in __main__ module. It's global setting and will affect the whole program and even C extensions (although note that LC_NUMERIC setting is not set at system level, but is emulated by Python). Read caveats in documentation and think twice before going this way. It's probably OK in single application, but never use it in libraries for wide audience. Probably you shoud avoid requesting locale with some particular charset encoding, since it might not be available on some systems.
Use one of third party libraries for internationalization. For example PyICU allows using any available locale wihtout affecting the whole process (and even parsing numbers with particular thousands separators without using locales):
NumberFormat.createInstance(Locale('en_US')).parse("1,000,000").getLong()
Write your own parsing function, if you don't what to install third party libraries to do it "right way". It can be as simple as int(data.replace(',', '')) when strict validation is not needed.
Replace the commas with empty strings, and turn the resulting string into an int or a float.
>>> a = '1,000,000'
>>> int(a.replace(',' , ''))
1000000
>>> float(a.replace(',' , ''))
1000000.0
I got locale error from accepted answer, but the following change works here in Finland (Windows XP):
import locale
locale.setlocale( locale.LC_ALL, 'english_USA' )
print locale.atoi('1,000,000')
# 1000000
print locale.atof('1,000,000.53')
# 1000000.53
This works:
(A dirty but quick way)
>>> a='-1,234,567,89.0123'
>>> "".join(a.split(","))
'-123456789.0123'
I tried this. It goes a bit beyond the question:
You get an input. It will be converted to string first (if it is a list, for example from Beautiful soup);
then to int,
then to float.
It goes as far as it can get. In worst case, it returns everything unconverted as string.
def to_normal(soupCell):
''' converts a html cell from beautiful soup to text, then to int, then to float: as far as it gets.
US thousands separators are taken into account.
needs import locale'''
locale.setlocale( locale.LC_ALL, 'english_USA' )
output = unicode(soupCell.findAll(text=True)[0].string)
try:
return locale.atoi(output)
except ValueError:
try: return locale.atof(output)
except ValueError:
return output
>>> import locale
>>> locale.setlocale(locale.LC_ALL, "")
'en_US.UTF-8'
>>> print locale.atoi('1,000,000')
1000000
>>> print locale.atof('1,000,000.53')
1000000.53
this is done on Linux in US.
A little late, but the babel library has parse_decimal and parse_number which do exactly what you want:
from babel.numbers import parse_decimal, parse_number
parse_decimal('10,3453', locale='es_ES')
>>> Decimal('10.3453')
parse_number('20.457', locale='es_ES')
>>> 20457
parse_decimal('10,3453', locale='es_MX')
>>> Decimal('103453')
You can also pass a Locale class instead of a string:
from babel import Locale
parse_decimal('10,3453', locale=Locale('es_MX'))
>>> Decimal('103453')
If you're using pandas and you're trying to parse a CSV that includes numbers with a comma for thousands separators, you can just pass the keyword argument thousands=',' like so:
df = pd.read_csv('your_file.csv', thousands=',')
Try this:
def changenum(data):
foo = ""
for i in list(data):
if i == ",":
continue
else:
foo += i
return float(int(foo))
How can I convert a string like 123,456.908 to float 123456.908 in Python?
For ints, see How to convert a string to a number if it has commas in it as thousands separators?, although the techniques are essentially the same.
Using the localization services
The default locale
The standard library locale module is Python's interface to C-based localization routines.
The basic usage is:
import locale
locale.atof('123,456')
In locales where , is treated as a thousands separator, this would return 123456.0; in locales where it is treated as a decimal point, it would return 123.456.
However, by default, this will not work:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python3.8/locale.py", line 326, in atof
return func(delocalize(string))
ValueError: could not convert string to float: '123,456'
This is because by default, the program is "in a locale" that has nothing to do with the platform the code is running on, but is instead defined by the POSIX standard. As the documentation explains:
Initially, when a program is started, the locale is the C locale, no matter what the user’s preferred locale is. There is one exception: the LC_CTYPE category is changed at startup to set the current locale encoding to the user’s preferred locale encoding. The program must explicitly say that it wants the user’s preferred locale settings for other categories by calling setlocale(LC_ALL, '').
That is: aside from making a note of the system's default setting for the preferred character encoding in text files (nowadays, this will likely be UTF-8), by default, the locale module will interpret data the same way that Python itself does (via a locale named C, after the C programming language). locale.atof will do the same thing as float passed a string, and similarly locale.atoi will mimic int.
Using a locale from the environment
Making the setlocale call mentioned in the above quote from the documentation will pull in locale settings from the user's environment. Thus:
>>> import locale
>>> # passing an empty string asks for a locale configured on the
>>> # local machine; the return value indicates what that locale is.
>>> locale.setlocale(locale.LC_ALL, '')
'en_CA.UTF-8'
>>> locale.atof('123,456.789')
123456.789
>>> locale.atof('123456.789')
123456.789
The locale will not care if the thousands separators are in the right place - it just recognizes and filters them:
>>> locale.atof('12,34,56.789')
123456.789
In 3.6 and up, it will also not care about underscores, which are separately handled by the built-in float and int conversion:
>>> locale.atof('12_34_56.789')
123456.789
On the other side, the string format method, and f-strings, are locale-aware if the n format is used:
>>> f'{123456.789:.9n}' # `.9` specifies 9 significant figures
'123,456.789'
Without the previous setlocale call, the output would not have the comma.
Setting a locale explicitly
It is also possible to make temporary locale settings, using the appropriate locale name, and apply those settings only to a specific aspect of localization. To get localized parsing and formatting only for numbers, for example, use LC_NUMERIC rather than LC_ALL in the setlocale call.
Here are some examples:
>>> # in Denmark, periods are thousands separators and commas are decimal points
>>> locale.setlocale(locale.LC_NUMERIC, 'en_DK.UTF-8')
'en_DK.UTF-8'
>>> locale.atof('123,456.789')
123.456789
>>> # Formatting a number according to the Indian lakh/crore system:
>>> locale.setlocale(locale.LC_NUMERIC, 'en_IN.UTF-8')
'en_IN.UTF-8'
>>> f'{123456.789:9.9n}'
'1,23,456.789'
The necessary locale strings may depend on your operating system, and may require additional work to enable.
To get back to how Python behaves by default, use the C locale described previously, thus: locale.setlocale(locale.LC_ALL, 'C').
Caveats
Setting the locale affects program behaviour globally, and is not thread safe. If done at all, it should normally be done just once at the beginning of the program. Again quoting from documentation:
It is generally a bad idea to call setlocale() in some library routine, since as a side effect it affects the entire program. Saving and restoring it is almost as bad: it is expensive and affects other threads that happen to run before the settings have been restored.
If, when coding a module for general use, you need a locale independent version of an operation that is affected by the locale (such as certain formats used with time.strftime()), you will have to find a way to do it without using the standard library routine. Even better is convincing yourself that using locale settings is okay. Only as a last resort should you document that your module is not compatible with non-C locale settings.
When the Python code is embedded within a C program, setting the locale can even affect the C code:
Extension modules should never call setlocale(), except to find out what the current locale is. But since the return value can only be used portably to restore it, that is not very useful (except perhaps to find out whether or not the locale is C).
(N.B: when setlocale is called with a single category argument, or with None - not an empty string - for the locale name, it does not change anything, and simply returns the name of the existing locale.)
So, this is not meant as a tool, in production code, to try out experimentally parsing or formatting data that was meant for different locales. The above examples are only examples to illustrate how the system works. For this purpose, seek a third-party internationalization library.
However, if the data is all formatted according to a specific locale, specifying that locale ahead of time will make it possible to use locale.atoi and locale.atof as drop-in replacements for int and float calls on string input.
Just remove the , with replace():
float("123,456.908".replace(',',''))
If you don't know the locale and you want to parse any kind of number, use this parseNumber(text) function (My repo). It is not perfect but take into account most cases :
>>> parseNumber("a 125,00 €")
125
>>> parseNumber("100.000,000")
100000
>>> parseNumber("100 000,000")
100000
>>> parseNumber("100,000,000")
100000000
>>> parseNumber("100 000 000")
100000000
>>> parseNumber("100.001 001")
100.001
>>> parseNumber("$.3")
0.3
>>> parseNumber(".003")
0.003
>>> parseNumber(".003 55")
0.003
>>> parseNumber("3 005")
3005
>>> parseNumber("1.190,00 €")
1190
>>> parseNumber("1190,00 €")
1190
>>> parseNumber("1,190.00 €")
1190
>>> parseNumber("$1190.00")
1190
>>> parseNumber("$1 190.99")
1190.99
>>> parseNumber("1 000 000.3")
1000000.3
>>> parseNumber("1 0002,1.2")
10002.1
>>> parseNumber("")
>>> parseNumber(None)
>>> parseNumber(1)
1
>>> parseNumber(1.1)
1.1
>>> parseNumber("rrr1,.2o")
1
>>> parseNumber("rrr ,.o")
>>> parseNumber("rrr1rrr")
1
If the input uses a comma as a decimal point and period as a thousands separator, use .replace twice to convert the data to the format used by the built-in float. Thus:
s = s.replace('.','').replace(',','.')
number = float(s)
What about this?
my_string = "123,456.908"
commas_removed = my_string.replace(',', '') # remove comma separation
my_float = float(commas_removed) # turn from string to float.
In short:
my_float = float(my_string.replace(',', ''))
Better solution for different currency formats:
def text_currency_to_float(text):
t = text
dot_pos = t.rfind('.')
comma_pos = t.rfind(',')
if comma_pos > dot_pos:
t = t.replace(".", "")
t = t.replace(",", ".")
else:
t = t.replace(",", "")
return float(t)
This attempts to detect whether commas are thousands separators and periods are decimal points, or the other way around, by checking where each appears in the string, if at all. (The premise is that thousands separators should not be used in the fractional part of the number.)
s = "123,456.908"
print float(s.replace(',', ''))
Here's a simple way I wrote up for you. :)
>>> number = '123,456,789.908'.replace(',', '') # '123456789.908'
>>> float(number)
123456789.908
You may use babel:
from babel.numbers import parse_decimal
f = float(parse_decimal("123,456.908", locale="en_US"))
In Python, what is a clean and elegant way to convert strings like "1,374" or "21,000,000" to int values like 1374 or 21000000?
It really depends where you get your number from.
If the number you are trying to convert comes from user input, use locale.atoi(). That way, the number will be parsed in a way that is consistent with the user's settings and thus expectations.
If on the other hand you read it, let's say, from a file, that always uses the same format, use int("1,234".replace(",", "")) or int("1.234".replace(".", "")) depending on your situation. This is not only easier to read and debug, but it's not affected by the user's locale setting, so your parser will work on any system.
locale.atoi(), after setting an appropriate locale.
>>> s="1,374"
>>> import locale
>>> locale.setlocale(locale.LC_NUMERIC, '')
'en_US.UTF-8'
>>> locale.atoi(s)
1374
int("1,374".replace(",",""))