How to convert variable into integer in python? like in php (int) - python

I have a variable with value like a ="\x01" from my database, how can I convert it into an integer. I have searched the internet but had no success in finding anything.
Anyone have an idea?
In PHP, there is a build-in module to convert it. Is there any similar module for that function in Python?

Simple answer is to use ord().
>>> a = '\x01'
>>> ord(a)
1
But if performance is what you are looking for then refer #chepner's answer.

You can use the struct module for fixed-length values.
>>> a = '\x01'
>>> import struct
>>> struct.unpack("B", a)
(1,)
unpack always returns a tuple, since you can extract multiple values from a single string.

Related

Defining unpacking format with python's struct

I'm trying to unpack a binary file with python's struct.unpack.
when i write struct.unpack("200i",data) it works.
but when i want to use a number of integers found in a previous operation like this: struct.unpack("a[1]i",data) it doesn't work.
p.s: a[1]=200
You have to convert it to string for the function to work:
struct.unpack(str(a[1]) + "i",data)

Convert a big number string 2,345,678 into its value in int or float or anything that can be manipulated later in python [duplicate]

I have a string that represents a number which uses commas to separate thousands. How can I convert this to a number in python?
>>> int("1,000,000")
Generates a ValueError.
I could replace the commas with empty strings before I try to convert it, but that feels wrong somehow. Is there a better way?
For float values, see How can I convert a string with dot and comma into a float in Python, although the techniques are essentially the same.
import locale
locale.setlocale( locale.LC_ALL, 'en_US.UTF-8' )
locale.atoi('1,000,000')
# 1000000
locale.atof('1,000,000.53')
# 1000000.53
There are several ways to parse numbers with thousands separators. And I doubt that the way described by #unutbu is the best in all cases. That's why I list other ways too.
The proper place to call setlocale() is in __main__ module. It's global setting and will affect the whole program and even C extensions (although note that LC_NUMERIC setting is not set at system level, but is emulated by Python). Read caveats in documentation and think twice before going this way. It's probably OK in single application, but never use it in libraries for wide audience. Probably you shoud avoid requesting locale with some particular charset encoding, since it might not be available on some systems.
Use one of third party libraries for internationalization. For example PyICU allows using any available locale wihtout affecting the whole process (and even parsing numbers with particular thousands separators without using locales):
NumberFormat.createInstance(Locale('en_US')).parse("1,000,000").getLong()
Write your own parsing function, if you don't what to install third party libraries to do it "right way". It can be as simple as int(data.replace(',', '')) when strict validation is not needed.
Replace the commas with empty strings, and turn the resulting string into an int or a float.
>>> a = '1,000,000'
>>> int(a.replace(',' , ''))
1000000
>>> float(a.replace(',' , ''))
1000000.0
I got locale error from accepted answer, but the following change works here in Finland (Windows XP):
import locale
locale.setlocale( locale.LC_ALL, 'english_USA' )
print locale.atoi('1,000,000')
# 1000000
print locale.atof('1,000,000.53')
# 1000000.53
This works:
(A dirty but quick way)
>>> a='-1,234,567,89.0123'
>>> "".join(a.split(","))
'-123456789.0123'
I tried this. It goes a bit beyond the question:
You get an input. It will be converted to string first (if it is a list, for example from Beautiful soup);
then to int,
then to float.
It goes as far as it can get. In worst case, it returns everything unconverted as string.
def to_normal(soupCell):
''' converts a html cell from beautiful soup to text, then to int, then to float: as far as it gets.
US thousands separators are taken into account.
needs import locale'''
locale.setlocale( locale.LC_ALL, 'english_USA' )
output = unicode(soupCell.findAll(text=True)[0].string)
try:
return locale.atoi(output)
except ValueError:
try: return locale.atof(output)
except ValueError:
return output
>>> import locale
>>> locale.setlocale(locale.LC_ALL, "")
'en_US.UTF-8'
>>> print locale.atoi('1,000,000')
1000000
>>> print locale.atof('1,000,000.53')
1000000.53
this is done on Linux in US.
A little late, but the babel library has parse_decimal and parse_number which do exactly what you want:
from babel.numbers import parse_decimal, parse_number
parse_decimal('10,3453', locale='es_ES')
>>> Decimal('10.3453')
parse_number('20.457', locale='es_ES')
>>> 20457
parse_decimal('10,3453', locale='es_MX')
>>> Decimal('103453')
You can also pass a Locale class instead of a string:
from babel import Locale
parse_decimal('10,3453', locale=Locale('es_MX'))
>>> Decimal('103453')
If you're using pandas and you're trying to parse a CSV that includes numbers with a comma for thousands separators, you can just pass the keyword argument thousands=',' like so:
df = pd.read_csv('your_file.csv', thousands=',')
Try this:
def changenum(data):
foo = ""
for i in list(data):
if i == ",":
continue
else:
foo += i
return float(int(foo))

Reversing C-style format strings in Python (`%`)

Introduction and setup
Suppose I have a 'template'* string of the form,
>>> template = """My %(pet)s ate my %(object)s.
... This is a float: %(number)0.2f.
... %(integer)10d is an integer on a newline."""
With this template I can generate a new string with,
>>> d = dict(pet='dog', object='homework', number=7.7375487, integer=743898,)
>>> out_string = template % d
>>> print(out_string)
My dog ate my homework.
This is a float: 7.74.
743898 is an integer on a newline.
How nice!
Question
I'd like to apply template to out_string to produce a new dict. Something like,
>>> d_approx_copy = reverse_cstyle_template(out_string, template)
>>> print(d_approx_copy)
{pet='dog', object='homework', number=7.74, integer=743898,}
Is there a Pythonic way to do this? Does an implementation already exist?**
Notes
*: I'm not using Template because, AFAIK, they don't currently support reversing.
**: I am aware of the risks associated with the loss of precision in number (from 7.7375487 to 7.74). I can deal with that. I'm just looking for a simple way to do this.
As I was developing this question, I could not find an existing tool to reverse C-style strings this way. That is, I think the answer to this question is: the reverse_cstyle_template function I was looking for does not currently exist.
In the process of researching this topic, I found many questions/answers similar to this one that use regular expressions (e.g. 1, 2, 3). However, I wanted something simpler and I did not want to have to use a different template string for formatting vs. parsing.
This eventually led me to format string syntax, and Richard Jones' parse package. For example the template above is written in format string syntax as,
>>> template = """My {pet} ate my {object}.
... This is a float: {number:0.2f}.
... {integer:10d} is an integer on a newline."""
With this template, one can use the built-in str.format to create a new string based on d,
template.format(**d)
Then use the parse package to get d_approx_copy,
>>> from parse import parse
>>> d_approx_copy = parse(template, out_string).named
Note here that I've accessed the .named attribute. This is because parse returns a Result object (defined in parse) that captures both named and fixed format specifiers. For example if one uses,
>>> template = """My {pet} {}ate my {object}.
... This is a float: {number:0.2f}.
... {integer:10d} is an integer on a newline.
... Here is another 'fixed' input: {}"""
>>> out_string = template.format('spot ', 7, **d)
>>> print(out_string)
My dog spot ate my homework.
This is a float: 7.74.
743898 is an integer on a newline.
Here is another 'fixed' input: 7
Then we can get the fixed and named data back by,
>>> data = parse.parse(template, out_string)
>>> print(data.named)
{'pet': 'dog', 'integer': 743898, 'object': 'homework', 'number': 7.74}
>>> print(data.fixed)
('spot ', '7')
Cool, right?!
Hopefully someday this functionality will be included as a built-in either in str, or in Template. For now though parse works well for my purposes.
Lastly, I think it's important to re-emphasize the loss of precision that occurs through these steps when specifying precision in the format specifier (i.e. 7.7375487 becomes 7.74)! In general using the precision specifier is probably a bad idea except when creating 'readable' strings (e.g. for 'summary' file output) that are not meant for further processing (i.e. will never be parsed). This, of course, negates the point of this Q/A but needs to be mentioned here.

python: how to generate char by adding int

I can use 'a'+1 to get 'b' in C language, so what the convient way to do this in Python?
I can write it like:
chr(ord('a')+1)
but I don't know whether it is the best way.
Yes, this is the best way. Python doesn't automatically convert between a character and an int the way C and C++ do.
Python doesn't actually have a character type, unlike C, so yea, chr(ord is the way to do it.
If you wanted to do it a bit more cleanly, you could do something like:
def add(c, x):
return chr(ord(c)+x)
There is the bytearray type in Python -
it is slower than regular strings, but behaves mostly like a C string:
it is mutable, acessing inidividual elements raise 0 - 255 integer numbers, insetead of substrings with lenght 1, and you can assign to the elements. Still, it is represented as a string, and in Python 2, can be used in most places a string can without being cast to a str object:
>>> text = bytearray("a")
>>> text
bytearray(b'a')
>>> print text
a
>>> text[0]+=1
>>> print text
b
>>> text[0]
98
>>> print "other_text" + text
other_textb
When using Python 3, to use the contents of a bytearray as a text object, simply call its decode method with an appropriate encoding such as "latin1" or utf-8":
>>> print ("other_text" + text.decode("latin1"))
What you're doing is really the right way. Python does not conflate a character with its numerical codepoint, as C and similar languages do. The reason is that once you go beyond ASCII, the same integral value can represent different characters, depending on the encoding. C emphasizes direct access to the underlying hardware formats, but python emphasizes well-defined semantics.

Replacing one character of a string in python

In python, are strings mutable? The line someString[3] = "a" throws the error
TypeError: 'str' object does not
support item assignment
I can see why (as I could have written someString[3] = "test" and that would obviously be illegal) but is there a method to do this in python?
Python strings are immutable, which means that they do not support item or slice assignment. You'll have to build a new string using i.e. someString[:3] + 'a' + someString[4:] or some other suitable approach.
Instead of storing your value as a string, you could use a list of characters:
>>> l = list('foobar')
>>> l[3] = 'f'
>>> l[5] = 'n'
Then if you want to convert it back to a string to display it, use this:
>>> ''.join(l)
'foofan'
If you are changing a lot of characters one at a time, this method will be considerably faster than building a new string each time you change a character.
In new enough pythons you can also use the builtin bytearray type, which is mutable. See the stdlib documentation. But "new enough" here means 2.6 or up, so that's not necessarily an option.
In older pythons you have to create a fresh str as mentioned above, since those are immutable. That's usually the most readable approach, but sometimes using a different kind of mutable sequence (like a list of characters, or possibly an array.array) makes sense. array.array is a bit clunky though, and usually avoided.
>>> import ctypes
>>> s = "1234567890"
>>> mutable = ctypes.create_string_buffer(s)
>>> mutable[3] = "a"
>>> print mutable.value
123a567890
Use this:
someString.replace(str(list(someString)[3]),"a")
Just define a new string equaling to what you want to do with your current string.
a = str.replace(str[n],"")
return a

Categories

Resources