Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 6 years ago.
Improve this question
I am encountering the following output and I cannot really understand it.
Could you please advise what it is exactly? How to unpack it?
'#\x01\x01\x00'
It does not look to be purely binary or hexadecimal.
I would like to see the ASCII representation of it.
You have a string of bytes, if you print it you are seeing the the ascii output:
In [5]: s = '#\x01\x01\x00'
In [8]: print(list(bytearray(s)))
[64, 1, 1, 0]
If you call chr on each of the ints you will see exactly the same output, 64 in ascii is #, 1 is a SOH and 0 is a NUL , without more info like where it came from there is not much else that can be suggested.
This seems to be a sequence of four bytes with the values 64, 1, 1, 0.
To interpret it, you need to know how it was encoded or what it is supposed to represent.
Generally, you can unpack binary data in Python with the unpack function in the struct module:
import struct
intval = struct.unpack('i', '#\x01\x01\x00')
shortvals = struct.unpack('hh', '#\x01\x01\x00')
The first unpack line would give you the value of your string interpreted as a 4-byte integer, which is the number 65856. The second one interprets the string as two 2-byte integers (320 and 1).
Related
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 2 years ago.
Improve this question
I have a string that is already encoded with utf-8 (ex. "No\xf0\x9f\x92\x80"). I would like to decode it so it becomes No💀. However, when I use .decode('utf-8) it says decode is not a function of a str.
The string is from a txt file that I am reading with pandas.
If the length is 6, that doesn't quite make sense if you read the file with encoding='utf8'. It should have decoded the UTF-8 bytes correctly, but this would fix it if it is really what you have:
>>> s='No\xf0\x9f\x92\x80'
>>> len(s)
6
>>> s.encode('latin1').decode('utf8')
'No💀'
Instead, if you have literal backslashes and numbers in the string, this would work:
>>> s=r'No\xf0\x9f\x92\x80'
>>> s
'No\\xf0\\x9f\\x92\\x80'
>>> len(s)
18
>>> s.encode('latin1').decode('unicode-escape').encode('latin1').decode('utf8')
'No💀'
unicode-escape translates escape codes to Unicode code points, but only works on bytes strings. .encode('latin1') translates Unicode code points, 1:1 to their byte equivalent (only works U+0000 to U+00FF, of course).
The code above translates a str to bytes, decodes the escapes, converts to bytes again, and decodes correctly as UTF-8.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
As a French user of Python 2.7, I'm trying to properly print strings containing accents such as "é", "è", "à", etc. in the Python console.
I already know the trick of using u before the explicit value of a string, such as :
print(u'Université')
which properly prints the last character.
Now, my question is: how can I do the same for a string that is stored as a variable?
Indeed, I know that I could do the following:
mystring = u'Université'
print(mystring)
but the problem is that the value of mystring is bound to be passed into a SQL query (using psycopg2), and therefore I can't afford to store the u inside the value of mystring.
so how could I do something like
"print the unicode value of mystring" ?
The u sigil is not part of the value, it's just a type indicator. To convert a string into a Unicode string, you need to know the encoding.
unicodestring = mystring.decode('utf-8') # or 'latin-1' or ... whatever
and to print it you typically (in Python 2) need to convert back to whatever the system accepts on the output filehandle:
print(unicodestring.encode('utf-8')) # or 'latin-1' or ... whatever
Python 3 clarifies (though not directly simplifies) the situation by keeping Unicode strings and (what is now called) bytes objects separate.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
This question is not about usage of int function, but rather how it is done internally.
Because source code is in C I don't understand what is going on there.
Maybe someone can explain how Python convert string "123" to integer 123.
What operations are performed for it?
https://github.com/python/cpython/blob/2d305e1c46abfcd609bf8b2dff8d2065e6af8ab2/Objects/longobject.c#L2075-L2366 contains the implementation you're looking for. While understanding the C is useful, there is a large comment in the middle (starting on line 2132) that explains much of the approach.
When converting a python string to an int, e.g. a = int("123",10), (convert the string "123" to an integer in base 10) a C function is is called.
First, it checks that the given counting base base is >= 2 and <=36, or 0. (Error otherwise)
Next, it ignores all leading spaces. (so that " 123" = "123"),
and check if the number is marked as positive '+', or negative '-'
When the base is 0, it checks if the string starts with '0x','0o', '0b', '0', and sets the base respectively (hexadecimal, octal, binary, decimal).
Note that if no base is given, then the default base is 10 (Decimal).
It then proceeds to turning the character array into a number, using the algorithm described in the code comment at the link posted by Paul Kehrer
Trailing spaces are also ignored, and Errors are raised if needed- for example if there's a space in the middle of the string, followed by a number, or if there's a non-number character.
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 8 years ago.
Improve this question
Basically what I'm asking is, what's the most direct way to convert any integer between 0 and 255 into it's hexadecimal, escaped equivalent? One that I mean will function correctly if wrapped in a write() function (which means '\x56' writes 'V' and not literally '\x56'.
That's what the chr function is for.
f.write(chr(0x56))
Speaking of hexadecimal escaped equivalents isn't really relevant in this context - every character has a hexadecimal equivalent, but in expressing a string the characters that can be expressed as a single simple character are simply output as the character.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question appears to be off-topic because it lacks sufficient information to diagnose the problem. Describe your problem in more detail or include a minimal example in the question itself.
Closed 8 years ago.
Improve this question
I have a file in .ktx format. I have opened the file in 'rb' mode. I want to modify particular bytes in that file. I am reading bytes using read(4) [ i want to read number which is of 4 bytes], call and convert each chunk into a number. What I want is, to increase that number by specific number and insert it back into file stream. Is there any function in python which converts a byte string to an integer? I tried with int() but it prints some binary data.
my code:
bytes=file.read(4)
for char in bytes:
print hex(ord(char))
bytes = file.read(4)
bytesAsInt = struct.unpack("l",bytes)
do_something_with_int(bytesAsInt)
I think might be what you are looking for ... its hard to tell from the question though
here is the docs on the struct module https://docs.python.org/3/library/struct.html
Try this
How can I convert a character to a integer in Python, and viceversa?
Here is a suggested workflow for what you seem to be wanting to do
Read the data
Convert the data to integer
Add X to the integer, where X is the value you want to increase by