So, I have got a float value: -1.0f, or something. And how could I write it into a file in hexadecimal format in Python? I mean that we open the file in notepad, we won't see the hexadecimal values, just the ASCII code.
In Python 3:
>>> import struct
>>> "".join("{0:02X}".format(b) for b in struct.pack(">f", -1.0))
'BF800000'
In Python 2:
>>> import struct
>>> "".join("{0:02X}".format(ord(b)) for b in struct.pack(">f", -1.0))
'BF800000'
Related
I have Unicode Code Point of an emoticon represented as U+1F498:
emoticon = u'\U0001f498'
I would like to get utf-16 decimal groups of this character, which according to this website are 55357 and 56472.
I tried to do print emoticon.encode("utf16") but did not help me at all because it gives some other characters.
Also, trying to decode from UTF-8 before encode it to UTF-16 as follow print str(int("0001F498", 16)).decode("utf-8").encode("utf16") does not help either.
How do I correctly get the utf-16 decimal groups of a unicode character?
You can encode the character with the utf-16 encoding, and then convert every 2 bytes of the encoded data to integers with int.from_bytes (or struct.unpack in python 2).
Python 3
def utf16_decimals(char, chunk_size=2):
# encode the character as big-endian utf-16
encoded_char = char.encode('utf-16-be')
# convert every `chunk_size` bytes to an integer
decimals = []
for i in range(0, len(encoded_char), chunk_size):
chunk = encoded_char[i:i+chunk_size]
decimals.append(int.from_bytes(chunk, 'big'))
return decimals
Python 2 + Python 3
import struct
def utf16_decimals(char):
# encode the character as big-endian utf-16
encoded_char = char.encode('utf-16-be')
# convert every 2 bytes to an integer
decimals = []
for i in range(0, len(encoded_char), 2):
chunk = encoded_char[i:i+2]
decimals.append(struct.unpack('>H', chunk)[0])
return decimals
Result:
>>> utf16_decimals(u'\U0001f498')
[55357, 56472]
In a Python 2 "narrow" build, it is as simple as:
>>> emoticon = u'\U0001f498'
>>> map(ord,emoticon)
[55357, 56472]
This works in Python 2 (narrow and wide builds) and Python 3:
from __future__ import print_function
import struct
emoticon = u'\U0001f498'
print(struct.unpack('<2H',emoticon.encode('utf-16le')))
Output:
(55357, 56472)
This is a more general solution that prints the UTF-16 code points for any length of string:
from __future__ import print_function,division
import struct
def utf16words(s):
encoded = s.encode('utf-16le')
num_words = len(encoded) // 2
return struct.unpack('<{}H'.format(num_words),encoded)
emoticon = u'ABC\U0001f498'
print(utf16words(emoticon))
Output:
(65, 66, 67, 55357, 56472)
I tried to get crc32 of a string data type variable but getting the following error.
>>> message='hello world!'
>>> import binascii
>>> binascii.crc32(message)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: a bytes-like object is required, not 'str'
For a string values it can be done with binascii.crc32(b'hello world!') but I would like to know how to do this for a string data-type variable
When you are computing crc32 of some data, you need to know the exact value of bytes you are hashing. One string can represent different values of bytes in different encodings, therefore passing string as parameter is ambiguous.
When using binascii.crc32(b'hello world!'), you are converting char array into array of bytes using simple ascii table as conversion.
To convert any string, you can use:
import binascii
text = 'hello'
binascii.crc32(text.encode('utf8'))
This can be done using binascii.crc32 or zlib.crc32. This answer improves upon the prior answer by Tomas by documenting both modules and by producing a string output besides just an integer.
# Define data
> text = "hello"
> data = text.encode()
> data
b'hello'
# Using binascii
> import binascii
> crc32 = binascii.crc32(data)
> crc32
907060870
> hex(crc32)
'0x3610a686'
> f'{crc32:#010x}'
'0x3610a686'
# Using zlib
> import zlib
> zlib.crc32(data)
907060870 # Works the same as binascii.crc32.
If you don't want the string output to have the 0x prefix:
> import base64
> crc32 = 907060870
> digest = crc32.to_bytes(4, 'big')
> digest
b'6\x10\xa6\x86'
> base64.b16encode(digest).decode().lower()
'3610a686'
I try to convert a hex string to shellcode format
For example: I have a file in hex string like aabbccddeeff11223344
and I want to convert that through python to show this exact format:
"\xaa\xbb\xcc\xdd\xee\xff\x11\x22\x33\x44" including the quotes "".
My code is:
with open("file","r") as f:
a = f.read()
b = "\\x".join(a[i:i+2] for i in range(0, len(a), 2))
print b
so my output is aa\xbb\xcc\xdd\xee\xff\x11\x22\x33\x44\x.
I understand I can do it via sed command but I wonder how I may accomplish this through python.
The binascii standard module will help here:
import binascii
print repr(binascii.unhexlify("aabbccddeeff11223344"))
Output:
>>> print repr(binascii.unhexlify("aabbccddeeff11223344"))
'\xaa\xbb\xcc\xdd\xee\xff\x11"3D'
I am reading four bytes from file
I would like to join them
g = f.read(60)
f.seek (60)
k60 =f.read(1)
print('byte60',k60)
k61 =f.read(1)
print('byte61',k61)
k62 =f.read(1)
print('byte62',k62)
k63 =f.read(1)
print('byte63',k63)
print(k63,k62,k61,k60)
print (b''.join([k63,k62,k61,k60]))
Result is:
b'\x00\x00\x00\x80'
I would like to receive:
00000080
You to convert a byte string to its hex representation, you can use the hexlify() method from the binascii module:
>>> from binascii import hexlify
>>> ...
>>> raw = b''.join([k63,k62,k61,k60])
>>> print(hexlify(raw))
b'00000080'
>>> print(hexlify(raw).decode('ascii') # if you want to convert it to a string
00000080
The same could be accomplished by using codecs.encode(raw, 'hex').
Suppose I have a number like 824 and I write it to a text file using python. In the text file, it will take 3 bytes space. However, If i represent it using bits, it has the following representation 0000001100111000 which is 2 bytes (16 bits). I was wondering how can I write bits to file in python, not bytes. If I can do that, the size of the file will be 2 bytes, not 3.
Please provide code. I am using python 2.6. Also, I do not want to use any external modules that do not come with the basic installation
I tried below and gave me 12 bytes!
a =824;
c=bin(a)
handle = open('try1.txt','wb')
handle.write(c)
handle.close()
The struct module is what you want. From your example, 824 = 0000001100111000 binary or 0338 hexadecimal. This is the two bytes 03H and 38H. struct.pack will convert 824 to a string of these two bytes, but you also have to decide little-endian (write the 38H first) or big-endian (write the 03H first).
Example
>>> import struct
>>> struct.pack('>H',824) # big-endian
'\x038'
>>> struct.pack('<H',824) # little-endian
'8\x03'
>>> struct.pack('H',824) # Use system default
'8\x03'
struct returns a two-byte string. the '\x##' notation means (a byte with hexadecimal value ##). the '8' is an ASCII '8' (value 38H). Python byte strings use ASCII for printable characters, and \x## notation for unprintable characters.
Below is an example writing and reading binary data to a file. You should always specify the endian-ness when writing to and reading from a binary file, in case it is read on a system with a different endian default:
import struct
a = 824
bin_data = struct.pack('<H',824)
print 'bin_data length:',len(bin_data)
with open('data.bin','wb') as f:
f.write(bin_data)
with open('data.bin','rb') as f:
bin_data = f.read()
print 'Value from file:',struct.unpack('<H',bin_data)[0]
print 'bin_data representation:',repr(bin_data)
for i,c in enumerate(bin_data):
print 'Byte {0} as binary: {1:08b}'.format(i,ord(c))
Output
bin_data length: 2
Value from file: 824
bin_data representation: '8\x03'
Byte 0 as binary: 00111000
Byte 1 as binary: 00000011
Have a look at struct:
>>> struct.pack("h", 824)
'8\x03'
I think what you want is to open the file in binary mode:
open("file.bla", "wb")
However, this will write an integer to the file, which will probably be 4 bytes in size. I do not know if Python has a 2 byte integer type. But you can circumvent that by encoding 2 16 bit number in one 32 bit number:
a = 824
b = 1234
c = (a << 16) + b