Python3 ASCII Hexadecimal to Binary String Conversion - python

I'm using Python 3.2.3 on Windows, and am trying to convert binary data within a C-style ASCII file into its binary equivalent for later parsing using the struct module. For example, my input file contains "0x000A 0x000B 0x000C 0x000D", and I'd like to convert it into "\x00\x0a\x00\x0b\x00\x0c\x00\x0d".
The problem I'm running into is that the string datatypes have changed in Python 3, and the built-in functions to convert from hexadecimal to binary, such as binascii.unhexlify(), no longer accept regular unicode strings, but only byte strings. This process of converting from unicode strings to byte strings and back is confusing me, so I'm wondering if there's an easier way to achieve this. Below is what I have so far:
with open(path, "r") as f:
l = []
data = f.read()
values = data.split(" ")
for v in values:
if (v.startswith("0x")):
l.append(binascii.unhexlify(bytes(v[2:], "utf-8").decode("utf-8")
string = ''.join(l)

3>> ''.join(chr(int(x, 16)) for x in "0x000A 0x000B 0x000C 0x000D".split()).encode('utf-16be')
b'\x00\n\x00\x0b\x00\x0c\x00\r'

As agf says, opening the image with mode 'r' will give you string data.
Since the only thing you are doing here is looking at binary data, you probably want to open with 'rb' mode and make your result of type bytes, not str.
Something like:
with open(path, "rb") as f:
l = []
data = f.read()
values = data.split(b" ")
for v in values:
if (v.startswith(b"0x")):
l.append(binascii.unhexlify(v[2:]))
result = b''.join(l)

Related

I need to read a file bit by bit in python [duplicate]

This question already has answers here:
How to read bits from a file?
(3 answers)
Closed last year.
I would like an output like 01100011010... or [False,False,True,True,False,...] from a file to then create an encrypted file.
I've already tried byte = file.read(1) but i don't know how to then convert it to bits.
You can read a file in binary mode this way:
with open(r'path\to\file.ext', 'rb') as binary_file:
bits = binary_file.read()
The 'rb' option stands for Read Binary.
For the output you asked for you can do something like this:
[print(bit, end='') for bit in bits]
that is the list comprehension equivalent to:
for bit in bits:
print(bit, end='')
Since 'rb' gives you hex numbers instead of bin, you can use the built-in function bin to solve the problem:
with open(r'D:\\help\help.txt', 'rb') as file:
for char in file.read():
print(bin(char), end='')
Suppose we have text file text.txt
# Read the content:
with open("text.txt") as file:
content=file.read()
# using join() + bytearray() + format()
# Converting String to binary
res = ''.join(format(i, '08b') for i in bytearray(content, encoding ='utf-8'))
# printing result
print("The string after binary conversion : " + str(res))

Python Opening UTF-16 file read each byte

I am attempting to parse a file that I believe is UTF-16 encoded (the file magic is 0xFEFF), and I can open the file just as I want with:
f = open(file, 'rb')
But when for example I do
print f.read(40)
it prints the actual unicode strings of the file where I would like to access the hexadecimal data and read that byte-by-byte. This may be a stupid question but I haven't been able to find out how to do this.
Also, as a follow up question. Once I get this working, I would like to parse the file looking for a specific set of bytes, in this case:
0x00 00 00 43 00 00 00
And after that pattern is found, begin parsing an entry. What's the best way to accomplish this? I was thinking about using a generator to walk through each byte, and once this pattern shows up, yield the bytes until the next instance of that pattern? Is there a more efficient way to do this?
EDIT: I am using Python 2.7
shouldn't you just be able to do this
string = 'string'
>>> hex(ord(string[1]))
'0x74'
hexString = ''
with open(filename) as f:
while True:
#char = f.read(1)
chars = f.read(40)
hexString += ''.join(hex(ord(char) for char in chars)
if not chars:
break
If you want a string of hexadecimal, you can pass it through binascii.hexlify():
with open(filename, 'rb') as f:
raw = f.read(40)
hexadecimal = binascii.hexlify(raw)
print(hexadecimal)
(This also works without modification on Python 3)
If you need the numerical value of each byte, you can call ord() on each element, or equivalently, map() the function over the string:
with open(filename, 'rb') as f:
raw = f.read(40)
byte_list = map(ord, raw)
print byte_list
(This doesn't work on Python 3, but on 3.x, you can just iterate over raw directly)

python image (.jpeg) to hex code

I operate with a thermal printer, this printer is able to print images, but it needs to get the data in hex format. For this I would need to have a python function to read an image and return a value containing the image data in hex format.
I currently use this format to sent hex format to the printer:
content = b"\x1B\x4E"
Which is the simplest way to do so using Python2.7?
All the best;
I don't really know what you mean by "hex format", but if it needs to get the whole file as a sequence of bytes you can do:
with open("image.jpeg", "rb") as fp:
img = fp.read()
If your printer expects the image in some other format (like 8bit values for every pixel) then try using the pillow library, it has many image manipulation functions and handles a wide range of input and ouput formats.
How about this:
with open('something.jpeg', 'rb') as f:
binValue = f.read(1)
while len(binValue) != 0:
hexVal = hex(ord(binValue))
# Do something with the hex value
binValue = f.read(1)
Or for a function, something like this:
import re
def imgToHex(file):
string = ''
with open(file, 'rb') as f:
binValue = f.read(1)
while len(binValue) != 0:
hexVal = hex(ord(binValue))
string += '\\' + hexVal
binValue = f.read(1)
string = re.sub('0x', 'x', string) # Replace '0x' with 'x' for your needs
return string
Note: You do not necessarily need to do the re.sub portion if you use struct.pack to write the bits, but this will get it into the format that you need
Read in a jpg and make a string of hex values. Then reverse the procedure. Take a string of hex and write it out as a jpg file...
import binascii
with open('my_img.jpg', 'rb') as f:
data = f.read()
print(data[:10])
im_hex = binascii.hexlify(data)
# check out the hex...
print(im_hex[:10])
# reversing the procedure
im_hex = binascii.a2b_hex(im_hex)
print(im_hex[:10])
# write it back out to a jpg file
with open('my_hex.jpg', 'wb') as image_file:
image_file.write(im_hex)

Create text file of hexadecimal from binary

I would like to convert a binary to hexadecimal in a certain format and save it as a text file.
The end product should be something like this:
"\x7f\xe8\x89\x00\x00\x00\x60\x89\xe5\x31\xd2\x64\x8b\x52"
Input is from an executable file "a".
This is my current code:
with open('a', 'rb') as f:
byte = f.read(1)
hexbyte = '\\x%02s' % byte
print hexbyte
A few issues with this:
This only prints the first byte.
The result is "\x" and a box like this:
00
7f
In terminal it looks exactly like this:
Why is this so? And finally, how do I save all the hexadecimals to a text file to get the end product shown above?
EDIT: Able to save the file as text with
txt = open('out.txt', 'w')
print >> txt, hexbyte
txt.close()
You can't inject numbers into escape sequences like that. Escape sequences are essentially constants, so, they can't have dynamic parts.
There's already a module for this, anyway:
from binascii import hexlify
with open('test', 'rb') as f:
print(hexlify(f.read()).decode('utf-8'))
Just use the hexlify function on a byte string and it'll give you a hex byte string. You need the decode to convert it back into an ordinary string.
Not quite sure if decode works in Python 2, but you really should be using Python 3, anyway.
Your output looks like a representation of a bytestring in Python returned by repr():
with open('input_file', 'rb') as file:
print repr(file.read())
Note: some bytes are shown as ascii characters e.g. '\x52' == 'R'. If you want all bytes to be shown as the hex escapes:
with open('input_file', 'rb') as file:
print "\\x" + "\\x".join([c.encode('hex') for c in file.read()])
Just add the content to list and print:
with open("default.png",'rb') as file_png:
a = file_png.read()
l = []
l.append(a)
print l

Reading binary file (.chn) in Python

In python, how do I read a binary file (here I need to read a .chn file) and show the result in binary format?
Assuming that values are separated by a space:
with open('myfile.chn', 'rb') as f:
data = []
for line in f: # a file supports direct iteration
data.extend(hex(int(x, 2)) for x in line.split())
In Python is better to use open() over file(), documentation says it explicitly:
When opening a file, it’s preferable to use open() instead of invoking
the file constructor directly.
rb mode will open the file in binary mode.
Reference:
http://docs.python.org/library/functions.html#open
try this:
with open('myfile.chn') as f:
data=f.read()
data=[bin(ord(x)).strip('0b') for x in data]
print ''.join(data)
and if you want only the binary data it will be in the list.
with open('myfile.chn') as f:
data=f.read()
data=[bin(ord(x)).strip('0b') for x in data]
print data
In data now you will have the list of binary numbers. you can take this and convert to hexadecimal number
with file('myfile.chn') as f:
data = f.read() # read all strings at once and return as a list of strings
data = [hex(int(x, 2)) for x in data] # convert to a list of hex strings (by interim getting the decimal value)

Categories

Resources