I would like to convert a binary to hexadecimal in a certain format and save it as a text file.
The end product should be something like this:
"\x7f\xe8\x89\x00\x00\x00\x60\x89\xe5\x31\xd2\x64\x8b\x52"
Input is from an executable file "a".
This is my current code:
with open('a', 'rb') as f:
byte = f.read(1)
hexbyte = '\\x%02s' % byte
print hexbyte
A few issues with this:
This only prints the first byte.
The result is "\x" and a box like this:
00
7f
In terminal it looks exactly like this:
Why is this so? And finally, how do I save all the hexadecimals to a text file to get the end product shown above?
EDIT: Able to save the file as text with
txt = open('out.txt', 'w')
print >> txt, hexbyte
txt.close()
You can't inject numbers into escape sequences like that. Escape sequences are essentially constants, so, they can't have dynamic parts.
There's already a module for this, anyway:
from binascii import hexlify
with open('test', 'rb') as f:
print(hexlify(f.read()).decode('utf-8'))
Just use the hexlify function on a byte string and it'll give you a hex byte string. You need the decode to convert it back into an ordinary string.
Not quite sure if decode works in Python 2, but you really should be using Python 3, anyway.
Your output looks like a representation of a bytestring in Python returned by repr():
with open('input_file', 'rb') as file:
print repr(file.read())
Note: some bytes are shown as ascii characters e.g. '\x52' == 'R'. If you want all bytes to be shown as the hex escapes:
with open('input_file', 'rb') as file:
print "\\x" + "\\x".join([c.encode('hex') for c in file.read()])
Just add the content to list and print:
with open("default.png",'rb') as file_png:
a = file_png.read()
l = []
l.append(a)
print l
Related
I have a file like this:
\u9515\u7691\u853c\u788d\u7231
\u9515\u7691\u853c\u788d\u7231
\u9515\u7691\u853c\u788d\u7231
now I want to read this file to print string, I do this like this:
with open(fi, "rb") as fi:
print(fi.readline().strip().decode("utf-8"))
but I find that it still print
\u9515\u7691\u853c\u788d\u7231
how can I get the real string:
锕皑蔼碍爱
you can decode your string using unicode-escape
line = "\\u9515\\u7691\\u853c\\u788d\\u7231"
print line.decode("unicode-escape")
Your decode function treats your data as regular string. Try doing it like this:
with open(fi, "rb") as fi:
data = fi.readline().strip()
encode_data = data.encode("utf-8")
print(encode_data.decode("utf-8")
Alternatively, as this is a Python escaped string, you can use ast.literal_eval:
line = r"\u9515\u7691\u853c\u788d\u7231"
print(ast.literal_eval('u"' + line + '"')
gives as expected:
锕皑蔼碍爱
I am attempting to parse a file that I believe is UTF-16 encoded (the file magic is 0xFEFF), and I can open the file just as I want with:
f = open(file, 'rb')
But when for example I do
print f.read(40)
it prints the actual unicode strings of the file where I would like to access the hexadecimal data and read that byte-by-byte. This may be a stupid question but I haven't been able to find out how to do this.
Also, as a follow up question. Once I get this working, I would like to parse the file looking for a specific set of bytes, in this case:
0x00 00 00 43 00 00 00
And after that pattern is found, begin parsing an entry. What's the best way to accomplish this? I was thinking about using a generator to walk through each byte, and once this pattern shows up, yield the bytes until the next instance of that pattern? Is there a more efficient way to do this?
EDIT: I am using Python 2.7
shouldn't you just be able to do this
string = 'string'
>>> hex(ord(string[1]))
'0x74'
hexString = ''
with open(filename) as f:
while True:
#char = f.read(1)
chars = f.read(40)
hexString += ''.join(hex(ord(char) for char in chars)
if not chars:
break
If you want a string of hexadecimal, you can pass it through binascii.hexlify():
with open(filename, 'rb') as f:
raw = f.read(40)
hexadecimal = binascii.hexlify(raw)
print(hexadecimal)
(This also works without modification on Python 3)
If you need the numerical value of each byte, you can call ord() on each element, or equivalently, map() the function over the string:
with open(filename, 'rb') as f:
raw = f.read(40)
byte_list = map(ord, raw)
print byte_list
(This doesn't work on Python 3, but on 3.x, you can just iterate over raw directly)
I'm new to python and I have a file like this:
cw==ZA==YQ==ZA==YQ==cw==ZA==YQ==cw==ZA==YQ==cw==ZA==YQ==cw==ZA==dA==ZQ==cw==dA==
It's an keybord input, coded with base64, and new I want to decode it
I try this by the code is stoping at first character decoded.
import base64
file = "my_file.txt"
fin = open(file, "rb")
binary_data = fin.read()
fin.close()
b64_data = base64.b64decode(binary_data)
b64_fname = "original_b64.txt"
fout = open(b64_fname, "w")
fout.write(b64_data)
fout.close
Any help is welcome. thanks
I assume that you created your test input string yourself.
If I split your test input string in blocks of 4 characters and decode each one apart, I get the following:
>>> import base64
>>> s = 'cw==ZA==YQ==ZA==YQ==cw==ZA==YQ==cw==ZA==YQ==cw==ZA==YQ==cw==ZA==dA==ZQ==cw==dA=='
>>> ''.join(base64.b64decode(s[i:i+4]) for i in range(0, len(s), 4))
'sdadasdasdasdasdtest'
However, the correct base64 encoding of your test string sdadasdasdasdasdtest is:
>>> base64.b64encode('sdadasdasdasdasdtest')
'c2RhZGFzZGFzZGFzZGFzZHRlc3Q='
If you place this string in my_file.txt (and rewriting your code to be a bit more concise) then it all works.
import base64
with open("my_file.txt") as f, open("original_b64.txt", 'w') as g:
encoded = f.read()
decoded = base64.b64decode(encoded)
g.write(decoded)
I operate with a thermal printer, this printer is able to print images, but it needs to get the data in hex format. For this I would need to have a python function to read an image and return a value containing the image data in hex format.
I currently use this format to sent hex format to the printer:
content = b"\x1B\x4E"
Which is the simplest way to do so using Python2.7?
All the best;
I don't really know what you mean by "hex format", but if it needs to get the whole file as a sequence of bytes you can do:
with open("image.jpeg", "rb") as fp:
img = fp.read()
If your printer expects the image in some other format (like 8bit values for every pixel) then try using the pillow library, it has many image manipulation functions and handles a wide range of input and ouput formats.
How about this:
with open('something.jpeg', 'rb') as f:
binValue = f.read(1)
while len(binValue) != 0:
hexVal = hex(ord(binValue))
# Do something with the hex value
binValue = f.read(1)
Or for a function, something like this:
import re
def imgToHex(file):
string = ''
with open(file, 'rb') as f:
binValue = f.read(1)
while len(binValue) != 0:
hexVal = hex(ord(binValue))
string += '\\' + hexVal
binValue = f.read(1)
string = re.sub('0x', 'x', string) # Replace '0x' with 'x' for your needs
return string
Note: You do not necessarily need to do the re.sub portion if you use struct.pack to write the bits, but this will get it into the format that you need
Read in a jpg and make a string of hex values. Then reverse the procedure. Take a string of hex and write it out as a jpg file...
import binascii
with open('my_img.jpg', 'rb') as f:
data = f.read()
print(data[:10])
im_hex = binascii.hexlify(data)
# check out the hex...
print(im_hex[:10])
# reversing the procedure
im_hex = binascii.a2b_hex(im_hex)
print(im_hex[:10])
# write it back out to a jpg file
with open('my_hex.jpg', 'wb') as image_file:
image_file.write(im_hex)
I'm using Python 3.2.3 on Windows, and am trying to convert binary data within a C-style ASCII file into its binary equivalent for later parsing using the struct module. For example, my input file contains "0x000A 0x000B 0x000C 0x000D", and I'd like to convert it into "\x00\x0a\x00\x0b\x00\x0c\x00\x0d".
The problem I'm running into is that the string datatypes have changed in Python 3, and the built-in functions to convert from hexadecimal to binary, such as binascii.unhexlify(), no longer accept regular unicode strings, but only byte strings. This process of converting from unicode strings to byte strings and back is confusing me, so I'm wondering if there's an easier way to achieve this. Below is what I have so far:
with open(path, "r") as f:
l = []
data = f.read()
values = data.split(" ")
for v in values:
if (v.startswith("0x")):
l.append(binascii.unhexlify(bytes(v[2:], "utf-8").decode("utf-8")
string = ''.join(l)
3>> ''.join(chr(int(x, 16)) for x in "0x000A 0x000B 0x000C 0x000D".split()).encode('utf-16be')
b'\x00\n\x00\x0b\x00\x0c\x00\r'
As agf says, opening the image with mode 'r' will give you string data.
Since the only thing you are doing here is looking at binary data, you probably want to open with 'rb' mode and make your result of type bytes, not str.
Something like:
with open(path, "rb") as f:
l = []
data = f.read()
values = data.split(b" ")
for v in values:
if (v.startswith(b"0x")):
l.append(binascii.unhexlify(v[2:]))
result = b''.join(l)