I Basically want to read a png file and convert it into binary(base 2) and store the converted base 2 value in a string. I've tried so many things, but all of them are showing some error
You can use two approaches:
At first, try to read the image and decode it into base64 format:
import base64
with open("my_image.png", "rb") as f:
png_encoded = base64.b64encode(f.read())
Then, you encode base64 string into base2 string:
encoded_b2 = "".join([format(n, '08b') for n in png_encoded])
print(encoded_b2)
Although, you may decode base2 string into png file:
decoded_b64 = b"".join([bytes(chr(int(encoded_b2[i:i + 8], 2)), "utf-8") for i in range(0, len(encoded_b2), 8)])
with open('my_image_decoded.png', 'wb') as f:
f.write(base64.b64decode(decoded_b64))
At second, read bytes directly and write byte as base 2 number into string:
from PIL import Image
from io import BytesIO
out = BytesIO()
with Image.open("my_image.png") as img:
img.save(out, format="png")
image_in_bytes = out.getvalue()
encoded_b2 = "".join([format(n, '08b') for n in image_in_bytes])
print(encoded_b2)
And you may decode base2 string into file:
decoded_b2 = [int(encoded_b2[i:i + 8], 2) for i in range(0, len(encoded_b2), 8)]
with open('my_image_decoded.png', 'wb') as f:
f.write(bytes(decoded_b2))
Related
f.read(1) will return 1 byte, not one character. The file is binary but particular ranges in the file are UTF-8 encoded strings with the length coming before the string. There is no newline character at the end of the string. How do I read such strings?
I have seen this question but none of the answers address the UTF-8 case.
Example code:
file = 'temp.txt'
with open(file, 'wb') as f:
f.write(b'\x41')
f.write(b'\xD0')
f.write(b'\xB1')
f.write(b'\xC0')
with open(file, 'rb') as f:
print(f.read(1), '+', f.read(1))
with open(file, 'r') as f:
print(f.buffer.read(1), '+', f.read(1))
This outputs:
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xc0 in position 2: invalid start byte
When f.write(b'\xC0') is removed, it works as expected. It seems to read more than it is told: the code doesn't say to read the 0xC0 byte.
The file is binary but particular ranges in the file are UTF-8 encoded strings with the length coming before the string.
You have the length of the string, which is likely the byte length as it makes the most sense in a binary file. Read the range of bytes in binary mode and decode it after-the-fact. Here's a contrived example of writing a binary file with a UTF-8 string with the length encoded first. It has a two-byte length followed by the encoded string data, surrounded with 10 bytes of random data on each side.
import os
import struct
string = "我不喜欢你女朋友。你需要一个新的。"
with open('sample.bin','wb') as f:
f.write(os.urandom(10)) # write 10 random bytes
encoded = string.encode()
f.write(len(encoded).to_bytes(2,'big')) # write a two-byte big-endian length
f.write(encoded) # write string
f.write(os.urandom(10)) # 10 more random bytes
with open('sample.bin','rb') as f:
print(f.read()) # show the raw data
# Option 1: Seeking to the known offset, read the length, then the string
with open('sample.bin','rb') as f:
f.seek(10)
length = int.from_bytes(f.read(2),'big')
result = f.read(length).decode()
print(result)
# Option 2: read the fixed portion as a structure.
with open('sample.bin','rb') as f:
# read 10 bytes and a big endian 16-bit value
*other,header = struct.unpack('>10bH',f.read(12))
result = f.read(length).decode()
print(result)
Output:
b'\xa3\x1e\x07S8\xb9LA\xf0_\x003\xe6\x88\x91\xe4\xb8\x8d\xe5\x96\x9c\xe6\xac\xa2\xe4\xbd\xa0\xe5\xa5\xb3\xe6\x9c\x8b\xe5\x8f\x8b\xe3\x80\x82\xe4\xbd\xa0\xe9\x9c\x80\xe8\xa6\x81\xe4\xb8\x80\xe4\xb8\xaa\xe6\x96\xb0\xe7\x9a\x84\xe3\x80\x82ta\xacg\x9c\x82\x85\x95\xf9\x8c'
我不喜欢你女朋友。你需要一个新的。
我不喜欢你女朋友。你需要一个新的。
If you do need to read UTF-8 characters from a particular byte offset in a file, you can wrap the binary stream in a UTF-8 reader after seeking:
with open('sample.bin','rb') as f:
f.seek(12)
c = codecs.getreader('utf8')(f)
print(c.read(1))
Output:
我
Here's a character that takes up more than one byte. Whether you open the file giving the utf-8 encoding or not, reading one byte seems to do the job and you get the whole character.
file = 'temp.txt'
with open(file, 'wb') as f:
f.write('⾀'.encode('utf-8'))
f.write(b'\x01')
with open(file, 'rb') as f:
print(f.read(1))
with open(file, 'r') as f:
print(f.read(1))
Output:
b'\xe2'
⾀
Even though some of the file is non utf-8, you can still open the file in reading mode (non-binary), skip to the byte you want to read and then read a whole character by running read(1).
This works even if your character isn't in the beginning of the file:
file = 'temp.txt'
with open(file, 'wb') as f:
f.write(b'\x01')
f.write('⾀'.encode('utf-8'))
with open(file, 'rb') as f:
print(f.read(1), '+', f.read(1))
with open(file, 'r') as f:
print(f.read(1),'+', f.read(1))
If this does not work for you please provide an example.
Can you help me , I need to open my_url in rb mode. Try to do this.
url = "https://my url/" + file_info.file_path
response = requests.get(url)
with open(BytesIO(response.content), "rb") as f: # Open in 'rb' mode for reading it in way like: 010101010
byte = f.read(1)
#some algorithm..............
while byte:
hexadecimal = binascii.hexlify(byte)
decimal = int(hexadecimal, 16)
binary = bin(decimal)[2:].zfill(8)
hiddenData += binary
byte = f.read(1)
Have an error:
Expected str,bytes or.osPathLIke object, not _ioBytesIO
Can you help ,please, how I should open my url in "rb" mode?
I was trying to open an image, using Pillow - it is okay. But as for using open() , I can not do the same. Please..
you're passing a BytesIO object (basically a file handle) where a filename is expected.
So quickfix:
f = BytesIO(response.content)
but better, iterate on a bytes objects using iter either manually (for the start of your algorithm) or automatically (using a for loop which will stop when the iterator is exhausted, so no need for while):
f = iter(response.content)
byte = next(f)
#some algorithm..............
for byte in f:
hexadecimal = binascii.hexlify(byte)
decimal = int(hexadecimal, 16)
binary = bin(decimal)[2:].zfill(8)
hiddenData += binary
I'm new to python and I have a file like this:
cw==ZA==YQ==ZA==YQ==cw==ZA==YQ==cw==ZA==YQ==cw==ZA==YQ==cw==ZA==dA==ZQ==cw==dA==
It's an keybord input, coded with base64, and new I want to decode it
I try this by the code is stoping at first character decoded.
import base64
file = "my_file.txt"
fin = open(file, "rb")
binary_data = fin.read()
fin.close()
b64_data = base64.b64decode(binary_data)
b64_fname = "original_b64.txt"
fout = open(b64_fname, "w")
fout.write(b64_data)
fout.close
Any help is welcome. thanks
I assume that you created your test input string yourself.
If I split your test input string in blocks of 4 characters and decode each one apart, I get the following:
>>> import base64
>>> s = 'cw==ZA==YQ==ZA==YQ==cw==ZA==YQ==cw==ZA==YQ==cw==ZA==YQ==cw==ZA==dA==ZQ==cw==dA=='
>>> ''.join(base64.b64decode(s[i:i+4]) for i in range(0, len(s), 4))
'sdadasdasdasdasdtest'
However, the correct base64 encoding of your test string sdadasdasdasdasdtest is:
>>> base64.b64encode('sdadasdasdasdasdtest')
'c2RhZGFzZGFzZGFzZGFzZHRlc3Q='
If you place this string in my_file.txt (and rewriting your code to be a bit more concise) then it all works.
import base64
with open("my_file.txt") as f, open("original_b64.txt", 'w') as g:
encoded = f.read()
decoded = base64.b64decode(encoded)
g.write(decoded)
I operate with a thermal printer, this printer is able to print images, but it needs to get the data in hex format. For this I would need to have a python function to read an image and return a value containing the image data in hex format.
I currently use this format to sent hex format to the printer:
content = b"\x1B\x4E"
Which is the simplest way to do so using Python2.7?
All the best;
I don't really know what you mean by "hex format", but if it needs to get the whole file as a sequence of bytes you can do:
with open("image.jpeg", "rb") as fp:
img = fp.read()
If your printer expects the image in some other format (like 8bit values for every pixel) then try using the pillow library, it has many image manipulation functions and handles a wide range of input and ouput formats.
How about this:
with open('something.jpeg', 'rb') as f:
binValue = f.read(1)
while len(binValue) != 0:
hexVal = hex(ord(binValue))
# Do something with the hex value
binValue = f.read(1)
Or for a function, something like this:
import re
def imgToHex(file):
string = ''
with open(file, 'rb') as f:
binValue = f.read(1)
while len(binValue) != 0:
hexVal = hex(ord(binValue))
string += '\\' + hexVal
binValue = f.read(1)
string = re.sub('0x', 'x', string) # Replace '0x' with 'x' for your needs
return string
Note: You do not necessarily need to do the re.sub portion if you use struct.pack to write the bits, but this will get it into the format that you need
Read in a jpg and make a string of hex values. Then reverse the procedure. Take a string of hex and write it out as a jpg file...
import binascii
with open('my_img.jpg', 'rb') as f:
data = f.read()
print(data[:10])
im_hex = binascii.hexlify(data)
# check out the hex...
print(im_hex[:10])
# reversing the procedure
im_hex = binascii.a2b_hex(im_hex)
print(im_hex[:10])
# write it back out to a jpg file
with open('my_hex.jpg', 'wb') as image_file:
image_file.write(im_hex)
I would like to convert a binary to hexadecimal in a certain format and save it as a text file.
The end product should be something like this:
"\x7f\xe8\x89\x00\x00\x00\x60\x89\xe5\x31\xd2\x64\x8b\x52"
Input is from an executable file "a".
This is my current code:
with open('a', 'rb') as f:
byte = f.read(1)
hexbyte = '\\x%02s' % byte
print hexbyte
A few issues with this:
This only prints the first byte.
The result is "\x" and a box like this:
00
7f
In terminal it looks exactly like this:
Why is this so? And finally, how do I save all the hexadecimals to a text file to get the end product shown above?
EDIT: Able to save the file as text with
txt = open('out.txt', 'w')
print >> txt, hexbyte
txt.close()
You can't inject numbers into escape sequences like that. Escape sequences are essentially constants, so, they can't have dynamic parts.
There's already a module for this, anyway:
from binascii import hexlify
with open('test', 'rb') as f:
print(hexlify(f.read()).decode('utf-8'))
Just use the hexlify function on a byte string and it'll give you a hex byte string. You need the decode to convert it back into an ordinary string.
Not quite sure if decode works in Python 2, but you really should be using Python 3, anyway.
Your output looks like a representation of a bytestring in Python returned by repr():
with open('input_file', 'rb') as file:
print repr(file.read())
Note: some bytes are shown as ascii characters e.g. '\x52' == 'R'. If you want all bytes to be shown as the hex escapes:
with open('input_file', 'rb') as file:
print "\\x" + "\\x".join([c.encode('hex') for c in file.read()])
Just add the content to list and print:
with open("default.png",'rb') as file_png:
a = file_png.read()
l = []
l.append(a)
print l