Easy way to view and save the binary of a file? - python

What is the easiest way to get the underlying binary code(0s and 1s) for a given file? The context for this question is that I want a python function which takes a file name, looks it up and gathers the binary code for that file before either storing it somewhere or returning it. After this I want to do some manipulations on the binary file.

The underlying code for a file is available form the .read() method of the file object. Use the b mode modifier when you open the file:
with open("input_file.bin", "rb") as input_file:
bits = input_file.read()
If you want to easily manipulate the bits after reading them in, you might want to convert them to a bitarray:
from bitarray import bitarray
with open("input_file.bin", "rb") as input_file:
chars = input_file.read()
bits = bitarray()
bits.frombytes(chars)
print bits.count(1), bits.count(0)
References:
https://docs.python.org/2/library/functions.html#open
https://pypi.python.org/pypi/bitarray/0.8.1

Related

Fastest way to read a large binary file with Python

I need to read a simple but large (500MB) binary file in Python 3.6. The file was created by a C program, and it contains 64-bit double precision data. I tried using struct.unpack but that's very slow for a large file.
Here is my simple file read:
def ReadBinary():
fileName = 'C:\\File_Data\\LargeDataFile.bin'
with open(fileName, mode='rb') as file:
fileContent = file.read()
Now I have fileContent. What is the fastest way to decode it into 64-bit double-precision floating point, or read it without the need to do a format conversion?
I want to avoid, if possible, reading the file in chunks. I would like to read it decoded, all at once, like C does.
You can use array.array('d')'s fromfile method:
def ReadBinary():
fileName = r'C:\File_Data\LargeDataFile.bin'
fileContent = array.array('d')
with open(fileName, mode='rb') as file:
fileContent.fromfile(file)
return fileContent
That's a C-level read as raw machine values. mmap.mmap could also work by creating a memoryview of the mmap object and casting it.

Python binary file write directly from string

I have the byte-code of a png-file in a string variable. How do I write it to .png file without python trying to encode it? The string is '\x89PNG\r\n\x1a\n\x00\x00\x00\rIHDR\x00\x00\x00\n\x00\x00\x00\x07\x08\x02\x00\x00\x00\xbe\xceK4\x00\x00\x00\x01sRGB\x00\xae\xce\x1c\xe9\x00\x00\x00\x04gAMA\x00\x00\xb1\x8f\x0b\xfca\x05\x00\x00\x00\tpHYs\x00\x00\x0e\xc3\x00\x00\x0e\xc3\x01\xc7o\xa8d\x00\x00\x00DIDAT\x18Wc\xf8\xff\xff\xff\xaf\xfd\x07\xdf[:\xbc\x95Q\x81 \xfb\xc7\xaa\xb5#q \x00I#\xcb\xc1\x11D\x11H\xfa\xdb\x94\x19hr\x10\xf4NY\x1b$\x8d\x0c\x90\x95~\xad\xacE\x97F\x03\x94H\xff\xff\x0f\x00\x1f]\xa2\x03U|Z\xa3\x00\x00\x00\x00IEND\xaeB`\x82'
edit: I feel like you might need more info on my situation: I am trying to make a little encryption program, and although it works on strings, I want to make it work for any file too. I am reading a .png file in byte-mode(which gives the string mentioned above), and after it is done being encrypted and decrypted, I have a string with the exact same content, but no way to put it back into a file.
For python3, you have to open the file in binary write mode and encode the string to bytes:
with open('filename', 'wb') as f:
f.write(the_string.encode())
You could try using PyPNG, looks like a possible solution:
http://pythonhosted.org/pypng/ex.html#writing
This will let you write binary to a file in python.
with open('filename', 'wb') as f:
f.write(bytecode)

How to open and read a binary file in Python?

I have a binary file (link) that I would like to open and read contents of with Python. How are such binary files opened and read with Python? Any specific modules to use for such an operation.
The 'b' flag will get python to treat the file as a binary, so no modules are needed. Also you haven't provided a purpose for having python read a binary file with a question like that.
f = open('binaryfile', 'rb')
print(f.read())
Here is an Example:
with open('somefile.bin', 'rb') as f: #the second parameter "rb" is used only when reading binary files. Term "rb" stands for "read binary".
data = f.read() #we are assigning a variable which will read whatever in the file and it will be stored in the variable called data.
print(data)
Reading a file in python is trivial (as mentioned above); however, it turns out that if you want to read a binary file and decode it correctly you need to know how it was encoded in the first place.
I found a helpful example that provided some insight at https://www.devdungeon.com/content/working-binary-data-python,
# Binary to Text
binary_data = b'I am text.'
text = binary_data.decode('utf-8') #Trans form back into human-readable ASCII
print(text)
binary_data = bytes([65, 66, 67]) # ASCII values for A, B, C
text = binary_data.decode('utf-8')
print(text)
but I was still unable to decode some files that my work created because they used an unknown encoding method.
Once you know how it is encoded you can read the file bit by bit and perform the decoding with a function of three.

Save a bitstream to a file using python

I need to output an h.265 (or hevc, is the same) bit-stream onto an str file in python.
I have a bitstream file and i select some data from this file to save it to a new one. I use bitstring module to process the bitstream file.
Edit: My question is how to create a new bitstream file and insert data into.
Check out the part about Joining BitArrays (base class of BitStream) in this part of the bitstring documentation. How to join the substreams depends on how you have them in the first place.
For writing the bitstream to a file, use the method 'toFile' of the Bits class, which is a base class of BitStream.
f = open('fileToWriteTo', 'wb')
bitstreamObject.tofile(f)
If you want to write multiple substreams one after another, you can open the file in append mode the next times you write something.
f = open('fileToWriteTo', 'ab')
nextSubstream.tofile(f)
Take a look at struct
A quick example:
import struct
characters = "Hello World"
with open(filepath, 'wb') as f:
for char in characters:
# #B means to pack native (LSB or MSB) to size unsigned char (1 byte)
packed = struct.pack('#B', char)
f.write(packed)

Workin with binary in python

Is there an easy way to work in binary with Python?
I have a file of data I am receiving (in 1's and 0's) and would like to scan through it and look for certain patterns in binary. It has to be in binary because due to my system, I might be off by 1 bit or so which would throw everything off when converting to hex or ascii.
For example, I would like to open the file, then search for '0001101010111100110' or some string of binary and have it tell me whether or not it exists in the file, where it is, etc.
Is this doable or would I be better off working with another language?
To convert a byte string into a string of '0' and '1', you can use this one-liner:
bin_str = ''.join(bin(0x100 + ord(b))[-8:] for b in byte_str)
Combine that with opening and reading the file:
with open(filename, 'rb') as f:
byte_str = f.read()
Now it's just a simple string search:
if '0001101010111100110' in bin_str:
You would be better working off another language. Python could do it (if you use for example,
file = open("file", "wb")
(appending the b opens it in binary), and then using a simple search, but to be honest, it is much easier and faster to do it in a lower-level language such as C.

Categories

Resources