Read binary file with header using bitarray's from file in Python - python

I wrote a program that uses bitarray 0.8.0 to write bits to a binary file. I would like to add a header to this binary file to describe what's inside the file.
My problem is that I think the method "fromfile" of bitarray necessarily starts reading the file from the beginning. I could make a workaround so that the reading program gets the header and then rewrite a temporary file containing only the binary portion (bitarray tofile), but it doesn't sound too efficient of an idea.
Is there any way to do this properly?
My file could look something like the following where clear text is the header and binary data is the bitarray information:
...{(0, 0): '0'}{(0, 0): '0'}{(0, 0): '0'}���������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������...
Edit:
I tried the following after reading the response:
bits = ""
b = bitarray()
with open(Filename, 'rb') as file:
#Get header
byte = file.read(1)
while byte != "":
# read header
byte = file.read(1)
b.fromfile(file)
print b.to01()
print "len(b.to01())", len(b.to01())
The length is 0 and the print of "to01()" is empty.
However, the print of the header is fine.

My problem is that I think the method "fromfile" of bitarray necessarily starts reading the file from the beginning.
This is likely false; it, like most other file read routines, probably starts at the current position within the file, and stops at EOF.
EDIT:
From the documentation:
fromfile(f, [n])
Read n bytes from the file object f and append them to the bitarray interpreted as machine values. When n is omitted, as many bytes are read until EOF is reached.

Related

How to print the last byte of a file .txt

python
I have a text file (.txt) and I need to print to the last byte of the txt.
how I do that ?
When I do not know the size of the document.
The documenation provides an API, that can be used to solve that problem. You will need to do the following things in order:
Open the file in text mode like in the example here.
Change the file pointer to the last byte. This can be achieved using the seek memeber function of the file object. Use the SEEK_END token with an offset of -1 to get one byte before the end of the file
Read one byte with the read function.
Print that byte.
If you did not use a context manager (with keyword) while opening the file, you should use close to close the file before exiting the program
The trick here is to use the seek method, that can be used to specify an offset relative to the end of the file.
The following should work:
with open("text.txt") as file:
text = outfile.read()
byte_array = bytearray(text, "utf8")
print(byte_array[-1:])
If you need the binary representation
with open("text.txt") as file:
text = outfile.read()
byte_array = bytearray(text, "utf8")
binary_byte_list = []
for byte in byte_array:
binary_representation = bin(byte)
binary_byte_list.append(binary_representation)
print(binary_byte_list[-1:])
You could do it like this using seek which obviates the need to read the entire file into memory:
import os
with open('foo.txt', 'rb') as foo:
foo.seek(-1, os.SEEK_END)
b = foo.read()
print(b)
In this case the last character is newline and therefore:
Output:
b'\n'
Note:
File opened in binary mode

Read N number of bytes from stdin of python and output to a temp file for further processing

I would like to read a fixed number of bytes from stdin of a python script and output it to one temporary file batch by batch for further processing. Therefore, when the first N number of bytes are passed to the temp file, I want it to execute the subsequent scripts and then read the next N bytes from stdin. I am not sure what to iterate over in the top loop before While true. This is an example of what I tried.
import sys
While True:
data = sys.stdin.read(2330049) # Number of bytes I would like to read in one iteration
if data == "":
break
file1=open('temp.fil','wb') #temp file
file1.write(data)
file1.close()
further_processing on temp.fil (I think this can only be done after file1 is closed)
Two quick suggestions:
You should pretty much never do While True
Python3
Are you trying to read from a file? or from actual standard in? (Like the output of a script piped to this?)
Here is an answer I think will work for you, if you are reading from a file, that I pieced together from some other answers listed at the bottom:
with open("in-file", "rb") as in_file, open("out-file", "wb") as out_file:
data = in_file.read(2330049)
while byte != "":
out_file.write(data)
If you want to read from actual standard in, I would read all of it in, then split it up by bytes. The only way this won't work is if you are trying to deal with constant streaming data...which I would most definitely not use standard in for.
The .encode('UTF-8') and .decode('hex') methods might be of use to you also.
Sources: https://stackoverflow.com/a/1035360/957648 & Python, how to read bytes from file and save it?

How to open and read a binary file in Python?

I have a binary file (link) that I would like to open and read contents of with Python. How are such binary files opened and read with Python? Any specific modules to use for such an operation.
The 'b' flag will get python to treat the file as a binary, so no modules are needed. Also you haven't provided a purpose for having python read a binary file with a question like that.
f = open('binaryfile', 'rb')
print(f.read())
Here is an Example:
with open('somefile.bin', 'rb') as f: #the second parameter "rb" is used only when reading binary files. Term "rb" stands for "read binary".
data = f.read() #we are assigning a variable which will read whatever in the file and it will be stored in the variable called data.
print(data)
Reading a file in python is trivial (as mentioned above); however, it turns out that if you want to read a binary file and decode it correctly you need to know how it was encoded in the first place.
I found a helpful example that provided some insight at https://www.devdungeon.com/content/working-binary-data-python,
# Binary to Text
binary_data = b'I am text.'
text = binary_data.decode('utf-8') #Trans form back into human-readable ASCII
print(text)
binary_data = bytes([65, 66, 67]) # ASCII values for A, B, C
text = binary_data.decode('utf-8')
print(text)
but I was still unable to decode some files that my work created because they used an unknown encoding method.
Once you know how it is encoded you can read the file bit by bit and perform the decoding with a function of three.

Want to check this script I wrote to read a Fortran binary file

I'm working on a project that requires me to read Fortran binary files. It is my understanding that Fortran automatically puts a 4-byte header and footer into each file. As such, I want to remove the first and last 4 bytes from the file before I read it. Would this do the trick?
a = open("foo",rb)
b = a.seek(4,0)
x = np.fromfile(b.seek(4,2),dtype='float64')
It might be easier to read the entire file and then chop 4 bytes off each end:
a = open("foo","rb")
data = a.read()
a.close()
x = np.fromstring(data[4:-4], dtype='float64')
For a similar question, see How to read part of binary file with numpy?

Save an array as bin in Matlab, pass it to Python and read the bin file in Python

I am currently trying to save an array as bin file in Matlab, send it to Python and read it in Python. However, Matlab is showing errors when I run it. I am using the following codes:
Read the array in Matlab, convert to bin file and pass to Python:
array1 = rand(5,1); %% array1 is the desired array that needs to be sent to Python
fid = fopen('nazmul.bin','wb'); %% I want to save array1 in the nazmul.bin file
fwrite(fid,array1);
status=fclose(fid);
python('squared.py','nazmul.bin'); %% I want to send the parameters to squared.py program
squared.py file:
import sys
if __name__ == '__main__':
f = open("nazmul.bin", "rb") # Trying to open the bin file
try:
byte = f.read(1) # Reading the bin file and saving it in the byte array
while byte != "":
# Do stuff with byte.
byte = f.read(1)
finally:
f.close()
print byte # printing the byte array from Python
However, when I run this program, nothing gets printed. I guess that the bin file is not getting passed properly to the squared.py file.
Thanks for your feedback.
Nazmul
There are several problems here.
You should use double underscore when checking for 'main'. I.e. __main__ == "__main__".
You are not collecting bytes but rather always storing the last byte read. Therefore, the last byte is always "".
Finally, it seems like indentation is not correct. I assume this is just a stackoverflow formatting error.
One more potential issue - When you use fwrite(fid, A) in MATLAB, it assumes that you want to write bytes (8 bit numbers). However, your rand() command generates reals, so MATLAB first rounds the results to integers and your binary file will hold '0' or '1' only.
Final note: Reading a file one byte at a time is probably very inefficient. It is probably better to read the file in large chunks, or - if it is a small file - read the entire file in one read() operation.
The corrected Python code is as follows:
if __name__ == '__main__':
f = open("xxx.bin", "rb") # Trying to open the bin file
try:
a = [];
byte = f.read(1) # Reading the bin file and saving it in the byte array
while byte != "":
a.append(byte);
# Do stuff with byte.
byte = f.read(1)
finally:
f.close()
print a;

Categories

Resources