Writing 8 bit values to a binary file in python v3.3 - python

Please let me know the best way to write 8bit values to a file in python. The values can be between 0 and 255.
I open the file as follows:
f = open('temp', 'wb')
Assume the value I am writing is an int between 0 and 255 assigned to a variable, e.g.
x = 13
This does not work:
f.write(x)
..as you experts know does not work. python complains about not being able to write ints to buffer interface. As a work around I am writing it as a hex digit. Thus:
f.write(hex(x))
..and that works but is not only space inefficient but clearly not the right python way. Please help. Thanks.

Try explicitly creating a bytes object:
f.write(bytes([x]))
You can also output a series of bytes as follows:
f.write(bytes([65, 66, 67]))

As an alternative you can use the struct module...
import struct
x = 13
with open('temp', 'wb') as f:
f.write(struct.pack('>I', x)) # Big-endian, unsigned int
To read x from the file...
with open('temp', 'rb') as f:
x, = struct.unpack(">I", f.read())

You just need to write an item of bytes data type, not integer.

Related

Binary reader from c++ to python

I have a binary reader in c++ and i am trying to write the same thing in python (beginner in both language). I read online and i see that i should use struct but i am having trouble getting it done.
My c++ code is the Following:
struct databin {
float prixc[60];
float volume[60];
};
databin qt;
//get the data for day d, minute i//
int numsteps = m_nbsteps * sizeof(qt);
int step = d*numsteps+i*sizeof(qt);
m_rf.seekg(step, ios::beg);
m_rf.read((char*) &qt, sizeof(qt));
// at the end we have the data in the object qt
I would really appreciate some help to do the same in python.
Thank you!!
update:
Sorry Mark, i did not want my message to be perceived that way. Really appreciate the time you spend.
Actually i was more looking for a starting point with "struct" as my struct in c was made of two arrays and i cant find how to have the same type of structure in python to be able to use unpack.
What i have done so far:
//get the data for day d, minute i//
d=5000
i=15
numsteps=391*480
x=[]
y=[]
step=d*numsteps+i*480
with open(file, "rb") as of:
of.seek(step, 0)
couple_bytes = of.read(480)
for j in range(0,240,4):
[x] = struct.unpack('f', couple_bytes[j:j+4])
xx.append(x)
for j in range(244,480,4):
[y] = struct.unpack('f', couple_bytes[j:j+4])
yy.append(y)
Now this works and in xx and yy i have my 2 arrays. But my goal was to have a more direct approach by defining a structure and reading it directly.
Thank you again!
The struct module has the ability to unpack many values at a time by including a count in the format string.
with open(file, "rb") as of:
of.seek(step, 0)
couple_bytes = of.read(60*4)
prixc = list(struct.unpack('60f', couple_bytes))
couple_bytes = of.read(60*4)
volume = list(struct.unpack('60f', couple_bytes))

Read a binary file using unpack in Python compared with a IDL method

I have a IDL procedure reading a binary file and I try to translate it into a Python routine.
The IDL code look like :
a = uint(0)
b = float(0)
c = float(0)
d = float(0)
e = float(0)
x=dblarr(nptx)
y=dblarr(npty)
z=dblarr(nptz)
openr,11,name_file_data,/f77_unformatted
readu,11,a
readu,11,b,c,d,e
readu,11,x
readu,11,y
readu,11,z
it works perfectly. So I'm writing the same thing in python but I can't find the same results (even the value of 'a' is different). Here is my code :
x=np.zeros(nptx,float)
y=np.zeros(npty,float)
z=np.zeros(nptz,float)
with open(name_file_data, "rb") as fb:
a, = struct.unpack("I", fb.read(4))
b,c,d,e = struct.unpack("ffff", fb.read(16))
x[:] = struct.unpack(str(nptx)+"d", fb.read(nptx*8))[:]
y[:] = struct.unpack(str(npty)+"d", fb.read(npty*8))[:]
z[:] = struct.unpack(str(nptz)+"d", fb.read(nptz*8))[:]
Hope it will help anyone to answer me.
Update : As suggested in the answers, I'm now trying the module "FortranFile", but I'm not sure I understood everything about its use.
from scipy.io import FortranFile
f=FortranFile(name_file_data, 'r')
a=f.read_record('H')
b=f.read_record('f','f','f','f')
However, instead of having an integer for 'a', I got : array([0, 0], dtype=uint16).
And I had this following error for 'b': Size obtained (1107201884) is not a multiple of the dtypes given (16)
According to a table of IDL data types, UINT(0) creates a 16 bit integer (i.e. two bytes). In the Python struct module, the I format character denotes a 4 byte integer, and H denotes an unsigned 16 bit integer.
Try changing the line that unpacks a to
a, = struct.unpack("H", fb.read(2))
Unfortunately, this probably won't fix the problem. You use the option /f77_unformatted with openr, which means the file contains more than just the raw bytes of the variables. (See the documentation of the OPENR command for more information about /f77_unformatted.)
You could try to use scipy.io.FortranFile to read the file, but there are no gaurantees that it will work. The binary layout of an unformatted Fortran file is compiler dependent.

How to skip bytes after reading data using numpy fromfile

I'm trying to read noncontiguous fields from a binary file in Python using numpy fromfile function. It's based on this Matlab code using fread:
fseek(file, 0, 'bof');
q = fread(file, inf, 'float32', 8);
8 indicates the number of bytes I want to skip after reading each value. I was wondering if there was a similar option in fromfile, or if there is another way of reading specific values from a binary file in Python. Thanks for your help.
Henrik
Something like this should work, untested:
import struct
floats = []
with open(filename, 'rb') as f:
while True:
buff = f.read(4) # 'f' is 4-bytes wide
if len(buff) < 4: break
x = struct.unpack('f', buff)[0] # Convert buffer to float (get from returned tuple)
floats.append(x) # Add float to list (for example)
f.seek(8, 1) # The second arg 1 specifies relative offset
Using struct.unpack()

Writing to binary file as int in python

I'm working on a project where the output size is very important. As my outputs are numbers between 0 and 100, I'm trying to write them as bytes (or unsigned chars).
However, I'm getting errors when trying to read them.
Here is a simple example:
test_filename='test.b'
g=(3*ones(shape=[1000])).astype('c')
g.tofile(test_filename)
with open(test_filename, "rb") as f:
bytes = f.read(1)
num = int(bytes.encode('hex'), 1)
print num
Here is the error I get, somehow the bytes.encode thingy excepts a binary string or something of that sort (not sure of course):
ValueError Traceback (most recent call last)
<ipython-input-43-310a447041fe> in <module>()
----> 1 num = int(bytes.encode('hex'), 1)
2 print num
ValueError: int() base must be >= 2 and <= 36
I should state that I would later need to read the output files in C++.
Thanks in advance,
Gil
There is some iffiness to this based on the version of python you are using.
If python2, which I assume you are using because of the print statement, the main problem you have is that you are getting a string from the read, so if the value is say 50 you would get an ascii value of 2 if you print it. You need to tell python that those bits should be in an int type not a str type and a simple cast does not do that.
I personally would use the struct package and do the following:
with open(test_filename, "rb") as f:
bytes = f.read(1)
num = struct.unpack("B", bytes)[0]
print num
Another option would be to encode the string to hex and read it in as a hex string (which looks like is what you are trying):
num = int(bytes.encode("hex_codec"), 16))
print num
One final option would be to put the string in a bytearray and pull the first byte:
num = bytearray(bytes)[0]
print num
If you are actually using python 3 this is simpler because you will get back a bytes object (if so dont name a variable bytes, very confusing). With a bytes object you can just pull the first element out which will be pulled out as an int:
num = bytes[0]
print num

How to read a float from a raw binary file written with numpy's tofile()

I am writing a float32 to a file with numpy's tofile().
float_num = float32(3.4353)
float_num.tofile('float_test.bin')
It can be read with numpy's fromfile(), however that doesn't suit my needs and I have to read it as a raw binary with the help of the bitstring module.
So I do the following:
my_file = open('float_test.bin', 'rb')
raw_data = ConstBitStream(my_file)
float_num_ = raw_data.readlist('float:32')
print float_num
print float_num_
Output:
3.4353
-5.56134659129e+32
What could be the cause? The second output should also be 3.4353 or close.
The problem is that numpy's float32 is stored as little endian and bitstrings default implementation is bigendian. The solution is to specify little endian as the data type.
my_file = open('float_test.bin', 'rb')
raw_data = ConstBitStream(my_file)
float_num_ = raw_data.readlist('floatle:32')
print float_num
print float_num_
Output:
3.4353
3.43530011177
Reference on bitstring datatypes, here.

Categories

Resources