reading *.his (image) file in Python - python

I am trying to read an image file which is in *.his format. Honestly, I do not know much about this format, on spending some time on google I figured out that its a binary format and it can be read in ImageJ software as a raw format import. On further inquiry, I found the following details of the *.his file:
Image type = 16-bit unsigned
Matrix dimensions in pixels = w1024 x h1024
Skip header info = 100 Bytes (The number of bytes in the file before the first byte of image data).
Little-Endian Byte Order
With this information in hand, I started out ...
Just wanted to print the values in one by one, just to see the output:
f = open("file.his", 'rb')
f.seek(100)
try:
byte = f.read(2)
while byte != "":
byte = f.read(2)
print unpack('<H', byte)
finally:
f.close()
It prints some numbers out and then the error message :
.....
(64846,)
(64846,)
(64830,)
Traceback (most recent call last):
print unpack('
Plz can someone suggest me how to read this kind of file. I still think 'unpack' is the right function however if someone has similar experience, any response greatly appreciated.
Rky.

I've done a very similar task with *.inr image file maybe the logic could help you, here its what you could apply:
1-Reading the file
First you need to read the file.
file = open(hisfile, 'r')
inp = file.readlines()
2-Get header
In my case i done a for loop until the number of characters was 256, in your case you need to count the bits so you could "print" line by line to find out when you need to stop or try to use this to count the bits:
import sys
sys.getsizeof(line) #returns the size of the object
3-Data
When you already know that the following lines are the raw data you need to put them in one variable with a for loop:
for line in inp:
raw_data += line
4-Convert the data
To convert the string to a numpy array you could do:
data = fromstring(raw_data, dtype='uint16')
And then aplying the shape data:
data = data.reshape((1024,1024)).transpose() #You need to see if the transpose part its relevant,because in my case was fundamental.
Maybe if you have an example of the file i could try to read it and help you more. Of course you could do all the process in 1 for loop using if's.

Related

how to change byte type in pythone

i have a small problem and it caused me a lot of trubble. basicly i want to convert an immage to bytes than store string wersion of those bytes in an txt file and than read file contents and transform it into bytes and than into image. i've goten first part of this kinda ready (it works but it's made quickly and badly) but the conversion from string to byte gives me problem.
when i read image bytes it's something like this: b'GIF89aP\x00P\x00\xe3'
but when i read it from txt by 'rb' or just transform str to byte it gives me this: b'GIF89aP\\x00P\\x00\\xe3'
and with this i can't write it to an immage.
so i've tried to read and learn anything about this but i couldn't find anything that would help.
the code is here and i know it's really messy but i just need it to work
file = open('p.gif', 'rb')
image = file.read()
str_b = str(image)
leng = len(str_b)
print(leng)
str_b = str_b[:0] + str_b[0+2:]
leng =- 1
str_b = str_b[:leng]
print(image)
#a = open('bytearray', 'w+')
#a.write(str_b)
#a.close
a = open('bytearray', 'r')
a = a.read()
temp = a.encode('utf-8')
print(temp)
#b = open('check', 'w+')
#b.write(str(string))
#print(string)
image_result = open('decoded.jpg', 'wb') # create a writable image and write the decoding result
image_result.write(temp)
basicly my goal right now is to get bytes that look like this: b'GIF89aP\x00P\x00\xe3'
Please do not use eval like suggested above, eval has serious security vulnerabilities and will execute any python code you pass within it. You could accidentally read a text file that has code to reformat the disk and it will just execute, this is just an example but you get my point its bad practice and just results in more problems see https://nedbatchelder.com/blog/201206/eval_really_is_dangerous.html if you want some examples on why eval is bad
anyways lets try to fix your code
instead of converting your byte array to string by wrapping it in the str() method I would suggest you use .decode() and .encode()
Fixed Code:
with open('p.gif', 'rb') as file:
image = file.read() # read file bytes
str_image = image.decode("utf-8") #using decode we changed the bytes to a string
with open('image.txt', 'w') as file:
file.write(str_image) # write image as string to a text file
with open('image.txt', 'r') as file
str_from_file = file.read() # read the text file and store the string
file_bytes = str_from_file.encode("utf-8") # encode the image str back to bytes
print(type(str_from_file)) #type is str
print(type(file_bytes)) # types is bytes
I hope this fixes your issue and also doesn't include vulnerabilties in what your building

Read in binary data with python

I am very new to Python and I am trying to read in a file that partially contains binary data. There is a header with some information about the data and after the header binary data follow. If one opens the file in a texteditor it looks like this:
>>> Begin of header <<<
value1: 5
value2: 7
...
value65: 9
>>> End of header <<<
���ÄI›C¿���†¨¨v#���ÄW]c¿��� U⁄z#���#¬P\¿����∂:q#���#Ò˚U¿���†÷Us#���`ªw4¿��� :‘m#���#À›9#���ÄAs#���¿‹ ¿����ır#���¿#&%#���†„bq#����*˙-#��� [q#����ÚN8#����
Òo#���#√·T#���†‰zm#����9\#����ÃÜq#����€dZ#���`Ëäs#���†∏8I#���¿¬Ot#���†�6
an additional problem is that I did not create the file myself and do not now if those are double or float data.
So how can I interpret those data?
So first, thanks to all for the help: So basically the problem is the header. I can read in the data quit well, when i remove the header from the file. This can be done with
x = numpy.fromfile(f, dtype = numpy.complex128 , count = -1)
quite easily. The problem is that I cannot find any option for the function fromfile that skips lines (one can skip bytes, but the header size may be different from file to file.
In this great thread I found the how to convert an binary array to an numpy array:
convert binary string to numpy array
With this I could overcome the problem by reading in the datafile line for line and then merge every line after the end header line together in one string.
This string was then changed into an nice array exactly as I wanted it.

Python 2.6: Creating image from array

Python rookie here! So, I have a data file which stores a list of bytes, representing pixel values in an image. I know that the image is 3-by-3 pixels. Here's my code so far:
# Part 1: read the data
data = []
file = open("test.dat", "rb")
for i in range(0, 9)
byte = file.read(1)
data[i] = byte
file.close()
# Part2: create the image
image = PIL.Image.frombytes('L', (3, 3), data)
image.save('image.bmp')
I have a couple of questions:
In part 1, is this the best way to read a binary file and store the data in an array?
In part 2, I get the error "TypeError: must be string or read-only buffer, not list.
Any help on either of these?
Thank you!
Part 1
If you know that you need exactly nine bytes of data, that looks like a fine way to do it, though it would probably be cleaner/clearer to use a context manager and skip the explicit loop:
with open('test.dat', 'rb') as infile:
data = list(infile.read(9)) # read nine bytes and convert to a list
Part 2
According to the documentation, the data you must pass to PIL.Image.frombytes is:
data – A byte buffer containing raw data for the given mode.
A list isn't a byte buffer, so you're probably wasting your time converting the input to a list. My guess is that if you pass it the byte string directly, you'll get what you're looking for. This is what I'd try:
with open('test.dat', 'rb') as infile:
data = infile.read(9) # Don't convert the bytestring to a list
image = PIL.Image.frombytes('L', (3, 3), data) # pass in the bytestring
image.save('image.bmp')
Hopefully that helps; obviously I can't test it over here since I don't know what the content of your file is.
Of course, if you really need the bytes as a list for some other reason (doubtful--you can iterate over a string just as well as a list), you can always either convert them to a list when you need it (datalist = list(data)) or join them into a string when you make the call to PIL:
image = PIL.Image.frombytes('L', (3, 3), ''.join(datalist))
Part 3
This is sort of an aside, but it's likely to be relevant: do you know what version of PIL you're using? If you're using the actual, original Python Imaging Library, you may also be running into some of the many problems with that library--it's super buggy and unsupported since about 2009.
If you are, I highly recommend getting rid of it and grabbing the Pillow fork instead, which is the live, functional version. You don't have to change any code (it still installs a module called PIL), but the Pillow library is superior to the original PIL by leaps and bounds.

Read binary file with header using bitarray's from file in Python

I wrote a program that uses bitarray 0.8.0 to write bits to a binary file. I would like to add a header to this binary file to describe what's inside the file.
My problem is that I think the method "fromfile" of bitarray necessarily starts reading the file from the beginning. I could make a workaround so that the reading program gets the header and then rewrite a temporary file containing only the binary portion (bitarray tofile), but it doesn't sound too efficient of an idea.
Is there any way to do this properly?
My file could look something like the following where clear text is the header and binary data is the bitarray information:
...{(0, 0): '0'}{(0, 0): '0'}{(0, 0): '0'}���������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������...
Edit:
I tried the following after reading the response:
bits = ""
b = bitarray()
with open(Filename, 'rb') as file:
#Get header
byte = file.read(1)
while byte != "":
# read header
byte = file.read(1)
b.fromfile(file)
print b.to01()
print "len(b.to01())", len(b.to01())
The length is 0 and the print of "to01()" is empty.
However, the print of the header is fine.
My problem is that I think the method "fromfile" of bitarray necessarily starts reading the file from the beginning.
This is likely false; it, like most other file read routines, probably starts at the current position within the file, and stops at EOF.
EDIT:
From the documentation:
fromfile(f, [n])
Read n bytes from the file object f and append them to the bitarray interpreted as machine values. When n is omitted, as many bytes are read until EOF is reached.

Save an array as bin in Matlab, pass it to Python and read the bin file in Python

I am currently trying to save an array as bin file in Matlab, send it to Python and read it in Python. However, Matlab is showing errors when I run it. I am using the following codes:
Read the array in Matlab, convert to bin file and pass to Python:
array1 = rand(5,1); %% array1 is the desired array that needs to be sent to Python
fid = fopen('nazmul.bin','wb'); %% I want to save array1 in the nazmul.bin file
fwrite(fid,array1);
status=fclose(fid);
python('squared.py','nazmul.bin'); %% I want to send the parameters to squared.py program
squared.py file:
import sys
if __name__ == '__main__':
f = open("nazmul.bin", "rb") # Trying to open the bin file
try:
byte = f.read(1) # Reading the bin file and saving it in the byte array
while byte != "":
# Do stuff with byte.
byte = f.read(1)
finally:
f.close()
print byte # printing the byte array from Python
However, when I run this program, nothing gets printed. I guess that the bin file is not getting passed properly to the squared.py file.
Thanks for your feedback.
Nazmul
There are several problems here.
You should use double underscore when checking for 'main'. I.e. __main__ == "__main__".
You are not collecting bytes but rather always storing the last byte read. Therefore, the last byte is always "".
Finally, it seems like indentation is not correct. I assume this is just a stackoverflow formatting error.
One more potential issue - When you use fwrite(fid, A) in MATLAB, it assumes that you want to write bytes (8 bit numbers). However, your rand() command generates reals, so MATLAB first rounds the results to integers and your binary file will hold '0' or '1' only.
Final note: Reading a file one byte at a time is probably very inefficient. It is probably better to read the file in large chunks, or - if it is a small file - read the entire file in one read() operation.
The corrected Python code is as follows:
if __name__ == '__main__':
f = open("xxx.bin", "rb") # Trying to open the bin file
try:
a = [];
byte = f.read(1) # Reading the bin file and saving it in the byte array
while byte != "":
a.append(byte);
# Do stuff with byte.
byte = f.read(1)
finally:
f.close()
print a;

Categories

Resources