I have a image in .img format. The image size is 1920x1200 px. It's a RGB image with 8 bit depth. I am using the following python code to recover this image. However, the error can display the image but the image content isn't correct. I don't where did I do wrong. Anyone can help?
w, h = 1920, 1200 # img image size in px
# read img files and save them to png
with open(file_add, 'rb') as f:
# Seek backwards from end of file by 3 bytes per pixel
f.seek(-w*h*3, 2)
img = np.fromfile(f, dtype=np.uint8).reshape((h, w, 3))
# Save as PNG, and retain 8-bit resolution
PIL.Image.fromarray(img).save('result.png')
I would like to upload the img file, however, it's larger than the 2Mb limitation.
Your file is in some hideous, Microsoft-designed "Compound File Binary Format" which is described here. I don't run Windows so I cannot unpack it. There are apparently tools available, but I cannot vouch for any of them:
https://openrpmsgfile.com/cfbf.html
http://fileformats.archiveteam.org/wiki/Microsoft_Compound_File
There seems to be a Python module called olefile that can read these things. I installed it and was able to test your file and find your image within it as follows:
#!/usr/bin/env python3
import olefile
import numpy as np
from PIL import Image
# Open file
ole = olefile.OleFileIO('image.img')
# Get a directory listing
ole.dumpdirectory()
# Open image stream within file and read
stream = ole.openstream('image/__102/DataObject')
data = stream.read()
# Define image width, height and bytes per pixel
w, h, bpp = 1920, 1200, 3
imsize = w * h * bpp
# Check data size and image size
print(f'Data size: {len(data)}, Image size: {imsize}')
# There are 192 bytes difference, assume it is a header and take our bytes from the tail of the file
data = data[-imsize:]
# Make into Numpy array
na = np.frombuffer(data, dtype=np.uint8).reshape((h*3,w))
# Convert from interleaved by line to interleaved by plane
R = na[0::3]
G = na[1::3]
B = na[2::3]
na = np.dstack((R,G,B))
# Make into PIL Image and save, but you could equally use OpenCV or scikit-image here
Image.fromarray(na).save('result.jpg')
Sample Output from running script:
'Root Entry' (root) 192 bytes
'NonDataObjects' (stream) 26 bytes
'Signature' (stream) 12 bytes
'image' (storage)
'__102' (storage)
'DataObject' (stream) 6912192 bytes
'DataObjectChilds' (stream) 4 bytes
'DataObjectStub' (stream) 6760 bytes
Data size: 6912192, Image size: 6912000
I worked out it was a CFBF file from the following. Firstly, if you run the Linux/Unix file command to determine the type of the file, you get this:
file image.img
image.img: Composite Document File V2 Document, Cannot read section info
Secondly, if you dump the file with xxd you will see the CFBF signature bytes referred to in the links above:
xxd image.img
00000000: d0cf 11e0 a1b1 1ae1 0000 0000 0000 0000 ................
Keywords: OLE file, CFBF, Composite Document File V2 Document, IMG format, d0cf11e0a1b1
This post seems to be accomplishing what you're looking for. It reads the data with matplotlib instead, but it should still be able to do what you want.
You can use if you have in case .img with .hdr nibabel or simpleITK and transformed to numpy array
important !! simplITK support maximum 5D.
import nibabel as nib
data_path="sample/path"
array_data=nib.load(data_path).get_fdata() # you get your matrix
print(array_data.shape)
exemple with SimpleITK
import SimpleITK as sitk
data_path="/your/path"
imgObj=sitk.Image(data_path) # you will get and Image object it's a complex data format to handle
array_data = sitk.GetArrayFromImage(imgObj) # you array matrix
Related
I am trying to convert an image (specifically Spotify album covers which are a constant size of 640x640) to a byte array which will then be used to display the image on a 32x32 RGB matrix. This is the code:
import urllib.request
url = "https://i.scdn.co/image/ab67616d0000b2734c63ce517dd44d0fbd9416da"
path = "test.jpg"
def image_byte(url, file_path):
urllib.request.urlretrieve(url, file_path)
with open(path, "rb") as image:
f = image.read()
b = bytearray(f)
return b
b = image_byte(url, path)
print(len(b))
The length of the byte array comes out to be 175742, why? Shouldn't the byte array size be 640 x 640 x 3 = 1228800?
Sorry if I am missing something huge, this is my first project with an RGB matrix and anything to do with image conversion, thanks in advance.
This is because the image is compressed and is in the JPEG format. If the returned data would be the raw pixel values, then you would be right in your calculations; however, pretty much all services will return image data in a compressed form.
You can decode the compressed image with a library like Pillow:
from PIL import Image
with open("example_image.jfif") as img_data:
img = Image.open(img_data)
actual_data = list(img.getdata()) # actual_data will hold all pixel values
Can someone please explain why do I get this inconsistency in rgb values after saving the image.
import imageio as io
image = 'img.jpg'
type = image.split('.')[-1]
output = 'output' + type
img = io.imread(image)
print(img[0][0][1]) # 204
img[0][0][1] = 255
print(img[0][0][1]) # 255
io.imwrite(output, img, type, quality = 100)
imgTest = io.imread(output)
print(imgTest[0][0][1]) # 223
# io.help('jpg')
Image used = img.jpg
The reason that pixels are changed when loading a jpeg image and then saving it as a jpeg again is that jpeg uses lossy compression. To save storage space for jpeg images, pixel values are saved in a dimension-reduced representation. You can find some information about the specific algorithm here.
The advantage of lossy compression is that the image size can significantly be reduced, without the human eye noticing any changes. However, without any additional methods, we will not retrieve the original image after saving it in jpg format.
An alternative that does not use lossy compression is the png format, which we can verify by converting your example image to png and runnning the code again:
import imageio as io
import numpy as np
import matplotlib.pyplot as plt
image = '/content/drive/My Drive/img.png'
type = image.split('.')[-1]
output = 'output' + type
img = io.imread(image)
print(img[0][0][1]) # 204
img[0][0][1] = 255
print(img[0][0][1]) # 255
io.imwrite(output, img, type)
imgTest = io.imread(output)
print(imgTest[0][0][1]) # 223
# io.help('jpg')
Output:
204
255
255
We can also see that the png image takes up much more storage space than the jpg image
import os
os.path.getsize('img.png')
# output: 688444
os.path.getsize('img.jpg')
# output: 69621
Here is the png image:
there is a defined process in the imageio
imageio reads in a structure of RGB, if you are trying to save it in the opencv , you need to convert this RGB to BGR. Also, if you are plotting in a matplotlib, it varies accordingly.
the best way is,
Read the image in imageio
convert RGB to BGR
save it in the opencv write
After deciding not to use OpenCV because I only use one function of it I was looking to replace the cv2.imencode() function with something else. The goal is to convert a 2D Numpy Array into a image format (like .png) to send it to the GCloud Vision API.
This is what I was using until now:
content = cv2.imencode('.png', image)[1].tostring()
image = vision.types.Image(content=content)
And now I am looking to achieve the same without using OpenCV.
Things I've found so far:
Vision API needs base64 encoded data
Imencode returns the encoded bytes for the specific image type
I think it is worth noting that my numpy array is a binary image with only 2 dimensions and the whole functions will be used in an API, so saving a png to disk and reloading it is to be avoided.
PNG writer in pure Python
If you're insistent on using more or less pure python, the following function from ideasman's answer to this question is useful.
def write_png(buf, width, height):
""" buf: must be bytes or a bytearray in Python3.x,
a regular string in Python2.x.
"""
import zlib, struct
# reverse the vertical line order and add null bytes at the start
width_byte_4 = width * 4
raw_data = b''.join(
b'\x00' + buf[span:span + width_byte_4]
for span in range((height - 1) * width_byte_4, -1, - width_byte_4)
)
def png_pack(png_tag, data):
chunk_head = png_tag + data
return (struct.pack("!I", len(data)) +
chunk_head +
struct.pack("!I", 0xFFFFFFFF & zlib.crc32(chunk_head)))
return b''.join([
b'\x89PNG\r\n\x1a\n',
png_pack(b'IHDR', struct.pack("!2I5B", width, height, 8, 6, 0, 0, 0)),
png_pack(b'IDAT', zlib.compress(raw_data, 9)),
png_pack(b'IEND', b'')])
Write Numpy array to PNG formatted byte literal, encode as base64
To represent the grayscale image as an RGBA image, we will stack the matrix into 4 channels and set the alpha channel. (Supposing your 2d numpy array is called "img"). We also flip the numpy array vertically, due to the manner in which PNG coordinates work.
import base64
img_rgba = np.flipud(np.stack((img,)*4, axis=-1)) # flip y-axis
img_rgba[:, :, -1] = 255 # set alpha channel (png uses byte-order)
data = write_png(bytearray(img_rgba), img_rgba.shape[1], img_rgba.shape[0])
data_enc = base64.b64encode(data)
Test that encoding works properly
Finally, to ensure the encoding works, we decode the base64 string and write the output to disk as "test_out.png". Check that this is the same image you started with.
with open("test_out.png", "wb") as fb:
fb.write(base64.decodestring(data_enc))
Alternative: Just use PIL
However, I'm assuming that you are using some library to actually read your images in the first place? (Unless you are generating them). Most libraries for reading images have support for this sort of thing. Supposing you are using PIL, you could also try the following snippet (from this answer). It just saves the file in memory, rather than on disk, and uses this to generate a base64 string.
in_mem_file = io.BytesIO()
img.save(in_mem_file, format = "PNG")
# reset file pointer to start
in_mem_file.seek(0)
img_bytes = in_mem_file.read()
base64_encoded_result_bytes = base64.b64encode(img_bytes)
base64_encoded_result_str = base64_encoded_result_bytes.decode('ascii')
I would like to know if there is a way in Python to measure the memory consumption of a PNG image.
For my test I've to Images normal.png and evil.png. Let's say both images are 100kb in size.
normal.png consists of data represented by 1 byte per pixel.
evil.png consists of \x00 bytes and a PLTE. chunk - 3 bytes per Pixel.
For normal.png I could decompress the IDAT data chunk, measure the size and compare it with the original file size to get an approximate memory consumption.
But how to proceed with evil.png ?
You can use Pillow library to identify the image and to get the number of pixels and the mode, which can be transformed into the bitdepth:
from PIL import Image
mode_to_bpp = {'1':1, 'L':8, 'P':8, 'RGB':24, 'RGBA':32,
'CMYK':32, 'YCbCr':24, 'I':32, 'F':32}
i = Image.open('warty-final-ubuntu.png')
h, w = i.size
n_pixels = h * w
bpp = mode_to_bpp[data.mode]
n_bytes = n_pixels * bpp / 8
Image.open does not load the whole data yet; the compressed image in question is 3367987 bytes, 4096000 pixels and uses 12288000 bytes of memory when uncompressed; however straceing the python script shows that Image.open read only 4096 bytes from the file in memory.
I try open a tif image with 16-bit per pixel and multi-band to convert it in a raw file. I'm using PIL with the next commands i = Image.open('image.tif') and after I use rawData = i.tostring(). It doesn't work with multi-band tif image.
The error is:
File "C:\Python27\lib\site-packages\PIL\Image.py", line 1980, in open
raise IOError("cannot identify image file")
IOError: cannot identify image file
The directory contains the file.
How I must do it ?
GDAL is pretty good at opening multiband rasters, and supports 11 different band types, including int16.
from osgeo import gdal
import numpy as np
ds = gdal.Open('image.tif')
# loop through each band
for bi in range(ds.RasterCount):
band = ds.GetRasterBand(bi + 1)
# Read this band into a 2D NumPy array
ar = band.ReadAsArray()
print('Band %d has type %s'%(bi + 1, ar.dtype))
raw = ar.tostring()