PIL cannot identify image file for io.BytesIO object

PIL cannot identify image file for io.BytesIO object - python

I am using the Pillow fork of PIL and keep receiving the error
OSError: cannot identify image file <_io.BytesIO object at 0x103a47468>
when trying to open an image. I am using virtualenv with python 3.4 and no installation of PIL.
I have tried to find a solution to this based on others encountering the same problem, however, those solutions did not work for me. Here is my code:
from PIL import Image
import io
# This portion is part of my test code
byteImg = Image.open("some/location/to/a/file/in/my/directories.png").tobytes()
# Non test code
dataBytesIO = io.BytesIO(byteImg)
Image.open(dataBytesIO) # <- Error here
The image exists in the initial opening of the file and it gets converted to bytes. This appears to work for almost everyone else but I can't figure out why it fails for me.
EDIT:
dataBytesIO.seek(0)
does not work as a solution (tried it) since I'm not saving the image via a stream, I'm just instantiating the BytesIO with data, therefore (if I'm thinking of this correctly) seek should already be at 0.

(This solution is from the author himself. I have just moved it here.)
SOLUTION:
# This portion is part of my test code
byteImgIO = io.BytesIO()
byteImg = Image.open("some/location/to/a/file/in/my/directories.png")
byteImg.save(byteImgIO, "PNG")
byteImgIO.seek(0)
byteImg = byteImgIO.read()
# Non test code
dataBytesIO = io.BytesIO(byteImg)
Image.open(dataBytesIO)
The problem was with the way that Image.tobytes()was returning the byte object. It appeared to be invalid data and the 'encoding' couldn't be anything other than raw which still appeared to output wrong data since almost every byte appeared in the format \xff\. However, saving the bytes via BytesIO and using the .read() function to read the entire image gave the correct bytes that when needed later could actually be used.

image = Image.open(io.BytesIO(decoded))
# File "C:\Users\14088\anaconda3\envs\tensorflow\lib\site-packages\PIL\Image.py", line 2968, in open
# "cannot identify image file %r" % (filename if filename else fp)
# PIL.UnidentifiedImageError: cannot identify image file <_io.BytesIO object at 0x000002B733BB11C8>
===
I fixed as worked:
message = request.get_json(force=True)
encoded = message['image']
# https://stackoverflow.com/questions/26070547/decoding-base64-from-post-to-use-in-pil
#image_data = re.sub('^data:image/.+;base64,', '', message['image'])
image_data = re.sub('^data:image/.+;base64,', '', encoded)
# Remove extra "data:image/...'base64" is Very important
# If "data:image/...'base64" is not remove, the following line generate an error message:
# File "C:\Work\SVU\950_SVU_DL_TF\sec07_TF_Flask06_09\32_KerasFlask06_VisualD3\32_predict_app.py", line 69, in predict
# image = Image.open(io.BytesIO(decoded))
# File "C:\Users\14088\anaconda3\envs\tensorflow\lib\site-packages\PIL\Image.py", line 2968, in open
# "cannot identify image file %r" % (filename if filename else fp)
# PIL.UnidentifiedImageError: cannot identify image file <_io.BytesIO object at 0x000002B733BB11C8>
# image = Image.open(BytesIO(base64.b64decode(image_data)))
decoded = base64.b64decode(image_data)
image = Image.open(io.BytesIO(decoded))
# return json.dumps({'result': 'success'}), 200, {'ContentType': 'application/json'}
#print('#app.route => image:')
#print()
processed_image = preprocess_image(image, target_size=(224, 224))
prediction = model.predict(processed_image).tolist()
#print('prediction:', prediction)
response = {
'prediction': {
'dog': prediction[0][0],
'cat': prediction[0][1]
}
}
print('response:', response)
return jsonify(response)

On some cases the same error happens when you are dealing with a Raw Image file such CR2. Example: http://www.rawsamples.ch/raws/canon/g10/RAW_CANON_G10.CR2
when you try to run:
byteImg = Image.open("RAW_CANON_G10.CR2")
You will get this error:
OSError: cannot identify image file 'RAW_CANON_G10.CR2'
So you need to convert the image using rawkit first, here is an example how to do it:
from io import BytesIO
from PIL import Image, ImageFile
import numpy
from rawkit import raw
def convert_cr2_to_jpg(raw_image):
raw_image_process = raw.Raw(raw_image)
buffered_image = numpy.array(raw_image_process.to_buffer())
if raw_image_process.metadata.orientation == 0:
jpg_image_height = raw_image_process.metadata.height
jpg_image_width = raw_image_process.metadata.width
else:
jpg_image_height = raw_image_process.metadata.width
jpg_image_width = raw_image_process.metadata.height
jpg_image = Image.frombytes('RGB', (jpg_image_width, jpg_image_height), buffered_image)
return jpg_image
byteImg = convert_cr2_to_jpg("RAW_CANON_G10.CR2")
Code credit if for mateusz-michalik on GitHub (https://github.com/mateusz-michalik/cr2-to-jpg/blob/master/cr2-to-jpg.py)

While reading Dicom files the problem might be caused due to Dicom compression.
Make sure both gdcm and pydicom are installed.
GDCM is usually the one that's more difficult to install. The latest way to easily install the same is
conda install -U conda-forge gdcm

When dealing with url, this error can arise from a wrong extension of the downloaded
file or just a corrupted file.
So to avoid that use a try/except bloc so you app doesn't crash and will continue its job.
In the except part, you can retrieve the file in question for analysis:
A snippet here:
for url in urls:
with closing(urllib.request.urlopen(url)) as f:
try:
img = Image(f, 30*mm, 30*mm)
d_img.append(img)
except Exception as e:
print(url) #here you get the file causing the exception
print(e)
Here a related answer

The image file itself might be corrupted. So if you were to process a considerable amount of image files, then simply enclose the line that processes each image file with a try catch statement.

Related

PIL Image as Bytes with BytesIO to prevent hard disk saving

Problematic
I have a PIL Image and i want to convert it to a bytes array. I can't save the image on my hard disk so i can't use the default open(file_path, 'rb') function.
What i tried
To overturn this problem i'm trying to use the io library doing this :
buf = io.BytesIO()
image.save(buf, format='JPEG')
b_image = buf.getvalue()
Considering image as a functional PIL Image.
the "b_image" will be used as argument for the Microsoft Azure cognitives services function read_in_stream()
If we look in the documentation, we can see that this function image argument have to be :
image
xref:Generator
Required
An image stream.
Documentation available here
The issue
When i execute it i got the error :
File "C:...\envs\trainer\lib\site-packages\msrest\service_client.py", line 137, in stream_upload
chunk = data.read(self.config.connection.data_block_size)
AttributeError: 'bytes' object has no attribute 'read'
There is no error in the client authentification or at another point because when i give as parameter an image imported with this line :
image = open("./1.jpg", 'rb')
Everything is working correctly..
Sources
I also saw this post that explains exactly what i want to do but in my case it's not working. Any idea would be appreciated.

When we use the method read_in_stream, we need to provide a stream. But the code BytesIO.getvalue will return the content of the stream as string or bytes. So please update code as below
buf = io.BytesIO()
image.save(buf, format='JPEG')
computervision_client.read_in_stream(buf)
For more details, please refer to here
Update
Regarding the issue, I suggest you use rest API to implement your need.
import io
import requests
from PIL import Image
import time
url = "{your endpoint}/vision/v3.1/read/analyze"
key = ''
headers = {
'Ocp-Apim-Subscription-Key': key,
'Content-Type': 'application/octet-stream'
}
// process image
...
with io.BytesIO() as buf:
im.save(buf, 'jpeg')
response = requests.request(
"POST", url, headers=headers, data=buf.getvalue())
# get result
while True:
res = requests.request(
"GET", response.headers['Operation-Location'], headers=headers)
status = res.json()['status']
if status == 'succeeded':
print(res.json()['analyzeResult'])
break
time.sleep(1)

Error while trying to get the EXIF tags of the image

I'm trying to get the EXIF tags of an JPG image. To do this, I'm using piexif module.
The problem is that I get an error - KeyError, saying this:
Traceback (most recent call last):
File "D:/PythonProjects/IMGDateByNameRecovery/recovery.py", line 98, in selectImages
self.setExifTag(file_str)
File "D:/PythonProjects/IMGDateByNameRecovery/recovery.py", line 102, in setExifTag
exif = piexif.load(img.info["Exif"])
KeyError: 'Exif'
I've did everything as in the docs, here on some questions StackOverflow and on pypi website. Everything the same. My code:
img = Image.open(file)
exif_dict = piexif.load(img.info["exif"])
altitude = exif_dict['GPS'][piexif.GPSIFD.GPSAltitude]
print(altitude)
How do I read the image's EXIF tags then? Am I doing it wrong?
Please, I'm so clueless. This is such a weird error.

Pillow only adds the exif key to the Image.info if EXIF data exists. So if the images has no EXIF data your script will return the error because the key does not exist.
You can see what image formats support the info["exif"] data in the Image file formats documentation.
You could do something like this...
img = Image.open(file)
exif_dict = img.info.get("exif") # returns None if exif key does not exist
if exif_dict:
exif_data = piexif.load(exif_dict)
altitude = exif_data['GPS'][piexif.GPSIFD.GPSAltitude]
print(altitude)
else:
pass
# Do something else when there is no EXIF data on the image.
Using mydict.get("key") will return a value of None if the key does not exist where as mydict["key"] will throw a KeyError.

Say you have encoded metadata within the MakerNotes.
Make sure to install the following dependencies:
pixeif
Pillow
Then run the following code considering the image is Image.png and is in the same directory of the script:
from PIL import Image
import piexif
import pickle
img = Image.open('Image.png')
exif_dict = img.info.get("exif") # returns None if exif key does not exist
if exif_dict:
exif_data = piexif.load(exif_dict)
raw = exif_data['Exif'][piexif.ExifIFD.MakerNote]
tags = pickle.loads(raw)
print(tags)

Image processing after upload with Python Bottle

Context
I have made a simple web app for uploading content to a blog. The front sends AJAX requests (using FormData) to the backend which is Bottle running on Python 3.7. Text content is saved to a MySQL database and images are saved to a folder on the server. Everything works fine.
Image processing and PIL/Pillow
Now, I want to enable processing of uploaded images to standardise them (I need them all resized and/or cropped to 700x400px).
I was hoping to use Pillow for this. My problem is creating a PIL Image object from the file object in Bottle. I cannot initialise a valid Image object.
Code
# AJAX sends request to this route
#post('/update')
def update():
# Form data
title = request.forms.get("title")
body = request.forms.get("body")
image = request.forms.get("image")
author = request.forms.get("author")
# Image upload
file = request.files.get("file")
if file:
extension = file.filename.split(".")[-1]
if extension not in ('png', 'jpg', 'jpeg'):
return {"result" : 0, "message": "File Format Error"}
save_path = "my/save/path"
file.save(save_path)
The problem
This all works as expected, but I cannot create a valid Image object with pillow for processing. I even tried reloading the saved image using the save path but this did not work either.
Other attempts
The code below did not work. It caused an internal server error, though I am having trouble setting up more detailed Python debugging.
path = save_path + "/" + file.filename
image_data = open(path, "rb")
image = Image.open(image_data)
When logged manually, the path is a valid relative URL ("../domain-folder/images") and I have checked that I am definitely importing PIL (Pillow) correctly using PIL.PILLOW_VERSION.
I tried adapting this answer:
image = Image.frombytes('RGBA', (128,128), image_data, 'raw')
However, I won’t know the size until I have created the Image object. I also tried using io:
image = Image.open(io.BytesIO(image_data))
This did not work either. In each case, it is only the line trying to initialise the Image object that causes problems.
Summary
The Bottle documentation says the uploaded file is a file-like object, but I am not having much success in creating an Image object that I can process.
How should I go about this? I do not have a preference about processing before or after saving. I am comfortable with the processing, it is initialising the Image object that is causing the problem.
Edit - Solution
I got this to work by adapting the answer from eatmeimadanish. I had to use a io.BytesIO object to save the file from Bottle, then load it with Pillow from there. After processing, it could be saved in the usual way.
obj = io.BytesIO()
file.save(obj) # This saves the file retrieved by Bottle to the BytesIO object
path = save_path + "/" + file.filename
# Image processing
im = Image.open(obj) # Reopen the object with PIL
im = im.resize((700,400))
im.save(path, optimize=True)
I found this from the Pillow documentation about a different function that may also be of use.
PIL.Image.frombuffer(mode, size, data, decoder_name='raw', *args)
Note that this function decodes pixel data only, not entire images.
If you have an entire image file in a string, wrap it in a BytesIO object, and use open() to load it.

Use StringIO instead.
From PIL import Image
try:
import cStringIO as StringIO
except ImportError:
import StringIO
s = StringIO.StringIO()
#save your in memory file to this instead of a regular file
file = request.files.get("file")
if file:
extension = file.filename.split(".")[-1]
if extension not in ('png', 'jpg', 'jpeg'):
return {"result" : 0, "message": "File Format Error"}
file.save(s)
im = Image.open(s)
im.resize((700,400))
im.save(s, 'png', optimize=True)
s64 = base64.b64encode(s.getvalue())

From what I understand, you're trying to resize the image after it has been saved locally (note that you could try to do the resize before it is saved). If this is what you want to achieve here, you can open the image directly using Pillow, it does the job for you (you do not have to open(path, "rb"):
image = Image.open(path)
image.resize((700,400)).save(path)

PIL cannot identify image file for a Google Drive image streamd into io.BytesIO

I am using the Drive API to download an image. Following their file downloading documentation in Python, I end up with a variable fh that is a populated io.BytesIO instance. I try to save it as an image:
file_id = "0BwyLGoHzn5uIOHVycFZpSEwycnViUjFYQXR5Nnp6QjBrLXJR"
request = service.files().get_media(fileId=file_id)
fh = io.BytesIO()
downloader = MediaIoBaseDownload(fh, request)
done = False
while done is False:
status, done = downloader.next_chunk()
print('Download {} {}%.'.format(file['name'],
int(status.progress() * 100)))
fh.seek(0)
image = Image.open(fh) # error
The error is: cannot identify image file <_io.BytesIO object at 0x106cba890>. Actually, the error does not occur with another image but is thrown with most images, including the one I linked at the beginning of this post.
After reading this answer I change that last line to:
byteImg = fh.read()
dataBytesIO = io.BytesIO(byteImg)
image = Image.open(dataBytesIO) # still the same error
I've also tried this answer, where I change the last line of my first code block to
byteImg = fh.read()
image = Image.open(StringIO(byteImg))
But I still get a cannot identify image file <StringIO.StringIO instance at 0x106471e60> error.
I've tried using alternates (requests, urllib) with no fruition. I can Image.open the the image if I download it manually.
This error was not present a month ago, and has recently popped up into the application this code is in. I've spent days debugging this error with no success and have finally brought the issue to Stack Overflow. I am using from PIL import Image.

Ditch the Drive service's MediaIOBaseDownload. Instead, use the webContentLink property of a media file (a link for downloading the content of the file in a browser, only available for files with binary content). Read more here.
With that content link, we can use an alternate form of streaming—the requests and shutil libraries and the —to get the image.
import requests
import shutil
r = requests.get(file['webContentLink'], stream=True)
with open('output_file', 'wb') as f:
shutil.copyfileobj(r.raw, f)

Pillow Attribute Error

I want to set up an imagestream from my rbpi to my server.
So I would like to setup a network stream discripted in the http://picamera.readthedocs.io/en/release-1.12/recipes1.html#streaming-capture.
This worked well, but now I want to save the captured Image.
-> (modified the server script)
import io
import socket
import struct
from PIL import Image
# Start a socket listening for connections on 0.0.0.0:8000 (0.0.0.0 means
# all interfaces)
server_socket = socket.socket()
server_socket.bind(('0.0.0.0', 8000))
server_socket.listen(0)
# Accept a single connection and make a file-like object out of it
connection = server_socket.accept()[0].makefile('rb')
try:
while True:
# Read the length of the image as a 32-bit unsigned int. If the
# length is zero, quit the loop
image_len = struct.unpack('<L', connection.read(struct.calcsize('<L')))[0]
if not image_len:
break
# Construct a stream to hold the image data and read the image
# data from the connection
image_stream = io.BytesIO()
image_stream.write(connection.read(image_len))
# Rewind the stream, open it as an image with PIL and do some
# processing on it
image_stream.seek(0)
image = Image.open(image_stream)
print('Image is %dx%d' % image.size)
image.verify()
print('Image is verified')
im = Image.new("RGB", (640,480), "black") #the saving part
im = image.copy()
im.save("./img/test.jpg","JPEG")
finally:
connection.close()
server_socket.close()
But it returns me following errorcode:
Traceback (most recent call last):
File "stream.py", line 33, in <module>
im = image.copy()
File "/usr/lib/python2.7/dist-packages/PIL/Image.py", line 781, in copy
self.load()
File "/usr/lib/python2.7/dist-packages/PIL/ImageFile.py", line 172, in load
read = self.fp.read
AttributeError: 'NoneType' object has no attribute 'read'
How can I fix this?

I don't have a raspberry-pi, but decided to see if I could reproduce the problem anyway. Also, for input I just created an image file on disk to eliminate all the socket stuff. Sure enough I got exactly the same error as you encountered. (Note: IMO you should have done this simplification yourself and posted an MCVE illustrating the problem (see How to create a Minimal, Complete, and Verifiable example in the SO Help Center).
To get the problem to go away I added a call to the image.load() method immediately after the Image.open() statement and things started working. Not only was the error gone, but the output file seemed fine, too.
Here's my simple test code with the fix indicated:
import io
import os
from PIL import Image
image_filename = 'pillow_test.jpg'
image_len = os.stat(image_filename).st_size
image_stream = io.BytesIO()
with open(image_filename, 'rb') as image_file:
image_stream.write(image_file.read(image_len))
image_stream.seek(0)
image = Image.open(image_stream)
image.load() # <======================== ADDED LINE
print('Image is %dx%d' % image.size)
image.verify()
print('Image is verified')
im = Image.new("RGB", (640,480), "black") #the saving part
im = image.copy()
im.save("pillow_test_out.jpg","JPEG")
print('image written')
The clue was this passage from the pillow documentation for the PIL.Image.open() function:
This is a lazy operation; this function identifies the file, but the file
remains open and the actual image data is not read from the file until you try
to process the data (or call the load() method).
Emphasis mine. You would think the image.verify() would make this unnecessary because it seems like verifying the "file" would require loading the image data in order to check its contents (according to that method's own documentation, which claims it "verifies the contents of a file"). My guess is this is likely a bug and you should report it.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

PIL cannot identify image file for io.BytesIO object - python

While reading Dicom files the problem might be caused due to Dicom compression. Make sure both gdcm and pydicom are installed. GDCM is usually the one that's more difficult to install. The latest way to easily install the same is conda install -U conda-forge gdcm

The image file itself might be corrupted. So if you were to process a considerable amount of image files, then simply enclose the line that processes each image file with a try catch statement.

Related

PIL Image as Bytes with BytesIO to prevent hard disk saving

Error while trying to get the EXIF tags of the image

Image processing after upload with Python Bottle

PIL cannot identify image file for a Google Drive image streamd into io.BytesIO

Pillow Attribute Error

Categories

Resources