Why ImageField in form always triggers "invalid_image"? - python

I've implemented ImageField to upload images using Pillow verification in Django 1.8. For some reason, I can't submit the form. It always raises this ValidationError in the form (but with FileField this would work):
Upload a valid image. The file you uploaded was either not an image or a corrupted image.
The weird part of all this is that the ImageField.check method seems to obtain correct MIME type! (see below)
WHAT I TRIED
I've tried with JPG, GIF, and PNG formats; none worked.
So I tried to print some variables in django.forms.fields.ImageField modifying the try statement that triggers this error, adding print statements for testing:
try:
# load() could spot a truncated JPEG, but it loads the entire
# image in memory, which is a DoS vector. See #3848 and #18520.
image = Image.open(file)
# verify() must be called immediately after the constructor.
damnit = image.verify()
print 'MY_LOG: verif=', damnit
# Annotating so subclasses can reuse it for their own validation
f.image = image
# Pillow doesn't detect the MIME type of all formats. In those
# cases, content_type will be None.
f.content_type = Image.MIME.get(image.format)
print 'MY_LOG: image_format=', image.format
print 'MY_LOG: content_type=', f.content_type
Then I submit a form again to trigger the error after running python manage.py runserver and obtain these lines:
MY_LOG: verif= None
MY_LOG: image_format= JPEG
MY_LOG: content_type= image/jpeg
Image is correctly identified by Pillow and the try statement is executed until it's last line... and still the except statement is triggered? It makes nosense!
Using the same tactic, I tried to obtain sone useful log from django.db.models.fields.files.ImageField and every of it's parents until Field to print errors lists... all of them empty!
MY QUESTION
Is there anything else I can try to spot what is triggering the ValidationError?
SOME CODE
models.py
class MyImageModel(models.Model):
# Using FileField instead would mean succesfull upload
the_image = models.ImageField(upload_to="public_uploads/", blank=True, null=True)
views.py
from django.views.generic.edit import CreateView
from django.forms.models import modelform_factory
class MyFormView(CreateView):
model = MyImageModel
form_class = modelform_factory(MyImageModel,
widgets={}, fields = ['the_image',])
EDIT:
After trying the tactic suggested by #Alasdair, I obtained this report from e.message:
cannot identify image file <_io.BytesIO object at 0x7f9d52bbc770>
However, the file is successfully uploaded even if I'm not allowed to submit the form. It looks like if, somehow, the path to image wasn't processed correctly (or something else that hinders the image loading in these lines).
I think something is probably failing on these lines (from django.forms.fields.ImageField):
# We need to get a file object for Pillow. We might have a path or we might
# have to read the data into memory.
if hasattr(data, 'temporary_file_path'):
file = data.temporary_file_path()
else:
if hasattr(data, 'read'):
file = BytesIO(data.read())
else:
file = BytesIO(data['content'])
If I explore what properties does this class BytesIO have, maybe I can extract some relevant information about the error...
EDIT2
data attribute arrives empty! Determining why won't be easy...

From django documentation:
Using an ImageField requires that Pillow is installed with support for the image formats you use. If you encounter a corrupt image error when you upload an image, it usually means that Pillow doesn’t understand its format. To fix this, install the appropriate library and reinstall Pillow.
So first, you should install Pillow, instead of PIL (pillow is an fork of PIL) and second, when installing, make sure that all libraries required for "understanding" by Pillow various image formats, are installed.
For list of dependencies, you can look into Pillow documentation.

After thinking a lot, analyzing the implied code and lots of trial-and-error, I tried to edit this line from the try / except block that I exposed in the question (in django.forms.fields.ImageField) like this:
# Before edition
image = Image.open(file)
# After my edition
image = Image.open(f)
This fixed my issue. Now everything works well and I can submit the form. Invalid files are correctly rejected by the corresponding ValidationError
MY GUESS ABOUT HOW COULD THIS HAPPEN
I'm not sure if I'm guessing right, but:
I think this worked because this line had an error naming the correct variable. In addition, using file as a variable name looks like a typo, because file seems to be reserved for an existing built-in.
If my guess is right, maybe I should report this issue to Django developers

Related

I am getting this error "media type unrecognized" when I upload this image?

I am getting this error ("Media type unrecognized.") after uploading these kinds of images. Uploading this image directly to twitter works fine but not through the API. This is the image:
Image:
My Code:
images = []
for x in user_tweet_images_list: # AWS S3 image urls
media_file = tempfile.NamedTemporaryFile(
delete=False,
suffix=get_image_extension(x.tweet_image.url)
)
media_file.write(x.tweet_image.read())
images.append(media_file.name)
media_ids = [api.media_upload(i).media_id_string for i in images]
post_tweet = api.update_status(status=user_tweet.tweet_message, media_ids=media_ids)
EDIT: This error is actually a TweepError originating from Twitter's API. The likely issue is that you're uploading an empty file because you haven't flushed the data to the temporary file, e.g. with media_file.flush().
This is likely due to a known Tweepy issue and will probably be fixed when v3.9 is released with a fix.
In the meantime, Depending on what get_image_extension does, you might need to add a dot (.):
[NamedTemporaryFile] does not put a dot between the file name and the suffix; if you need one, put it at the beginning of suffix.
https://docs.python.org/3/library/tempfile.html

Python Script to detect broken images

I wrote a python script to detect broken images and count them,
The problem in my script is it detects all the images and does not detect broken images. How to fix this. I refered :
How to check if a file is a valid image file? for my code
My code
import os
from os import listdir
from PIL import Image
count=0
for filename in os.listdir('/Users/ajinkyabobade/Desktop/2'):
if filename.endswith('.JPG'):
try:
img=Image.open('/Users/ajinkyabobade/Desktop/2'+filename)
img.verify()
except(IOError,SyntaxError)as e:
print('Bad file : '+filename)
count=count+1
print(count)
I have added another SO answer here that extends the PIL solution to better detect broken images.
I also implemented this solution in my Python script here on GitHub.
I also verified that damaged files (jpg) frequently are not 'broken' images i.e, a damaged picture file sometimes remains a legit picture file, the original image is lost or altered but you are still able to load it.
I quote the other answer for completeness:
You can use Python Pillow(PIL) module, with most image formats, to check if a file is a valid and intact image file.
In the case you aim at detecting also broken images, #Nadia Alramli correctly suggests the im.verify() method, but this does not detect all the possible image defects, e.g., im.verify does not detect truncated images (that most viewer often load with a greyed area).
Pillow is able to detect these type of defects too, but you have to apply image manipulation or image decode/recode in or to trigger the check. Finally I suggest to use this code:
try:
im = Image.load(filename)
im.verify() #I perform also verify, don't know if he sees other types o defects
im.close() #reload is necessary in my case
im = Image.load(filename)
im.transpose(PIL.Image.FLIP_LEFT_RIGHT)
im.close()
except:
#manage excetions here
In case of image defects this code will raise an exception.
Please consider that im.verify is about 100 times faster than performing the image manipulation (and I think that flip is one of the cheaper transformations).
With this code you are going to verify a set of images at about 10 MBytes/sec (modern 2.5Ghz x86_64 CPU).
For the other formats psd,xcf,.. you can use Imagemagick wrapper Wand, the code is as follows:
im = wand.image.Image(filename=filename)
temp = im.flip;
im.close()
But, from my experiments Wand does not detect truncated images, I think it loads lacking parts as greyed area without prompting.
I red that Imagemagick has an external command identify that could make the job, but I have not found a way to invoke that function programmatically and I have not tested this route.
I suggest to always perform a preliminary check, check the filesize to not be zero (or very small), is a very cheap idea:
statfile = os.stat(filename)
filesize = statfile.st_size
if filesize == 0:
#manage here the 'faulty image' case
You are building a bad path with
img=Image.open('/Users/ajinkyabobade/Desktop/2'+filename)
Try the following instead (by adding / to the end of the directory path)
img=Image.open('/Users/ajinkyabobade/Desktop/2/'+filename)
or
img=Image.open(os.path.join('/Users/ajinkyabobade/Desktop/2', filename))
try the below: It worked fine for me. It identifies the bad/corrupted image and remove them as well. Or if you want you can only print the bad/corrupted file name and remove the final script to delete the file.
for filename in listdir('/Users/ajinkyabobade/Desktop/2/'):
if filename.endswith('.JPG'):
try:
img = Image.open('/Users/ajinkyabobade/Desktop/2/'+filename) # open the image file
img.verify() # verify that it is, in fact an image
except (IOError, SyntaxError) as e:
print(filename)
os.remove('/Users/ajinkyabobade/Desktop/2/'+filename)
I am getting an error that tells me that Image.load is not available. Image.open appears to work.
I was also getting errors using:
except (IOError, SyntaxError) as e:
I just changed that to:
except:
and it worked fine.

How to send embedded image created using PIL/pillow as email (Python 3)

I am creating image that I would like to embed in the e-mail. I cannot figure out how to create image as binary and pass into MIMEImage. Below is the code I have and I have error when I try to read image object - the error is "AttributeError: 'NoneType' object has no attribute 'read'".
image=Image.new("RGBA",(300,400),(255,255,255))
image_base=ImageDraw.Draw(image)
emailed_password_pic=image_base.text((150,200),emailed_password,(0,0,0))
imgObj=emailed_password_pic.read()
msg=MIMEMultipart()
html="""<p>Please finish registration <br/><img src="cid:image.jpg"></p>"""
img_file='image.jpg'
msgText = MIMEText(html,'html')
msgImg=MIMEImage(imgObj)
msgImg.add_header('Content-ID',img_file)
msg.attach(msgImg)
msg.attach(msgText)
If you look at line 4 - I am trying to read image so that I can pass it into MIMEImage. Apparently, image needs to be read as binary. However, I don't know how to convert it to binary so that .read() can process it.
FOLLOW-UP
I edited code per suggestions from jsbueno - thank you very much!!!:
emailed_password=os.urandom(16)
image=Image.new("RGBA",(300,400),(255,255,255))
image_base=ImageDraw.Draw(image)
emailed_password_pic=image_base.text((150,200),emailed_password,(0,0,0))
stream_bytes=BytesIO()
image.save(stream_bytes,format='png')
stream_bytes.seek(0)
#in_memory_file=stream_bytes.getvalue()
#imgObj=in_memory_file.read()
imgObj=stream_bytes.read()
msg=MIMEMultipart()
sender='xxx#abc.com'
receiver='jjjj#gmail.com'
subject_header='Please use code provided in this e-mail to confirm your subscription.'
msg["To"]=receiver
msg["From"]=sender
msg["Subject"]=subject_header
html="""<p>Please finish registration by loging into your account and typing in code from this e-mail.<br/><img src="cid:image.png"></p>"""
img_file='image.png'
msgText=MIMEText(html,'html')
msgImg=MIMEImage(imgObj) #Is mistake here?
msgImg.add_header('Content-ID',img_file)
msg.attach(msgImg)
msg.attach(msgText)
smtpObj=smtplib.SMTP('smtp.mandrillapp.com', 587)
smtpObj.login(userName,userPassword)
smtpObj.sendmail(sender,receiver,msg.as_string())
I am not getting errors now but e-mail does not have image in it. I am confused about the way image gets attached and related to in html/email part. Any help is appreciated!
UPDATE:
This code actually works - I just had minor typo in the code on my PC.
There are a couple of conceptual errors there, both in using PIL and on what format an image should be in order to be incorporated into an e-mail.
In PIL: the ImageDraw class operates inplace, not like the Image class calls, which usually return a new image after each operation. In your code, it means that the call to image_base.text is actually changing the pixel data of the object that lies in your image variable. This call actually returns None and the code above should raise an error like "AttributeError: None object does not have attribute 'read'" on the following line.
Past that (that is, you should fetch the data from your image variable to attach it to the e-mail) comes the second issue: PIL, for obvious reasons, have images in an uncompressed, raw pixel data format in memory. When attaching images in e-mails we usually want images neatly packaged inside a file - PNG or JPG formats are usually better depending on the intent - let's just stay with .PNG. So, you have to create the file data using PIL, and them attach the file data (i.e. the data comprising a PNG file, including headers, metadata, and the actual pixel data in a compressed form). Otherwise you'd be putting in your e-mail a bunch of (uncompressed) pixel data that the receiving party would have no way to assemble back into an image (even if he would treat the data as pixels, raw pixel data does not contain the image shape so-)
You have two options: either generate the file-bytes in memory, or write them to an actual file in disk, and re-read that file for attaching. The second form is easier to follow. The first is both more efficient and "the right thing to do" - so let's keep it:
from io import BytesIO
# In Python 2.x:
# from StringIO import StringIO.StringIO as BytesIO
image=Image.new("RGBA",(300,400),(255,255,255))
image_base=ImageDraw.Draw(image)
# this actually modifies "image"
emailed_password_pic=image_base.text((150,200),emailed_password,(0,0,0))
stream = BytesIO()
image.save(stream, format="png")
stream.seek(0)
imgObj=stream.read()
...
(NB: I have not checked the part dealing with mail and mime proper in your code - if you are using it correctly, it should work now)

ipython notebook - uploading from and saving to subdirectories?

Change IPython working directory
Inserting image into IPython notebook markdown
Hi, I've read the two above links, and the second link seems most relevant. what the person describes - simply calling the subdirectory - doesn't work for me. For instance, I have an image 'gephi.png' in '/Graphs/gephi.png'
But when I write the following
from IPython.display import Image
path = "/Graphs/gephi.png"
i = Image(path)
i
no image pops up - Yup. No error. Just nothing pops up besides an empty square box image.
Clarification:
When I move the image to the regular director, the image pops up fine.
My only code change is path = "gephi.png"
IPython's Image display object takes three kinds of arguments
The first is raw image data (e.g. the results of open(filename).read():
with open("Graphs/graph.png") as f:
data = f.read()
Image(data=data)
The second model is to load an image from a filename. This is functionally the same as above, but IPython does the reading from the file:
Image(filename="Graphs/graph.png")
The third form is passing URLs. External URLs can be used, but relative URIs will serve files relative to the notebook's own directory:
Image(url="Graphs/graph.png")
Where this can get confusing is if you don't tell IPython which one of these you are specifying, and you just pass the one argument positionally:
Image("Graphs/graph.png")
IPython tries to guess what you mean in this case:
if it looks like a path and points to an existing file, use it as a filename
if it looks like a URL, use it as a URL
otherwise, fallback on embedding the string as raw png data
That #3 is the source of the most confusion. If you pass it a filename that doesn't exist,
you will get a broken image:
Image("/Graphs/graph.png")
Note that URLs to local files must be relative. Absolute URLs will generally be wrong:
Image(url="/Graphs/graph.png")
An example notebook illustrating these things.

How to check if a file is a valid image file?

I am currently using PIL.
from PIL import Image
try:
im=Image.open(filename)
# do stuff
except IOError:
# filename not an image file
However, while this sufficiently covers most cases, some image files like, xcf, svg and psd are not being detected. Psd files throws an OverflowError exception.
Is there someway I could include them as well?
I have just found the builtin imghdr module. From python documentation:
The imghdr module determines the type
of image contained in a file or byte
stream.
This is how it works:
>>> import imghdr
>>> imghdr.what('/tmp/bass')
'gif'
Using a module is much better than reimplementing similar functionality
UPDATE: imghdr is deprecated as of python 3.11
In addition to what Brian is suggesting you could use PIL's verify method to check if the file is broken.
im.verify()
Attempts to determine if the file is
broken, without actually decoding the
image data. If this method finds any
problems, it raises suitable
exceptions. This method only works on
a newly opened image; if the image has
already been loaded, the result is
undefined. Also, if you need to load
the image after using this method, you
must reopen the image file. Attributes
Additionally to the PIL image check you can also add file name extension check like this:
filename.lower().endswith(('.png', '.jpg', '.jpeg', '.tiff', '.bmp', '.gif'))
Note that this only checks if the file name has a valid image extension, it does not actually open the image to see if it's a valid image, that's why you need to use additionally PIL or one of the libraries suggested in the other answers.
A lot of times the first couple chars will be a magic number for various file formats. You could check for this in addition to your exception checking above.
One option is to use the filetype package.
Installation
python -m pip install filetype
Advantages
Fast: Does its work by loading only the first few bytes of your image (check on the magic number)
Supports different mime type: Images, Videos, Fonts, Audio, Archives.
Example
filetype >= 1.0.7
import filetype
filename = "/path/to/file.jpg"
if filetype.is_image(filename):
print(f"{filename} is a valid image...")
elif filetype.is_video(filename):
print(f"{filename} is a valid video...")
filetype <= 1.0.6
import filetype
filename = "/path/to/file.jpg"
if filetype.image(filename):
print(f"{filename} is a valid image...")
elif filetype.video(filename):
print(f"{filename} is a valid video...")
Additional information on the official repo: https://github.com/h2non/filetype.py
Update
I also implemented the following solution in my Python script here on GitHub.
I also verified that damaged files (jpg) frequently are not 'broken' images i.e, a damaged picture file sometimes remains a legit picture file, the original image is lost or altered but you are still able to load it with no errors. But, file truncation cause always errors.
End Update
You can use Python Pillow(PIL) module, with most image formats, to check if a file is a valid and intact image file.
In the case you aim at detecting also broken images, #Nadia Alramli correctly suggests the im.verify() method, but this does not detect all the possible image defects, e.g., im.verify does not detect truncated images (that most viewers often load with a greyed area).
Pillow is able to detect these type of defects too, but you have to apply image manipulation or image decode/recode in or to trigger the check. Finally I suggest to use this code:
from PIL import Image
try:
im = Image.load(filename)
im.verify() #I perform also verify, don't know if he sees other types o defects
im.close() #reload is necessary in my case
im = Image.load(filename)
im.transpose(Image.FLIP_LEFT_RIGHT)
im.close()
except:
#manage excetions here
In case of image defects this code will raise an exception.
Please consider that im.verify is about 100 times faster than performing the image manipulation (and I think that flip is one of the cheaper transformations).
With this code you are going to verify a set of images at about 10 MBytes/sec with standard Pillow or 40 MBytes/sec with Pillow-SIMD module (modern 2.5Ghz x86_64 CPU).
For the other formats xcf,.. you can use Imagemagick wrapper Wand, the code is as follows:
Check the Wand documentation: here, to installation: here
im = wand.image.Image(filename=filename)
temp = im.flip;
im.close()
But, from my experiments Wand does not detect truncated images, I think it loads lacking parts as greyed area without prompting.
I red that Imagemagick has an external command identify that could make the job, but I have not found a way to invoke that function programmatically and I have not tested this route.
I suggest to always perform a preliminary check, check the filesize to not be zero (or very small), is a very cheap idea:
import os
statfile = os.stat(filename)
filesize = statfile.st_size
if filesize == 0:
#manage here the 'faulty image' case
On Linux, you could use python-magic which uses libmagic to identify file formats.
AFAIK, libmagic looks into the file and tries to tell you more about it than just the format, like bitmap dimensions, format version etc.. So you might see this as a superficial test for "validity".
For other definitions of "valid" you might have to write your own tests.
You could use the Python bindings to libmagic, python-magic and then check the mime types. This won't tell you if the files are corrupted or intact but it should be able to determine what type of image it is.
Adapting from Fabiano and Tiago's answer.
from PIL import Image
def check_img(filename):
try:
im = Image.open(filename)
im.verify()
im.close()
im = Image.open(filename)
im.transpose(Image.FLIP_LEFT_RIGHT)
im.close()
return True
except:
print(filename,'corrupted')
return False
if not check_img('/dir/image'):
print('do something')
Extension of the image can be used to check image file as follows.
import os
for f in os.listdir(folderPath):
if (".jpg" in f) or (".bmp" in f):
filePath = os.path.join(folderPath, f)
format = [".jpg",".png",".jpeg"]
for (path,dirs,files) in os.walk(path):
for file in files:
if file.endswith(tuple(format)):
print(path)
print ("Valid",file)
else:
print(path)
print("InValid",file)

Categories

Resources