I noticed that a lot of dataloaders use PIL to load and transform images, e.g. the dataset builders in torchvision.datasets.folder.
My question is: why use PIL? You would need to do an np.asarray operation before turning it into a tensor. OpenCV seems to load it directly as a numpy array, and is faster too.
One reason I can think of is because PIL has a rich transforms library, but I feel like several of those transforms can be quickly implemented.
There is a discussion about adding OpenCV as one of possible backends in torchvision PR.
In summary, some reasons provided:
OpenCV2 loads images in BGR format which would require wrapper class to handle changing to RGB internally or format of loaded images backend dependent
This in turn would lead to code duplication in functional transforms in torchvision many of which use PIL operations (as transformations to support multiple backends would be pretty convoluted)
OpenCV loads images as np.array, it's not really easier to do transformations on arrays
Different representation might lead to hard to catch by users bugs
PyTorch's modelzoo is dependent on RGB format as well and they would like to have it easily supported
Doesn't play well with Python's multiprocessing (but it's no-issue as it was an issue for Python 2)
To be honest I don't see much movement towards this idea as there exists albumentations which uses OpenCV and can be integrated with PyTorch rather smoothly.
A little off-topic, but one can choose faster backend via torchvision.set_image_backend to Intel's accimage. Also Pillow-SIMD can be used as a drop-in replacement for PIL (it is supposedly faster and recommended by fastai project).
When it comes to performance benchmarks they do not seem too reliable and it's not that easy to tell AFAIK.
There is some elements of answer here and here.
TL,DR: because of historical reasons, some benchmarks (never trust bencharcks, though) and because PIL is lighter and easier to instal than OpenCV.
One possible reason PIL turns out to be frequent is because there are lot of examples online using PIL.Image.open method:
%matplotlib inline
from PIL import Image
img = Image.open(r"img.jpg")
# do more ...
Turns out that we don't need to use PIL, if we open the image using matplotlib.
%matplotlib inline
import matplotlib.pyplot as plt
import matplotlib.image as mpimg
img = mpimg.imread(r"img.jpg")
_ = plt.imshow(img)
Jupyter notebooks frequent need is to show the image, and matplotlib.pyplot.imshow is used frequently for that.
Related
I have no idea why their have many choice to saving and loading the image in python. So I suppose that various method for operating the image may result different image quality or the time complexity.
package example.
e.g. (PIL.imread, PIL.imwrite ), (cv2.imread, cv2.imwrite), (_, plt.savefig)
Could someone give me some advice or suggestion for practices view, and I also attempt to know why as the title described.
Thank you very much ~~
I have written a python script to project and overlay geostationary satellite images from the university of Dundee so the resulting image can be used for xplanet to render the surface of the earth. The source code of the tool can be found at https://github.com/jmozmoz/cloudmap/tree/cartopy (this is the branch with cartopy support)
The tool support two different python libraries to project the geostationary images on a flat map: pyresample and cartopy.
I have found the following differences/problems:
pyresample is much faster than cartopy (depending on the size of the output image up to a factor of 10)
The output images differ: The results using pyresample show a stronger contrast.
For examples see the debug directory at https://github.com/jmozmoz/cloudmap/tree/cartopy/debug
If the multiprocessing library is used to do the projection in parallel, the cartopy version crashes with the following error message:
Fatal Python error: PyEval_RestoreThread: NULL tstate
So why is cartopy so much slower? Is pyresample doing the work in C code? Should cartopy support multiprocessing? And how to fix the problem with the contrast?
Thank you for your help
1. pyresample is much faster than cartopy (depending on the size of the output image up to a factor of 10)
The cartopy reprojection functionality hasn't been optimized in any way, and although it is using the scipy ckdtree functionality under the hood, the algorithm itself is written in Python. I seem to remember that a quick win was to use https://pypi.python.org/pypi/kdtree which from memory gave quite a reasonable speedup with little work, cartopy.img_transform would be the place where changes would be needed.
Cartopy's re-projection functionality is probably also paying the cost of being very general - you can provide an image in any projection, and it will put it into any other projection, dealing with discontinuities and tears without a problem. It would be really cool to hook into pyresample's functionality though (and GDAL's for that matter) to give users the opportunity to speed up the reprojection in certain cases.
2. The output images differ: The results using pyresample show a stronger contrast.
Looks like you're creating a matplotlib figure to resample the image and using mpl's savefig functionality. It is possible that this process is causing the contrast to be lost. I'd advise just using cartopy's reprojection functionality without adding an image to a figure and saving the figure (example at the end).
3. If the multiprocessing library is used to do the projection in parallel, the cartopy version crashes with the following error message:
This really surprised me as there is no C code in cartopy which is doing the reprojecting. Therefore you've either found a bug with scipy, or more likely you are hitting a problem with numpy/matplotlib (google brings up a few results with your exception and matplotlib and/or numpy, e.g. https://github.com/numpy/numpy/issues/1270).
Ok, so here is how I would do the reprojection without using matplotlib at all:
import cartopy.crs as ccrs
from cartopy.img_transform import warp_array
import numpy as np
import PIL.Image
# I've downloaded the file from https://github.com/jmozmoz/cloudmap/blob/78923d15ad906eaa6d1dcab168a6364643d3fc94/debug/2014_8_7_1800_GOES15_4_S1.jpeg
# and clipped the image.
fname = '2014_8_7_1800_GOES15_4_S1.jpeg'
img = PIL.Image.open(fname)
result_array, extent = warp_array(np.array(img),
source_proj=ccrs.Geostationary(),
target_proj=ccrs.PlateCarree(),
target_res=(4000, 2000))
result = PIL.Image.fromarray(result_array)
result.save('reprojected.jpeg')
With the resulting image (eventually) looking something like:
There are some real possibilities for some optimisations with this functionality - quite a large amount of work is done creating the kdtree in the first place (which could potentially be cached) and another large chunk of the work is computing the indices from the original image (again, caches very well) which would essentially reduce and repeat reprojections to an numpy indexing problem.
If you want to look into the performance possibilities or the contrast issue (which I'm uncertain whether my solution fixes or not) please feel free to open up an issue on the github repo and we can talk through some of the options.
Thanks for asking, and HTH!
I've recently started using openCV in python. I've come across various posts comparing cv and cv2 and with an overview saying how cv2 is based on numpy and makes use of an array (cvMat) as opposed to cv makes use of old openCV bindings that was using Iplimage * (correct me if i'm wrong).
However I would really like know how basic techniques (Iplimage* and cvMat) differ and why later is faster and better and how that being used in cv and cv2 respectively makes difference in terms of performance.
Thanks.
there is no question at all, - use cv2
the old cv api, that wraps IplImage and CvMat is being phased out, and will be no more available in the next release of opencv
the newer cv2 api uses numpy arrays for almost anything, so you can easily combine it with scipy, matplotlib, etc.
I'm looking in to learning about processing and handling images with Python. I'm experimenting with searching the inside of an image for a specific picture. For example, this picture has two images in it that are the same;
In Python, how would I go about detecting which two images are the same?
I would recommend you to take a look at OpenCV and PIL, if you want to implement simple (or complex) algorithms on your own.
Furthermore you can integrate OpenCV with PIL and also numpy, which makes it a really powerful tool for this kind of jobs.
What is the most efficient way in terms of speed to access the pixel data of a PIL image from a C extension? I only need read-only access to it, if that makes a difference.
C-level bindings for PIL are available, but there is very little documentation for them. You will need to consult the source for usage information.
Besides C extension, you can try numpy. It takes a bit to learn though. To get started, check Convert RGBA PNG to RGB with PIL , and http://effbot.org/zone/pil-numpy.htm .
In my experience, numpy performance is great if the code is properly written. Processing image data can still be slow using C extension. But numpy uses SIMD instructions such as SSE2, which dramatically improves operation such as histogram elevating or alpha blending.