What is the fastest way to generate image thumbnails in Python? - python

I'm building a photo gallery in Python and want to be able to quickly generate thumbnails for the high resolution images.
What's the fastest way to generate high quality thumbnails for a variety of image sources?
Should I be using an external library like imagemagick, or is there an efficient internal way to do this?
The dimensions of the resized images will be (max size):
120x120
720x720
1600x1600
Quality is an issue, as I want to preserve as many of the original colors as possible and minimize compression artifacts.
Thanks.

I fancied some fun so I did some benchmarking on the various methods suggested above and a few ideas of my own.
I collected together 1000 high resolution 12MP iPhone 6S images, each 4032x3024 pixels and use an 8-core iMac.
Here are the techniques and results - each in its own section.
Method 1 - Sequential ImageMagick
This is simplistic, unoptimised code. Each image is read and a thumbnail is produced. Then it is read again and a different sized thumbnail is produced.
#!/bin/bash
start=$SECONDS
# Loop over all files
for f in image*.jpg; do
# Loop over all sizes
for s in 1600 720 120; do
echo Reducing $f to ${s}x${s}
convert "$f" -resize ${s}x${s} t-$f-$s.jpg
done
done
echo Time: $((SECONDS-start))
Result: 170 seconds
Method 2 - Sequential ImageMagick with single load and successive resizing
This is still sequential but slightly smarter. Each image is only read one time and the loaded image is then resized down three times and saved at three resolutions. The improvement is that each image is read just once, not 3 times.
#!/bin/bash
start=$SECONDS
# Loop over all files
N=1
for f in image*.jpg; do
echo Resizing $f
# Load once and successively scale down
convert "$f" \
-resize 1600x1600 -write t-$N-1600.jpg \
-resize 720x720 -write t-$N-720.jpg \
-resize 120x120 t-$N-120.jpg
((N=N+1))
done
echo Time: $((SECONDS-start))
Result: 76 seconds
Method 3 - GNU Parallel + ImageMagick
This builds on the previous method, by using GNU Parallel to process N images in parallel, where N is the number of CPU cores on your machine.
#!/bin/bash
start=$SECONDS
doit() {
file=$1
index=$2
convert "$file" \
-resize 1600x1600 -write t-$index-1600.jpg \
-resize 720x720 -write t-$index-720.jpg \
-resize 120x120 t-$index-120.jpg
}
# Export doit() to subshells for GNU Parallel
export -f doit
# Use GNU Parallel to do them all in parallel
parallel doit {} {#} ::: *.jpg
echo Time: $((SECONDS-start))
Result: 18 seconds
Method 4 - GNU Parallel + vips
This is the same as the previous method, but it uses vips at the command-line instead of ImageMagick.
#!/bin/bash
start=$SECONDS
doit() {
file=$1
index=$2
r0=t-$index-1600.jpg
r1=t-$index-720.jpg
r2=t-$index-120.jpg
vipsthumbnail "$file" -s 1600 -o "$r0"
vipsthumbnail "$r0" -s 720 -o "$r1"
vipsthumbnail "$r1" -s 120 -o "$r2"
}
# Export doit() to subshells for GNU Parallel
export -f doit
# Use GNU Parallel to do them all in parallel
parallel doit {} {#} ::: *.jpg
echo Time: $((SECONDS-start))
Result: 8 seconds
Method 5 - Sequential PIL
This is intended to correspond to Jakob's answer.
#!/usr/local/bin/python3
import glob
from PIL import Image
sizes = [(120,120), (720,720), (1600,1600)]
files = glob.glob('image*.jpg')
N=0
for image in files:
for size in sizes:
im=Image.open(image)
im.thumbnail(size)
im.save("t-%d-%s.jpg" % (N,size[0]))
N=N+1
Result: 38 seconds
Method 6 - Sequential PIL with single load & successive resize
This is intended as an improvement to Jakob's answer, wherein the image is loaded just once and then resized down three times instead of re-loading each time to produce each new resolution.
#!/usr/local/bin/python3
import glob
from PIL import Image
sizes = [(120,120), (720,720), (1600,1600)]
files = glob.glob('image*.jpg')
N=0
for image in files:
# Load just once, then successively scale down
im=Image.open(image)
im.thumbnail((1600,1600))
im.save("t-%d-1600.jpg" % (N))
im.thumbnail((720,720))
im.save("t-%d-720.jpg" % (N))
im.thumbnail((120,120))
im.save("t-%d-120.jpg" % (N))
N=N+1
Result: 27 seconds
Method 7 - Parallel PIL
This is intended to correspond to Audionautics' answer, insofar as it uses Python's multiprocessing. It also obviates the need to re-load the image for each thumbnail size.
#!/usr/local/bin/python3
import glob
from PIL import Image
from multiprocessing import Pool
def thumbnail(params):
filename, N = params
try:
# Load just once, then successively scale down
im=Image.open(filename)
im.thumbnail((1600,1600))
im.save("t-%d-1600.jpg" % (N))
im.thumbnail((720,720))
im.save("t-%d-720.jpg" % (N))
im.thumbnail((120,120))
im.save("t-%d-120.jpg" % (N))
return 'OK'
except Exception as e:
return e
files = glob.glob('image*.jpg')
pool = Pool(8)
results = pool.map(thumbnail, zip(files,range((len(files)))))
Result: 6 seconds
Method 8 - Parallel OpenCV
This is intended to be an improvement on bcattle's answer, insofar as it uses OpenCV but it also obviates the need to re-load the image to generate each new resolution output.
#!/usr/local/bin/python3
import cv2
import glob
from multiprocessing import Pool
def thumbnail(params):
filename, N = params
try:
# Load just once, then successively scale down
im = cv2.imread(filename)
im = cv2.resize(im, (1600,1600))
cv2.imwrite("t-%d-1600.jpg" % N, im)
im = cv2.resize(im, (720,720))
cv2.imwrite("t-%d-720.jpg" % N, im)
im = cv2.resize(im, (120,120))
cv2.imwrite("t-%d-120.jpg" % N, im)
return 'OK'
except Exception as e:
return e
files = glob.glob('image*.jpg')
pool = Pool(8)
results = pool.map(thumbnail, zip(files,range((len(files)))))
Result: 5 seconds

You want PIL it does this with ease
from PIL import Image
sizes = [(120,120), (720,720), (1600,1600)]
files = ['a.jpg','b.jpg','c.jpg']
for image in files:
for size in sizes:
im = Image.open(image)
im.thumbnail(size)
im.save("thumbnail_%s_%s" % (image, "_".join(size)))
If you desperately need speed. Then thread it, multiprocess it or get another language.

A little late to the question (only a year!), but I'll piggy backing on the "multiprocess it" part of #JakobBowyer's answer.
This is a good example of an embarrassingly parallel problem, as the main bit of code doesn't mutate any state external to itself. It simply reads an input, performs its computation and saves the result.
Python is actually pretty good at these kinds of problems thanks to the map function provided by multiprocessing.Pool.
from PIL import Image
from multiprocessing import Pool
def thumbnail(image_details):
size, filename = image_details
try:
im = Image.open(filename)
im.thumbnail(size)
im.save("thumbnail_%s" % filename)
return 'OK'
except Exception as e:
return e
sizes = [(120,120), (720,720), (1600,1600)]
files = ['a.jpg','b.jpg','c.jpg']
pool = Pool(number_of_cores_to_use)
results = pool.map(thumbnail, zip(sizes, files))
The core of the code is exactly the same as #JakobBowyer, but instead of running it in a loop in a single thread, we wrapped it in a function spread it out across multiple cores via the multiprocessing map function.

Another option is to use the python bindings to OpenCV. This may be faster than PIL or Imagemagick.
import cv2
sizes = [(120, 120), (720, 720), (1600, 1600)]
image = cv2.imread("input.jpg")
for size in sizes:
resized_image = cv2.resize(image, size)
cv2.imwrite("thumbnail_%d.jpg" % size[0], resized_image)
There's a more complete walkthrough here.
If you want to run it in parallel, use concurrent.futures on Py3 or the futures package on Py2.7:
import concurrent.futures
import cv2
def resize(input_filename, size):
image = cv2.imread(input_filename)
resized_image = cv2.resize(image, size)
cv2.imwrite("thumbnail_%s%d.jpg" % (input_filename.split('.')[0], size[0]), resized_image)
executor = concurrent.futures.ThreadPoolExecutor(max_workers=3)
sizes = [(120, 120), (720, 720), (1600, 1600)]
for size in sizes:
executor.submit(resize, "input.jpg", size)

One more answer, since (I think?) no one has mentioned quality.
Here's a photo I took with an iPhone 6S at the Olympic park in East London:
The roof is made from a set of wooden slats and unless you downsize rather carefully you'll get very nasty Moire effects. I had to compress the image quite heavily to upload to stackoverflow --- if you're interested, the original is here.
Here's cv2 resize:
$ python3
Python 3.7.3 (default, Apr 3 2019, 05:39:12)
[GCC 8.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import cv2
>>> x = cv2.imread("IMG_1869.JPG")
>>> y = cv2.resize(x, (120, 90))
>>> cv2.imwrite("cv2.png", y)
True
Here's vipsthumbnail:
$ vipsthumbnail IMG_1869.JPG -s 120 -o vips.png
And here are the two downsized images side-by-side and zoomed by x2, with vipsthumbnail on the left:
(ImageMagick gives the same results as vipsthumbnail)
cv2 is defaults to BILINEAR, so it has a fixed 2x2 mask. For every point in the output image, it calculates the corresponding point in the input and takes the 2x2 average. This means it's really only sampling at most 240 points in each line, and simply ignoring the other 3750! This produces ugly aliasing.
vipsthumbnail is doing a a more complex three stage downsize.
It uses the libjpeg shrink-on-load feature to shrink the image by a factor of 8 in each axis with a box filter to turn the 4032 pixel across image to 504 x 378 pixels.
It does a further 2 x 2 box filter shrink to get 252 x 189 pixels.
It finishes with a 5 x 5 Lanczos3 kernel to get the output 120 x 90 pixel image.
This is supposed to give equivalent quality to a full Lanczos3 kernel, but be quicker because it can box filter most of the way.

If you are already familiar with imagemagick, why not stick with the python-bindings?
PythonMagick

Python 2.7, Windows, x64 users
In addition to #JakobBowyer & #Audionautics, PIL is quite old and you can find yourself troubleshooting and looking for the right version... instead, use Pillow from here (source)
the updated snippet will look like this:
im = Image.open(full_path)
im.thumbnail(thumbnail_size)
im.save(new_path, "JPEG")
full enumeration script for thumbnail creation:
import os
from PIL import Image
output_dir = '.\\output'
thumbnail_size = (200,200)
if not os.path.exists(output_dir):
os.makedirs(output_dir)
for dirpath, dnames, fnames in os.walk(".\\input"):
for f in fnames:
full_path = os.path.join(dirpath, f)
if f.endswith(".jpg"):
filename = 'thubmnail_{0}'.format(f)
new_path = os.path.join(output_dir, filename)
if os.path.exists(new_path):
os.remove(new_path)
im = Image.open(full_path)
im.thumbnail(thumbnail_size)
im.save(new_path, "JPEG")

I stumbled upon this when trying to figure out which library I should use:
It seems like OpenCV is clearly faster than PIL.
That said, I'm working with spreadsheets and it turns out that the module I was using openpyxl already requires me to import PIL to insert images.

Related

Sending RGB image data from Python numpy to a browser HTML page

I need to send realtime image RGB data (in Numpy format) to a HTML page in a browser (web-based GUI), through HTTP.
The following code works with the well-known multipart/x-mixed-replace trick: run this and access http://127.0.0.1:5000/video_feed: you will see a video in the browser.
from flask import Flask, render_template, Response
import numpy as np, cv2
app = Flask('')
def gen_frames():
while True:
img = np.random.randint(0, 255, size=(1000, 1000, 3))
ret, buf = cv2.imencode('.jpg', img)
frame = buf.tobytes()
yield (b'--frame\r\nContent-Type: image/jpeg\r\n\r\n' + frame + b'\r\n')
#app.route('/video_feed')
def video_feed():
return Response(gen_frames(), mimetype='multipart/x-mixed-replace; boundary=frame')
app.run()
However, according to my benchmark, the real performance bottleneck is the cv2.imencode('.jpg', img).
In my real application, if I just generate the image, the CPU is ~ 1% for Python.
When I imencode(...), the CPU jumps to 25%, and 15% for Chrome.
I also tried with PNG format but it's similar.
Question: how to efficiently send RGB image data from a numpy array (example: 1000 x 1000 pixels x 3 colors because of RGB) to a browser HTML page?
(Without compression/decompression it might be better, but how?)
Here is the benchmark
FPS CPU PYTHON CPU CHROME
PNG 10.8 20 % 10 %
JPG 14 23 % 12 %
JPG 10.7 16 % 10 % (with time.sleep to match PNG 10.8 fps)
BMP 19 17 % 23 %
BMP 10.8 8 % 12 % (with time.sleep to match PNG 10.8 fps)
Try using the PILLOW module instead and see if that improves performance. My benchmark shows that each iteration of the gen_frames() generator function based on PILLOW requires less than half the CPU of the CV2 version.
from flask import Flask, render_template, Response
from PIL import Image
import numpy as np
from io import BytesIO
app = Flask('')
def gen_frames():
while True:
img = np.random.randint(0, 255, size=(1000, 1000, 3), dtype=np.uint8)
rgb_image = Image.fromarray(img, 'RGB')
buf = BytesIO()
rgb_image.save(buf, 'JPEG')
frame = buf.getbuffer()
yield (b'--frame\r\nContent-Type: image/jpeg\r\n\r\n' + frame + b'\r\n')
#app.route('/video_feed')
def video_feed():
return Response(gen_frames(), mimetype='multipart/x-mixed-replace; boundary=frame')
app.run()
According to the docs you could check if the default optimization is turned on:
Default Optimization in OpenCV
Many of the OpenCV functions are optimized using SSE2, AVX, etc. It contains the unoptimized code also. So if our system support these features, we should exploit them (almost all modern day processors support them). It is enabled by default while compiling. So OpenCV runs the optimized code if it is enabled, otherwise it runs the unoptimized code. You can use cv.useOptimized() to check if it is enabled/disabled and cv.setUseOptimized() to enable/disable it.
So try this:
In [5]: cv.useOptimized() # check for optimization
Out[5]: False
In [7]: cv.setUseOptimized(True) # turn it on if not already turned on
As it stands AVI or MP4 compression would be good quality even for movies, but the compression itself takes too much CPU time to perform it on live data.
If some arbitrary protocol/format were created, one would not just have to program the server but also the client to consume this protocol/format. Therefore still some standard solution should be preferred.
I believe you can find a compromise between compression and CPU load in video conferencing systems, where the live camera data needs to be compressed and streamed via the network. With that in mind I believe here are sources of information that can help pursuing the topic:
https://www.reachcambridge.com/wp-content/uploads/wp_videocompression_33085_en_0809_lo.pdf
https://cs.haifa.ac.il/~nimrod/Compression/Video/V3h261-2005.pdf
Open Source Video Conferencing software like Jitsi Meet or Apache OpenMeetings.
Maybe you can try encoding it as base64 for compression of the video/image and then send it to the browser with the base64 mime type e.g.data:image/jpeg;base64.

Image generated using OpenCV for python is larger when run in docker container vs. on local machine

I am using OpenCV within python 2.7 to stitch 4 images together vertically and then output 1 file. Here is the cv-merge.py script:
import cv2
import numpy as np
import time
import sys
import os
location = os.path.join(sys.path[0], "n1.png")
img = cv2.imread(location)
img1 = img
img2 = img
img3 = img
img4 = img
start_time = time.time()
new_image = np.concatenate((img1, img2, img3, img4), axis=0)
cv2.imwrite(os.path.join(sys.path[0], 'out.png'), new_image, [int(cv2.IMWRITE_PNG_COMPRESSION), 6])
print("time time: %s" % (time.time() - start_time))
sys.stdout.flush()
Running locally on MacOS will generate a file of size ~30MB for a compression level of 6 specified above in around 12 seconds.
However, when I run it inside the Ubuntu 16.04 docker container running a Node.js server. It takes slightly longer (not a problem) but the file size is 130MB every time.
The environment should not matter, since it is the same input image on both my local machine and docker container being used. Why is the generated file size different when I run the script in 2 different environments using the same compression level?
Memory and CPU levels are not being exceeded, and the exported image looks the same.
How I run the script locally as well as inside docker container:
python cv-merge.py

Improving copying bytes from an Image

I have the following minimal code that gets the bytes from an image:
import Image
im = Image.open("kitten.png")
im_data = [pix for pixdata in im.getdata() for pix in pixdata]
This is rather slow (I have gigabytes of images to process) so how could this be sped up? I'm also unfamiliar with what exactly that code is trying to do. All my data is 1280 x 960 x 8-bit RGB, so I can ignore corner cases, etc.
(FYI, the full code is here - I've already replaced the ImageFile loop with the above Image.open().)
You can try
scipy.ndimage.imread()
If you mean speeding up by algorythamically i can suggest you accessing file with multiple threads simultaneously (only if you don't have a connection between processing sequence)
divide file logically by few sections and access each part simultaneously with threads (you have to put your operation inside a function and call it with threads)
here is a link to tutorial about threading in python
threding in python
I solved my problem, I think:
>>> [pix for pixdata in im.getdata() for pix in pixdata] ==
numpy.ndarray.tolist(numpy.ndarray.flatten(numpy.asarray(im)))
True
This cuts down the runtime by half, and with a bit of bash magic I can run the conversion on the 56 directories in parallel.

Difficulty with handling very large image using VIPS

I'm writing a Python(3.4.3) program that uses VIPS(8.1.1) on Ubuntu 14.04 LTS to read many small tiles using multiple threads and put them together into a large image.
In a very simple test :
from concurrent.futures import ThreadPoolExecutor
from multiprocessing import Lock
from gi.repository import Vips
canvas = Vips.Image.black(8000,1000,bands=3)
def do_work(x):
img = Vips.Image.new_from_file('part.tif') # RGB tiff image
with lock:
canvas = canvas.insert(img, x*1000, 0)
with ThreadPoolExecutor(max_workers=8) as executor:
for x in range(8):
executor.submit(do_work, x)
canvas.write_to_file('complete.tif')
I get correct result. In my full program, the work for each thread involves read binary from a source file, turn them into tiff format, read the image data and insert into canvas. It seems to work but when I try to examine the result, I ran into trouble. Because the image is extremely large(~50000*100000 pixels), I couldn't save the entire image in one file, so I tried
canvas = canvas.resize(.5)
canvas.write_to_file('test.jpg')
This takes extremely long time, and the resulting jpeg has only black pixels. If I do resize three times, the program get killed. I also tried
canvas.extract_area(20000,40000,2000,2000).write_to_file('test.tif')
This results in error message segmentation fault(core dumped) but it does save an image. There are image contents in it, but they seem to be in the wrong place.
I'm wondering what the problem could be?
Below are the codes for the complete program. The same logic was also implemented using OpenCV + sharedmem (sharedmem handled the multiprocessing part) and it worked without a problem.
import os
import subprocess
import pickle
from multiprocessing import Lock
from concurrent.futures import ThreadPoolExecutor
import threading
import numpy as np
from gi.repository import Vips
lock = Lock()
def read_image(x):
with open(file_name, 'rb') as fin:
fin.seek(sublist[x]['dataStartPos'])
temp_array = np.fromfile(fin, dtype='int8', count=sublist[x]['dataSize'])
name_base = os.path.join(rd_path, threading.current_thread().name + 'tempimg')
with open(name_base + '.jxr', 'wb') as fout:
temp_array.tofile(fout)
subprocess.call(['./JxrDecApp', '-i', name_base + '.jxr', '-o', name_base + '.tif'])
temp_img = Vips.Image.new_from_file(name_base + '.tif')
with lock:
global canvas
canvas = canvas.insert(temp_img, sublist[x]['XStart'], sublist[x]['YStart'])
def assemble_all(filename, ramdisk_path, scene):
global canvas, sublist, file_name, rd_path, tilesize_x, tilesize_y
file_name = filename
rd_path = ramdisk_path
file_info = fetch_pickle(filename) # A custom function
# this info includes where to begin reading image data, image size and coordinates
tilesize_x = file_info['sBlockList_P0'][0]['XSize']
tilesize_y = file_info['sBlockList_P0'][0]['YSize']
sublist = [item for item in file_info['sBlockList_P0'] if item['SStart'] == scene]
max_x = max([item['XStart'] for item in file_info['sBlockList_P0']])
max_y = max([item['YStart'] for item in file_info['sBlockList_P0']])
canvas = Vips.Image.black((max_x+tilesize_x), (max_y+tilesize_y), bands=3)
with ThreadPoolExecutor(max_workers=4) as executor:
for x in range(len(sublist)):
executor.submit(read_image, x)
return canvas
The above module (imported as mcv) is called in the driver script :
canvas = mcv.assemble_all(filename, ramdisk_path, 0)
To examine the content, I used
canvas.extract_area(25000, 40000, 2000, 2000).write_to_file('test_vips1.jpg')
I think your problem has to do with the way libvips calculates pixels.
In systems like OpenCV, images are huge areas of memory. You perform a series of operations, and each operation modifies a memory image in some way.
libvips is not like this, though the interface looks similar. In libvips, when you perform an operation on an image, you are actually just adding a new section to a pipeline. It's only when you finally connect the output to some sink (a file on disk, or a region of memory you want filled with image data, or an area of the display) that libvips will actually do any calculations. libvips will then use a recursive algorithm to run a large set of worker threads up and down the whole length of the pipeline, evaluating all of the operations you created at the same time.
To make an analogy with programming languages, systems like OpenCV are imperative, libvips is functional.
The good thing about the way libvips does things is that it can see the whole pipeline at once and it can optimise away most of the memory use and make good use of your CPU. The bad thing is that long sequences of operations can need large amounts of stack to evaluate (whereas with systems like OpenCV you are more likely to be bounded by image size). In particular, the recursive system used by libvips to evaluate means that pipeline length is limited by the C stack, about 2MB on many operating systems.
Here's a simple test program that does more or less what you are doing:
#!/usr/bin/python3
import sys
import pyvips
if len(sys.argv) < 4:
print "usage: %s image-in image-out n" % sys.argv[0]
print " make an n x n grid of image-in"
sys.exit(1)
tile = pyvips.Image.new_from_file(sys.argv[1])
outfile = sys.argv[2]
size = int(sys.argv[3])
img = pyvips.Image.black(size * tile.width, size * tile.height, bands=3)
for y in range(size):
for x in range(size):
img = img.insert(tile, x * size, y * size)
# we're not interested in huge files for this test, just write a small patch
img.crop(10, 10, 100, 100).write_to_file(outfile)
You run it like this:
time ./bigjoin.py ~/pics/k2.jpg out.tif 2
real 0m0.176s
user 0m0.144s
sys 0m0.031s
It loads k2.jpg (a 2k x 2k JPG image), repeats that image into a 2 x 2 grid, and saves a small part of it. This program will work well with very large images, try removing the crop and running as:
./bigjoin.py huge.tif out.tif[bigtiff] 10
and it'll copy the huge tiff image 100 times into a REALLY huge tiff file. It'll be quick and use little memory.
However, this program will become very unhappy with small images being copied many times. For example, on this machine (a Mac), I can run:
./bigjoin.py ~/pics/k2.jpg out.tif 26
But this fails:
./bigjoin.py ~/pics/k2.jpg out.tif 28
Bus error: 10
With a 28 x 28 output, that's 784 tiles. The way we've built the image, repeatedly inserting a single tile, that's a pipeline 784 operations long -- long enough to cause a stack overflow. On my Ubuntu laptop I can get pipelines up to about 2,900 operations long before it starts failing.
There's a simple way to fix this program: build a wide rather than a deep pipeline. Instead of inserting a single image each time, make a set of strips, then join the strips. Now the pipeline depth will be proportional to the square root of the number of tiles. For example:
img = pyvips.Image.black(size * tile.width, size * tile.height, bands=3)
for y in range(size):
strip = pyvips.Image.black(size * tile.width, tile.height, bands=3)
for x in range(size):
strip = strip.insert(tile, x * size, 0)
img = img.insert(strip, 0, y * size)
Now I can run:
./bigjoin2.py ~/pics/k2.jpg out.tif 200
Which is 40,000 images joined together.

Python server app leaking memory

I'm trying to diagnose why my Python server app is leaking memory. The app takes a request for an image url resizes it using Vips and returns the image. After every request the memory usage grows roughly by the size of the original image.
from fapws import base
import fapws._evwsgi as evwsgi
from gi.repository import Vips
import urllib2
import hmac
import hashlib
import base64
import StringIO
from boto.s3.connection import S3Connection
from boto.s3.bucket import Bucket
def start():
evwsgi.start('0.0.0.0', '80')
evwsgi.set_base_module(base)
def lfrThumbnail(environ, start_response):
try:
parameters = environ['PATH_INFO'].split('/')
s3File = 'my s3 url' + parameters[0]
width = float(parameters[1])
height = float(parameters[2])
hmacSignatureUser = parameters[3]
hmacSignature = some hasing code...
if not (hmacSignatureUser == hmacSignature):
print hmacSignatureUser
print hmacSignature
print hmacSignatureUser == hmacSignature
raise Exception
bufferedImage = urllib2.urlopen(s3File).read()
image = Vips.Image.new_from_buffer(bufferedImage, '')
imageWidth = float(image.width)
imageHeight = float(image.height)
imageAspectRatio = imageWidth / imageHeight
if (width > imageWidth) or (height > imageHeight):
image = image
elif abs((imageAspectRatio / (width/height)) - 1) < 0.05:
image = image.resize(width / imageWidth)
else:
scaleRatioWidth = width / imageWidth
scaleRatioHeight = height / imageHeight
maxScale = max(scaleRatioWidth, scaleRatioHeight)
image = image.resize(maxScale)
cropStartX = (image.width - width) / 2
cropStartY = (image.height - height) / 2
image = image.crop(cropStartX, cropStartY, width, height)
except Exception, e:
start_response('500 INTERNAL SERVER ERROR', [('Content-Type','text')])
return ['Error generating thumbnail']
start_response('200 OK', [
('Content-Type','image/jpeg'),
('Cache-Control: max-stale', '31536000')
])
return [image.write_to_buffer('.jpg[Q=90]')]
evwsgi.wsgi_cb(('/lfr/', lfrThumbnail))
evwsgi.set_debug(0)
evwsgi.run()
if __name__ == '__main__':
start()
I've tried using muppy , the pympler tracker but each diff after the image open/close operations showed only a couple of bytes being used.
Could the external C libraries be the cause of the memory leak? if so, how does one debug that.
If it's anything related I'm running the python server inside a docker container
I'm the libvips maintainer. It sounds like the vips operation cache: vips keeps the last few operations in memory and reuses the results if it can. This can be a huge performance win in some cases.
For a web service, you're probably caching elsewhere so you won't want this, or you won't want a large cache at least. You can control the cache size with vips_cache_set_max() and friends:
http://www.vips.ecs.soton.ac.uk/supported/current/doc/html/libvips/VipsOperation.html#vips-cache-set-max
From Python it's:
Vips.cache_set_max(0)
To turn off the cache completely. You can set the cache to limit by memory use, file descriptor use, or number of operations.
There are a couple of other useful things you can set to watch resource usage. Vips.leak_set(True) makes vips report leaked objects on exit, and also report peak pixel buffer memory use. Vips.cache_set_trace(True) makes it trace all operations as they are called, and shows cache hits.
In your code, I would also enable sequential mode. Add access = Vips.Access.SEQUENTIAL to your new_from_buffer().
The default behaviour is to open images for full random access (since vips doesn't know what operations you'll end up running on the image). For things like JPG, this means that vips will decode the image to a large uncompressed array on open. If the image is under 100mb, it'll keep this array in memory.
However for a simple resize, you only need to access pixels top-to-bottom, so you can hint sequential access on open. In this mode, vips will only decompress a few scanlines at once from your input and won't ever keep the whole uncompressed image around. You should see a nice drop in memory use and latency.
There are a lot of other things you could handle, like exif autorotate, colour management, transparency, jpeg shrink-on-load, and many others, I'm sure you know. The sources to vipsthumbnail might be a useful reference:
https://github.com/jcupitt/libvips/blob/master/tools/vipsthumbnail.c

Categories

Resources