Why python is slower than matlab?

Why python is slower than matlab? - python

Finally, I succeeded in changing the MATLAB code to Python code.
However, contrary to what i have heard, Python's execution speed was remarkably slow.
Is MATLAB better on the image processing side these days?
Of course, I have no choice but to use Python because the company doesn't buy MATLAB...
ps. run Python in visual studio environment and IDLE.
If there's a way to speed Python up, please help me.
## Exercise Python image processing ##
import numpy as np
import cv2
import matplotlib.pyplot as plt
B = cv2.imread(r'D:\remedi\Exercise\Xray\Offset.png', -1) # offset image
for i in range(2,3):
org_I = cv2.imread(r'D:\remedi\Exercise\Xray\objects\object (' + str(i) + ').png', -1) # original image
w = cv2.imread(r'D:\remedi\Exercise\Xray\white\white (' + str(i) + ').png', -1) # white image
# dead & bad pixel correction
corrected_w = w.copy()
corrected_org_I = org_I.copy()
c = np.mean(corrected_w)
p = np.abs(corrected_w - c)
sens = 0.7
[num_y, num_x] = np.where((p < c*sens) | (p > c*sens))
ar = np.zeros((3,3))
ar2 = np.zeros((3,3))
for n in range(0, num_y.shape[0]):
for j in range(-1,2):
for k in range(-1,2):
if num_y[n]+j+1 == 0 or num_x[n]+k+1 == 0 or num_y[n]+j+1 == 577 or num_x[n]+k+1 == 577:
ar[j+1][k+1] = 0
ar2[j+1][k+1] = 0
else:
ar[j+1][k+1] = corrected_w[num_y[n]+j-1][num_x[n]+k-1]
ar2[j+1][k+1] = corrected_org_I[num_y[n]+j-1][num_x[n]+k-1]
ar[1][1] = 0
ar2[1][1] = 0
corrected_w[num_y[n]][num_x[n]] = np.sum(ar)/np.count_nonzero(ar)
corrected_org_I[num_y[n]][num_x[n]] = np.sum(ar2)/np.count_nonzero(ar2)
c = np.mean(corrected_w) # constant
FFC = np.uint16(np.divide(c*(corrected_org_I-B), (corrected_w-B))) # flat field correction
plt.subplot(2,3,1), plt.imshow(org_I, cmap='gray'), plt.title('Original Image')
plt.subplot(2,3,2), plt.imshow(corrected_org_I, cmap='gray'), plt.title('corrected original Image')
plt.subplot(2,3,3), plt.imshow(FFC, cmap='gray'), plt.title('FFC')
plt.subplot(2,3,4), plt.imshow(w, cmap='gray'), plt.title('w')
plt.subplot(2,3,5), plt.imshow(corrected_w, cmap='gray'), plt.title('corrected w')
plt.subplot(2,3,6), plt.imshow(B, cmap='gray'), plt.title('B')
plt.show()

cv2 calls into c++ library code, there is some overhead to each of these calls. These can add up depending on what you are doing. Additionally opencv takes advantage of some processor optimizations such as SIMD and NEON.
Here is a concrete example, I have implemented my own threshold function by iterating through the pixels. I then use opencv's built-in threshold function. I print timing for each function and compare the final result to verify they are the same.
img = cv2.imread('image.jpg', cv2.IMREAD_GRAYSCALE)
thresh = 128
thresh_loops = img.copy()
# implement threshold with loops
t1 = cv2.getTickCount()
for c in range(img.shape[0]):
for r in range(img.shape[1]):
if thresh_loops[c,r] > thresh:
thresh_loops[c,r] = 255
else:
thresh_loops[c,r] = 0
t2 = cv2.getTickCount()
print((t2-t1)/cv2.getTickFrequency())
# use the threshold function
t1 = cv2.getTickCount()
thr, thresh_fun = cv2.threshold(img, thresh, 255, cv2.THRESH_BINARY)
t2 = cv2.getTickCount()
print((t2-t1)/cv2.getTickFrequency())
print(np.all(thresh_loops == thresh_fun)) # verify the same result
I ran on a 720p image (720x1280 - or approximately 1 million pixels), on my machine the first function too 2.319724256 seconds the second took 0.013217389 seconds. So the second function is approximately 200x faster! You will run into this problem if you use Java or other languages that call into the OpenCV library (or any other C library).
Use it as good motivation to learn the api.
I'll also add that if you are doing image processing in Python with OpenCV, you should also learn the Numpy api, since the matrix (Mat) class is represented as a numpy array. You can get massive speedups by knowing the numpy api as well.

I switched to python a year ago from my beloved Matlab. Yes you are right, Matlab is faster than python, usually at least by 3-time or more.
2 ways I found to save python is here
Try to read Numpy Functional programming Routines and replace some for-loops with methods there. They are much faster than slicing an ndarray in side a for-loop.
Try Multiprocessing. I know that Matlab parfor is also better than any parallel computation package in python. Try 'joblib' or 'multiprocessing'. They might help a little bit.

Related

How to parallelize a for loop with a shared array return?

I have a numpy array with an image, and a binary segmentation mask containing separate binary "blobs". Something like the binary mask in:
I wish to extract image statistics from pixels in correspondence of each of the binary blobs, separately. These values are stored inside a new numpy array, named cnr_map.
My current implementation uses a for loop. However, when the number of binary blobs increases, it is really slow, and I'm wondering if it is possible to parallelize it.
from scipy.ndimage import label
labeled_array, num_features = label(mask)
cnr_map = np.copy(mask)
for k in range(num_features):
foreground_mask = labeled_array == k
background_mask = 1.0 - foreground_mask
a = np.mean(image[foreground_mask == 1])
b = np.mean(image[background_mask == 1])
c = np.std(image[background_mask == 1])
cnr = np.abs(a - b) / (c + 1e-12)
cnr_map[foreground_mask] = cnr
How can I parallelize the work so that the for loop runs faster?
I have seen this question, but my case is a bit different as I want to return a numpy array with the cumulative modifications of the multiple processes (i.e. cnr_map), and I don't understand how to do it.

OpenCV determine what two hue values would best represent an image

I would like to determine what 2 hue values would best represent an image using OpenCV in a Python 3 script.
So far I have been able to access the hue channel and display its hue histogram:
As you can see there are basically pixels of 2 different hues, Beige and Green.
I have gained access to the hue channel as follows:
hsv = cv.cvtColor( self.img, cv.COLOR_BGR2HSV )
hue,sat,val = cv.split(hsv)
What is the most efficient way to operate on the hue channel to determine what two hue values would best represent the image?
Edit #1 original image requested:
Edit #2 Almost there but still need help:
I have cobbled together some code to use OpenCV kmeans to convert images to 2 colors:
import numpy as np
import cv2
import time
import sys
img = cv2.imread('3.JPG')
cv2.imshow('Original',img)
print (sys.version)
if sys.version_info[0] < 3:
raise Exception("Python 3 or a more recent version is required.")
def redo():
Z = img.reshape((-1,3))
# convert to np.float32
Z = np.float32(Z)
# define criteria, number of clusters(K) and apply kmeans()
criteria = (cv2.TERM_CRITERIA_EPS + cv2.TERM_CRITERIA_MAX_ITER, 10, 1.0)
K = 2
start_time = time.time()
ret,label,center=cv2.kmeans(Z,K,None,criteria,10,cv2.KMEANS_RANDOM_CENTERS)
end_time = time.time()
print("Elapsed time was %g seconds" % (end_time - start_time))
# Now convert back into uint8, and make original image
center = np.uint8(center)
res = center[label.flatten()]
res2 = res.reshape((img.shape))
cv2.imshow('Converted',res2)
while(1):
ch = cv2.waitKey(50)
if ch == 27:
break
if ch == ord(' '):
redo()
cv2.destroyAllWindows()
Still needed:
I really don't want to convert back into uint8, and make original image as 2 colors. I would like to know the Hue value of those two colors. How do I get these 2 values from the kmeans output parameters?
Is there a way to reduce the time to convert using kmeans? kmeans is taking 8.6 seconds to convert to 2 colors using Python 3 script on my Raspberry Pi Zero. The conversion to 2 colors in Gimp is almost instantaneous (I know it is different processor but 8.6 seconds on Pi Zero is unusable for my purposes, maybe 1 second is OK). I am just a novice but to me it looks like this kmeans code is acting on all RGB pixels when I just want the act on Hue, so could I not have kmeans act just on Hue and cut time considerably (doing that is a little beyond my abilities at this point tho).
Here is the 3.JPG image that takes 8.6 seconds:

I've tackled a few of the things you mentioned in you post.
You shouldn't need to cast back to uint8. Center is an array of size (2,3), where the rows are your center points (e.i. hues) and the columns are the BGR values of each hue.
I have switched to Scipy KMeans2 instead of using OpenCVs implementation. It seems to be about 3x faster on the same image as OpenCV.
import numpy as np
from scipy.cluster.vq import kmeans, kmeans2
import time, sys, cv2
img = cv2.imread('3.jpg')
print (sys.version)
if sys.version_info[0] < 3:
raise Exception("Python 3 or a more recent version is required.")
def redo():
Z = img.reshape((-1,3))
Z = np.float32(Z)
# define criteria, number of clusters(K) and apply kmeans()
criteria = (cv2.TERM_CRITERIA_EPS + cv2.TERM_CRITERIA_MAX_ITER, 10, 1.0)
K = 2
start_time = time.time()
ret,label,center=cv2.kmeans(Z,K,None,criteria,10,cv2.KMEANS_RANDOM_CENTERS)
end_time = time.time()
print("Elapsed cv2 time was %g seconds" % (end_time - start_time))
print("center")
print(center)
start_time = time.time()
spcenter, splabel = kmeans2(Z,K,10,1,'points','warn',False)
end_time = time.time()
print("Elapsed scipy time was %g seconds" % (end_time - start_time))
print("center")
print(spcenter)
print("diff")
check = np.abs(spcenter[0][0]-center[0][0])
checkflip = np.abs(spcenter[0][0]-center[1][0])
if (check < checkflip):
print(np.abs(spcenter-center))
else:
print("labels are flipped, has no effect on data")
print(np.abs(spcenter-np.roll(center,1,0)))
# Now convert back into uint8, and make original image
#center = np.uint8(center)
res = spcenter[splabel.flatten()]
res2 = res.reshape((img.shape))
cv2.imwrite('converted.jpg',res2)
redo()
You may not be able to do much better than whatever timing this code gives you. You could try shrinking the images if that works for your solution. It will result in faster processing.
Another solution, like I mentioned, is moving to a faster language such as C. I could try my hand at it, but I'm not sure what other code you are using and if it would work. Let me know if you want to go down this path.

Faster implementation to quantize an image with an existing palette?

I am using Python 3.6 to perform basic image manipulation through Pillow. Currently, I am attempting to take 32-bit PNG images (RGBA) of arbitrary color compositions and sizes and quantize them to a known palette of 16 colors. Optimally, this quantization method should be able to leave fully transparent (A = 0) pixels alone, while forcing all semi-transparent pixels to be fully opaque (A = 255). I have already devised working code that performs this, but I wonder if it may be inefficient:
import math
from PIL import Image
# a list of 16 RGBA tuples
palette = [
(0, 0, 0, 255),
# ...
]
with Image.open('some_image.png').convert('RGBA') as img:
for py in range(img.height):
for px in range(img.width):
pix = img.getpixel((px, py))
if pix[3] == 0: # Ignore fully transparent pixels
continue
# Perform exhaustive search for closest Euclidean distance
dist = 450
best_fit = (0, 0, 0, 0)
for c in palette:
if pix[:3] == c: # If pixel matches exactly, break
best_fit = c
break
tmp = sqrt(pow(pix[0]-c[0], 2) + pow(pix[1]-c[1], 2) + pow(pix[2]-c[2], 2))
if tmp < dist:
dist = tmp
best_fit = c
img.putpixel((px, py), best_fit + (255,))
img.save('quantized.png')
I think of two main inefficiencies of this code:
Image.putpixel() is a slow operation
Calculating the distance function multiple times per pixel is computationally wasteful
Is there a faster method to do this?
I've noted that Pillow has a native function Image.quantize() that seems to do exactly what I want. But as it is coded, it forces dithering in the result, which I do not want. This has been brought up in another StackOverflow question. The answer to that question was simply to extract the internal Pillow code and tweak the control variable for dithering, which I tested, but I find that Pillow corrupts the palette I give it and consistently yields an image where the quantized colors are considerably darker than they should be.
Image.point() is a tantalizing method, but it only works on each color channel individually, where color quantization requires working with all channels as a set. It'd be nice to be able to force all of the channels into a single channel of 32-bit integer values, which seems to be what the ill-documented mode "I" would do, but if I run img.convert('I'), I get a completely greyscale result, destroying all color.
An alternative method seems to be using NumPy and altering the image directly. I've attempted to create a lookup table of RGB values, but the three-dimensional indexing of NumPy's syntax is driving me insane. Ideally I'd like some kind of code that works like this:
img_arr = numpy.array(img)
# Find all unique colors
unique_colors = numpy.unique(arr, axis=0)
# Generate lookup table
colormap = numpy.empty(unique_colors.shape)
for i, c in enumerate(unique_colors):
dist = 450
best_fit = None
for pc in palette:
tmp = sqrt(pow(c[0] - pc[0], 2) + pow(c[1] - pc[1], 2) + pow(c[2] - pc[2], 2))
if tmp < dist:
dist = tmp
best_fit = pc
colormap[i] = best_fit
# Hypothetical pseudocode I can't seem to write out
for iy in range(arr.size):
for ix in range(arr[0].size):
if arr[iy, ix, 3] == 0: # Skip transparent
continue
index = # Find index of matching color in unique_colors, somehow
arr[iy, ix] = colormap[index]
I note with this hypothetical example that numpy.unique() is another slow operation, since it sorts the output. Since I cannot seem to finish the code the way I want, I haven't been able to test if this method is faster anyway.
I've also considered attempting to flatten the RGBA axis by converting the values to a 32-bit integer and desiring to create a one-dimensional lookup table with the simpler index:
def shift(a):
return a[0] << 24 | a[1] << 16 | a[2] << 8 | a[3]
img_arr = numpy.apply_along_axis(shift, 1, img_arr)
But this operation seemed noticeably slow on its own.
I would prefer answers that involve only Pillow and/or NumPy, please. Unless using another library demonstrates a dramatic computational speed increase over any PIL- or NumPy-native solution, I don't want to import extraneous libraries to do something these two libraries should be reasonably capable of on their own.

for loops should be avoided for speed.
I think you should make a tensor like:
d2[x,y,color_index,rgb] = distance_squared
where rgb = 0..2 (0 = r, 1 = g, 2 = b).
Then compute the distance:
d[x,y,color_index] =
sqrt(sum(rgb,d2))
Then select the color_index with the minimal distance:
c[x,y] = min_index(color_index, d)
Finally replace alpha as needed:
alpha = ceil(orig_image.alpha)
img = c,alpha

When plotting Mandelbrot using NumPy & Pillow, program outputs apparent noise

Previously, I had created a Mandelbrot generator in python using turtle. Now, I am re-writing the program to use the Python Imaging Library in order to increase speed and reduce limits on size of images.
However, the program below only outputs RGB nonsense, almost noise. I think it is something to do with a difference in the way NumPy and PIL deal with arrays, since saying l[x,y] = [1,1,1] where l = np.zeros((height,width,3)) doesn't just make 1 pixel white when img = Image.fromarray(l) and img.show() are performed.
def imagebrot(mina=-1.25, maxa=1.25, minb=-1.25, maxb=1.25, width=100, height=100, maxit=300, inf=2):
l,b = np.zeros((height,width,3), dtype=np.float64), minb
for y in range(0, height):
a = mina
for x in range(0, width):
ab = mandel(a, b, maxit, inf)
if ab[0] == maxit:
l[x,y:] = [1,1,1]
#if ab[0] < maxit:
#smoothit = mandelc(ab[0], ab[1], ab[2])
#l[x, y] = colorsys.hsv_to_rgb(smoothit, 1, 1)
a += abs(mina-maxa)/width
b += abs(minb-maxb)/height
img = Image.fromarray(l, "RGB")
img.show()
def mandel(re, im, maxit, inf):
z = complex(re, im)
c,it = z,0
for i in range(0, maxit):
if abs(z) > inf:
break
z,it = z*z+c,it+1
return it,z,inf
def mandelc(it,z,inf):
return (it+1-log(log(abs(z)))/log(2))
UPDATE 1:
I realised that one of the major errors in this program (I'm sure there are many) is the fact that I was using the x,y coords as the complex coefficients! So, 0 to 100 instead of -1.25 to 1.25! I have changed this so that the code now uses variables a,b to describe them, incremented in a manner I've stolen from some of my code in the turtle version. The code above has been updated accordingly. Since the Smooth Colouring Algorithm code is currently commented out for debugging, the inf variable has been reduced to 2 in size.
UPDATE 2:
I have edited the numpy index with help from a great user. The program now outputs this when set to 200 by 200:
As you can see, it definitely shows some mathematical shape and yet is filled with all these strange red, green and blue pixels! Why could these be here? My program can only set RGB values to [1,1,1] or leave it as a default [0,0,0]. It can't be [1,0,0] or anything like that - this must be a serious flaw...
UPDATE 3:
I think there is an error with NumPy and PIL's integration. If I make l = np.zeros((100, 100, 3)) and then state l[0,0,:] = 1 and finally img = Image.fromarray(l) & img.show(), this is what we get:
Here we get a series of coloured pixels. This calls for another question.
UPDATE 4:
I have no idea what was happening previously, but it seems with a np.uint8 array, Image.fromarray() uses colour values from 0-255. With this piece of wisdom, I move one step closer to understanding this Mandelbug!
Now, I do get something vaguely mathematical, however it still outputs strange things.
This dot is all there is... I get even stranger things if I change to np.uint16, I presume due to the different byte-shape and encoding scheme.

You are indexing the 3D array l incorrectly, try
l[x,y,:] = [1,1,1]
instead. For more details on how to access and modify numpy arrays have a look at numpy indexing
As a side note: the quickstart documentation of numpy actually has an implementation of the mandelbrot set generation and plotting.

Normalize histogram (brightness and contrast) of a set of images using Python Image Library (PIL)

I have a script which uses Google Maps API to download a sequence of equal-sized square satellite images and generates a PDF. The images need to be rotated beforehand, and I already do so using PIL.
I noticed that, due to different light and terrain conditions, some images are too bright, others are too dark, and the resulting pdf ends up a bit ugly, with less-than-ideal reading conditions "in the field" (which is backcountry mountain biking, where I want to have a printed thumbnail of specific crossroads).
(EDIT) The goal then is to make all images end up with similar apparent brightness and contrast. So, the images that are too bright would have to be darkened, and the dark ones would have to be lightened. (by the way, I once used imagemagick autocontrast, or auto-gamma, or equalize, or autolevel, or something like that, with interesting results in medical images, but don't know how to do any of these in PIL).
I already used some image corrections after converting to grayscale (had a grayscale printer a time ago), but the results weren't good, either. Here is my grayscale code:
#!/usr/bin/python
def myEqualize(im)
im=im.convert('L')
contr = ImageEnhance.Contrast(im)
im = contr.enhance(0.3)
bright = ImageEnhance.Brightness(im)
im = bright.enhance(2)
#im.show()
return im
This code works independently for each image. I wonder if it would be better to analyze all images first and then "normalize" their visual properties (contrast, brightness, gamma, etc).
Also, I think it would be necessary to perform some analysis in the image (histogram?), so as to apply a custom correction depending on each image, and not an equal correction for all of them (although any "enhance" function implicitly considers initial contitions).
Does anybody had such problem and/or know a good alternative to do this with the colored images (no grayscale)?
Any help will be appreciated, thanks for reading!

What you are probably looking for is a utility that performs "histogram stretching". Here is one implementation. I am sure there are others. I think you want to preserve the original hue and apply this function uniformly across all color bands.
Of course there is a good chance that some of the tiles will have a noticeable discontinuity in level where they join. Avoiding this, however, would involve spatial interpolation of the "stretch" parameters and is a much more involved solution. (...but would be a good exercise if there is that need.)
Edit:
Here is a tweak that preserves image hue:
import operator
def equalize(im):
h = im.convert("L").histogram()
lut = []
for b in range(0, len(h), 256):
# step size
step = reduce(operator.add, h[b:b+256]) / 255
# create equalization lookup table
n = 0
for i in range(256):
lut.append(n / step)
n = n + h[i+b]
# map image through lookup table
return im.point(lut*im.layers)

The following code works on images from a microscope (which are similar), to prepare them prior to stitching. I used it on a test set of 20 images, with reasonable results.
The brightness average function is from another Stackoverflow question.
from PIL import Image
from PIL import ImageStat
import math
# function to return average brightness of an image
# Source: https://stackoverflow.com/questions/3490727/what-are-some-methods-to-analyze-image-brightness-using-python
def brightness(im_file):
im = Image.open(im_file)
stat = ImageStat.Stat(im)
r,g,b = stat.mean
return math.sqrt(0.241*(r**2) + 0.691*(g**2) + 0.068*(b**2)) #this is a way of averaging the r g b values to derive "human-visible" brightness
myList = [0.0]
deltaList = [0.0]
b = 0.0
num_images = 20 # number of images
# loop to auto-generate image names and run prior function
for i in range(1, num_images + 1): # for loop runs from image number 1 thru 20
a = str(i)
if len(a) == 1: a = '0' + str(i) # to follow the naming convention of files - 01.jpg, 02.jpg... 11.jpg etc.
image_name = 'twenty/' + a + '.jpg'
myList.append(brightness(image_name))
avg_brightness = sum(myList[1:])/num_images
print myList
print avg_brightness
for i in range(1, num_images + 1):
deltaList.append(i)
deltaList[i] = avg_brightness - myList[i]
print deltaList
At this point, the "correction" values (i.e. difference between value and mean) are stored in deltaList. The following section applies this correction to all the images one by one.
for k in range(1, num_images + 1): # for loop runs from image number 1 thru 20
a = str(k)
if len(a) == 1: a = '0' + str(k) # to follow the naming convention of files - 01.jpg, 02.jpg... 11.jpg etc.
image_name = 'twenty/' + a + '.jpg'
img_file = Image.open(image_name)
img_file = img_file.convert('RGB') # converts image to RGB format
pixels = img_file.load() # creates the pixel map
for i in range (img_file.size[0]):
for j in range (img_file.size[1]):
r, g, b = img_file.getpixel((i,j)) # extracts r g b values for the i x j th pixel
pixels[i,j] = (r+int(deltaList[k]), g+int(deltaList[k]), b+int(deltaList[k])) # re-creates the image
j = str(k)
new_image_name = 'twenty/' +'image' + j + '.jpg' # creates a new filename
img_file.save(new_image_name) # saves output to new file name

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Why python is slower than matlab? - python

Related

How to parallelize a for loop with a shared array return?

OpenCV determine what two hue values would best represent an image

Faster implementation to quantize an image with an existing palette?

When plotting Mandelbrot using NumPy & Pillow, program outputs apparent noise

Normalize histogram (brightness and contrast) of a set of images using Python Image Library (PIL)

Categories

Resources