Matlab to Python Conversion - python

I am trying to convert some code from Matlab to Python, but I am unfamiliar with a considerable amount of the Matlab syntax and functionality. I have managed to do a some of the conversion using the PIL and Numpy python package, but I was hoping someone would be able to explain what is going on with some elements of this code.
clear all;close all;clc;
% Set gray scale to 0 for color images. Will need more memory
GRAY_SCALE = 1
% The physical mask placed close to the sensor has 4 harmonics, therefore
% we will have 9 angular samples in the light field
nAngles = 9;
cAngles = (nAngles+1)/2;
% The fundamental frequency of the cosine in the mask in pixels
F1Y = 238; F1X = 191; %Cosine Frequency in Pixels from Calibration Image
F12X = floor(F1X/2);
F12Y = floor(F1Y/2);
%PhaseShift due to Mask In-Plane Translation wrt Sensor
phi1 = 300; phi2 = 150;
%read 2D image
disp('Reading Input Image...');
I = double(imread('InputCones.png'));
if(GRAY_SCALE)
%take green channel only
I = I(:,:,2);
end
%make image odd size
I = I(1:end,1:end-1,:);
%find size of image
[m,n,CH] = size(I);
%Compute Spectral Tile Centers, Peak Strengths and Phase
for i = 1:nAngles
for j = 1:nAngles
CentY(i,j) = (m+1)/2 + (i-cAngles)*F1Y;
CentX(i,j) = (n+1)/2 + (j-cAngles)*F1X;
%Mat(i,j) = exp(-sqrt(-1)*((phi1*pi/180)*(i-cAngles) + (phi2*pi/180)*(j-cAngles)));
end
end
Mat = ones(nAngles,nAngles);
% 20 is because we cannot have negative values in the mask. So strenght of
% DC component is 20 times that of harmonics
Mat(cAngles,cAngles) = Mat(cAngles,cAngles) * 20;
% Beginning of 4D light field computation
% do for all color channel
for ch = 1:CH
disp('=================================');
disp(sprintf('Processing channel %d',ch));
% Find FFT of image
disp('Computing FFT of 2D image');
f = fftshift(fft2(I(:,:,ch)));
%If you want to visaulize the FFT of input 2D image (Figure 8 of
%paper), uncomment the next 2 lines
% figure;imshow(log10(abs(f)),[]);colormap gray;
% title('2D FFT of captured image (Figure 8 of paper). Note the spectral replicas');
%Rearrange Tiles of 2D FFT into 4D Planes to obtain FFT of 4D Light-Field
disp('Rearranging 2D FFT into 4D');
for i = 1: nAngles
for j = 1: nAngles
FFT_LF(:,:,i,j) = f( CentY(i,j)-F12Y:CentY(i,j)+F12Y, CentX(i,j)-F12X:CentX(i,j)+F12X)/Mat(i,j);
end
end
clear f
k = sqrt(-1);
for i = 1:nAngles
for j = 1:nAngles
shift = (phi1*pi/180)*(i-cAngles) + (phi2*pi/180)*(j-cAngles);
FFT_LF(:,:,i,j) = FFT_LF(:,:,i,j)*exp(k*shift);
end
end
disp('Computing inverse 4D FFT');
LF = ifftn(ifftshift(FFT_LF)); %Compute Light-Field by 4D Inverse FFT
clear FFT_LF
if(ch==1)
LF_R = LF;
elseif(ch==2)
LF_G = LF;
elseif(ch==3)
LF_B = LF;
end
clear LF
end
clear I
%Now we have 4D light fiel
disp('Light Field Computed. Done...');
disp('==========================================');
% Digital Refocusing Code
% Take a 2D slice of 4D light field
% For refocusing, we only need the FFT of light field, not the light field
disp('Synthesizing Refocused Images by taking 2D slice of 4D Light Field');
if(GRAY_SCALE)
FFT_LF_R = fftshift(fftn(LF_R));
clear LF_R
else
FFT_LF_R = fftshift(fftn(LF_R));
clear LF_R
FFT_LF_G = fftshift(fftn(LF_G));
clear LF_G
FFT_LF_B = fftshift(fftn(LF_B));
clear LF_B
end
% height and width of refocused image
H = size(FFT_LF_R,1);
W = size(FFT_LF_R,2);
count = 0;
for theta = -14:14
count = count + 1;
disp('===============================================');
disp(sprintf('Calculating New ReFocused Image: theta = %d',theta));
if(GRAY_SCALE)
RefocusedImage = Refocus2D(FFT_LF_R,[theta,theta]);
else
RefocusedImage = zeros(H,W,3);
RefocusedImage(:,:,1) = Refocus2D(FFT_LF_R,[theta,theta]);
RefocusedImage(:,:,2) = Refocus2D(FFT_LF_G,[theta,theta]);
RefocusedImage(:,:,3) = Refocus2D(FFT_LF_B,[theta,theta]);
end
str = sprintf('RefocusedImage%03d.png',count);
%Scale RefocusedImage in [0,255]
RefocusedImage = RefocusedImage - min(RefocusedImage(:));
RefocusedImage = 255*RefocusedImage/max(RefocusedImage(:));
%write as png image
clear tt
for ii = 1:CH
tt(:,:,ii) = fliplr(RefocusedImage(:,:,ii)');
end
imwrite(uint8(tt),str);
disp(sprintf('Refocused image written as %s',str));
end
Here is the Refocus2d function:
function IOut = Refocus2D(FFTLF,theta)
[m,n,p,q] = size(FFTLF);
Theta1 = theta(1);
Theta2 = theta(2);
cTem = floor(size(FFTLF)/2) + 1;
% find the coordinates of 2D slice
[XX,YY] = meshgrid(1:n,1:m);
cc = (XX - cTem(2))/size(FFTLF,2);
cc = Theta2*cc + cTem(4);
dd = (YY - cTem(1))/size(FFTLF,1);
dd = Theta1*dd + cTem(3);
% Resample 4D light field along the 2D slice
v = interpn(FFTLF,YY,XX,dd,cc,'cubic');
%set nan values to zero
idx = find(isnan(v)==1);
disp(sprintf('Number of Nans in sampling = %d',size(idx,1)))
v(isnan(v)) = 0;
% take inverse 2D FFT to get the image
IOut = real(ifft2(ifftshift(v)));
If anyone could help it would be greatly appreciated.
Thanks in advance
Apologies: Here is a brief description of what the code does:
The code reads in an image of a light field, and with prior knowledge of the plenoptic mask it we store the relevant nAngles and the fundamental frequencies of the mask and the phase shift, these are used to find multiple spectral replicas of the image.
Once the image is read in and the green channel is extracted we perform a Fast Fourier Transform on the image, and start taking slices from the image matrix that represent one of the spectral replicas.
We then take the Inverse Fourier Transform of all the spectral replicas to produce a the light field.
The Refocus2d function, then takes a 2 dimensional slice of the 4d data to recreate a refocused image.
The things I am struggling with specifically are:
FFT_LF(:,:,i,j) = f( CentY(i,j)-F12Y:CentY(i,j)+F12Y, CentX(i,j)-F12X:CentX(i,j)+F12X)/Mat(i,j);
We are taking a slice from the Matrix f, but where is that data in FFT_LF? What does (:,:,i,j) mean? Is it a multidimensional array?
and what does the size function return:
[m,n,p,q] = size(FFTLF);
Just a brief explanation of how this translates to python would be a great help.
Thanks everyone so far :)

How about getting start with this page http://www.scipy.org/NumPy_for_Matlab_Users? Also if you have brief description of what this is supposed to do, that would be good

You're correct: FFT_LF(:,:,i,j) refers to a multidimensional array. In this case, FFT_LF is a 4-D array, but the calculations result in a 2-D array. The (:,:,i,j) tells MATLAB exactly how to place the 2-D results into the 4-D variable.
In effect, it is storing one MxN array for each pair of indices (i,j). The colons (:) effectively mean "get every element in that dimension."
What [m,n,p,q] = size(FFTLF) will do is return the length of each dimension in your array. So, if FFTLF ends up being a 5x5x3x2 array, you get:
m=5, n=5, p=3, q=2.
If you have MATLAB available, typing "help size" should give a good explanation of what it does. The same can be said for most MATLAB functions: I've always been quite impressed with their documentation.
Hope that Helps

Related

Rotating a reshaped image as a matrix operation

I have a gray scale image that I want to rotate. However, I need to do optimization on it. Therefore, I cannot use pillow or opencv.
I want to reshape this image using python with numpy.reshape into an one dimensional vector (where I use the default settings C-style reshape).
And thereafter, I want to rotate this image around a point using matrix multiplication and addition, i.e. it should be something like
rotated_image_vector = A # vector + b # (or the equivalent in homogenious coordinates).
After this operation I want to reshape the outcome back to two dimensions and have the rotated image.
It would be best if it would as well use linear interpolation between the pixels that do not fit exactly to an other pixel.
The mathematical theory tells it is possible, and I believe there is a very elegant solution to this problem, but I do not see how to create this matrix. Did anyone already have this problem or sees an immediate solution?
Thanks a lot,
Eike
I like your approach but there is a slight misconception in it. What you want to transform are not the pixel values themselves but the coordinates. So you don't reshape your image but rather do a np.indices on it to obtain coordinates to each pixel. For those a rotation around a point looks like
rotation_matrix#(coordinates-fixed_point)+fixed_point
except that I have to transpose a bit to get the dimensions to align. The cove below is a slight adoption of my code in this answer.
As an example I am going to use the Wikipedia-logo-v2 by Nohat. It is licensed under the Creative Commons Attribution-Share Alike 3.0 Unported license.
First I read in the picture, swap x and y axis to not get mad and rotate the coordinates as described above.
import numpy as np
import matplotlib.pyplot as plt
import itertools
image = plt.imread('wikipedia.jpg')
image = np.swapaxes(image,0,1)/255
fixed_point = np.array(image.shape[:2], dtype='float')/2
points = np.moveaxis(np.indices(image.shape[:2]),0,-1).reshape(-1,2)
a = 2*np.pi/8
A = np.array([[np.cos(a),-np.sin(a)],[np.sin(a),np.cos(a)]])
rotated_coordinates = (A#(points-fixed_point.reshape(1,2)).T).T+fixed_point.reshape(1,2)
Now I set up a little class to interpolate between the pixels that do not fit exactly to an other pixel. And finally I swap the axis back and plot it.
class Image_knn():
def fit(self, image):
self.image = image.astype('float')
def predict(self, x, y):
image = self.image
weights_x = [(1-(x % 1)).reshape(*x.shape,1), (x % 1).reshape(*x.shape,1)]
weights_y = [(1-(y % 1)).reshape(*x.shape,1), (y % 1).reshape(*x.shape,1)]
start_x = np.floor(x)
start_y = np.floor(y)
return sum([image[np.clip(np.floor(start_x + x), 0, image.shape[0]-1).astype('int'),
np.clip(np.floor(start_y + y), 0, image.shape[1]-1).astype('int')] * weights_x[x]*weights_y[y]
for x,y in itertools.product(range(2),range(2))])
image_model = Image_knn()
image_model.fit(image)
transformed_image = image_model.predict(*rotated_coordinates.T).reshape(*image.shape)
plt.imshow(np.swapaxes(transformed_image,0,1))
And I get a result like this
Possible Issue
The artifact in the bottom left that looks like one needs to clean the screen comes from the following problem: When we rotate it can happen that we don't have enough pixels to paint the lower left. What we do by default in image_knn is to clip the coordinates to an area where we have information. That means when we ask image knn for pixels coming from outside the image it gives us the pixels at the boundary of the image. This looks good if there is a background but if an object touches the edge of the picture it looks odd like here. Just something to keep in mind when using this.
Thank you for your answer!
But actually it is not a misconception that you could let this roation be represented by a matrix multiplication with the reshaped vector.
I used your code to generate such a matrix (its surely not the most efficient way but it works, most likely you see a more efficient implementation immediately XD. You see I really need it as a matix multiplication :-D).
What I basically did is to generate the representation matrix of the linear transformation, by computing how every of the 100*100 basis images (i.e. the image with zeros everywhere und a one) is mapped by your transformation.
import sys
import numpy as np
import matplotlib.pyplot as plt
import itertools
angle = 2*np.pi/6
image_expl = plt.imread('wikipedia.jpg')
image_expl = image_expl[:,:,0]
plt.imshow(image_expl)
plt.title("Image")
plt.show()
image_shape = image_expl.shape
pixel_number = image_shape[0]*image_shape[1]
rot_mat = np.zeros((pixel_number,pixel_number))
for i in range(pixel_number):
vector = np.zeros(pixel_number)
vector[i] = 1
image = vector.reshape(*image_shape)
fixed_point = np.array(image.shape, dtype='float')/2
points = np.moveaxis(np.indices(image.shape),0,-1).reshape(-1,2)
a = -angle
A = np.array([[np.cos(a),-np.sin(a)],[np.sin(a),np.cos(a)]])
rotated_coordinates = (A#(points-fixed_point.reshape(1,2)).T).T+fixed_point.reshape(1,2)
x,y = rotated_coordinates.T
image = image.astype('float')
weights_x = [(1-(x % 1)).reshape(*x.shape), (x % 1).reshape(*x.shape)]
weights_y = [(1-(y % 1)).reshape(*x.shape), (y % 1).reshape(*x.shape)]
start_x = np.floor(x)
start_y = np.floor(y)
transformed_image_returned = sum([image[np.clip(np.floor(start_x + x), 0, image.shape[0]-1).astype('int'),
np.clip(np.floor(start_y + y), 0, image.shape[1]-1).astype('int')] * weights_x[x]*weights_y[y]
for x,y in itertools.product(range(2),range(2))])
rot_mat[:,i] = transformed_image_returned
if i%100 == 0: print(int(100*i/pixel_number), "% finisched")
plt.imshow((rot_mat # image_expl.reshape(-1)).reshape(image_shape))
Thank you again :-)

image distance transform different xyz voxel sizes

I would like to find minimum distance of each voxel to a boundary element in a binary image in which the z voxel size is different from the xy voxel size. This is to say that a single voxel represents a 225x110x110 (zyx) nm volume.
Normally, I would do something with scipy.ndimage.morphology.distance_transform_edt (https://docs.scipy.org/doc/scipy-0.14.0/reference/generated/scipy.ndimage.morphology.distance_transform_edt.html) but this gives the assume that isotropic sizes of the voxel:
dtrans_stack = np.zeros_like(segm_stack) # empty array to add to
### iterate over the t dimension and get distance transform
for t_iter in range(dtrans_stack.shape[0]):
segm_ = segm_stack[t_iter, ...] # segmented image in single t
neg_segm = np.ones_like(segm_) - segm_ # negative of the segmented image
# get a ditance transform with isotropic voxel sizes
dtrans_stack_iso = distance_transform_edt(segm_)
dtrans_neg_stack_iso = -distance_transform_edt(neg_segm) # make distance in the segmented image negative
dtrans_stack[t_iter, ...] = dtrans_stack_iso + dtrans_neg_stack_iso
I can do this with brute force using scipy.spatial.distance.cdist (https://docs.scipy.org/doc/scipy/reference/generated/scipy.spatial.distance.cdist.html) but this takes ages and I'd rather avoid it if I can
vox_multiplier = np.array([z_voxelsize, xy_voxelsize, xy_voxelsize]) # array of voxel sizes
## get a subset of coordinatess so I'm not wasting times in empty space
disk_size = 5 # size of disk for binary dilation
mip_tz = np.max(np.max(decon_stack, axis = 1), axis = 0)
thresh_li = threshold_li(mip_tz) # from from skimage.filters
mip_mask = mip_tz >= thresh_li
mip_mask = remove_small_objects(mip_mask) # from skimage.morphology
mip_dilated = binary_dilation(mip_mask, disk(disk_size)) # from skimage.morphology
# get the coordinates of the mask
coords = np.argwhere(mip_dilated == 1)
ycoords = coords[:, 0]
xcoords = coords[:, 1]
# get the lower and upper bounds of the xyz coordinates
ylb = np.min(ycoords)
yub = np.max(ycoords)
xlb = np.min(xcoords)
xub = np.max(xcoords)
zlb = 0
zub = zdims -1
# make zeros arrays of the proper size
dtrans_stack = np.zeros_like(segm_stack)
dtrans_stack_neg = np.zeros_like(segm_stack) # this will be the distance transform into the low inten area
for t_iter in range(dtrans_stack.shape[0]):
segm_ = segm_stack[t_iter, ...]
neg_segm_ = np.ones_like(segm_) - segm_ # negative of the segmented image
# get the coordinats of segmented image and convert to nm
segm_coords = np.argwhere(segm_ == 1)
segm_coords_nm = vox_multiplier * segm_coords
neg_segm_coords = np.argwhere(neg_segm_ == 1)
neg_segm_coords_nm = vox_multiplier * neg_segm_coords
# make an empty arrays for the xy and z distance transforms
dtrans_stack_x = np.zeros_like(segm_)
dtrans_stack_y = np.zeros_like(segm_)
dtrans_stack_z = np.zeros_like(segm_)
dtrans_stack_neg_x = np.zeros_like(segm_)
dtrans_stack_neg_y = np.zeros_like(segm_)
dtrans_stack_neg_z = np.zeros_like(segm_)
# iterate over the zyx and determine the minimum distance in nm from segmented image
for z_iter in range(zlb, zub):
for y_iter in range(ylb, yub):
for x_iter in range(xlb, xub):
coord_nm = vox_multiplier* np.array([z_iter, y_iter, x_iter]) # change coords from pixel to nm
coord_nm = coord_nm.reshape(1, 3) # reshape for distance calculateion
dists_segm = distance.cdist(coord_nm, segm_coords_nm) # distance from the segmented image
dists_neg_segm = distance.cdist(coord_nm, neg_segm_coords_nm) # distance from the negative segmented image
dtrans_stack[t_iter, z_iter, y_iter, x_iter] = np.min(dists_segm) # add minimum distance to distance transfrom stack
dtrans_neg_stack[t_iter, z_iter, y_iter, x_iter] = np.min(dists_neg_segm)
Here is image of a single zslice of segmented image if that helps to clear things up
single z-slice of segmented image
Normally, I would do something with scipy.ndimage.morphology.distance_transform_edt but this gives the assume that isotropic sizes of the voxel:
It does no such thing! You are looking for the sampling= parameter. From the latest version of the docs:
Spacing of elements along each dimension. If a sequence, must be of length equal to the input rank; if a single number, this is used for all axes. If not specified, a grid spacing of unity is implied.
The wording "sampling" or "spacing" is probably a bit mysterious if you think of pixels as little squares/cubes, and that is probably why you missed it. In most situations, it is better to think of pixels as point samples on a grid, with fixed spacing between samples. I recommend Alvy Ray's a pixel is not a little square for a better understanding of this terminology.

generate center frequency for sin wave in python

I implement spreading function in the psychoacoustic model.
My steps:
Generate center frequency of each Bark
Group subband frequency to Bark group
Calculate power of each critical band
Convert to dB
I started with step 1 and got a syntax error in python.
I have no idea how to correct
# Generate center frequency of each Bark
center = zeros(1, size(bark_scale, 2) - 1)
for k in 1:size(bark_scale, 2) - 1:
center(k) = (bark_scale(k) + bark_scale(k + 1)) / 2
end
Edit:
# Generate center frequency of each Bark
center = zeros(1, bark_scale.size[1]- 1)
k = np.arange(bark_scale.size[1] - 1)
center = (bark_scale[k] + bark_scale[k + 1]) / 2
Edit+corrections
Here is how to correct your code. I assume that bark_scale is an array of shape (1, n) and is loaded with numpy.
Numpy First you need to import numpy to be able to handle arrays:
import numpy as np
Array size To get the size of the second dimension of an array (e.g. bark_scale), you should use bark_scale.size[1] instead of size(bark_scale, 2).
Array filled with zeros To define an array filled with zeros, you should use np.zeros((1, 10)) instead of zeros(1, 10).
For loop In Matlab, it looks like this:
for k = 1:10
% ...
end
whereas in Python it looks like:
for k in range(0, 10):
# ...
Array access You should use bark_scale[k] instead of bark_scale(k) to get the k-th value of bark_scale.
With all these points you should be able to correct your code. I hope this will help you.

How to perform logical operation and logical indexing using VIPS in Python?

I've had following codes that use Python and OpenCV. Briefly, I have a stack of image taken at different focal depth. The codes pick out pixels at every (x,y) position that has the largest Laplacian of Guassian response among all focal depth(z), thus creating a focus-stacked image. Function get_fmap creates a 2d array where each pixel will contains the number of the focal plane having the largest log response. In the following codes, lines that are commented out are my current VIPS implementation. They don't look compatible within the function definition because it's only partial solution.
# from gi.repository import Vips
def get_log_kernel(siz, std):
x = y = np.linspace(-siz, siz, 2*siz+1)
x, y = np.meshgrid(x, y)
arg = -(x**2 + y**2) / (2*std**2)
h = np.exp(arg)
h[h < sys.float_info.epsilon * h.max()] = 0
h = h/h.sum() if h.sum() != 0 else h
h1 = h*(x**2 + y**2 - 2*std**2) / (std**4)
return h1 - h1.mean()
def get_fmap(img): # img is a 3-d numpy array.
log_response = np.zeros_like(img[:, :, 0], dtype='single')
fmap = np.zeros_like(img[:, :, 0], dtype='uint8')
log_kernel = get_log_kernel(11, 2)
# kernel = get_log_kernel(11, 2)
# kernel = [list(row) for row in kernel]
# kernel = Vips.Image.new_from_array(kernel)
# img = Vips.new_from_file("testimg.tif")
for ii in range(img.shape[2]):
# img_filtered = img.conv(kernel)
img_filtered = cv2.filter2D(img[:, :, ii].astype('single'), -1, log_kernel)
index = img_filtered > log_response
log_response[index] = img_filtered[index]
fmap[index] = ii
return fmap
and then fmap will be used to pick out pixels from different focal planes to create a focus-stacked image
This is done on an extremely large image, and I feel VIPS might do a better job than OpenCV on this. However, the official documentation provides rather scant information on its Python binding. From the information I can find on the internet, I'm only able to make image convolution work ( which, in my case, is an order of magnitude faster than OpenCV.). I'm wondering how to implement this in VIPS, especially these lines?
log_response = np.zeros_like(img[:, :, 0], dtype = 'single')
index = img_filtered > log_response
log_response[index] = im_filtered[index]
fmap[index] = ii
log_response and fmap are initialized as 3D arrays in the question code, whereas the question text states that the output, fmap is a 2D array. So, I am assuming that log_response and fmap are to be initialized as 2D arrays with their shapes same as each image. Thus, the edits would be -
log_response = np.zeros_like(img[:,:,0], dtype='single')
fmap = np.zeros_like(img[:,:,0], dtype='uint8')
Now, back to the theme of the question, you are performing 2D filtering on each image one-by-one and getting the maximum index of filtered output across all stacked images. In case, you didn't know as per the documentation of cv2.filter2D, it could also be used on a multi-dimensional array giving us a multi-dimensional array as output. Then, getting the maximum index across all images is as simple as .argmax(2). Thus, the implementation must be extremely efficient and would be simply -
fmap = cv2.filter2D(img,-1,log_kernel).argmax(2)
After consulting the Python VIPS manual and some trial-and-error, I've come up with my own answer. My numpy and OpenCV implementation in question can be translated into VIPS like this:
import pyvips
img = []
for ii in range(num_z_levels):
img.append(pyvips.Image.new_from_file("testimg_z" + str(ii) + ".tif")
def get_fmap(img)
log_kernel = get_log_kernel(11,2) # get_log_kernel is my own function, which generates a 2-d numpy array.
log_kernel = [list(row) for row in log_kernel] # pyvips.Image.new_from_array takes 1-d list array.
log_kernel = pyvips.Image.new_from_array(log_kernel) # Turn the kernel into Vips array so it can be used by Vips.
log_response = img[0].conv(log_kernel)
for ii in range(len(img)):
img_filtered = img[ii+1].conv(log_kernel)
log_response = (img_filtered > log_response).ifthenelse(img_filtered, log_response)
fmap = (img_filtered > log_response).ifthenelse(ii+1, 0)
Logical indexing is achieved through ifthenelse method :
result_img = (test_condition).ifthenelse(value_if_true, value_if_false)
The syntax is rather flexible. The test condition can be a comparison between two images of the same size or between an image and a value, e.g. img1 > img2 or img > 5. Like wise, value_if_true can be a single value or a Vips image.

Numpy Histogram - Python

I have a problem in which a have a bunch of images for which I have to generate histograms. But I have to generate an histogram for each pixel. I.e, for a collection of n images, I have to count the values that the pixel 0,0 assumed and generate an histogram, the same for 0,1, 0,2 and so on. I coded the following method to do this:
class ImageData:
def generate_pixel_histogram(self, images, bins):
"""
Generate a histogram of the image for each pixel, counting
the values assumed for each pixel in a specified bins
"""
max_value = 0.0
min_value = 0.0
for i in range(len(images)):
image = images[i]
max_entry = max(max(p[1:]) for p in image.data)
min_entry = min(min(p[1:]) for p in image.data)
if max_entry > max_value:
max_value = max_entry
if min_entry < min_value:
min_value = min_entry
interval_size = (math.fabs(min_value) + math.fabs(max_value))/bins
for x in range(self.width):
for y in range(self.height):
pixel_histogram = {}
for i in range(bins+1):
key = round(min_value+(i*interval_size), 2)
pixel_histogram[key] = 0.0
for i in range(len(images)):
image = images[i]
value = round(Utils.get_bin(image.data[x][y], interval_size), 2)
pixel_histogram[value] += 1.0/len(images)
self.data[x][y] = pixel_histogram
Where each position of a matrix store a dictionary representing an histogram. But, how I do this for each pixel, and this calculus take a considerable time, this seems to me to be a good problem to be parallelized. But I don't have experience with this and I don't know how to do this.
EDIT:
I tried what #Eelco Hoogendoorn told me and it works perfectly. But applying it to my code, where the data are a large number of images generated with this constructor (after the values are calculated and not just 0 anymore), I just got as h an array of zeros [0 0 0]. What I pass to the histogram method is an array of ImageData.
class ImageData(object):
def __init__(self, width=5, height=5, range_min=-1, range_max=1):
"""
The ImageData constructor
"""
self.width = width
self.height = height
#The values range each pixel can assume
self.range_min = range_min
self.range_max = range_max
self.data = np.arange(width*height).reshape(height, width)
#Another class, just the method here
def generate_pixel_histogram(realizations, bins):
"""
Generate a histogram of the image for each pixel, counting
the values assumed for each pixel in a specified bins
"""
data = np.array([image.data for image in realizations])
min_max_range = data.min(), data.max()+1
bin_boundaries = np.empty(bins+1)
# Function to wrap np.histogram, passing on only the first return value
def hist(pixel):
h, b = np.histogram(pixel, bins=bins, range=min_max_range)
bin_boundaries[:] = b
return h
# Apply this for each pixel
hist_data = np.apply_along_axis(hist, 0, data)
print hist_data
print bin_boundaries
Now I get:
hist_data = np.apply_along_axis(hist, 0, data)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numpy/lib/shape_base.py", line 104, in apply_along_axis
outshape[axis] = len(res)
TypeError: object of type 'NoneType' has no len()
Any help would be appreciated.
Thanks in advance.
As noted by john, the most obvious solution to this is to look for library functionality that will do this for you. It exists, and it will be orders of magnitude more efficient than what you are doing here.
Standard numpy has a histogram function that can be used for this purpose. If you have only few values per pixel, it will be relatively inefficient; and it creates a dense histogram vector rather than the sparse one you produce here. Still, chances are good the code below solves your problem efficiently.
import numpy as np
#some example data; 128 images of 4x4 pixels
voxeldata = np.random.randint(0,100, (128, 4,4))
#we need to apply the same binning range to each pixel to get sensibble output
globalminmax = voxeldata.min(), voxeldata.max()+1
#number of output bins
bins = 20
bin_boundaries = np.empty(bins+1)
#function to wrap np.histogram, passing on only the first return value
def hist(pixel):
h, b = np.histogram(pixel, bins=bins, range=globalminmax)
bin_boundaries[:] = b #simply overwrite; result should be identical each time
return h
#apply this for each pixel
histdata = np.apply_along_axis(hist, 0, voxeldata)
print bin_boundaries
print histdata[:,0,0] #print the histogram of an arbitrary pixel
But the more general message id like to convey, looking at your code sample and the type of problem you are working on: do yourself a favor, and learn numpy.
Parallelization certainly would not be my first port of call in optimizing this kind of thing. Your main problem is that you're doing lots of looping at the Python level. Python is inherently slow at this kind of thing.
One option would be to learn how to write Cython extensions and write the histogram bit in Cython. This might take you a while.
Actually, taking a histogram of pixel values is a very common task in computer vision and it has already been efficiently implemented in OpenCV (which has python wrappers). There are also several functions for taking histograms in the numpy python package (though they are slower than the OpenCV implementations).

Categories

Resources