Numpy Array python dimension uniform - python

I have 2 dimensional array with 15 elements in one dimension and variable length in second dimension
for example
>>print abc.size()
15
>>print abc[0].size()
5873
>>print abc[1].size()
9825
How can i make array dimensions uniform either using numpy or skikit sparse array. the data is hog features of an image.

Assuming you want to align all the arrays to the left, and pad to the right with zeros, then you could first find the maximum length with
max_len = max([abc[i].size() for for i in range(abc.size())])
and then pad using zeros:
import numpy as np
for i in range(abc.size()):
abc[i] = np.append(abc[i], np.zeros(max_len - abc[i].size())

We have here two possible cases:
abc is a list of images, and for each image abc[i] is the set of hog features of the image i.
abc is one image and each abc[i] is the i-th hog feature of the image
For the first case, the image sizes or the hog parameters (size for the neighbour) differ from one image to another, so you need to adjust the parameters in order to calculate the hog features properly for all the images (if you want fixed sized descriptors).
For the second case, your hog computation is not correct (it shouldn't happen that the sizes of the hog descriptors are different for the same image).
So, in any of the cases, there is no way of resizing your arrays. You need to fix your hog computations.
Edit: related to your problem, you have a dataset of different size images. The are two possible common approaches for image classification with hog descriptors. But first, a quick summary of HOG:
HOG splits the image in M x N windows of size m x n each and calculates a histogram oriented gradients with fixed W number of bins (number of orientations) in that window. Hence, you will end up with M x N x W features. Features are usually flattened in a 2D vector of size K x W with K = M x N.
Now, for classification there are 2 common approaches:
Combine all the features of an image in one, this is, perform an average (or weighted average or norm) over the K features to end up with a vector of size W for each image (the number of orientations).
To preserve (more or less) the spatial relationship of the features, another more common approach is to concatenate all the features in order to end up with a flattened 1D vector of size Z, with Z = K x W/
From your data, I think you are trying to perform the 2nd step. The problem you are facing is that the images have different size, and therefore, for a fixed window size m x n the number of features differ from one image to another.
The way you could fix that, is by fixing the number of features M x N you want, and for a given image, calculate m = height / M and n = width / N and calculate the HOG descriptors with that custom m x n window size (which is different for every image). This way, you will end up with an K = M x N vector with the same K (but different window size) for every image.
With a fixed K and therefore fixed Z you would be able to perform classification.
I don't know which library are you using for computing the HOG, but m x n window size parameter should be easy to manually set up for every image.
Hope it helps!

Related

Can this be done faster with numpy?

There's a color image, a numpy array of shape (h,w,3) with N=h*w pixels; there's an array labels of shape (h,w), each label an integer between 1 and M. N is 10^6-10^7, M is 10^3-10^4.
I need to produce
a result image (h,w,3) where the color of each pixel labelled l is the mean color of all pixels labelled l. I.e.:
def recolor1(image, labels):
result = np.empty(shape=(h,w,3))
for label in np.unique(labels):
mask = labels==label
mean = np.mean(image[mask], axis=0)
result[mask] = mean
return result
The code is straightforward, but runs in O(M.N) (the computation of mask is O(N) and the loop runs M times).
An O(N) recolor2 is possible. Basically you go over the labels and image pixels twice. First to compute an auxiliary array, indexed by label, where you keep the sums of each primary and the number of pixels for that label. Then you compute the averages for each label. Then you go over labels and pixels again, computing result. The O(M) time to find the averages is noise.
With recolor2 written in Python, recolor1 and recolor2 break even for N=1000000 and M=1000 at ~4s. As expected, recolor1's time grows linearly to ~20s for M=5000, while recolor2's remains essentially the same.
4s for a relatively small image is not great and it will get much worse for larger images. I'm no expert in numpy and associated libraries. Is there an O(N) solution there?
Let's try np.bincount and loop over the channels:
result = np.stack([np.bincount(labels.flat, weights=img[...,i].flat)[labels-1]
for i in range(3)],
axis=-1)
which takes about 35ms on my system with h,w,M = 1000,1000,1000.
Note This compute the sum, but mean should be easy enough.

How to convolve a 3 dimensional array (in this case a filter bank) with a 2 dimensional image (monochrome) in Python?

I have a function definition that takes in an image that is monochromatic and 2 dimensional, and a filter bank that is a 3 dimensional array (48 2D filters). I need to convolve the two to find the feature vector at each pixel location. How do I do that?
I have tried scipy.ndimage.convolve() but get the error "filter weights array has incorrect shape."
To make things simple, simply loop over the temporal dimension of your filter bank, then apply convolution to the image and each filter within the filter bank. After, stack the results into a 3D matrix. This is actually what I would do for readability.
Suppose your image is stored in img and your filters are stored in filters. img is of size M x N and your filters are of size R x C x D with D being the total number of filters you have.
As you've eluded to using scipy.ndimage.convolve, we can just use that. However, it's possible to use cv2.filter2D too. I'll show you how to use both.
Method #1 - Using scipy.ndimage.convolve
import scipy.ndimage
import numpy as np
outputs = []
D = filters.shape[2]
for i in range(D):
filt = filters[...,i]
out = scipy.ndimage.convolve(img, filt)
outputs.append(out)
outputs = np.dstack(outputs)
The above is straight forward. Create an empty list to store our convolution results, then extract the total number of filters we have. After, we loop over each filter, convolve the image with said filter and append it to the list. We then use numpy.dstack to stack all of the 2D responses together to a 3D matrix.
Method #2 - Using cv2.filter2D
import cv2
import numpy as np
outputs = []
D = filters.shape[2]
for i in range(D):
filt = filters[...,i]
filt = filt[::-1, ::-1]
out = cv2.filter2D(img, -1, filt)
outputs.append(out)
outputs = np.dstack(outputs)
This is exactly the same as Method #1 with the exception of calling cv2.filter2D instead. Also take note that I had to rotate the kernel by 180 degrees as cv2.filter2D performs correlation and not convolution. To perform convolution with cv2.filter2D, you need to rotate the kernel first prior to running the method. Take note that the second parameter to cv2.filter2D is the output data type of the result. We set this to -1 to say that it will be whatever the input data type is.
Note on indexing
If you want to avoid indexing into your filter bank all together and let the for loop do that for you, you can shift the channels around so that the number of filters is the first channel. You can then construct the resulting 3D output matrix by list comprehension:
filters = filters.transpose((2, 0, 1))
outputs = np.dstack([scipy.ndimage.convolve(img, filt) for filt in filters])
You can make the monochrome image a 3D array by either padding zeros or replicating the image itself. The number of such paddings would depend on the depth of convolution kernel. For example, let d be the depth of the convolution kernel and I is your image, then
I_pad = np.empty((I.shape[0], I.shape[1], 0))
# Do this for copying the image across channels
I_pad = [np.concatenate((I_pad, I), axis=-1) for _ in range(d)]
# Do this for zero padding
I_pad = [np.concatenate((I_pad, np.zeros(size(I))), axis=-1) for _ in range(d)]
Then carry out the convolution. Hope it helps

Multiplying subarrays of tensor

I am trying to implement a multivariate Gaussian Mixture Model and am trying to calculate the probability distribution function using tensors. There are n data points, k clusters, and d dimensions. So far, I have two tensors. One is a (n,k,d) tensor of centered data points and the other is a kxdxd tensor of covariance matricies. I can compute an nxk matrix of probabilities by doing
centered = np.repeat(points[:,np.newaxis,:],K,axis=1) - mu[np.newaxis,:] # KxNxD
prob = np.zeros(n,k)
constant = 1/2/np.pow(np.pi, d/2)
for n in range(centered.shape[1]):
for k in range(centered.shape[0]):
p = centered[n,k,:][np.newaxis] # 1xN
power = -1/2*(p # np.linalg.inv(sigma[k,:,:]) # p.T)
prob[n,k] = constant * np.linalg.det(sigma[k,:,:]) * np.exp(power)
where sigma is the triangularized kxdxd matrix of covariances and centered are mypoints. What is a more pythonic way of doing this using numpy's tensor capabilites?
Just a couple of quick observations:
I don't see you using p in the loop; is this a mistake? Using n instead?
The T in centered[n,k,:].T does nothing; with that index the array is 1d
I'm not sure if np.linal.inv can handle batches of arrays, allowing np.linalg.inv(sigma).
# allows batches, just so long as the last 2 dim are the ones entering into the dot (with the usual last of A, 2nd to the last of B rule; einsum can also be used.
again does np.linalg.det handle batches?

Is there an efficient way of applying a radial average in keras?

I would like to apply a radial average at the end of a keras pipeline.
At the second to last step, I have an image of size n x n. I then want to map this n x n image to a 1 x n/2 vector, where vector[x] = mean(image(radialPosition = x)). I.e. I want to average all points of distance X from the center of the image, and set this as output[x]. We can assume that n is odd, so the center point is a single point.
I have considered looping over all radii, and selecting the desired indices, as well as a dot product between the image and multiple "averaging" matrices, but neither of these seem computationally efficient.
Is there a better way of doing this?

Indexing for 3 dimensional Numpy Arrays (convolutional network)

I'm trying to write a function that performs Convolution, and I'm getting a little challenged trying to create the output volume using numpy. Specifically, I have an input image that is represented as an array of dimensions (150,150,3). Now, I want to convolve over this image with a set of kernels num_kernels, which are arrays of dimension (4,4,3), and I want these kernels to move over the image with a stride of 2. My thought process has been:
(1) I'll create an output array which is comprised of taking (4,4,3) size chunks out of the input array and stretching these out into rows, and ultimately making a large matrix of these.
(2) Then, I'll create a parameter array composed of all of my (4,4,3) kernels stretched out into rows, which will also make a large matrix.
(3) Then I can dot product these matrices together and reshape the output matrix into the proper dimensions.
My rough psuedo-code start to number (1) is as follows.
def Convolution(input, filter_size, num_filters, stride):
X = input
output_Volume = np.zeros(#dimensions)
weights = np.zeros(#dimensions)
#get weights from other function
for width in range(0,150,2):
for height in range(0,150,2):
row = X(#indexes here to take out chunk).flatten
output_Volume.append(row) #something of this sort
return #dot product output volume and weights
If someone could provide a specific code example of how to implement this (most helpful would be answers to (1) and (2)) in Python (I'm using numpy), it would be much appreciated. Thank you!

Categories

Resources