I've got a numpy array 'image' that is a two dimensional array where each element has two components. I want to convert this to another two dimensional array where each element has three components. The first two and a third one calculated from the first two, like so:
for x in range(0, width):
for y in range(0, height):
horizontal, vertical = image[y, x]
annotated_image[y, x] = (horizontal, vertical, int(abs(horizontal) > 1.0 or abs(vertical) > 1.0))
This loop works as expected, but is very slow when compared to other numpy functions. For a medium-sized image this takes an unacceptable 30 seconds.
Is there a different way to do the same calculation but faster? The original image array does not have to be preserved.
You could just separate the components of the image and work with multiple images instead:
image_component1 = image[:, :, 0]
image_component2 = image[:, :, 1]
result = (np.abs(image_component1) > 1.) | (np.abs(image_component2) > 1.)
If you for some reason need the layout you specified you could as well construct another three dimensional image:
result = np.empty([image.shape[0], image.shape[1], 3], dtype=image.dtype)
result[:, :, 0] = image[:, :, 0]
result[:, :, 1] = image[:, :, 1]
result[:, :, 2] = (np.abs(image[:, :, 0]) > 1.) | (np.abs(image[:, :, 1]) > 1.)
Related
I'm trying to compute the Normal Map by taking the derivative of depth map for that I'm using Sobel filter:
d_im = cv2.imread('vin.png')
zx = cv2.Sobel(d_im, cv2.CV_64F, 1, 0, ksize=5)
zy = cv2.Sobel(d_im, cv2.CV_64F, 0, 1, ksize=5)
normal = np.dstack((-zx, -zy, np.ones_like(d_im)))
n = np.linalg.norm(normal, axis=2)
normal[:, :, 0] /= n
normal[:, :, 1] /= n
normal[:, :, 2] /= n
As seen below the result seems to have a harsh transition in the x, y direction:
I read somewhere as I was trying to find a solution for the issue(Unfortunately, I can't find the reference anymore) that this can be fixed by normalizing one side of the kernel. I am not sure what does that mean. is it summing one side then multiplying the kernel by 1/sum? If that does not sound right is there any method to get a better result that is less harsher than the result I'm getting ?
I have a stack of images stored in a 4D array, e.g. [0, 0, :, :] is the image at the location (0, 0). Now I want to make a montage of the images and store them in a 2D array and do something with the images, then I want to transfer the montage back to a 4D array. How can I manage this with numpy? Following is a schematic of what I want to do. It is shown with a 3D array, but I think you can get the idea.
The first part of the operation can be carried out using np.block. You would need to convert to a non-array sequence type for the outer dimensions:
l = [list(x) for x in arr]
montage = np.block(l)
Alternatively, you can just arrange your dimensions the way you like first, then reshape. The key is to remember that later dimensions get raveled together. So if you have an array with (A, B) elements, each of which is an (M, N) image, the result should be an (A * M, B * N) image. You want the original image pixels from each row to stay contiguous, but the rows to be concatenated. So transpose and reshape like this:
a, b, m, n = arr.shape
montage = arr.transpose(0, 2, 1, 3).reshape(a * m, b * n)
You can reshape back using the inverse operation fairly easily:
stack = montage.reshape(a, m, b, n).transpose(0, 2, 1, 3)
This is actually the default behavior of np.reshape(). Just calculate how wide/tall the collage image will be, and then call np.reshape. reshape again will reverse it.
import numpy as np
# placeholder data -- 4 images that are 5x5
image = np.arange(4 * 5 * 5 * 3).reshape(4, 5, 5, 3)
# 2x2 grid of images
collage = image.reshape(10, 10, 3)
result = collage.reshape(4, 5, 5, 3)
assert np.array_equal(image, result)
Edit: I misunderstood the question. I assumed that the 4D array was a 1D-list of NxMx3 RGB images. If, instead, it is a 2D grid of 2D (single channel) images, I can't think of a clever way to do it with numpy operations. But, it shouldn't be to slow to just use a python for-loop.
(assuming row-major order)
# rows = number of rows in image grid
# cols = number of cols in image grid
# width = width of each image
# height = height of each image
rows, cols, height, width = images.shape
collage = np.empty(rows * height, cols * width, dtype=images.dytpe)
for i in range(rows):
for j in range(cols):
y = i * height
x = j * range
collage[y:y+height, x:x+width] = images[i, j]
Then to reverse just flip it:
result = np.empty(rows, cols, width, height, dtype=collage.dytpe)
for i in range(rows):
for j in range(cols):
y = i * height
x = j * range
images[i, j, :, :] = collage[y:y+height, x:x+width]
I've two tensors of shape a(16,8,8,64) and b(64,64). Suppose, I extract last dimension of ainto another column vector c, I want to compute matmul(matmul(c.T, b), c). I want this to be done in each of the first 3 dimensions of a. That is the final product should be of shape (16,8,8,1). How can I achieve this in pytorch?
Can be done as follows:
row_vec = a[:, :, :, None, :].float()
col_vec = a[:, :, :, :, None].float()
b = (b[None, None, None, :, :]).float()
prod = torch.matmul(torch.matmul(row_vec, b), col_vec)
Let tensor T has shape [B, N, N, 6] and I want to multiply matrices [b, N, N, 0:3] by [b, N, N, 5] element-wise for each b in range(B). Note, that [N, N, 4] should not be changed. What is the best way to do this using tensorflow?
My attempts:
result = tf.empty([B, N, N, 5])
for b in range(B):
for i in range(4)
result[b, :, :, i] = tf.mul(T[b, :, :, i], T[b, :, :, 5])
result[b, :, :, 4] = T[b, :, :, 4]
In TensorFlow, it's not generally possible to build a tensor value by assigning to slices. The programming model tends to be more functional than imperative. One way of implementing your calculation is as follows:
result = tf.concat(3, [tf.mul(T[:, :, :, 0:4], T[:, :, :, 5:6]), T[:, :, :, 4:5]])
Note that you don't need multiple multiplications, because (i) the original computation is already element-wise on the 0th dimension (for b in range(B)), and (ii) TensorFlow will broadcast the second argument to the multiplication in the 3rd dimension.
i need to manipulate an numpy array:
My Array has the followng format:
x = [1280][720][4]
The array stores image data in the third dimension:
x[0][0] = [Red,Green,Blue,Alpha]
Now i need to manipulate my array to the following form:
x = [1280][720]
x[0][0] = Red + Green + Blue / 3
My current code is extremly slow and i want to use the numpy array manipulation to speed it up:
for a in range(0,719):
for b in range(0,1279):
newx[a][b] = x[a][b][0]+x[a][b][1]+x[a][b][2]
x = newx
Also, if possible i need the code to work for variable array sizes.
Thansk Alot
Use the numpy.mean function:
import numpy as np
n = 1280
m = 720
# Generate a n * m * 4 matrix with random values
x = np.round(np.random.rand(n, m, 4)*10)
# Calculate the mean value over the first 3 values along the 2nd axix (starting from 0)
xnew = np.mean(x[:, :, 0:3], axis=2)
x[:, :, 0:3] gives you the first 3 values in the 3rd dimension, see: numpy indexing
axis=2 specifies, along which axis of the matrix the mean value is calculated.
Slice the alpha channel out of the array, and then sum the array along the RGB axis and divide by 3:
x = x[:,:,:-1]
x_sum = x.sum(axis=2)
x_div = x_sum / float(3)