I have a 4d array (python) with a batch of 10000 images with 5 channels in each image. Each image is 25*25 i.e. the 4d array shape is 10000*5*25*25.
I need to transpose the images. The naive way is with nested loops:
for i in range(np.shape(img)[0]):
for j in range(np.shape(img)[1]):
img[i, j, :, :] = np.transpose(img[i, j, :, :])
but I'm sure that there is a more efficient way to do this. Do you have any idea?
Thanks!
The function numpy.transpose is general enough to handle multi-dimensional arrays. By default it reverses the order of dimensions.
However, it takes an optional axis argument, which explicitly specifies the order in which to rearrange the dimensions. To swap the last two dimensions in a 4D array (i.e. transposing a stack of images):
np.transpose(x, [0, 1, 3, 2])
No loops are required, it simply works on the entire 4D array and is super efficient.
Some more examples:
np.transpose(x, [0, 1, 2, 3]) # leaves the array unchanged
np.transpose(x, [3, 2, 1, 0]) # same as np.transpose(x)
np.transpose(x, [0, 2, 1, 3]) # transpose a stack of images with channel in the last dim
Related
I have a numpy array of shape [batch_size, timesteps_per_samples, width, height], where width and height refer to a 2D grid. The values in this array can be interpreted as an elevation at a certain location that changes over time.
I want to know the elevation over time for various paths within this array. Therefore i have a second array of shape [batch_size, paths_per_batch_sample, timesteps_per_path, coordinates] (coordinates = 2, for x and y in the 2D plane).
The resulting array should be of shape [batch_size, paths_per_batch_sample, timesteps_per_path] containing the elevation over time for each sample within the batch.
The following two examples work. The first one is very slow and just serves for understanding what I am trying to do. I think the second one does what I want but I have no idea why this works nor if it may crash under certain circumstances.
Code for the problem setup:
import numpy as np
batch_size=32
paths_per_batch_sample=10
timesteps_per_path=4
width=64
height=64
elevation = np.arange(0, batch_size*timesteps_per_path*width*height, 1)
elevation = elevation.reshape(batch_size, timesteps_per_path, width, height)
paths = np.random.randint(0, high=width-1, size=(batch_size, paths_per_batch_sample, timesteps_per_path, 2))
range_batch = range(batch_size)
range_paths = range(paths_per_batch_sample)
range_timesteps = range(timesteps_per_path)
The following code works but is very slow:
elevation_per_time = np.zeros((batch_size, paths_per_batch_sample, timesteps_per_path))
for s in range_batch:
for k in range_paths:
for t in range_timesteps:
x_co, y_co = paths[s,k,t,:].astype(int)
elevation_per_time[s,k,t] = elevation[s,t,x_co,y_co]
The following code works (even fast) but I can't understand why and how o.0
elevation_per_time_fast = elevation[
:,
range_timesteps,
paths[:, :, range_timesteps, 0].astype(int),
paths[:, :, range_timesteps, 1].astype(int),
][range_batch, range_batch, :, :]
Prove that the results are equal
check = (elevation_per_time == elevation_per_time_fast)
print(np.all(check))
Can somebody explain how I can slice an nd-array by multiple other arrays?
Especially, I don't understand how the numpy knows that 'range_timesteps' has to run in step (for the index in axis 1,2,3).
Thanks in advance!
Lets take a quick look at slicing numpy array first:
a = np.arange(0,9,1).reshape([3,3])
array([[0, 1, 2],
[3, 4, 5],
[6, 7, 8]])
Numpy has 2 ways of slicing array, full sections start:stop and by index from a list [index1, index2 ...]. The output will still be an array with the shape of your slice:
a[0:2,:]
array([[0, 1, 2],
[3, 4, 5]])
a[:,[0,2]]
array([[0, 2],
[3, 5],
[6, 8]])
The second part is that since you get a returned array with the same amount of dimensions you can easily stack any number of slices as long as you dont try to directly access an index outside of the array.
a[:][:][:][:][:][:][:][[0,2]][:,[0,2]]
array([[0, 2],
[6, 8]])
For example, I have a matrix with shape:
x = np.random.rand(3, 10, 2, 6)
As you can see, there are only two arrays along an axis=2.
I have a function that accepts these two arrays:
def f(arr1, arr2): # arr1 with shape (6, ) and arr2 with (6, )
return np.sum(arr1, arr2) # for simplicity
How can I apply this function along the second axis to x array in a vectorized way? Such that resulting array will be of shape [3, 10, dim of output].
I came across apply_along_axis routine, but it requires that f accepts only 1D slice.
You can't do it entirely arbitrarily, but your particular case reduces to
x.sum(axis=2)
If you want to add the arrays as in your code:
x[:, :, 0, :] + x[:, :, 1, :]
Suppose I have two images with dimensions of 32x32x3 (number of channels=3). I want to multiply them (like "matmul" function) on the first and the second dimensions for each of these 3 channels in Tensorflow to get a new 32x32x3 image.
Can someone help me with this?
Something like this loop:
#x.shape=(32,32,3)
#y.shape=(32,32,3)
a = np.zeros((x.shape[-3], x.shape[-2], x.shape[-1],), dtype='float32')
for i in range(a.shape[-1]):
a[:, :, i] = tf.matmul(x[:, :, i], y[:, :, i])
a = tf.convert_to_tensor(a, dtype=tf.float32)
but I was wondering there is a more efficient way to do this?
Actually, I found the answer.
The matmul works also for 3d arrays. However, the features (channels) need to be first in matmul function. So we need to use tf.transpose if channels are placed in the last dimension as follow:
x=tf.transpose(x, perm=[2, 0, 1])
y=tf.transpose(y, perm=[2, 0, 1])
a=tf.matmul(x,y)
a=tf.transpose(a, perm=[1, 2, 0])
It gives the same result as the loop I wrote above.
just asked a basic question earlier on dot product of two 2d arrays with hopes that I can then figure out how to expand the logic to the following question where one matrix is actually in 3d and then further multiplied by a 1d scalar vector, with no success... sorry for the back-to-back questions, numpy looks much more complicated that I thought..
I have the following 3 arrays
a = np.array([[1, 2, 3], [3, 4, 5]])
b = np.array([[[1, 0, 1], [1, 1, 0], [0, 1, 1]], [[1, 1, 1], [0, 1, 0], [0, 0, 1]]])
c = np.array([1, 2, 3])
I'd like to do the following computation. I am just going to type out the formula directly as I am not sure how to properly describe it in words (if someone can enlighten me on this as well I would be interested to know!)...
[[(1*1 + 0*2 + 1*3)/1, (1*1 + 1*2 + 0*3)/2, (0*1, 1*2, 1*3)/3],
[(1*3 + 1*4 + 1*5)/1, (0*3 + 1*4 + 0*5)/2, (0*3, 0*4, 1*5)/3]]
>>> [[4, 1.5, 1.67], [12, 2, 1.67]]
it probably have to do with how to use the axis arg but I can't figure out quite yet... thanks much again!!
Correct answer: np.sum((a[:,None,:] * b), axis=2) / c
Process step by step visually:
Additional comments
a[:,None,:] inserts an extra dimension in the middle of size 1. a.shape is (2, 3) and a[:,None,:] is (2, 1, 3). a[None,:,:] and [a[:,:,None] does the same, you can check it with a shape parameter.
One of approaches of understanding this problem is visual. All the pictures demonstrate how three dimensions corresponds to axis=0, axis=1, axis=2. So if you had an array of shape = (2, 1, 3), its visualisation would be a cuboid of height = 2, width = 1 and lenght = 3. For example, you can see visually how a[:,:,None] and b are broadcasted into a new array a[:,:,None] * b.
There's also a symbolic (formal) way to see it. Only arrays of balanced shapes could be broadcasted. So arrays of shapes (2, 3) and (2, 1, 3) can't be broadcasted unlikely to arrays of shapes (2, 3, 3) and (2, 1, 3). But, arrays of shapes (4, 3) and (3,) can be broadcasted. I mean, there are mathematical rules which defines whether arrays are broadcastable.
That may not be what you were looking for exactly (as it reshapes the numpy array using a list, which is not very efficient), but it seems to work :
d = np.array([[row]*3 for row in a]) #reshape np.array to 3d
print((d*b).sum(axis=2)/c)
I've read the numpy doc on slicing(especially the bottom where it discusses variable array indexing)
https://docs.scipy.org/doc/numpy/user/basics.indexing.html
But I'm still not sure how I could do the following: Write a method that either returns a 3D set of indices, or a 4D set of indices that are then used to access an array. I want to write a method for a base class, but the classes that derive from it access either 3D or 4D depending on which derived class is instantiated.
Example Code to illustrate idea:
import numpy as np
a = np.ones([2,2,2,2])
size = np.shape(a)
print(size)
for i in range(size[0]):
for j in range(size[1]):
for k in range(size[2]):
for p in range(size[3]):
a[i,j,k,p] = i*size[1]*size[2]*size[3] + j*size[2]*size[3] + k*size[3] + p
print(a)
print('compare')
indices = (0,:,0,0)
print(a[0,:,0,0])
print(a[indices])
In short, I'm trying to get a tuple(or something) that can be used to make both of the following access depending on how I fill the tuple:
a[i, 0, :, 1]
a[i, :, 1]
The slice method looked promising, but it seems to require a range, and I just want a ":" i.e. the whole dimension. What options are out there for variable numpy array dimension access?
In [324]: a = np.arange(8).reshape(2,2,2)
In [325]: a
Out[325]:
array([[[0, 1],
[2, 3]],
[[4, 5],
[6, 7]]])
slicing:
In [326]: a[0,:,0]
Out[326]: array([0, 2])
In [327]: idx = (0,slice(None),0) # interpreter converts : into slice object
In [328]: a[idx]
Out[328]: array([0, 2])
In [331]: idx
Out[331]: (0, slice(None, None, None), 0)
In [332]: np.s_[0,:,0] # indexing trick to generate same
Out[332]: (0, slice(None, None, None), 0)
Your code appears to work how you want it using :. The reason the two examples
(a[i, 0, :, 7], a[i, :, 7])
don't work is because the 7 is out of range of the array. If you change the 7 to something in range like 1 then it returns a value, which I believe is what you are looking for.