I have a list with numpy.ndarrays - each of shape (33,1,8,45,3)
Problem that when i concatenate the list using a = np.concatenate(list)
The output shape of a becomes
print a.shape
(726,1,8,45,3)
instead of shape (22,33,1,8,45,3).
How do I cleanly concatenate the list, without having to change the input.
You can use numpy.array() or numpy.stack():
import numpy
a = [numpy.random.rand(33,1,8,45,3) for i in range(22)]
b = numpy.array(a)
b.shape # (22, 33, 1, 8, 45, 3)
c = numpy.stack(a, axis=0)
c.shape # (22, 33, 1, 8, 45, 3)
np.concatenate:
Join a sequence of arrays along an existing axis.
np.stack:
Stack a sequence of arrays along a new axis.
a = np.ones((3, 4))
b = np.stack([a, a])
print(b.shape) # (2, 3, 4)
Related
How do you reshape a (55, 11) numpy array to a (55, 11, 1) numpy array?
Attempts:
Simply doing numpy_array.reshape(-1, 1) without any loop produces a flat array that is not 3D.
The following for loop produces a "cannot
broadcast error":
for i in range(len(numpy_array)):
numpy_array[i] = numpy_array[i].reshape(-1, 1)
Maybe you are looking for numpy.expand_dims(https://numpy.org/doc/stable/reference/generated/numpy.expand_dims.html)?
import numpy
a = numpy.random.rand(55,11)
print(a.shape) # 55,11
print(numpy.expand_dims(a, 2).shape) # 55, 11, 1
Add a newaxis to the array
my_array = np.arange(55*11).reshape(55,11)
my_array.shape
# (55, 11)
# add new axis
new_array = my_array[...,None]
new_array.shape
# (55, 11, 1)
Can specify new shape in reshape too:
new_array = my_array.reshape(*my_array.shape, 1)
new_array.shape
# (55, 11, 1)
One of the answers recommends using expand_dims. That's a good answer, but if you look at its code, and strip off some generalities, all it is doing is:
In [409]: a = np.ones((2,3)); axis=(2,)
...: out_ndim = 2+1
...: shape_it = iter(a.shape)
...: shape = [1 if ax in axis else next(shape_it) for ax in range(out_ndim)]
In [410]: shape
Out[410]: [2, 3, 1]
followed by a return a.reshape(shape).
In other words, the function call is just hiding the obvious, expand a (x,y) to (x,y,1) with
a.reshape(x,y,1)
Are you seeking some 3d 'magic' akin to the -1 in numpy_array.reshape(-1, 1)?
Personally I like to use None to add dimensions, so prefer the other answer [...,None]. But functionally it's all the same.
I have 6 tensors of shape (batch_size, S, S, 1) and I want to combine them in one python list of size (batch_size, S*S, 6) - so every element of tensor should be inside the inner list.
Can this be achieved without using loops? What's the efficient way to solve it?
Let batch_size=10 and S=4 for the purpose of this example:
>>> x = [torch.rand(10, 4, 4, 1) for _ in range(6)]
Indeed the first step is to concatenate the tensor on the last dimension axis=3:
>>> y = torch.cat(x, -1)
>>> y.shape
torch.Size([10, 4, 4, 6])
Then reshape to flatten axis=1 and axis=2, you can do so with torch.flatten here since the two axes as adjacent:
>>> y = torch.cat(x, -1).flatten(1, 2)
>>> y.shape
torch.Size([10, 16, 6])
I just want to get the dot product of some sets of multidimensional data.
For simplicity, I am posting the pieces small, and demonstrating my efforts
thus far.
To just get 'a' dot 'q', and the 4 numbers that I want is easy enough.
import numpy as np
a = np.arange(1,4) # shape = (3,)
q = np.array([[x, x, x] for x in range(4)])+1 # shape = (4, 3)
c = np.dot(a, q.T) # array([ 6, 12, 18, 24]) shape = (4,)
If I want to add another set to 'a', I can expand the dimensions. Again, pretty easy. The dot product simply reflects the additional dimension.
a = np.arange(1,4).reshape(1,3) # shape = (1,3)
c = np.dot(a, q.T) # array([[ 6, 12, 18, 24]]) shape = (1,4)
and the other set...
a = np.vstack((a,a+1)) # shape = (2,3)
c = np.dot(a, q.T) # array([[ 6, 12, 18, 24], [ 9, 18, 27, 36]]) shape = (2,4)
To add another dimension to q, the transpose needs to be a little more complicated.
q = np.expand_dims(q, axis=0) # shape = (1, 4, 3)
c = np.dot(a, np.transpose(q, (0, 2, 1))) # shape = (2, 1, 4)
now stack 'q' matrix
q = np.vstack((q, q+1)) # shape = (2, 4, 3)
c = np.dot(a, np.transpose(q, (0, 2, 1))) # shape = (2, 2, 4)
Though, what I am going for is the diagonal of c. While I have not tried it yet, I am imagining that when 'a' and 'q' start to reach >(2000, 3) and >(2000, 4, 3) c will be (2000, 2000, 4) and I only need 1/2000th of that. Does anyone know how to make this more efficient than doing the calculation and then taking the diagonal?
Again, what I want is...
c = np.dot(a, np.transpose(q, (0, 2, 1)))
c = c[np.arange(2), np.arange(2)]
or
c[0] = np.dot(a[0:1], np.transpose(q[0:1], (0, 2, 1)))
c[1] = np.dot(a[1:2], np.transpose(q[1:2], (0, 2, 1)))
but without having to make the enormous matrix first and then trim it later.
I have read a couple other, kinda, similar questions. Though, I hope that this question is perceived to be more complicated than a dot product of the same vector and its diagonal, Also, if the answer is np.einsum(), could you explain the process a more than the numpy docs?
I reposted the question, with the einsum() entries at each c. In fact, Alexander Korovin linked to an excellent einsum summary.
I just want to get the dot product of some sets of multidimensional data.
For simplicity, I am posting the pieces small, and demonstrating my efforts
thus far.
To just get 'a' dot 'q', and the 4 numbers that I want is easy enough.
import numpy as np
a = np.arange(1,4) # shape = (3,)
q = np.array([[x, x, x] for x in range(4)])+1 # shape = (4, 3)
c = np.dot(a, q.T) # array([ 6, 12, 18, 24]) shape = (4,)
c = np.einsum('i,ji->j', a, q)
If I want to add another set to 'a', I can expand the dimensions. Again, pretty easy. The dot product simply reflects the additional dimension.
a = np.arange(1,4).reshape(1,3) # shape = (1,3)
c = np.dot(a, q.T) # array([[ 6, 12, 18, 24]]) shape = (1,4)
c = np.einsum('ij,ij->i', a, q)
and the other set...
a = np.vstack((a,a+1)) # shape = (2,3)
c = np.dot(a, q.T) # array([[ 6, 12, 18, 24], [ 9, 18, 27, 36]]) shape = (2,4)
c = np.einsum('ij,gj->ig', a, q)
To add another dimension to q, the transpose needs to be a little more complicated.
q = np.expand_dims(q, axis=0) # shape = (1, 4, 3)
c = np.dot(a, np.transpose(q, (0, 2, 1))) # shape = (2, 1, 4)
c = np.einsum('ij,fgj->fig', a, q)
now stack 'q' matrix
q = np.vstack((q, q+1)) # shape = (2, 4, 3)
c = np.dot(a, np.transpose(q, (0, 2, 1))) # shape = (2, 2, 4)
c = np.einsum('ij,fgj->fig', a, q)
Though, what I am going for is the diagonal of c. While I have not tried it yet, I am imagining that when 'a' and 'q' start to reach >(2000, 3) and >(2000, 4, 3) c will be (2000, 2000, 4) and I only need 1/2000th of that. Does anyone know how to make this more efficient than doing the calculation and then taking the diagonal?
Again, what I want is...
c = np.dot(a, np.transpose(q, (0, 2, 1)))
c = c[np.arange(2), np.arange(2)]
or
c[0] = np.dot(a[0:1], np.transpose(q[0:1], (0, 2, 1)))
c[1] = np.dot(a[1:2], np.transpose(q[1:2], (0, 2, 1)))
but without having to make the enormous matrix first and then trim it later.
So do this...
c = np.einsum('ik,ijk->ij', a, q)
My program creates a numpy array within a for loop. For example it creates array with shape (100*30*10), then (160*30*10) and then may be (120*30*10) . I have to append the above to an empty numpy array such that , at the end of the loop, it will be a numpy array with shape (380*30*10) (i.e sum of 100+160+120) . The second and third dimension doesnt change in the numpy array.
How can I do the above in python. I tried the following.
np_model = np.append(np_model,np_temp1)
print("Appended model shape is",np_model.shape)
np_label = np.append(np_label,np_temp2)
print("Appended label shape is",np_label.shape)
The np_model is an empty array which I have defined as np_model = np.empty(1,30,10) and np_label as np_label = np.empty(1 ,str)
np_temp1 corresponds to array within each for loop like 100*30*10,120*30*10 etc and np_temp2 is a string with "item1","item2" etc
The np_label is a string numpy array with 1 label corresponding to np_temp1.shape[0]. But the result I get in np_model is flattened array with size 380*30*10 = 1140000
Any help is appreciated.
you can use numpy concatenate function, append the output numpy(s) to a list and then feed it to the concatenate function:
empty_list = []
x = np.zeros([10, 20, 4])
y = np.zeros([12, 20, 4])
empty_list.append(x)
empty_list.append(y)
z = np.concatenate(empty_list, axis=0)
print(x.shape, y.shape, z.shape)
(10, 20, 4) (12, 20, 4) (22, 20, 4)
As #Nullman suggested in comment(np.vstack)
You can create empty array like this >>> np_model = np.empty((0,30,10))
>>> np_model = np.empty((0,30,10))
>>> a = np.random.rand(100,30,10)
>>> b = np.random.rand(160,30,10)
>>> c = np.random.rand(120,30,10)
# It can done by one-line like`np_model = np.vstack((a,b,c))`
# but i guess you have loop dependency here
>>> np_model = np.vstack((np_model,a))
>>> np_model = np.vstack((np_model,b))
>>> np_model = np.vstack((np_model,c))
>>> np_model.shape
(380, 30, 10)
To specifically answer your question towards starting with an empty array, that'd be my solution, solely using np.concatenate:
import numpy as np
# Some arrays to append in a loop
arrays = (
np.random.rand(100, 30, 10),
np.random.rand(160, 30, 10),
np.random.rand(120, 30, 10)
)
# Initial empty array
array = np.zeros((0, 30, 10))
# Appending arrays in loop
for a in arrays:
array = np.concatenate((array, a), axis=0)
# Output shape
print(array.shape)
Output:
(380, 30, 10)
Hope that helps!
----------------------------------------
System information
----------------------------------------
Platform: Windows-10-10.0.16299-SP0
Python: 3.8.1
NumPy: 1.18.1
----------------------------------------
Suppose I have a 5x10x3 array, which I interpret as 5 'sub-arrays', each consisting of 10 rows and 3 columns. I also have a seperate 1D array of length 5, which I call b.
I am trying to insert a new column into each sub-array, where the column inserted into the ith (i=0,1,2,3,4) sub-array is a 10x1 vector where each element is equal to b[i].
For example:
import numpy as np
np.random.seed(777)
A = np.random.rand(5,10,3)
b = np.array([2,4,6,8,10])
A[0] should look like:
A[1] should look like:
And similarly for the other 'sub-arrays'.
(Notice b[0]=2 and b[1]=4)
What about this?
# Make an array B with the same dimensions than A
B = np.tile(b, (1, 10, 1)).transpose(2, 1, 0) # shape: (5, 10, 1)
# Concatenate both
np.concatenate([A, B], axis=-1) # shape: (5, 10, 4)
One method would be np.pad:
np.pad(A, ((0,0),(0,0),(0,1)), 'constant', constant_values=[[[],[]],[[],[]],[[],b[:, None,None]]])
# array([[[9.36513084e-01, 5.33199169e-01, 1.66763960e-02, 2.00000000e+00],
# [9.79060284e-02, 2.17614285e-02, 4.72452812e-01, 2.00000000e+00],
# etc.
Or (more typing but probably faster):
i,j,k = A.shape
res = np.empty((i,j,k+1), np.result_type(A, b))
res[...,:-1] = A
res[...,-1] = b[:, None]
Or dstack after broadcast_to:
np.dstack([A,np.broadcast_to(b[:,None],A.shape[:2])]