adding matrices to n dimensional array with numpy - python

Perhaps a simple questions, but I am using numpy, and iteratively generating 9x9x9 matrices.
I would like to stack these so I end up with Nx9x9x9, but using append, stack and stack it seems to vectorise one of the dimensions rather than add these as individual objects. any ideas how I can do this?
thanks

This could be resolved using np.vstack but to get this in the shape you want to need to add another dimension (an empty one) as first. Otherwise you would stack you current first dimension:
import numpy as np
a = np.ones((1,2,2,2))
print(a.shape) # (1, 2, 2, 2)
or if you create your arrays, then add another dimension by:
a = np.ones((2,2,2))
a = a[None, :] # Adds an dimension as first
and then to stack them you could use:
b = np.vstack([a,a])
print(b.shape) # (2, 2, 2, 2)
c = np.vstack([b,a])
print(c.shape) # (3, 2, 2, 2)
c.shape
you said you create them iterativly but if you only need the final result at the end you don't even need to use vstack just create a new array:
a = np.ones((9,9,9))
b = np.ones((9,9,9))
c = np.ones((9,9,9))
d = np.ones((9,9,9))
res = np.array([a, b, c, d])
print(res.shape) # (4, 9, 9, 9)

Related

Index numpy array by other array as indices

I'm trying to take array
a = [1,5,4,5,7,8,9,8,4,13,43,42]
and array
b = [3,5,6,2,7]
And I want b to be the indexes in a, e.g. a new array that is
[a[b[0]], a[b[1]], a[b[2]], a[b[3]] ...]
So the values in b are indexes into a.
And there are 500k entries in a and 500k in b (approximately).
Is there a fast way to kick in all cores in numpy to do this?
I already do it just fine in for loops and it is sloooooooowwwwww.
Edit to clarify. The solution has to work for 2D and 3D arrays.
so maybe
b = [(2,3), (5,4), (1,2), (1,0)]
and we want
c = [a[b[0], a[b[1], ...]
Not saying it is fast, but the numpy way would simply be:
a[b]
outputs:
array([5, 8, 9, 4, 8])
This can be done in NumPy using advanced indexing. As Christian's answer pointed out, in the 1-D case, you would simply write:
a[b]
and that is equivalent to:
[a[b[x]] for x in range(b.shape[0])]
In higher-dimensional cases, however, you need to have separate lists for each dimension of the indices. Which means, you can't do:
a = np.random.randn(7, 8, 9) # 3D array
b = [(2, 3, 0), (5, 4, 1), (1, 2, 2), (1, 0, 3)]
print(a[b]) # this is incorrect
but you can do:
b0, b1, b2 = zip(*b)
print(a[b0, b1, b2])
you can also use np.take:
print(np.take(a, b))
I solved this by writing a C extension to numpy called Tensor Weighted Interpolative Transfer, in order to get speed and multi-threading. In pure python it is 3 seconds per 200x100x3 image scale and fade across, and in multi-threaded C with 8 cores is 0.5 milliseconds for the same operation.
The core C code ended up being like
t2[dstidxs2[i2] + doff1] += t1[srcidxs2[i2] + soff1] * w1 * ws2[i2];
Where the doff1 is the offset in the destination array etc. The w1 and ws2 are the interpolated weights.
All the code is ultra optimized in C for speed. (not code size or maintainability)
All code is available on https://github.com/RMKeene/twit and on PyPI.
I expect furthur optimization in the future such as special cases if all weights are 1.0.

How to add a new column to a 4d numpy array

I am trying to add a new column to my image dataset.
Sample Code:
import numpy as np
A = np.arange(240).reshape(3,4,4,5)
print(type(A))
print(A.shape)
B = np.concatenate([A, np.ones((A.shape[0],4,4,5,1),dtype=int)], axis=1)
print(B.shape)
Gives error:
ValueError: all the input arrays must have same number of dimensions
Context:
Consider this as m samples of read images (nH=height, nW=Weight, nC=channels).
Dataset is of shape (m, nH, nW, nC )and now I want to add additional column reflecting the image is of "good" example or "bad" example of an object.
Thus, want to create a dataset with label added in the dataset to form shape : (m,nH,nW,nC,l) where l stands for label and can have values either 0 or 1.
How can i achieve this? Thanks in advance.
Even simpler without reshaping :
A = np.random.rand(3, 4, 4, 5)
B = A[None] # Append a new dimension at the beginning, shape (1, 3, 4, 4, 5)
B = A[:,:,None] # Append a new dimension in the middle, shape (3, 4, 1, 4, 5)
B = A[:,:,:,:,None] # Append a new dimension at the end, shape (3, 4, 4, 5, 1)
Basically, the position of None indicates where to add the new dimension.
You don't need to add the fifth column explicitly. Just reshape and add the fifth dimension.
import numpy as np
A = np.arange(240).reshape(3,4,4,5,1) # add the fifth dimension here
print(type(A))
print(A.shape)
To set the "good" or "bad" label, just access the last dimension of A

Create random numpy matrix of same size as another.

This question here was useful, but mine is slightly different.
I am trying to do something simple here, I have a numpy matrix A, and I simply want to create another numpy matrix B, of the same shape as A, but I want B to be created from numpy.random.randn() How can this be done? Thanks.
np.random.randn takes the shape of the array as its input which you can get directly from the shape property of the first array. You have to unpack a.shape with the * operator in order to get the proper input for np.random.randn.
a = np.zeros([2, 3])
print(a.shape)
# outputs: (2, 3)
b = np.random.randn(*a.shape)
print(b.shape)
# outputs: (2, 3)

Adding appending numpy arrays

I'm currently trying to append multiple Numpy arrays together. Basically, what I want to do is to start from a (1 x m) matrix (technically a vector), and end up with a (n x m) matrix. So going from n (1 x m) matrices (vectors) to one (n x m) matrix (If that makes any sense). The ultimate goal with this is to write the matrix into a csv-file with the numpy.savetxt() function so I'll end up with a csv-file with n columns of m length.
The problem with this is that numpy.append() appends the vectors together into a (1 x 2m) vector. So let's say a1 and a2 are Numpy arrays with 10000 elements each. I'll append a2 into a1 by using the append function and simultaneously creating a new array called a, which contains both a1 and a2.
a=np.append(a1, a2, axis=0)
a.shape
>>(20000,)
What I want instead is for the shape to be of the form
>>(2, 10000)
or more generally
>>(n, m)
What should I do? Please note, that I want to continue adding the vectors into the array. Thanks for your time!
you can use the transpose of numpy.column_stack
For example:
import numpy as np
a=np.array([1,2,3,4,5])
b=np.array([9,8,7,6,5])
c=np.column_stack((a,b)).T
print c
>>> array([[1, 2, 3, 4, 5],
[9, 8, 7, 6, 5]])
print a.shape,b.shape,c.shape
>>> (5,) (5,) (2, 5)
EDIT:
you can keep adding columns like so:
d=np.array([2,2,2,2,2])
c=np.column_stack((c.T,d)).T
print c
>>> array([[1, 2, 3, 4, 5],
[9, 8, 7, 6, 5],
[2, 2, 2, 2, 2]])
print c.shape
>>> (3, 5)
This should work
a=np.append(a1, a2, axis=0).reshape(2,10000)
a.shape
>>(2,10000)
In order to merge arrays vertically I would use np.vstack
import numpy as np
np.vstack((a1,a2))
However, from my point of view, numpy.array shouldn't be created using for loops and appending the new array to the old one. Instead, either you create first the whole numpy.array (nxm) and you write the data from the for loop into that array,
data = np.zeros((n,m))
for i in range(n):
data[i] = ...
or you first create your array as an ordinary python list using append which you can transform at the end into an numpy.array.
data = []
for i in range(n):
data.append(...)
data = np.asarray(data)

Numpy: stack array by the last dimension

Suppose I have 3 numpy arrays a, b, c, of the same shape, say
a.shape == b.shape == c.shape == (7,9)
Now I'd like to create a 3-dimensional array of size (7,9,3), say x, such that
x[:,:,0] == a
x[:,:,1] == b
x[:,:,2] == c
What is the "pythonic" way of doing it (perhaps in one line)?
Thanks in advance!
There's a function that does exactly that: numpy.dstack ("d" for "depth"). For example:
In [10]: import numpy as np
In [11]: a = np.ones((7, 9))
In [12]: b = a * 2
In [13]: c = a * 3
In [15]: x = np.dstack((a, b, c))
In [16]: x.shape
Out[16]: (7, 9, 3)
In [17]: (x[:, :, 0] == a).all()
Out[17]: True
In [18]: (x[:, :, 1] == b).all()
Out[18]: True
In [19]: (x[:, :, 2] == c).all()
Out[19]: True
TL;DR:
Use numpy.stack (docs), which joins a sequence of arrays along a new axis of your choice.
Although #NPE answer is very good and cover many cases, there are some scenarios in which numpy.dstack isn't the right choice (I've just found that out while trying to use it). That's because numpy.dstack, according to the docs:
Stacks arrays in sequence depth wise (along third axis).
This is equivalent to concatenation along the third axis after 2-D
arrays of shape (M,N) have been reshaped to (M,N,1) and 1-D arrays of
shape (N,) have been reshaped to (1,N,1).
Let's walk through an example in which this function isn't desirable. Suppose you have a list with 512 numpy arrays of shape (3, 3, 3) and want to stack them in order to get a new array of shape (3, 3, 3, 512). In my case, those 512 arrays were filters of a 2D-convolutional layer. If you use numpy.dstack:
>>> len(arrays_list)
512
>>> arrays_list[0].shape
(3, 3, 3)
>>> numpy.dstack(arrays_list).shape
(3, 3, 1536)
That's because numpy.dstack always stacks the arrays along the third axis! Alternatively, you should use numpy.stack (docs), which joins a sequence of arrays along a new axis of your choice:
>>> numpy.stack(arrays_list, axis=-1).shape
(3, 3, 3, 512)
In my case, I passed -1 to the axis parameter because I wanted the arrays stacked along the last axis.

Categories

Resources