Want to append 2 2d arrays in numpy - python

I'm trying to append 2 2d numpy arrays
a = np.array([[1],
[2],
[3],
[4]])
b = np.array([[ 4, 5, 6],
[ 7, 8, 9],
[10, 11, 12]])
target is
c = ([[1],
[2],
[3],
[4],
[ 4, 5, 6],
[ 7, 8, 9],
[10, 11, 12]])
Tried np.concatenate((a,b),axis=0) and np.concatenate((a,b),axis=1)
and get
ValueError: all the input array dimensions except for the concatenation axis must match exactly
and np.append(a,b)
But nothing seems to work. If I convert to list it gives me the result I want but seems inefficient
c = a.tolist() + b.tolist()
Is there a numpy way to do this?

As the error indicate, the dimensions have to match.
So you could resize a so that it matches the dimension of b and then concatenate (the empty cells are filled with zeros).
a.resize(3,4)
a = a.transpose()
np.concatenate((a,b))
array([[ 1, 0, 0],
[ 2, 0, 0],
[ 3, 0, 0],
[ 4, 0, 0],
[ 4, 5, 6],
[ 7, 8, 9],
[10, 11, 12]])

Short answer - no. Numpy arrays need to be 'rectangular'; similar to matrices in linear algebra. You can follow the suggestion here and force it in (at the loss of a lot of functionality) or, if you really need the target, use a data structure like a list which is designed to cope with it.

Related

Numpy array indexing with another matrix

I looked into other posts related to indexing numpy array with another numpy array, but still could not wrap my head around to accomplish the following:
a = [[[1,2,3],[4,5,6]],[[7,8,9],[10,11,12]]],
b = [[[1,0],[0,1]],[[1,1],[0,1]]]
a[b] = [[[7,8,9],[4,5,6]],[[10,11,12],[4,5,6]]]
a is an image represented by 3D numpy array, with dimension 2 * 2 * 3 with RGB values for the last dimension. b contains the index that will match to the image. For instance for pixel index (0,0), it should map to index (1,0) of the original image, which should give pixel values [7,8,9]. I wonder if there's a way to achieve this. Thanks!
Here's one way:
In [54]: a = np.array([[[1, 2, 3], [4, 5, 6]], [[7, 8, 9], [10, 11, 12]]])
In [55]: b = np.array([[[1, 0], [0, 1]], [[1, 1], [0, 1]]])
In [56]: a[b[:, :, 0], b[:, :, 1]]
Out[56]:
array([[[ 7, 8, 9],
[ 4, 5, 6]],
[[10, 11, 12],
[ 4, 5, 6]]])

How to index row elements of a Matrix with a Matrix of indices for each row?

I have a Matrix of indices I e.g.
I = np.array([[1, 0, 2], [2, 1, 0]])
The index at i-th row selects an element from another Matrix M in the i-th row.
So having M e.g.
M = np.array([[6, 7, 8], [9, 10, 11])
M[I] should select:
[[7, 6, 8], [11, 10, 9]]
I could have:
I1 = np.repeat(np.arange(0, I.shape[0]), I.shape[1])
I2 = np.ravel(I)
Result = M[I1, I2].reshape(I.shape)
but this looks very complicated and I am looking for a more elegant solution. Preferably without flattening and reshaping.
In the example I used numpy, but I am actually using jax. So if there is a more efficient solution in jax, feel free to share.
In [108]: I = np.array([[1, 0, 2], [2, 1, 0]])
...: M = np.array([[6, 7, 8], [9, 10, 11]])
...:
...: I,M
I had to add a ']' to M.
Out[108]:
(array([[1, 0, 2],
[2, 1, 0]]),
array([[ 6, 7, 8],
[ 9, 10, 11]]))
Advanced indexing with broadcasting:
In [110]: M[np.arange(2)[:,None],I]
Out[110]:
array([[ 7, 6, 8],
[11, 10, 9]])
THe first index has shape (2,1) which pairs with the (2,3) shape of I to select a (2,3) block of values.
How about this one line code? The idea is to enumerate both the rows and the row indices of the matrix, so you can access the corresponding rows in the indexing matrix.
import numpy as np
I = np.array([[1, 0, 2], [2, 1, 0]])
M = np.array([[6, 7, 8], [9, 10, 11]])
Result = np.array([row[I[i]] for i, row in enumerate(M)])
print(Result)
Output:
[[ 7 6 8]
[11 10 9]]
np.take_along_axis can also be used here to take values of M using indices I over axis=1:
>>> np.take_along_axis(M, I, axis=1)
array([[ 7, 6, 8],
[11, 10, 9]])

Sum each row of a numpy array with all rows of second numpy array (python)

I would like to know if there is any fast way to sum each row of a first array with all rows of a second array. In this case both arrays have the same number of colulmns. For instance if array1.shape = (n,c) and array2.shape = (m,c), the resulting array would be an array3.shape = ((n*m), c)
Look at the example below:
array1 = np.array([[0, 1, 2],
[3, 4, 5],
[6, 7, 8]])
array2 = np.array([[0, 1, 2],
[3, 4, 5]])
The result would be:
array3 = np.array([[0, 2, 4],
[3, 5, 7]
[3, 5, 7]
[6, 8, 10]
[6, 8, 10]
[9, 11, 13]])
The only way I see I can do this is to repeat each row of one of the arrays the number of rows of the other array. For instance, by doing np.repeat(array1, len(array2), axis=0) and then sum this array with array2. This is not very practical however if the number of rows is too big. The other way would be with a for loop but this is too slow.
Any other better way to do it..?
Thanks in advance.
Extend array1 to 3D so that it becomes broadcastable against 2D array2 and then perform broadcasted addition and a final reshape is needed for desired output -
In [30]: (array1[:,None,:] + array2).reshape(-1,array1.shape[1])
Out[30]:
array([[ 0, 2, 4],
[ 3, 5, 7],
[ 3, 5, 7],
[ 6, 8, 10],
[ 6, 8, 10],
[ 9, 11, 13]])
You could try the following inline code if you haven't already. This is the simplest and probably also the quickest on a single thread.
>>> import numpy as np
>>> array1 = np.array([[0, 1, 2],
... [3, 4, 5],
... [6, 7, 8]])
>>>
>>> array2 = np.array([[0, 1, 2],
... [3, 4, 5]])
>>> array3 = np.array([i+j for i in array1 for j in array2])
>>> array3
array([[ 0, 2, 4],
[ 3, 5, 7],
[ 3, 5, 7],
[ 6, 8, 10],
[ 6, 8, 10],
[ 9, 11, 13]])
>>>
If you are looking for speed up by treading, you could consider using CUDA or multithreading. This suggestion goes a bit out of scope of your question but gives you an idea of what can be done to speed up matrix operations.

How can I isolate rows in a 2d numpy matrix that match a specific criteria?

How can I isolate rows in a 2d numpy matrix that match a specific criteria? For example if I have some data and I only want to look at the rows where the 0 index has a value of 5 or less how would I retrieve those values?
I tried this approach:
import numpy as np
data = np.matrix([
[10, 8, 2],
[1, 4, 5],
[6, 5, 7],
[2, 2, 10]])
#My attempt to retrieve all rows where index 0 is less than 5
small_data = (data[:, 0] < 5)
The output is:
matrix([
[False],
[ True],
[False],
[ True]], dtype=bool)
However I'd like the output to be:
[[1, 4, 5],
[2, 2, 10]]
Another approach may be for me to loop through the matrix rows and if the 0 index is smaller than 5 append the row to a list but I am hoping there is a better way than that.
Note: I'm using Python 2.7.
First: Don't use np.matrix, use normal np.arrays.
import numpy as np
data = np.array([[10, 8, 2],
[1, 4, 5],
[6, 5, 7],
[2, 2, 10]])
Then you can always use boolean indexing (based on the boolean array you get when you do comparisons) to get the desired rows:
>>> data[data[:, 0] < 5]
array([[ 1, 4, 5],
[ 2, 2, 10]])
or integer array indexing:
>>> data[np.where(data[:, 0] < 5)]
array([[ 1, 4, 5],
[ 2, 2, 10]])
That way you got a logical array that you can use to select the desired rows.
>>> data = np.matrix([
[10, 8, 2],
[1, 4, 5],
[6, 5, 7],
[2, 2, 10]])
>>> data = np.array(data)
>>> data[(data[:, 0] < 5), :]
array([[ 1, 4, 5],
[ 2, 2, 10]])
You can also use np.squeeze to filter the rows.
>>> ind = np.squeeze(np.asarray(data [:,0]))<5
>>> data[ind,:]
array([[ 1, 4, 5],
[ 2, 2, 10]])
use the following code.
data[data[small_data,:]]
That would work

Selecting specific rows and columns from NumPy array

I've been going crazy trying to figure out what stupid thing I'm doing wrong here.
I'm using NumPy, and I have specific row indices and specific column indices that I want to select from. Here's the gist of my problem:
import numpy as np
a = np.arange(20).reshape((5,4))
# array([[ 0, 1, 2, 3],
# [ 4, 5, 6, 7],
# [ 8, 9, 10, 11],
# [12, 13, 14, 15],
# [16, 17, 18, 19]])
# If I select certain rows, it works
print a[[0, 1, 3], :]
# array([[ 0, 1, 2, 3],
# [ 4, 5, 6, 7],
# [12, 13, 14, 15]])
# If I select certain rows and a single column, it works
print a[[0, 1, 3], 2]
# array([ 2, 6, 14])
# But if I select certain rows AND certain columns, it fails
print a[[0,1,3], [0,2]]
# Traceback (most recent call last):
# File "<stdin>", line 1, in <module>
# ValueError: shape mismatch: objects cannot be broadcast to a single shape
Why is this happening? Surely I should be able to select the 1st, 2nd, and 4th rows, and 1st and 3rd columns? The result I'm expecting is:
a[[0,1,3], [0,2]] => [[0, 2],
[4, 6],
[12, 14]]
As Toan suggests, a simple hack would be to just select the rows first, and then select the columns over that.
>>> a[[0,1,3], :] # Returns the rows you want
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[12, 13, 14, 15]])
>>> a[[0,1,3], :][:, [0,2]] # Selects the columns you want as well
array([[ 0, 2],
[ 4, 6],
[12, 14]])
[Edit] The built-in method: np.ix_
I recently discovered that numpy gives you an in-built one-liner to doing exactly what #Jaime suggested, but without having to use broadcasting syntax (which suffers from lack of readability). From the docs:
Using ix_ one can quickly construct index arrays that will index the
cross product. a[np.ix_([1,3],[2,5])] returns the array [[a[1,2] a[1,5]], [a[3,2] a[3,5]]].
So you use it like this:
>>> a = np.arange(20).reshape((5,4))
>>> a[np.ix_([0,1,3], [0,2])]
array([[ 0, 2],
[ 4, 6],
[12, 14]])
And the way it works is that it takes care of aligning arrays the way Jaime suggested, so that broadcasting happens properly:
>>> np.ix_([0,1,3], [0,2])
(array([[0],
[1],
[3]]), array([[0, 2]]))
Also, as MikeC says in a comment, np.ix_ has the advantage of returning a view, which my first (pre-edit) answer did not. This means you can now assign to the indexed array:
>>> a[np.ix_([0,1,3], [0,2])] = -1
>>> a
array([[-1, 1, -1, 3],
[-1, 5, -1, 7],
[ 8, 9, 10, 11],
[-1, 13, -1, 15],
[16, 17, 18, 19]])
Fancy indexing requires you to provide all indices for each dimension. You are providing 3 indices for the first one, and only 2 for the second one, hence the error. You want to do something like this:
>>> a[[[0, 0], [1, 1], [3, 3]], [[0,2], [0,2], [0, 2]]]
array([[ 0, 2],
[ 4, 6],
[12, 14]])
That is of course a pain to write, so you can let broadcasting help you:
>>> a[[[0], [1], [3]], [0, 2]]
array([[ 0, 2],
[ 4, 6],
[12, 14]])
This is much simpler to do if you index with arrays, not lists:
>>> row_idx = np.array([0, 1, 3])
>>> col_idx = np.array([0, 2])
>>> a[row_idx[:, None], col_idx]
array([[ 0, 2],
[ 4, 6],
[12, 14]])
USE:
>>> a[[0,1,3]][:,[0,2]]
array([[ 0, 2],
[ 4, 6],
[12, 14]])
OR:
>>> a[[0,1,3],::2]
array([[ 0, 2],
[ 4, 6],
[12, 14]])
Using np.ix_ is the most convenient way to do it (as answered by others), but it also can be done as follows:
>>> rows = [0, 1, 3]
>>> cols = [0, 2]
>>> (a[rows].T)[cols].T
array([[ 0, 2],
[ 4, 6],
[12, 14]])

Categories

Resources