Numpy Split array to sub arrays by specific number of elements - python

for example
I have a point list
a = np.array([[0,0,0],
[1,1,1],
[2,2,2],
[3,3,3],
[4,4,4],
[5,5,5],
[6,6,6],
[7,7,7],
[8,8,8],
[9,9,9]])
and I have another array represents the number of elements
b = np.array([2,0,3,5])
how can I split array a according the number of elements of array b so that I can get the output
[[[0,0,0],[1,1,1]],
[],
[[2,2,2],[3,3,3],[4,4,4]],
[[5,5,5],[6,6,6],[7,7,7],[8,8,8],[9,9,9]]]

You can use numpy.split using cumsum on b to get the split points:
out = np.split(a, b.cumsum()[:-1])
output:
[array([[0, 0, 0],
[1, 1, 1]]),
array([], shape=(0, 3), dtype=int64),
array([[2, 2, 2],
[3, 3, 3],
[4, 4, 4]]),
array([[5, 5, 5],
[6, 6, 6],
[7, 7, 7],
[8, 8, 8],
[9, 9, 9]])]
If you want lists:
out = [x.tolist() for x in np.split(a, b.cumsum()[:-1])]
output:
[[[0, 0, 0], [1, 1, 1]],
[],
[[2, 2, 2], [3, 3, 3], [4, 4, 4]],
[[5, 5, 5], [6, 6, 6], [7, 7, 7], [8, 8, 8], [9, 9, 9]]]
intermediate:
b.cumsum()[:-1]
# array([2, 2, 5])

Related

Delete rows of Numpy Array that contain combinations of certain numbers

i have an Array containing faces of a triangulation that looks like this:
faces = np.array([[4, 0, 1],
[5, 4, 1],
[7, 5, 1],
[7, 5, 4],
[3, 0, 1], # row to delete
[7, 3, 1],
[4, 2, 0],
[6, 4, 2],
[7, 6, 2],
[7, 6, 4],
[3, 2, 0], # row to delete
[7, 3, 2]])
Each number in the entry describes a 3-dimensional vertice.
Now i want to delete rows only containing combinations of following list:
indices = [0, 1, 2, 3]
How could i do this? I have tried some options, however i can't delete just the two faces that i wish to delete.
Assuming you have unique elements in each row, you could use:
faces_filtered = faces[~np.isin(faces, indices).all(1)]
output (initial rows 4 and 10 have been removed):
array([[4, 0, 1],
[5, 4, 1],
[7, 5, 1],
[7, 5, 4],
[7, 3, 1],
[4, 2, 0],
[6, 4, 2],
[7, 6, 2],
[7, 6, 4],
[7, 3, 2]])

Split a NumPy array into subarrays according to the values (not sorted, but grouped) of another array

Suppose I have two NumPy arrays
x = [[1, 2, 8],
[2, 9, 1],
[3, 8, 9],
[4, 3, 5],
[5, 2, 3],
[6, 4, 7],
[7, 2, 3],
[8, 2, 2],
[9, 5, 3],
[10, 2, 3],
[11, 2, 4]]
y = [0, 0, 1, 0, 1, 1, 2, 2, 2, 0, 0]
Note:
(values in x are not sorted in any way. I chose this example to better illustrate the example)
(These are just two examples of x and y. values of x and y can be arbitrarily many different numbers and y can have arbitrarily different numbers, but there are always as many values in x as there are in y)
I want to efficiently split the array x into sub-arrays according to the values in y.
My desired outputs would be
z_0 = [[1, 2, 8],
[2, 9, 1],
[4, 3, 5],
[10, 2, 3],
[11, 2, 4]]
z_1 = [[3, 8, 9],
[5, 2, 3],
[6, 4, 7],]
z_2 = [[7, 2, 3],
[8, 2, 2],
[9, 5, 3]]
Assuming that y starts with zero and is not sorted but grouped, what is the most efficient way to do this?
Note: This question is the unsorted version of this question:
Split a NumPy array into subarrays according to the values (sorted in ascending order) of another array
One way to solve this is to build up a list of filter indexes for each y value and then simply select those elements of x. For example:
z_0 = x[[i for i, v in enumerate(y) if v == 0]]
z_1 = x[[i for i, v in enumerate(y) if v == 1]]
z_2 = x[[i for i, v in enumerate(y) if v == 2]]
Output
array([[ 1, 2, 8],
[ 2, 9, 1],
[ 4, 3, 5],
[10, 2, 3],
[11, 2, 4]])
array([[3, 8, 9],
[5, 2, 3],
[6, 4, 7]])
array([[7, 2, 3],
[8, 2, 2],
[9, 5, 3]])
If you want to be more generic and support different sets of numbers in y, you could use a comprehension to produce a list of arrays e.g.
z = [x[[i for i, v in enumerate(y) if v == m]] for m in set(y)]
Output:
[array([[ 1, 2, 8],
[ 2, 9, 1],
[ 4, 3, 5],
[10, 2, 3],
[11, 2, 4]]),
array([[3, 8, 9],
[5, 2, 3],
[6, 4, 7]]),
array([[7, 2, 3],
[8, 2, 2],
[9, 5, 3]])]
If y is also an np.array and the same length as x you can simplify this to use boolean indexing:
z = [x[y==m] for m in set(y)]
Output is the same as above.
Just use list comprehension and boolean indexing
x = np.array(x)
y = np.array(y)
z = [x[y == i] for i in range(y.max() + 1)]
z
Out[]:
[array([[ 1, 2, 8],
[ 2, 9, 1],
[ 4, 3, 5],
[10, 2, 3],
[11, 2, 4]]),
array([[3, 8, 9],
[5, 2, 3],
[6, 4, 7]]),
array([[7, 2, 3],
[8, 2, 2],
[9, 5, 3]])]
Slight variation.
from operator import itemgetter
label = itemgetter(1)
Associate the implied information with the label ... (index,label)
y1 = [thing for thing in enumerate(y)]
Sort on the label
y1.sort(key=label)
Group by label and construct the results
import itertools
d = {}
for key,group in itertools.groupby(y1,label):
d[f'z{key}'] = [x[i] for i,k in group]
Pandas solution:
>>> import pandas as pd
>>> >>> df = pd.DataFrame({'points':[thing for thing in x],'cat':y})
>>> z = df.groupby('cat').agg(list)
>>> z
points
cat
0 [[1, 2, 8], [2, 9, 1], [4, 3, 5], [10, 2, 3], ...
1 [[3, 8, 9], [5, 2, 3], [6, 4, 7]]
2 [[7, 2, 3], [8, 2, 2], [9, 5, 3]]

row-wise Cartesian product between a 1d array and a 2d array

I think I'm missing something obvious. I want to find a cartesian product of arr1 (a 1d numpy array), and the ROWS of arr2 (a 2d numpy array). So, if arr1 has 4 elements and arr2 has shape (5,2), the output should have shape (20,3). (see below)
import numpy as np
arr1 = np.array([1, 4, 7, 3])
arr2 = np.array([[0, 1],
[2, 3],
[4, 5],
[4, 0],
[9, 9]])
The desired output is:
arr3 = np.array([[1, 0, 1],
[1, 2, 3],
[1, 4, 5],
[1, 4, 0],
[1, 9, 9],
[4, 0, 1],
[4, 2, 3],
[4, 4, 5],
[4, 4, 0],
[4, 9, 9],
[7, 0, 1],
[7, 2, 3],
[7, 4, 5],
[7, 4, 0],
[7, 9, 9],
[3, 0, 1],
[3, 2, 3],
[3, 4, 5],
[3, 4, 0],
[3, 9, 9]])
I've been trying to use transpose and reshape with code like np.array(np.meshgrid(arr1,arr2)), but no success yet.
I'm hoping the solution can be generalized because I also need to deal with situations like this: Get all combinations of the ROWS of a 2d (10,2) array and the ROWS of a 2d array (20, 5) to get an output array (200,7).
Here is a vectorized solution that works for your general case as well:
arr1 = np.array([[1, 4],
[7, 3]])
arr2 = np.array([[0, 1],
[2, 3],
[4, 5],
[4, 0],
[9, 9]])
np.hstack((np.repeat(arr1,len(arr2),0),np.stack((arr2,)*len(arr1)).reshape(-1,arr2.shape[1])))
output of shape (2,2)*(5,2)->(10,4):
[[1 4 0 1]
[1 4 2 3]
[1 4 4 5]
[1 4 4 0]
[1 4 9 9]
[7 3 0 1]
[7 3 2 3]
[7 3 4 5]
[7 3 4 0]
[7 3 9 9]]
You can use hstack to add columns to arr2, and vstack to get the final array.
np.vstack(np.apply_along_axis(lambda x: np.hstack([np.repeat(x[0], arr2.shape[0]).reshape(-1, 1),
arr2]),
1,
arr1[:, None]))
I think this should do it:
import numpy as np
arr0 = np.array([1, 4, 7, 3])
arr1 = np.reshape(arr0, (len(arr0),1))
arr2 = np.array([[0, 1],
[2, 3],
[4, 5],
[4, 0],
[9, 9]])
r1,c1 = arr1.shape
r2,c2 = arr2.shape
arrOut = np.zeros((r1,r2,c1+c2), dtype=arr1.dtype)
arrOut[:,:,:c1] = arr1[:,None,:]
arrOut[:,:,c1:] = arr2
arrOut.reshape(-1,c1+c2)
The output is:
array([[1, 0, 1],
[1, 2, 3],
[1, 4, 5],
[1, 4, 0],
[1, 9, 9],
[4, 0, 1],
[4, 2, 3],
[4, 4, 5],
[4, 4, 0],
[4, 9, 9],
[7, 0, 1],
[7, 2, 3],
[7, 4, 5],
[7, 4, 0],
[7, 9, 9],
[3, 0, 1],
[3, 2, 3],
[3, 4, 5],
[3, 4, 0],
[3, 9, 9]])

numpy optimize way(remove loops)

I want to use this code on very huge array. this code take long time to execute and it is not efficient.
is there any way to remove loop and convert this code to optimum way?
>>> import numpy as np
>>> x=np.random.randint(10, size=(4,5,3))
>>> x
array([[[3, 2, 6],
[4, 6, 6],
[3, 7, 9],
[6, 4, 2],
[9, 0, 1]],
[[9, 0, 4],
[1, 8, 9],
[6, 8, 1],
[9, 4, 5],
[1, 5, 2]],
[[6, 1, 6],
[1, 8, 8],
[3, 8, 3],
[7, 1, 0],
[7, 7, 0]],
[[5, 6, 6],
[8, 3, 1],
[0, 5, 4],
[6, 1, 2],
[5, 6, 1]]])
>>> y=[]
>>> for i in range(x.shape[1]):
for j in range(x.shape[2]):
y.append(x[:, i, j].tolist())
>>> y
[[3, 9, 6, 5], [2, 0, 1, 6], [6, 4, 6, 6], [4, 1, 1, 8], [6, 8, 8, 3], [6, 9, 8, 1], [3, 6, 3, 0], [7, 8, 8, 5], [9, 1, 3, 4], [6, 9, 7, 6], [4, 4, 1, 1], [2, 5, 0, 2], [9, 1, 7, 5], [0, 5, 7, 6], [1, 2, 0, 1]]
You could permute axes with np.transpose and then reshape to 2D -
y = x.transpose(1,2,0).reshape(-1,x.shape[0])
Append with .tolist() for list output.
yes, either use np.reshape(x, shape) or try it with np.ndarray.flatten(x, order='F') (F for Fortran style, column first, according to your example).
read the documentation to find out which parameters fit the best. IMHO, I think ndarray.flatten is the better and more elegant option for you here. However, depending on your exact wanted solution, you might have to reshape the array first.

argmax on 2 axis for 3-d numpy array

I'd like to obtain a 1D array of indexes from a 3D matrix.
For instance given x = np.random.randint(10, size=(10,3,3)), I'd like to do something like np.argmax(x, axis=(1,2)) just like you can do with np.max, that is, obtain a 1D array of length 10 containing the indexes (0 to 8) of the maximums of each submatrix of size (3,3).
I have not found anything helpful so far and I want to avoid looping on the first dimension (and use np.argmax(x)) as it is quite big.
Cheers!
Reshape to merge those last two axes and then use np.argmax -
idx = x.reshape(x.shape[0],-1).argmax(-1)
out = np.unravel_index(idx, x.shape[-2:])
Sample run -
In [263]: x = np.random.randint(10, size=(4,3,3))
In [264]: x
Out[264]:
array([[[0, 9, 2],
[7, 7, 8],
[2, 5, 9]],
[[1, 7, 2],
[8, 9, 0],
[2, 8, 3]],
[[7, 5, 0],
[7, 1, 6],
[5, 1, 1]],
[[0, 7, 3],
[5, 4, 1],
[9, 8, 9]]])
In [265]: idx = x.reshape(x.shape[0],-1).argmax(-1)
In [266]: np.unravel_index(idx, x.shape[-2:])
Out[266]: (array([0, 1, 0, 2]), array([1, 1, 0, 0]))
If you meant getting the merged index, then its simpler -
x.reshape(x.shape[0],-1).argmax(1)
Sample run -
In [283]: x
Out[283]:
array([[[2, 3, 7],
[8, 1, 0],
[3, 6, 9]],
[[8, 0, 5],
[2, 2, 9],
[9, 0, 9]],
[[1, 9, 2],
[5, 0, 3],
[7, 2, 1]],
[[1, 6, 5],
[2, 3, 7],
[7, 4, 6]]])
In [284]: x.reshape(x.shape[0],-1).argmax(1)
Out[284]: array([8, 5, 1, 5])

Categories

Resources