Related
for example
I have a point list
a = np.array([[0,0,0],
[1,1,1],
[2,2,2],
[3,3,3],
[4,4,4],
[5,5,5],
[6,6,6],
[7,7,7],
[8,8,8],
[9,9,9]])
and I have another array represents the number of elements
b = np.array([2,0,3,5])
how can I split array a according the number of elements of array b so that I can get the output
[[[0,0,0],[1,1,1]],
[],
[[2,2,2],[3,3,3],[4,4,4]],
[[5,5,5],[6,6,6],[7,7,7],[8,8,8],[9,9,9]]]
You can use numpy.split using cumsum on b to get the split points:
out = np.split(a, b.cumsum()[:-1])
output:
[array([[0, 0, 0],
[1, 1, 1]]),
array([], shape=(0, 3), dtype=int64),
array([[2, 2, 2],
[3, 3, 3],
[4, 4, 4]]),
array([[5, 5, 5],
[6, 6, 6],
[7, 7, 7],
[8, 8, 8],
[9, 9, 9]])]
If you want lists:
out = [x.tolist() for x in np.split(a, b.cumsum()[:-1])]
output:
[[[0, 0, 0], [1, 1, 1]],
[],
[[2, 2, 2], [3, 3, 3], [4, 4, 4]],
[[5, 5, 5], [6, 6, 6], [7, 7, 7], [8, 8, 8], [9, 9, 9]]]
intermediate:
b.cumsum()[:-1]
# array([2, 2, 5])
I have a numpy array of shape (294, 62, 350). Along the third dimension (the 350), I need to combine every two columns into one longer one which would result in an array of shape (294, 124, 175). For example if I have this array:
a_3d_array = np.array([[[1, 2, 3, 6, 1, 2], [3, 4, 3, 6, 1, 4]],
[[5, 2, 2, 1, 4, 2], [2, 9, 4, 3, 2, 7]]])
The expected output would be:
expected_output = np.array([[[5, 2, 4], [2, 4, 2], [ 2, 1, 2], [9, 3, 7]],
[[1, 3, 1], [3, 3, 1], [2, 6, 2], [4, 6, 4]]])
Sorry as I'm new to python and I don't have a clue how to approach this and thus I don't have a "my own attempt" to include here.
a_3d_array = np.array([[[1, 2, 3, 6, 1, 2], [3, 4, 3, 6, 1, 4]],
[[5, 2, 2, 1, 4, 2], [2, 9, 4, 3, 2, 7]]])
output = np.hstack([a_3d_array[:, :, ::2], a_3d_array[:, :, 1::2]])
To combine every N-th column:
N = 3
output = np.hstack([an_array[:, :, idx::N] for idx in range(N)])
You can reshape and reverse the first dimension:
a_3d_array.reshape((2,4,3), order='F')[::-1]
If you don't know the shape:
x,y,z = a_3d_array.shape
a_3d_array.reshape((x,y*2,-1), order='F')[::-1]
output:
array([[[5, 2, 4],
[2, 4, 2],
[2, 1, 2],
[9, 3, 7]],
[[1, 3, 1],
[3, 3, 1],
[2, 6, 2],
[4, 6, 4]]])
Suppose I have two NumPy arrays
x = [[1, 2, 8],
[2, 9, 1],
[3, 8, 9],
[4, 3, 5],
[5, 2, 3],
[6, 4, 7],
[7, 2, 3],
[8, 2, 2],
[9, 5, 3],
[10, 2, 3],
[11, 2, 4]]
y = [0, 0, 1, 0, 1, 1, 2, 2, 2, 0, 0]
Note:
(values in x are not sorted in any way. I chose this example to better illustrate the example)
(These are just two examples of x and y. values of x and y can be arbitrarily many different numbers and y can have arbitrarily different numbers, but there are always as many values in x as there are in y)
I want to efficiently split the array x into sub-arrays according to the values in y.
My desired outputs would be
z_0 = [[1, 2, 8],
[2, 9, 1],
[4, 3, 5],
[10, 2, 3],
[11, 2, 4]]
z_1 = [[3, 8, 9],
[5, 2, 3],
[6, 4, 7],]
z_2 = [[7, 2, 3],
[8, 2, 2],
[9, 5, 3]]
Assuming that y starts with zero and is not sorted but grouped, what is the most efficient way to do this?
Note: This question is the unsorted version of this question:
Split a NumPy array into subarrays according to the values (sorted in ascending order) of another array
One way to solve this is to build up a list of filter indexes for each y value and then simply select those elements of x. For example:
z_0 = x[[i for i, v in enumerate(y) if v == 0]]
z_1 = x[[i for i, v in enumerate(y) if v == 1]]
z_2 = x[[i for i, v in enumerate(y) if v == 2]]
Output
array([[ 1, 2, 8],
[ 2, 9, 1],
[ 4, 3, 5],
[10, 2, 3],
[11, 2, 4]])
array([[3, 8, 9],
[5, 2, 3],
[6, 4, 7]])
array([[7, 2, 3],
[8, 2, 2],
[9, 5, 3]])
If you want to be more generic and support different sets of numbers in y, you could use a comprehension to produce a list of arrays e.g.
z = [x[[i for i, v in enumerate(y) if v == m]] for m in set(y)]
Output:
[array([[ 1, 2, 8],
[ 2, 9, 1],
[ 4, 3, 5],
[10, 2, 3],
[11, 2, 4]]),
array([[3, 8, 9],
[5, 2, 3],
[6, 4, 7]]),
array([[7, 2, 3],
[8, 2, 2],
[9, 5, 3]])]
If y is also an np.array and the same length as x you can simplify this to use boolean indexing:
z = [x[y==m] for m in set(y)]
Output is the same as above.
Just use list comprehension and boolean indexing
x = np.array(x)
y = np.array(y)
z = [x[y == i] for i in range(y.max() + 1)]
z
Out[]:
[array([[ 1, 2, 8],
[ 2, 9, 1],
[ 4, 3, 5],
[10, 2, 3],
[11, 2, 4]]),
array([[3, 8, 9],
[5, 2, 3],
[6, 4, 7]]),
array([[7, 2, 3],
[8, 2, 2],
[9, 5, 3]])]
Slight variation.
from operator import itemgetter
label = itemgetter(1)
Associate the implied information with the label ... (index,label)
y1 = [thing for thing in enumerate(y)]
Sort on the label
y1.sort(key=label)
Group by label and construct the results
import itertools
d = {}
for key,group in itertools.groupby(y1,label):
d[f'z{key}'] = [x[i] for i,k in group]
Pandas solution:
>>> import pandas as pd
>>> >>> df = pd.DataFrame({'points':[thing for thing in x],'cat':y})
>>> z = df.groupby('cat').agg(list)
>>> z
points
cat
0 [[1, 2, 8], [2, 9, 1], [4, 3, 5], [10, 2, 3], ...
1 [[3, 8, 9], [5, 2, 3], [6, 4, 7]]
2 [[7, 2, 3], [8, 2, 2], [9, 5, 3]]
How do I use numpy / python array routines to do this ?
E.g. If I have array [ [1,2,3,4,]] , the output should be
[[1,1,2,2,],
[1,1,2,2,],
[3,3,4,4,],
[3,3,4,4]]
Thus, the output is array of double the row and column dimensions. And each element from original array is repeated three times.
What I have so far is this
def operation(mat,step=2):
result = np.array(mat,copy=True)
result[::2,::2] = mat
return result
This gives me array
[[ 98.+0.j 0.+0.j 40.+0.j 0.+0.j]
[ 0.+0.j 0.+0.j 0.+0.j 0.+0.j]
[ 29.+0.j 0.+0.j 54.+0.j 0.+0.j]
[ 0.+0.j 0.+0.j 0.+0.j 0.+0.j]]
for the input
[[98 40]
[29 54]]
The array will always be of even dimensions.
Use np.repeat():
In [9]: A = np.array([[1, 2, 3, 4]])
In [10]: np.repeat(np.repeat(A, 2).reshape(2, 4), 2, 0)
Out[10]:
array([[1, 1, 2, 2],
[1, 1, 2, 2],
[3, 3, 4, 4],
[3, 3, 4, 4]])
Explanation:
First off you can repeat the arrya items:
In [30]: np.repeat(A, 3)
Out[30]: array([1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4])
then all you need is reshaping the result (based on your expected result this can be different):
In [32]: np.repeat(A, 3).reshape(2, 3*2)
array([[1, 1, 1, 2, 2, 2],
[3, 3, 3, 4, 4, 4]])
And now you should repeat the result along the the first axis:
In [34]: np.repeat(np.repeat(A, 3).reshape(2, 3*2), 3, 0)
Out[34]:
array([[1, 1, 1, 2, 2, 2],
[1, 1, 1, 2, 2, 2],
[1, 1, 1, 2, 2, 2],
[3, 3, 3, 4, 4, 4],
[3, 3, 3, 4, 4, 4],
[3, 3, 3, 4, 4, 4]])
Another approach could be with np.kron -
np.kron(a.reshape(-1,2),np.ones((2,2),dtype=int))
Basically, we reshape input array into a 2D array keeping the second axis of length=2. Then np.kron essentially replicates the elements along both rows and columns for a length of 2 each with that array : np.ones((2,2),dtype=int).
Sample run -
In [45]: a
Out[45]: array([7, 5, 4, 2, 8, 6])
In [46]: np.kron(a.reshape(-1,2),np.ones((2,2),dtype=int))
Out[46]:
array([[7, 7, 5, 5],
[7, 7, 5, 5],
[4, 4, 2, 2],
[4, 4, 2, 2],
[8, 8, 6, 6],
[8, 8, 6, 6]])
If you would like to have 4 rows, use a.reshape(2,-1) instead.
The better solution is to use numpy but you could use iteration also:
a = [[1, 2, 3, 4]]
v = iter(a[0])
b = []
for i in v:
n = next(v)
[b.append([i for k in range(2)] + [n for k in range(2)]) for j in range(2)]
print b
>>> [[1, 1, 2, 2], [1, 1, 2, 2], [3, 3, 4, 4], [3, 3, 4, 4]]
A = np.array([[1,2,3],[4,1,3],[6,7,1]])
array([[1, 2, 3],
[4, 1, 3],
[6, 7, 1]])
I need to transform every 1 to a 23 but only on a subset of the array.
I want to start at the index 1:1 and stop at 2:2
array([[1, 2, 3],
[4, 23, 3],
[6, 7, 23]])
>>>A = np.array([[1,2,3],[4,1,3],[6,7,1]])
array([[1, 2, 3],
[4, 1, 3],
[6, 7, 1]])
>>>b = A[1:,1:]==1
>>>A[1:,1:][b]=23
>>>A
array([[ 1, 2, 3],
[ 4, 23, 3],
[ 6, 7, 23]])