Starting from two numpy arrays:
A = np.array([[1, 2], [2, 3], [3, 4], [4, 5]])
B = np.array([[9, 8], [8, 7], [7, 6], [6, 5]])
I would like to create a new array C picking, for each index, one row from the same index but randomly from A or B. The idea is that at each index of random_selector, if the value is higher than 0.1, then we chose the same-index row from A, otherwise, the same-index row from B.
random_selector = np.random.random(size=len(A))
C = np.where(random_selector > .1, A, B)
# example of desired result picking rows from respectively A, B, B, A:
# [[1, 2], [8, 7], [7, 6], [4, 5]]
Running the above code, however, produces the following error:
ValueError: operands could not be broadcast together with shapes (4,) (4,2) (4,2)
Try adding a new dimension:
import numpy as np
A = np.array([[1, 2], [2, 3], [3, 4], [4, 5]])
B = np.array([[9, 8], [8, 7], [7, 6], [6, 5]])
random_selector = np.random.random(size=len(A))
C = np.where((random_selector > .1)[:, None], A, B)
print(C)
Output (of a single run)
[[1 2]
[8 7]
[3 4]
[4 5]]
Related
I can do it with a loop but it takes me forever. Is there a way to do it without a loop or much faster? Here is my code explained. "data" is my 2D-array (M, N). "seq" is my window size (e.g., 40) and size = data.shape[0] = M.
X = list()
for j in range(size):
end_idx = j + seq
if end_idx >= size:
break
seq_x = data[j:end_idx, :]
X.append(seq_x)
final_data = np.array(X)
It will look like below:
data = [[0, 1]
[2, 3]
[3, 4]
[4, 5]
[5, 6]
[6, 7]
[7, 8]
[8, 9]
[9, 7]]
For a window of size w = 2 we have
res = [[[0, 1]
[2, 3]]
[[2, 3]
[3, 4]]
[[3, 4]
[4, 5]]
...
[[8, 9]
[9, 7]]]
Is any one as an idea of how to do it so that it can be executed quickly?
import numpy as np
data = np.array([[0, 1],
[2, 3],
[3, 4],
[4, 5],
[5, 6],
[6, 7],
[7, 8],
[8, 9],
[9, 7]])
w = 2
window_width = data.shape[1]
out = np.lib.stride_tricks.sliding_window_view(data, window_shape=(w, window_width)).squeeze()
out:
array([[[0, 1],
[2, 3]],
[[2, 3],
[3, 4]],
...
[[7, 8],
[8, 9]],
[[8, 9],
[9, 7]]])
Given the following numpy array:
>>> a = np.arange(9).reshape((3, 3))
>>> a
array([[0, 1, 2],
[3, 4, 5],
[6, 7, 8]])
How can get the list of all possible column deletions? So in this case:
array([[[1, 2],
[4, 5],
[7, 8]],
[[0, 2],
[3, 5],
[6, 8]],
[[0, 1],
[3, 4],
[6, 7]]])
You can use itertools.combinations:
>>> from itertools import combinations
>>> np.array([a[:, list(comb)] for comb in combinations(range(a.shape[1]), r=2)])
array([[[0, 1],
[3, 4],
[6, 7]],
[[0, 2],
[3, 5],
[6, 8]],
[[1, 2],
[4, 5],
[7, 8]]])
Alternatively you can create a list of needed column indices first and then use integer array indexing to pick up the required columns from the original array:
r = range(a.shape[1])
cols = [[j for j in r if i != j] for i in r]
cols
# [[1, 2], [0, 2], [0, 1]]
a[:, cols].swapaxes(0, 1)
#[[[1 2]
# [4 5]
# [7 8]]
#
# [[0 2]
# [3 5]
# [6 8]]
#
# [[0 1]
# [3 4]
# [6 7]]]
I have a 2D numpy array that I need to extract a subset of data from where the value of the 2nd column is higher than a certain value. What's the best way to do this?
E.g. given the array:
array1 = [[1, 5], [2, 6], [3, 7], [4, 8]]
I would want to extract all rows where the 2nd column was higher than 6, so I'd get:
[3, 7], [4, 8]
Or, even more simply:
a[a[:,1] > 6]
Output:
array([[3, 7], [4, 8]])
Where a is the array.
Use numpy.where:
import numpy as np
a = np.array([[1, 5], [2, 6], [3, 7], [4, 8]])
# all elements where the second item it greater than 6:
print(a[np.where(a[:, 1] > 6)])
# output: [[3 7], [4 8]]
Use list comprehension:
array1 = [[1, 5], [2, 6], [3, 7], [4, 8]]
threshold = 6
print([elem for elem in array1 if elem[1] > threshold])
# [[3, 7], [4, 8]]
Or using numpy:
import numpy as np
array1 = np.array(array1)
print(array1[array1[:,1] > 6])
# array([[3, 7], [4, 8]])
I have two matrices of the same size, A, B. I want to use the columns of B to acsses the columns of A, on a per column basis. For example,
A = np.array([[1, 4, 7],
[2, 5, 8],
[3, 6, 9]])
and
B = np.array([[0, 0, 2],
[1, 2, 1],
[2, 1, 0]])
I want something like:
A[B] = [[1, 4, 9],
[2, 6, 8],
[3, 5, 7]]
I.e., I've used the j'th column of B as indices to the j'th column of A.
Is there any effiecnt way of doing so?
Thanks!
You can use advanced indexing:
A[B, np.arange(A.shape[0])]
array([[1, 4, 9],
[2, 6, 8],
[3, 5, 7]])
Or with np.take_along_axis:
np.take_along_axis(A, B, axis=0)
array([[1, 4, 9],
[2, 6, 8],
[3, 5, 7]])
I have an array of size (100, 50). I need to generate an output array which represents a cartesian product of input array rows.
For simplification purposes, let's have an input array:
array([[2, 6, 5],
[7, 3, 6]])
As output I would like to have:
array([[2, 7],
[2, 3],
[2, 6],
[6, 7],
[6, 3],
[6, 6],
[5, 7],
[5, 3],
[5, 6]])
Note: itertools.product doesn't work here, because of the size of the input vector. Also all another similar answers, assumes number of rows smaller than 32, what is not the case here
This question has been asked many times, for example here.
The array of a size (100, 50) is too big and can't be handled by numpy. However, smaller array size might be solved.
Anyway, I prefer to use itertools for this kind of stuff:
import itertools
a = np.array([[2, 6, 5], [7, 3, 6]])
np.array(list(itertools.product(*a)))
array([[2, 7],
[2, 3],
[2, 6],
[6, 7],
[6, 3],
[6, 6],
[5, 7],
[5, 3],
[5, 6]])
a = np.array([[2, 6, 5],[7, 3, 6]])
out = np.array(np.meshgrid(a[0], a[1])).T.reshape(-1,2)
print(out)
"""
prints
[[2 7]
[2 3]
[2 6]
[6 7]
[6 3]
[6 6]
[5 7]
[5 3]
[5 6]]
"""