Related
I have to lists of arrays, one is all my data and another one is a filter:
data= [array([[12., 2., 7.],
[9., -1., 0.],
[2., 0., 4.],
[14., 2., 1.2]]),
array([[11., 3., 1.],
[2., 5., 0.],
[1., -1., 1.],
[-8., 0., 6.],
[2., 3., 3.]])]
filter= [array([2., 0., 5.]),
array([3., -1.])]
I want to only keep values of each array of my data, that their second column (data[i][:,1]) is equal to one the values of the same array from the filter (all values of filter[i]). I want to have the following output:
out = [array([[12., 2., 7.],
[2., 0., 4.],
[14., 2., 1.2]]),
array([[11., 3., 1.],
[1., -1., 1.],
[2., 3., 3.]])]
I have tried the following code but it was not successful:
out=[]
for a in data:
for b in filter:
for c in b:
if a[:,1]==c:
bb=a[:,c]
out=np.append(data,bb)
I appreciate any help and contribution in advance.
You could use np.isin to create the masks and then filter using those masks:
import numpy as np
data = [np.array([[12., 2., 7.],
[9., -1., 0.],
[2., 0., 4.],
[14., 2., 1.2]]),
np.array([[11., 3., 1.],
[2., 5., 0.],
[1., -1., 1.],
[-8., 0., 6.],
[2., 3., 3.]])]
filter_data = [np.array([2., 0., 5.]),
np.array([3., -1.])]
# compute the masks
masks = [np.isin(arr[:, 1], filter_arr) for arr, filter_arr in zip(data, filter_data)]
# filter using the masks
result = [arr[mask] for arr, mask in zip(data, masks)]
for filtered_arr in result:
print(filtered_arr)
Output
[[12. 2. 7. ]
[ 2. 0. 4. ]
[14. 2. 1.2]]
[[11. 3. 1.]
[ 1. -1. 1.]
[ 2. 3. 3.]]
I have 2 numpy arrays:
one of shape (753,8,1) denoting 8 sequential actions of a customer
and other of shape (753,10) denoting 10 features of a training sample.
How can I combine these two such that:
all 10 features are appended to each of the 8 sequential actions of a training sample , that is, the combined final array should have shape of (753,8,11).
Maybe something like this:
import numpy as np
# create dummy arrays
a = np.zeros((753, 8, 1))
b = np.arange(753*10).reshape(753, 10)
# make a new axis for b and repeat the values along axis 1
c = np.repeat(b[:, np.newaxis, :], 8, axis=1)
c.shape
>>> (753, 8, 10)
# now the first two axes of a and c have the same shape
# append the values in c to a along the last axis
result = np.append(a, c, axis=2)
result.shape
>>> (753, 8, 11)
result[0]
>>> array([[0., 0., 1., 2., 3., 4., 5., 6., 7., 8., 9.],
[0., 0., 1., 2., 3., 4., 5., 6., 7., 8., 9.],
[0., 0., 1., 2., 3., 4., 5., 6., 7., 8., 9.],
[0., 0., 1., 2., 3., 4., 5., 6., 7., 8., 9.],
[0., 0., 1., 2., 3., 4., 5., 6., 7., 8., 9.],
[0., 0., 1., 2., 3., 4., 5., 6., 7., 8., 9.],
[0., 0., 1., 2., 3., 4., 5., 6., 7., 8., 9.],
[0., 0., 1., 2., 3., 4., 5., 6., 7., 8., 9.]])
# values from b (0-9) have been appended to a (0)
This question already has an answer here:
python: Multiply slice i of a matrix stack by column i of a matrix efficiently
(1 answer)
Closed 5 years ago.
There are really similar questions here, here, here, but I don't really understand how to apply them to my case precisely.
I have an array of matrices and an array of vectors and I need element-wise dot product. Illustration:
In [1]: matrix1 = np.eye(5)
In [2]: matrix2 = np.eye(5) * 5
In [3]: matrices = np.array((matrix1,matrix2))
In [4]: matrices
Out[4]:
array([[[ 1., 0., 0., 0., 0.],
[ 0., 1., 0., 0., 0.],
[ 0., 0., 1., 0., 0.],
[ 0., 0., 0., 1., 0.],
[ 0., 0., 0., 0., 1.]],
[[ 5., 0., 0., 0., 0.],
[ 0., 5., 0., 0., 0.],
[ 0., 0., 5., 0., 0.],
[ 0., 0., 0., 5., 0.],
[ 0., 0., 0., 0., 5.]]])
In [5]: vectors = np.ones((5,2))
In [6]: vectors
Out[6]:
array([[ 1., 1.],
[ 1., 1.],
[ 1., 1.],
[ 1., 1.],
[ 1., 1.]])
In [9]: np.array([m # v for m,v in zip(matrices, vectors.T)]).T
Out[9]:
array([[ 1., 5.],
[ 1., 5.],
[ 1., 5.],
[ 1., 5.],
[ 1., 5.]])
This last line is my desired output. Unfortunately it is very inefficient, for instance doing matrices # vectors that computes unwanted dot products due to broadcasting (if I understand well, it returns the first matrix dot the 2 vectors and the second matrix dot the 2 vectors) is actually faster.
I guess np.einsum or np.tensordot might be helpful here but all my attempts have failed:
In [30]: np.einsum("i,j", matrices, vectors)
ValueError: operand has more dimensions than subscripts given in einstein sum, but no '...' ellipsis provided to broadcast the extra dimensions.
In [34]: np.tensordot(matrices, vectors, axes=(0,1))
Out[34]:
array([[[ 6., 6., 6., 6., 6.],
[ 0., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 0.]],
[[ 0., 0., 0., 0., 0.],
[ 6., 6., 6., 6., 6.],
[ 0., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 0.]],
[[ 0., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 0.],
[ 6., 6., 6., 6., 6.],
[ 0., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 0.]],
[[ 0., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 0.],
[ 6., 6., 6., 6., 6.],
[ 0., 0., 0., 0., 0.]],
[[ 0., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 0.],
[ 6., 6., 6., 6., 6.]]])
NB: my real-case scenario use more complicated matrices than matrix1 and matrix2
With np.einsum, you might use:
np.einsum("ijk,ki->ji", matrices, vectors)
#array([[ 1., 5.],
# [ 1., 5.],
# [ 1., 5.],
# [ 1., 5.],
# [ 1., 5.]])
You can use # as follows
matrices # vectors.T[..., None]
# array([[[ 1.],
# [ 1.],
# [ 1.],
# [ 1.],
# [ 1.]],
# [[ 5.],
# [ 5.],
# [ 5.],
# [ 5.],
# [ 5.]]])
As we can see it computes the right thing but arranges them wrong.
Therefore
(matrices # vectors.T[..., None]).squeeze().T
# array([[ 1., 5.],
# [ 1., 5.],
# [ 1., 5.],
# [ 1., 5.],
# [ 1., 5.]])
I have an application where I have to process 1000's of 2D arrays. The result of the processed array is based on half of a Kings Move neighborhood in the original array. I'm trying to avoid loops if I can due to speed considerations. So, here is an example Numpy Array:
array([[ 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 1., 1., 1., 0., 0., 0., 0.],
[ 0., 0., 0., 1., 5., 5., 5., 5., 0., 0., 0.],
[ 0., 1., 5., 5., 1., 1., 1., 1., 1., 0., 0.],
[ 5., 5., 5., 5., 1., 5., 1., 1., 1., 1., 0.],
[ 1., 1., 1., 5., 1., 1., 5., 5., 1., 1., 0.],
[ 5., 1., 5., 1., 1., 5., 5., 5., 1., 5., 0.],
[ 0., 5., 1., 5., 1., 1., 1., 1., 1., 0., 0.],
[ 0., 0., 1., 1., 1., 1., 1., 1., 0., 0., 0.],
[ 0., 0., 0., 1., 5., 5., 5., 0., 0., 0., 0.]])
At each element, I want the sum of the cell right above it, the upper right diagonal element, the cell to the immediate right and the lower diagonal. So, using the element at [6][0] I would want to sum 1 + 1 + 1 + 5.
Of course, I also have to handle the edge cases where one of the 4 cells is not there. I have started with the padded zeros on top and far right to manage some of that but I'm stuck right now. Any advice would be much appreciated!
What you're doing can be viewed as performing a convolution with a particular convolution kernel. Here's a solution using the scipy convolve2d function:
import numpy as np
import scipy as sp
import scipy.signal
x = np.random.randint(5,size=(10,10))
kernel = np.array([[0,1,1],[0,0,1],[0,0,1]])
kernel = np.fliplr(np.flipud(kernel))
check = sp.signal.convolve2d(x,kernel,mode='same')
print x
print check
I'd like to calculate the "cumulative minimum" array--basically, the minimum value of an array up to each index such as:
import numpy as np
nums = np.array([5.,3.,4.,2.,1.,1.,2.,0.])
cumulative_min = np.zeros(nums.size, dtype=float)
for i,num in enumerate(nums):
cumulative_min[i] = np.min(nums[0:i+1])
This works (it returns the correct array([ 5., 3., 3., 2., 1., 1., 1., 0.])
), but I'd like to avoid the for loop if I can. I thought it might be faster to construct a 2-d array and use the np.amin() function, but I needed a loop for that as well.
For any 2-argument NumPy universal function, its accumulate method is the cumulative version of that function. Thus, numpy.minimum.accumulate is what you're looking for:
>>> numpy.minimum.accumulate([5,4,6,10,3])
array([5, 4, 4, 4, 3])
Create a matrix which lower triangle (np.tril) is filled with values of your array nums and your upper triangle (np.triu, with second parameter 1, so the diagonal stays free) is filled with the maximum of the array. (EDIT: instead of the maximum, the first element of the array is the better way. -> comments)
nums = np.array([5.,3.,4.,2.,1.,1.,2.,0.])
oneSquare = np.ones((nums.size, nums.size))
A = nums * np.tril(oneSquare)
B = np.triu(oneSquare, 1) * nums[0]
A, B
Out:
(array([[ 5., 0., 0., 0., 0., 0., 0., 0.],
[ 5., 3., 0., 0., 0., 0., 0., 0.],
[ 5., 3., 4., 0., 0., 0., 0., 0.],
[ 5., 3., 4., 2., 0., 0., 0., 0.],
[ 5., 3., 4., 2., 1., 0., 0., 0.],
[ 5., 3., 4., 2., 1., 1., 0., 0.],
[ 5., 3., 4., 2., 1., 1., 2., 0.],
[ 5., 3., 4., 2., 1., 1., 2., 0.]]),
array([[ 0., 5., 5., 5., 5., 5., 5., 5.],
[ 0., 0., 5., 5., 5., 5., 5., 5.],
[ 0., 0., 0., 5., 5., 5., 5., 5.],
[ 0., 0., 0., 0., 5., 5., 5., 5.],
[ 0., 0., 0., 0., 0., 5., 5., 5.],
[ 0., 0., 0., 0., 0., 0., 5., 5.],
[ 0., 0., 0., 0., 0., 0., 0., 5.],
[ 0., 0., 0., 0., 0., 0., 0., 0.]]))
Now take the minimum of each row:
(A+B).min(axis=1)
Out:
array([ 5., 3., 3., 2., 1., 1., 1., 0.])