I have a 2d numpy array and a 2d numpy subarray that I want to add to the original array based on a condition.
I know that you can add the 2d subarray to the array like this:
original_array[start_x:end_x, start_y:end_y] = sub_array
but I dont know how to efficiently add only values of the sub_array that are bigger than 0?
Example:
orginal_array = np.array([2,2],[2,2],[2,2],[2,2])
sub_array = np.array([0,0],[1,1],[0,1],[0,0])
expected_result = np.array([2,2], [1,1], [2,1], [2,2])
You can index based on the condition >,< 0 and add the arrays.
orginal_array * (sub_array <= 0) + sub_array * (sub_array > 0)
array([[2, 2],
[1, 1],
[2, 1],
[2, 2]])
Another approach is to use the np.where function as:
np.where(sub_array > 0, sub_array, original_array)
Output:
array([[2, 2],
[1, 1],
[2, 1],
[2, 2]])
Try,
sub_array2 = np.select([sub_array>0],[sub_array])
original_array[start_x:end_x, start_y:end_y] = sub_array2
Related
I have a numpy array with shape (100,9,17,2). I need to transform it to (100,15,17,2) adding zeros in missing cells. I created a zeros(100,6,17,2) but I can't merge them. Can you help me?
Use numpy's concatenate function.
result = np.concatenate([input_array, np.zeros((100, 6, 17, 2), dtype=input_array.dtype)], axis=1)
The axis argument is the argument over which you want the dimensions to be added.
import numpy as np
a = np.array ([[1, 2], [2, 2]])
b = np.array ([[1, 2], [2, 2], [1,2]])
print (a.shape, b.shape)
output: (2,2), (3,2)
Now, to concatenate both array mention axis as well.
c = np.concatenate((a,b), axis=0)
output: array([[1, 2],
[2, 2],
[1, 2],
[2, 2],
[1, 2]])
Very simple, you create an array containing zeros using the reference shape:
c = np.zeros((100, 15, 17, 2))
then insert the other array in the correct elements:
c[:a.shape[0],b.shape[1]:,:a.shape[2],:a.shape[3]] = a
Finally add them up:
merged = b + c
You can use, numpy.apply_along_axis() and numpy.pad()
new_array = np.apply_along_axis(lambda x: np.pad(x,(0,6)), 1, input_array)
I for example have the numpy array:
array1 = [[1,2, 3], [2,4, 0], [1,2, 9]]
Then I only want to select the rows which don't end with a 0 like:
[[1,2,3], [1,2,9]]
I tried to do this by:
return array1[array1[-1] != 0]
But I get a dimension error
boolean index did not match indexed array along dimension 0; dimension is 8 but corresponding boolean dimension is 5
I find this illogical because when I had a array of the form [1,2,3,4,0] and I wanted to select the nonzero elements, I could just do array1[array1 != 0] and to me it seems both bits of code use the same logic. What am I doing wrong?
I think there is a mistake on your 'return' line.
Try with this:
return array1[array1[:, -1] != 0]
Please, first be sure that you are using a np.array. Then you should use proper indexing like [:, -1] not [-1].
array1 = np.array([[1, 2, 3], [2, 4, 0], [1, 2, 9]])
array1[array1[:, -1] != 0]
>>>
array([[1, 2, 3],
[1, 2, 9]])
Sat I have the following numpy array:
arr = numpy.array([[0,0], [1, 0], [2, 0], [3, 0]])
How do I add a single sub-array on each of the six sub-arrays? (Say if want to add [2,1] to each of them then the output should be [[2,1], [3, 1], [4, 1], [5, 1]])
I know if it's a 1D array you can just write something like arr + 1 and it will add 1 to each elements in arr but what about in this case? I have yet to be able to find relative information in the documentations
arr = np.array([np.append(item, [2,1]) for item in arr])
This should give you the result
I have a NumPy array with each row representing some (x, y, z) coordinate like so:
a = array([[0, 0, 1],
[1, 1, 2],
[4, 5, 1],
[4, 5, 2]])
I also have another NumPy array with unique values of the z-coordinates of that array like so:
b = array([1, 2])
How can I apply a function, let's call it "f", to each of the groups of rows in a which correspond to the values in b? For example, the first value of b is 1 so I would get all rows of a which have a 1 in the z-coordinate. Then, I apply a function to all those values.
In the end, the output would be an array the same shape as b.
I'm trying to vectorize this to make it as fast as possible. Thanks!
Example of an expected output (assuming that f is count()):
c = array([2, 2])
because there are 2 rows in array a which have a z value of 1 in array b and also 2 rows in array a which have a z value of 2 in array b.
A trivial solution would be to iterate over array b like so:
for val in b:
apply function to a based on val
append to an array c
My attempt:
I tried doing something like this, but it just returns an empty array.
func(a[a[:, 2]==b])
The problem is that the groups of rows with the same Z can have different sizes so you cannot stack them into one 3D numpy array which would allow to easily apply a function along the third dimension. One solution is to use a for-loop, another is to use np.split:
a = np.array([[0, 0, 1],
[1, 1, 2],
[4, 5, 1],
[4, 5, 2],
[4, 3, 1]])
a_sorted = a[a[:,2].argsort()]
inds = np.unique(a_sorted[:,2], return_index=True)[1]
a_split = np.split(a_sorted, inds)[1:]
# [array([[0, 0, 1],
# [4, 5, 1],
# [4, 3, 1]]),
# array([[1, 1, 2],
# [4, 5, 2]])]
f = np.sum # example of a function
result = list(map(f, a_split))
# [19, 15]
But imho the best solution is to use pandas and groupby as suggested by FBruzzesi. You can then convert the result to a numpy array.
EDIT: For completeness, here are the other two solutions
List comprehension:
b = np.unique(a[:,2])
result = [f(a[a[:,2] == z]) for z in b]
Pandas:
df = pd.DataFrame(a, columns=list('XYZ'))
result = df.groupby(['Z']).apply(lambda x: f(x.values)).tolist()
This is the performance plot I got for a = np.random.randint(0, 100, (n, 3)):
As you can see, approximately up to n = 10^5 the "split solution" is the fastest, but after that the pandas solution performs better.
If you are allowed to use pandas:
import pandas as pd
df=pd.DataFrame(a, columns=['x','y','z'])
df.groupby('z').agg(f)
Here f can be any custom function working on grouped data.
Numeric example:
a = np.array([[0, 0, 1],
[1, 1, 2],
[4, 5, 1],
[4, 5, 2]])
df=pd.DataFrame(a, columns=['x','y','z'])
df.groupby('z').size()
z
1 2
2 2
dtype: int64
Remark that .size is the way to count number of rows per group.
To keep it into pure numpy, maybe this can suit your case:
tmp = np.array([a[a[:,2]==i] for i in b])
tmp
array([[[0, 0, 1],
[4, 5, 1]],
[[1, 1, 2],
[4, 5, 2]]])
which is an array with each group of arrays.
c = np.array([])
for x in np.nditer(b):
c = np.append(c, np.where((a[:,2] == x))[0].shape[0])
Output:
[2. 2.]
So I have a numpy matrix, such as:
[[1,2,3],
[0,59,2],
[54,4,2]]
Now I want to find the indices, where all values are greater than 50 (not the maximum).
This should give me [1,1],[2,0].
Other than iterating through, checking each value and keeping track of indices for values which obey condition and then returning these pairs - how do you do this efficiently?
You want to use either numpy.where or numpy.argwhere:
import numpy as np
A = np.array([[99, 2, 3],
[0, 59, 2],
[54, 4, 2]])
np.where(A > 50)
# (array([0, 1, 2]), array([0, 1, 0]))
np.argwhere(A > 50)
# array([[0, 0],
# [1, 1],
# [2, 0]])
How about list comprehension?
ind = [[i,j] for i in range(3) for j in range(3) if A[i][j]>50]