Related
So I have two numpy arrays of arrays
a = [[[1, 2, 3, 4], [3, 3, 3, 3], [4, 4, 4, 4]]]
b = [[[0, 0, 4, 0], [0, 0, 0, 0], [0, 1, 0, 1]]]
Both arrays are always of the same size.
The result should be like
c = [[[1, 2, 4, 4], [3, 3, 3, 3], [4, 1, 4, 1]]]
How can I do that in a very fast way in numpy?
Use numpy.where:
import numpy as np
a = np.array([[1, 2, 3, 4], [3, 3, 3, 3], [4, 4, 4, 4]])
b = np.array([[0, 0, 4, 0], [0, 0, 0, 0], [0, 1, 0, 1]])
res = np.where(b == 0, a, b)
print(res)
Output
[[1 2 4 4]
[3 3 3 3]
[4 1 4 1]]
For optimal speed use b criterion directly.
Instead of
np.where(b == 0, a, b)
# array([[1, 2, 4, 4],
# [3, 3, 3, 3],
# [4, 1, 4, 1]])
timeit(lambda:np.where(b==0,a,b))
# 2.6133874990046024
better do
np.where(b,b,a)
# array([[1, 2, 4, 4],
# [3, 3, 3, 3],
# [4, 1, 4, 1]])
timeit(lambda:np.where(b,b,a))
# 1.5850481310044415
I have a large matrix where I want to permute (or shift) one row of it.
For example:
np.array([[1, 2, 3, 4],
[1, 2, 3, 4],
[1, 2, 3, 4],
[1, 2, 3, 4]])
And the desired shifting output is: (for the second row by 1, for that example)
np.array([[1, 2, 3, 4],
[2, 3, 4, 1],
[1, 2, 3, 4],
[1, 2, 3, 4]])
This can be done naively by extracting the row of interest, permute and stick it back in the matrix.
I want a better solution that is in-place and efficient.
How to shift desired row or column by n places?
How to permute (change the order as desired)?
Can this be done efficiently for more than 1 row? for example shift the i'th row i places forward:
np.array([[1, 2, 3, 4],
[2, 3, 4, 1],
[3, 4, 1, 2],
[4, 1, 2, 3]])
You can do it indexing by slicing the rows and rolling them:
import numpy as np
a = np.array([[1, 2, 3, 4],
[1, 2, 3, 4],
[1, 2, 3, 4],
[1, 2, 3, 4]])
shift = 2
rows = [1, 3]
a[rows] = np.roll(a[rows], shift, axis=1)
array([[1, 2, 3, 4],
[3, 4, 1, 2],
[1, 2, 3, 4],
[3, 4, 1, 2]])
Suppose I have the following array:
a = [[1, 4, 2, 3]
[3, 1, 5, 4]
[4, 3, 1, 2]]
What I'd like to do is impose a maximum value on the array, but have that maximum vary by row. For instance if I wanted to limit the 1st and 3rd row to a maximum value of 3, and the 2nd row to a value of 4, I could create something like:
[[1, 3, 2, 3]
[3, 1, 4, 4]
[3, 3, 1, 2]
Is there any better way than just looping over each row individually and setting it with 'nonzero'?
With numpy.clip (using the method version here):
a.clip(max=np.array([3, 4, 3])[:, None]) # np.clip(a, ...)
# array([[1, 3, 2, 3],
# [3, 1, 4, 4],
# [3, 3, 1, 2]])
Generalized:
def clip_2d_rows(a, maxs):
maxs = np.asanyarray(maxs)
if maxs.ndim == 1:
maxs = maxs[:, np.newaxis]
return np.clip(a, a_min=None, a_max=maxs)
You might be safer using the module-level function (np.clip) rather than the class method (np.ndarray.clip). The former uses a_max as a parameter, while the latter uses the builtin max as a parameter which is never a great idea.
With masking -
In [50]: row_lims = np.array([3,4,3])
In [51]: np.where(a > row_lims[:,None], row_lims[:,None], a)
Out[51]:
array([[1, 3, 2, 3],
[3, 1, 4, 4],
[3, 3, 1, 2]])
With
>>> a
array([[1, 4, 2, 3],
[3, 1, 5, 4],
[4, 3, 1, 2]])
Say you have
>>> maxs = np.array([[3],[4],[3]])
>>> maxs
array([[3],
[4],
[3]])
What about doing
>>> a.clip(max=maxs)
array([[1, 3, 2, 3],
[3, 1, 4, 4],
[3, 3, 1, 2]])
I have an array of numpy arrays:
a = [[1, 2, 3, 4], [1, 2, 3, 5], [2, 5, 4, 3], [5, 2, 3, 1]]
I need to find and remove a particular list from a:
rem = [1,2,3,5]
numpy.delete(a,rem) does not return the correct results. I need to be able to return:
[[1, 2, 3, 4], [2, 5, 4, 3], [5, 2, 3, 1]]
is this possible with numpy?
A list comprehension can achieve this.
rem = [1,2,3,5]
a = [[1, 2, 3, 4], [1, 2, 3, 5], [2, 5, 4, 3], [5, 2, 3, 1]]
a = [x for x in a if x != rem]
outputs
[[1, 2, 3, 4], [2, 5, 4, 3], [5, 2, 3, 1]]
Numpy arrays do not support random deletion by element. Similar to strings in Python, you need to generate a new array to delete a single or multiple sub elements.
Given:
>>> a
array([[1, 2, 3, 4],
[1, 2, 3, 5],
[2, 5, 4, 3],
[5, 2, 3, 1]])
>>> rem
array([1, 2, 3, 5])
You can get each matching sub array and create a new array from that:
>>> a=np.array([sa for sa in a if not np.all(sa==rem)])
>>> a
array([[1, 2, 3, 4],
[2, 5, 4, 3],
[5, 2, 3, 1]])
To use np.delete, you would use an index and not a match, so:
>>> a
array([[1, 2, 3, 4],
[1, 2, 3, 5],
[2, 5, 4, 3],
[5, 2, 3, 1]])
>>> np.delete(a, 1, 0) # delete element 1, axis 0
array([[1, 2, 3, 4],
[2, 5, 4, 3],
[5, 2, 3, 1]])
But you can't loop over the array and delete elements...
You can pass multiple elements to np.delete however and you just need to match sub elements:
>>> a
array([[1, 2, 3, 5],
[1, 2, 3, 5],
[2, 5, 4, 3],
[5, 2, 3, 1]])
>>> np.delete(a, [i for i, sa in enumerate(a) if np.all(sa==rem)], 0)
array([[2, 5, 4, 3],
[5, 2, 3, 1]])
And given that same a, you can have an all numpy solution by using np.where:
>>> np.delete(a, np.where((a == rem).all(axis=1)), 0)
array([[2, 5, 4, 3],
[5, 2, 3, 1]])
Did you try list remove?
In [84]: a = [[1, 2, 3, 4], [1, 2, 3, 5], [2, 5, 4, 3], [5, 2, 3, 1]]
In [85]: a
Out[85]: [[1, 2, 3, 4], [1, 2, 3, 5], [2, 5, 4, 3], [5, 2, 3, 1]]
In [86]: rem = [1,2,3,5]
In [87]: a.remove(rem)
In [88]: a
Out[88]: [[1, 2, 3, 4], [2, 5, 4, 3], [5, 2, 3, 1]]
remove matches on value.
np.delete works with an index, not value. Also it returns a copy; it does not act in place. And the result is an array, not a nested list (np.delete converts the input to an array before operating on it).
In [92]: a = [[1, 2, 3, 4], [1, 2, 3, 5], [2, 5, 4, 3], [5, 2, 3, 1]]
In [93]: a1=np.delete(a,1, axis=0)
In [94]: a1
Out[94]:
array([[1, 2, 3, 4],
[2, 5, 4, 3],
[5, 2, 3, 1]])
This is more like list pop:
In [96]: a = [[1, 2, 3, 4], [1, 2, 3, 5], [2, 5, 4, 3], [5, 2, 3, 1]]
In [97]: a.pop(1)
Out[97]: [1, 2, 3, 5]
In [98]: a
Out[98]: [[1, 2, 3, 4], [2, 5, 4, 3], [5, 2, 3, 1]]
To delete by value you need first find the index of the desired row. With integer arrays that's not too hard. With floats it is trickier.
=========
But you don't need to use delete to do this in numpy; boolean indexing works:
In [119]: a = [[1, 2, 3, 4], [1, 2, 3, 5], [2, 5, 4, 3], [5, 2, 3, 1]]
In [120]: A = np.array(a) # got to work with array, not list
In [121]: rem=np.array([1,2,3,5])
Simple comparison; rem is broadcasted to match rows
In [122]: A==rem
Out[122]:
array([[ True, True, True, False],
[ True, True, True, True],
[False, False, False, False],
[False, True, True, False]], dtype=bool)
find the row where all elements match - this is the one we want to remove
In [123]: (A==rem).all(axis=1)
Out[123]: array([False, True, False, False], dtype=bool)
Just not it, and use it to index A:
In [124]: A[~(A==rem).all(axis=1),:]
Out[124]:
array([[1, 2, 3, 4],
[2, 5, 4, 3],
[5, 2, 3, 1]])
(the original A is not changed).
np.where can be used to convert the boolean (or its inverse) to indicies. Sometimes that's handy, but usually it isn't required.
I have a large 2d array of vectors. I want to split this array into several arrays according to one of the vectors' elements or dimensions. I would like to receive one such small array if the values along this column are consecutively identical. For example considering the third dimension or column:
orig = np.array([[1, 2, 3],
[3, 4, 3],
[5, 6, 4],
[7, 8, 4],
[9, 0, 4],
[8, 7, 3],
[6, 5, 3]])
I want to turn into three arrays consisting of rows 1,2 and 3,4,5 and 6,7:
>>> a
array([[1, 2, 3],
[3, 4, 3]])
>>> b
array([[5, 6, 4],
[7, 8, 4],
[9, 0, 4]])
>>> c
array([[8, 7, 3],
[6, 5, 3]])
I'm new to python and numpy. Any help would be greatly appreciated.
Regards
Mat
Edit: I reformatted the arrays to clarify the problem
Using np.split:
>>> a, b, c = np.split(orig, np.where(orig[:-1, 2] != orig[1:, 2])[0]+1)
>>> a
array([[1, 2, 3],
[1, 2, 3]])
>>> b
array([[1, 2, 4],
[1, 2, 4],
[1, 2, 4]])
>>> c
array([[1, 2, 3],
[1, 2, 3]])
Nothing fancy here, but this good old-fashioned loop should do the trick
import numpy as np
a = np.array([[1, 2, 3],
[1, 2, 3],
[1, 2, 4],
[1, 2, 4],
[1, 2, 4],
[1, 2, 3],
[1, 2, 3]])
groups = []
rows = a[0]
prev = a[0][-1] # here i assume that the grouping is based on the last column, change the index accordingly if that is not the case.
for row in a[1:]:
if row[-1] == prev:
rows = np.vstack((rows, row))
else:
groups.append(rows)
rows = [row]
prev = row[-1]
groups.append(rows)
print groups
## [array([[1, 2, 3],
## [1, 2, 3]]),
## array([[1, 2, 4],
## [1, 2, 4],
## [1, 2, 4]]),
## array([[1, 2, 3],
## [1, 2, 3]])]
if a looks like this:
array([[1, 1, 2, 3],
[2, 1, 2, 3],
[3, 1, 2, 4],
[4, 1, 2, 4],
[5, 1, 2, 4],
[6, 1, 2, 3],
[7, 1, 2, 3]])
than this
col = a[:, -1]
indices = np.where(col[:-1] != col[1:])[0] + 1
indices = np.concatenate(([0], indices, [len(a)]))
res = [a[start:end] for start, end in zip(indices[:-1], indices[1:])]
print(res)
results in:
[array([[1, 2, 3],
[1, 2, 3]]), array([[1, 2, 4],
[1, 2, 4],
[1, 2, 4]]), array([[1, 2, 3],
[1, 2, 3]])]
Update: np.split() is much nicer. No need to add first and last index:
col = a[:, -1]
indices = np.where(col[:-1] != col[1:])[0] + 1
res = np.split(a, indices)