I have a question related to finding maxima or more preciseley discontinuities in a numpy array?
My exemplary data looks for example like this
a = np.array([3,4,5,8,7,6,5,4,1])
In general, I am interested in every maximum/jump in the data. For array a, I want to detect the 8 since it is a maximum (growing numbers on the left side and decreasing numbers on the right) and the value of 4, since the data drops after this value. Until now, I have used scipy.signal.argrelextrema
with np.greater to detect maxima, but I am not able to detect these jumps/discontinuities. For the data I am looking at, only a jump towards smaller values can occur not the opposite. Is there an easy pythonic way to detect these jumps?
Let's try this:
threshold = 1
a = np.array([3, 4, 5, 8, 7, 6, 5, 4, 1])
discontinuities_idx = np.where(abs(np.diff(a))>threshold)[0] + 1
np.diff(a) gives the difference between every component of a:
>>> array([ 1, 1, 3, -1, -1, -1, -1, -3])
From then np.where(abs(np.diff(a))>threshold)[0] is applied to find where detected discontinuities are (above user specified threshold in terms of absolute difference). Finally, you may add +1 to compensate for n=1 difference idx if needed (see np.diff kwargs) depending on which side of the discontinuities you need to be.
>>> discontinuities_idx
>>> array([3, 8])
>>> a[discontinuities_idx]
>>> array([8, 1])
It sounds like mathemathical analysis where you need to define some conditions like a'(x)>0 or a'(x)<0. So you can mask them:
a = np.array([3,4,5,8,7,8,6,5,4,9,2,9,9,7])
mask1 = np.diff(a) > 0
mask2 = np.diff(a) < 0
>>> np.flatnonzero(mask1[:-1] & mask2[1:]) + 1
array([3, 5, 9], dtype=int64)
It returns indices of items where maxima is met.
You can try this:
import numpy as np
import math
a = np.array([3,4,5,8,7,6,5,4,1])
MaxJump = np.diff(a)
print(MaxJump)
print(len(MaxJump))
MaxJump1 = []
for i in range (len(MaxJump)):
MaxJump1.append(math.fabs(MaxJump[i]))
print(MaxJump1)
MaxJump3 = np.max(MaxJump1)
print(MaxJump3)
Related
Is there a way to copy values from one numpy masked array to another where ONLY the unmasked values are copied and the target values are left unchanged for masked source values? It seems like this should be handled automatically, but so far I haven't found a good way to do it. Right now I'm using ma.choose with the target region of the destination and the mask, but it really seems like there should be a better way given that the entire purpose of the masked array is to not operate on masked values automatically.
import numpy as np
from numpy import ma
x = ma.array([1, 2, 3, 4], mask=[0, 1, 1, 0])
y = np.array([5, 6, 7, 8])
y[~x.mask] = x[~x.mask]
which gives for y:
array([1, 6, 7, 4])
Consider two 1d numpy arrays.
import numpy as np
X = np.array([-43, 21, 4, 6, -1, 22, 8])
Y = np.array([13, 5, -12, 0])
I want to find the value(s) from X that have the minimum absolute difference with the value(s) from Y. In the example shown, the minimum absolute difference is 1, given by [[4, 5], [6, 5], [-1, 0]]. There are lots of resources on this site about finding minimum element of arrays, but that's not what I'm after.
For the present question, both starting arrays are 1d, though their sizes may differ. I'd also be interested, though, on tips about how to proceed if the starting arrays had different shapes. Is it simply a matter of flattening both then proceeding as before?
You can calculate the absolute distance array and then find the minimum in that array. This method works for different X and Y lengths. If they are multi-dimensional, simply flatten them first (using X.flatten(), ...) and apply this solution to the flattened arrays:
If you want ALL pairs with minimum absolute distance:
#absolute distance between X and Y
dist = np.abs(X[:,None]-Y)
#elements of X with minimum absolute distance
X[np.where(dist==dist.min())[0]]
#corresponding elements of Y with absolute distance
Y[np.where(dist==dist.min())[1]]
output:
[ 4 6 -1]
[5 5 0]
And you want them in a single array format:
idx = np.where(dist==dist.min())
np.stack((X[idx[0]], Y[idx[1]])).T
[[ 4 5]
[ 6 5]
[-1 0]]
If you want the first occurrence of minimum absolute distance with faster solution:
X[dist.argmin()//Y.size]
Y[dist.argmin()//X.size]
or equally another solution (I think would be faster):
idx = np.unravel_index(np.argmin(dist), dist.shape)
X[idx[0]]
Y[idx[1]]
output:
4
5
Note: Another way of getting the absolute distance array is:
dist = np.abs(np.subtract.outer(X,Y))
Function argrelextrema from scipy.signal does not detect flat extrema.
Example:
import numpy as np
from scipy.signal import argrelextrema
data = np.array([ 0, 1, 2, 1, 0, 1, 3, 3, 1, 0 ])
argrelextrema(data, np.greater)
(array([2]),)
the first max (2) is detected, the second max (3, 3) is not detected.
Any workaround for this behaviour?
Thanks.
Short answer: Probably argrelextrema will not be flexible enough for your task. Consider writing your own function matching your needs.
Longer answer: Are you bound to use argrelextrema? If yes, then you can play around with the comparator and the order arguments of argrelextrema (see the reference).
For your easy example, it would be enough to chose np.greater_equal as comparator.
>>> data = np.array([ 0, 1, 2, 1, 0, 1, 3, 3, 1, 0 ])
>>> print(argrelextrema(data, np.greater_equal,order=1))
(array([2, 6, 7]),)
Note however that in this way
>>> data = np.array([ 0, 1, 2, 1, 0, 1, 3, 3, 4, 1, 0 ])
>>> print(argrelextrema(data, np.greater_equal,order=1))
(array([2, 6, 8]),)
behaves differently that you would probably like, finding the first 3 and the 4 as maxima, since argrelextrema now sees everything as a maximum that is greater or equal to its two nearest neighbors. You can now use the order argument to decide to how many neighbors this comparison must hold - choosing order=2 would change my upper example to only find 4 as a maximum.
>>> print(argrelextrema(data, np.greater_equal,order=2))
(array([2, 8]),)
There is, however, a downside to this - let's change the data once more:
>>> data = np.array([ 0, 1, 2, 1, 0, 1, 3, 3, 4, 1, 5 ])
>>> print(argrelextrema(data, np.greater_equal,order=2))
(array([ 2, 10]),)
Adding another peak as a last value keeps you from finding your peak at 4, as argrelextrema is now seeing a second-neighbor that is greater than 4 (which can be useful for noisy data, but not necessarily the behavior expected in all cases).
Using argrelextrema, you will always be limited to binary operations between a fixed number of neighbors. Note, however, that all argrelextrema is doing in your example above is to return n, if data[n] > data[n-1] and data[n] > data[n+1]. You could easily implement this yourself, and then refine the rules, for example by checking the second neighbor in case that the first neighbor has the same value.
For the sake of completeness, there seems to be a more elaborate function in scipy.signal, find_peaks_cwt. I have however no experience using it and can therefore not give you more details about it.
I'm really surprised that no one figured out an answer to this. All you need to do is preprocess the array to remove duplicates that are located next to each other and you can run argrelextrema like so:
import numpy as np
from scipy.signal import argrelextrema
data = np.array([ 0, 1, 2, 1, 0, 1, 3, 3, 1, 0 ])
filter_table = [False] + list(np.equal(data[:-1], data[1:]))
data = np.array([x for idx, x in enumerate(data) if not filter_table[idx]])
argrelextrema(data, np.greater)
I am trying to get the x and y coordinates of a given value in a numpy image array.
I can do it by running through the rows and columns manually with a for statement, but this seems rather slow and I am possitive there is a better way to do this.
I was trying to modify a solution I found in this post. Finding the (x,y) indexes of specific (R,G,B) color values from images stored in NumPy ndarrays
a = image
c = intensity_value
y_locs = np.where(np.all(a == c, axis=0))
x_locs = np.where(np.all(a == c, axis=1))
return np.int64(x_locs), np.int64(y_locs)
I have the np.int64 to convert the values back to int64.
I was also looking at numpy.where documentation
I don't quite understand the problem. The axis parameter in all() runs over the colour channels (axis 2 or -1) rather than the x and y indices. Then where() will give you the coordinates of the matching values in the image:
>>> # set up data
>>> image = np.zeros((5, 4, 3), dtype=np.int)
>>> image[2, 1, :] = [7, 6, 5]
>>> # find indices
>>> np.where(np.all(image == [7, 6, 5], axis=-1))
(array([2]), array([1]))
>>>
This is really just repeating the answer you linked to. But is a bit too long for a comment. Maybe you could explain a bit more why you need to modify the previous answer? It doesn't seem like you do need to.
To index the middle points of a numpy array, you can do this:
x = np.arange(10)
middle = x[len(x)/4:len(x)*3/4]
Is there a shorthand for indexing the middle of the array? e.g., the n or 2n elements closes to len(x)/2? Is there a nice n-dimensional version of this?
as cge said, the simplest way is by turning it into a lambda function, like so:
x = np.arange(10)
middle = lambda x: x[len(x)/4:len(x)*3/4]
or the n-dimensional way is:
middle = lambda x: x[[slice(np.floor(d/4.),np.ceil(3*d/4.)) for d in x.shape]]
Late, but for everyone else running into this issue:
A much smoother way is to use numpy's take or put.
To address the middle of an array you can use put to index an n-dimensional array with a single index. Same for getting values from an array with take
Assuming your array has an odd number of elements, the middle of the array will be at half of it's size. By using an integer division (// instead of /) you won't get any problems here.
import numpy as np
arr = np.array([[0, 1, 2],
[3, 4, 5],
[6, 7, 8]])
# put a value to the center
np.put(arr, arr.size // 2, 999)
print(arr)
# take a value from the center
center = np.take(arr, arr.size // 2)
print(center)