irregular slicing in numpy - python

consider the below numpy array
a = np.arange(20)
and the slicing requirement give below
b = [[0,4],
[4,9],
[9,15],
[15,19]]
how can i slice 'a' based on irregular slicing information in 'b'? like for example:
np.mean(a[b[:,0]:b[:,1]])
I know how to achieve this with loop statement, like
[np.mean(a[b[_][0]:b[_][1]]) for _ in range(len(b))]
but is there a way in which i can avoid using loops?

You can use np.add.reduceat with flattened b as the indices:
np.add.reduceat(a, np.ravel(b))[::2]/np.diff(b, axis=1).ravel()
# array([ 1.5, 6. , 11.5, 16.5])
with for loop:
[np.mean(a[b[_][0]:b[_][1]]) for _ in range(len(b))]
# [1.5, 6.0, 11.5, 16.5]
For more, you can see the first example in help(np.add.reduceat):
Examples
--------
To take the running sum of four successive values:
>>> np.add.reduceat(np.arange(8),[0,4, 1,5, 2,6, 3,7])[::2]
array([ 6, 10, 14, 18])

Let's try np.split.
>>> list(map(np.mean, np.split(a, b[:, 1])))
[1.5, 6.0, 11.5, 16.5, 19.0]
Using a list comprehension:
>>> [np.mean(x) for x in np.split(a, b[:, 1])]
[1.5, 6.0, 11.5, 16.5, 19.0]

Using cumsum and np.diff
c = b[:, 1]
np.diff(np.append(0, a.cumsum()[c - 1])) / np.diff(np.append(0, c))
array([ 1.5, 6. , 11.5, 16.5])

Related

Conditional Numpy slicing and append to another array

I have two numpy arrays in Python.
vec_1 = np.array([2.3, 1.4, 7.3, 1.8, 0, 0, 0])
vec_2 = np.array([29, 7, 5.8, 2.4, 6.7, 5, 8])
I am wanting a slice from vec_1 where the slice would be all 0's (except the last one) plus the preceding (Non 0) value so that the slice from vec_1 would be:
slice = ([1.8,0,0])
The slice would replace the last x elements of vec_2 so that it would look like so:
vec_2 = ([29, 7, 5.8, 2.4, 1.8, 0, 0])
vec_2's last 3 elements in this example are replaced by the slice from vec_1. Lastly, how could this be made dynamic so that slice lengths are determined in step 1 and then replace the last x elements in vec_2. When a 0 is observed in vec_1, it will be 0 from that point to the end of the array.
import numpy as np
vec_1 = np.array([2.3, 1.4, 7.3, 1.8, 0, 0, 0])
vec_2 = np.array([29, 7, 5.8, 2.4, 6.7, 5, 8])
##Take the lowest value where 0 appears in vec_1 and subtract 1. :-1 to remove the last 0
vec_1_slice = vec_1[np.where(vec_1 == 0)[0][0] - 1:-1]
##Remove the last however many digits are in vec_1_slice then add vec_1_slice
vec_2 = np.append(vec_2[:-len(vec_1_slice)], vec_1_slice)
Output
vec_2
Out[237]: array([29. , 7. , 5.8, 2.4, 1.8, 0. , 0. ])

Indexes of first occurrences of zeroes in Python list

I have a list like this,
import numpy as np
myList = [0.0 , 0.0, 0.0, 2.0, 2.0, 0.0, 2.5, 0.0, 0.0, 0.0, 3.0, 0.0]
I can find the index of non zero occurrence like below.
I = np.nonzero(myList)
for i in I:
print(i)
Can I find the index of first occurrence of zeros? something like below,
[0, 5, 7, 11]
Since NumPy tagged, here are two ways -
In [44]: m = np.r_[False,np.equal(myList,0)]
In [45]: np.flatnonzero(m[:-1]<m[1:])
Out[45]: array([ 0, 5, 7, 11])
If the input is an array, becomes a bit easier to get the equivalent mask m -
a = np.array(myList)
m = np.r_[False,a==0]
Another way with np.diff for a one-liner -
In [46]: np.flatnonzero(np.diff(np.r_[0,np.equal(myList,0)])==1)
Out[46]: array([ 0, 5, 7, 11])
Easier again with array input a -
In [52]: np.flatnonzero(np.diff(np.r_[0,a==0])==1)
Out[52]: array([ 0, 5, 7, 11])
With python
i = True
r = []
l = [0.0 , 0.0, 0.0, 2.0, 2.0, 0.0, 2.5, 0.0, 0.0, 0.0, 3.0, 0.0]
for a in range(len(l)):
if l[a] == 0:
if i:
r.append(a)
i = False
else:
i = True
print(r)
Prints
[0, 5, 7, 11]

Get average column value from list of arrays Python

I'm trying to get the average value of values in column 1 and column 2 of a list of arrays. I am using a dict called clusters with an index of clusterNo where I iterate through clusterNo.
print(kMeans.clusters[clusterNo])
When I print the dictionary it gives me this result:
[array([ 5.1, 3.5]), array([ 4.9, 3. ]), array([ 4.7, 3.2]), array([ 4.6, 3.1]), array([ 5. , 3.6])
etc etc..
I cannot figure out how to slice into columns and then get the average. Bare in mind they are float values so I cannot simply avg() them.
Setup
>>> import numpy as np
>>> lst = [np.array([ 5.1, 3.5]), np.array([ 4.9, 3. ]), np.array([ 4.7, 3.2]), np.array([ 4.6, 3.1]), np.array([ 5. , 3.6])]
Solution
>>> np.mean(lst, axis=0)
array([4.86, 3.28])
However, having lst as an array might be advantageous if you need to do more calculations or array operations on that data.
>>> arr = np.array(lst)
>>> arr
array([[5.1, 3.5],
[4.9, 3. ],
[4.7, 3.2],
[4.6, 3.1],
[5. , 3.6]])
>>> arr.mean(axis=0)
array([4.86, 3.28])

How to declare array in format where continuous zeros can be written as n*0 like in fortran?

I have a fortran array of the type
DATA ELEV \1.2,3.2,2*0.0,3.9,3*0.0\
which in python would be
ELEV = [1.2, 3.2, 0.0, 0.0, 3.9, 0.0, 0.0, 0.0]
Notice how 2*0.0 was not 0.0 but instead 2 elements with value 0.0.
Is there some way to use numpy or other python methods(or libraries) to write it similarly in python3 ?
I essentially have array in the fortran format which I want to use in my python code instead of mere representation.
Use the new * unpacking generalizations and list multiplication.
>>> [1.2, 3.2, *2*[0.0], 3.9, *3*[0.0]]
[1.2, 3.2, 0.0, 0.0, 3.9, 0.0, 0.0, 0.0]
You can also multiply strings and tuples in Python.
>>> 'abc'*3
'abcabcabc'
>>> (1, 2, 3)*2
(1, 2, 3, 1, 2, 3)
You can unpack any iterable, and this also works in tuple displays, etc.
>>> (1.2, 3.2, *'xy'*2, 3.9, *3*(0.0,), *'foo')
(1.2, 3.2, 'x', 'y', 'x', 'y', 3.9, 0.0, 0.0, 0.0, 'f', 'o', 'o')
Python's built-in lists already have a very similar functionality:
[1.2, 3.2] + [0.0] * 2 + [3.9] + [0.0] * 3
results in
[1.2, 3.2, 0.0, 0.0, 3.9, 0.0, 0.0, 0.0]
Perhaps the most natural way of doing this in numpy is with the repeat function/method:
In [252]: a = np.array([1.2,3.2,0,3.9,0])
In [253]: b = a.repeat([1,1,2,1,3])
In [254]: b
Out[254]: array([1.2, 3.2, 0. , 0. , 3.9, 0. , 0. , 0. ])
or if there are a lot of 0s, copy the nonzero values to a zeros array
In [255]: c = np.zeros(8, float)
In [256]: c[[0,1,4]] = [1.2,3.2,3.9]
In [257]: c
Out[257]: array([1.2, 3.2, 0. , 0. , 3.9, 0. , 0. , 0. ])
#fireball, I think it's better if you will have a reusable code block (function) for this.
Here it is get_nums(), which takes 2 parameters.
First parameter n is the number we want to repeat and second parameter freq denotes frequency of n.
Function returns a list containing n, freq times which we unpack in the calling statement using *.
Here you don't need to use + operator again and again multiple times and divide the list in multiple sub lists like below.
[65, 54] + [0.0] * 5 + [6, 9]
Please have a look at the below python code and its output.
Try it online at http://rextester.com/AJDEUE76668
def get_nums(n, freq):
l = [n] * freq
return l
# TEST CASE 1
ELEV = [1.2, 3.2, *get_nums(0.0, 2) ,3.9, *get_nums(0.0, 3)]
print(ELEV)
print() # newline
# TEST CASE 2
arr = [45, *get_nums(1, 4), *get_nums(9, 3), 34, 99, *get_nums(7, 1), 12, 21, *get_nums(-1, 5)]
print(arr)
Output »
[1.2, 3.2, 0.0, 0.0, 3.9, 0.0, 0.0, 0.0]
[45, 1, 1, 1, 1, 9, 9, 9, 34, 99, 7, 12, 21, -1, -1, -1, -1, -1]

Find numpy array coordinates of neighboring maximum

I used the accepted answer in this question to obtain local maxima in a numpy array of 2 or more dimensions so I could assign labels to them. Now I would like to also assign these labels to neighboring cells in the array, depending on gradient – i.e. a cell gets the same label as the neighboring cell with the highest value. This way I can iteratively assign labels to my entire array.
Assume I have an array A like
>>> A = np.array([[ 1. , 2. , 2.2, 3.5],
[ 2.1, 2.4, 3. , 3.3],
[ 1. , 3. , 3.2, 3. ],
[ 2. , 4.1, 4. , 2. ]])
Applying the maximum_filter I get
>>> scipy.ndimage.filters.maximum_filter(A, size=3)
array([[ 2.4, 3. , 3.5, 3.5],
[ 3. , 3.2, 3.5, 3.5],
[ 4.1, 4.1, 4.1, 4. ],
[ 4.1, 4.1, 4.1, 4. ]])
Now, for every cell in this array I would like to have the coordinates of the maximum found by the filter, i.e.
array([[[1,1],[1,2],[0,3],[0,3]],
[[2,1],[2,2],[0,3],[0,3]],
[[3,1],[3,1],[3,1],[3,2]],
[[3,1],[3,1],[3,1],[3,2]]])
I would then use these coordinates to assign my labels iteratively.
I can do it for two dimensions using loops, ignoring borders
highest_neighbor_coordinates = np.array([[(argmax2D(A[i-1:i+2, j-1:j+2])+np.array([i-1, j-1])) for j in range(1, A.shape[1]-1)] for i in range(1, A.shape[0]-1)])
but after seeing the many filter functions in scipy.ndimage I was hoping there would be a more elegant and extensible (to >=3 dimensions) solution.
We can use pad with reflected elements to simulate the max-filter operation and get sliding windows on it with scikit-image's view_as_windows, compute the flattened argmax indices, offset those with ranged values to translate onto global scale -
from skimage.util import view_as_windows as viewW
def window_argmax_global2D(A, size):
hsize = (size-1)//2 # expects size as odd number
m,n = A.shape
A1 = np.pad(A, (hsize,hsize), mode='reflect')
idx = viewW(A1, (size,size)).reshape(-1,size**2).argmax(-1).reshape(m,n)
r,c = np.unravel_index(idx, (size,size))
rows = np.abs(r + np.arange(-hsize,m-hsize)[:,None])
cols = np.abs(c + np.arange(-hsize,n-hsize))
return rows, cols
Sample run -
In [201]: A
Out[201]:
array([[1. , 2. , 2.2, 3.5],
[2.1, 2.4, 3. , 3.3],
[1. , 3. , 3.2, 3. ],
[2. , 4.1, 4. , 2. ]])
In [202]: rows, cols = window_argmax_global2D(A, size=3)
In [203]: rows
Out[203]:
array([[1, 1, 0, 0],
[2, 2, 0, 0],
[3, 3, 3, 3],
[3, 3, 3, 3]])
In [204]: cols
Out[204]:
array([[1, 2, 3, 3],
[1, 2, 3, 3],
[1, 1, 1, 2],
[1, 1, 1, 2]])
Extending to n-dim
We would use np.ogrid for this extension part :
def window_argmax_global(A, size):
hsize = (size-1)//2 # expects size as odd number
shp = A.shape
N = A.ndim
A1 = np.pad(A, (hsize,hsize), mode='reflect')
idx = viewW(A1, ([size]*N)).reshape(-1,size**N).argmax(-1).reshape(shp)
offsets = np.ogrid[tuple(map(slice, shp))]
out = np.unravel_index(idx, ([size]*N))
return [np.abs(i+j-hsize) for i,j in zip(out,offsets)]

Categories

Resources