splitting an array where it meets the peak values - python

hope doing well.
I have an extremely big numpy array and want to split it into several ones. My array has three columns and I want to split it where the all the columns are reaching their maximum values:
array = [[0, 0, 0],
[0, 0, 5],
[10, 5, 10],
[1, 1, 1],
[5, 5, 15],
[10, 8, 20],
[2, 0, 0],
[10, 10, 12],
[1, 2, 0],
[2, 5, 9]]
Now, I want to split it into four array:
sub_array_1=[[0, 0, 0],
[0, 0, 5],
[10, 5, 10]]
sub_array_2=[[1, 1, 1],
[5, 5, 15],
[10, 8, 20]]
sub_array_3=[[2, 0, 0],
[10, 10, 12]]
sub_array_4=[[1, 2, 0],
[2, 5, 9]]
I tried to it in a for loop having if statements saying that give me an array when each element of my input is bigger than the element stored in the both upper and lower rows. And I also should figure out the last row:
import numpy as np
sub_array_1=np.array([])
for i in array:
if array[i,:]>array[i+1,:] and array[i,:]>array[i+1,:]:
vert_1=np.append(sub_array_1,array[0:i,:])
My code doesn't work, but it simply shows my idea.
I am quite new in Python and I could not find the way to write my idea as a code. So, I appreciate any help and contribution.
Cheers,
Ali

IIUC, one way using numpy.diff with numpy.array_split:
indices = np.argwhere(np.all(np.diff(array, axis=0) < 0, axis=1))
np.array_split(array, indices.ravel()+1, axis=0)
Output:
[array([[ 0, 0, 0],
[ 0, 0, 5],
[10, 5, 10]]),
array([[ 1, 1, 1],
[ 5, 5, 15],
[10, 8, 20]]),
array([[ 2, 0, 0],
[10, 10, 12]]),
array([[1, 2, 0],
[2, 5, 9]])]
np.all and np.diff find a row where all elements of the row as a negative difference with a next row (i.e. where the peak ends)
np.array_split will then split the given array based on the locations of the peak found.

Related

Initialise a numpy array of a specific shape

I want to initialise a numpy array of a specific shape such that when I append numbers to it it will 'fill up' in that shape.
The length of the array will vary - and that is fine I do not mind how long it is - but I want it to have 4 columns. Ideally somthing similar to the following:
array = np.array([:, 4])
print(array)
array = [[0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0]]
Again the actual length of the array would not be defines. That way if I was to append a different array it would work as follows
test_array = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16]
array = np.append(array, test_array)
print(array)
array = [[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12], [13, 14, 15, 16]]
Is there any way to do this?
If I understand well your issue, I think you do not need to initialize your array.
You sould check first that your array size divides by 4.
import numpy as np
l = test_array.shape[0]
cols = 4
rows = l / cols
my_array = np.reshape(test_array, (rows, cols))
The kind of behavior that you seek is unusual. You should explain why you need it. If you want something readily grows, use Python list. numpy arrays have a fixed size. Values can be assigned to an array in various ways, but to grow it, you need to create a new array with some version of concatenate. (Yes, there is a resize function/method, but that's not commonly used.)
I'll illustrate the value assignment options:
Initial an array with a known size. In your case the 5 could be larger than anticipated, and the 4 is the desired number of 'columns'.
In [1]: arr = np.zeros((5,4), dtype=int)
In [2]: arr
Out[2]:
array([[0, 0, 0, 0],
[0, 0, 0, 0],
[0, 0, 0, 0],
[0, 0, 0, 0],
[0, 0, 0, 0]])
Assign 4 values to one row:
In [3]: arr[0] = [1,2,3,4]
Assign 3 values starting at a given point in a flat view of the array:
In [4]: arr.flat[4:7] = [1,2,3]
In [5]: arr
Out[5]:
array([[1, 2, 3, 4],
[1, 2, 3, 0],
[0, 0, 0, 0],
[0, 0, 0, 0],
[0, 0, 0, 0]])
This array, while defined as (5,4) shape, can be viewed as (20,) 1d array. I had to choose the appropriate slice values in the flat view.
More commonly we assign values to a block of rows (or a variety of other indexed areas). arr[2:, :] is a (3,4) portion of arr. So we need to assign (3,4) array to it (or an equivalent list structure). To get full benefit of this sort of assignment you need to read up on broadcasting.
In [6]: arr[2:,:] = np.reshape(list(range(10,22)),(3,4))
In [7]: arr
Out[7]:
array([[ 1, 2, 3, 4],
[ 1, 2, 3, 0],
[10, 11, 12, 13],
[14, 15, 16, 17],
[18, 19, 20, 21]])
In [8]: arr.ravel()
Out[8]:
array([ 1, 2, 3, 4, 1, 2, 3, 0, 10, 11, 12, 13, 14, 15, 16, 17, 18,
19, 20, 21])

Remove 2d slice from 3d numpy array

I need to remove the last arrays from a 3D numpy cube. I have:
a = np.array(
[[[1,2,3],
[4,5,6],
[7,8,9]],
[[9,8,7],
[6,5,4],
[3,2,1]],
[[0,0,0],
[0,0,0],
[0,0,0]],
[[0,0,0],
[0,0,0],
[0,0,0]]])
How do I remove the arrays with zero sub-arrays like at the bottom side of the cube, using np.delete?
(I cannot simply remove all zero values, because there will be zeros in the data on the top side)
For a 3D cube, you might check all against the last two axes
a = np.asarray(a)
a[~(a==0).all((2,1))]
array([[[1, 2, 3],
[4, 5, 6],
[7, 8, 9]],
[[9, 8, 7],
[6, 5, 4],
[3, 2, 1]]])
Here's one way to remove trailing all zeros slices, as mentioned in the question that we want to keep the all zeros slices in the data on the top side -
a[:-(a==0).all((1,2))[::-1].argmin()]
Sample run -
In [80]: a
Out[80]:
array([[[0, 0, 0],
[0, 0, 0],
[0, 0, 0]],
[[9, 8, 7],
[6, 5, 4],
[3, 2, 1]],
[[0, 0, 0],
[0, 0, 0],
[0, 0, 0]],
[[0, 0, 0],
[0, 0, 0],
[0, 0, 0]]])
In [81]: a[:-(a==0).all((1,2))[::-1].argmin()]
Out[81]:
array([[[0, 0, 0],
[0, 0, 0],
[0, 0, 0]],
[[9, 8, 7],
[6, 5, 4],
[3, 2, 1]]])
If you know where they are already, the easiest thing to do is slice them off:
a[:-2]
Results in:
array([[[1, 2, 3],
[4, 5, 6],
[7, 8, 9]],
[[9, 8, 7],
[6, 5, 4],
[3, 2, 1]]])
Hope this helps,
a_new=[] #Create a empty list
for item in a:
if not (np.count_nonzero(item) == 0): #check if inner matrix is empty or not
a_new.append(item) #appending to inner matrix to the list
a_new=np.array(a_new) #creating numpy matrix with removed zero elements
Output:
array([[[1, 2, 3],
[4, 5, 6],
[7, 8, 9]],
[[9, 8, 7],
[6, 5, 4],
[3, 2, 1]]])
Use any and select :)
a=np.array([[[1,2,3],
[4,5,6],
[7,8,9]],
[[9,8,7],
[6,5,4],
[3,2,1]],
[[0,0,0],
[0,0,0],
[0,0,0]],
[[0,0,0],
[0,0,0],
[0,0,0]]])
a[a.any(axis=2).any(axis=1)]

Numpy 3D array arranging and reshaping

I have a 3D numpy array that I need to reshape and arrange. For example, I have x=np.array([np.array([np.array([1,0,1]),np.array([1,1,1]),np.array([0,1,0]),np.array([1,1,0])]),np.array([np.array([0,0,1]),np.array([0,0,0]),np.array([0,1,1]),np.array([1,0,0])]),np.array([np.array([1,0,0]),np.array([1,0,1]),np.array([1,1,1]),np.array([0,0,0])])])
Which is a shape of (3,4,3), when printing it I get:
array([[[1, 0, 1],
[1, 1, 1],
[0, 1, 0],
[1, 1, 0]],
[[0, 0, 1],
[0, 0, 0],
[0, 1, 1],
[1, 0, 0]],
[[1, 0, 0],
[1, 0, 1],
[1, 1, 1],
[0, 0, 0]]])
Now I need to reshape this array to a (4,3,3) by selecting the same index in each subarray and putting them together to end up with something like this:
array([[[1,0,1],[0,0,1],[1,0,0]],
[[1,1,1],[0,0,0],[1,0,1]],
[[0,1,0],[0,1,1],[1,1,1]],
[[1,1,0],[1,0,0],[0,0,0]]]
I tried reshape, all kinds of stacking and nothing worked (arranged the array like I need). I know I can do it manually but for large arrays manually isn't a choice.
Any help will be much appreciated.
Thanks
swapaxes will do what you want. That is, if your input array is x and your desired output is y, then
np.all(y==np.swapaxes(x, 1, 0))
should give True.
For higher dimensional arrays, transpose will accept a tuple of axis numbers to permute the axes:
import numpy as np
foo = np.array([[[1, 2], [3, 4]], [[5, 6], [7, 8]], [[9, 10], [11, 12]]])
foo.transpose(1, 0, 2)
result:
array([[[ 1, 2],
[ 5, 6],
[ 9, 10]],
[[ 3, 4],
[ 7, 8],
[11, 12]]])

Find the lowest non-masked point with numpy efficiently

The application here is finding the "cloud base", but the principles apply wherever. I have a numpy masked 3-D array (which we will say corresponds to a 3-D grid box with dimensions z, y, x), where I have masked out all points with a value of less than 0.1. What I want to find is, at every x,y point, what is the lowest z point index (not the lowest value in z, the smallest z coordinate) that is not masked out. I can think of a few trivial ways to do it, e.g.:
for x points:
for y points:
minz=-1
for z points:
if x,y,z is not masked:
minz = z
break
However, this seems really inefficient and I'm sure that there is a more efficient or more pythonic way to do this. What am I missing here?
Edit: I do not need to use masked arrays, but it seemed like the easiest way to ask the question- I can instead find the lowest point under a certain threshold without using masked arrays.
Edit 2: Idea for what I'm looking for (taking z=0 to be the lowest point):
input:
[[[0,1],
[1,5]],
[[3,3],
[2,4]],
[[2,1],
[4,9]]]
threshold: val >=3
output:
[[1,1],
[2,0]]
Assuming A as the input array, you could do -
np.where((A < thresh).all(0),-1,(A >= thresh).argmax(0))
Sample runs
Run #1:
In [87]: A
Out[87]:
array([[[0, 1],
[1, 5]],
[[3, 3],
[2, 4]],
[[2, 1],
[4, 9]]])
In [88]: thresh = 3
In [89]: np.where((A < thresh).all(0),-1,(A >= thresh).argmax(0))
Out[89]:
array([[1, 1],
[2, 0]])
Run #2:
In [82]: A
Out[82]:
array([[[17, 1, 2, 3],
[ 5, 13, 11, 2],
[ 9, 16, 11, 19],
[11, 16, 6, 3],
[15, 9, 14, 14]],
[[18, 19, 5, 8],
[13, 13, 17, 2],
[17, 12, 16, 0],
[19, 14, 12, 5],
[ 7, 8, 4, 7]],
[[10, 12, 11, 2],
[10, 18, 6, 15],
[ 4, 16, 0, 16],
[16, 18, 2, 1],
[10, 19, 9, 4]]])
In [83]: thresh = 10
In [84]: np.where((A < thresh).all(0),-1,(A >= thresh).argmax(0))
Out[84]:
array([[ 0, 1, 2, -1],
[ 1, 0, 0, 2],
[ 1, 0, 0, 0],
[ 0, 0, 1, -1],
[ 0, 2, 0, 0]])

Selecting specific rows and columns from NumPy array

I've been going crazy trying to figure out what stupid thing I'm doing wrong here.
I'm using NumPy, and I have specific row indices and specific column indices that I want to select from. Here's the gist of my problem:
import numpy as np
a = np.arange(20).reshape((5,4))
# array([[ 0, 1, 2, 3],
# [ 4, 5, 6, 7],
# [ 8, 9, 10, 11],
# [12, 13, 14, 15],
# [16, 17, 18, 19]])
# If I select certain rows, it works
print a[[0, 1, 3], :]
# array([[ 0, 1, 2, 3],
# [ 4, 5, 6, 7],
# [12, 13, 14, 15]])
# If I select certain rows and a single column, it works
print a[[0, 1, 3], 2]
# array([ 2, 6, 14])
# But if I select certain rows AND certain columns, it fails
print a[[0,1,3], [0,2]]
# Traceback (most recent call last):
# File "<stdin>", line 1, in <module>
# ValueError: shape mismatch: objects cannot be broadcast to a single shape
Why is this happening? Surely I should be able to select the 1st, 2nd, and 4th rows, and 1st and 3rd columns? The result I'm expecting is:
a[[0,1,3], [0,2]] => [[0, 2],
[4, 6],
[12, 14]]
As Toan suggests, a simple hack would be to just select the rows first, and then select the columns over that.
>>> a[[0,1,3], :] # Returns the rows you want
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[12, 13, 14, 15]])
>>> a[[0,1,3], :][:, [0,2]] # Selects the columns you want as well
array([[ 0, 2],
[ 4, 6],
[12, 14]])
[Edit] The built-in method: np.ix_
I recently discovered that numpy gives you an in-built one-liner to doing exactly what #Jaime suggested, but without having to use broadcasting syntax (which suffers from lack of readability). From the docs:
Using ix_ one can quickly construct index arrays that will index the
cross product. a[np.ix_([1,3],[2,5])] returns the array [[a[1,2] a[1,5]], [a[3,2] a[3,5]]].
So you use it like this:
>>> a = np.arange(20).reshape((5,4))
>>> a[np.ix_([0,1,3], [0,2])]
array([[ 0, 2],
[ 4, 6],
[12, 14]])
And the way it works is that it takes care of aligning arrays the way Jaime suggested, so that broadcasting happens properly:
>>> np.ix_([0,1,3], [0,2])
(array([[0],
[1],
[3]]), array([[0, 2]]))
Also, as MikeC says in a comment, np.ix_ has the advantage of returning a view, which my first (pre-edit) answer did not. This means you can now assign to the indexed array:
>>> a[np.ix_([0,1,3], [0,2])] = -1
>>> a
array([[-1, 1, -1, 3],
[-1, 5, -1, 7],
[ 8, 9, 10, 11],
[-1, 13, -1, 15],
[16, 17, 18, 19]])
Fancy indexing requires you to provide all indices for each dimension. You are providing 3 indices for the first one, and only 2 for the second one, hence the error. You want to do something like this:
>>> a[[[0, 0], [1, 1], [3, 3]], [[0,2], [0,2], [0, 2]]]
array([[ 0, 2],
[ 4, 6],
[12, 14]])
That is of course a pain to write, so you can let broadcasting help you:
>>> a[[[0], [1], [3]], [0, 2]]
array([[ 0, 2],
[ 4, 6],
[12, 14]])
This is much simpler to do if you index with arrays, not lists:
>>> row_idx = np.array([0, 1, 3])
>>> col_idx = np.array([0, 2])
>>> a[row_idx[:, None], col_idx]
array([[ 0, 2],
[ 4, 6],
[12, 14]])
USE:
>>> a[[0,1,3]][:,[0,2]]
array([[ 0, 2],
[ 4, 6],
[12, 14]])
OR:
>>> a[[0,1,3],::2]
array([[ 0, 2],
[ 4, 6],
[12, 14]])
Using np.ix_ is the most convenient way to do it (as answered by others), but it also can be done as follows:
>>> rows = [0, 1, 3]
>>> cols = [0, 2]
>>> (a[rows].T)[cols].T
array([[ 0, 2],
[ 4, 6],
[12, 14]])

Categories

Resources