Remove head and tail from numpy array PYTHON

Remove head and tail from numpy array PYTHON - python

I have a numpy.ndarray, and want to remove first h elements and last t.
As I see, the more general way is by selecting:
h, t = 1, 1
my_array = [0,1,2,3,4,5]
middle = my_array[h:-t]
and the middle is [1,2,3,4]. This is correct, but when I want not to remove anything, I used h = 0 and t = 0 since I was trying to remove nothing, but this returns empty array. I know it is because of t = 0 and I also know that an if condition for this border case would solve it with my_array[h:] but I don't want this solution (my problem is a little more complex, with more dimensions, code will become ugly)
Any ideas?

Instead, use
middle = my_array[h:len(my_array)-t]
For completeness, here's the trial run:
my_array = [0,1,2,3,4,5]
h,t = 0,0
middle = my_array[h:len(my_array)-t]
print(middle)
Output: [0, 1, 2, 3, 4, 5]
This example was just for a standard array. Since your ultimate goal is to work with numpy multidimensional arrays, this problem is actually a bit trickier. When you say you want to remove the first h elements and the last t elements, are we guaranteed that h and t satisfy the proper divisibility criteria so that the result will be a well-formed array?
I actually think the cleanest solution is simply to use this solution, but divide out by the appropriate factor first. For example, in two dimensions:
h = 3
t = 6
a = numpy.array([[ 1, 2, 3],
[ 4, 5, 6],
[ 7, 8, 9],
[10, 11, 12]])
d = numpy.prod(numpy.shape(a)[1:])
mid_a = a[int(h/3):int(len(a)-t/3)]
print(mid_a)
Output: array([[4, 5, 6]])
I have included the int casts in the indices because python 3 automatically promotes division to float, even when the numerator evenly divides the denominator.

The i:j can be replaced with a slice object. and ':j' with slice(None,j), etc:
In [55]: alist = [0,1,2,3,4,5]
In [56]: h,t=1,-1; alist[slice(h,t)]
Out[56]: [1, 2, 3, 4]
In [57]: h,t=None,-1; alist[slice(h,t)]
Out[57]: [0, 1, 2, 3, 4]
In [58]: h,t=None,None; alist[slice(h,t)]
Out[58]: [0, 1, 2, 3, 4, 5]
This works for lists and arrays. For multidimensional arrays use a tuple of indices, which can include slice objects
x[i:j, k:l]
x[(slice(i,j), Ellipsis, slice(k,l))]

Related

Python - How to remove some terms of a numpy array in specific intervals

Suppose I have the following array:
import numpy as np
x = np.array([1,2,3,4,5,
1,2,3,4,5,
1,2,3,4,5])
How can I manipulate it to remove the term in equally spaced intervals and adapt the new length for it? For example, I'd like to have:
x = [1,2,3,4,
1,2,3,4,
1,2,3,4]
Where the terms from positions 4, 9, and 14 were excluded (so every 5 terms, one gets excluded). If possible, I'd like to have a code that I could use for an array with length N. Thank you in advance!

In your case, you can simply run code below after initializing the x array(as you did your question):
x.reshape(3,5)[:,:4]
Output
array([[1, 2, 3, 4],
[1, 2, 3, 4],
[1, 2, 3, 4]])
If you are interested in getting a vector and not a matrix(such as the output above), you can call the flatten function on the code above:
x.reshape(3,5)[:,:4].flatten()
Output
array([1, 2, 3, 4,
1, 2, 3, 4,
1, 2, 3, 4])
Explanation
Since x is a numpy array, we can use NumPy in-built functions such as reshape. This function, which has a self-explanatory name, shapes the array into the desired format. x was a vector of 15 elements. Therefore, running x.reshape(3,5) gives us a matrix with 3 rows and five columns. [:, :4] is to reselect the first four columns. flatten function changes a matrix into a vector.

IIUC, you can use a boolean mask generated with the modulo (%) operator:
N = 5
mask = np.arange(len(x))%N != N-1
x[mask]
output: array([1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4])
This works even if your array has not a size that is a multiple of N

array split with overlap [duplicate]

Let's say I have a list A
A = [1,2,3,4,5,6,7,8,9,10]
I would like to create a new list (say B) using the above list in the following order.
B = [[1,2,3], [3,4,5], [5,6,7], [7,8,9], [9,10,]]
i.e. the first 3 numbers as A[0,1,2] and the second 3 numbers as A[2,3,4] and so on.
I believe there is a function in numpy for such a kind of operation.

Simply use Python's built-in list comprehension with list-slicing to do this:
>>> A = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
>>> size = 3
>>> step = 2
>>> A = [A[i : i + size] for i in range(0, len(A), step)]
This gives you what you're looking for:
>>> A
[[1, 2, 3], [3, 4, 5], [5, 6, 7], [7, 8, 9], [9, 10]]
But you'll have to write a couple of lines to make sure that your code doesn't break for unprecedented values of size/step.

The 'duplicate' Partition array into N chunks with Numpy suggests np.split - that's fine for non-overlapping splits. The example (added after the close?) overlaps, one element across each subarray. Plus it pads with a 0.
How do you split a list into evenly sized chunks? has some good list answers, with various forms of generator or list comprehension, but at first glance I didn't see any that allow for overlaps - though with a clever use of iterators (such as iterator.tee) that should be possible.
We can blame this on poor question wording, but it is not a duplicate.
Working from the example and the comment:
Here my window size is 3., i.e each splitted list should have 3 elements first split [1,2,3] and the step size is 2 , So the second split start should start from 3rd element and 2nd split is [3,4,5] respectively.
Here is an advanced solution using as_strided
In [64]: ast=np.lib.index_tricks.as_strided # shorthand
In [65]: A=np.arange(1,12)
In [66]: ast(A,shape=[5,3],strides=(8,4))
Out[66]:
array([[ 1, 2, 3],
[ 3, 4, 5],
[ 5, 6, 7],
[ 7, 8, 9],
[ 9, 10, 11]])
I increased the range of A because I didn't want to deal with the 0 pad.
Choosing the target shape is easy, 5 sets of 3. Choosing the strides requires more knowledge about striding.
In [69]: x.strides
Out[69]: (4,)
The 1d striding, or stepping from one element to the next, is 4 bytes (the length one element). The step from one row to the next is 2 elements of the original, or 2*4 bytes.
as_strided produces a view. Thus changing an element in it will affect the original, and may change overlapping values. Add .copy() to make a copy; math with the strided array will also produce a copy.
Changing the strides can give non overlapping rows - but be careful about the shape - it is possible to access values outside of the original data buffer.
In [82]: ast(A,shape=[4,3],strides=(12,4))
Out[82]:
array([[ 1, 2, 3],
[ 4, 5, 6],
[ 7, 8, 9],
[10, 11, 17]])
In [84]: ast(A,shape=[3,3],strides=(16,4))
Out[84]:
array([[ 1, 2, 3],
[ 5, 6, 7],
[ 9, 10, 11]])
edit
A new function gives a safer version of as_strided.
np.lib.strided_tricks.sliding_window_view(np.arange(1,10),3)[::2]

This function that I wrote may help you, although it only outputs filled chunks with a length of len_chunk:
def overlap(array, len_chunk, len_sep=1):
"""Returns a matrix of all full overlapping chunks of the input `array`, with a chunk
length of `len_chunk` and a separation length of `len_sep`. Begins with the first full
chunk in the array. """
n_arrays = np.int(np.ceil((array.size - len_chunk + 1) / len_sep))
array_matrix = np.tile(array, n_arrays).reshape(n_arrays, -1)
columns = np.array(((len_sep*np.arange(0, n_arrays)).reshape(n_arrays, -1) + np.tile(
np.arange(0, len_chunk), n_arrays).reshape(n_arrays, -1)), dtype=np.intp)
rows = np.array((np.arange(n_arrays).reshape(n_arrays, -1) + np.tile(
np.zeros(len_chunk), n_arrays).reshape(n_arrays, -1)), dtype=np.intp)
return array_matrix[rows, columns]

Numpy: How is this code different from each other?

Let r be an array where each element is an column index (less than N, size of r is M) and P be a MxN array.
The following two snippets behave differently. Why?
1.
P[:, r] += 1
2.
for i in range(len(r)):
P[i, r[i]] += 1

The first one selects an entire column for each element of r. The second just an element. You can directly compare the two cases like so:
>>> P = np.arange(12).reshape(4, 3)
>>> r = np.random.randint(0, 3, (4,))
>>> r
array([1, 1, 2, 0])
>>>
>>> P[:, r]
array([[ 1, 1, 2, 0],
[ 4, 4, 5, 3],
[ 7, 7, 8, 6],
[10, 10, 11, 9]])
>>> P[np.arange(4), r]
array([1, 4, 8, 9])
As you can see the second produces essentially the diagonal of the first.
You may profit from having a look at section "Combining advanced and basic indexing" in the numpy docs.

P[:, r] in the first snippet selects the entire first axis (: is like 0:-1 here) and r-th second axis.
P[i, r[i]] in the for loop selects only the i-th first axis and r[i]-th second axis, which is just a single element.
Once that's clear, it's no surprising the two give different results.

Split list into separate but overlapping chunks

Let's say I have a list A
A = [1,2,3,4,5,6,7,8,9,10]
I would like to create a new list (say B) using the above list in the following order.
B = [[1,2,3], [3,4,5], [5,6,7], [7,8,9], [9,10,]]
i.e. the first 3 numbers as A[0,1,2] and the second 3 numbers as A[2,3,4] and so on.
I believe there is a function in numpy for such a kind of operation.

Simply use Python's built-in list comprehension with list-slicing to do this:
>>> A = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
>>> size = 3
>>> step = 2
>>> A = [A[i : i + size] for i in range(0, len(A), step)]
This gives you what you're looking for:
>>> A
[[1, 2, 3], [3, 4, 5], [5, 6, 7], [7, 8, 9], [9, 10]]
But you'll have to write a couple of lines to make sure that your code doesn't break for unprecedented values of size/step.

The 'duplicate' Partition array into N chunks with Numpy suggests np.split - that's fine for non-overlapping splits. The example (added after the close?) overlaps, one element across each subarray. Plus it pads with a 0.
How do you split a list into evenly sized chunks? has some good list answers, with various forms of generator or list comprehension, but at first glance I didn't see any that allow for overlaps - though with a clever use of iterators (such as iterator.tee) that should be possible.
We can blame this on poor question wording, but it is not a duplicate.
Working from the example and the comment:
Here my window size is 3., i.e each splitted list should have 3 elements first split [1,2,3] and the step size is 2 , So the second split start should start from 3rd element and 2nd split is [3,4,5] respectively.
Here is an advanced solution using as_strided
In [64]: ast=np.lib.index_tricks.as_strided # shorthand
In [65]: A=np.arange(1,12)
In [66]: ast(A,shape=[5,3],strides=(8,4))
Out[66]:
array([[ 1, 2, 3],
[ 3, 4, 5],
[ 5, 6, 7],
[ 7, 8, 9],
[ 9, 10, 11]])
I increased the range of A because I didn't want to deal with the 0 pad.
Choosing the target shape is easy, 5 sets of 3. Choosing the strides requires more knowledge about striding.
In [69]: x.strides
Out[69]: (4,)
The 1d striding, or stepping from one element to the next, is 4 bytes (the length one element). The step from one row to the next is 2 elements of the original, or 2*4 bytes.
as_strided produces a view. Thus changing an element in it will affect the original, and may change overlapping values. Add .copy() to make a copy; math with the strided array will also produce a copy.
Changing the strides can give non overlapping rows - but be careful about the shape - it is possible to access values outside of the original data buffer.
In [82]: ast(A,shape=[4,3],strides=(12,4))
Out[82]:
array([[ 1, 2, 3],
[ 4, 5, 6],
[ 7, 8, 9],
[10, 11, 17]])
In [84]: ast(A,shape=[3,3],strides=(16,4))
Out[84]:
array([[ 1, 2, 3],
[ 5, 6, 7],
[ 9, 10, 11]])
edit
A new function gives a safer version of as_strided.
np.lib.strided_tricks.sliding_window_view(np.arange(1,10),3)[::2]

This function that I wrote may help you, although it only outputs filled chunks with a length of len_chunk:
def overlap(array, len_chunk, len_sep=1):
"""Returns a matrix of all full overlapping chunks of the input `array`, with a chunk
length of `len_chunk` and a separation length of `len_sep`. Begins with the first full
chunk in the array. """
n_arrays = np.int(np.ceil((array.size - len_chunk + 1) / len_sep))
array_matrix = np.tile(array, n_arrays).reshape(n_arrays, -1)
columns = np.array(((len_sep*np.arange(0, n_arrays)).reshape(n_arrays, -1) + np.tile(
np.arange(0, len_chunk), n_arrays).reshape(n_arrays, -1)), dtype=np.intp)
rows = np.array((np.arange(n_arrays).reshape(n_arrays, -1) + np.tile(
np.zeros(len_chunk), n_arrays).reshape(n_arrays, -1)), dtype=np.intp)
return array_matrix[rows, columns]

Maintaining shape of output as of input after Boolean indexing in python

I want help in the following problem, plz.
Suppose X = [1 3 0 8
1 4 6 0
2 0 7 8 ]
mask = (X != 0)
mask = [ T T F T
T T T F
T F T T]
X1 = X[(mask,np.newaxis)]
Its output X1 is of shape (9,1)
But i want X1 to be of (3,3), i.e., maintaining the same shape as of X except the masked entries.
X1 = [1 3 8
1 4 6
2 7 8 ]
Can someone help me plz? Thank you.
Every row of X will contain a zero and I don't want to use reshape(). Here is the working
X= np.array([[1,3,0,8],[1,4,6,0],[2,0,7,8]])
mask = (X!=0)
X1=X[(mask,np.newaxis)]
The output X is of shape (9,1). Is there any way that X1 be of (3,3) as mentioned.

I think you might want to start on something easier in python, since your question doesn't even contain correct syntax. I'm hoping this was just a psuedocode attempt. However, here's some code to do the mask you desire.
import numpy as np
X = np.array([1, 3, 0, 8, 1, 4, 6, 0, 2, 0, 7, 8])
indicies_we_want = np.where(X > 0) # Results in an array containing the indicies of X we want to keep
result = np.take(X, indicies_we_want) # Filter by these indicies
result = result.reshape(3, 3) # Reshape to desired result
print result
This code could be condensed considerably, but I wanted to show each step as you have in your question for clarity.
As pointed out in the comments section, the reshape typically isn't a good idea unless you somehow know after filtering out 0s that you'll be left with 9 elements. In the case you described, we certainly know this, but for a given array, not so much.

In [173]: x=[[1,3,0,8],[1,4,6,0],[2,0,7,8]]
In [174]: xa=np.array(x)
solution with reshape:
In [175]: xa[xa!=0].reshape(3,3)
Out[175]:
array([[1, 3, 8],
[1, 4, 6],
[2, 7, 8]])
a solution without reshape:
In [176]: np.array([i[i!=0] for i in xa])
Out[176]:
array([[1, 3, 8],
[1, 4, 6],
[2, 7, 8]])
Obviously both depend on there being only one deletion per row.
You aren't deleting a common column; nothing in your code tells the underlying numpy that the result will be reshapeable. So boolean indexing operates on the flattened array.
In [177]: xa[xa!=0]
Out[177]: array([1, 3, 8, 1, 4, 6, 2, 7, 8])
In [178]: xa.flat[xa.flat!=0]
Out[178]: array([1, 3, 8, 1, 4, 6, 2, 7, 8])
I could throw in an extra 0, and this indexing would still work the same; but the efforts to reshape it to 3x3 will fail.
Keep in mind that the underlying data buffer is flat, 1d, and that it only displays as 2d because of the shape and striding attributes. Selecting elements (or skipping some) will produce a copy, and a 1d copy is just as easy, even faster, than a 2d one. reshape doesn't change the data buffer, just the shape attribute.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Remove head and tail from numpy array PYTHON - python

Related

Python - How to remove some terms of a numpy array in specific intervals

array split with overlap [duplicate]

Numpy: How is this code different from each other?

Split list into separate but overlapping chunks

Maintaining shape of output as of input after Boolean indexing in python

Categories

Resources