Deleting certain elements from a matrix - python

I have the following problem:
I have a matrix. Now, I want to delete one entry in each row of the matrix: In rows that contain a certain number (say 4) I want to delete the entry with that number, and in other rows I simply want to delete the last element.
E.g. if I have the matrix
matrix=np.zeros((2,2))
matrix[0,0]=2
matrix[1,0]=4
matrix
which gives
2 0
4 0
after the deletion it should simply be
2
0
thanks for your help!

so, assuming there's maximum only one 4 in a row, what you want to do is:
iterate all rows, and if there's a four use roll so it becomes the last element
delete the last column
in rows that have 4, it will delete this 4 and shift the remaining values that come after it,
in rows that don't have 4, it will delete the last element.
(I took the liberty of trying with a little bigger matrix just to make sure output is as expected)
try this:
import numpy as np
# Actual solution
def remove_in_rows(mat, num):
for i, row in enumerate(mat):
if num in row.tolist():
index = row.tolist().index(num)
mat[i][index:] = np.roll(row[index:], -1)
return np.delete(mat, -1, 1)
# Just some example to demonstrate it works
matrix = np.array([[10 * y + x for x in range(6)] for y in range(6)])
matrix[1, 2] = 4
matrix[3, 3] = 4
matrix[4, 0] = 4
print("BEFORE:")
print(matrix)
matrix = remove_in_rows(matrix, 4)
print("AFTER:")
print(matrix)
Output:
BEFORE:
[[ 0 1 2 3 4 5]
[10 11 4 13 14 15]
[20 21 22 23 24 25]
[30 31 32 4 34 35]
[ 4 41 42 43 44 45]
[50 51 52 53 54 55]]
AFTER:
[[ 0 1 2 3 5]
[10 11 13 14 15]
[20 21 22 23 24]
[30 31 32 34 35]
[41 42 43 44 45]
[50 51 52 53 54]]

Related

Sum function using slices with a numpy array Python

Is there a way I could index through a numpy list just like how I would be able to within a normal list function. I want to go through 3 elements in the list moving up by one point every single time and summing all the slices. So it would go through 1,2,3 for the first sum and then it would go through 2,3,4 for the second sum etc. The code down below gives me a scalar error, is there a way I could perform the function below without using a for loop.
import numpy as np
n = 3
arr = np.array([1, 2, 3, 4, 5, 6, 7, 8,9, 10, 11,12, 13, 14, 15, 16, 17, 18, 19, 20, 21 ,22, 23, 24, 25])
start = np.arange(0, len(arr)-n, 1)
stop = np.arange(n-1, len(arr), 1)
sum_arr = np.sum(arr[start:stop])
I think this should work:
sum_arr = arr[1:-1] + arr[2:] + arr[:-2]
This creates an array that's two values shorter than arr because the last element in arr doesn't have two extra elements to create a sum with.
If you wanted the array to be of the same length as the original arr, you could append two extra zeros to the arr array like so:
arr = np.append(arr, [0, 0])
sum_arr = arr[1:-1] + arr[2:] + arr[:-2]
To sum a sliding range of n elements you can use convolve1d with all weights set to 1. Use 'constant' boundary mode with the default fill value of 0. As the filter window is centered by default you need to adjust the length of the result at both ends.
import numpy as np
from scipy.ndimage import convolve1d
arr = np.arange(1,26)
for n in range(2,6):
k,r = divmod(n, 2)
print(n, convolve1d(arr, np.ones(n), mode='constant')[k+r-1:-k])
Result:
2 [ 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49]
3 [ 6 9 12 15 18 21 24 27 30 33 36 39 42 45 48 51 54 57 60 63 66 69 72]
4 [ 10 14 18 22 26 30 34 38 42 46 50 54 58 62 66 70 74 78 82 86 90 94]
5 [ 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100 105 110 115]

numpy vectorized resampling like pandas DataFrame resample

I have an (4, 2000) numpy array and want to resample each column (N=4) for every 5 elements with such as max, min, left, right, which makes its shape as (4, 400).
I can do with Pandas.DataFrame using .resample('5Min').agg(~) or with numpy array and for loop like result = [max(input[i:i+5]) for i in range(0, len(input), 5)]. However, it takes amount of time with large input array since it's not vectorized. Is there any way that I can do with vectorized computation with np array?
Here is another way that uses numpy strides under the hood (a is your array):
from skimage.util import view_as_blocks
a = view_as_blocks(a, (4,5))
Now, you can use methods/slicing for parameters you want:
#max
a.max(-1)[0].T
#min
a.min(-1)[0].T
#left
a[...,0][0].T
#right
a[...,-1][0].T
example:
a
#[[ 0 1 2 3 4 5 6 7 8 9]
# [10 11 12 13 14 15 16 17 18 19]
# [20 21 22 23 24 25 26 27 28 29]
# [30 31 32 33 34 35 36 37 38 39]]
output for max
#[[ 4 9]
# [14 19]
# [24 29]
# [34 39]]

NUMPY: select even lines, last column

I am trying to learn numpy and I can't manage to complete this question: take the even lines, last column of the M matrix:
[[ 1 2 3 4 5]
[ 6 7 8 9 10]
[11 12 13 14 15]
[16 17 18 19 20]
[21 22 23 24 25]
[26 27 28 29 30]]
What I did : print(M[0:, -1, 2], '\n')
error: IndexError: too many indices for array
Why isn't this working ? I select all the lines with 0:, the last column with -1, with step 2.
Your array is 2-dimensional, but you're using three indices as if your array had 3 dimensions, you can use this index to get what you want:
print(M[::2, -1])
Output:
[ 5 15 25]

How to efficiently subtract values from each column with numpy

I have a 2D array of shape (50,50). I need to subtract a value from each column of this array skipping the first), which is calculated based on the index of the column. For example, using a for loop it would look something like this:
for idx in range(1, A[0, :].shape[0]):
A[0, idx] -= idx * (...) # simple calculations with idx
Now, of course this works fine, but it's very slow and performance is critical for my application. I've tried computing the values to be subtracted using np.fromfunction() and then subtracting it from the original array, but results are different than those obtained by the for loop iteractive subtraction:
func = lambda i, j: j * (...) #some simple calculations
subtraction_matrix = np.fromfunction(np.vectorize(func), (1,50))
A[0, 1:] -= subtraction_matrix
What am I doing wrong? Or is there some other method that would be better? Any help is appreciated!
All your code snippets indicate that you require the subtraction to happen only in the first row of A (though you've not explicitly mentioned that). So, I'm proceeding with that understanding.
Referring to your use of from_function(), you can use the subtraction_matrix as below:
A[0,1:] -= subtraction_matrix[1:]
Testing it out (assuming shape (5,5) instead of (50,50)):
import numpy as np
A = np.arange(25).reshape(5,5)
print (A)
func = lambda j: j * 10 #some simple calculations
subtraction_matrix = np.fromfunction(np.vectorize(func), (5,), dtype=A.dtype)
A[0,1:] -= subtraction_matrix[1:]
print (A)
Output:
[[ 0 1 2 3 4] # print(A), before subtraction
[ 5 6 7 8 9]
[10 11 12 13 14]
[15 16 17 18 19]
[20 21 22 23 24]]
[[ 0 -9 -18 -27 -36] # print(A), after subtraction
[ 5 6 7 8 9]
[ 10 11 12 13 14]
[ 15 16 17 18 19]
[ 20 21 22 23 24]]
If you want the subtraction to happen in all the rows of A, you just need to use the line A[:,1:] -= subtraction_matrix[1:], instead of the line A[0,1:] -= subtraction_matrix[1:]

In Python how do you split a list into evenly sized chunks starting with the last element from the previous chunk?

What would be the most pythonic way to convert a list like:
mylist = [0,1,2,3,4,5,6,7,8]
into chunks of n elements that always start with the last element of the previous chunk.
The last element of the last chunk should be identical to the first element of the first chunk to make the data structure circular.
Like:
[
[0,1,2,3],
[3,4,5,6],
[6,7,8,0],
]
under the assumption that len(mylist) % (n-1) == 0 . So that it always works nicely.
What about the straightforward solution?
splitlists = [mylist[i:i+n] for i in range(0, len(mylist), n-1)]
splitlists[-1].append(splitlists[0][0])
A much less straightforward solution involving numpy (for the sake of overkill):
from numpy import arange, roll, column_stack
n = 4
values = arange(10, 26)
# values -> [10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25]
idx = arange(0, values.size, n) # [ 0 4 8 12]
idx = roll(idx, -1) # [ 4 8 12 0]
col = values[idx] # [14 18 22 10]
values = column_stack( (values.reshape(n, -1), col) )
[[10 11 12 13 14]
[14 15 16 17 18]
[18 19 20 21 22]
[22 23 24 25 10]]

Categories

Resources