I need to find minimum over all elements from the column which has the maximum column sum.
I do the following things:
Create random matrix
from numpy import *
a = random.rand(5,4)
Then calculate sum of each column and find index of the maximum element
c = a.sum(axis=0)
d = argmax(c)
Then I try to find the minimum number in this column, but I am quite bad with syntax, I know how to find the minimum element in the row with current index.
e = min(a[d])
But how can I change it for columns?
You can extract the minimum value of a column as follows (using the variables you have indicated):
e=a[:,d].min()
Note that using
a=min(a[:,d])
will break you out of Numpy, slowing things down (thanks for pointing this out #SaulloCastro).
Related
Is it possible to use .nlargest to get the two highest numbers in a set of number, but ensure that they are x amount of rows apart?
For examples, in the following code I would want to find the largest values but ensure that they are more than 5 values apart from each other. Is there an easy way to do this?
data = {'Pressure' : [100,112,114,120,123,420,1222,132,123,333,123,1230,132,1,23,13,13,13,123,13,123,3,222,2303,1233,1233,1,1,30,20,40,401,10,40,12,122,1,12,333],
}
If I understand the question correctly, you need to output the largest value, and then the next largest value that's at least X rows apart from it (based on the index).
First value is just data.Pressure.max(). Its index is data.Pressure.idxmax()
Second value is either before or after the first value's index:
max_before = df.Pressure.loc[:df.Pressure.idxmax() - X].max()
max_after = df.Pressure.loc[df.Pressure.idxmax() + X:].max()
second_value = max(max_before, max_after)
I understand that
np.argmax(np.max(x, axis=1))
returns the index of the row that contains the maximum value and
np.argmax(np.max(x, axis=0))
returns the index of the row that contains the maximum value.
But what if the matrix contained strings? How can I change the code so that it still finds the index of the largest value?
Also (if there's no way to do what I previously asked for), can I change the code so that the operation is only carried out on a sub-section of the matrix, for instance, on the bottom right '2x2' sub-matrix in this example:
array = [['D','F,'J'],
['K',3,4],
['B',3,1]]
[[3,4],
[3,1]]
Can you try first converting the column to type dtype? If you take the min/max of a dtype column, it should use string values for the minimum/maximum.
Although not efficient, this could be one way to find index of the maximum number in the original matrix by using slices:
newmax=0
newmaxrow=0
newmaxcolumn=0
for row in [array[i][1:] for i in range(1,2)]:
for num in row:
if num>newmax:
newmax=num
newmaxcolumn=row.index(newmax)+1
newmaxrow=[array[i][1:] for i in range(1,2)].index(row)+1
Note: this method would not work if the lagest number lies within row 0 or column 0.
I have an numpy array 'A' of size 5000x10. I also have another number 'Num'. I want to apply the following to each row of A:
import numpy as np
np.max(np.where(Num > A[0,:]))
Is there a pythonic way than writing a for loop for above.
You could use argmax -
A.shape[1] - 1 - (Num > A)[:,::-1].argmax(1)
Alternatively with cumsum and argmax -
(Num > A).cumsum(1).argmax(1)
Explanation : With np.max(np.where(..), we are basically looking to get the last occurrence of matches along each row on the comparison.
For the same, we can use argmax. But, argmax on a boolean array gives us the first occurrence and not the last one. So, one trick is to perform the comparison and flip the columns with [:,::-1] and then use argmax. The column indices are then subtracted by the number of cols in the array to make it trace back to the original order.
On the second approach, it's very similar to a related post and therefore quoting from it :
One of the uses of argmax is to get ID of the first occurence of the max element along an axis in an array . So, we get the cumsum along the rows and get the first max ID, which represents the last non-zero elem. This is because cumsum on the leftover elements won't increase the sum value after that last non-zero element.
I have an numpy array 'A' of size 5000x10. I also have another number 'Num'. I want to apply the following to each row of A:
import numpy as np
np.max(np.where(Num > A[0,:]))
Is there a pythonic way than writing a for loop for above.
You could use argmax -
A.shape[1] - 1 - (Num > A)[:,::-1].argmax(1)
Alternatively with cumsum and argmax -
(Num > A).cumsum(1).argmax(1)
Explanation : With np.max(np.where(..), we are basically looking to get the last occurrence of matches along each row on the comparison.
For the same, we can use argmax. But, argmax on a boolean array gives us the first occurrence and not the last one. So, one trick is to perform the comparison and flip the columns with [:,::-1] and then use argmax. The column indices are then subtracted by the number of cols in the array to make it trace back to the original order.
On the second approach, it's very similar to a related post and therefore quoting from it :
One of the uses of argmax is to get ID of the first occurence of the max element along an axis in an array . So, we get the cumsum along the rows and get the first max ID, which represents the last non-zero elem. This is because cumsum on the leftover elements won't increase the sum value after that last non-zero element.
I want to find quantiles of element n in sublists.
Let's say I have (in reality it's much bigger):
List=[[[1,3,0,1],[1,2,0,1],[1,3,0,1]],[[2,2,1,0],[2,2,1,0],[2,2,1,0]]]
I want a way to find quantiles (like numpy.percentile) for the 2:nd elements in the sublist [[1,3,1,1],[1,2,0,1],[9,3,2,1]] and in [[1,2,3,4],[0,2,0,0],[1,2,2,2]] and then I want to do a maximum function so I know which subgroup of those two had the highest chosen quantile, and I also want to know the values the other 3 constant values (1:st, 3:rd and 4:th elements) has at that maximum.
Here's one possible way. Assuming (as in your question)
List=[[[1,3,0,1],[1,2,0,1],[1,3,0,1]],[[2,2,1,0],[2,2,1,0],[2,2,1,0]]]
Then one can convert each first-level tuple to a numpy matrix first, which allows easily selecting the 2nd column, to which one can apply the numpy.percentile function. Shortly,
import numpy as np
quartiles = [np.percentile(np.matrix(l)[:,1], 25) for l in List]
which gives as output the quartiles (25-percentiles) of each first-level tuple:
[2.5, 2.0]
One can then find the maximum with numpy.argmax:
am = np.argmax(quartiles)
and then use it to select the other 3 constant elements
other3 = [List[am][0][0], List[am][0][2], List[am][0][3]]