Numpy, change max value in each row to 1 without changing others - python

I'm trying to change max value of each rows to 1 and leave others.
Each values is between 0 to 1.
I want to change this
>>> a = np.array([[0.5, 0.2, 0.1],
... [0.6, 0.3, 0.8],
... [0.3, 0.4, 0.2]])
into this
>>> new_a = np.array([[1, 0.2, 0.1],
... [0.6, 0.3, 1],
... [0.3, 1, 0.2]])
Is there any good solution for this problem using np.where maybe? (without using for loop)

Use np.argmax and slice assignment:
>>> a[np.arange(len(a)), np.argmax(a, axis=1)] = 1
>>> a
array([[1. , 0.2, 0.1],
[1. , 0.3, 0.6],
[1. , 0.3, 0.2]])
>>>

the question differes from desired output.
the author says he wants to replace max value and leave others but actualy he replaces max value and some others.
this is the solution for replacing max value only.
np.where(arr == np.amax(arr), 1, arr)

U12-Forward's answer does it perfectly. Here is another answer using numpy.where
np.where(a[0]==a.max(1), 1, a)
# `a[0]==a.max(1)` -> ​for each row, find element that is equal to max element in that row
# `1` -> set it to `1`
# `a` -> others remain the same

Here's a more detailed step by step process that gives us the desired output:
# input array
a = np.array([[0.5, 0.8, 0.1],
[0.8, 0.9, 0.6],
[0.4, 0.3, 12]])
# finding the max element for each row
# axis=1 is given because we want to find the max for each row
max_elements = np.amax(a, axis=1)
# this changes the shape of max_elements array so that it matches with input array(a)
# this shape change is done so that we can compare directly
max_elements = max_elements[:, None]
# this code is checking the main condition
# if the value in a row matches with the max element of that row, change it to 1
# else keep it the same
new_arr = np.where(a == max_elements, 1, a)
print(new_arr)

U12-Forward's and AcaNg's answers are perfect. Here's another way to do it usng numpy.where
new_a = np.where(a==[[i] for i in np.amax(a,axis=1)],1,a)

Related

Randomly get index of one of the maximum values in a PyTorch tensor

I need to perform something similar to the built-in torch.argmax() function on a one-dimensional tensor, but instead of picking the index of the first of the maximum values, I want to be able to pick a random index of one of the maximum values. For example:
my_tensor = torch.tensor([0.1, 0.2, 0.2, 0.1, 0.1, 0.2, 0.1])
index_1 = random_max_val_index_fn(my_tensor)
index_2 = random_max_val_index_fn(my_tensor)
print(f"{index_1}, {index_2}")
> 5, 1
You can get the indexes of all the maximums first and then choose randomly from them:
def rand_argmax(tens):
max_inds, = torch.where(tens == tens.max())
return np.random.choice(max_inds)
sample runs:
>>> my_tensor = torch.tensor([0.1, 0.2, 0.2, 0.1, 0.1, 0.2, 0.1])
>>> rand_argmax(my_tensor)
2
>>> rand_argmax(my_tensor)
5
>>> rand_argmax(my_tensor)
2
>>> rand_argmax(my_tensor)
1
I think this should work:
import numpy as np
import torch
your_tensor = torch.tensor([0.1, 0.2, 0.2, 0.1, 0.1, 0.2, 0.1])
argmaxes = np.argwhere(your_tensor==torch.max(your_tensor)).flatten()
rand_argmax = np.random.choice(argmaxes)
print(rand_argmax)
make sure you adjust for np.random.choice to account for replacement

Summing three consecutive number when equal to or great than 0 - Python

I am using numpy in Python
I have an array of numbers, for example:
arr = np.array([0.1, 1, 1.2, 0.5, -0.3, -0.2, 0.1, 0.5, 1)
If i is a position in the array, I want to create a function which creates a running sum of i and the two previous numbers, but only accumulating the number if it is equal to or greater than 0.
In other words, negative numbers in the array become equal to 0 when calculating the three number running sum.
For example, the answer I would be looking for here is
2.3, 2.7, 1.7, 0.5, 0.1, 0.6, 1.6
The new array has two elements less than the original array as the calculation can't be completed for the first two number.
Thank you !
As Dani Mesejo answered, you can use stride tricks. You can either use clip or boolean indexing to handle the <0 elements. I have explained how stride tricks work below -
arr[arr<0]=0 sets all elements below 0 as 0
as_strided takes in the array, the expected shape of the view (7,3) and the number of strides in the respective axes, (8,8). This is the number of bytes you have to move in axis0 and axis1 respectively to access the next element. E.g. If you want to move every 2 elements, then you can set it to (16,8). This means you would move 16 bytes each time to get the element in axis0 (which is 0.1->1.2->0->0.1->.., till a shape of 7) and 8 bytes each time to get element in axis1 (which is 0.1->1->1.2, till a shape of 3)
Use this function with caution! Always use x.strides to define the strides parameter to avoid corrupting memory!
Lastly, sum this array view over axis=1 to get your rolling sum.
arr = np.array([0.1, 1, 1.2, 0.5, -0.3, -0.2, 0.1, 0.5, 1])
w = 3 #rolling window
arr[arr<0]=0
shape = arr.shape[0]-w+1, w #Expected shape of view (7,3)
strides = arr.strides[0], arr.strides[0] #Strides (8,8) bytes
rolling = np.lib.stride_tricks.as_strided(arr, shape=shape, strides=strides)
rolling_sum = np.sum(rolling, axis=1)
rolling_sum
array([2.3, 2.7, 1.7, 0.5, 0.1, 0.6, 1.6])
You could clip, roll and sum:
import numpy as np
def rolling_window(a, window):
"""Recipe from https://stackoverflow.com/q/6811183/4001592"""
shape = a.shape[:-1] + (a.shape[-1] - window + 1, window)
strides = a.strides + (a.strides[-1],)
return np.lib.stride_tricks.as_strided(a, shape=shape, strides=strides)
a = np.array([0.1, 1, 1.2, 0.5, -0.3, -0.2, 0.1, 0.5, 1])
res = rolling_window(np.clip(a, 0, a.max()), 3).sum(axis=1)
print(res)
Output
[2.3 2.7 1.7 0.5 0.1 0.6 1.6]
You may use np.correlate to sweep an array of 3 ones over the clipped of arr to get desired output
In [20]: np.correlate(arr.clip(0), np.ones(3), mode='valid')
Out[20]: array([2.3, 2.7, 1.7, 0.5, 0.1, 0.6, 1.6])
arr = np.array([0.1, 1, 1.2, 0.5, -0.3, -0.2, 0.1, 0.5, 1])
def sum_3(x):
collector = []
for i in range(len(arr)-2):
collector.append(sum(arr[i:i+3][arr[i:i+3]>0]))
return collector
#output
[2.3, 2.7, 1.7, 0.5, 0.1, 0.6, 1.6]
Easiest and most comprehensible way. The collector will append the sum of the 3 consecutive numbers if their indices are True otherwise, they are all turned to 0s.
The method is not general, it is for 3 consecutives but you can adapt it.
def sum_any(x,n):
collector = []
for i in range(len(arr)-(n-1)):
collector.append(sum(arr[i:i+n][arr[i:i+n]>0]))
return collector
Masked arrays and view_as_windows (which uses numpy strides under the hood) are built for this purpose:
from skimage.util import view_as_windows
arr = view_as_windows(arr, 3)
arr2 = np.ma.masked_array(arr, arr<0).sum(-1)
output:
[2.3 2.7 1.7 0.5 0.1 0.6 1.6]

Python: for a list of lists, get mean value in each position

I have a list of lists:
list_of_lists = []
list_1 = [-1, 0.67, 0.23, 0.11]
list_2 = [-1]
list_3 = [0.54, 0.24, -1]
list_4 = [0.2, 0.85, 0.8, 0.1, 0.9]
list_of_lists.append(list_1)
list_of_lists.append(list_2)
list_of_lists.append(list_3)
list_of_lists.append(list_4)
The position is meaningful. I want to return a list that contains the mean per position, excluding -1. That is, I want:
[(0.54+0.2)/2, (0.67+0.24+0.85)/3, (0.23+0.8)/2, (0.11+0.1)/2, 0.9/1]
which is actually:
[0.37, 0.5866666666666667, 0.515, 0.10500000000000001, 0.9]
How can I do this in a pythonic way?
EDIT:
I am working with Python 2.7, and I am not looking for the mean of each list; instead, I'm looking for the mean of 'all list elements at position 0 excluding -1', and the mean of 'all list elements at position 1 excluding -1', etc.
The reason I had:
[(0.54+0.2)/2, (0.67+0.24+0.85)/3, (0.23+0.8)/2, (0.11+0.1)/2, 0.9/1]
is that values in position 0 are -1, -1, 0.54, and 0.2, and I want to exclude -1; position 1 has 0.67, 0.24, and 0.85; position 3 has 0.23, -1, and 0.8, etc.
A solution without third-party libraries:
from itertools import zip_longest
from statistics import mean
def f(lst):
return [mean(x for x in t if x != -1) for t in zip_longest(*lst, fillvalue=-1)]
>>> f(list_of_lists)
[0.37, 0.5866666666666667, 0.515, 0.10500000000000001, 0.9]
It uses itertools.zip_longest with fillvalue set to -1 to "transpose" the list and set missing values to -1 (will be ignored at the next step). Then, a generator expression and statistics.mean are used to filter out -1s and get the average.
Here is a vectorised numpy-based solution.
import numpy as np
a = [[-1, 0.67, 0.23, 0.11],
[-1],
[0.54, 0.24, -1],
[0.2, 0.85, 0.8, 0.1, 0.9]]
# first create non-jagged numpy array
b = -np.ones([len(a), max(map(len, a))])
for i, j in enumerate(a):
b[i][0:len(j)] = j
# count negatives per column (for use later)
neg_count = [np.sum(b[:, i]==-1) for i in range(b.shape[1])]
# set negatives to 0
b[b==-1] = 0
# calculate means
means = [np.sum(b[:, i])/(b.shape[0]-neg_count[i]) \
if (b.shape[0]-neg_count[i]) != 0 else 0 \
for i in range(b.shape[1])]
# [0.37,
# 0.58666666666666667,
# 0.51500000000000001,
# 0.10500000000000001,
# 0.90000000000000002]
You can use pandas module to process.Code would like this :
import numpy as np
import pandas as pd
list_1 = [-1, 0.67, 0.23, 0.11,np.nan]
list_2 = [-1,np.nan,np.nan,np.nan,np.nan]
list_3 = [0.54, 0.24, -1,np.nan,np.nan]
list_4 = [0.2, 0.85, 0.8, 0.1, 0.9]
df=pd.DataFrame({"list_1":list_1,"list_2":list_2,"list_3":list_3,"list_4":list_4})
df=df.replace(-1,np.nan)
print(list(df.mean(axis=1)))

python mask matrice for selecting a list of vertices

I have a numpy matrix of booleans, whose shape is (N,N), e.g.:
[[True False False True]
[...]
[True True True False]]
and a numpy array of vertices, whose shape is (N,3), e.g:
[[0.1, 0.2, 0.3]
[0.4, 0.5, 0.6]
[0.7, 0.8, 0.9]
[1.0, 1.1, 1.2]]
I would like to compute a matrix, with shape (N, varying), in which each row is a list of vertices selected with each line of the boolean matrix.
From the examples above:
[[[0.1, 0.2, 0.3], [1.0, 1.1, 1.2]]
[...]
[[0.1, 0.2, 0.3],[0.4, 0.5, 0.6],[0.7, 0.8, 0.9]]]
Is it possible ?
Thanks in advance
Here's one approach after extracting rows, columns from the mask -
r,c = np.where(mask)
start = np.r_[0,np.flatnonzero(r[1:] != r[:-1])+1]
stop = np.r_[start[1:], r.size]
data_rep = data[c]
out = [data_rep[start[i]:stop[i]] for i in range(len(start))]
Thanks Divakar !!
I tried your solution and it works fine.
However, I also tried a solution with a loop:
result = []
for i in range(len(data)):
result.append(data[mask[i]])
and it's faster than doing:
result = extract_rows_using_mask(data, mask)
Weird isn't it ?

Generating a new order for a list of lists in python

I have a list of lists that I want to re-order:
qvalues = [[0.1, 0.3, 0.6],[0.7, 0.1, 0.2],[0.3, 0.4, 0.3],[0.1, 0.3, 0.6],[0.1, 0.3, 0.6],[0.1, 0.3, 0.6]]
I know how to reorder this list if I have a list with the order I want (example here). The tricky part is getting this order.
What I have is this:
locations = [(['Loc1','Loc1'], 3), (['Loc2'], 1), (['Loc3', 'Loc3', 'Loc3'], 2)]
This is a list of tuples, where the first element of each tuple is a list with the location name, repeated for each individual in that location, and the second element is the order these individuals are in on the qvalues list (qvalues[0] is 'Loc2', qvalues[1:4] are 'Loc3' and qvalues[4:6] are 'Loc1'.
What I want is to change the order of the lists in qvalues to the order they show up in locations: First 'Loc1', then 'Loc2' and finally 'Loc3'.
This is just a small example, my real dataset has hundreds of individuals and 17 locations.
Thanks in advance for any help you may provide.
You will need to build a list of offsets and length instead of length and positions as provided in your locations list. Then, you’ll be able to reorder based on the answer you linked to:
qvalues = [[0.1, 0.3, 0.6],[0.7, 0.1, 0.2],[0.3, 0.4, 0.3],[0.1, 0.3, 0.6],[0.1, 0.3, 0.6],[0.1, 0.3, 0.6]]
locations = [(['Loc1','Loc1'], 3), (['Loc2'], 1), (['Loc3', 'Loc3', 'Loc3'], 2)]
locations_dict = {pos:(index,len(loc)) for index,(loc,pos) in enumerate(locations)}
# if python2: locations_dict = dict([(pos,(index,len(loc))) for index,(loc,pos) in enumerate(locations)])
offsets = [None]*len(locations)
def compute_offset(pos):
# compute new offset from offset and length of previous position. End of recursion at position 1: we’re at the beginning of the list
offset = sum(compute_offset(pos-1)) if pos > 1 else 0
# get index at where to store current offset + length of current location
index, length = locations_dict[pos]
offsets[index] = (offset, length)
return offsets[index]
compute_offset(len(locations))
qvalues = [qvalues[offset:offset+length] for offset,length in offsets]
You’ll end up with qvalues being a list of lists of lists instead of a "simple" list of lists. If you want to flatten it to keep your initial layout use this list comprehension instead:
qvalues = [value for offset,length in offsets for value in qvalues[offset:offset+length]]
Output with first version
[[[0.1, 0.3, 0.6], [0.1, 0.3, 0.6]], [[0.1, 0.3, 0.6]], [[0.7, 0.1, 0.2], [0.3, 0.4, 0.3], [0.1, 0.3, 0.6]]]
Output with second version
[[0.1, 0.3, 0.6], [0.1, 0.3, 0.6], [0.1, 0.3, 0.6], [0.7, 0.1, 0.2], [0.3, 0.4, 0.3], [0.1, 0.3, 0.6]]

Categories

Resources