Pick lines with highest values from np.zeros - python

I have the following data structure:
Pl = np.zeros((7,2,7))
Pl[0,0,1]=1
Pl[1,0,2]=1
Pl[2,0,3]=1
Pl[5,0,6]=0.9
Pl[5,0,5]=0.1
...
Pl[5,1,4]=1
How can I get the entry with a specified first value and that has the highest assigned value?
For example for x=5, I want to get Pl[5,1,4]. I have seen max but I can't specify the value of x.
Thank you!

import numpy as np
Pl = np.zeros((7, 2, 7))
Pl[0, 0, 1] = 1
Pl[1, 0, 2] = 1
Pl[2, 0, 3] = 1
Pl[5, 0, 6] = 0.9
Pl[5, 0, 5] = 0.1
Pl[5, 1, 4] = 1
res = np.array([5, *np.unravel_index(Pl[5].argmax(), Pl[5].shape)])
# array([5, 1, 4])
Here, Pl[5].argmax() gets the maximal value. It will be a 1D integer index, but you can convert it to a 2D index of Pl[5] with np.unravel_index. Finally, we are missing the index along the zeroeth dimension that we know is 5. Just prepend it and return the array.

Related

How do I replace an element with another new element in the same index and move the previous element to the next index

I have this problem where I want to replace an element with a new element, and instead of removing the element I replaced, I just want it to move to the next index.
import numpy as np
empty_arr = [0] * 5
arr = np.array(empty_arr)
inserted = np.insert(empty_arr, 1, 3)
inserted = np.insert(empty_arr, 1, 4)
#Output:
[0 4 0 0 0 0]
I don't know the right syntax for this but I just want to replace element 3 with 4
#Expected Output:
[0 3 4 0 0 0] #move the element 4 to next index
You are placing the result of the first insertion in the inserted variable but you are starting over from the original array for the 2nd insertion and overriding the previous result.
You should start the 2nd insertion from the previous result:
inserted = np.insert(empty_arr, 1, 3)
inserted = np.insert(inserted, 1, 4)
BTW, do you have to use numpy arrays for this ? regular Python lists seem better suited:
empty_arr = [0] * 5
empty_arr.insert(1,3)
empty_arr.insert(1,4)
print(empty_arr)
[0, 4, 3, 0, 0, 0, 0]
Note that if you want 4 to appear after 3 in you result, you either have to insert them in the reverse order at index 1 or insert 4 at index 2 after inserting 3 at index 1.
import numpy as np
empty_arr = [0] * 5
arr = np.array(empty_arr)
empty_arr = np.insert(empty_arr, 1, 3)
empty_arr = np.insert(empty_arr, 1, 4)
#output
array([0, 4, 3, 0, 0, 0, 0])

How to replace the N smallest elements in each row of numpy array?

I would like to replace the N smallest elements in each row for 0, and that the resulting array would respect the same order and shape of the original array.
Specifically, if the original numpy array is:
import numpy as np
x = np.array([[0,50,20],[2,0,10],[1,1,0]])
And N = 2, I would like for the result to be the following:
x = np.array([[0,50,0],[0,0,10],[0,1,0]])
I tried the following, but in the last row it replaces 3 elements instead of 2 (because it replaces both 1s and not only one)
import numpy as np
N = 2
x = np.array([[0,50,20],[2,0,10],[1,1,0]])
x_sorted = np.sort(x , axis = 1)
x_sorted[:,N:] = 0
replace = x_sorted.copy()
final = np.where(np.isin(x,replace),0,x)
Note that this is small example and I would like that it works for a much bigger matrix.
Thanks for your time!
One way using numpy.argsort:
N = 2
x[x.argsort().argsort() < N] = 0
Output:
array([[ 0, 50, 0],
[ 0, 0, 10],
[ 0, 1, 0]])
Use numpy.argpartition to find the index of N smallest elements, and then use the index to replace values:
N = 2
idy = np.argpartition(x, N, axis=1)[:, :N]
x[np.arange(len(x))[:,None], idy] = 0
x
array([[ 0, 50, 0],
[ 0, 0, 10],
[ 1, 0, 0]])
Notice if there are ties, it could be undetermined which values get replaced depending on the algorithm used.

Count the occuarences of each elements in numpy array, where elements are elementwise equal with another array?

I have two arrays like
[2,2,0,1,1,1,2] and [2,2,0,1,1,1,0]
I need to count (eg. with bincount) the occourances of each element in the first array, where the elements equal by position in the second array.
So in this case, we get [1,3,2], because 0 occurs once in the same position of the arrays, 1 occurs three times in the same positions and 2 occurs twice in the same positions.
I tried this, but not the desired result:
np.bincount(a[a==b])
Can someone help me?
You must put your lists in np array format:
import numpy as np
a = np.array([2,2,0,1,1,1,2])
b = np.array([2,2,0,1,1,1,0])
np.bincount(a, weights=(a==b)) # [1, 3, 2]
from datatable import dt, f, by
df = dt.Frame(
col1=[2, 2, 0, 1, 1, 1, 2],
col2=[2, 2, 0, 1, 1, 1, 0]
)
df['equal'] = dt.ifelse(f.col1 == f.col2, 1, 0)
df_sub = df[:, {"sum": dt.sum(f.equal)}, by('col1')]
yourlist = df_sub['sum'].to_list()[0]
yourlist
[1, 3, 2]
array_1 = np.array([2,2,0,1,1,1,2])
array_2 = np.array([2,2,0,1,1,1,0])
# set up a bins array for the results:
if array_1.max() > array_2.max():
bins = np.zeros(array_1.max()+1)
else:
bins = np.zeros(array_2.max()+1)
# fill the bin values:
for val1, val2 in zip(array_1, array_2):
if val1 == val2:
bins[val1] += 1
# convert bins to a list with int values
bins = bins.astype(int).tolist()
And the results:
[1, 3, 2]

How to bin values in a list into categories

For example, the double values in the first array [1.2,4.6,3.7,11.2,13,5,18.9,0.3,20.0,26.7,1]
now I want to create another array based on the first one with states 1, 2 and 3
for every value in the first array that is in the range [0,10) add the value 1 in the second array
so the range [0,10) represents state 1
the range [10,20) represents state 2
the range [20,30) represents state 3
so at the end, the second array would look like [1,1,1,2,2,2,1,3,3,1]
This is a transition state array that will help to build the transition matrix in python **
If numpy is an option this is quite simple with np.digitize:
import numpy as np
a = np.array([1.2,4.6,3.7,11.2,13,5,18.9,0.3,20.0,26.7,1])
np.digitize(a, (0,10,20))
# array([1, 1, 1, 2, 2, 1, 2, 1, 3, 3, 1], dtype=int64)
If you do not want to use numpy (see yatu's solution) or want to explicitly see a basic pure Python implementation, check out the below:
arr = [1.2,4.6,3.7,11.2,13,5,18.9,0.3,20.0,26.7,1]
def get_state(el):
if 0 <= el < 10:
return 1
elif 10 <= el < 20:
return 2
elif 20 <= el < 30:
return 3
else:
raise Exception(f"Unexpected value: {el}")
res = [get_state(el) for el in arr]
# [1, 1, 1, 2, 2, 1, 2, 1, 3, 3, 1]

How to randomly throw numbers in a 2D dimensional board

I have a 50x50 2D dimensional board with empty cells now. I want to fill 20% cells with 0, 30% cells with 1, 30% cells with 2 and 20% cells with 3. How to randomly throw these 4 numbers onto the board with the percentages?
import numpy as np
from numpy import random
dim = 50
map = [[" "for i in range(dim)] for j in range(dim)]
print(map)
One way to get this kind of randomness would be to start with a random permutation of the numbers from 0 to the total number of cells you have minus one.
perm = np.random.permutation(2500)
now you split the permutation according the proportions you want to get and treat the entries of the permutation as the indices of the array.
array = np.empty(2500)
p1 = int(0.2*2500)
p2 = int(0.3*2500)
p3 = int(0.3*2500)
array[perm[range(0, p1)]] = 0
array[perm[range(p1, p1 + p2)]] = 1
array[perm[range(p1 + p2, p3)]] = 2
array[perm[range(p1 + p2 + p3, 2500)]] = 3
array = array.reshape(50, 50)
This way you ensure the proportions for each number.
Since the percentages sum up to 1, you can start with a board of zeros
bsize = 50
board = np.zeros((bsize, bsize))
In this approach the board positions are interpreted as 1D postions, then we need a set of position equivalent to 80% of all positions.
for i, pos in enumerate(np.random.choice(bsize**2, int(0.8*bsize**2), replace=False)):
# the fisrt 30% will be set with 1
if i < int(0.3*bsize**2):
board[pos//bsize][pos%bsize] = 1
# the second 30% (between 30% and 60%) will be set with 2
elif i < int(0.6*bsize**2):
board[pos//bsize][pos%bsize] = 2
# the rest 20% (between 60% and 80%) will be set with 3
else:
board[pos//bsize][pos%bsize] = 3
At the end the last 20% of positions will remain as zeros
As suggested by #alexis in commentaries, this approach could became more simple by using shuffle method from random module:
from random import shuffle
bsize = 50
board = np.zeros((bsize, bsize))
l = list(range(bsize**2))
shuffle(l)
for i, pos in enumerate(l):
# the fisrt 30% will be set with 1
if i < int(0.3*bsize**2):
board[pos//bsize][pos%bsize] = 1
# the second 30% (between 30% and 60%) will be set with 2
elif i < int(0.6*bsize**2):
board[pos//bsize][pos%bsize] = 2
# the rest 20% (between 60% and 80%) will be set with 3
elif i < int(0.8*bsize**2):
board[pos//bsize][pos%bsize] = 3
The last 20% of positions will remain as zeros again.
A different approach (admittedly it's probabilistic so you won't get perfect proportions as the solution proposed by Brad Solomon)
import numpy as np
res = np.random.random((50, 50))
zeros = np.where(res <= 0.2, 0, 0)
ones = np.where(np.logical_and(res <= 0.5, res > 0.2), 1, 0)
twos = np.where(np.logical_and(res <= 0.8, res > 0.5), 2, 0)
threes = np.where(res > 0.8, 3, 0)
final_result = zeros + ones + twos + threes
Running
np.unique(final_result, return_counts=True)
yielded
(array([0, 1, 2, 3]), array([499, 756, 754, 491]))
Here's an approach with np.random.choice to shuffle indices, then filling those indices with repeats of the inserted ints. It will fill the array in the exact proportions that you specify:
import numpy as np
np.random.seed(444)
board = np.zeros(50 * 50, dtype=np.uint8).flatten()
# The "20% cells with 0" can be ignored since that is the default.
#
# This will work as long as the proportions are "clean" ints
# (I.e. mod to 0; 2500 * 0.2 is a clean 500. Otherwise, need to do some rounding.)
rpt = (board.shape[0] * np.array([0.3, 0.3, 0.2])).astype(int)
repl = np.repeat([1, 2, 3], rpt)
idx = np.random.choice(board.shape[0], size=repl.size, replace=False)
board[idx] = repl
board = board.reshape((50, 50))
Resulting frequencies:
>>> np.unique(board, return_counts=True)
(array([0, 1, 2, 3], dtype=uint8), array([500, 750, 750, 500]))
>>> board
array([[1, 3, 2, ..., 3, 2, 2],
[0, 0, 2, ..., 0, 2, 0],
[1, 1, 1, ..., 2, 1, 0],
...,
[1, 1, 2, ..., 2, 2, 2],
[1, 2, 2, ..., 2, 1, 2],
[2, 2, 2, ..., 1, 0, 1]], dtype=uint8)
Approach
Flatten the board. Easier to work with indices when the board is (temporarily) one-dimensional.
rpt is a 1d vector of the number of repeats per int. It gets "zipped" together with [1, 2, 3] to create repl, which is length 2000. (80% of the size of the board; you don't need to worry about the 0s in this example.)
The indices of the flattened array are effectively shuffled (idx), and the length of this shuffled array is constrained to the size of the replacement candidates. Lastly, those indices in the 1d board are filled with the replacements, after which it can be made 2d again.

Categories

Resources