Creating an M*N matrix of unique numbers from 1-90 - python

I want to develop an algorithm in python that returns an 18x9 matrix randomly populated with numbers from 1 to 90 using certain rules/conditions.
Rules
#1 - Maintain an array of 18x9 between 1 and 90.
#2 - First column contains 1-10, second column contains 11-20, etc.
#3 - Each row must have 5 numbers. Other columns should be set to 0.
#4 - Numbers must be arranged in ascending order from top to bottom in a column
What I have done so far?
import numpy as np
columns = 9
rows = 18
n_per_row = 5
matrix = np.zeros((rows, columns), dtype=int)
# Keep track of available places at each row.
available_places = {k: n_per_row for k in range(rows)}
# Shuffle order in which we fill the columns.
col_order = np.arange(columns)
np.random.shuffle(col_order)
for column in col_order:
# Indices of available rows.
indices = [c for c in available_places if available_places[c]]
# Sample which row to use of the available.
indices = np.random.choice(indices, size=min(len(indices), 10), replace=False)
# print(indices)
# Values for this column.
col_values = np.random.choice(list(np.arange(1, 10+1)), size=min(len(indices), 10), replace=False) + column*10
# Fill in ascending order.
matrix[sorted(indices), column] = sorted(col_values)
for idx in indices:
available_places[idx] -= 1
print(matrix)
Result
[[ 0 0 0 31 0 51 0 71 81]
[ 1 11 0 0 0 52 61 72 0]
[ 0 0 21 32 41 0 62 73 0]
[ 0 0 22 33 0 0 0 74 82]
[ 0 12 23 0 42 0 63 0 83]
[ 2 13 24 34 0 53 0 0 0]
[ 3 0 0 0 43 54 64 0 84]
[ 4 0 0 0 44 55 65 0 85]
[ 5 14 0 0 45 0 66 0 86]
[ 6 0 25 35 46 0 0 75 0]
[ 7 15 26 36 0 0 0 0 87]
[ 8 16 0 0 47 0 0 76 88]
[ 0 17 27 0 48 0 0 77 89]
[ 0 18 0 0 49 56 67 78 0]
[ 9 0 28 39 0 57 0 79 0]
[ 0 0 29 0 50 58 68 80 0]
[ 0 19 30 40 0 59 69 0 0]
[10 20 0 0 0 60 70 0 90]]
Expected Result: https://images.meesho.com/images/products/56485141/snksv_512.jpg

Final result according to the 4 rules
5 values per row
10 values per column starting with 1,11,21, etc in ascending order
( Notice these rules are not ok for a bingo as seen in the image )
============ final matrix ===============
--------------------------------
[1, 11, 21, 31, 41, 0, 0, 0, 0]
[2, 12, 0, 32, 42, 0, 61, 0, 0]
[0, 13, 0, 33, 0, 0, 62, 71, 81]
[3, 0, 0, 34, 0, 0, 63, 72, 82]
[0, 0, 22, 0, 0, 51, 64, 73, 83]
[4, 14, 23, 35, 0, 52, 0, 0, 0]
[5, 0, 24, 0, 43, 53, 0, 0, 84]
[6, 15, 0, 36, 44, 54, 0, 0, 0]
[7, 0, 0, 37, 0, 0, 65, 74, 85]
[0, 0, 0, 0, 45, 55, 66, 75, 86]
[8, 16, 25, 0, 0, 0, 67, 76, 0]
[0, 0, 26, 0, 46, 56, 0, 77, 87]
[9, 17, 0, 0, 0, 0, 68, 78, 88]
[10, 18, 0, 0, 0, 57, 0, 79, 89]
[0, 19, 27, 38, 47, 0, 0, 80, 0]
[0, 20, 28, 39, 48, 58, 0, 0, 0]
[0, 0, 29, 0, 49, 59, 69, 0, 90]
[0, 0, 30, 40, 50, 60, 70, 0, 0]
--------------------------------
Principles :
Establish first a matrix with 0 and 1 set as placeholders for future values.
Randomize 0 or 1 per cell in the matrix, but survey # of 1 in a row and # of 1 in a col to respect constraints.
As it could happen that random gives not enough 1 early, the both constraints cannot be satisfied at first try. Prog retry automatically and traces each try for observation. (max seen in my tests : 10 loops, mean : <=3 loops)
Once a satisfactory matrix of 0 & 1 is obtained, replace each 1 by the corresponding value for each col.
A solution :
import random
# #1 - Maintain an array of 18x9 (between 1 and 90)
maxRow = 18
maxCol = 9
# #2 - First column contains 1-10, second column contains 11-20, etc.
# ie first column 1 start from 1 and have 10 entries, column 2 start from 11 and have 10 entries, etc.
origins = [i*10 +1 for i in range(maxCol)] #[1, 11, 21, 31, 41, 51, 61, 71, 81]
maxInCol = [10 for i in range(maxCol)] #[10, 10, 10, 10, 10, 10, 10, 10, 10]
# comfort : display matrix
def showMatrix():
print('--------------------------------')
for row in range(len(matrix)):
print(matrix[row])
print('--------------------------------')
# comfort : count #values in a col
def countInCol(col):
count = 0
for row in range(maxRow):
count+=matrix[row][col]
return count
# verify the rules of 5 per row and 10 per cols
def verify():
ok = True
showMatrix()
# count elements in a col
for col in range(maxCol):
count = 0
for row in range(maxRow):
count+= matrix[row][col]
if(count!= maxInCol[col]):
print ('*** wrong # of elements in col {0} : {1} instead of {2}'.format(col, count,maxInCol[col]))
ok = False
# count elements in a row
for row in range(maxRow):
count = 0
for col in range(maxCol):
count+= matrix[row][col]
if(count!=5):
print('***** wrong # of elements in row {0} : {1}'.format(row, count))
ok = False
if (not ok): print( '********************************************')
return ok
# -- main ----
# need to iterate in case of no more value to complete a col
tour = 1
maxTour = 100 #security limit
while True:
# prepare a matrix of rows of cols of 0
matrix = [[0 for i in range(maxCol)] for i in range(18)]
# begin to fill some places with 1 instead of 0
for row in range(maxRow):
count = 0
for col in range(maxCol):
if (count==5): break # line is already full with 5 elt
# random a 0 or 1
placeHolder = random.choice([0,1])
# if the remaining cols of this row needs to be 1 to complete at 5/row
if (5-count) == (maxCol-col):
placeHolder = 1 # must complete the row
else:
inCol = countInCol(col)
# 10 places max in col
if (inCol)==maxInCol[col]: placeHolder = 0 # this col is full
# constraint : if the remaining rows of this col need to be 1 to complete the expected 10 values
if(maxRow-row) == (maxInCol[col]-inCol): placeHolder = 1
matrix[row][col] = placeHolder
count+= placeHolder
#-------- some case are not correct . prog loops
if verify():
print(' ok after {0} loop(s)'.format(tour))
break
# security infinite loop
if (tour>=maxTour): break
tour +=1
# now replace the placeholders by successive values per col
print('\n============ final matrix ===============')
for row in range(maxRow):
for col in range(maxCol):
if matrix[row][col]==1:
matrix[row][col] = origins[col]
origins[col]+=1
showMatrix()
HTH

Related

Transformation of the 3d numpy array

I have 3d array and I need to set to zero its right part. For each 2d slice (n, :, :) of the array the index of the column should be taken from vector b. This index defines separating point - the left and right parts, as shown in the figure below.
a_before = [[[ 1 2 3 4]
[ 5 6 7 8]
[ 9 10 11 12]
[13 14 15 16]]
[[17 18 19 20]
[21 22 23 24]
[25 26 27 28]
[29 30 31 32]]
[[33 34 35 36]
[37 38 39 40]
[41 42 43 44]
[45 46 47 48]]]
a_before.shape = (3, 4, 4)
b = (2, 3, 1)
a_after_1 = [[[ 1 2 0 0]
[ 5 6 0 0]
[ 9 10 0 0]
[13 14 0 0]]
[[17 18 19 0]
[21 22 23 0]
[25 26 27 0]
[29 30 31 0]]
[[33 0 0 0]
[37 0 0 0]
[41 0 0 0]
[45 0 0 0]]]
After this, for each 2d slice (n, :, :) I have to take index of the column from c vector and multiply by the corresponding value taken from the vector d.
c = (1, 2, 0)
d = (50, 100, 150)
a_after_2 = [[[ 1 100 0 0]
[ 5 300 0 0]
[ 9 500 0 0]
[13 700 0 0]]
[[17 18 1900 0]
[21 22 2300 0]
[25 26 2700 0]
[29 30 3100 0]]
[[4950 0 0 0]
[5550 0 0 0]
[6150 0 0 0]
[6750 0 0 0]]]
I did it but my version looks ugly. Maybe someone can help me.
P.S. I would like to avoid for loops and use only numpy methods.
Thank You.
Here's a version without loops.
In [232]: A = np.arange(1,49).reshape(3,4,4)
In [233]: b = np.array([2,3,1])
In [234]: d = np.array([50,100,150])
In [235]: I,J = np.nonzero(b[:,None]<=np.arange(4))
In [236]: A[I,:,J]=0
In [237]: A[np.arange(3),:,b-1] *= d[:,None]
In [238]: A
Out[238]:
array([[[ 1, 100, 0, 0],
[ 5, 300, 0, 0],
[ 9, 500, 0, 0],
[ 13, 700, 0, 0]],
[[ 17, 18, 1900, 0],
[ 21, 22, 2300, 0],
[ 25, 26, 2700, 0],
[ 29, 30, 3100, 0]],
[[4950, 0, 0, 0],
[5550, 0, 0, 0],
[6150, 0, 0, 0],
[6750, 0, 0, 0]]])
Before I developed this, I wrote an iterative version. It helped me visualize the problem.
In [240]: Ac = np.arange(1,49).reshape(3,4,4)
In [241]:
In [241]: for i,v in enumerate(b):
...: Ac[i,:,v:]=0
...:
In [242]: for i,(bi,di) in enumerate(zip(b,d)):
...: Ac[i,:,bi-1]*=di
It may be easier to understand, and in that sense, less ugly!
The fact that your A has middle dimension that is "just-going-along" for the ride, complicates "vectorizing" the problem.
With a (3,4) 2d array, the solution is just:
In [251]: Ab = Ac[:,0,:]
In [252]: Ab[b[:,None]<=np.arange(4)]=0
In [253]: Ab[np.arange(3),b-1]*=d
Here it is:
import numpy as np
a = np.arange(1,49).reshape(3,4,4)
b = np.array([2,3,1])
c = np.array([1,2,0])
d = np.array([50,100,150])
for i in range(len(b)):
a[i,:,b[i]:] = 0
for i,j in enumerate(c):
a[i,:,j] = a[i,:,j]* d[i]
print(a)
#
[[[ 1 100 0 0]
[ 5 300 0 0]
[ 9 500 0 0]
[ 13 700 0 0]]
[[ 17 18 1900 0]
[ 21 22 2300 0]
[ 25 26 2700 0]
[ 29 30 3100 0]]
[[4950 0 0 0]
[5550 0 0 0]
[6150 0 0 0]
[6750 0 0 0]]]

How can I measure distance from a local minimum value in a numpy array?

I'm using scikit.morphology to do an erosion on a two-dimensional array. I need to also ascertain the distance of each cell to the minimum value identified in the erosion.
Example:
np.reshape(np.arange(1,126,step=5),[5,5])
array([[ 1, 6, 11, 16, 21],
[ 26, 31, 36, 41, 46],
[ 51, 56, 61, 66, 71],
[ 76, 81, 86, 91, 96],
[101, 106, 111, 116, 121]])
erosion(np.reshape(np.arange(1,126,step=5),[5,5]),selem=disk(3))
array([[ 1, 1, 1, 1, 6],
[ 1, 1, 1, 6, 11],
[ 1, 1, 1, 6, 11],
[ 1, 6, 11, 16, 21],
[26, 31, 36, 41, 46]])
Now what I want to do is also return an array that gives me the distance to the minimum like this:
array([[ 0, 1, 2, 3, 3],
[ 1, 1, 2, 3, 3],
[ 2, 2, 3, 3, 3],
[ 3, 3, 3, 3, 3],
[ 3, 3, 3, 3, 3]])
Is there a scikit tool that can do this? If not, any tips on how to efficiently achieve this result?
You can find the distances from the centre of your footprint using scipy.ndimage.distance_transform_cdt, then use SciPy's ndimage.generic_filter to return those values:
import numpy as np
from skimage.morphology import erosion, disk
from scipy import ndimage as ndi
input_arr = np.reshape(np.arange(1,126,step=5),[5,5])
footprint = disk(3)
def distance_from_min(values, distance_values):
d = np.inf
min_val = np.inf
for i in range(len(values)):
if values[i] <= min_val:
min_val = values[i]
d = distance_values[i]
return d
full_footprint = np.ones_like(footprint, dtype=float)
full_footprint[tuple(i//2 for i in footprint.shape)] = 0
# use `ndi.distance_transform_edt` instead for the euclidean distance
distance_footprint = ndi.distance_transform_cdt(
full_footprint, metric='taxicab'
)
# set values outside footprint to 0 for pretty-printing
distance_footprint[~footprint.astype(bool)] = 0
# then, extract it into values matching the values in generic_filter
distance_values = distance_footprint[footprint.astype(bool)]
output = ndi.generic_filter(
input_arr.astype(float),
distance_from_min,
footprint=footprint,
mode='constant',
cval=np.inf,
extra_arguments=(distance_values,),
)
print('input:\n', input_arr)
print('footprint:\n', footprint)
print('distance_footprint:\n', distance_footprint)
print('output:\n', output)
Which gives:
input:
[[ 1 6 11 16 21]
[ 26 31 36 41 46]
[ 51 56 61 66 71]
[ 76 81 86 91 96]
[101 106 111 116 121]]
footprint:
[[0 0 0 1 0 0 0]
[0 1 1 1 1 1 0]
[0 1 1 1 1 1 0]
[1 1 1 1 1 1 1]
[0 1 1 1 1 1 0]
[0 1 1 1 1 1 0]
[0 0 0 1 0 0 0]]
distance_footprint:
[[0 0 0 3 0 0 0]
[0 4 3 2 3 4 0]
[0 3 2 1 2 3 0]
[3 2 1 0 1 2 3]
[0 3 2 1 2 3 0]
[0 4 3 2 3 4 0]
[0 0 0 3 0 0 0]]
output:
[[0. 1. 2. 3. 3.]
[1. 2. 3. 3. 3.]
[2. 3. 4. 4. 4.]
[3. 3. 3. 3. 3.]
[3. 3. 3. 3. 3.]]
This function will be very slow, however. If you want to make it faster, you will need (a) a solution like Numba or Cython for the filter function, in conjunction with SciPy LowLevelCallables and (b) to hardcode the distance array into the distance function, because for LowLevelCallables it is more difficult to pass in extra arguments. Here is a full example with llc-tools, which you can install with pip install numba llc-tools.
import numpy as np
from scipy import ndimage as ndi
from skimage.morphology import erosion, disk
import llc
def filter_func_from_footprint(footprint):
# first, create a footprint where the values are the distance from the
# center
full_footprint = np.ones_like(footprint, dtype=float)
full_footprint[tuple(i//2 for i in footprint.shape)] = 0
# use `ndi.distance_transform_edt` instead for the euclidean distance
distance_footprint = ndi.distance_transform_cdt(
full_footprint, metric='taxicab'
)
# then, extract it into values matching the values in generic_filter
distance_footprint[~footprint.astype(bool)] = 0
distance_values = distance_footprint[footprint.astype(bool)]
# finally, create a filter function with the values hardcoded
#llc.jit_filter_function
def distance_from_min(values):
d = np.inf
min_val = np.inf
for i in range(len(values)):
if values[i] <= min_val:
min_val = values[i]
d = distance_values[i]
return d
return distance_from_min
if __name__ == '__main__':
input_arr = np.reshape(np.arange(1,126,step=5),[5,5])
footprint = disk(3)
eroded = erosion(input_arr, selem=footprint)
filter_func = filter_func_from_footprint(footprint)
result = ndi.generic_filter(
# use input_arr.astype(float) when using euclidean dist
input_arr,
filter_func,
footprint=disk(3),
mode='constant',
cval=np.inf,
)
print('input:\n', input_arr)
print('output:\n', result)
Which gives:
input:
[[ 1 6 11 16 21]
[ 26 31 36 41 46]
[ 51 56 61 66 71]
[ 76 81 86 91 96]
[101 106 111 116 121]]
output:
[[0 1 2 3 3]
[1 2 3 3 3]
[2 3 4 4 4]
[3 3 3 3 3]
[3 3 3 3 3]]
For more reading on low-level callables and llc-tools, in addition to the LowLevelCallable documentation on the SciPy site (linked above, plus links therein), you can read these two blog posts I wrote a few years ago:
SciPy's new LowLevelCallable is a game-changer
Prettier LowLevelCallables with Numba JIT and decorators

Pandas - Subtracting from columns via priority

A simple illustration of what I'm trying to do: given a set of payroll data, that has columns regular, over_time, double_time, lunch_break, I want to subtract the lunch_break column from the other time columns, in a specified order, until the lunch_break minutes are exhausted. For example, the lunch_break minutes should first come out of regular, then over_time, then double_time. So given the following data set:
import pandas as pd
payroll = [
{'regular': 120, 'over_time': 60, 'double_time': 0, 'lunch_break': 30},
{'regular': 15, 'over_time': 60, 'double_time': 30, 'lunch_break': 45},
{'regular': 15, 'over_time': 15, 'double_time': 120, 'lunch_break': 45},
{'regular': 0, 'over_time': 120, 'double_time': 120, 'lunch_break': 30}
]
payroll_df = pd.DataFrame(payroll)
I need the result of:
result = [
{'regular': 90, 'over_time': 60, 'double_time': 0}, # 30 from reg
{'regular': 0, 'over_time': 30, 'double_time': 30}, # 15 from reg, 30 from ovr
{'regular': 0, 'over_time': 0, 'double_time': 105}, # 15 from reg, 15 from ovr, 15 from dbl
{'regular': 0, 'over_time': 90, 'double_time': 120}, # 0 from reg, 30 from ovr
]
result_df = pd.DataFrame(result)
Is there a good way to do this using pandas?
Vectorized version
df = payroll_df.copy()
df['regular'] = df.regular - df['lunch_break']
df.loc[df.regular < 0, 'over_time'] += df[df.regular < 0].regular
df.loc[df.over_time < 0, 'double_time'] += df[df.over_time < 0].over_time
df[df < 0] = 0
print(df.drop(columns='lunch_break'))
regular over_time double_time
0 90 60 0
1 0 30 30
2 0 0 105
3 0 90 120
One way of doing it
regular = np.where(payroll_df['regular']-payroll_df['lunch_break']>0, payroll_df['regular']-payroll_df['lunch_break'],0)
b=np.where(regular>0, payroll_df['over_time'],payroll_df['over_time']+(payroll_df['regular']-payroll_df['lunch_break']))
over_time = np.where(b>0,b,0)
double_time= np.where(b<0,payroll_df['double_time']+b,payroll_df['double_time'])
result_df = pd.DataFrame({'regular': regular,'over_time': over_time,'double_time': double_time})
result_df
output
regular over_time double_time
0 90 60 0
1 0 30 30
2 0 0 105
3 0 90 120
def subtract_lunch(row):
remaining = row['lunch_break']
for col in time_priority:
if row[col] >= remaining:
row[col] = row[col] - remaining
break
remaining = remaining - row[col]
row[col] = 0
return row[time_priority]
time_priority = ['regular', 'over_time', 'double_time']
payroll_df.apply(subtract_lunch, axis = 1)
You don't say how you want the case where lunch_break is larger than the others put together handled. My code just sets all the other columns to zero, but doesn't indicate the overage.

Python Randomly assign a list from a set number

What i'm trying to do is make a list that gets filled with different combinations of numbers (not even) that all add up to a pre defined number.
Example, if I have the a variable total = 50 as well as a list that holds 7 numbers, each time I generate and print the list in a loop, the results will be completly different with some being huge and others near empty or empty. I dont want any restrictions for the range of the value (could come as 0 or the entire 50, and next time may even be all balanced).
Is this possible?
Thanks
EDIT: I've gotten to here, but it seems to prioritize the ending, how can I make each variable have an equal chance of high or low numbers?
`import random
tot = 50
size = 7
s = 0
run = 7
num = {}
while run > 0:
num[run] = random.randint(s,tot)
tot -= num[run]
run -= 1
print(str(num))
`
Disclaimer: I don't mind what this code is meant to be.
from random import randint, seed
seed(345)
def bizarre(total, slots):
tot = total
acc = []
for _ in range(slots-1):
r = randint(0,tot)
tot -= r
acc.append(r)
acc.append(total-sum(acc))
return acc
# testing code
for i in range(10):
tot = randint(50,80)
n = randint(5,10)
b = bizarre(tot, n)
print "%3d %3d %s -> %d" % (tot, n, b, sum(b))
Output
73 5 [73, 0, 0, 0, 0] -> 73
54 6 [36, 5, 9, 0, 3, 1] -> 54
60 7 [47, 6, 6, 1, 0, 0, 0] -> 60
69 7 [3, 48, 15, 3, 0, 0, 0] -> 69
72 8 [36, 18, 18, 0, 0, 0, 0, 0] -> 72
65 8 [17, 32, 13, 3, 0, 0, 0, 0] -> 65
54 7 [33, 13, 0, 2, 4, 1, 1] -> 54
54 6 [7, 11, 26, 3, 5, 2] -> 54
67 7 [62, 5, 0, 0, 0, 0, 0] -> 67
67 8 [28, 25, 1, 0, 10, 3, 0, 0] -> 67
If you want a list of n random numbers that add up to a variable x, create n-1 random numbers. Then last number is the difference between x and the n-1 random numbers. For example, if you want a list of size three that adds up to 5 create two numbers randomly, 1 and 2. 1+2 = 3, 5-3 = 2, so the list is 1,2,2.

Iterate over a matrix, sum over some rows and add the result to another array

Hi there I have the following matrix
[[ 47 43 51 81 54 81 52 54 31 46]
[ 35 21 30 16 37 11 35 30 39 37]
[ 8 17 11 2 5 4 11 9 17 10]
[ 5 9 4 0 1 1 0 3 9 3]
[ 2 7 2 0 0 0 0 1 2 1]
[215 149 299 199 159 325 179 249 249 199]
[ 27 49 24 4 21 8 35 15 45 25]
[100 100 100 100 100 100 100 100 100 100]]
I need to iterate over the matrix summing all elements in rows 0,1,2,3,4 only
example: I need
row_0_sum = 47+43+51+81....46
Furthermore I need to store each rows sum in an array like this
[row0_sum, row1_sum, row2_sum, row3_sum, row4_sum]
So far I have tried this code but its not doing the job:
mu = np.zeros(shape=(1,6))
#get an average
def standardize_ratings(matrix):
sum = 0
for i, eli in enumerate(matrix):
for j, elj in enumerate(eli):
if(i<5):
sum = sum + matrix[i][j]
if(j==elj.len -1):
mu[i] = sum
sum = 0
print "mu[i]="
print mu[i]
This just gives me an Error: numpy.int32 object has no attribute 'len'
So can someone help me. What's the best way to do this and which type of array in Python should I use to store this. Im new to Python but have done programming....
Thannks
Make your data, matrix, a numpy.ndarray object, instead of a list of lists, and then just do matrix.sum(axis=1).
>>> matrix = np.asarray([[ 47, 43, 51, 81, 54, 81, 52, 54, 31, 46],
[ 35, 21, 30, 16, 37, 11, 35, 30, 39, 37],
[ 8, 17, 11, 2, 5, 4, 11, 9, 17, 10],
[ 5, 9, 4, 0, 1, 1, 0, 3, 9, 3],
[ 2, 7, 2, 0, 0, 0, 0, 1, 2, 1],
[215, 149, 299, 199, 159, 325, 179, 249, 249, 199],
[ 27, 49, 24, 4, 21, 8, 35, 15, 45, 25],
[100, 100, 100, 100, 100, 100, 100, 100, 100, 100]])
>>> print matrix.sum(axis=1)
[ 540 291 94 35 15 2222 253 1000]
To get the first five rows from the result, you can just do:
>>> row_sums = matrix.sum(axis=1)
>>> rows_0_through_4_sums = row_sums[:5]
>>> print rows_0_through_4_sums
[540 291 94 35 15]
Or, you can alternatively sub-select only those rows to begin with and only apply the summation to them:
>>> rows_0_through_4 = matrix[:5,:]
>>> print rows_0_through_4.sum(axis=1)
[540 291 94 35 15]
Some helpful links will be:
NumPy for Matlab Users, if you are familiar with these things in Matlab/Octave
Slicing/Indexing in NumPy

Categories

Resources