Assignments with minimum fractions - python

Say we have a number of elements E and a number of sets S.
We need to assign elements to sets so that:
All sets roughly contain the same number of elements (minimum
difference in set size between the smallest and largest set)
The number of elements per set should be as small as possible.
Each element needs to be assigned to at least a minimum % of sets of the total. This % is specified for each element (this
implies that elements are course be assigned to multiple sets
accordingly)
Note that (1) and (2) are problem objectives, and in some instances there is a tradeoff between them. I'm effectively looking for a mathematical formulation / solution that parameterizes this tradeoff. Meanwhile (3) is just a problem constraint.
How do we find an optimal assignment? Does this problem have a name in the literature? In case it matters, I'm specifically looking for a solution in Python.
As an example, say we have 3 sets and 10 elements, each of them specifying the min. fraction of sets as follows:
0 97.844356
1 48.006223
2 99.772135
3 16.899074
4 0.111023
5 1.028894
6 5.315590
7 100.000000
8 99.838698
9 93.323315

You could just rotate infinitely over the sets in order to determine the next set to assign to. Then for each element calculate how many sets it should be assigned to and then do the assignment accordingly:
from itertools import cycle
from math import ceil
elems = [
[0, 97.844356],
[1, 48.006223],
[2, 99.772135],
[3, 16.899074],
[4, 0.111023],
[5, 1.028894],
[6, 5.315590],
[7, 100.000000],
[8, 99.838698],
[9, 93.323315]
]
def assign(elements, n):
sets = [[] for _ in range(n)]
gen = (e for e, p in elements for _ in range(ceil(p*n/100)))
for s, e in zip(cycle(sets), gen):
s.append(e)
return sets
print(assign(elems, 3))
Output:
[[0, 1, 2, 4, 7, 8, 9], [0, 1, 2, 5, 7, 8, 9], [0, 2, 3, 6, 7, 8, 9]]
In above cycle is used to iterate infinitely over the target sets. gen is a generator that returns the minimum amount of elements to add based on the probabilities:
>>> n = 3
>>> gen = (e for e, p in elems for _ in range(ceil(p*n/100)))
>>> list(gen)
[0, 0, 0, 1, 1, 2, 2, 2, 3, 4, 5, 6, 7, 7, 7, 8, 8, 8, 9, 9, 9]
Finally zip is used to generate (target set, element) tuples which are then assigned within a loop.

Related

variation of binary search

Say that the target isn't in the list. Instead of outputting -1, I want it to output the rightmost value that's less than the target, followed by the next value (if there is one). In other words, I want to see the gap where the target would be.
If there was a list [1, 2, 4, 7, 7, 8, 9] and I was looking for 5, instead of outputting -1 I want it to output [4, 7]. How do I do that?
Use bisect.bisect to find the insertion point for 5 using binary search, and then take the elements before and after that point:
>>> import bisect
>>> a = [1, 2, 4, 7, 7, 8, 9]
>>> i = bisect.bisect(a, 5)
>>> a[i-1:i+1]
[4, 7]

How would I generate solutions to fill in a 3x3 matrix with the numbers 1-9 and all rows and columns adding up to 1 number?

I am trying to find a way to get as many solutions as possible for a 3x3 matrix to have all rows and columns add up to the same number. It must use the numbers 1-9. I figured out that I need to use 1 big, med, and small number in each row for it to work.
Ex:
2 4 9 = 15
6 8 1 = 15
7 3 5 = 15
= = =
15 15
I have a dict with the usable numbers grouped by size and a matrix with the 3 biggest numbers each of which are in a separate row because nothing would add up to them if they were in the same row.
nums = {
"small" : [1, 2, 3],
"med" : [4, 5, 6],
"big" : [7, 8, 9]
}
m = [
[0, 0, 9],
[0, 8, 0],
[7, 0, 0]
]
What would be the best way to find all possible solutions to this?
There are two problems to solve:
Generate new possible solutions
Verify that the solution is valid
Step one is easy; python includes a permutation function which will generate every single arrangement of numbers for you. Then you need to verify that the sums all agree. We can simplify that by using #JohanC's observation that each row and column must sum to 15.
from itertools import permutations
def all_sums():
# Generate all possible grids
r = range(1, 10)
grids = permutations(r)
# Only keep grids that are valid solutions
solutions = [g for g in grids if _all_sums_are_15(g)]
return solutions
def _all_sums_are_15(grid):
"""Check that each row and column of the grid sums to 15"""
return (_sum_is_15(grid, 0, 1, 2) and
_sum_is_15(grid, 3, 4, 5) and
_sum_is_15(grid, 6, 7, 8) and
_sum_is_15(grid, 0, 3, 6) and
_sum_is_15(grid, 1, 4, 7) and
_sum_is_15(grid, 2, 5, 8))
def _sum_is_15(grid, a, b, c):
"""Determine if the given 3 cells in the grid sum up to 15"""
sum_ = grid[a] + grid[b] + grid[c]
return sum_ == 15
if __name__ == '__main__':
for s in all_sums():
print(s)
First, note that the sum of each row/column must necessarily be 15, because the sum of the 3 rows together must be equal to the sum of the numbers from 1 to 9, so 45.
Here is a way to generate all 72 solutions using Z3, an open source SAT/SMT solver. Note that Z3 is a powerful solver for this kind of combinatorial problems, and probably a bit overkill for this specific one. But it can be used as an example of how such combinatorial problems can be handled, also much trickier ones. See e.g. this long list of examples.
from z3 import *
# get 9 integer variables for the matrix elements
M = [[Int(f"m{i}{j}") for j in range(3)] for i in range(3)]
# create a Z3 solver instance
s = Solver()
# all numbers must be between 1 and 9
s.add([And(M[i][j] >= 1, M[i][j] <= 9) for i in range(3) for j in range(3)])
# all the rows must sum to 15
s.add([And([Sum([M[i][j] for j in range(3)]) == 15]) for i in range(3)])
# all the columns must sum to 15
s.add([And([Sum([M[i][j] for i in range(3)]) == 15]) for j in range(3)])
# all 9 numbers must be distinct
s.add(Distinct([M[i][j] for i in range(3) for j in range(3)]))
res = s.check()
num_solutions = 0
while res == sat:
num_solutions += 1
m = s.model()
print(num_solutions, ":", [[m[M[i][j]] for j in range(3)] for i in range(3)])
# add a new condition that at least one of the elements must be different to the current solution
s.add(Or([m[M[i][j]].as_long() != M[i][j] for i in range(3) for j in range(3)]))
res = s.check()
Output:
1 : [[3, 4, 8], [5, 9, 1], [7, 2, 6]]
2 : [[5, 7, 3], [9, 2, 4], [1, 6, 8]]
3 : [[6, 8, 1], [7, 3, 5], [2, 4, 9]]
...
72 : [[7, 5, 3], [6, 1, 8], [2, 9, 4]]
All solutions are equivalent to each other. You can permute rows and or columns of a solution to get another. And you can mirror a solution. 3! row permutations times 3! column permutations times 2 for the mirroring, 72 altogether.

Comparing the order of specific elements in a python list

How to implement the following comparison using Python 2
Input is composed of two groups:
Different lists collected from Experiments.
All accepted sequences of some of these elements.
How to filter all those lists from input group 1. for which any of the accepted sequences from input group 2 is a proper subsequence?
For example:
Group two (defined by user):
x = [3, 1, 6]
y = [2, 1, 6]
z = [3, 4, 6]
Group one (from Experiments):
a = [1, 2, 3, 5, 6, 7]
b = [2, 1, 4, 3, 1, 8, 6]
c = [6, 3, 5, 7, 8, 4, 2, 6]
d = [1, 2, 1, 3, 4]
We accept b and c because x is a subsequence of b and z is a subsequence of c. And likewise we reject a and d because none of the x, y or z is a subsequence of either.
mysterious(a) should return [2,6] which is not acceptable as we didn't visit node 1 after 2
mysterious(b) should return [2,1,6] which is acceptable and so on
Another example(detailed):
To accept a certain set or list I need some elements that present some services.
ServiceA served by [ 3 , 2 ]
ServiceB served by [ 1 , 4 ]
ServiceC served by [ 6 ]
Total nodes available to end user [ 1 , 2 , 3 , 4 , 5 , 6 , 7 , 8 , 9 ]
We ask him to choose a set or list from the total nodes. We will only accept any combination when the nodes which serve services appear in correct order or sequence.
So user could choose any set with unlimited number of nodes as long as:
1. the nodes are member from the total nodes.
2. the order of services will be correct.
example [1,4,5,"Node serve A", 7,1,2, "Node serve B", "Node serve C"]
or so the general form to accept a list or set is:
[some elements, element serve service A, other elements, element service B, more elements, element service c, etc...]
and you can replace the node service element with any element from the correspondent set above
* If that not clear please let me know and will explain in more examples.
Example three:
Lets think in a factory with 10 machines. The product need three different processes to be manufactured.
Every machine can do some of these processes or all it differs.
Machine 1 can do process alpha, gama but not beta.
Machine 2 can do only process alpha
Every raw material arrives need to find route through machines with condition that by end it should have the three processes done.
The processes must be in order so first do Alpha, beta, then at end Gama.
we do route every time to avoid overloading machines.
so now I need a function to accept or reject certain route suggestions to enforce that every raw material go through the processes in correct order.
I can't of course make all possible combinations and then compare as this will consume time and can run for infinity.
Thanks
I wrote the following code for longest_subsequence
a = [1, 2, 3, 5, 6, 7]
b = [2, 1, 4, 3, 1, 8, 6]
c = [6, 3, 5, 7, 8, 4, 2, 6]
d = [1, 2, 1, 3, 4]
x = [3, 1, 6]
y = [2, 1, 6]
z = [3, 4, 6]
group1 = [a, b, c, d]
group2 = [x, y, z]
def longest_subsequence(experiment):
longest = []
for accepted_sequence in group2:
it = iter(experiment)
subsequence = []
for element in accepted_sequence:
if element in it:
subsequence.append(element)
else:
break
if subsequence == list(accepted_sequence):
return subsequence
longest = max(longest, subsequence, key=len)
return longest
for experiment in group1:
print(longest_subsequence(experiment))
It works by using element in iterator, which while scanning for next matching element, discards other elements in between.
The code finds the first longest subsequence in the group2. However since x precedes y and both of them are subsequences of b, x is printed for b:
[3]
[3, 1, 6]
[3, 4, 6]
[2, 1]
Your first example could be handled by the following code:
group1 = {
'x': [3, 1, 6],
'y': [2, 1, 6],
'z': [3, 4, 6],
}
group2 = {
'a': [1, 2, 3, 5, 6, 7],
'b': [2, 1, 4, 3, 1, 8, 6],
'c': [6, 3, 5, 7, 8, 4, 2, 6],
'd': [1, 2, 1, 3, 4],
}
def isSubsequence(sub, seq):
start = 0
for i in sub:
try:
start = seq[start:].index(i)
except ValueError:
return False
return True
for seqName, seq in group2.items(): # Change items for iteritems in Python2
for subName, sub in group1.items(): # Same than above
if isSubsequence(sub, seq):
print("{} is subsequence of {}".format(subName, seqName))
#break # Uncomment this line if you only want to find the first
This outputs:
z is subsequence of c
y is subsequence of b
x is subsequence of b
If the break is uncommented the last row would not appear.
As you can see it doesn't take order of the groups into account as they are stored in dicts, we could use other containers to keep the order if it matters.
isSubsequence() uses str.index() to avoid one of the loops in order to be faster and it also returns as soon as it knows a list is not a subsequence. If only a boolean context is needed, this is, if you only need if any group1 item is a subsequence of a group2 item and not which of them uncummenting the break is suggested.

random generation of list elements with rules

Let list l consists of n elements, where
each element should be either 0 or a positive integer less than or equal to r, and
the sum of list should be equal to m
Example:
Given n = 5, r = 4, m = 10
l = [4, 3, 2, 0, 1]
It is easy to fulfill rule(1), but I wonder if there is any good idea/algo to fulfill both rules?
Here's a simple brute force solution. Basically, you want to generate samples where random integers less than r are equally likely. Samples that do not meet the criteria (sum to m) are rejected.
import numpy as np
def rand_with_rules(n, r, m):
if n*(r-1) < m:
raise ValueError
while True:
l = np.random.randint(0, r, size=(n))
if l.sum() == m:
return l
Note that the rejection of samples will necessarily bias your "random" numbers. Because of your constraints, you can't have a purely set of random and certain will tend to be over or underrepresented.
Take, for example, the case n, r, m = 2, 3, 4. The only series that matches this criteria is (2, 2), so the likelihood of drawing 2 is 100% and 0% for other values.
In essence, this solution states that you have no prior knowledge of which integers are most likely. However, through the constraints, your posterior knowledge of the numbers will almost never be truly uniform.
There's probably a clever analytic solution for the distribution of integers given this constraint, which you could use to generate samples. But I don't know what it is!
Without validation that all rules can be meet at once, I suggest following solution
import random
def dis(n,m,r):
l = [0] * n
for inc in range(1,m):
l.sort()
for ind_less_r, less_r in enumerate(l):
if less_r > r:
break
l[random.randint(0,ind_less_r-1)] += 1
random.shuffle(l)
return l
for i in range(1,10):
print dis(10,50,9)
result is
[7, 7, 6, 1, 6, 3, 5, 4, 4, 6]
[5, 6, 7, 5, 4, 7, 4, 4, 4, 3]
[4, 3, 2, 4, 7, 7, 5, 7, 5, 5]
[4, 4, 5, 6, 4, 6, 6, 4, 5, 5]
[6, 6, 4, 6, 5, 6, 2, 5, 4, 5]
[2, 8, 4, 2, 6, 5, 4, 4, 6, 8]
[6, 6, 3, 4, 5, 5, 5, 5, 6, 4]
[6, 4, 5, 6, 7, 3, 1, 5, 6, 6]
[4, 5, 4, 7, 6, 6, 3, 2, 6, 6]
A reasonable interpretation would be to (i) find and count all unique lists satisfying the rules; and (ii) pick one of those lists at random with equal probability. The problem with this method is that it is complicated to code the list of lists.
A simpler way to do it (less code) is to expose each correct list to the same probability of being picked. That algorithm is:
check that m<=nr If not, return None or raise an error.
Loop: repeatedly generate lists with random numbers in [0,r]. Break the loop and return the first list of sum m.
Note: The tradeoff here for less code is potentially longer execution time. if r is large, or m is an improbable sum, this may take a while. We can mitigate this a little by checking the limits where the answer can only be zeros or r-1's.
from numpy.random import randint
def known_sum_random_list(size,limitint,knownsum):
if knownsum>size*(limitint-1):
return None
if knownsum==size*(limitint-1):
return size*[limitint-1]
s=0
l = size*[0]
while (s!=knownsum):
l = randint(0,limitint,size)
s = sum(l)
return l
for t in xrange(10):
print known_sum_random_list(5,4,10)
output:
[3 2 1 2 2]
[1 1 2 3 3]
[3 0 3 1 3]
[3 2 0 3 2]
[2 2 0 3 3]
[1 3 2 3 1]
[3 3 0 3 1]
[2 0 2 3 3]
[3 1 2 3 1]
[3 2 0 3 2]
Since you responded in comments that it can have many 0s and numbers can repeat, I infer that it need not be all that random. With that in mind, here is a basic solution without any loops or includes. It assumes n, r, and m have valid values and types. But such checks are simple enough to add, I'll edit them in upon request.
def create_list(n, r, m):
output = [r] * (m/r)
remainder = m - r * len(output)
if remainder != 0:
output += [m - r * len(output)]
output += [0] * (n - len(output))
return output
output = create_list(5, 4, 10) # gives [4, 4, 2, 0, 0]
output = create_list(5, 2, 10) # gives [2, 2, 2, 2, 2]
P.S. - request was for values less than r, but example showed values equaling r, so this is going by the example

Using a list to multiply a value?

n1 = 20*values[0]
n2 = 100*values[1]
print(n1,"\n",n2)
I've basically got a list above this, with a few values, only if I would have
currently if i have 3 in the first value and 7 in the second, 3 would show up 20 times, and 7 would show up 100 times,
however i would like it to be just multiplied.
Is there any way to do this without importing anything?
--edited again--
I Should have said this much sooner but i didn't realise inputted values would change anything, or factor into anything of this
top of the code i have:
for num in numbers:
values.append(num)
with values being an empty list, and "numbers" is the Input
I think this is what you want
>>> values = [3,7]
>>> n1 = 20 * [values[0]]
>>> n2 = 5 * [values[1]]
>>> print n1
[3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3]
>>> print n2
[7, 7, 7, 7, 7]
number * list_value = product
In [1752]: values = [3, 7]
In [1753]: 20 * values[0]
Out[1753]: 60
When you multiply by an int, the math functions take over. However, in order for repetition, just convert to a str then multiply:
n1 = 20*str(values[0])
n2 = 100*str(values[1])
print(n1,"\n",n2)
If you have a list of values and another list of numbers to multiply the corresponding values with:
values = [12, 14]
multiples = [20, 100]
Then, you can get what you want with:
result = [value * multiple for value, multiple in zip(values, multiples)]
See zip and list comprehensions.
If, on the other hand, you want to repeat the elements in values by the elements in multiples, you can do:
def repeat(values, repeats):
for value, rep in zip(values, repeats):
for _ in range(rep):
yield value
Now, you can use repeat() as a generator to do what you want:
>>> list(repeat([3, 7], [5, 10])
[3, 3, 3, 3, 3, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7]

Categories

Resources