Constraining Pulp Binary Variable Selection Based on Another Variable - python

I'm creating an optimisation script for Fantasy Football. It starts off quite easily- loading in players & their relevant details.
The key in this game is that 15 players can be selected in your squad but only 11 can be fielded per week.
What I would like to do is have 2 variables- one defining that the player is in your squad and a sub-variable that determines whether you put the player in your starting 11.
I have tried a few things- one broad solution is that have 2 unrelated variables. 1 that selects 11 starters and a second that selects 4 subs. This works well for 1 week, but for example one week Player A from your squad might be best starting and the next he's better on the bench. Therefore I would get a more optimal solution if I can make the starting 11 variable a subset of the squad variable.
I've attached the code defining the variables and my attempt at creating a constraint that would link them together. (there are other constraints that all successfully work. For example I can pick a starting 11 or a squad of 15 to maximize expected results without issue, but I cannot pick a starting 11 within a squad of 15.
#VECTORS OF BINARY DECISIONS VARIABLES
squad_variables = []
for rownum in ID:
variable = str('x' + str(rownum))
variable = pulp.LpVariable(str(variable), lowBound = 0, upBound = 1, cat= 'Integer')
squad_variables.append(variable)
xi_variables = []
for rownum in ID:
bariable = str('y' + str(rownum))
bariable = pulp.LpVariable(str(bariable), lowBound = 0, upBound = 1, cat= 'Integer')
xi_variables.append(bariable)
The code below is not working for this task and is the root of the problem..
#ID CONSTRAINTS (ie. only 15 unique id selection across both systems)
id_usage = ""
for rownum in ID:
for i, player in enumerate(squad_variables):
if rownum == i:
formula = max(1*xi_variables[rownum],(1*player))
id_usage += formula
prob += (id_usage ==15)
Any help would be greatly appreciated- perhaps this is simply a non-linear problem. Thank you :)

You want a constraint that says "if x[i] = 0 then y[i] = 0". The typical way to do this is through the constraint y[i] <= x[i]. Note that this only works if both variables are binary; otherwise a modified approach is necessary. I can't quite follow your PuLP code so I won't try to give you the code for this constraint, but I assume you'll be able to implement it once you understand the logic.

Related

Can you create an array of unknown values with CVXPY?

I am attempting to find an array of optimum values to help with selecting the best runtime for some motors in a factory.
At the moment I have created a simple version of what I would like to do, and it works for one.
# Create variables
running_time = cp.Variable(name='running_time')
max_running_time = 240 # in minutes
pump_flow_per_minute = 9.6*3600
required_volume = 2073600 # in litres
# Find the optimal solution
objective = cp.Minimize(running_time)
# Constraints
runtime_constraint = running_time <= max_running_time
volume_contraint = pump_flow_per_minute * running_time >= required_volume
constraints = [runtime_constraint, volume_contraint]
prob = cp.Problem(objective, constraints)
prob.solve()
print("The optimal solution is:", prob.value)
The issue I am having is translating this to multiple areas, where I would like to look at an array of max_running_times which could equate to a 24 hour period, and likewise with the required volumes.
Is there such a way to complete this trick with the use of CVXPY, I ultimately think the use of one problem is best as in the future adding onto this problem, the solution will directly cause effect on the other times of day selected. Is the solution best solved with the liked of loops?

Calculating probability of a Fantasy League in Python

Inspired by some projects, I have decided to work on a calculator project based on Python.
Essentially, I have 5 teams in a fantasy league, with points assigned to these teams based on their current standings. Teams A-E.
Assuming the league has 10 more matches to be played, my main aim is to calculate the probability that a team makes it to the top 3 in their league given the matches have a 33.3% of either going:
A win to the team (which adds 2 points to the winning team)
A lose to the team (which awards 0 points to the losing team)
A draw (which awards 1 point to both teams)
This also in turn means there will be 3^10 outcomes for the 10 matches to be played.
For each of these 3^10 scenarios, I will also compute how the final points table will look, and from there, I will be able to sort and figure out which are the top 3 teams in the fantasy league.
I've worked halfway through the project, as shown:
Points = { "A":12, "B":14, "C":8, "D":12, "E":6} #The current standings
RemainingMatches = [
A:B
B:D
C:E
A:E
D:C
B:D
C:D
A:E
C:E
D:C
]
n=len(RemainingMatches) # Number of matches left
combinations = pow(3,n) # Number of possible scenarios left assumes each game has 3 outcomes
print( "Number of remaining matches = %d" % n )
print( "Number of possible scenarios = %d" % combinations )
for i in range(0,combinations)
...
for i in range(0,n)
I am currently wondering how do I get these possible combinations to match a certain scenario? For example, when i = 0, it points to the first matchup where A wins, B losses. Hence, Points[A] = Points[A] + 2 etc. I know there will be a nested loop since I have to consider the other matches too. But how do I exactly map each scenario, and then nest it?
Apologies if I am being unclear here, struggling with this part at the moment.
Thinking Process:
3 outcomes per game.
for i to combinations:
#When i =1, means A win/B lost?
#When i =2, means B win/A lost?
#When i =3, means A/B drew?
for i to n:
#Go to next match?
Not exactly sure what is the logic flow for this kind of scenario. Thank you.
Here is a different way to write the code. If you knew in advance the outcome of each of the ten remaining games, you could compute which teams would finish in top three. This is done with the play_out function, which takes current standings, remaining games, and the known outcomes of future games.
Now all that remains is to loop over all possible future outcomes and tally the winning probabilities. This is done in the simulate_league function that takes in current standings and remaining games, and returns a probability that a given team finishes in top 3.
There may be situations where two teams are tied for the third place. In cases like this, the code below allows for four teams or more to be in "top 3". To use a different rule, you can change the top3scores = nlargest(3, set(pts.values())) line.
In terms of useful Python functions, itertools.product is used to generate all possible win/draw/loss outcomes, and heapq.nlargest is used to find the largest 3 scores out of a bunch. The collections.Counter class is used to count the number of possibilities in which a given team finishes in top 3.
from itertools import product
from heapq import nlargest
from collections import Counter
Points = {"A":12, "B":14, "C":8, "D":12, "E":6} # The current standings
RemainingMatches = ["A:B","B:D","C:E","A:E","D:C","B:D","C:D","A:E","C:E","D:C"]
# reformat remaining matches
RemainingMatches = [tuple(s.split(":")) for s in RemainingMatches]
def play_out(points, remaining_matches, winloss_schedule):
pts = points.copy()
# simulate remaining games given predetermine win/loss/draw outcomes
# given by winloss_schedule
for (team_a, team_b), outcome in zip(remaining_matches, winloss_schedule):
if outcome == -1:
pts[team_a] += 2
elif outcome == 0:
pts[team_a] += 1
pts[team_b] += 1
else:
pts[team_b] += 2
# top 3 scores (allows for ties)
top3scores = nlargest(3, set(pts.values()))
return {team: score in top3scores for team, score in pts.items()}
def simulate_league(points, remaining_matches):
top3counter = Counter()
for winloss_schedule in product([-1, 0, 1], repeat=len(remaining_matches)):
top3counter += play_out(points, remaining_matches, winloss_schedule)
total_possibilities = 3 ** len(remaining_matches)
return {team: top3count / total_possibilities
for team, top3count in top3counter.items()}
# simulate_league(Points, RemainingMatches)
# {'A': 0.9293637487510372,
# 'B': 0.9962573455943369,
# 'C': 0.5461057765584515,
# 'D': 0.975088485833799,
# 'E': 0.15439719554945894}

Pulp Integer Programming Constraint ignored

I am trying to solve a LpProblem with only boolean variables and Pulp seems to be ignoring some constraints. To give some context about the problem:
I want to find an optimal solution to the problem schools face when trying to create classroom groups. In this case, students are given a paper to write at most 5 other students and the school guarantees them that they will be together with at least one of those students. To see how I modeled this problem into an integer programming problem please refer to this question.
In that link you will see that my variables are defined as x_ij = 1 if student i will be together with student j, and x_i_j = 0 otherwise. Also, in that link I ask about the constraint that I am having trouble implementing with Pulp: if x_i_j = 1 and x_j_k = 1, then by transitive property, x_i_k = 1. In other words, if student i is with student j, and student j is with student k, then, student i will inherently be together with student k.
My objective is to maximize the sum of all the elements of the matrix obtained when performing a Hadamard product between the input matrix and the variables matrix. In other words, I want to contemplate as many of the student's requests as possible.
I will now provide some code snippets and screen captures that should help visualize the problem:
Inputs (just a sample: the real matrix is 37x37)
Output
As you can see in this last image, x_27 = 1 and x_37 = 1 but x_23 = 0 which doesn't make sense.
Here is how I define my variables
def define_variables():
variables = []
for i in range(AMOUNT_OF_STUDENTS):
row = []
for j in range(AMOUNT_OF_STUDENTS):
row.append(LpVariable(f"x_{i}_{j}", lowBound=0, upBound=1, cat='Integer'))
variables.append(row)
return variables
Here is how I define the transitive constraints
for i in range(len(variables)):
for j in range(i, len(variables)):
if i != j:
problem += variables[i][j] == variables[j][i] # Symmetry
for k in range(j, len(variables)):
if i < j < k < len(variables):
problem += variables[i][j] + variables[j][k] - variables[i][k] <= 1 # Transitive
problem += variables[i][j] + variables[i][k] - variables[j][k] <= 1
problem += variables[j][k] + variables[i][k] - variables[i][j] <= 1
When printing the LpProblem I see the constraint that is apparently not working:
As you can see in the output: x_2_7 = 1 and x_3_7 = 1. Therefore, to satisfy this constraint, x_2_3 should also be 1, but as you can also see in the output, it is 0.
Any ideas about what could be happening? I've been stuck for days and the problem seems to be modeled fine and it worked when I only had 8 students (64 variables). Now that I have 37 students (1369 variables) it seems to be behaving oddly. The solver arrives to a solution but it seems to be ignoring some constraints.
Any help is very much appreciated! Thank you in advance.
The constraint is working correctly. Find below the analysis:
(crossposted from github: https://github.com/coin-or/pulp/issues/377)
import pulp as pl
import pytups as pt
path = 'debugSolution.txt'
# import model
_vars, prob = pl.LpProblem.from_json(path)
# get all variables with non-zero value
vars_value = pt.SuperDict(_vars).vfilter(pl.value)
# export the lp
prob.writeLP('debugSolution.lp')
# the constraint you show in the SO problem is:
# _C3833: - x_2_3 + x_2_7 + x_3_7 <= 1
'x_2_7' in vars_value
# True, so x_2_7 has value 1
'x_3_7' in vars_value
# False, so x_3_7 has value 0
'x_2_3' in vars_value
# False, so x_2_3 has value 0
So -0 + 1 + 0 <= 1 means the constraint is respected. There must be a problem with bringing back the value of x_3_7 somewhere because you think is 1 when in pulp it's 0.
This is called a set partitioning problem and PuLP has an example in their documentation here.
In essence, instead of modeling your variables as indicators of whether student A is in the same class as student B, you'll define a mapping between a set of students and a set of classrooms. You can then apply your student preferences as either constraints or part of a maximization objective.

Python Data Structure Selection

Let's say I have a list of soccer players. For now, I only have four players. [Messi, Iniesta, Xavi, Neymar]. More players will be added later on. I want to keep track of the number of times these soccer players pass to each other during the course of a game. To keep track of the passes, I believe I'll need a data structure similar to this
Messi = {Iniesta: 4, Xavi: 5 , Neymar: 8}
Iniesta = {Messi: 4, Xavi: 10 , Neymar: 5}
Xavi = {Messi: 5, Iniesta: 10 , Neymar: 6}
Neymar = {Messi: 8, Iniesta: 5 , Xavi: 6}
Am I right to use a dictionary? If not, what data structure would be better suited? If yes, how do I approach this using a dictionary though? How do I address the issue of new players being included from time to time, and creating a dictionary for them as well.
As an example, If I get the first element in the list, List(i) in the first iteration is Messi, how do i use the value stored in it to create a dictionary with the name Messi. That is how do i get the line below.
Messi = [Iniesta: 4, Xavi: 5 , Neymar: 8]
It was suggested I try something like this
my_dynamic_vars = dict()
string = 'someString'
my_dynamic_vars.update({string: dict()})
Python and programming newbie here. Learning with experience as I go along. Thanks in advance for any help.
This is a fun question, and perhaps a good situation where something like a graph might be useful. You could implement a graph in python by simply using a dictionary whose keys are the names of the players and whose values are lists players that have been passed the ball.
passes = {
'Messi' : ['Iniesta', 'Xavi','Neymar', 'Xavi', 'Xavi'],
'Iniesta' : ['Messi','Xavi', 'Neymar','Messi', 'Xavi'],
'Xavi' : ['Messi','Neymar','Messi','Neymar'],
'Neymar' : ['Iniesta', 'Xavi','Iniesta', 'Xavi'],
}
To get the number of passes by any one player:
len(passes['Messi'])
To add a new pass to a particular player:
passes['Messi'].append('Xavi')
To count the number of times Messi passed to Xavi
passes['Messi'].count('Xavi')
To add a new player, just add him the first time he makes a pass
passes['Pele'] = ['Messi']
Now, he's also ready to have more passes 'appended' to him
passes['Pele'].append['Xavi']
What's great about this graph-like data structure is that not only do you have the number of passes preserved, but you also have information about each pass preserved (from Messi to Iniesta)
And here is a super bare-bones implementation of some functions which capture this behavior (I think a beginner should be able to grasp this stuff, let me know if anything below is a bit too confusing)
passes = {}
def new_pass(player1, player2):
# if p1 has no passes, create a new entry in the dict, else append to existing
if player1 not in passes:
passes[player1] = [player2]
else:
passes[player1].append(player2)
def total_passes(player1):
# if p1 has any passes, return the total number; otherewise return 0
total = len(passes[player1]) if player1 in passes else 0
return total
def total_passes_from_p1_to_p2(player1, player2):
# if p1 has any passes, count number of passes to player 2; otherwise return 0
total = passes[player1].count(player2) if player1 in passes else 0
return total
Ideally, you would be saving passes in some database that you could continuously update, but even without a database, you can add the following code and run it to get the idea:
# add some new passes!
new_pass('Messi', 'Xavi')
new_pass('Xavi', 'Iniesta')
new_pass('Iniesta', 'Messi')
new_pass('Messi', 'Iniesta')
new_pass('Iniesta', 'Messi')
# let's see where we currently stand
print total_passes('Messi')
print total_passes('Iniesta')
print total_passes_from_p1_to_p2('Messi', 'Xavi')
Hopefully you find this helpful; here's some more on python implementation of graphs from the python docs (this was a fun answer to write up, thanks!)
I suggest you construct a two dimensional square array. The array should have dimensions N x N. Each index represents a player. So the value at passes[i][j] is the number of times the player i passed to player j. The value passes[i][i] is always zero because a player can't pass to themselves
Here is an example.
players = ['Charles','Meow','Rebecca']
players = dict( zip(players,range(len(players)) ) )
rplayers = dict(zip(range(len(players)),players.keys()))
passes = []
for i in range(len(players)):
passes.append([ 0 for i in range(len(players))])
def pass_to(f,t):
passes[players[f]][players[t]] += 1
pass_to('Charles','Rebecca')
pass_to('Rebecca','Meow')
pass_to('Charles','Rebecca')
def showPasses():
for i in range(len(players)):
for j in range(len(players)):
print("%s passed to %s %d times" % ( rplayers[i],rplayers[j],passes[i][j],))
showPasses()

Infinite loop in simulation

I'm starting out in python.. The details I have written in the below.. It goes to an infinite loop and give me an error when I try to call the function inside itself.. Is this kind of recursion not allowed ?
Posting code below.. Thanks for all your help :)
The program assumes that we have 100 passengers boarding a plane. Assuming if the first one has lost his boarding pass, he finds a random seat and sits there. Then the other incoming passengers sit in their places if unoccupied or some other random seat if occupied.
The final aim is to find the probability with which the last passenger will not sit in his/her own seat. I haven't added the loop part yet which
would make it a proper simulation. The question above is actually a puzzle in probability. I am trying to verify the answer as I don't really follow the reasoning.
import random
from numpy import zeros
rand = zeros((100,3))
# The rows are : Passenger number , The seat he is occupying and if his designated seat is occupied. I am assuming that the passengers have seats which are same as the order in which they enter. so the 1st passenger enter has a designated seat number 1, 2nd to enter has no. 2 etc.
def cio(r): # Says if the seat is occupied ( 1 if occupied, 0 if not)
if rand[r][2]==1:
return 1
if rand[r][2]==0:
return 0
def assign(ini,mov): # The first is passenger no. and the second is the final seat he gets. So I keep on chaning the mov variable if the seat that he randomly picked was occupied too.
if cio(rand[mov][2])== 0 :
rand[mov][2] = 1
rand[mov][1] = ini
elif cio(rand[mov][2])== 1 :
mov2 = random.randint(0,99)
# print(mov2) Was used to debug.. didn't really help
assign(ini,mov2) # I get the error pointing to this line :(
# Defining the first passenger's stats.
rand[0][0] = 1
rand[0][1] = random.randint(1,100)
m = rand[0][1]
rand[m][2]= 1
for x in range(99):
rand[x+1][0] = x + 2
for x in range(99):
assign(x+1,x+1)
if rand[99][0]==rand[99][1] :
print(1);
else :
print(0);
Please tell me if y'all get the same error.. ALso tell me if I am breaking any rules coz thisi sthe first question I'm posting.. Sorry if it seems too long.
This is how it should've been...
The code does work fine in this case with the following mods :
def assign(ini,mov):
if cio(mov)== 0 : """Changed here"""
rand[mov][2] = 1
rand[mov][1] = ini
elif cio(mov)== 1 : """And here"""
mov2 = random.randint(0,99)
assign(ini,mov2)
I am using Python 2.6.6 on Windows 7, using a software from Enthought Academic Version of Python.
http://www.enthought.com/products/getepd.php
Also the answer to this puzzle is 0.5 which is actually what I am getting(almost) by running it 10000 times.
I didn't see it here but it had to be available online..
http://www.brightbubble.net/2010/07/10/100-passengers-and-plane-seats/
Recursion, while allowed, isn't your best first choice for this.
Python enforces an upper bound on recursive functions. It appears that your loop exceeds the upper bound.
You really want some kind of while loop in assign.
def assign(ini,mov):
"""The first is passenger no. and the second is the final seat he gets. So I keep on chaning the mov variable if the seat that he randomly picked was occupied too.
"""
while cio(rand[mov][2])== 1:
mov = random.randint(0,99)
assert cio(rand[mov][2])== 0
rand[mov][2] = 1
rand[mov][1] = ini
This may be more what you're trying to do.
Note the change to your comments. Triple-quoted string just after the def.
you may be able to find the exact solution using dynamic programming
http://en.wikipedia.org/wiki/Dynamic_programming
For this you will need to add memoization to your recursive function:
What is memoization and how can I use it in Python?
If you just want to estimate the probability using simulation with random numbers then I suggest you break out of your recursive function after a certain depth when the probability is getting really small because this will only change some of the smaller decimal places (most likely.. you may want to plot the change in result as you change the depth).
to measure the depth you could add an integer to your parameters:
f(depth):
if depth>10:
return something
else: f(depth+1)
the maximum recursion depth allowed by default is 1000 although you can change this you will just run out of memory before you get your answer

Categories

Resources