A recruiter wants to form a team with different skills and he wants to pick the minimum number of persons which can cover all the required skills.
N represents number of persons and K is the number of distinct skills that need to be included. list spec_skill = [[1,3],[0,1,2],[0,2,4]] provides information about skills of each person. e.g. person 0 has skills 1 and 3, person 1 has skills 0, 1 and 2 and so on.
The code should outputs the size of the smallest team that recruiter could find (the minimum number of persons) and values indicating the specific IDs of the people to recruit onto the team.
I implemented the code with brute force as below but since some data are more than thousands, it seems I need to be solved with heuristic approaches. In this case it is possible to have approximate answer.
Any suggestion how to solve it with heuristic methods will be appreciated.
N,K = 3,5
spec_skill = [[1,3],[0,1,2],[0,2,4]]
A = list(range(K))
set_a = set(A)
solved = False
for L in range(0, len(spec_skill)+1):
for subset in itertools.combinations(spec_skill, L):
s = set(item for sublist in subset for item in sublist)
if set_a.issubset(s):
print(str(len(subset)) + '\n' + ' '.join([str(spec_skill.index(item)) for item in subset]))
solved = True
break
if solved: break
Here is my way of doing this. There might be potential optimization possibilities in the code, but the base idea should be understandable.
import random
import time
def man_power(lst, K, iterations=None, period=0):
"""
Specify a fixed number of iterations
or a period in seconds to limit the total computation time.
"""
# mapping each sublist into a (sublist, original_index) tuple
lst2 = [(lst[i], i) for i in range(len(lst))]
mini_sample = [0]*(len(lst)+1)
if period<0 or (period == 0 and iterations is None):
raise AttributeError("You must specify iterations or a positive period")
def shuffle_and_pick(lst, iterations):
mini = [0]*len(lst)
for _ in range(iterations):
random.shuffle(lst2)
skillset = set()
chosen_ones = []
idx = 0
fullset = True
# Breaks from the loop when all skillsets are found
while len(skillset) < K:
# No need to go further, we didn't find a better combination
if len(chosen_ones) >= len(mini):
fullset = False
break
before = len(skillset)
skillset.update(lst2[idx][0])
after = len(skillset)
if after > before:
# We append with the orginal index of the sublist
chosen_ones.append(lst2[idx][1])
idx += 1
if fullset:
mini = chosen_ones.copy()
return mini
# Estimates how many iterations we can do in the specified period
if iterations is None:
t0 = time.perf_counter()
mini_sample = shuffle_and_pick(lst, 1)
iterations = int(period / (time.perf_counter() - t0)) - 1
mini_result = shuffle_and_pick(lst, iterations)
if len(mini_sample)<len(mini_result):
return mini_sample, len(mini_sample)
else:
return mini_result, len(mini_result)
Problem
I'm implementing a generalized assignment problem using LINGO (in which I have experience to model mathematical problems) and Or-tools, but results were different.
Brief explanation of my assignment problem
I have a set of houses (called 'object' in the model) that need to be build. Each house needs a set of resources. To supply these resources, there are 3 suppliers. The resource cost varies by supplier.
The model should assign those suppliers to the houses in order to minimize the total cost of assignments.
Model
Parameters
resource_cost_per_supplier[i,j]: cost of resource i of supplier j.
resource_cost_factor_per_object[i,j]: matrix that signals the resources demanded by the objects (cost factor > 0). In addition, it contains the cost factor of resource i demanded by object j. This factor is calculated based on the duration of use of the resource during the construction of the object and also in others contractual factors.
supplier_budget_limit[j]: supplier budget limit of supplier j. Each supplier has a budget limit that should not be exceded (it's in the contract).
supplier_budget_tolerance_margin_limit[j]: supplier budget tolerance margin limit of supplier j. To the model works, I had to create this tolerance margin, that is applied in the supplier budget limit to create an acceptable range of supplier cost.
object_demand_attended_per_supplier[i,j]: binary matrix that signals if the supplier i has all the resources required by object j.
Variables
x[i,j]: binary variable that indicate if the supplier i will be (1) or not (0) assigned to the object j.
supplier_cost[j]: variable that represents the cost of supplier j in the market share. Its value is given by:
total_cost: variable that represents the total cost of market share. Its value is given by:
Objective function
min Z = total_cost
Constraints
1 - Ensure that each object j will have only one supplier i.
2 - For each supplier i, the sum of the cost of all your assignments must be greater than or equal to your budget limit minus the tolerance margin.
3 - For each supplier j, the sum of the cost of all your assignments must be less than or equal to your budget limit plus the tolerance margin.
4 - Ensure that a supplier i will not assigned to an object j if the supplier i cannot provide all the resources of object j.
5 - Ensure that variable x is binary for every supplier i and object j.
Code
Or-tools (Python)
from __future__ import print_function
from ortools.linear_solver import pywraplp
import pandas as pd
import numpy
###### [START] parameters ######
num_objects = 252 #Number of objects
num_resources = 35 #Number of resources (not every object will use all resources. It depends of the type of the object and other things)
num_suppliers = 3 #Number of suppliers
resource_cost_per_supplier = pd.read_csv('https://raw.githubusercontent.com/hrassis/divisao-mercado/master/input_prototype/resource_cost_per_supplier.csv', index_col = 0).to_numpy()
resource_cost_factor_per_object = pd.read_csv('https://raw.githubusercontent.com/hrassis/divisao-mercado/master/input_prototype/resource_cost_factor_per_object.csv', index_col = 0).to_numpy()
object_demand_attended_per_supplier = pd.read_csv('https://raw.githubusercontent.com/hrassis/divisao-mercado/master/input_prototype/object_demand_attended_per_supplier.csv', index_col = 0).to_numpy()
supplier_budget_limit = pd.read_csv('https://raw.githubusercontent.com/hrassis/divisao-mercado/master/input_prototype/supplier_budget_limit.csv', index_col = 0)['budget_limit'].values
supplier_budget_tolerance_margin_limit = pd.read_csv('https://raw.githubusercontent.com/hrassis/divisao-mercado/master/input_prototype/supplier_budget_tolerance_margin_limit.csv', index_col = 0)['tolerance_margin'].values
###### [END] parameters ######
###### [START] variables ######
#Assignment variable
x = {}
supplier_cost = []
#Total cost of market share
total_cost = 0
###### [END] variables ######
def main():
#Declare the solver
solver = pywraplp.Solver('GeneralizedAssignmentProblem', pywraplp.Solver.CBC_MIXED_INTEGER_PROGRAMMING)
#Assignment variable
#x = {}
#Ensure that the assignment variable is binary
for i in range(num_suppliers):
for j in range(num_objects):
x[i, j] = solver.BoolVar('x[%i,%i]' % (i,j))
#Assigning an expression to each supplier_cost element
for j in range(num_suppliers):
supplier_cost.append(solver.Sum(solver.Sum(resource_cost_per_supplier[i,j] * resource_cost_factor_per_object[i,k] * x[j,k] for k in range(num_objects)) for i in range(num_resources)))
#Total cost of market share
total_cost = solver.Sum(supplier_cost[j] for j in range(num_suppliers))
#Objective function
solver.Minimize(total_cost)
###### [START] constraints ######
# 1 - Ensure that each object will have only one supplier
for j in range(num_objects):
solver.Add(solver.Sum([x[i,j] for i in range(num_suppliers)]) == 1)
# 2 - For each supplier j, the sum of the cost of all your allocations must be greater than or equal to your budget limit minus the tolerance margin
for j in range(num_suppliers):
solver.Add(supplier_cost[j] >= total_cost * (supplier_budget_limit[j] - supplier_budget_tolerance_margin_limit[j]))
# 3 - For each supplier j, the sum of the cost of all your allocations must be less than or equal to your budget limit plus the tolerance margin
for j in range(num_suppliers):
solver.Add(supplier_cost[j] <= total_cost * (supplier_budget_limit[j] + supplier_budget_tolerance_margin_limit[j]))
# 4 - Ensure that a supplier i will not assigned to an object j if the supplier i can not supply all resources demanded by object j
for i in range(num_suppliers):
for j in range(num_objects):
solver.Add(x[i,j] - object_demand_attended_per_supplier[i,j] <= 0)
###### [END] constraints ######
solution = solver.Solve()
#Print the result
if solution == pywraplp.Solver.OPTIMAL:
print('------- Solution -------')
print('Total cost =', round(total_cost.solution_value(), 2))
for i in range(num_suppliers):
print('-----')
print('Supplier', i)
print('-> cost:', round(supplier_cost[i].solution_value(), 2))
print('-> cost percentage:', format(supplier_cost[i].solution_value()/total_cost.solution_value(),'.2%'))
print('-> supplier budget limit:', format(supplier_budget_limit[i], '.0%'))
print('-> supplier budget tolerance margin limit:', format(supplier_budget_tolerance_margin_limit[i], '.0%'))
print('-> acceptable range: {0} <= cost percentage <= {1}'.format(format(supplier_budget_limit[i] - supplier_budget_tolerance_margin_limit[i], '.0%'), format(supplier_budget_limit[i] + supplier_budget_tolerance_margin_limit[i], '.0%')))
# print('-> objects: {0}'.format(i))
else:
print('The problem does not have an optimal solution.')
#Generate a result to consult
assignment_result = pd.DataFrame(columns=['object','supplier','cost','assigned'])
for i in range(num_suppliers):
for j in range(num_objects):
assignment_result = assignment_result.append({'object': j, 'supplier': i, 'cost': get_object_cost(j, i), 'assigned': x[i, j].solution_value()}, ignore_index=True)
assignment_result.to_excel('assignment_result.xlsx')
def get_object_cost(object_index, supplier_index):
object_cost = 0.0
for i in range(num_resources):
object_cost = object_cost + resource_cost_factor_per_object[i,object_index] * resource_cost_per_supplier[i,supplier_index]
return object_cost
#Run main
main()
LINGO
model:
title: LINGO;
data:
!Number of objects;
num_objects = #OLE('LINGO_input.xlsx',num_objects);
!Number of resources (not every object will use all resources. It depends of the type of the object and other things);
num_resources = #OLE('LINGO_input.xlsx',num_resources);
!Number of suppliers;
num_suppliers = #OLE('LINGO_input.xlsx',num_suppliers);
enddata
sets:
suppliers/1..num_suppliers/:supplier_budget_limit,supplier_tolerance_margin_limit,supplier_cost;
resources/1..num_resources/:;
objects/1..num_objects/:;
resources_suppliers(resources,suppliers):resource_cost_per_supplier;
resources_objects(resources,objects):resource_cost_factor_per_object;
suppliers_objects(suppliers,objects):x,object_demand_attended_supplier;
endsets
data:
resource_cost_per_supplier = #OLE('LINGO_input.xlsx',resource_cost_per_supplier[cost]);
resource_cost_factor_per_object = #OLE('LINGO_input.xlsx',resource_cost_factor_per_object[cost_factor]);
supplier_budget_limit = #OLE('LINGO_input.xlsx',supplier_budget_limit[budget_limit_percentage]);
supplier_tolerance_margin_limit = #OLE('LINGO_input.xlsx',supplier_budget_tolerance_margin_limit[budget_tolerance_percentage]);
object_demand_attended_supplier = #OLE('LINGO_input.xlsx',object_demand_attended_per_supplier[supply_all_resources]);
enddata
!The array 'supplier_cost' was created to store the total cost of each supplier;
#FOR(suppliers(j):supplier_cost(j)= #SUM(resources(i):#SUM(objects(k):resource_cost_per_supplier(i,j)*resource_cost_factor_per_object(i,k)*x(j,k))));
!Total cost of market share;
total_cost = #SUM(suppliers(i):supplier_cost(i));
!Objective function;
min = total_cost;
!Ensure that each object will have only one supplier;
#FOR(objects(j):#SUM(suppliers(i):x(i,j))=1);
!For each supplier j, the sum of the cost of all your assignments must be greater than or equal to your budget limit minus the tolerance margin;
#FOR(suppliers(j):supplier_cost(j) >= total_cost*(supplier_budget_limit(j)-supplier_tolerance_margin_limit(j)));
!For each supplier j, the sum of the cost of all your assignments must be less than or equal to your budget limit plus the tolerance margin;
#FOR(suppliers(j):supplier_cost(j) <= total_cost*(supplier_budget_limit(j)+supplier_tolerance_margin_limit(j)));
!Ensure that a supplier j will not assigned to an object k if the supplier j can not supply all resources demanded by object k;
#FOR(suppliers(j):#FOR(objects(k):x(j,k)-object_demand_attended_supplier(j,k)<=0));
!Ensure that the assignment variable is binary;
#FOR(suppliers(i):#FOR(objects(j):#BIN(x(i,j))));
data:
#OLE('LINGO_input.xlsx',output[assigned])=x;
#OLE('LINGO_input.xlsx',objective_function_value)=total_cost;
#OLE('LINGO_input.xlsx',supplier_cost)=supplier_cost;
enddata
Results
The picture below shows the comparative result between Or-Tools and LINGO. I emphasize that the data used by the two implementations were exactly the same and I checked all the data several times.
Note that there is a difference of 1.876,20 between the two implementations. LINGO, that uses a Branch and Bound algorithm, found a better solution than Or-Tools. The difference is caused by the assignments inconsistencies shown below.
Regarding the processing time of the algorithms, LINGO took around 14 min and Or-Tools less than 1 min.
All the data used in the two implementations are in this repository: https://github.com/hrassis/divisao-mercado. Data used by LINGO is in folder input_lingo and used by Or-Tools is in the folder input_prototype. In addition I uploaded the validation report.
After "cheating" a bit:
solver.Add(x[1, 177] == 1)
solver.Add(x[0, 186] == 1)
solver.Add(x[0, 205] == 1)
solver.Add(x[2, 206] == 1)
solver.Add(x[2, 217] == 1)
solver.Add(x[2, 66] == 1)
solver.Add(x[2, 115] == 1)
solver.Add(x[1, 237] == 1)
The solver returns a better objective, so I believe there is a bug either on the CBC binary or the OR-Tools interface to it (sounds like the former).
Can you try using the CP-SAT solver?
There have been quite a few problems with CBC
https://github.com/google/or-tools/issues/1450
https://github.com/google/or-tools/issues/1525
I'm doing a coursera' discrete optimization course
which, in the course a tool called Minizinc is used to solve the problems.
I want to translate class examples to python, starting for this one:
I'm using this example code reproduce the results:
v = {'hammer':6, 'wrench':10, 'screwdriver':8, 'towel':40}
w = {'hammer':13, 'wrench':21, 'screwdriver':17, 'towel':100}
q = {'hammer':1000, 'wrench':400, 'screwdriver':500, 'towel':150}
limit = 1000
items = list(sorted(v.keys()))
# Create model
m = LpProblem("Knapsack", LpMaximize)
# Variables
x = LpVariable.dicts('x', items, lowBound=0, upBound=1, cat=LpInteger)
# Objective
m += sum(v[i]*x[i] for i in items)
# Constraint
m += sum(w[i]*x[i] for i in items) <= limit
# Optimize
m.solve()
# Print the status of the solved LP
print("Status = %s" % LpStatus[m.status])
# Print the value of the variables at the optimum
for i in items:
print("%s = %f" % (x[i].name, x[i].varValue))
# Print the value of the objective
print("Objective = %f" % value(m.objective))
But this is giving a wrong answer since is only taken one of a kind.
How can I add the amount available for each item (dict q) into the constraints?
You need to make two very small changes to your code. Firstly you need to remove the upper bound you have set on your x variables. At the moments you have binary variables x[i] which can be only one or zero.
Secondly you need to add in the constraints which effectively set a custom upper bound for each of the items. Working code and resulting solution below - as you can see multiple wrenches (the highest v/w ratio) are chosen, with a single hammer to fill up the small amount of space left.
from pulp import *
v = {'hammer':6, 'wrench':10, 'screwdriver':8, 'towel':40}
w = {'hammer':13, 'wrench':21, 'screwdriver':17, 'towel':100}
q = {'hammer':1000, 'wrench':400, 'screwdriver':500, 'towel':150}
limit = 1000
items = list(sorted(v.keys()))
# Create model
m = LpProblem("Knapsack", LpMaximize)
# Variables
x = LpVariable.dicts('x', items, lowBound=0, cat=LpInteger)
# Objective
m += sum(v[i]*x[i] for i in items)
# Constraint
m += sum(w[i]*x[i] for i in items) <= limit
# Quantity of each constraint:
for i in items:
m += x[i] <= q[i]
# Optimize
m.solve()
# Print the status of the solved LP
print("Status = %s" % LpStatus[m.status])
# Print the value of the variables at the optimum
for i in items:
print("%s = %f" % (x[i].name, x[i].varValue))
# Print the value of the objective
print("Objective = %f" % value(m.objective))
print("Total weight = %f" % sum([x[i].varValue*w[i] for i in items]))
Which returns:
Status = Optimal
x_hammer = 1.000000
x_screwdriver = 0.000000
x_towel = 0.000000
x_wrench = 47.000000
Objective = 476.000000
Total weight = 1000.000000
I need help with the following problem for computer science
A clerk works in a store where the cost of each item is a positive integer number of dollars. So, for example,
something might cost $21, but nothing costs $9.99.
In order to make change a clerk has an unbounded number
of bills in each of the following denominations: $1, $2, $5, $10, and $20.
Write a procedure that takes two
arguments, the cost of an item and the amount paid, and prints how to make change using the smallest
possible number of bills.
Since I am also a beginner, I'll take it as a practice on python. Please see the codes below:
def pay_change(paid, cost):
# set up the change and an empty dictionary for result
change = paid - cost
result = {}
# get the result dictionary values for each bill
n_twenty = change // 20
result['$20'] = n_twenty
rest = change % 20
n_ten = rest // 10
result['$10'] = n_ten
rest = rest % 10
n_five = rest // 5
result['$5'] = n_five
rest = rest % 5
n_two = rest // 2
result['$2'] = n_two
rest = rest % 2
n_one = rest // 1
result['$1'] = n_one
# print(result) if you want to check the result dictionary
# present the result, do not show if value is 0
for k, v in result.items():
if v != 0:
print('Need', v, 'bills of', k)
The logic is to assume the change is over 20, and slowly calculated down, by using //, and calculate the rest by using %. No matter what, we end up with a dictionary, that gives how many bills are needed for each dollar bill.
And then, for those dollar bills that the value is 0, we don't need to show them, so I wrote a for loop to exam the values in this dictionary.
OK, now I've simplified to codes to avoid repeating snippets, I am quite happy with it:
def pay_change(paid, price):
# set up the change and an empty dictionary for result
global change
change = paid - price
bills = ['$20', '$10', '$5', '$2', '$1']
# create a function to calculate the change for each bills
def f(x):
global change
result = divmod(change, x)[0]
change = divmod(change, x)[1]
return result
temp = list(map(f, (20, 10, 5, 2, 1)))
# generate the final result as a dictionary
result = dict(zip(bills, temp))
# present the result, do not show if value is 0
for k, v in result.items():
if v != 0:
print('Need', v, 'bills of', k)
I have a store that contains items. Each item is either a component (which is atomal) or a product which consists of various components (but never of 2 or more of the same components).
Now, when I want to get a product out of the store, there are various scenarios:
The store contains the necessary number of the product.
The store contains components of which I can assemble the product.
The store contains products that share components with the required product. I can disassemble those and assemble the required item.
Any combination of the above.
Below you can see my code so far (getAssemblyPath). It does find a way to assemble the required item if it is possible, but it does not optimize the assembly path.
I want to optimize the path in two ways:
First, choose the path which takes the least number of assembly/disassembly actions.
Second, if there are various such paths, choose the path which leave the least amount of disassembled components in the store.
Now, here I am at a complete loss of how to get this optimization done (I am not even sure if this is a question for SO or for Maths).
How can I alter getAssemblyPath so that it meets my optimization requirements?
My code so far:
#! /usr/bin/python
class Component:
def __init__ (self, name): self.__name = name
def __repr__ (self): return 'Component {}'.format (self.__name)
class Product:
def __init__ (self, name, components):
self.__name = name
self.__components = components
#property
def components (self): return self.__components
def __repr__ (self): return 'Product {}'.format (self.__name)
class Store:
def __init__ (self): self.__items = {}
def __iadd__ (self, item):
item, count = item
if not item in self.__items: self.__items [item] = 0
self.__items [item] += count
return self
#property
def items (self): return (item for item in self.__items.items () )
#property
def products (self): return ( (item, count) for item, count in self.__items.items () if isinstance (item, Product) )
#property
def components (self): return ( (item, count) for item, count in self.__items.items () if isinstance (item, Component) )
def getAssemblyPath (self, product, count):
if product in self.__items:
take = min (count, self.__items [product] )
print ('Take {} of {}'.format (take, product) )
count -= take
if not count: return
components = dict ( (comp, count) for comp in product.components)
for comp, count in self.components:
if comp not in components: continue
take = min (count, components [comp] )
print ('Take {} of {}'.format (take, comp) )
components [comp] -= take
if not components [comp]: del components [comp]
if not components: return
for prod, count in self.products:
if prod == product: continue
shared = set (prod.components) & set (components.keys () )
dis = min (max (components [comp] for comp in shared), count)
print ('Disassemble {} of {}.'.format (dis, prod) )
for comp in shared:
print ('Take {} of {}.'.format (dis, comp) )
components [comp] -= take
if not components [comp]: del components [comp]
if not components: return
print ('Missing components:')
for comp, count in components.items ():
print ('{} of {}.'.format (count, comp) )
c1 = Component ('alpha')
c2 = Component ('bravo')
c3 = Component ('charlie')
c4 = Component ('delta')
p1 = Product ('A', [c1, c2] )
p2 = Product ('B', [c1, c2, c3] )
p3 = Product ('C', [c1, c3, c4] )
store = Store ()
store += (c2, 100)
store += (c4, 100)
store += (p1, 100)
store += (p2, 100)
store += (p3, 10)
store.getAssemblyPath (p3, 20)
This outputs:
Take 10 of Product C
Take 10 of Component delta
Disassemble 10 of Product A.
Take 10 of Component alpha.
Disassemble 10 of Product B.
Take 10 of Component charlie.
Which works, but it does unnecessarily disassemble product A, as product B contains both of the required components alpha and charlie.
--
EDIT:
Answering the very sensible questions of Blckknght:
When you say you want "the least number of assembly/disassembly actions", do you mean the smallest number of items, or the smallest number of different products?
An "asm/disasm action" is the action of assembling or disassembling one product, no matter how many components are involved. I am looking for least number of touched items, no matter whether they be distinct or not.
That is, is is better to dissassemble 20 of Product A than to dissassemble 10 of Product A and an additional 5 of Product B?
The latter is closer to optimum.
Further, you say you want to avoid leaving many components behind, but in your current code all disassembled components that are not used by the requested Product are lost. Is that deliberate (that is, do you want to be throwing away the other components), or is it a bug?
The method getAssemblyPath only determines the path of how to get the items. It does not touch the actual store. At no moment it assigns to self.__items. Think of it as a function that issues an order to the store keep of what he must do in the (inmediate) future, in order to get the required amount of the required item out of his store.
--
EDIT 2:
The first obvious (or at least obvious to me) way to tackle this issue, is to search first those products, that share the maximum amount of components with the required product, as you get more required components out of each disassembly. But unfortunately this doesn't necessary yield the optimum path. Take for instance:
Product A consisting of components α, β, γ, δ, ε and ζ.
Product B consisting of components α, β, η, δ, ε and θ.
Product C consisting of components α, β, γ, ι, κ and λ.
Product D consisting of components μ, ν, ξ, δ, ε and ζ.
We have in store 0 of A, 100 of B, 100 of C and 100 of D. We require 10 of A. Now if we look first for the products that shares most components with A, we will find B. We disassemble 10 of B getting 10 each of α, β, δ and ε. But then we need to disassemble 10 of C (to get γ) and 10 of D (to get ζ). These would be 40 actions (30 disassembling and 10 assembling).
But the optimum way would be to disassemble 10 of C and 10 of D (30 actions, 20 disassembling and 10 assembling).
--
EDIT 3:
You don't need to post python code to win the bounty. Just explain the algorithm to me and show that it does indeed yield the optimum path, or one of the optima if several exist.
Here is how I would solve this problem. I wanted to write code for this but I don't think I have time.
You can find an optimal solution recursively. Make a data structure that represents the state of the parts store and the current request. Now, for each part you need, make a series of recursive calls that try the various ways to fill the order. The key is that by trying a way to fill the order, you are getting part of the work done, so the recursive call is now a slightly simpler version of the same problem.
Here's a specific example, based on your example. We need to fill orders for product 3 (p3) which is made of components c1, c3, and c4. Our order is for 20 of p3, and we have 10 p3 in stock so we trivially fill the order for the first 10 of p3. Now our order is for 10 of p3, but we can look at it as an order for 10 of c1, 10 of c3, and 10 of c4. For the first recursive call we disassemble a p1, and fill an order for a single c1 and place an extra c2 in the store; so this recursive call is for 9 of c1, 10 of c3, and 10 of c4, with an updated availability in the store. For the second recursive call we disassemble a p2, and fill an order for a c1 and a c4, and put an extra c2 into the store; so this recursive call is for 9 of c1, 10 of c3, and 9 of c4, with an updated availability in the store.
Since each call reduces the problem, the recursive series of calls will terminate. The recursive calls should return a cost metric, which either signals that the call failed to find a solution or else signals how much the found solution cost; the function chooses the best solution by choosing the solution with the lowest cost.
I'm not sure, but you might be able to speed this up by memoizing the calls. Python has a really nifty builtin new in the 3.x series, functools.lru_cache(); since you tagged your question as "Python 3.2" this is available to you.
What is memoization and how can I use it in Python?
The memoization works by recognizing that the function has already been called with the same arguments, and just returning the same solution as before. So it is a cache mapping arguments to answers. If the arguments include non-essential data (like how many of component c2 are in the store) then the memoization is less likely to work. But if we imagine we have products p1 and p9, and p9 contains components c1 and c9, then for our purposes disassembling one of p1 or one of p9 should be equivalent: they have the same disassembly cost, and they both produce a component we need (c1) and one we don't need (c2 or c9). So if we get the recursive call arguments right, the memoization could just return an instant answer when we get around to trying p9, and it could save a lot of time.
Hmm, now that I think about it, we probably can't use functools.lru_cache() but we can just memoize on our own. We can make a cache of solutions: a dictionary mapping tuples to values, and build tuples that just have the arguments we want cached. Then in our function, the first thing we do is check the cache of solutions, and if this call is equivalent to a cached solution, just return it.
EDIT: Here's the code I have written so far. I haven't finished debugging it so it probably doesn't produce the correct answer yet (I'm not certain because it takes a long time and I haven't let it finish running). This version is passing in dictionaries, which won't work well with my ideas about memoizing, but I wanted to get a simple version working and then worry about speeding it up.
Also, this code takes apart products and adds them to the store as components, so the final solution will first say something like "Take apart 10 product A" and then it will say "Take 20 component alpha" or whatever. In other words, the component count could be considered high since it doesn't distinguish between components that were already in the store and components that were put there by disassembling products.
I'm out of time for now and won't work on it for a while, sorry.
#!/usr/bin/python3
class Component:
def __init__ (self, name): self.__name = name
#def __repr__ (self): return 'Component {}'.format (self.__name)
def __repr__ (self): return 'C_{}'.format (self.__name)
class Product:
def __init__ (self, name, components):
self.__name = name
self.__components = components
#property
def components (self): return self.__components
#def __repr__ (self): return 'Product {}'.format (self.__name)
def __repr__ (self): return 'P_{}'.format (self.__name)
class Store:
def __init__ (self): self.__items = {}
def __iadd__ (self, item):
item, count = item
if not item in self.__items: self.__items [item] = 0
self.__items [item] += count
return self
#property
def items (self): return (item for item in self.__items.items () )
#property
def products (self): return ( (item, count) for item, count in self.__items.items () if isinstance (item, Product) )
#property
def components (self): return ( (item, count) for item, count in self.__items.items () if isinstance (item, Component) )
def get_assembly_path (self, product, count):
store = self.__items.copy()
if product in store:
take = min (count, store [product] )
s_trivial = ('Take {} of {}'.format (take, product) )
count -= take
if not count:
print(s_trivial)
return
dict_decr(store, product, take)
product not in store
order = {item:count for item in product.components}
cost, solution = solver(order, store)
if cost is None:
print("No solution.")
return
print("Solution:")
print(s_trivial)
for item, count in solution.items():
if isinstance(item, Component):
print ('Take {} of {}'.format (count, item) )
else:
assert isinstance(item, Product)
print ('Disassemble {} of {}'.format (count, item) )
def getAssemblyPath (self, product, count):
if product in self.__items:
take = min (count, self.__items [product] )
print ('Take {} of {}'.format (take, product) )
count -= take
if not count: return
components = dict ( (comp, count) for comp in product.components)
for comp, count in self.components:
if comp not in components: continue
take = min (count, components [comp] )
print ('Take {} of {}'.format (take, comp) )
components [comp] -= take
if not components [comp]: del components [comp]
if not components: return
for prod, count in self.products:
if prod == product: continue
shared = set (prod.components) & set (components.keys () )
dis = min (max (components [comp] for comp in shared), count)
print ('Disassemble {} of {}.'.format (dis, prod) )
for comp in shared:
print ('Take {} of {}.'.format (dis, comp) )
components [comp] -= take
if not components [comp]: del components [comp]
if not components: return
print ('Missing components:')
for comp, count in components.items ():
print ('{} of {}.'.format (count, comp) )
def str_d(d):
lst = list(d.items())
lst.sort(key=str)
return "{" + ", ".join("{}:{}".format(k, v) for (k, v) in lst) + "}"
def dict_incr(d, key, n):
if key not in d:
d[key] = n
else:
d[key] += n
def dict_decr(d, key, n):
assert d[key] >= n
d[key] -= n
if d[key] == 0:
del(d[key])
def solver(order, store):
"""
order is a dict mapping component:count
store is a dict mapping item:count
returns a tuple: (cost, solution)
cost is a cost metric estimating the expense of the solution
solution is a dict that maps item:count (how to fill the order)
"""
print("DEBUG: solver: {} {}".format(str_d(order), str_d(store)))
if not order:
solution = {}
cost = 0
return (cost, solution)
solutions = []
for item in store:
if not isinstance(item, Component):
continue
print("...considering: {}".format(item))
if not item in order:
continue
else:
o = order.copy()
s = store.copy()
dict_decr(o, item, 1)
dict_decr(s, item, 1)
if not o:
# we have found a solution! Return it
solution = {}
solution[item] = 1
cost = 1
print("BASIS: solver: {} {} / {} {}".format(str_d(order), str_d(store), cost, str_d(solution)))
return (cost, solution)
else:
cost, solution = solver(o, s)
if cost is None:
continue # this was a dead end
dict_incr(solution, item, 1)
cost += 1
solutions.append((cost, solution))
for item in store:
if not isinstance(item, Product):
continue
print("...Product components: {} {}".format(item, item.components))
assert isinstance(item, Product)
if any(c in order for c in item.components):
print("...disassembling: {}".format(item))
o = order.copy()
s = store.copy()
dict_decr(s, item, 1)
for c in item.components:
dict_incr(s, c, 1)
cost, solution = solver(o, s)
if cost is None:
continue # this was a dead end
cost += 1 # cost of disassembly
solutions.append((cost, solution))
else:
print("DEBUG: ignoring {}".format(item))
if not solutions:
print("DEBUG: *dead end*")
return (None, None)
print("DEBUG: finding min of: {}".format(solutions))
return min(solutions)
c1 = Component ('alpha')
c2 = Component ('bravo')
c3 = Component ('charlie')
c4 = Component ('delta')
p1 = Product ('A', [c1, c2] )
p2 = Product ('B', [c1, c2, c3] )
p3 = Product ('C', [c1, c3, c4] )
store = Store ()
store += (c2, 100)
store += (c4, 100)
store += (p1, 100)
store += (p2, 100)
store += (p3, 10)
#store.getAssemblyPath (p3, 20)
store.get_assembly_path(p3, 20)
Optimal path for N products <=> optimal path for single product.
Indeed, if we need to optimally assemble N of product X, after we optimally (using current stock) assemble one product, question becomes to optimally assemble (N-1) of product X using remaining stock.
=> Therefore, it is sufficient to provide algorithm of optimally assembling ONE product X at a time.
Assume we need components x1,..xn for the product (here we only include components not available as components in stock)
For each component xk, find all products that have this component. We will get a list of products for each component - products A1(1),..,A1(i1) have component x1, products A(1),..,
A(i2) have component x2, and so forth (some products can be contained in several lists A1,A2,..,An lists).
If any of the lists is empty - there is no solution.
We need minimal set of products, such that a product from that set is contained in each of the lists. The simplest, but not computationally efficient solution is by brute force - try all sets and pick minimal:
Take union of A1,..,An - call it A (include only unique products in the union).
a. Take single product from A, if it is contained in all A1,..,An - we need only one disassembly (this product).
b. Try all combinations of two products from A, if any combination (a1,a2) satisfies condition that either a1 or a2 is contained in each of the lists A1,..,An - it is a solution.
...
for sure, there is a solution at depth n - one component from each of the lists A1,..,An. If we found no solution prior, this is the best solution.
Now, we only need to think about better strategy then brute force check, which I think is possible - I need to think about it, but this brute force approach for sure finds strictly optimal solution.
EDIT:
More accurate solution is to sort lists by length. Then when checking set of K products for being solution - only all possible combinations of 1 item from each list from first K lists need to be checked, if no solution there - there is no minimal set of depth K that solves the problem. That type of check will be also computationally no that bad - perhaps it can work????
I think the key here is to establish the potential costs of each purchase case, so that the proper combination of purchase cases optimally minimize a cost function. (Then its simply reduced to a knapsack problem)
What follows is probably not optimal but here is an example of what I mean:
1.Any product that is the end product "costs" it's actual cost (in currency).
2.Any component or product that can be assembled into the end product (given other separate products/components) but does not require being dissembled costs it's real price (in currency) plus a small tax( tbd).
3.Any component or product that can facilitate assembly of the end product but requires being dissembled costs it's price in currency plus a small tax for the assembly into the end product and another small tax for each dis-assembly needed. (maybe the same value as the assembly tax?).
Note: these "taxes" will apply to all sub-products that occupy the same case.
... and so on for other possible cases
Then, find all possible combinations of components and products available at the storefront that are capable of being assembled into the end product. Place these "assembly lists" into a cost sorted list determined by your chosen cost function. After that, start creating as many of the first (lowest cost) "assembly list" as you can (by checking if all items in assembly list are still available at the store - i.e. you have already used them for a previous assembly). Once you cannot create any more of this case, pop it from the list. Repeat until all the end products you need are "built".
Note: Every time you "assemble" an end product you will need to decriment a global counter for each product in the current "assembly list".
Hope this get's the discussion moving in the right direction. Good luck!