Is there any way to implement fitness sharing/niching using DEAP? Specifically I'm looking for an implementation of the method defined here (Goldberg's fitness sharing) on page 98 of the pdf. If you know of any other methods that are in DEAP, that would be useful as well.
Thanks
Write your own selection routine.
The default routines are at deap/tools/selection.py and might be useful as a guide to get started
For example:
def selYourSelectionRoutine(individuals, k):
"""Select the *k* best individuals among the input *individuals*.
:param individuals: A list of individuals to select from.
:param k: The number of individuals to select.
:returns: A list containing the k best individuals.
"""
return sorted(individuals, key=attrgetter("fitness"), reverse=True)[:k]
Then use it with the rest of deap as they prescribe:
toolbox.register("select", tools.selYourSelectionRoutine, yourargs)
I've got one that does something more like a probablistic selection based on relative fitness which I don't have the rights to, it's only about 10-15 lines of python—so it can be done, and isn't crazy difficult.
I'm not aware of any implementations of that specific selection routine that are publically available (yet).
To do fitness sharing, you would have to define your own shared fitness function that depends on the entire population.
Assuming you already defined a fitness function, you could do the following:
from scipy.spatial import distance_matrix
def sharing(distance, sigma, alpha):
res = 0
if distance<sigma:
res += 1 - (distance/sigma)**alpha
return res
def shared_fitness(individual, population, sigma, alpha):
num = fitness(individual)[0]
dists = distance_matrix([individual], population)[0]
tmp = [sharing(d, sigma, alpha) for d in dists]
den = sum(tmp)
return num/den,
This shared fitness will favor individuals with fewer neighbors. sigma is the radius in which neighbors will penalize an individual's shared fitness. If sigma is bigger, then your niches will be further away, and you risk missing a local maximum. If sigma is smaller, you need a larger population, and your algorithm will take longer to run. alpha indicates how strong the penalization for nearby neighbors is.
This shared fitness can then be registered in your toolbox like a regular fitness.
population = toolbox.population()
toolbox.register('evaluate', shared_fitness, population=population, sigma=0.1, alpha=1.)
After that, you can use a standard algorithm like $\mu + \lambda$, that will select offspring based on their fitness, to get niching.
Related
I'm trying to solve a dynamic food web with JiTCODE. One aspect of the network is that populations which undergo a threshold are set to zero. So I'm getting a not differentiable equation. Is there a way to implement that in JiTCODE?
Another similar problem is a Heaviside dependency of the network.
Example code:
import numpy as np
from jitcode import jitcode, y, t
def f():
for i in range(N):
if i <5:
#if y(N-1) > y(N-2): #Heavyside, how to make the if-statement
#yield (y(i)*y(N-2))**(0.02)
#else:
yield (y(i)*y(N-1))**(0.02)
else:
#if y(i) > thr:
#yield y(i)**(0.2) #?? how to set the population to 0 ??
#else:
yield y(i)**(0.3)
N = 10
thr = 0.0001
initial_value = np.zeros(N)+1
ODE = jitcode(f)
ODE.set_integrator("vode",interpolate=True)
ODE.set_initial_value(initial_value,0.0)
Python conditionals will be evaluated during the code-generation and not during the simulation (using the generated code). Therefore you cannot use them here. Instead you need to use special conditional objects that provide a reasonably smooth approximation of a step function (or build such a thing yourself):
def f():
for i in range(N):
if i<5:
yield ( y(i)*conditional(y(N-1),y(N-2),y(N-2),y(N-1)) )**0.2
else:
yield y(i)**conditional(y(i),thr,0.2,0.3)
For example, you can treat conditional(y(i),thr,0.2,0.3) to be evaluated as 0.2 if y(i)>thr and 0.3 otherwise (at simulation time).
how to set the population to 0 ??
You cannot do such a discontinuous jump within JiTCODE or the framework of differential equations in general. Usually, you would use a sharp population decline to simulate this, possibly introducing a delay (and thus JiTCDDE). If you really need this, you can either:
Detect threshold crossings after each integration step and reinitialise the integrator with respective initial conditions. If you just want to fully kill populations that went below a reproductive threshold, this seems to be a valid solution.
Implement a binary-switch dynamical variable.
Also see this GitHub issue.
I have a system that simplifies to the following: a power generation and storage unit are being used to meet demand. The objective function is the cost to produce the power times the power produced. However, the power produced is stratified into bins of different costs, and the "clearing price" to produce power is the cost at the highest bin produced each hour:
T = np.arange(5, dtype=int)
produce_cap = 90 # MW
store_cap = 100 # MWh
store_init = 0 # MWh
m = pyo.ConcreteModel()
m.T = pyo.Set(initialize=T) # time, hourly
m.produce = pyo.Var(m.T, within=pyo.NonNegativeReals, initialize=0) # generation
m.store = pyo.Var(m.T, within=pyo.Reals, initialize=0) # storage
stack = np.arange(10, 91, 20) # cumulative sum of generation subidivisions
price = np.arange(0.9, 0.01, -0.2) # marginal cost for subdivision of generation
demand = np.asarray([35, 5, 75, 110, 15]) # load to meet
m.produce_cap = pyo.Constraint(m.T, rule=lambda m, t: m.produce[t] <= produce_cap)
m.store_max = pyo.Constraint(m.T, rule=lambda m, t: m.store[t] <= store_cap)
m.store_min = pyo.Constraint(m.T, rule=lambda m, t: m.store[t] >= -store_cap)
rule = lambda m, t: m.produce[t] + m.store[t] == demand[t] # conservation rule
m.consv = pyo.Constraint(m.T, rule=rule)
# objective
def obj(stack, price, demand, m):
cost = 0
for t in m.T:
load = m.produce[t]
idx = np.searchsorted(stack, m.produce[t])
p = price[idx] if idx < len(price) else 1000 # penalty for exceeding production capability
cost += m.produce[t] * p
return cost
rule = functools.partial(obj, stack, price, demand)
m.objective = pyo.Objective(rule=rule, sense=pyo.minimize)
# more constraints added below ...
The problem seems to be in the objective function definition, using the np.searchsorted algorithm. The specific error is
Cannot create a compound inequality with identical upper and lower
bounds using strict inequalities: constraint infeasible:
produce[0] < produce[0] and 50.0 < produce[0]
If I try to implement my own searchsorted-like algorithm, I get a similar error. I gather the expression for the objective function that Pyomo is trying to create can't deal with this kind of table lookup, at least how I've implemented it. Is there another approach or reformulation I can consider?
There's a lot going on here.
The root cause is a conceptual misunderstanding of how Pyomo works: the rules for forming constraints and objectives are not callback functions that are called during optimization. Instead, the rules are functions that Pyomo calls to generate the model, and those rules are expected to return expression objects. Pyomo then passes those expressions to the underlying solver(s) through one of several standard intermediate formats (e.g., LP, NL, BAR, GMS formats). As a result, as a general rule, you should not have rules that have logic that is conditioned on the value of a variable (the rule can run, but the result will be a function of the initial variable value and will not be updated/changed during the optimization process).
For your specific example, the challenge is that searchsorted is iterating over the m.produce variable and comparing it to the cutpoints. That is causing Pyomo to start generating expression objects (through operator overloading). You are then running afoul of a (deprecated) feature where Pyomo allowed for generating compound (range) inequality expressions with a syntax like "lower <= m.x <= upper".
The solution is that you need to reformulate your objective to return an expression for the objective cost. There are several approaches to doing this, and the "best" approach depends on the balance of the model and the actual shape of the cost curve. From your example, it looks like the cost curve is intended to be piecewise linear, so I would consider either directly reformulating the expression (using an intermediate variable and a set of constraits), or to use Pyomo's "Piecewise" component for generating piecewise linear expressions.
Given transport costs, per single unit of delivery, for a supermarket from three distribution centers to ten separate stores.
Note: Please look in the #data section of my code to see the data that I'm not allowed to post in photo form. ALSO note while my costs are a vector with 30 entries. Each distribution centre can only access 10 costs each. So DC1 costs = entries 1-10, DC2 costs = entries 11-20 etc..
I want to minimize the transport cost subject to each of the ten stores demand (in units of delivery).
This can be done by inspection. The the minimum cost being $150313. The problem being implementing the solution with Python and Gurobi and producing the same result.
What I've tried is a somewhat sloppy model of the problem in Gurobi so far. I'm not sure how to correctly index and iterate through my sets that are required to produce a result.
This is my main problem: The objective function I define to minimize transport costs is not correct as I produce a non-answer.
The code "runs" though. If I change to maximization I just get an unbounded problem. So I feel like I am definitely not calling the correct data/iterations through sets into play.
My solution so far is quite small, so I feel like I can format it into the question and comment along the way.
from gurobipy import *
#Sets
Distro = ["DC0","DC1","DC2"]
Stores = ["S0", "S1", "S2", "S3", "S4", "S5", "S6", "S7", "S8", "S9"]
D = range(len(Distro))
S = range(len(Stores))
Here I define my sets of distribution centres and set of stores. I am not sure where or how to exactly define the D and S iteration variables to get a correct answer.
#Data
Demand = [10,16,11,8,8,18,11,20,13,12]
Costs = [1992,2666,977,1761,2933,1387,2307,1814,706,1162,
2471,2023,3096,2103,712,2304,1440,2180,2925,2432,
1642,2058,1533,1102,1970,908,1372,1317,1341,776]
Just a block of my relevant data. I am not sure if my cost data should be 3 separate sets considering each distribution centre only has access to 10 costs and not 30. Or if there is a way to keep my costs as one set but make sure each centre can only access the costs relevant to itself I would not know.
m = Model("WonderMarket")
#Variables
X = {}
for d in D:
for s in S:
X[d,s] = m.addVar()
Declaring my objective variable. Again, I'm blindly iterating at this point to produce something that works. I've never programmed before. But I'm learning and putting as much thought into this question as possible.
#set objective
m.setObjective(quicksum(Costs[s] * X[d, s] * Demand[s] for d in D for s in S), GRB.MINIMIZE)
My objective function is attempting to multiply the cost of each delivery from a centre to a store, subject to a stores demand, then make that the smallest value possible. I do not have a non zero constraint yet. I will need one eventually?! But right now I have bigger fish to fry.
m.optimize()
I produce a 0 row, 30 column with 0 nonzero entries model that gives me a solution of 0. I need to set up my program so that I get the value that can be calculated easily by hand. I believe the issue is my general declaring of variables and low knowledge of iteration and general "what goes where" issues. A lot of thinking for just a study exercise!
Appreciate anyone who has read all the way through. Thank you for any tips or help in advance.
Your objective is 0 because you do not have defined any constraints. By default all variables have a lower bound of 0 and hence minizing an unconstrained problem puts all variables to this lower bound.
A few comments:
Unless you need the names for the distribution centers and stores, you could define them as follows:
D = 3
S = 10
Distro = range(D)
Stores = range(S)
You could define the costs as a 2-dimensional array, e.g.
Costs = [[1992,2666,977,1761,2933,1387,2307,1814,706,1162],
[2471,2023,3096,2103,712,2304,1440,2180,2925,2432],
[1642,2058,1533,1102,1970,908,1372,1317,1341,776]]
Then the cost of transportation from distribution center d to store s are stored in Costs[d][s].
You can add all variables at once and I assume you want them to be binary:
X = m.addVars(D, S, vtype=GRB.BINARY)
(or use Distro and Stores instead of D and S if you need to use the names).
Your definition of the objective function then becomes:
m.setObjective(quicksum(Costs[d][s] * X[d, s] * Demand[s] for d in Distro for s in Stores), GRB.MINIMIZE)
(This is all assuming that each store can only be delivered from one distribution center, but since your distribution centers do not have a maximal capacity this seems to be a fair assumption.)
You need constraints ensuring that the stores' demands are actually satisfied. For this it suffices to ensure that each store is being delivered from one distribution center, i.e., that for each s one X[d, s] is 1.
m.addConstrs(quicksum(X[d, s] for d in Distro) == 1 for s in Stores)
When I optimize this, I indeed get an optimal solution with value 150313.
Here is my question.
I'm dealing with one optimization problem using DEAP.
For now, I use toolbox.register("select", tools.selNSGA2) to select some fittest indivual to survive.
But I want to add some threshold by user-defined function.
Can the algorithm achieve two step of selection?
Select several individuals by the tournament or selNSGA2 method
Eliminate several individuals by pre-defined thresholds.
This should work.
def myselect(pop, k, check):
return [ind for in in tools.selNSGA2(pop, k) if check(ind)]
def mycheck(ind):
return True
toolbox.register("select", myselect, check=mycheck)
However, you will end up selecting <= k offspring.
I'm trying to use fitness/function sharing on a minimization function. I'm using the standard definition of the sharing function found here which then divides the fitness by the niche count. This will lower the fitness, proportional to the amount of individuals in its niche. However, in my case the lower the fitness the more fit the individual is. How can I make my fitness sharing function increase the fitness proportionally to the amount of individuals in its niche?
Here's the code:
def evalTSPsharing(individual, radius, pop):
individualFitness = evalTSP(individual)[0]
nicheCount = 0
for ind in pop:
distance = abs(individualFitness - evalTSP(ind)[0])
if distance < radius:
nicheCount += (1-(distance/radius))
return (individualFitness/nicheCount,)
I couldn't find a non-pdf of the paper, but here's a picture of the relevant parts. Again, this is from the link above.
Question is two years old now, but I'll give it a try:
You can try replacing the niche_count division penalty with a multiplication, i.e.:
individualFitness * nicheCount
instead of:
individualFitness / nicheCount