I have a constrained optimization problem where I have a number of
products I want to spend money on and estimate my total returns based on models
I built for each individual product.
I'm using scipy.optimzie.minimize to find the optimal spend given the
individual models output. The problem I'm having is that the optimizer
finishes with a "optimizer terminated successfully" flag, but very clearly
does not find the optimal solution. In fact, the output using the original
seed/x0 is better than the output produced from the optimizer. I put a print
statement in the objective function and you can see at one point it just drops off
a cliff. Does anyone have any idea why this would happen/how to fix it?
I've included a reduced version of my code below.
If my products are ['P1','P2',... 'P9'], and I have a model for each of them
# estimates returns on spend for each product
model1 = lambda money : func1(money *betas_a)
model2 = lambda money : func2(money,*betas_b)
... etc
where func is one of
def power_curve(x,beta1,beta2):
return beta1*x**beta2
def mm_curve(x,beta1,beta2):
"Beta2 >= 0"
return (beta1*x)/(1+beta2*x)
def dbl_exponential(x,beta1,beta2,beta3,beta4):
return beta1**(beta2**(x/beta4))*beta3
def neg_exp(x,beta1,beta2):
"Beta2 > 0"
return beta1*(1-np.exp(-beta2*x))
I now want to optimize my spend in each in order to maximize my total returns.
To do this is use scipy.optimize.minimize with a wrapper around the following function:
def budget(products, budget, betas, models):
"""
Given a budget distributed across each product, estimate total returns.
products = str: names of each individual product
budget = list-floats: amount of money/spend corresponding to each product
models = list-funcs: function to use to predict individual returns corresponding to each product
betas = dict: keys are product names - values are list of betas to feed to corresponding model
"""
results = []
target_total = 0 # total returns
assert len(products) == len(budget) == len(betas)
# for each product, calculate return using corresponding model
for v,b,m in zip(products,budget,models):
tpred = m(b,*betas[v])
target_total+=tpred
results.append({'val':v,'budget':b, 'tpred':tpred})
# if you watch this you can see it drops off dramatically towards the end
print(target_total)
return results, target_total
Minimum Reproducible Code:
import numpy as np
from scipy import optimize
### Setup inputs to the optimizer
vals = ['P1','P2','P3','P4','P5','P6','P7','P8','P9']
funcs = [dbl_exponential,
mm_curve,
power_curve,
mm_curve,
neg_exp,
power_curve,
power_curve,
mm_curve,
dbl_exponential]
betas = {'P1': [0.018631215601097723,0.6881958654622713,43.84956270498627,
1002.1010110475437],
'P2': [0.002871159199956573, 1.1388317502737174e-06],
'P3': [0.06863672099961649, 0.7295132426289046],
'P4': [0.009954885796211378, 3.857169894090025e-05],
'P5': [307.624705578708, 1.4454030580404746e-05],
'P6': [0.0875910297422766, 0.6848303282418671],
'P7': [0.12147343508583974, 0.6573539731442877],
'P8': [0.002789390181221983, 5.72554293489956e-07],
'P9': [0.02826834133593836,0.8999750236756555,1494.677373273538,
6529.1531372261725]
}
bounds = [(4953.474502264151, 14860.423506792453),
(48189.39647820754, 144568.18943462262),
(10243.922611886792, 30731.767835660376),
(6904.288592358491, 20712.865777075473),
(23440.199762641503, 70320.5992879245),
(44043.909679905664, 132131.729039717),
(9428.298255754717, 28284.89476726415),
(53644.56626556605, 160933.69879669815),
(8205.906018773589, 24617.718056320766)]
seed = [9906.949005,
96378.792956,
20487.845224,
13808.577185,
46880.399525,
88087.81936,
18856.596512,
107289.132531,
16411.812038]
wrapper = lambda b: -budget(vals,b,betas, funcs)[1] # negative to get *maximum* output
## Optimizer Settings
tol = 1e-16
maxiter = 10000
max_budget = 400000
# total spend can't exceed max budget
constraint = [{'type':'ineq', 'fun': lambda budget: max_budget-sum(budget)}]
# The output from the seed is better than the final "optimized" output
print('DEFAULT OUTPUT TOTAL:', wrapper(seed))
res = optimize.minimize(wrapper, seed, bounds = bounds,
tol = tol, constraints = (constraint),
options={'maxiter':maxiter})
print("Optimizer Func Output:", res.fun)
As with most of my issues, it turned out to be something silly. The sum of the seed values i passed was greater than the max_budget constraint I gave it. Hence x0 was incompatible with my constraints. Why scipy didn't produce a corresponding warning or error i'm not sure. But this turned out to be the problem.
Probably you are getting stuck in a local minima as your function seems to be quite non-linear. Have you tried to perhaps change the optimisation method to something like BFGS.
res = optimize.minimize(wrapper, seed, bounds = bounds, method='BFGS',
tol = tol, constraints = (constraint),
options={'maxiter':maxiter})
I've tried solving the problem using this algorithm, and got -78464.52052455483.
Hope it helps
Related
In the following code I want to optimize a wind farm using a penalty function.
Using the first function(newsite), I have defined the wind turbines numbers and layout. Then in the next function, after importing x0(c=x0=initial guess), for each range of 10 wind directions (wd) I took the c values for the mean wd of each range. For instance, for wd:[0,10] mean value is 5 and I took c values of wd=5 and put it for all wd in the range[0,10] and for each wind speed(ws). I have to mention that c is the value that shows that wind turbines are off or on(c=0 means wt is off). then I have defined operating according to the c, which means that if operating is 0,c=0 and that wt is off.
Then I defined the penalty function to optimize power output. indeed wherever TI_eff>0.14, I need to implement a penalty function so this function must be subtracted from the original power output. For instance, if sim_res.TI_eff[1][2][3] > 0.14, so I need to apply penalty function so curr_func[1][2][3]=sim_res.Power[1][2][3]-10000*(sim_res.TI_eff[1][2][3]-0.14)**2.
The problem is that I run this code but it did not give me any results and I waited for long hours, I think it was stuck in a loop that could not reach converge. so I want to know what is the problem?
import time
from py_wake.examples.data.hornsrev1 import V80
from py_wake.examples.data.hornsrev1 import Hornsrev1Site # We work with the Horns Rev 1 site, which comes already set up with PyWake.
from py_wake import BastankhahGaussian
from py_wake.turbulence_models import GCLTurbulence
from py_wake.deflection_models.jimenez import JimenezWakeDeflection
from scipy.optimize import minimize
from py_wake.wind_turbines.power_ct_functions import PowerCtFunctionList, PowerCtTabular
import numpy as np
def newSite(x,y):
xNew=np.array([x[0]+560*i for i in range(4)])
yNew=np.array([y[0]+560*i for i in range(4)])
x_newsite=np.array([xNew[0],xNew[0],xNew[0],xNew[1]])
y_newsite=np.array([yNew[0],yNew[1],yNew[2],yNew[0]])
return (x_newsite,y_newsite)
def wt_simulation(c):
c = c.reshape(4,360,23)
site = Hornsrev1Site()
x, y = site.initial_position.T
x_newsite,y_newsite=newSite(x,y)
windTurbines = V80()
for item in range(4):
for j in range(10,370,10):
for i in range(j-10,j):
c[item][i]=c[item][j-5]
windTurbines.powerCtFunction = PowerCtFunctionList(
key='operating',
powerCtFunction_lst=[PowerCtTabular(ws=[0, 100], power=[0, 0], power_unit='w', ct=[0, 0]), # 0=No power and ct
windTurbines.powerCtFunction], # 1=Normal operation
default_value=1)
operating = np.ones((4,360,23)) # shape=(#wt,wd,ws)
operating[c <= 0.5]=0
wf_model = BastankhahGaussian(site, windTurbines,deflectionModel=JimenezWakeDeflection(),turbulenceModel=GCLTurbulence())
# run wind farm simulation
sim_res = wf_model(
x_newsite, y_newsite, # wind turbine positions
h=None, # wind turbine heights (defaults to the heights defined in windTurbines)
wd=None, # Wind direction (defaults to site.default_wd (0,1,...,360 if not overriden))
ws=None, # Wind speed (defaults to site.default_ws (3,4,...,25m/s if not overriden))
operating=operating
)
curr_func=np.ones((4,360,23))
for i in range(4):
for l in range(360):
for k in range(23):
if sim_res.TI_eff[i][l][k]-0.14 > 0 :
curr_func[i][l][k]=sim_res.Power[i][l][k]-10000*(sim_res.TI_eff[i][l][k]-0.14)**2
else:
curr_func[i][l][k]=sim_res.Power[i][l][k]
return -float(np.sum(curr_func)) # negative because of scipy minimize
t0 = time.perf_counter()
def solve():
wt =4 # for V80
wd=360
ws=23
x0 = np.ones((wt,wd,ws)).reshape(-1) # initial value for c
b=(0,1)
bounds=np.full((wt,wd,ws,2),b).reshape(-1, 2)
res = minimize(wt_simulation, x0=x0, bounds=bounds)
return res
res=solve()
print(f'success status: {res.success}')
print(f'aep: {-res.fun}') # negative to get the true maximum aep
print(f'c values: {res.x}\n')
print(f'elapse: {round(time.perf_counter() - t0)}s')
sim_res=wt_simulation(res.x)
There are a number of things in your approach that are either wrong or incomprehensible to me. Just for fun I have tried your code. A few observations:
Your set of parameters (optimization variables) has a shape of (4, 360, 23), i.e. you are looking at 33,120 parameters. There is no nonlinear optimization algorithm that is going to give you any meaningful answer to a problem that big. Ever. But then again, you shouldn't be looking at SciPy optimize if your optimization variables should only assume 0/1 values.
Calling SciPy minimize like this:
res = minimize(wt_simulation, x0=x0, bounds=bounds)
Is going to select a nonlinear optimizer between BFGS, L-BFGS-B or SLSQP (according to the documentation at https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.minimize.html)
Those algorithms are gradient-based, and since you're not providing a gradient of your objective function SciPy is going to calculate them numerically. Good luck with that when you have 33,000 parameters. Never going to finish.
At the beginning of your objective function you are doing this:
for item in range(4):
for j in range(10,370,10):
for i in range(j-10,j):
c[item][i]=c[item][j-5]
I don't understand why you're doing it but you are overriding the input values of c coming from the optimizer.
Your objective function takes 20-25 seconds to evaluate on my powerful workstation. Even if you had only 10-15 optimization parameters, it would take you several days to get any answer out of an optimizer. You have 33,000+ variables. No way.
I don't know why you are doing this and why you're doing it the way you're doing it. You should rethink your approach.
I'm trying to perform solve an optimization problem in python. At this point, I'm already familiar with Scipy, but I didn't manage to use it properly with a constraint of unique integer values.
The example below might better fit mlrose tag.
In a high level, I'm trying to create a Swiss pairing, I have seen a few articles and one of those suggested a "Penalty Matrix" and to minimize the penalty. Here is what I have done:
import mlrose
import numpy as np
# Create Penalty Matrix
x = np.round(np.random.rand(8,8)*50)
z = np.eye(8, dtype=int)*100 + x
print(z)
# fitness problem given a penalty matrix and an order
def pairing_fittness(order, panalty):
print(order)
order = np.array(order)
a = np.bincount(order)
order = order.reshape(-1, 2)
PF = 0
for pair in order:
print("{}, {}: {}".format(pair[0],pair[1], panalty[pair[0],pair[1]]))
PF = PF + panalty[pair[0],pair[1]]
print("Real PF: ",PF)
print("order penalty: {}".format((np.max(a) - 1) * 1000))
return (PF + (np.max(a) - 1) * 1000)
One of the problems this tries to solve is to create an array with unique values (The same player can not play twice in the same round) this is why the penalty for duplicated values is high (1000)
kwargs = {'panalty': z}
fitness_cust_problem_fun = mlrose.CustomFitness(pairing_fittness, **kwargs)
problem = mlrose.DiscreteOpt(length = 8,
fitness_fn = fitness_cust_problem_fun,
maximize = False,
max_val = 8,
)
best_state, best_fitness = mlrose.simulated_annealing(
problem,
max_attempts = 300,
max_iters = 100000,
random_state = 1)
print(best_state)
print(best_fitness)
No matter what I do, with over 6 variables it can not find a unique values array to optimize it. While I can do it in Excel (Solver > Evolutionary).
I'm looking for a better tool (I used scipy.optimize but I'm not sure it works well for integer problems, as I got better results with mlrose) or suggestions on how to improve my optimization problem so it is solvable.
I am trying to create a fairly generic genetic algorithm implementation I'm TensorFlow. I have an implementation that is slow and am trying to increase its speed. I will provide a really simple example of where the program is getting slow and would welcome recommendations of improving the speed of the current implementation.
Let us say that we create the following:
W = tf.Variable(tf.convert_to_tensor(Warr, dtype=tf.float32))
X = tf.placeholder(dtype=tf.float32, shape=(3, None))
y = tf.placeholder(dtype=tf.float32, shape=(None) )
And we want to find W for the condition:
Warr = np.array([[0.1, 0, 0]])
Xarr = np.random.random((3, 100))
yarr = np.dot(Warr, Xarr)
A naive implementation (like the one I have created) goes thus:
1 a cost function is created for this implementation:
yHat = tf.matmul(W, X)
costFunction = tf.reduce_mean( tf.sqrt((y - yHat)*(y - yHat)) )
Note that the cost function can be arbitrarily complex and is not known apriori. Hence, it is something that will be passed into a class. Note that the rest of the code are excerpts within a class, but the main idea is easy to follow:
2 A population is generated (within a class).
self.population = []
for i in tqdm(range(self.GAconfig['numChildren'])):
temp = []
for j, v in enumerate(locVars):
v = (v + (np.random.random( v.shape ) - 0.5) * 2)
v = tf.Variable(tf.convert_to_tensor(v, dtype=tf.float32))
temp.append(v)
self.population.append( temp )
Finding the cost function for this population is a rather arduous task. First copy the weights in the population into the original weight tensor and then calculate the original cost function:
for ps in self.population:
for i, v in enumerate(self.variables):
sess.run(tf.assign( self.variables[i], ps[i] ))
result = sess.run(self.costFunction, feed_dict={
self.X : X, self.y : y
})
This implementation is obviously slow. One possible way would be to to generate a set of cost function tensors rather than weight variables, which can all be updated at once.
However, this is the point at which I am not sure what a "good implementation" would be that can improve the speed of the current implementation. Ant help will be greatly appreciated ...
Note: The full implementation is available here:
https://github.com/sankhaMukherjee/tfNNGA
It is at its very early stages, so the code at the moment is very bad.
The implementation of the GA function can be found in the file src/lib/GA/GA.py
Crossover is found within the GA class
This is called from within the file src/moduleGA/moduleGA.py
Have you tried next functions?
copy_variable_to_graph
copy_op_to_graph
get_copied_op
source
I have a function Imaginary which describes a physics process and I want to fit this to a dataset x_interpolate, y_interpolate. The function is a form of a Lorentzian peak function and I have some initial values that are user given, except for f_peak (the peak location) which I find using a peak finding algorithm. All of the fit parameters, except for the offset, are expected to be positive and thus I have set bounds_I accordingly.
def Imaginary(freq, alpha, res, Ms, off):
numerator = (2*alpha*freq*res**2)
denominator = (4*(alpha*res*freq)**2) + (res**2 - freq**2)**2
Im = Ms*(numerator/denominator) + off
return Im
pI = np.array([alpha_init, f_peak, Ms_init, 0])
bounds_I = ([0,0,0,0, -np.inf], [np.inf,np.inf,np.inf, np.inf])
poptI, pcovI = curve_fit(Imaginary, x_interpolate, y_interpolate, pI, bounds=bounds_I)
In some situations I want to keep the parameter f_peak fixed during the fitting process. I tried an easy solution by changing bounds_I to:
bounds_I = ([0,f_peak+0.001,0,0, -np.inf], [np.inf,f_peak-0.001,np.inf, np.inf])
This is for many reasons not an optimal way of doing this so I was wondering if there is a more Pythonic way of doing this? Thank you for your help
If a parameter is fixed, it is not really a parameter, so it should be removed from the list of parameters. Define a model that has that parameter replaced by a fixed value, and fit that. Example below, simplified for brevity and to be self-contained:
x = np.arange(10)
y = np.sqrt(x)
def parabola(x, a, b, c):
return a*x**2 + b*x + c
fit1 = curve_fit(parabola, x, y) # [-0.02989396, 0.56204598, 0.25337086]
b_fixed = 0.5
fit2 = curve_fit(lambda x, a, c: parabola(x, a, b_fixed, c), x, y)
The second call to fit returns [-0.02350478, 0.35048631], which are the optimal values of a and c. The value of b was fixed at 0.5.
Of course, the parameter should be removed from the initial vector pI and the bounds as well.
You might find lmfit (https://lmfit.github.io/lmfit-py/) helpful. This library adds a higher-level interface to the scipy optimization routines, aiming for a more Pythonic approach to optimization and curve fitting. For example, it uses Parameter objects to allow setting bounds and fixing parameters without having to modify the objective or model function. For curve-fitting, it defines high level Model functions that can be used.
For you example, you could use your Imaginary function as you've written it with
from lmfit import Model
lmodel = Model(Imaginary)
and then create Parameters (lmfit will name the Parameter objects according to your function signature), providing initial values:
params = lmodel.make_params(alpha=alpha_init, res=f_peak, Ms=Ms_init, off=0)
By default all Parameters are unbound and will vary in the fit, but you can modify these attributes (without rewriting the model function):
params['alpha'].min = 0
params['res'].min = 0
params['Ms'].min = 0
You can set one (or more) of the parameters to not vary in the fit as with:
params['res'].vary = False
To be clear: this does not require altering the model function, making it much easier to change with is fixed, what bounds might be imposed, and so forth.
You would then perform the fit with the model and these parameters:
result = lmodel.fit(y_interpolate, params, freq=x_interpolate)
you can get a report of fit statistics, best-fit values and uncertainties for parameters with
print(result.fit_report())
The best fit Parameters will be held in result.params.
FWIW, lmfit also has builtin Models for many common forms, including Lorentzian and a Constant offset. So, you could construct this model as
from lmfit.models import LorentzianModel, ConstantModel
mymodel = LorentzianModel(prefix='l_') + ConstantModel()
params = mymodel.make_params()
which will have Parameters named l_amplitude, l_center, l_sigma, and c (where c is the constant) and the model will use the name x for the independent variable (your freq). This approach can become very convenient when you may want to change the functional form of the peaks or background, or when fitting multiple peaks to a spectrum.
I was able to solve this issue regarding arbitrary number of parameters and arbitrary positioning of the fixed parameters:
def d_fit(x, y, param, boundMi, boundMx, listparam):
Sparam, SboundMi, SboundMx = asarray([]), asarray([]), asarray([])
Nparam, NboundMi, NboundMx = asarray([]), asarray([]), asarray([])
for i in range(len(param)):
if(listparam[i] == 1):
Sparam = append(Sparam,asarray(param[i]))
SboundMi = append(SboundMi,asarray(boundMi[i]))
SboundMx = append(SboundMx,asarray(boundMx[i]))
else:
Nparam = append(Nparam,asarray(param[i]))
def funF(x, Sparam):
j = 0
for i in range(len(param)):
if(listparam[i] == 1):
param[i] = Sparam[i-j]
else:
param[i] = Nparam[j]
j = j + 1
return fun(x, param)
return curve_fit(lambda x, *Sparam: funF(x, Sparam), x, y, p0 = Sparam, bounds = (SboundMi,SboundMx))
In this case:
param = [a,b,c,...] # parameters array (any size)
boundMi = [min_a, min_b, min_c,...] # minimum allowable value of each parameter
boundMx = [max_a, max_b, max_c,...] # maximum allowable value of each parameter
listparam = [0,1,1,0,...] # 1 = fit and 0 = fix the corresponding parameter in the fit routine
and the root function is define as
def fun(x, param):
a,b,c,d.... = param
return a*b/c... # any function of the params a,b,c,d...
This way, you can change the root function and the number of parameters without changing the fit routine.
And, at any time, you can fix or let fit any parameter by changing "listparam".
Use like this:
popt, pcov = d_fit(x, y, param, boundMi, boundMx, listparam)
"popt" and "pcov" are 1D arrays of the size of the number of "1" in "listparam" bringing the results of the fitted parameters (best value and err matrix)
"param" will ramain an 1D array of the same size of the original (input) "param", HOWEVER IT WILL BE UPDATED AUTOMATICALLY TO THE FITTED VALUES (same as "popt") for the fitted values, keeping the fixed values according to "listparam"
Hope can be usefull!
Obs1: x = 1D-array independent values and y = 1D-array dependent values
Obs2: This is my first post. Please let me know if I can improove it!
I would like to optimize over all 30 by 30 matrices with entries that are 0 or 1. My objective function is the determinant. One way to do this would be some sort of stochastic gradient descent or simulated annealing.
I looked at scipy.optimize but it doesn't seem to support this sort of optimization as far as I can tell. scipy.optimize.basinhopping looked very tempting but it seems to require continuous variables.
Are there any tools in Python for this sort of general discrete optimization?
I think a genetic algorithm might work quite well in this case. Here's a quick example thrown together using deap, based loosely on their example here:
import numpy as np
import deap
from deap import algorithms, base, tools
import imp
class GeneticDetMinimizer(object):
def __init__(self, N=30, popsize=500):
# we want the creator module to be local to this instance, since
# creator.create() directly adds new classes to the module's globals()
# (yuck!)
cr = imp.load_module('cr', *imp.find_module('creator', deap.__path__))
self._cr = cr
self._cr.create("FitnessMin", base.Fitness, weights=(-1.0,))
self._cr.create("Individual", np.ndarray, fitness=self._cr.FitnessMin)
self._tb = base.Toolbox()
# an 'individual' consists of an (N^2,) flat numpy array of 0s and 1s
self.N = N
self.indiv_size = N * N
self._tb.register("attr_bool", np.random.random_integers, 0, 1)
self._tb.register("individual", tools.initRepeat, self._cr.Individual,
self._tb.attr_bool, n=self.indiv_size)
# the 'population' consists of a list of such individuals
self._tb.register("population", tools.initRepeat, list,
self._tb.individual)
self._tb.register("evaluate", self.fitness)
self._tb.register("mate", self.crossover)
self._tb.register("mutate", tools.mutFlipBit, indpb=0.025)
self._tb.register("select", tools.selTournament, tournsize=3)
# create an initial population, and initialize a hall-of-fame to store
# the best individual
self.pop = self._tb.population(n=popsize)
self.hof = tools.HallOfFame(1, similar=np.array_equal)
# print summary statistics for the population on each iteration
self.stats = tools.Statistics(lambda ind: ind.fitness.values)
self.stats.register("avg", np.mean)
self.stats.register("std", np.std)
self.stats.register("min", np.min)
self.stats.register("max", np.max)
def fitness(self, individual):
"""
assigns a fitness value to each individual, based on the determinant
"""
return np.linalg.det(individual.reshape(self.N, self.N)),
def crossover(self, ind1, ind2):
"""
randomly swaps a subset of array values between two individuals
"""
size = self.indiv_size
cx1 = np.random.random_integers(0, size - 2)
cx2 = np.random.random_integers(cx1, size - 1)
ind1[cx1:cx2], ind2[cx1:cx2] = (
ind2[cx1:cx2].copy(), ind1[cx1:cx2].copy())
return ind1, ind2
def run(self, ngen=int(1E6), mutation_rate=0.3, crossover_rate=0.7):
np.random.seed(seed)
pop, log = algorithms.eaSimple(self.pop, self._tb,
cxpb=crossover_rate,
mutpb=mutation_rate,
ngen=ngen,
stats=self.stats,
halloffame=self.hof)
self.log = log
return self.hof[0].reshape(self.N, self.N), log
if __name__ == "__main__":
np.random.seed(0)
gd = GeneticDetMinimizer()
best, log = gd.run()
It takes about 40 seconds to run 1000 generations on my laptop, which gets me from a minimum determinant value of about -5.7845x108 to -6.41504x1011. I haven't really played around much with the meta-parameters (population size, mutation rate, crossover rate etc.), so I'm sure it's possible to do a lot better.
Here's a greatly improved version that implements a much smarter crossover function that swaps blocks of rows or columns across individuals, and uses a cachetools.LRUCache to guarantee that each mutation step produces a novel configuration, and to skip evaluation of the determinant for configurations that have already been tried:
import numpy as np
import deap
from deap import algorithms, base, tools
import imp
from cachetools import LRUCache
# used to control the size of the cache so that it doesn't exceed system memory
MAX_MEM_BYTES = 11E9
class GeneticDetMinimizer(object):
def __init__(self, N=30, popsize=500, cachesize=None, seed=0):
# an 'individual' consists of an (N^2,) flat numpy array of 0s and 1s
self.N = N
self.indiv_size = N * N
if cachesize is None:
cachesize = int(np.ceil(8 * MAX_MEM_BYTES / self.indiv_size))
self._gen = np.random.RandomState(seed)
# we want the creator module to be local to this instance, since
# creator.create() directly adds new classes to the module's globals()
# (yuck!)
cr = imp.load_module('cr', *imp.find_module('creator', deap.__path__))
self._cr = cr
self._cr.create("FitnessMin", base.Fitness, weights=(-1.0,))
self._cr.create("Individual", np.ndarray, fitness=self._cr.FitnessMin)
self._tb = base.Toolbox()
self._tb.register("attr_bool", self.random_bool)
self._tb.register("individual", tools.initRepeat, self._cr.Individual,
self._tb.attr_bool, n=self.indiv_size)
# the 'population' consists of a list of such individuals
self._tb.register("population", tools.initRepeat, list,
self._tb.individual)
self._tb.register("evaluate", self.fitness)
self._tb.register("mate", self.crossover)
self._tb.register("mutate", self.mutate, rate=0.002)
self._tb.register("select", tools.selTournament, tournsize=3)
# create an initial population, and initialize a hall-of-fame to store
# the best individual
self.pop = self._tb.population(n=popsize)
self.hof = tools.HallOfFame(1, similar=np.array_equal)
# print summary statistics for the population on each iteration
self.stats = tools.Statistics(lambda ind: ind.fitness.values)
self.stats.register("avg", np.mean)
self.stats.register("std", np.std)
self.stats.register("min", np.min)
self.stats.register("max", np.max)
# keep track of configurations that have already been visited
self.tabu = LRUCache(cachesize)
def random_bool(self, *args):
return self._gen.rand(*args) < 0.5
def mutate(self, ind, rate=1E-3):
"""
mutate an individual by bit-flipping one or more randomly chosen
elements
"""
# ensure that each mutation always introduces a novel configuration
while np.packbits(ind.astype(np.uint8)).tostring() in self.tabu:
n_flip = self._gen.binomial(self.indiv_size, rate)
if not n_flip:
continue
idx = self._gen.random_integers(0, self.indiv_size - 1, n_flip)
ind[idx] = ~ind[idx]
return ind,
def fitness(self, individual):
"""
assigns a fitness value to each individual, based on the determinant
"""
h = np.packbits(individual.astype(np.uint8)).tostring()
# look up the fitness for this configuration if it has already been
# encountered
if h not in self.tabu:
fitness = np.linalg.det(individual.reshape(self.N, self.N))
self.tabu.update({h: fitness})
else:
fitness = self.tabu[h]
return fitness,
def crossover(self, ind1, ind2):
"""
randomly swaps a block of rows or columns between two individuals
"""
cx1 = self._gen.random_integers(0, self.N - 2)
cx2 = self._gen.random_integers(cx1, self.N - 1)
ind1.shape = ind2.shape = self.N, self.N
if self._gen.rand() < 0.5:
# row swap
ind1[cx1:cx2, :], ind2[cx1:cx2, :] = (
ind2[cx1:cx2, :].copy(), ind1[cx1:cx2, :].copy())
else:
# column swap
ind1[:, cx1:cx2], ind2[:, cx1:cx2] = (
ind2[:, cx1:cx2].copy(), ind1[:, cx1:cx2].copy())
ind1.shape = ind2.shape = self.indiv_size,
return ind1, ind2
def run(self, ngen=int(1E6), mutation_rate=0.3, crossover_rate=0.7):
pop, log = algorithms.eaSimple(self.pop, self._tb,
cxpb=crossover_rate,
mutpb=mutation_rate,
ngen=ngen,
stats=self.stats,
halloffame=self.hof)
self.log = log
return self.hof[0].reshape(self.N, self.N), log
if __name__ == "__main__":
np.random.seed(0)
gd = GeneticDetMinimizer(0)
best, log = gd.run()
My best score thus far is about -3.23718x1013 -3.92366x1013 after 10000 1000 generations, which takes about 45 seconds on my machine.
Based on the solution cthonicdaemon linked to in the comments, the maximum determinant for a 31x31 Hadamard matrix must be at least 75960984159088×230 ~= 8.1562x1022 (it's not yet proven whether that solution is optimal). The maximum determinant for an (n-1 x n-1) binary matrix is 21-n times the value for an (n x n) Hadamard matrix, i.e. 8.1562x1022 x 2-30 ~= 7.5961x1013, so the genetic algorithm gets within an order of magnitude of the current best known solution.
However, the fitness function seems to plateau around here, and I'm having a hard time breaking -4x1013. Since it's a heuristic search there is no guarantee that it will eventually find the global optimum.
I don't know of any straight-forward method for discrete optimization in scipy. One alternative is using the simanneal package from pip or github, which allows you to introduce your own move function, such that you can restrict it to moves within your domain:
import random
import numpy as np
import simanneal
class BinaryAnnealer(simanneal.Annealer):
def move(self):
# choose a random entry in the matrix
i = random.randrange(self.state.size)
# flip the entry 0 <=> 1
self.state.flat[i] = 1 - self.state.flat[i]
def energy(self):
# evaluate the function to minimize
return -np.linalg.det(self.state)
matrix = np.zeros((5, 5))
opt = BinaryAnnealer(matrix)
print(opt.anneal())
I have looked into this a bit.
A couple of things first off: 1) 56 million is the max value when the size of the matrix is 21x21, not 30x30:https://en.wikipedia.org/wiki/Hadamard%27s_maximal_determinant_problem#Connection_of_the_maximal_determinant_problems_for_.7B1.2C.C2.A0.E2.88.921.7D_and_.7B0.2C.C2.A01.7D_matrices.
But that is also an upper bound on -1, 1 matrices, not 1,0.
EDIT: Reading more carefully from that link:
The maximal determinants of {1, −1} matrices up to size n = 21 are given in the following table. Size 22 is the smallest open case. In the table, D(n) represents the maximal determinant divided by 2n−1. Equivalently, D(n) represents the maximal determinant of a {0, 1} matrix of size n−1.
So that table can be used for upper bounds, but remember they're divided by 2n−1. Also note that 22 is the smallest open case, so trying to find the maximum of a 30x30 matrix has not been done, and is not even close to being done just yet.
2) The reason David Zwicker's code gives an answer of 30 million is probably due to the fact that he's minimising. Not maximising.
return -np.linalg.det(self.state)
See how he's got the minus sign there?
3) Also, the solution space for this problem is very big. I calculate the number of different matrices to be 2^(30*30) i.e. in the order of 10^270. So looking at each matrix is simply impossible, and even look at most of them is too.
I have a bit of code here (adapted from David Zwicker's code) that runs, but I have no idea how close it is to the actual maximum. It takes around 45 mins to do 10 million iterations on my PC, or only about 2 mins for 1 mill iterations. I get a max value of around 3.4 billion. But again, I have no idea how close this is to the theoretical maximum.
import numpy as np
import random
import time
MATRIX_SIZE = 30
def Main():
startTime = time.time()
mat = np.zeros((MATRIX_SIZE, MATRIX_SIZE), dtype = int)
for i in range(MATRIX_SIZE):
for j in range(MATRIX_SIZE):
mat[i,j] = random.randrange(2)
print("Starting matrix:\n", mat)
maxDeterminant = 0
for i in range(1000000):
# choose a random entry in the matrix
x = random.randrange(MATRIX_SIZE)
y = random.randrange(MATRIX_SIZE)
mat[x,y] = 1 - mat[x,y]
#print(mat)
detValue = np.linalg.det(mat)
if detValue > maxDeterminant:
maxDeterminant = detValue
timeTakenStr = "\nTotal time to complete: " + str(round(time.time() - startTime, 4)) + " seconds"
print(timeTakenStr )
print(maxDeterminant)
Main()
Does this help?