PyGMO Batch fitness evaluation - python

My goal is to perform a parameter estimation (model calibration) using PyGmo. My model will be an external "black blox" model (c-code) outputting the objective function J to be minimized (J in this case will be the "Normalized Root Mean Square Error" (NRMSE) between model outputs and measured data. To speed up the optimization (calibration) I would like to run my models/simulations on multiple cores/threads in parallel. Therefore I would like to use a batch fitness evaluator (bfe) in PyGMO. I prepared a minimal example using a simple problem class but using pure python (no external model) and the rosenbrock problem:
#!/usr/bin/env python
# coding: utf-8
import numpy as np
from fmpy import read_model_description, extract, simulate_fmu, freeLibrary
from fmpy.fmi2 import FMU2Slave
import pygmo as pg
from mpl_toolkits.mplot3d import Axes3D
import matplotlib.pyplot as plt
from matplotlib import cm
import time
#-------------------------------------------------------
def main():
# Optimization
# Define problem
class my_problem:
def __init__(self, dim):
self.dim = dim
def fitness(self, x):
J = np.zeros((1,))
for i in range(len(x) - 1):
J[0] += 100.*(x[i + 1]-x[i]**2)**2+(1.-x[i])**2
return J
def get_bounds(self):
return (np.full((self.dim,),-5.),np.full((self.dim,),10.))
def get_name(self):
return "My implementation of the Rosenbrock problem"
def get_extra_info(self):
return "\nDimensions: " + str(self.dim)
def batch_fitness(self, dvs):
J = [123] * len(dvs)
return J
prob = pg.problem(my_problem(30))
print('\n----------------------------------------------')
print('\nProblem description: \n')
print(prob)
#-------------------------------------------------------
dvs = pg.batch_random_decision_vector(prob, 1)
print('\n----------------------------------------------')
print('\nBarch fitness evaluation:')
print('\ndvs length:' + str(len(dvs)))
print('\ndvs:')
print(dvs)
udbfe = pg.default_bfe()
b = pg.bfe(udbfe=udbfe)
print('\nudbfe:')
print(udbfe)
print('\nbfe:')
print(b)
fvs = b(prob, dvs)
print(fvs)
#-------------------------------------------------------
pop_size = 50
gen_size = 1000
algo = pg.algorithm(pg.sade(gen = gen_size)) # The algorithm (a self-adaptive form of Differential Evolution (sade - jDE variant)
algo.set_verbosity(int(gen_size/10)) # We set the verbosity to 100 (i.e. each 100 gen there will be a log line)
print('\n----------------------------------------------')
print('\nOptimization:')
start = time.time()
pop = pg.population(prob, size = pop_size) # The initial population
pop = algo.evolve(pop) # The actual optimization process
best_fitness = pop.get_f()[pop.best_idx()] # Getting the best individual in the population
print('\n----------------------------------------------')
print('\nResult:')
print('\nBest fitness: ', best_fitness) # Get the best parameter set
best_parameterset = pop.get_x()[pop.best_idx()]
print('\nBest parameter set: ',best_parameterset)
print('\nTime elapsed for optimization: ', time.time() - start, ' seconds\n')
if __name__ == '__main__':
main()
When I try to run this code I get the following error:
Exception has occurred: ValueError
function: bfe_check_output_fvs
where: C:\projects\pagmo2\src\detail\bfe_impl.cpp, 103
what: An invalid result was produced by a batch fitness evaluation: the number of produced fitness vectors, 30, differs from the number of input decision vectors, 1
By deleting or commeting out this two lines:
fvs = b(prob, dvs)
print(fvs)
the script can be run without errors.
My questions:
How to use the batch fitness evaluation? (I know this is a new
capability of PyGMO and they are still working on the
documentation...) Can anybody give a minimal example on how to implement this?
Is this the right way to go to speed up my model calibration problem? Or should I use islands and archipelagos? If I got it right, the islands in an archipelago are not communicating to eachother, right? So if one performs e.g. a Particle Swarm Optimization and wants to evaluate several objective function calls simultaneously (in parallel) then the batch fitness evaluator is the right choice?
Do I need to care about archipelagos and islands in this example? What are they exactly meant for? Is it worth running several optimizations but with different initial x (input to objective function) and then to take the best solution? Is this a common approach in optimization with GA's?
I am very knew to the field of optimization and PyGMO, so thx for helping!

Is this the right way to go to speed up my model calibration problem? Or should I use islands and archipelagos? If I got it right, the islands in an archipelago are not communicating to eachother, right? So if one performs e.g. a Particle Swarm Optimization and wants to evaluate several objective function calls simultaneously (in parallel) then the batch fitness evaluator is the right choice?
There are 2 modes of parallelization in pagmo, the island model (i.e., coarse-grained parallelization) and the BFE machinery (i.e., fine-grained parallelization).
The island model works on any problem/algorithm combination, and it is based on the idea that multiple optimisations are run in parallel while exchanging information to accelerate the global convergence to a solution.
The BFE machinery, instead, parallelizes a single optimisation, and it requires explicit support in the solver to work. Currently in pagmo only a handful of solvers are able to take advantage of the BFE machinery. The BFE machinery can also be used to parallelise the initialisation of a population of individuals, which can be useful is your fitness function is particularly heavyweight.
Which parallelisation method is best for you depends on the properties of your problem. In my experience, users tend to prefer the BFE machinery (fine-grained parallelisation) if the fitness function is very heavy (e.g., it takes minutes or more to compute), because in such a situation fitness evaluations are so costly that in order to take advantage of the island model one would have to wait too long. The BFE is also in some sense easier to understand because you don't have to delve into the details of archipelagos, topologies, etc. On the other hand, the BFE works only with certain solvers (although we are trying to extend BFE support to other solvers as time goes by).
How to use the batch fitness evaluation? (I know this is a new capability of PyGMO and they are still working on the documentation...) Can anybody give a minimal example on how to implement this?
One way of using the BFE is what you did in your example, i.e., via the implementation of a batch_fitness() method in your problem. However, my suggestion would be to comment out the batch_fitness() method and try using one of the general-purpose batch fitness evaluators provided with pagmo. The easiest thing to do is to just default-construct an instance of the bfe class and then pass it to one of the algorithms that can use the BFE machinery. One such algorithm is nspso:
https://esa.github.io/pygmo2/algorithms.html#pygmo.nspso
So, something like this:
b = pg.bfe() # Construct a default BFE
uda = pg.nspso(gen = gen_size) # Construct the algorithm
uda.set_bfe(b) # Tell the UDA to use the BFE machinery
algo = pg.algorithm(uda) # Construct a pg.algorithm from the UDA
new_pop = algo.evolve(pop) # Evolve the population
This should use multiple processes to evaluate your fitness functions in parallel within the loop of the nspso algorithm.
If you need more help, please come over to our public users/devs chat room, where you should get assistance rather quickly (normally):
https://gitter.im/pagmo2/Lobby

Related

Python Pandas how to speed up the __init__ class function correctly with numba?

I have a class that does different mathematical calculations on a cycle
I want to speed up its processing with Numba
And now I'm trying to apply Namba to init functions
The class itself and its init function looks like this:
class Generic(object):
##numba.vectorize
def __init__(self, N, N1, S1, S2):
A = [['Time'],['Time','Price'], ["Time", 'Qty'], ['Time', 'Open_interest'], ['Time','Operation','Quantity']]
self.table = pd.DataFrame(pd.read_csv('Datasets\\RobotMath\\table_OI.csv'))
self.Reader()
for a in A:
if 'Time' in a:
self.df = pd.DataFrame(pd.read_csv(Ex2_Csv, usecols=a, parse_dates=[0]))
self.df['Time'] = self.df['Time'].dt.floor("S", 0)
self.df['Time'] = pd.to_datetime(self.df['Time']).dt.time
if a == ['Time']:
self.Tik()
elif a == ['Time','Price']:
self.Poc()
self.Pmm()
self.SredPrice()
self.Delta_Ema(N, N1)
self.Comulative(S1, S2)
self.M()
elif a == ["Time", 'Qty']:
self.Volume()
elif a == ['Time', 'Open_interest']:
self.Open_intrest()
elif a == ['Time','Operation','Quantity']:
self.D()
#self.Dataset()
else:
print('Something went wrong', f"Set Error: {0} ".format(a))
All functions of the class are ordinary column calculations using Pandas.
Here are two of them for example:
def Tik(self):
df2 = self.df.groupby('Time').value_counts(ascending=False)
df2.to_csv('Datasets\\RobotMath\\Tik.csv', )
def Poc(self):
g = self.df.groupby('Time', sort=False)
out = (g.last()-g.first()).reset_index()
out.to_csv('Datasets\\RobotMath\\Poc.csv', index=False)
I tried to use Numba in different ways, but I got an error everywhere.
Is it possible to speed up exactly the init function? Or do I need to look for another way and I can 't do without rewriting the class ?
So there's nothing special or magical about the init function, at run time, it's just another function like any other.
In terms of performance, your class is doing quite a lot here - it might be worth breaking down the timings of each component first to establish where your performance hang-ups lie.
For example, just reading in the files might be responsible for a fair amount of that time, and for Ex2_Csv you repeatedly do that within a loop, which is likely to be sub-optimal depending on the volume of data you're dealing with, but before targeting a resolution (like Numba for example) it'd be prudent to identify which aspect of the code is performing within expectations.
You can gather that information in a number of ways, but the simplest might be to add in some tagged print statements that emit the elapsed time since the last print statement.
e.g.
start = datetime.datetime.now()
print("starting", start)
## some block of code that does XXX
XXX_finish = datetime.datetime.now()
print("XXX_finished at", XXX_finish, "taking", XXX_finish-start)
## and repeat to generate a runtime report showing timings for each block of code
Then you can break the runtime of your program down into feature-aligned chunks, then when tweaking you'll be able to directly see the effect. Profiling the runtime of your code like this can really help when making performance tweaks, and it helps sharpen focus on what tweaks are benefiting/harming specific areas of your code.
For example, in the section that's performing the group-bys, with some timestamp outputs before and after you can compare running it with the numba turned on, and then again with it turned off.
From there, with a little careful debugging, sensible logging of times and a bit of old-fashioned sleuthing (I tend to jot these kinds of things down with paper and pencil) you (and if you share your findings, we) ought to be in a better position to answer your question more precisely.

Ax.dev does not do anything with search space constraints?

I found some articles online that mentioned Ax.dev's capability to cope with a constrained search space (e.g. dimension_x + dimension_y <= bound). However, I only experienced Ax.dev to ignore/violate all constraints. I have tried some different constraints on the Hartmann6d example. I assume Ax.dev models the constraints as soft constraints (not sure though, might as well be my coding skills...). So, my first question is: does Ax.dev SearchSpace use parameter_constraints as soft or hard constraint(s).
My second problem:
from ax import *
number of parameters
...
c0 = SumConstraint(parameters=[ some parameters ], bound= some boundary)
c1...
space = SearchSpace(parameters=[ parameters ], parameter_constraints=[c0, c1])
exp = SimpleExperiment(
name='EXPERIMENT5',
search_space=space,
evaluation_function=black_box_function,
objective_name='BLABLA',
minimize=False,
)
sobol = Models.SOBOL(exp.search_space)
for i in range(10):
exp.new_trial(generator_run=sobol.gen(1))
exp.trials[len(exp.trials) - 1].run()
returns
SearchSpaceExhausted: Rejection sampling error (specified maximum draws (100000) exhausted, without finding sufficiently many (1) candidates). This likely means that there are no new points left in the search space.
I have not been able to find useful information concerning this, despite all promising articles online stating ax.dev benefits (such as a constrained parameter space!) :(
meta-comment: Probably a better place for GitHub issues (not much by way of Ax help/docs on stackoverflow to my knowledge, but their GitHub issues is rich and generally has a lot of developer/community support).
Does Ax.dev SearchSpace use parameter_constraints as soft or hard constraint(s)?
I think parameter constraints are hard constraints (pretty sure that's the case at least for Sobol sampling, but I'm not sure about Bayesian models).
Outcome constraints are soft penalties (constraints)
SearchSpaceExhausted: Rejection sampling error
Search: https://github.com/facebook/Ax/issues?q=is%3Aissue+sort%3Aupdated-desc+specified+maximum+draws+is%3Aclosed
--> https://github.com/facebook/Ax/issues/694
--> https://github.com/facebook/Ax/issues/694#issuecomment-987353936
Since it's on master, first I install the latest version in a conda environment:
pip install 'git+https://github.com/facebook/Ax.git#egg=ax-platform'
The relevant imports are:
from ax.modelbridge.generation_strategy import GenerationStrategy, GenerationStep
from ax.modelbridge.registry import Models
Then based on 1A. Manually configured generation strategy, I change the first GenerationStep model_kwargs from:
model_kwargs={"seed": 999}
to
model_kwargs={
"seed": 999,
"fallback_to_sample_polytope": True,
}
With the full generation strategy (gs) given by:
gs = GenerationStrategy(
steps=[
# 1. Initialization step (does not require pre-existing data and is well-suited for
# initial sampling of the search space)
GenerationStep(
model=Models.SOBOL,
num_trials=5, # How many trials should be produced from this generation step
min_trials_observed=3, # How many trials need to be completed to move to next model
max_parallelism=5, # Max parallelism for this step
model_kwargs={
"seed": 999,
"fallback_to_sample_polytope": True,
}, # Any kwargs you want passed into the model
model_gen_kwargs={}, # Any kwargs you want passed to `modelbridge.gen`
),
# 2. Bayesian optimization step (requires data obtained from previous phase and learns
# from all data available at the time of each new candidate generation call)
GenerationStep(
model=Models.GPEI,
num_trials=-1, # No limitation on how many trials should be produced from this step
max_parallelism=3, # Parallelism limit for this step, often lower than for Sobol
# More on parallelism vs. required samples in BayesOpt:
# https://ax.dev/docs/bayesopt.html#tradeoff-between-parallelism-and-total-number-of-trials
),
]
)
Finally, in the case of this issue, and as mentioned:
AxClient(generation_strategy=gs)
Or in the case of the Loop API:
optimize(..., generation_strategy=gs)
Seems to work well for my use-case; thank you! I'll try to update the other relevant issues soon.

Use solve_ivp's 'events' to check for convergence

Problem:
Assume a simple decay process as described by the following ode:
def exponential_decay(t,y):
return -0.5 * y
This can easily be integrated with the help of scipy's solve_ivp()
t_min = 0; t_max = 25; y0 = 1
sol = solve_ivp(exponential_decay, [t_min, t_max],[y0],dense_output=True)
The resulting solution might look like this:
Question:
I would like to use solve_ivp's "event"-finder to check for convergence to reduce the computational time spent after convergence is reached.
However, the designed signature of the event tracker when an event function is provided is:
event(t,y) -> t_event
An event occurs when the return of the event function is equal to zero.
Because event(t,y) only knows the current y(t) it can not be used straightforwardly to implement standard convergence criteria as they all require a series of y.
So to cut this short: Is there a good way to do so, to use the event finder to check for convergence?
Or to make use of any kind of range of y(t) in the convergence tracker?
This seems like something that would be helpful in many applications
A (bad way) to do so I found is to pass a global variable in and out of event(t,y) that stores the the differnt (t,y(t). However, this is not only extreamly unelegant, it also offsets the computational efficiency provided by solive_ivp()

High memory usage when doing direct transcription with sympy equations

I used sympy to derive, via lagrange, the equations of motion of my 3 link robot. The resultant equation of motion in the form (theta_dot_dot = f(theta, theta_dot)) turned out very complicated with A LOT of cos and sin. I then lambdified the functions to use with drake, replacing all the sympy.sin and sympy.cos with drake.sin, drake.cos.
The final function can be evaluated numerically (i.e. given theta, theta_dot, find theta_dot_dot) somewhat efficiently in the milliseconds range.
I then tried to use direct transcription to do trajectory optimization. Note I did not use the DirectTranscription library, instead manually added the necessary constraints.
The constraints are added roughly as follows:
for i in range(NUM_TIME_STEPS-1):
print("Adding constraints for t = " + str(i))
tau = mp.NewContinuousVariables(3, "tau_%d" % i)
next_state = mp.NewContinuousVariables(8, "state_%d" % (i+1))
for j in range(8):
mp.AddConstraint(next_state[j] <= (state_over_time[i] + TIME_INTERVAL*derivs(state_over_time[i], tau))[j])
mp.AddConstraint(next_state[j] >= (state_over_time[i] + TIME_INTERVAL*derivs(state_over_time[i], tau))[j])
state_over_time[i+1] = next_state
tau_over_time[i] = tau
The problem I'm facing right now is that on each iteration of adding constraints, I observe that my memory usage increases by around 70-100MB. This means that my number of time steps cannot go more than around 50 before the program crashes due to out of memory.
I'm wondering what I can do to make trajectory optimization work for my robot. Obviously I can try to simplify (by hand or otherwise) the equations of motions... but is there anything else I can try? Is it even normal that the constraints are taking up so much memory? Am I doing something very wrong here?
You're pushing drake's symbolic through your complex equations. Making that better is a good goal, but probably you want to avoid it by using the other overload for AddConstraint:
AddConstraint(your_method, lb, ub, vars)
https://drake.mit.edu/pydrake/pydrake.solvers.mathematicalprogram.html?highlight=addconstraint#pydrake.solvers.mathematicalprogram.MathematicalProgram.AddConstraint
That will use your python code as a function pointer, and should use autodiff instead of symbolic.

Why is my sklearn t-sne function quitting before reaching its maximum iteration

I am trying to run a tsne analysis on a square distance matrix. These are the commands I am using.
model = TSNE(n_components = 2,perplexity = 32, verbose = 10,n_iter = 1000, metric = "precomputed")
embeddings = model.fit_transform(D)
This is the output I receive: output from tsne function
It looks like the program is running through 75 iterations and then calling it good and quitting. When I plot the data from the tsne it's basically just a single dense blob. Why is the program quitting early and how can I make it run longer?
It's quitting because the exit-condition is reached.
Interpreting the log, the exit-condition is probably a metric on the gradient, called gradient-norm here. If needed, checkout the basics of gradient-descent to understand the intuition. As every iteration is making a step towards the negative of the gradient, tiny gradients will not do much to the objective (and will be interpreted as: we found a local/global minimum).
It looks like (still interpreting your log only):
if np.linalg.norm(gradient) < 1e-4:
return solution
There is no merit to further do more iterations for this parameterization of the optimization-problem. The solution won't get better (in terms of minimization).
You can only try other parameters (resulting in other optimization-problems).

Categories

Resources