How to sample from multiple chains with GPflow?

How to sample from multiple chains with GPflow? - python

I have recently started using gpflow to build a GP model. I used a Hamiltonian Monte Carlo to sample the posterior with a single chain.
My goal is to run multiple chains and perform convergence diagnostics.
This is my set up for one chain:
num_burnin_steps = ci_niter(100)
num_samples = ci_niter(500)
hmc_helper = gpflow.optimizers.SamplingHelper(
model.log_posterior_density, model.trainable_parameters
)
hmc = tfp.mcmc.HamiltonianMonteCarlo(
target_log_prob_fn=hmc_helper.target_log_prob_fn, num_leapfrog_steps=10, step_size=0.01
)
adaptive_hmc = tfp.mcmc.SimpleStepSizeAdaptation(
hmc, num_adaptation_steps=10, target_accept_prob=f64(0.75), adaptation_rate=0.1
)
#tf.function
def run_chain_fn():
return tfp.mcmc.sample_chain(
num_results=num_samples,
num_burnin_steps=num_burnin_steps,
current_state=hmc_helper.current_state,
kernel=adaptive_hmc,
trace_fn=lambda _, pkr: pkr.inner_results.is_accepted,
)
samples, _ = run_chain_fn()
constrained_samples = hmc_helper.convert_to_constrained_values(samples
I cannot find any examples on how to modify this for multiple chains. Here is the closest example I can find that does the same thing. Initial_state is altered to have 10 chains. To do the same with my example I would need to alter hmc_helper.current_state but cannot figure out the right way to do this.
Any suggestions would be greatly appreciated.
I am new to stack overflow so apologies if my question is not clear.
Thanks!

Related

running multiple ray Tuning in parallel using a search algorithm

I want to queue 200+ tuning jobs to my ray cluster, they each need to be guided by a search algorithm, as my actual objective function has 40+ parameters.
I can do this for a single job like this:
import ray
from ray import tune
from ray.tune import Tuner, TuneConfig
from ray.tune.search.optuna import OptunaSearch
ray.init()
def objective(config):
ground_truth = [1,2,3,4]
yhat = [i*config['factor'] + config['constant'] for i in range(4)]
abs_err = [abs(gt - yh) for gt, yh in zip(ground_truth, yhat)]
mae = sum(abs_err)/len(abs_err)
tune.report(mean_accuracy = mae)
config = {
'factor': tune.quniform(0,3,1),
'constant': tune.quniform(0,3,1)
}
algo = OptunaSearch()
tuner = tune.Tuner(
objective,
tune_config=TuneConfig(
metric="mean_accuracy",
mode="min",
search_alg=algo,
num_samples=100
),
param_space=config
)
results = tuner.fit()
This works and gives the desired result for 1 of the 200 jobs.
Now I want to queue up to 200 jobs from a single run of a single script:
As far as I understood the documentation this is how that should work:
import ray
from ray import tune
ray.init()
def objective(config):
ground_truth = [1,2,3,4]
yhat = [i*config['factor'] + config['constant'] for i in range(4)]
abs_err = [abs(gt - yh) for gt, yh in zip(ground_truth, yhat)]
mae = sum(abs_err)/len(abs_err)
tune.report(mean_accuracy = mae)
config = {
'factor': tune.quniform(0,3,1),
'constant': tune.quniform(0,3,1)
}
experiments = []
for i in range(3):
experiment_spec = tune.Experiment(
name=f'{i}',
run=objective,
stop={"mean_accuracy": 0},
config=config,
num_samples=10
)
experiments.append(experiment_spec)
out = tune.run_experiments(experiments)
When I run this I get the message: Running with multiple concurrent experiments. All experiments will be using the same SearchAlgorithm..
I need to be able to specify the search algorithm, but I don't understand how. Additionally, these experiments appear to be part of one large optimization out is a list of 30 objective objects. The parameter values chosen are from a uniform distribution, without the q. However all 30 values fall in the specified range.
I must've misunderstood the purpose of run_experiments, please help.

Why is my genetic algorithm not improving?

The fitness output is always 0 when my genetic algorithm is training. This should be normal at first but it does this for the entire duration of the training and does not improve at all. I have the training of the genetic algorithm set up so that it trains on small increments of data at a time instead of the entire training data array. The reason I am doing this is because I want the genetic algorithm to train on the most resent data last. This is a simplified version of what I am trying...
def trainModels(coin):
global endSliceTraining, startSliceTraining # These are also used in the fitness function
torch_ga = pygad.torchga.TorchGA(model=NeuralNetworkModel, num_solutions=10)
while endSliceTraining < len(trainingData):
ga_instance = pygad.GA(num_generations=10,
num_parents_mating=2,
initial_population=torch_ga.population_weights,
fitness_func=fitness_func,
parent_selection_type="sss",
mutation_type="random",
mutation_by_replacement=True)
ga_instance.run()
solution, solution_fitness, solution_idx = ga_instance.best_solution()
startSliceTraining += 1
endSliceTraining += 1
def fitness_func(solution, solution_idx):
global startSliceTraining, endSliceTraining
stats = numpy.array(trainingData)
statsSlice = stats[startSliceTraining:endSliceTraining]
for stats in statsSlice:
action = [0,0,0]
stats = torch.tensor(stats, dtype=torch.float)
prediction = pygad.torchga.predict(model=NeuralNetworks[currentCoinIndex], solution=solution, data=stats)
move = torch.argmax(prediction).item()
action[move] = 1
"""
I deleted the complicated part here. This area was not the problem. I hope what's above in this function is understandable
"""
return ...
I'm thinking that maybe my parameters in the pygad.GA are wrong or that perhaps for some reason the Neural Network is not being transfered over to be used in the next set of data but I don't know.
Help would be appreciated, thank you!

Stellargraph failing to work with data shuffle

when I ran the StellarGraph's demo on graph classification using DGCNNs, I got the same result as in the demo.
However, when I tested what happens when I first shuffle the data using the following code:
shuffler = list(zip(graphs, graph_labels))
random.shuffle(shuffler)
graphs, graph_labels = zip(*shuffler)
The model didn't learn at all (accuracy of around 50% - just as data distribution).
Does anyone know why this happens? Maybe I shuffled in a wrong way? Or is it that the data should be unshuffled in the first place (also why? it doesn't make any sense)? Or is it a bug in StellarGraph's implementation?

I found the problem. It wasn't anything to do with the shuffling algorithm, nor with StellarGraph's implementation. The problem was in the demo, at the following lines:
train_gen = gen.flow(
list(train_graphs.index - 1),
targets=train_graphs.values,
batch_size=50,
symmetric_normalization=False,
)
test_gen = gen.flow(
list(test_graphs.index - 1),
targets=test_graphs.values,
batch_size=1,
symmetric_normalization=False,
)
The problem was caused, specifically by train_graphs.index - 1 and test_graphs.index - 1. The indices are already in the range between 0 to n, so subtructing one from them would cause the graph data to "shift" one backwards, causing each data point to get the label of a different data point.
To fix this, simply change them to train_graphs.index and test_graphs.index without the -1 at the end.

Tensorflow probability not giving the same results as PyMC3

I have previousely used PyMC3 and am now looking to use tensorflow probability.
I have built some model in both, but unfortunately, I am not getting the same answer. In fact, the answer is not that close.
# definition of the joint_log_prob to evaluate samples
def joint_log_prob(data, proposal):
prior = tfd.Normal(mu_0, sigma_0, name='prior')
likelihood = tfd.Normal(proposal, sigma, name='likelihood')
return (prior.log_prob(proposal) + tf.reduce_mean(likelihood.log_prob(data)))
proposal = 0
# define a closure on joint_log_prob
def unnormalized_log_posterior(proposal):
return joint_log_prob(data=observed, proposal=proposal)
# define how to propose state
rwm = tfp.mcmc.NoUTurnSampler(
target_log_prob_fn=unnormalized_log_posterior,
max_tree_depth = 100,
step_size = 0.1
)
# define initial state
initial_state = tf.constant(0., name='initial_state')
#tf.function
def run_chain(initial_state, num_results=7000, num_burnin_steps=2000,adaptation_steps = 1):
adaptive_kernel = tfp.mcmc.DualAveragingStepSizeAdaptation(
rwm, num_adaptation_steps=adaptation_steps,
step_size_setter_fn=lambda pkr, new_step_size: pkr._replace(step_size=new_step_size),
step_size_getter_fn=lambda pkr: pkr.step_size,
log_accept_prob_getter_fn=lambda pkr: pkr.log_accept_ratio,
)
return tfp.mcmc.sample_chain(
num_results=num_results,
num_burnin_steps= num_burnin_steps,
current_state=initial_state,
kernel=adaptive_kernel,
trace_fn=lambda cs, kr: kr)
trace, kernel_results = run_chain(initial_state)
I am using NoUTurns sampler, I have added some stepsize adaptation, without it, the result is pretty much the same.
I dont really know how to move forward?
Maybe be joint log probability is wrong?

You should use reduce_sum in your log_prob instead of reduce_mean. Otherwise you are effectively downweighting the likelihood by a factor equal to the size of your data set. This would cause the samples to look a lot more like the prior, which might be what you’re seeing in the plot.

Acceptance-rate in PyMC3 (Metropolis-Hastings)

Does anyone know how I can see the final acceptance-rate in PyMC3 (Metropolis-Hastings) ? Or in general, how can I see all the information that pymc3.sample() returns ?
Thanks

Given an example, first, set up the model:
import pymc3 as pm3
sigma = 3 # Note this is the std of our data
data = norm(10,sigma).rvs(100)
mu_prior = 8
sigma_prior = 1.5 # Note this is our prior on the std of mu
plt.hist(data,bins=20)
plt.show()
basic_model = pm3.Model()
with basic_model:
# Priors for unknown model parameters
mu = pm3.Normal('Mean of Data',mu_prior,sigma_prior)
# Likelihood (sampling distribution) of observations
data_in = pm3.Normal('Y_obs', mu=mu, sd=sigma, observed=data)
Second, perform the simulation:
chain_length = 10000
with basic_model:
# obtain starting values via MAP
startvals = pm3.find_MAP(model=basic_model)
# instantiate sampler
step = pm3.Metropolis()
# draw 5000 posterior samples
trace = pm3.sample(chain_length, step=step, start=startvals)
Using the above example, the acceptance rate can be calculated this way:
accept = np.sum(trace['Mean of Data'][1:] != trace['Mean of Data'][:-1])
print("Acceptance Rate: ", accept/trace['Mean of Data'].shape[0])
(I found this solution in an online tutorial, but I don't quite understand it.)
Reference: Introduction to PyMC3

I checked for the NUTS algorithm, and found the solution from here pymc3 forum.
trace.mean_tree_accept.mean()

Let step = pymc3.Metropolis() be our sampler, we can get the final acceptance-rate through
"step.accepted"
Just for beginners (pymc3) like myself, after each variable/obj. put a "." and hit the tab key; you will see some interesting suggestions ;)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to sample from multiple chains with GPflow? - python

Related

running multiple ray Tuning in parallel using a search algorithm

Why is my genetic algorithm not improving?

Stellargraph failing to work with data shuffle

Tensorflow probability not giving the same results as PyMC3

Acceptance-rate in PyMC3 (Metropolis-Hastings)

Categories

Resources