Weights&Biases Sweep Keras K-Fold Validation - python

I'm using Weights&Biases Cloud-based sweeps with Keras.
So first i create a new Sweep within a W&B Project with a config like following:
description: LSTM Model
method: random
metric:
goal: maximize
name: val_accuracy
name: LSTM-Sweep
parameters:
batch_size:
distribution: int_uniform
max: 128
min: 32
epochs:
distribution: constant
value: 200
node_size1:
distribution: categorical
values:
- 64
- 128
- 256
node_size2:
distribution: categorical
values:
- 64
- 128
- 256
node_size3:
distribution: categorical
values:
- 64
- 128
- 256
node_size4:
distribution: categorical
values:
- 64
- 128
- 256
node_size5:
distribution: categorical
values:
- 64
- 128
- 256
num_layers:
distribution: categorical
values:
- 1
- 2
- 3
optimizer:
distribution: categorical
values:
- Adam
- Adamax
- Adagrad
path:
distribution: constant
value: "./path/to/data/"
program: sweep.py
project: SLR
My sweep.py file looks something like this:
# imports
init = wandb.init(project="my-project", reinit=True)
config = wandb.config
def main():
skfold = StratifiedKFold(n_splits=5,
shuffle=True, random_state=7)
cvscores = []
group_id = wandb.util.generate_id()
X,y = # load data
i = 0
for train, test in skfold.split(X,y):
i=i+1
run = wandb.init(group=group_id, reinit=True, name=group_id+"#"+str(i))
model = # build model
model.fit([...], WandBCallback())
cvscores.append([...])
wandb.join()
if __name__ == "__main__":
main()
Starting this with the wandb agent command within the folder of sweep.py.
What i experienced with this setup is, that with the first wandb.init() call a new run is initialized. Okay, i could just remove that. But when calling wandb.init() for the second time it seems to lose track of the sweep it is running in. Online an empty run is listed in the sweep (because of the first wandb.init() call), all other runs are listed inside the project, but not in the sweep.
My goal is to have a run for each fold of the k-Fold cross-validation. At least i thought this would be the right way of doing this.
Is there a different approach to combine sweeps with keras k-fold cross validation?

We put together an example of how to accomplish k-fold cross validation:
https://github.com/wandb/examples/tree/master/examples/wandb-sweeps/sweeps-cross-validation
The solution requires some contortions for the wandb library to spawn multiple jobs on behalf of a launched sweep job.
The basic idea is:
The agent requests a new set of parameters from the cloud hosted parameter server. This is the run called sweep_run in the main function.
Send information about what the folds should process over a multiprocessing queue to waiting processes
Each spawned process logs to their own run, organized with group and job_type to enable auto-grouping in the UI
When the process is finished, it sends the primary metric over a queue to the parent sweep run
The sweep run reads metrics from the child runs and logs it to the sweep run so that the sweep can use that result to impact future parameter choices and/or hyperband early termination optimizations
Example visualizations of the sweep and k-fold grouping can be seen here:
Sweep: https://app.wandb.ai/jeffr/examples-sweeps-cross-validation/sweeps/vp0fsvku
K-fold Grouping: https://app.wandb.ai/jeffr/examples-sweeps-cross-validation/groups/vp0fsvku

Related

is test data used in Pycaret time series(beta) completely unseen by the model(s)?

Post checking official documentation and example, I am still confused if test data passed to the setup function is completely unseen by the model???
from pycaret.datasets import get_data
from pycaret.internal.pycaret_experiment import TimeSeriesExperiment
# get data
y = get_data('airline', verbose=False)
# no of future steps to forecast
fh = 12 # or alternately fh = np.arange(1,13)
fold = 3
# setup
exp = TimeSeriesExperiment()
exp.setup(data=y, fh=fh, fold = fold)
exp.models()
which gives description as:
Also, checked at cv-graph, we can conclude that test data set is not used while cv. But, Still as it's not mentioned anywhere about it, need a concrete evidence.
Train-Test split
Train c-v splits
If you notice the cv splits, they do not use the test data at all. So any step such as create_model, tune_model, blend_model, compare_models that use Cross-Validation, will not use the test data at all for training.
Once you are happy with the models from these steps, you can finalize the model using finalize_model. In this case, whatever model you pass to finalize_model is trained on the complete dataset (train + test) so that you can make true future predictions.

Stable baselines saving PPO model and retraining it again

Hello I am using Stable baselines package (https://stable-baselines.readthedocs.io/), specifically I am using the PPO2 and I am not sure how to properly save my model... I trained it for 6 virtual days and got my average return to around 300, then I have decided that this is not enough for me so I trained the model for another 6 days. But when I looked at the training statistics the second training return per episode started at around 30. This suggest that it did not save all parameters.
this is how I save use the package:
def make_env_init(env_id, rank, seed=0):
"""
Utility function for multiprocessed env.
:param env_id: (str) the environment ID
:param seed: (int) the inital seed for RNG
:param rank: (int) index of the subprocess
"""
def env_init():
# Important: use a different seed for each environment
env = gym.make(env_id, connection=blt.DIRECT)
env.seed(seed + rank)
return env
set_global_seeds(seed)
return env_init
envs = VecNormalize(SubprocVecEnv([make_env_init(f'envs:{env_name}', i) for i in range(processes)]), norm_reward=False)
if os.path.exists(folder / 'model_dump.zip'):
model = PPO2.load(folder / 'model_dump.zip', envs, **ppo_kwards)
else:
model = PPO2(MlpPolicy, envs, **ppo_kwards)
model.learn(total_timesteps=total_timesteps, callback=callback)
model.save(folder / 'model_dump.zip')
The way you saved the model is correct. The training is not a monotonous process: it can also show much worse results after a further training.
What you can do, first of all is to write logs of the progress:
model = PPO2(MlpPolicy, envs, tensorboard_log="./logs/progress_tensorboard/")
In order to see the log, run in terminal:
tensorboard --port 6004 --logdir ./logs/progress_tensorboard/
it will give you the link to the board, which you can then open in a browser (e.g. http://pc0259:6004/)
Secondly, you can make snapshots of the model each X steps:
from stable_baselines.common.callbacks import CheckpointCallback
checkpoint_callback = CheckpointCallback(save_freq=1e4, save_path='./model_checkpoints/')
model.learn(total_timesteps=total_timesteps, callback=[callback, checkpoint_callback])
Combining it with the log, you can pick up the model which performed best!

Model prediction in data generator Keras

I'm working on a Keras model with images separated into patches.
I have a quite peculiar pipeline:
for i in range(n_iteration):
print("Epoch:", i, "/", n_iteration)
start = time.time()
self.train_batch, self.validation_batch = self.get_batch()
end = time.time()
print("Time for loading: ",end - start)
K.set_value(self.batch_source, self.train_batch[0][:self.batch_size])
K.set_value(self.batch_target, self.train_batch[0][self.batch_size:])
pred = self.model.predict(self.train_batch[0])
K.set_value(self.gamma, self.compute_gamma(pred))
hist = self.model.train_on_batch(self.train_batch[0], self.train_batch[1])
I need to compute based on the prediction of my model at a time t (for a given batch) a certain value named gamma.This value is then taken into account in my loss function but is not differentiable, therefore I canno't integrate it's computation in my loss function.
When measuring the necessary time for loading and training, it appears that the bottleneck is in the loading phase.
My question is: Is it possible to load several batches (the function self.get_batch() while computing the prediction, gamma and training on an other batch?
I guess the idea would be to create some kind of queue in which I store my batches, but I don't really know how to do that.
PS: in my get_batch function I'm accessing an hdf5 file, can it cause any trouble in multiprocessing ?
Thank you in advance.

TensorFlow 1.10+ custom estimator early stopping with train_and_evaluate

Suppose you are training a custom tf.estimator.Estimator with tf.estimator.train_and_evaluate using a validation dataset in a setup similar to that of #simlmx's:
classifier = tf.estimator.Estimator(
model_fn=model_fn,
model_dir=model_dir,
params=params)
train_spec = tf.estimator.TrainSpec(
input_fn = training_data_input_fn,
)
eval_spec = tf.estimator.EvalSpec(
input_fn = validation_data_input_fn,
)
tf.estimator.train_and_evaluate(
classifier,
train_spec,
eval_spec
)
Often, one uses a validation dataset to cut off training to prevent over-fitting when the loss continues to improve for the training dataset but not for the validation dataset.
Currently the tf.estimator.EvalSpec allows one to specify after how many steps (defaults to 100) to evaluate the model.
How can one (if possible not using tf.contrib functions) designate to terminate training after n number of evaluation calls (n * steps) where the evaluation loss does not improve and then save the "best" model / checkpoint (determined by validation dataset) to a unique file name (e.g. best_validation.checkpoint)
I understand your confusion now. The documentation for stop_if_no_decrease_hook states (emphasis mine):
max_steps_without_decrease: int, maximum number of training steps with
no decrease in the given metric.
eval_dir: If set, directory
containing summary files with eval metrics. By default,
estimator.eval_dir() will be used.
Looking through the code of the hook (version 1.11), though, you find:
def stop_if_no_metric_improvement_fn():
"""Returns `True` if metric does not improve within max steps."""
eval_results = read_eval_metrics(eval_dir) #<<<<<<<<<<<<<<<<<<<<<<<
best_val = None
best_val_step = None
for step, metrics in eval_results.items(): #<<<<<<<<<<<<<<<<<<<<<<<
if step < min_steps:
continue
val = metrics[metric_name]
if best_val is None or is_lhs_better(val, best_val):
best_val = val
best_val_step = step
if step - best_val_step >= max_steps_without_improvement: #<<<<<
tf_logging.info(
'No %s in metric "%s" for %s steps, which is greater than or equal '
'to max steps (%s) configured for early stopping.',
increase_or_decrease, metric_name, step - best_val_step,
max_steps_without_improvement)
return True
return False
What the code does is load the evaluation results (produced with your EvalSpec parameters) and extract the eval results and the global_step (or whichever other custom step you use to count) associated with the specific evaluation record.
This is the source of the training steps part of the docs: the early stopping is not triggered according to the number of non-improving evaluations, but to the number of non-improving evals in a certain step range (which IMHO is a bit counter-intuitive).
So, to recap: Yes, the early-stopping hook uses the evaluation results to decide when it's time to cut the training, but you need to pass in the number of training steps you want to monitor and keep in mind how many evaluations will happen in that number of steps.
Examples with numbers to hopefully clarify more
Let's assume you're training indefinitely long having an evaluation every 1k steps. The specifics of how the evaluation runs is not relevant, as long as it runs every 1k steps producing a metric we want to monitor.
If you set the hook as hook = tf.contrib.estimator.stop_if_no_decrease_hook(my_estimator, 'my_metric_to_monitor', 10000) the hook will consider the evaluations happening in a range of 10k steps.
Since you're running 1 eval every 1k steps, this boils down to early-stopping if there's a sequence of 10 consecutive evals without any improvement.
If then you decide to rerun with evals every 2k steps, the hook will only consider a sequence of 5 consecutive evals without improvement.
Keeping the best model
First of all, an important note: this has nothing to do with early stopping, the issue of keeping a copy of the best model through the training and the one of stopping the training once performance start degrading are completely unrelated.
Keeping the best model can be done very easily defining a tf.estimator.BestExporter in your EvalSpec (snippet taken from the link):
serving_input_receiver_fn = ... # define your serving_input_receiver_fn
exporter = tf.estimator.BestExporter(
name="best_exporter",
serving_input_receiver_fn=serving_input_receiver_fn,
exports_to_keep=5) # this will keep the 5 best checkpoints
eval_spec = [tf.estimator.EvalSpec(
input_fn=eval_input_fn,
steps=100,
exporters=exporter,
start_delay_secs=0,
throttle_secs=5)]
If you don't know how to define the serving_input_fn have a look here
This allows you to keep the overall best 5 models you obtained, stored as SavedModels (which is the preferred way to store models at the moment).

Predicting how long an scikit-learn classification will take to run

Is there a way to predict how long it will take to run a classifier from sci-kit learn based on the parameters and dataset? I know, pretty meta, right?
Some classifiers/parameter combinations are quite fast, and some take so long that I eventually just kill the process. I'd like a way to estimate in advance how long it will take.
Alternatively, I'd accept some pointers on how to set common parameters to reduce the run time.
There are very specific classes of classifier or regressors that directly report remaining time or progress of your algorithm (number of iterations etc.). Most of this can be turned on by passing verbose=2 (any high number > 1) option to the constructor of individual models. Note: this behavior is according to sklearn-0.14. Earlier versions have a bit different verbose output (still useful though).
The best example of this is ensemble.RandomForestClassifier or ensemble.GradientBoostingClassifier` that print the number of trees built so far and remaining time.
clf = ensemble.GradientBoostingClassifier(verbose=3)
clf.fit(X, y)
Out:
Iter Train Loss Remaining Time
1 0.0769 0.10s
...
Or
clf = ensemble.RandomForestClassifier(verbose=3)
clf.fit(X, y)
Out:
building tree 1 of 100
...
This progress information is fairly useful to estimate the total time.
Then there are other models like SVMs that print the number of optimization iterations completed, but do not directly report the remaining time.
clf = svm.SVC(verbose=2)
clf.fit(X, y)
Out:
*
optimization finished, #iter = 1
obj = -1.802585, rho = 0.000000
nSV = 2, nBSV = 2
...
Models like linear models don't provide such diagnostic information as far as I know.
Check this thread to know more about what the verbosity levels mean: scikit-learn fit remaining time
If you are using IPython, you can consider to use the built-in magic commands such as %time and %timeit
%time - Time execution of a Python statement or expression. The CPU and wall clock times are printed, and the value of the expression (if any) is returned. Note that under Win32, system time is always reported as 0, since it can not be measured.
%timeit - Time execution of a Python statement or expression using the timeit module.
Example:
In [4]: %timeit NMF(n_components=16, tol=1e-2).fit(X)
1 loops, best of 3: 1.7 s per loop
References:
https://ipython.readthedocs.io/en/stable/interactive/magics.html
http://scikit-learn.org/stable/developers/performance.html
We're actually working on a package that gives runtime estimates of scikit-learn fits.
You would basically run it right before running the algo.fit(X, y) to get the runtime estimation.
Here's a simple use case:
from scitime import Estimator
estimator = Estimator()
rf = RandomForestRegressor()
X,y = np.random.rand(100000,10),np.random.rand(100000,1)
# Run the estimation
estimation, lower_bound, upper_bound = estimator.time(rf, X, y)
Feel free to take a look!

Categories

Resources