stable_baselines3 callback on each step - python

I am training a stable_baselines3 PPO agent and want to perform some task on every step. To do this, I'm using a callback CustomCallback with _on_step method defined.
But it appears that _on_step is called only on every PPO.n_steps, so if n_steps param is 1024, then CustomCallback._on_step appears to be called only on every 1024 steps.
How can you do something on every 1 step, insted of on every PPO.n_steps steps?
from stable_baselines3 import PPO
from stable_baselines3.common.env_util import make_vec_env
from stable_baselines3.common.callbacks import BaseCallback
class CustomCallback(BaseCallback):
def __init__(self, freq, verbose=0):
super().__init__(verbose)
self.freq = freq
def _on_step(self):
if self.n_calls % self.freq == 0:
print('do something')
return True
env = make_vec_env("CartPole-v1", n_envs=1)
model = PPO("MlpPolicy", env, n_steps=1024)
model.learn(
total_timesteps=25000,
callback=CustomCallback(freq=123),
)

Related

How do I get openai.gym.spaces.Dict state updated?

"AttributeError: 'dict' object has no attribute 'flatten'".
I get this error when I run the following code:
import math
from gym import Env
from gym.spaces import Discrete, Box, Dict, Tuple, MultiBinary, MultiDiscrete
from stable_baselines3 import PPO
screen_width = 900
class GameEnv(Env):
def __init__(self):
self.action_space = Discrete(5)
observation_positions = Box(low=0, high=screen_width, shape=(2,))
self.observation_space = Dict({'observation_positions': observation_positions})
self.state = self.observation_space.sample()
def step(self, action):
self.state = self.observation_space.sample()
def render(self):
pass
def reset(self):
return self.state
env = GameEnv()
model = PPO('MlpPolicy', env, verbose=1,)
model.learn(total_timesteps=1000)
What do I have to change?
You may have to use MultiInputPolicy instead of MlpPolicy as the first parameter to the PPO class when using a Dict observation space:
model = PPO('MultiInputPolicy', env, verbose=1,)

Is it possible in OpenAI gym to persist the state as hidden and only make some variables visible to the players?

I want to create a game in which there is a lead time in the actions of the player and the rewards/consequences, therefore, I would like to not share the observation completely with the player, but still persist it because it's need for future. Is there a way we can do that?
If I create a variable in init and update it, it's visible to every instance of the game, so a player already knows a lot more than I'd have them know.
A rough example for your requirement on Cartpole would be something like this:
import gym
from gym.utils import seeding
import numpy as np
class myEnv(gym.Env):
def __init__(self, *args, **kwargs):
"""
Define all the necessary stuff here
"""
self.env = gym.make('CartPole-v1') # add stuff here to define game params
self.action_space = self.env.action_space
self.observation_space = self.env.observation_space
self.past_actions = []
self.delay = 2 # to have a delay of two timesteps
def reset(self):
"""
Define the reset
"""
self.observation = self.env.reset()
return self.observation
def step(self, action):
"""
Add the delay of actions here
"""
self.past_actions.append(action) # to keep track of actions
reward = 0; done = 0; info = {} # reward, done and info are 0,0,{} for first two timesteps
if len(self.past_actions) > self.delay:
present_action = self.past_actions.pop(0)
# change observation, reward, done, info
# according to the action 'delay' timesteps ago
self.observation, reward, done, info = self.env.step(present_action)
return self.observation, reward, done, info
def seed(self, seed=0):
"""
Define seed method here
"""
self.np_random, seed = seeding.np_random(seed)
return self.env.seed(seed=seed)
def render(self, mode="human", *args, **kwargs):
"""
Define rendering method here
"""
return self.env.render(*args, **kwargs)
def close(self):
"""
Define close method here
"""
return self.env.close()

Running a Python web scraper every hour [duplicate]

I'm looking for a library in Python which will provide at and cron like functionality.
I'd quite like have a pure Python solution, rather than relying on tools installed on the box; this way I run on machines with no cron.
For those unfamiliar with cron: you can schedule tasks based upon an expression like:
0 2 * * 7 /usr/bin/run-backup # run the backups at 0200 on Every Sunday
0 9-17/2 * * 1-5 /usr/bin/purge-temps # run the purge temps command, every 2 hours between 9am and 5pm on Mondays to Fridays.
The cron time expression syntax is less important, but I would like to have something with this sort of flexibility.
If there isn't something that does this for me out-the-box, any suggestions for the building blocks to make something like this would be gratefully received.
Edit
I'm not interested in launching processes, just "jobs" also written in Python - python functions. By necessity I think this would be a different thread, but not in a different process.
To this end, I'm looking for the expressivity of the cron time expression, but in Python.
Cron has been around for years, but I'm trying to be as portable as possible. I cannot rely on its presence.
If you're looking for something lightweight checkout schedule:
import schedule
import time
def job():
print("I'm working...")
schedule.every(10).minutes.do(job)
schedule.every().hour.do(job)
schedule.every().day.at("10:30").do(job)
while 1:
schedule.run_pending()
time.sleep(1)
Disclosure: I'm the author of that library.
You could just use normal Python argument passing syntax to specify your crontab. For example, suppose we define an Event class as below:
from datetime import datetime, timedelta
import time
# Some utility classes / functions first
class AllMatch(set):
"""Universal set - match everything"""
def __contains__(self, item): return True
allMatch = AllMatch()
def conv_to_set(obj): # Allow single integer to be provided
if isinstance(obj, (int,long)):
return set([obj]) # Single item
if not isinstance(obj, set):
obj = set(obj)
return obj
# The actual Event class
class Event(object):
def __init__(self, action, min=allMatch, hour=allMatch,
day=allMatch, month=allMatch, dow=allMatch,
args=(), kwargs={}):
self.mins = conv_to_set(min)
self.hours= conv_to_set(hour)
self.days = conv_to_set(day)
self.months = conv_to_set(month)
self.dow = conv_to_set(dow)
self.action = action
self.args = args
self.kwargs = kwargs
def matchtime(self, t):
"""Return True if this event should trigger at the specified datetime"""
return ((t.minute in self.mins) and
(t.hour in self.hours) and
(t.day in self.days) and
(t.month in self.months) and
(t.weekday() in self.dow))
def check(self, t):
if self.matchtime(t):
self.action(*self.args, **self.kwargs)
(Note: Not thoroughly tested)
Then your CronTab can be specified in normal python syntax as:
c = CronTab(
Event(perform_backup, 0, 2, dow=6 ),
Event(purge_temps, 0, range(9,18,2), dow=range(0,5))
)
This way you get the full power of Python's argument mechanics (mixing positional and keyword args, and can use symbolic names for names of weeks and months)
The CronTab class would be defined as simply sleeping in minute increments, and calling check() on each event. (There are probably some subtleties with daylight savings time / timezones to be wary of though). Here's a quick implementation:
class CronTab(object):
def __init__(self, *events):
self.events = events
def run(self):
t=datetime(*datetime.now().timetuple()[:5])
while 1:
for e in self.events:
e.check(t)
t += timedelta(minutes=1)
while datetime.now() < t:
time.sleep((t - datetime.now()).seconds)
A few things to note: Python's weekdays / months are zero indexed (unlike cron), and that range excludes the last element, hence syntax like "1-5" becomes range(0,5) - ie [0,1,2,3,4]. If you prefer cron syntax, parsing it shouldn't be too difficult however.
More or less same as above but concurrent using gevent :)
"""Gevent based crontab implementation"""
from datetime import datetime, timedelta
import gevent
# Some utility classes / functions first
def conv_to_set(obj):
"""Converts to set allowing single integer to be provided"""
if isinstance(obj, (int, long)):
return set([obj]) # Single item
if not isinstance(obj, set):
obj = set(obj)
return obj
class AllMatch(set):
"""Universal set - match everything"""
def __contains__(self, item):
return True
allMatch = AllMatch()
class Event(object):
"""The Actual Event Class"""
def __init__(self, action, minute=allMatch, hour=allMatch,
day=allMatch, month=allMatch, daysofweek=allMatch,
args=(), kwargs={}):
self.mins = conv_to_set(minute)
self.hours = conv_to_set(hour)
self.days = conv_to_set(day)
self.months = conv_to_set(month)
self.daysofweek = conv_to_set(daysofweek)
self.action = action
self.args = args
self.kwargs = kwargs
def matchtime(self, t1):
"""Return True if this event should trigger at the specified datetime"""
return ((t1.minute in self.mins) and
(t1.hour in self.hours) and
(t1.day in self.days) and
(t1.month in self.months) and
(t1.weekday() in self.daysofweek))
def check(self, t):
"""Check and run action if needed"""
if self.matchtime(t):
self.action(*self.args, **self.kwargs)
class CronTab(object):
"""The crontab implementation"""
def __init__(self, *events):
self.events = events
def _check(self):
"""Check all events in separate greenlets"""
t1 = datetime(*datetime.now().timetuple()[:5])
for event in self.events:
gevent.spawn(event.check, t1)
t1 += timedelta(minutes=1)
s1 = (t1 - datetime.now()).seconds + 1
print "Checking again in %s seconds" % s1
job = gevent.spawn_later(s1, self._check)
def run(self):
"""Run the cron forever"""
self._check()
while True:
gevent.sleep(60)
import os
def test_task():
"""Just an example that sends a bell and asd to all terminals"""
os.system('echo asd | wall')
cron = CronTab(
Event(test_task, 22, 1 ),
Event(test_task, 0, range(9,18,2), daysofweek=range(0,5)),
)
cron.run()
None of the listed solutions even attempt to parse a complex cron schedule string. So, here is my version, using croniter. Basic gist:
schedule = "*/5 * * * *" # Run every five minutes
nextRunTime = getNextCronRunTime(schedule)
while True:
roundedDownTime = roundDownTime()
if (roundedDownTime == nextRunTime):
####################################
### Do your periodic thing here. ###
####################################
nextRunTime = getNextCronRunTime(schedule)
elif (roundedDownTime > nextRunTime):
# We missed an execution. Error. Re initialize.
nextRunTime = getNextCronRunTime(schedule)
sleepTillTopOfNextMinute()
Helper routines:
from croniter import croniter
from datetime import datetime, timedelta
# Round time down to the top of the previous minute
def roundDownTime(dt=None, dateDelta=timedelta(minutes=1)):
roundTo = dateDelta.total_seconds()
if dt == None : dt = datetime.now()
seconds = (dt - dt.min).seconds
rounding = (seconds+roundTo/2) // roundTo * roundTo
return dt + timedelta(0,rounding-seconds,-dt.microsecond)
# Get next run time from now, based on schedule specified by cron string
def getNextCronRunTime(schedule):
return croniter(schedule, datetime.now()).get_next(datetime)
# Sleep till the top of the next minute
def sleepTillTopOfNextMinute():
t = datetime.utcnow()
sleeptime = 60 - (t.second + t.microsecond/1000000.0)
time.sleep(sleeptime)
I know there are a lot of answers, but another solution could be to go with decorators. This is an example to repeat a function everyday at a specific time. The cool think about using this way is that you only need to add the Syntactic Sugar to the function you want to schedule:
#repeatEveryDay(hour=6, minutes=30)
def sayHello(name):
print(f"Hello {name}")
sayHello("Bob") # Now this function will be invoked every day at 6.30 a.m
And the decorator will look like:
def repeatEveryDay(hour, minutes=0, seconds=0):
"""
Decorator that will run the decorated function everyday at that hour, minutes and seconds.
:param hour: 0-24
:param minutes: 0-60 (Optional)
:param seconds: 0-60 (Optional)
"""
def decoratorRepeat(func):
#functools.wraps(func)
def wrapperRepeat(*args, **kwargs):
def getLocalTime():
return datetime.datetime.fromtimestamp(time.mktime(time.localtime()))
# Get the datetime of the first function call
td = datetime.timedelta(seconds=15)
if wrapperRepeat.nextSent == None:
now = getLocalTime()
wrapperRepeat.nextSent = datetime.datetime(now.year, now.month, now.day, hour, minutes, seconds)
if wrapperRepeat.nextSent < now:
wrapperRepeat.nextSent += td
# Waiting till next day
while getLocalTime() < wrapperRepeat.nextSent:
time.sleep(1)
# Call the function
func(*args, **kwargs)
# Get the datetime of the next function call
wrapperRepeat.nextSent += td
wrapperRepeat(*args, **kwargs)
wrapperRepeat.nextSent = None
return wrapperRepeat
return decoratorRepeat
I like how the pycron package solves this problem.
import pycron
import time
while True:
if pycron.is_now('0 2 * * 0'): # True Every Sunday at 02:00
print('running backup')
time.sleep(60) # The process should take at least 60 sec
# to avoid running twice in one minute
else:
time.sleep(15) # Check again in 15 seconds
There isn't a "pure python" way to do this because some other process would have to launch python in order to run your solution. Every platform will have one or twenty different ways to launch processes and monitor their progress. On unix platforms, cron is the old standard. On Mac OS X there is also launchd, which combines cron-like launching with watchdog functionality that can keep your process alive if that's what you want. Once python is running, then you can use the sched module to schedule tasks.
Another trivial solution would be:
from aqcron import At
from time import sleep
from datetime import datetime
# Event scheduling
event_1 = At( second=5 )
event_2 = At( second=[0,20,40] )
while True:
now = datetime.now()
# Event check
if now in event_1: print "event_1"
if now in event_2: print "event_2"
sleep(1)
And the class aqcron.At is:
# aqcron.py
class At(object):
def __init__(self, year=None, month=None,
day=None, weekday=None,
hour=None, minute=None,
second=None):
loc = locals()
loc.pop("self")
self.at = dict((k, v) for k, v in loc.iteritems() if v != None)
def __contains__(self, now):
for k in self.at.keys():
try:
if not getattr(now, k) in self.at[k]: return False
except TypeError:
if self.at[k] != getattr(now, k): return False
return True
I don't know if something like that already exists. It would be easy to write your own with time, datetime and/or calendar modules, see http://docs.python.org/library/time.html
The only concern for a python solution is that your job needs to be always running and possibly be automatically "resurrected" after a reboot, something for which you do need to rely on system dependent solutions.

Update progress bar - MVP pattern

I'm studying the MVP pattern but having a hard time following the principles in order to update in real time a progress bar. As I understand the Presenter checks if there's any update in the Model and then outputs the result, so there's no instantiation of the Presenter in the Model, only the Presenter should instantiate the Model and the View.
My question is: how should I update the progress bar by following the MVP principle?
I could of course call presenter.update_progress_bar(i, total) from Model, but then it would infringe the MVP principle.
Here's a minimal working example:
PS: for now, I'm using CLI.
/main.py
import modules
def main():
modules.View(modules.Presenter).run()
if __name__ == "__main__":
main()
/modules/__init__.py
from modules.Model.Model import Model
from modules.Model.progressbar import ProgressBar
from modules.View.View import View
from modules.Presenter.Presenter import Presenter
/modules/Model/Model.py
class Model:
def __init__(self):
pass
def long_process(self):
import time
for i in range(10):
time.sleep(0.1)
print("Update the progress bar.")
return True
/modules/Model/progressbar.py
# MIT license: https://gist.github.com/vladignatyev/06860ec2040cb497f0f3
import sys
class ProgressBar:
def progress(count, total, status=''):
bar_len = 60
filled_len = int(round(bar_len * count / float(total)))
percents = round(100.0 * count / float(total), 1)
bar = '=' * filled_len + '-' * (bar_len - filled_len)
sys.stdout.write('[%s] %s%s ...%s\r' % (bar, percents, '%', status))
sys.stdout.flush()
/modules/View/View.py
import sys
class View:
def __init__(self, presenter):
self.presenter = presenter(self)
def run(self):
self.presenter.long_process()
def update_progress_bar(self, msg):
sys.stdout.write(msg)
def hide_progress_bar(self, msg):
sys.stdout.write(msg)
def update_status(self, msg):
print(msg)
/modules/Presenter/Presenter.py
class Presenter:
def __init__(self, view):
import modules
self.model = modules.Model()
self.view = view
def long_process(self):
if self.model.long_process():
self.view.update_status('Long process finished correctly')
else:
self.view.update_status('error')
def update_progress_bar(self, i, total):
from modules import ProgressBar
ProgressBar.progress(i, total)
self.view.update_progress_bar(ProgressBar.progress(i, total))
def end_progress_bar(self):
self.view.end_progress_bar('\n')
I could do:
class Model:
def __init__(self, presenter):
self.presenter = presenter # Violation of MVP
def long_process(self):
import time
for i in range(10):
time.sleep(0.1)
self.presenter.update_progress_bar(i, 10) # Violation of MVP
print("Update the progress bar.")
return True
But this is wrong since the Model now instantiates the Presenter. Any suggestions?
Use a callback:
import time
class Model:
def long_process(self, notify=lambda current, total: None):
for i in range(10):
time.sleep(0.1)
notify(i, 10)
return True
class Presenter:
def long_process(self):
result = self.model.long_process(lambda c, t: self.update_progress_bar(c, t)):
if result:
self.view.update_status('Long process finished correctly')
else:
self.view.update_status('error')
This keeps your model independant from the client code, while still allowing it (the model I mean) to notify it's caller.
As a side note, there are quite a few things in your code that are totally unpythonic:
1/ you don't have to put each class in a distinct module (it's actually considered an antipattern in Python), and even less in nested submodules (Python Zen: "flat is better than nested").
2/ you don't have to use classes when a plain function is enough (hint: Python functions are objects... actually, everything in Python is an object) - your ProgressBar class has no state and only one method, so it could just be a plain function (Python Zen: "simple is better than complex").
3/ imports should be at the top of the module, not in functions (if you have to put them in a function to solve cyclic dependancies issues then the proper solution is to rethink your design to avoid cyclic dependancies).
4/ module names should be all_lower

save model weights at the end of every N epochs

I'm training a NN and would like to save the model weights every N epochs for a prediction phase. I propose this draft code, it's inspired by #grovina 's response here. Could you, please, make suggestions?
Thanks in advance.
from keras.callbacks import Callback
class WeightsSaver(Callback):
def __init__(self, model, N):
self.model = model
self.N = N
self.epoch = 0
def on_batch_end(self, epoch, logs={}):
if self.epoch % self.N == 0:
name = 'weights%08d.h5' % self.epoch
self.model.save_weights(name)
self.epoch += 1
Then add it to the fit call: to save weights every 5 epochs:
model.fit(X_train, Y_train, callbacks=[WeightsSaver(model, 5)])
You shouldn't need to pass a model for the callback. It already has access to the model via it's super. So remove __init__(..., model, ...) argument and self.model = model. You should be able to access the current model via self.model regardless. You are also saving it on every batch end, which is not what you want, you probably want it to be on_epoch_end.
But in any case, what you are doing can be done via naive modelcheckpoint callback. You don't need to write a custom one. You can use that as follows;
mc = keras.callbacks.ModelCheckpoint('weights{epoch:08d}.h5',
save_weights_only=True, period=5)
model.fit(X_train, Y_train, callbacks=[mc])
You should implement on on_epoch_end rather implementing on_batch_end. And also passing model as argument for __init__ is redundant.
from keras.callbacks import Callback
class WeightsSaver(Callback):
def __init__(self, N):
self.N = N
self.epoch = 0
def on_epoch_end(self, epoch, logs={}):
if self.epoch % self.N == 0:
name = 'weights%08d.h5' % self.epoch
self.model.save_weights(name)
self.epoch += 1

Categories

Resources