Locust failed to fire events when running as lib - python

The following code is from the tutorial. I just added some codes to fire the test_start event(not sure if I fire it in the right place ?) and listen to both init and test_start events.
import gevent
from locust import HttpUser, task, events
from locust.env import Environment
from locust.stats import stats_printer, stats_history
from locust.log import setup_logging
setup_logging("INFO", None)
class MyUser(HttpUser):
host = "https://docs.locust.io"
#task
def t(self):
self.client.get("/")
#events.init.add_listener
def on_locust_init(**kwargs):
print("on locust init ...")
#events.test_start.add_listener
def on_test_start(**kwargs):
print("on test start ...")
# setup Environment and Runner
env = Environment(user_classes=[MyUser])
runner = env.create_local_runner()
# start a WebUI instance
web_ui = env.create_web_ui("127.0.0.1", 8089)
# execute init event handlers (only really needed if you have registered any)
env.events.init.fire(environment=env, runner=runner, web_ui=web_ui)
# start a greenlet that periodically outputs the current stats
gevent.spawn(stats_printer(env.stats))
# start a greenlet that save current stats to history
gevent.spawn(stats_history, env.runner)
# start the test
runner.start(1, spawn_rate=1)
# execute test_start event handlers (only really needed if you have registered any)
env.events.test_start.fire(environment=env, runner=runner, web_ui=web_ui)
# in 10 seconds stop the runner
gevent.spawn_later(10, lambda: runner.quit())
# wait for the greenlets
runner.greenlet.join()
# stop the web server for good measures
web_ui.stop()
When I ran it as a library (e.g. python use_as_lib.py), the two messages in MyUser didn't print. But if I remove those run-as-lib codes, and run it as tool (e.g. locust -f use_as_lib.py --headless -u 1 -r 1 -t=10s), messages been printed in the console. Seems I missed anything...
Here's my locust version.
locust 2.13.0 from /Users/myuser/workspace/tmp/try_python/venv/lib/python3.8/site-packages/locust (python 3.8.12)
Any ideas? Thanks!

Check the source code a bit, I need to add locust.events when initialize the Environment().
# ...
env = Environment(user_classes=[MyUser], events=events)
I think the use_as_lib example need to be updated.

Related

Apscheduler only single instance running in background

I have a script that runs every 5 minutes and performs some actions (check for products back in stock and notify me when they are).
I only want a single instance of apscheduler running because I do not want a website being checked multiple times in a 5 minute window.
Here is my code:
from apscheduler.schedulers.background import BackgroundScheduler
sched = BackgroundScheduler()
def check1():
requests.get("https://somewebsite.com/product-i-want")
# check if item in stock
# notify me
def check2():
requests.get("https://someotherwebsite.com/another-product-i-want")
# check if item in stock
# notify me
def main():
# Schedule jobs to run every 5min
sched.add_job(check1, 'interval', minutes=5, max_instances=1)
sched.add_job(check2, 'interval', minutes=5, max_instances=1)
# Also run jobs on start
for job in sched.get_jobs():
job.modify(next_run_time=datetime.now())
# Start jobs
sched.start()
# Keep-alive
try:
while True:
time.sleep(2)
except (KeyboardInterrupt, SystemExit):
sched.shutdown()
if __name__ == '__main__':
main()
And then I have a shell script that is run:
screen -X -S scrape quit # quit screen with name 'scrape'
screen -dmS scrape python3 Scrapers.py # create screen with name 'scrape' to run python script
I am constantly adding jobs to this script so I have a cronjob that calls the above shell script every hour to kill the current running script and restart it.
But having a cronjob call a script to refresh this python script is a little counterintuitive. My original thought was to give the job an id but it seems like sched.get_jobs() returns empty after you run sched.start().
Is my understanding of BackgroundScheduler completely incorrect? Is there a better way to achieve running only a single instance of certain apscheduler jobs even if the script crashes? I am using apscheduler V3.9.1 (latest).

Firing Event Hooks when running Locust as a library

I am trying to perform load test using Locust library for an API endpoint. Here, i am running Locust as a library instead of using locust command. I am trying to perform global setup and global teardown so that a global state is created initially which is used by all the users and then later cleared on teardown(Eg. Downloading S3 files once and then removing it at end).
There are built-in event hooks to add this functionality like init and quitting which can be used when running the locustfile using locust command. But, I am unable to trigger these events when running it as a library. Based on the Locust's source code, I can check that these events are fired in locust main.py file but it's not called when running as a library.
How to add such events when running it as a library? I have tried with the below 2 approaches. Is adding event listener and manually calling event.fire() a correct approach or directly creating and calling custom methods for it instead of using events is a better approach?
In general, should init and quitting events be used for setting a global state initially and then clearing at end or test_start and test_stop events can also be used in its place?
Source Code for reference:
Approach - 1 (Using event hooks)
import gevent
from locust import HttpUser, task, between
from locust.env import Environment
from locust.stats import stats_printer, stats_history
from locust.log import setup_logging
from locust import events
setup_logging("INFO", None)
def on_init(environment, **kwargs):
print("Perform global setup to create a global state")
def on_quit(environment, **kwargs):
print('Perform global teardown to clear the global state')
events.quitting.add_listener(on_quit)
events.init.add_listener(on_init)
class User(HttpUser):
wait_time = between(1, 3)
host = "https://docs.locust.io"
#tas
def my_task(self):
self.client.get("/")
#task
def task_404(self):
self.client.get("/non-existing-path")
# setup Environment and Runner
env = Environment(user_classes=[User], events=events)
runner = env.create_local_runner()
### Fire init event and environment and local runner have been instantiated
env.events.init.fire(environment=env, runner=runner) # Is it correct approach?
# start a WebUI instance
env.create_web_ui("127.0.0.1", 8089)
# start a greenlet that periodically outputs the current stats
gevent.spawn(stats_printer(env.stats))
# start a greenlet that save current stats to history
gevent.spawn(stats_history, env.runner)
# start the test
env.runner.start(1, spawn_rate=10)
# in 5 seconds stop the runner
gevent.spawn_later(5, lambda: env.runner.quit())
# wait for the greenlets
env.runner.greenlet.join()
### Fire quitting event when locust process is exiting
env.events.quitting.fire(environment=env, reverse=True) # Is it correct approach?
# stop the web server for good measures
env.web_ui.stop()
Approach - 2 (Creating custom methods and calling these directly)
import gevent
from locust import HttpUser, task, between
from locust.env import Environment
from locust.stats import stats_printer, stats_history
from locust.log import setup_logging
from locust import events
setup_logging("INFO", None)
class User(HttpUser):
wait_time = between(1, 3)
host = "https://docs.locust.io"
#classmethod
def perform_global_setup(cls):
print("Perform global setup to create a global state")
#classmethod
def perform_global_teardown(cls):
print('Perform global teardown to clear the global state')
#task
def my_task(self):
self.client.get("/")
#task
def task_404(self):
self.client.get("/non-existing-path")
# setup Environment and Runner
env = Environment(user_classes=[User])
runner = env.create_local_runner()
### Perform global setup
for cls in env.user_classes:
cls.perform_global_setup() # Is it correct approach?
# start a WebUI instance
env.create_web_ui("127.0.0.1", 8089)
# start a greenlet that periodically outputs the current stats
gevent.spawn(stats_printer(env.stats))
# start a greenlet that save current stats to history
gevent.spawn(stats_history, env.runner)
# start the test
env.runner.start(1, spawn_rate=10)
# in 5 seconds stop the runner
gevent.spawn_later(5, lambda: env.runner.quit())
# wait for the greenlets
env.runner.greenlet.join()
### Perform global teardown
for cls in env.user_classes:
cls.perform_global_teardown() # Is it correct approach?
# stop the web server for good measures
env.web_ui.stop()
Both approaches are fine. Using event hooks makes more sense if you think you might want to run in the normal (not as-a-library) way in the future, but if that is unlikely to happen then choose the approach that feels most natural to you.
init/quitting only differ from test_start/stop in a meaningful way when doing multiple runs in gui mode (where test_start/stop may happen multiple times). Use the one that is appropriate for what you are doing in the event handler, there is no other guideline.

Using Pyramid events and multithreading

I'd like to use events subscription / notification together with multithreading. It sounds like it should just work in theory and the documentation doesn't include any warnings. The events should be synchronous, so no deferring either.
But in practice, when I notify off the main thread, nothing comes in:
def run():
logging.config.fileConfig(sys.argv[1])
with bootstrap(sys.argv[1]) as env:
get_current_registry().notify(FooEvent()) # <- works
Thread(target=thread).start() # <- doesn't work
def thread():
get_current_registry().notify(FooEvent())
Is this not expected to work? Or am I doing something wrong?
I tried also the suggested solution. It doesn't print the expected event.
class Foo:
pass
#subscriber(Foo)
def metric_report(event):
print(event)
def run():
with bootstrap(sys.argv[1]) as env:
def foo(env):
try:
with env:
get_current_registry().notify(Foo())
except Exception as e:
print(e)
t = Thread(target=foo, args=(env,))
t.start()
t.join()
get_current_registry() is trying to access the threadlocal variable Pyramid registers when processing requests or config to tell the thread what Pyramid app is currently active IN THAT THREAD. The gotcha here is that get_current_registry() always returns a registry, just not the one you want, so it's hard to see why it's not working.
When spawning a new thread, you need to register your Pyramid app as the current threadlocal. The best way to do this is with pyramid.scripting.prepare. The "easy" way is just to run bootstrap again in your thread. I'll show the "right" way though.
def run():
pyramid.paster.setup_logging(sys.argv[1])
get_current_registry().notify(FooEvent()) # doesn't work, just like in the thread
with pyramid.paster.bootstrap(sys.argv[1]) as env:
registry = env['registry']
registry.notify(FooEvent()) # works
get_current_registry().notify(FooEvent()) # works
Thread(target=thread_main, args=(env['registry'],)).start()
def thread_main(registry):
registry.notify(FooEvent()) # works, but threadlocals are not setup if other code triggered by this invokes get_current_request() or get_current_registry()
# so let's setup threadlocals
with pyramid.scripting.prepare(registry=registry) as env:
registry.notify(FooEvent()) # works
get_current_registry().notify(FooEvent()) # works
pyramid.scripting.prepare is what bootstrap uses under the hood, and is a lot more efficient than running bootstrap multiple times because it shares the registry and all of your app configuration instead of making a completely new copy of your app.
Is it just that the 'with' context applies to the Thread() create statement only and does not propogate to the thread() method. i.e. in the case that works the 'get_current_registry' call has 'with' env context, but this 'with' context will not propogate to the point where the thread runs the 'get_current_registry'. So you need to propogate the env to the thread() - perhaps by creating a simple runnable class that takes the env in the init method.
class X:
def __init__(self,env):
self.env = env
def __call__(self):
with self.env:
get_current_registry().notify(FooEvent())
return
def run():
logging.config.fileConfig(sys.argv[1])
with bootstrap(sys.argv[1]) as env:
get_current_registry().notify(FooEvent())
Thread(target=X(env)).start()

Function running in background all the time (and startup itself) in Django app

I create simple Django app. Inside this app I have single checkbox. I save this checkbox state to database if it's checked in database I have True value if checkbox is uncecked I have False value. There is no problem with this part. Now I created function that prints for me every 10 second all the time this checkbox state value from database.
Function I put into views.py file and it looks like:
def get_value():
while True:
value_change = TurnOnOff.objects.first()
if value_change.turnOnOff:
print("true")
else:
print("false")
time.sleep(10)
The point is that function should work all the time. For example If I in models.py code checkbox = models.BooleanField(default=False) after I run command python manage.py runserver it should give me output like:
Performing system checks...
System check identified no issues (0 silenced).
January 04, 2019 - 09:19:47
Django version 2.1.3, using settings 'CMS.settings'
Starting development server at http://127.0.0.1:8000/
Quit the server with CTRL-BREAK.
true
true
true
true
then if I visit website and change state is should print false this is obvious. But as you notice problem is how start this method. It should work all the time even if I don't visit the website yet. And this part confuse me. How to do this properly ?
I need to admit that I tried some solutions
put this function at the end of manage.py file,
put this function into def ready(self),
create middleware class and put method here (example code below).
But this solutions doesn't work.
middleware class :
class SimpleMiddleware:
def __init__(self, get_response):
self.get_response = get_response
get_value()
You can achieve this by using the AppConfig.ready() hook and combining it with a sub-process/thread.
Here is an example apps.py file (based on the tutorial Polls app):
import time
from multiprocessing import Process
from django.apps import AppConfig
from django import db
class TurnOnOffMonitor(Process):
def __init__(self):
super().__init__()
self.daemon = True
def run(self):
# This import needs to be delayed. It needs to happen after apps are
# loaded so we put it into the method here (it won't work as top-level
# import)
from .models import TurnOnOff
# Because this is a subprocess, we must ensure that we get new
# connections dedicated to this process to avoid interfering with the
# main connections. Closing any existing connection *should* ensure
# this.
db.connections.close_all()
# We can do an endless loop here because we flagged the process as
# being a "daemon". This ensures it will exit when the parent exists
while True:
value_change = TurnOnOff.objects.first()
if value_change.turnOnOff:
print("true")
else:
print("false")
time.sleep(10)
class PollsConfig(AppConfig):
name = 'polls'
def ready(self):
monitor = TurnOnOffMonitor()
monitor.start()
Celery is the thing that best suits your needs from what you've described.
Celery is an asynchronous task queue/job queue based on distributed message passing. It is focused on real-time operation, but supports scheduling as well.
The execution units, called tasks, are executed concurrently on a single or more worker servers using multiprocessing, Eventlet, or gevent. Tasks can execute asynchronously (in the background) or synchronously (wait until ready).
You need to create task, run it periodically, call if you want to manually trigger is (in some view/controller).
NOTE: do not use time.sleep(10)

How to make Python apscheduler run in background

I want to make Python apscheduler run in background , here is my code:
from apscheduler.schedulers.background import BackgroundScheduler, BlockingScheduler
from datetime import datetime
import logging
import sys
logging.basicConfig(level=logging.DEBUG, stream=sys.stdout)
def singleton(cls, *args, **kw):
instances = {}
def _singleton(*args, **kw):
if cls not in instances:
instances[cls] = cls(*args, **kw)
return instances[cls]
return _singleton
#singleton
class MyScheduler(BackgroundScheduler):
pass
def simple_task(timestamp):
logging.info("RUNNING simple_task: %s" % timestamp)
scheduler = MyScheduler()
scheduler.start()
scheduler.add_job(simple_task, 'interval', seconds=5, args=[datetime.utcnow()])
when I run the command:
look:Python look$ python itger.py
I just got this:
INFO:apscheduler.scheduler:Scheduler started
DEBUG:apscheduler.scheduler:Looking for jobs to run
DEBUG:apscheduler.scheduler:No jobs; waiting until a job is added
INFO:apscheduler.scheduler:Added job "simple_task" to job store "default"
And ps:
ps -e | grep python
I just got 54615 ttys000 0:00.00 grep python
My Problem is how to set the code run in background and I can see it's running or it's print log for every 5 secs so the code show?
BackgroundScheduler runs in a background thread which in this case, I guess, doesn't prevent the application main threat to terminate.
Try add at the end of your application:
import time
print("Waiting to exit")
while True:
time.sleep(1)
... and then terminate your application with CTRL+C.
Threads are no magical way of making code run & managed by the OS. They are local to your process only, and so if that process terminates or dies unexpectantly, so does your Thread.
So the answer to your question is: don't use threads, write your program in a normal fashion so you can invoke it on your commandline, and then use a OS-based scheduler such as CRON to schedule it.
Alternatively, if your program needs to run continuously because it e.g. builds up caches that are expensive to re-compute every 5 minutes, use a process-observer such as supervisord to ensure even after a reboot or crash the program continues executing.

Categories

Resources