Manual job execution outside of schedule in APScheduler - python

I have a job which is scheduled using the cron scheduler in APScheduler to run a function of some sort for my Flask app.
I'd like to also be able to manually run this function, without interrupting the schedule that I also have set up.
For example, say the task is set to run once per day, I'd also like to run it manually whenever a user does a particular thing.
It is important that two instances of the job not be run at the same time (which is why I'm not simply calling the function itself) - so I'm trying to come up with a solution using APScheduler to prevent a scenario where the manual trigger is performed while the scheduled run is busy.

This is effectively a duplicate of this question: APScheduler how to trigger job now
Lars Blumberg's answer was the one that solved it for me. I used this line:
scheduler_object.get_job(job_id ="my_job_id").modify(next_run_time=datetime.datetime.now())
This ensures that the particular job will run immediately, and maintain the previous schedule. If the scheduled job is already running, this will not trigger the job now (desired behaviour for me)...unless you have set max_instances to more than 1. Similarly, if you manually execute the job and it is running when the scheduled run is triggered, it will also not execute unless max_instances is greater than 1.

Related

Django : Run task in background immediately

Similarly to this question, I don't understand how the django-background-tasks module works. What I want is to have a button which would start a task in background so it doesn't block the front. The task has to be run as soon as the click occurs.
Reading the docs I thought that I only had to define my task as a function preceded by the decorator #background(schedule=1) then calling the function would start the task one second after the click handler was run.
Calling notify_user as normal will schedule the original function to be run 60 seconds from now:
However this is not what happens. What happens is that a new task is created after each click (I could understand that by requesting the background_task database thanks to this answer) but none ever runs.
The command python manage.py process_tasks seems to run one task defined previously before stopping. If no task has been defined, then it waits for a task to be defined for up to duration seconds (specified when running the command). This behavior is not what I expected while reading the documentation.
duration - Run task for this many seconds (0 or less to run forever) - default is 0
I don't think creating a crontab that calls python manage.py process_tasks every second is relevant, does this mean I have to manually call the command myself so it runs when I want ? Isn't the command supposed to always be running so it handles all tasks when they are scheduled to be run ?
The fact that the command only runs one task per run is troubling to me, what if I defined 1000 small tasks to be run in background in a minute, would the command ever catch up ?

How to check if celery task is already running before running it again with beat?

I have a periodic task scheduled to run every 10 minutes. Sometimes this task completes in 2-3 minutes, sometimes it takes 20 minutes.
Is there any way using celery beats to not open the task if the previous task hasn't completed yet? I don't see an option for it in the interval settings.
No, Celery Beat knows nothing about the running tasks.
One way to achieve what you are trying to do is to link the task to itself. async_apply() for an example has optional parameter link and link_error which can be used to provide a signature (it can be a single task too) to run if the task finishes successfully (link) or unsuccessfully (link_error).
What I use is the following - I schedule task to run frequently (say every 5 minutes), and I use a distributed lock to make sure I always have only one instance of the task running.
Finally a reminder - you can always implement your own scheduler, and use it in your beat configuration. I was thinking about doing this in the past for exactly the same thing you want, but decided that the solution I already have is good enough for me.
You can try this
It provides you with a singleton base class for your tasks.
I use Celery with Django models and I implemented a boolean has_task_running at the model level. Then with Celery signals I change the state of the flag to True when signal before_task_publish is trigged and False when a task terminates. Not simple but flexible.

APScheduler using cron and instant triggers together

Im writing an app for Raspberry Pi. App has to run periodic tasks and also connected to main server over socket.io to get commands from server. I preferred APscheduler to run periodic tasks because it gives ability to control task intervals dynamically. I used socketIO_client to get cron statements from server and apply them on running tasks. Up until this point it works like charm. Yet i need some more functionality.
Between periodic task runs, i want to run tasks by socket.io server events. On this site i found similar problem on this question and applied answer. Normally APscheduler is smart enough not to run task before previous task finished by setting coalesce True and/or max_instances 1. But with job.func() method, job starts even though previous hasn't finished yet.
Basically what i want is run a function periodically and also be able to run between intervals by server events. If job started either cron or server event, up until it finishes new job should be passed. Is there any way to do that?
Sorry, that is not currently possible natively with APScheduler. You'll have to create two jobs and share a lock object or something among them that will make sure they don't run simultaneously.

How to persist scheduled jobs using APScheduler until they finish completely?

I'm using APScheduler with Python 2.7.6. I'm using BlockingScheduler to store scheduled jobs and SQLAlchemy as persistent database.
I want to schedule jobs and guarantee that they finish (function reach last line). Everything is working fine, but I see that when a job is started, it's removed from the database, even when the job did not finish the entire method.
Note: Obviously, I developed jobs that do not have state and can be re-executed in next program executions. This should be not an issue to be discussed in this question.
What is the best way to persist a job until the complete function/method is executed using APScheduler?
I had a similar problem, and was able to resolve it using Background Scheduler instead of blocking scheduler.

apscheduler - multiple instances

I have apscheduler running in django and it appears to work ... okay. In my project init.py, I initialize the scheduler:
scheduler = Scheduler(daemon=True)
print("\n\n\n\n\n\n\n\nstarting scheduler")
scheduler.configure({'apscheduler.jobstores.file.class': settings.APSCHEDULER['jobstores.file.class']})
scheduler.start()
atexit.register(lambda: scheduler.shutdown(wait=False))
The first problem with this is that the print shows this code is executed twice. Secondly, in other applications, I'd like to reference the scheduler, but haven't a clue how to do that. If I get another instance of a scheduler, I believe it is a separate threadpool and not the one created here.
how do I get one and only one instance of apscheduler running?
how do I reference that instance in other apps?
That depends on how you ended up with two scheduler instances in the first place. Are you starting apscheduler in a worker thread/process? If you have more than one such worker, you're going to get multiple instances of the scheduler. So, you have to find a way to prevent the scheduler from being started more than once by either running it in a different process if possible, or adding some condition to the scheduler startup.
You don't. Variables are local to each process. The best you can do is to build some kind of remote execution system, either using some kind of a ReST service or some remote control system like execnet or rpyc.

Categories

Resources