I'm hosting a Flask application in Heroku(free) which acts as an API and reads from an SQLite database file.
The way the project ran on my computer, I had scheduled Python scripts which would run every night and append new data to my SQLite database, which would then in turn be able to be used by my Flask Application.
However, hosted on Heroku, I don't think I will be able to run my Flask application and a python script 24/7. I know there is an alternate solution APScheduler on Flask, which would carry out tasks in Python functions in the Flask application. However, according to Heroku's free use guidelines, if there is no traffic to my page in 30 minutes, the application will "sleep." I'm assuming that means scheduled tasks will no longer work once the application is asleep, which defeats the purpose of using APScheduler.
Are there any alternatives I could use to go about this?
Related
I have a Python script that takes 20 minutes to run. I need to be able to trigger this script via my Azure .NET application.
I am looking for a possible cloud based host to help me do this. Preferably Azure, but open to other options.
I have tried the following Options:
Azure Functions
Assessment: Found too many limitations on code structure (e.g. have to organize Python files in certain way)
Azure Web App
Assessment: Works to create an endpoint but has timeout issues for long requests
Azure Virtual Machine (VM)
Assessment: I simulated a trigger by scheduling the script frequently on the VM. This is not a bad solution but not ideal either
What other viable options exist?
You can also choose to use Azure Web Jobs to serve this purpose.
It has a setting to specify idle time (WEBJOBS_IDLE_TIMEOUT):
The value must be in seconds, example 3600 which means the idle time before it times out is 1 hour. Note that this option will affect all scheduled web jobs either under web app or azure function.
Reference: https://jtabuloc.wordpress.com/2018/06/05/how-to-avoid-azure-webjob-idle-timeout-exception/
I have prototyped a system using python on linux. I am now designing the architecture to move to a web based system. I will use Django to serve public and private admin pages. I also need a service running, which will periodically run scripts, connect to the internet and allow API messaging with an admin user. Thus there will be 3 components : web server, api_service and database.
1) What is best mechanism for deploying a python api_service on the VM? My background is mainly C++/C# and I would have usually deployed a C#-written service on the same VM as the web server and used some sort of TCP messaging wrapper for the API. My admin API code will be ad hoc python scripts run from my machine to execute functionality in this service.
2) All my database code is written to an interface that presently uses flat-files. Any database suggestion? PostgreSQL, MongoDB, ...
Many thanks in advance for helpful suggestions. I am an ex-windows/C++/C# developer who now absolutely loves Python/Cython and needs a little help please ...
Right, am answering my own question. Have done a fair bit of research since posting.
2) PostgreSQL seems a good choice. There seem to be no damning warnings against using it and there is much searchable help. I am therefore implementing concrete PostgreSQL classes to implement my serialization interfaces.
1) Rather than implement my own service in python that sits on a remote machine, I am going to use Celery. RabbitMQ will act as the distributed TCP message wrapper. I can put required functionality in python scripts on the VM that Celery can find and execute as tasks. I can run these Celery tasks in 3 ways. i) A web request through Django can queue a task. ii) I can manually queue a remote Celery task from my machine by running a python script. iii) I can use Celery Beat to schedule tasks periodically. This fits my needs perfectly as I have a handful of daily/periodic tasks that can be scheduled plus a few rare maintenance tasks that I can fire off from my machine.
To summarize then, where before I would have created a windows service that handled both incoming TCP commands and scheduled behaviour, I can use RabbitMQ, Celery, Celery Beat and python scripts that sit on the VM.
Hope this helps anybody with a similar 'how to get started' problem .....
I am hoping to gain a basic understanding of scheduled task processes and why things like Celery are recommended for Flask.
My situation is a web-based tool which generates spreadsheets based on user input. I save those spreadsheets to a temp directory, and when the user clicks the "download" button, I use Flask's "send_from_directory" function to serve the file as an attachment. I need a background service to run every 15 minutes or so to clear the temp directory of all files older than 15 minutes.
My initial plan was a basic python script running in a while(True) loop, but I did some research to find what people normally do, and everything recommends Celery or other task managers. I looked into Celery and found that I also need to learn about redis, and I need to apparently host redis in a unix environment. This is a lot of trouble for a script that just deletes files every 15 minutes.
I'm developing my Flask app locally in Windows with the built-in development server and deploying to a virtual machine on company intranet with IIS. I'm learning as I go, so please explain why this much machinery is needed to regularly call a script that simply deletes things. It seems like a vast overcomplication, but as I said, I'm trying to learn as I go so I want to do/learn it correctly.
Thanks!
You wouldn't use Celery or redis for this. A cron job would be perfectly appropriate.
Celery is for jobs that need to be run asynchronously but in response to events in the main server processes. For example, if a sign up form requires sending an email notification, that would be scheduled and run via Celery so as not to block the main web response.
I have a Flask (Python) application running on Passenger which works absolutely fine when I test it and I'm the only user.
But as soon as I attempt to have several concurrent connections, each client waits forever for a response. I have tried it with 50 concurrent users, which seems like it should easily be supported.
The app is very simple, reading and writing to a SQLite database once or twice. (Concurrent access of SQLite by this small number of users is not a problem.)
What am I missing?
In the Passenger docs it makes the following suggestion:
Passenger also supports the magic file 'tmp/always_restart.txt'. If
this file exists, Passenger will restart your application after every
request. This way you do not have to invoke the restart command often.
Activate this mechanism by creating the file:
$ mkdir -p tmp
$ touch tmp/always_restart.txt
This is great for development because it means you only need to save your Python files for the latest version of the app to be available to clients.
But it's terrible for production because every client makes its own request and that restarts the Python app. This is a very major overhead for the server, so users are likely to timeout before receiving a response.
Delete the file tmp/always_restart.txt and your concurrency limits will shoot up.
I have a web app running on heroku using flask and SQLAlchemy. I am now wondering how i can start a schedule task that runs daily and does some database related tasks (deleting some row if you need to know:)
The documentation on heroku recommends to use APScheduler but i would like to do it with Heroku-Scheduler. Dispite this decision i would like to know how i connect to my postgres database in this scheduler task. I could not find any example or hint for that.
thanks for your time
Torsten
Heroku scheduler will run any command you throw at it. The typical way would be to create a Python script/command as part of your flask app. You can do something similar to http://flask-script.readthedocs.org/en/latest/. Then within the scheduler you would schedule it similar to:
python manage.py mytask