how to seperate celery code into server and client side? - python

Imagine that I've written a celery task, and put the code to the server, however, when I want to send the task to the server, I need to reuse the code written before.
So my question is that are there any methods to seperate the code between server and client.

Try a web server like flask that forwards requests to the celery workers. Or try a server that reads from a queue (SQS, AMQP,...) and does the same.
No matter the solution you choose, you end up with 2 services: the celery worker itself and the "server" that calls the celery tasks. They both share the same code but are launched with different command lines.
Alternately, if the task code is small enough, you could just import the git repository in your code and call it from there

Related

How to use Celery's chunks in a remote call?

I'm using send_task to send remote calls from a web server to a server that is running Celery. I'm trying to break down the tasks into smaller chunks and I know Celery has the chunks function.
However, I couldn't find any example on how this could be used in a remote call. Docs provided only show how it could be done if the Celery codes resides together in the same repository. Is this even possible in the first place? Any tips would help. Thanks!

Is it possible run a fastapi in command line?

We can run any script in python doing:
python main.py
Is it possible do the same if the script was a FastApi application?
Something like:
python main.py GET /login.html
To call a GET method that returns a login.html page.
If not, how I could start a FastApi application without using Uvicorn or another webserver?
I would like can run the script only when necessary.
Thanks
FastApi is designed to allow you to BUILD APIs which can be queried using a HTTP client, not directly query those APIs yourself; however, technically I believe you could.
When you start the script you could start the FastApi app in a another process running in the background, then send a request to it.
import subprocess
import threading
import requests
url = "localhost/some_path"
# launch sub process in background task while redirecting all output to /dev/null
thread = threading.Thread(target=lambda: subprocess.check_output(["uvcorn", "main:app"]))
thread.start()
response = requests.get(url)
# do something with the response...
thread.join()
Obviously this snippet has MUCH room for improvement, for example the thread will never actually end unless something bad happens, this is just a minimal example.
This is method has the clear drawback of starting the API each time you want to run the command. A better approach would be to emulate applications such as Docker, in which you would start up a local server daemon which you would then ping using the command line app.
This would mean that you would have the API running for much longer in the background, but typically these APIs are fairly light and you shouldn't notice and hit to you computer's performance. This also provides the benefit of multiple users being able to run the command at the same time.
If you used the first previous method you may run into situations where user A send a GET request, starting up the server taking hold of the configured host port combo. When user B tries to run the same command just after, they will find themselves unable to start the server. and perform the request.
This will also allow you to eventually move the API to an external server with minimal effort down the line. All you would need to do is change the base url of the requests.
TLDR; Run the FastApi application as a daemon, and query the local server from the command line program instead.

During a long-running process, will Flask be insensitive to new requests?

My Flask project takes in orders as POST requests from multiple online stores, saves those orders to a database, and forwards the purchase information to a service which delivers the product. Sometimes, the product is not set up in the final service and the request sits in my service's database in an "unresolved" state.
When the product is set up in the final service, I want to kick off a long-running (maybe a minute) process to send all "unresolved" orders to the final service. During this process, will Flask still be able to receive orders from the stores and continue processing as normal? If not, do I need to offload this to a task runner like rq?
I'm not worried about speed as much as I am about consistency. The items being purchased are tickets to a live event so as long as the order information is passed along before the event begins, it should make no difference to the customer.
There's a few different answers that are all valid in different situations. The quick answer is that a job queue like RQ is usually the right solution, especially in the long run as your project grows.
As long as the WSGI server has workers available, another request can be handled. Each worker handles one request at a time. The development server uses threads, so an unlimited number of workers are available (with the performance constraints of threads in Python). Production servers like Gunicorn can use multiple workers, and different types of workers such as threads, processes, or eventlets. If you want to run a task in response to an HTTP request and wait until the task is finished to send a response, you'll need enough workers to block on those tasks along with handling regular requests.
#app.route("/admin/send-purchases")
def send_purchases():
... # do stuff, wait for it to finish
return "success"
However, the task you're describing seems like a cleanup task that should be run regardless of HTTP requests from a user. In that case, you should write a Flask CLI command and call it using cron or another scheduling system.
#app.cli.command()
def send_purchases():
...
click.echo("done")
# crontab hourly job
0 * * * * env FLASK_APP=myapp /path/to/venv/bin/flask send-purchases
If you do want a user to initiate the task, but don't want to block a worker waiting for it to finish, then you want a task queue such as RQ or Celery. You could make a CLI command that submits the job too, to be able to trigger it on request and on a schedule.
#rq.job
def send_purchases():
...
#app.route("/admin/send-purchases", endpoint="send_purchases")
def send_purchases_view():
send_purchases.queue()
return "started"
#app.cli.command("send-purchases")
def send_purchases_command():
send_purchases.queue()
click.echo("started")
Flask's development server will spawn a new thread for each request. Similary, production servers can be started with multiple workers.
You can run your app with gunicorn or similar with multiple processes. For example with four process workers:
gunicorn -w 4 app:app
For example with eventlet workers:
gunicorn -k eventlet app:app
See the docs on deploying in production as well: https://flask.palletsprojects.com/en/1.1.x/deploying/

How do I create rq workers for my flask application for users online? (using redis server) [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
I am confused about how rq workers work! I created a flask web application that runs a background process on a redis server and then returns the results. So far, during development I have been starting rq worker processes on my machine using a command line. However, now I'm ready to launch my application online. But when a remote user tries to run the application online, where are the workers hosted?
I tried initiating them directly in my code using with Connection(): and it worked but it also interrupted the process of initiating the web application to run in my browser.
Thanks for any help!
EDIT: To make my question more specific, I have launched an rq worker using the command line on my machine, but when I initiate tasks on my remote redis server (hosted on AWS EC2) the app stalls and there is no worker activity. Is there something wrong with the configuration of my rq workers or is there another way I should be launching them? All I did was import rq and then run rq worker simulator and that was enough for local development; if there is configuration that I need to change do I do so in command line?
EDIT 2: I got it working. I simply needed to run rq worker simulator --url {ec2 address}. Thanks Paul Becotte for the explanation pointing me in the right direction.
Generally with a task queue (am only vaguely familiar with rq- excuse me for lack of specifics, but don't think they're necessary here) the idea is to split work out of the request/response cycle. You don't want requests that take 30 seconds- especially when you have other requests coming in!
In this world, you have a source of state (redis) and a bunch of clients that use it. Your users connect to your api to send requests. Your api connects to the source of state. For this, there are probably two endpoints- start task and check status. Your api sends a message to the pub/sub on "start task" and looks for the result of that task when check status is called.
You can also start X worker processes. They connect to redis as well as subscribers. When a task goes in, a worker picks it up and works it, then posts the result back to redis. You run enough of these processes to keep the total cycle time low with your traffic.
You don't store any state at all in any of the workers or api endpoints, so you can scale them out horizontally- start as many as you want on as many machines as you want. The only thing you need to maintain then is your redis server- make sure it is stable and backed up and everything (you can cluster redis as well and get HA, but that's a longer discussion).

Do I need celery when I am using gevent?

I am working on a django web app that has functions (say for e.g. sync_files()) that take a long time to return. When I use gevent, my app does not block when sync_file() runs and other clients can connect and interact with the webapp just fine.
My goal is to have the webapp responsive to other clients and not block. I do not expect a zillion users to connect to my webapp (perhaps max 20 connections), and I do not want to set this up to become the next twitter. My app is running on a vps, so I need something light weight.
So in my case listed above, is it redundant to use celery when I am using gevent? Is there a specific advantage to using celery? I prefer not to use celery since it is yet another service that will be running on my machine.
edit: found out that celery can run the worker pool on gevent. I think I am a litle more unsure about the relationship between gevent & celery.
In short you do need a celery.
Even if you use gevent and have concurrency, the problem becomes request timeout. Lets say your task takes 10 minutes to run however the typical request timeout is about up to a minute. So what will happen if you trigger the task directly within a view is that the server will start processing it however after a minute a client (browser) will probably disconnect the connection since it will think the server is offline. As a result, your data can become corrupt since you cannot be guaranteed what will happen when connection will close. Celery solves this because it will trigger a background process which will process the task independent of the view. So the user will get the view response right away and at the same time the server will start processing the task. That is a correct pattern to handle any scenarios which require lots of processing.

Categories

Resources