Django synchronous task queue to external system - python

I have Django application as a load balancer. It receives tasks and forwards them to available server. All communications handled as HTTP requests, i.e. django posts http request to external server and ext. server posts http request to django upon task completion. Unfortunately there's no way to check whether the ext. server is busy or not, so I need to wait for notification.
As Django process requests asynchronously I need to build some synchronous task queue which will monitor free tasks and free servers and send tasks to all available servers and then will wait until any of them reports back. I tried using celery but not sure how to "wait" for server to report back.

Related

Architecture and interaction of the client, NestJS backend, and Python microservice

As the title says, I have a question about the architecture and interaction of the client, NestJS backend, and Python microservice. I need to develop a recognition service. I have a client application that needs to send an HTTP request with an image to the NestJS backend, which should contact a third-party Python microservice, this service will recognize the text from the image (this can take a long time) and we should receive this response on the client application. What is the best way to implement this? I have an idea to connect NestJS with a Python microservice via RabbitMQ, the client sends a POST HTTP request to the NestJS backend, the backend sends a message to create a task via RPC to the Python microservice, which creates a Redis RQ queue with the task, returns the backend to the NestJS id of the task and starts to perform a long-running task, we return the task id to the client from the NestJS backend. After that, the client, with a certain interval, can send a GET HTTP request to the NestJS backend, which in turn will send a message to get the status of the task to the microservice and return the status to the client. Is this a good way to do it or is it possible to somehow optimize this process or implement it more competently?
I think you're on the right track here.
Send image to nest via HTTP - yes
Post job to redis queue - yes, use nestjs builtin queue handling (see docs), this will make it easier to consume the product of the job as well
Instead of having your client poll for a result, check out Server-sent Events
Server sent events are intended exactly for the use-case you are using.

Networking with gRPC

I'm trying to improve network performance between gRPC client and server.
The client network speed is 1Gps, the
Assuming my server takes 200ms to respond, and I measure the latency in the client.
Now, if the server processing time goes up, say to 700ms for a response. Where the requests will accumulate? Will they stay in the client network queue, or they will still be sent to the server and will wait in the server queue?
In other words, does a grpc client holds a queue for requests, or every request is always sent - which means that the latency does not depends on the server processing time.
And is there a setting for it in grpc-python?
I suggest you to check the Client-side Interceptor and Server-side Interceptor classes.
Also, if you want to debug the requests, you can try to create immediate or with time interval multiple requests using JMeter or Postman Runner.

How to implement request-reply (synchronous) messaging paradigm in Kafka?

I am going to use Kafka as a message broker in my application. This application is written entirely using Python. For a part of this application (Login and Authentication), I need to implement a request-reply messaging system. In other words, the producer needs to get the response of the produced message from the consumer, synchronously.
Is it feasible using Kafka and its Python libraries (kafka-python, ...) ?
I'm facing the same issue (request-reply for an HTTP hit in my case)
My first bet was (100% python):
start a consumer thread,
publish the request message (including a request_id)
join the consumer thread
get the answer from the consumer thread
The consumer thread subscribe to the reply topic (seeked to end) and deals with received messages until finding the request_id (modulus timeout)
If it works for a basic testing, unfortunatly, creating a KafkaConsumer object is a slow process (~300ms) so it's not an option for a system with massive traffic.
In addition, if your system deals with parallel request-reply (for example, multi-threaded like a web server is) you'll need to create a KafkaConsumer dedicated to request_id (basically by using request_id as consumer_group) to avoid to have reply to request published by thread-A consumed (and ignored) by thread-B.
So you can't here reclycle your KafkaConsumer and have to pay the creation time for each request (in addition to processing time on backend).
If your request-reply processing is not parallelizable you can try to keep the KafkaConsuser object available for threads started to get answer
The only solution I can see at this point is to use a DB (relational/noSQL):
requestor store request_id in DB (as local as possible) aznd publish request in kafka
requestor poll DB until finding answer to request_id
In parallel, a consumer process receiving messages from reply topic and storing result in DB
But I don't like polling..... It wil generate heavy load on DB in a massive traffic system
My 2CTS

Celery - Reuse a broker connection with apply_async()

We have a Django app that uses Celery's apply_async() call to send tasks to our RabbitMQ server. The problem is that when there are thousands of requests coming into the Django app, each apply_async() call will cause it to open thousands of new connections to the RabbitMQ server.
In the Celery documentation for apply_async, there is a connection parameter:
connection – Re-use existing broker connection instead of establishing a new one.
My question is, how can I use it in a Django app? I cannot find any examples of how to use this. We are running Django using Gunicorn, ideally we would like to have each worker create one connection to the broker and re-use it between requests. In this way, the number of connections opened on the broker is limited by the amount of workers running.

Django - listening to rabbitmq, in a synchronized way. without celery. in the same process of the web bound django

I need to implement a quite simple Django server that server some http requests and listens to a rabbitmq message queue that streams information into the Django app (that should be written to the db). the data must be written to the db in a synchronized order , So I can't use the obvious celery/rabbit configuration. I was told that there is no way to do this in the same Django project. since Django would listen to http requests on It's process. and It can't handle another process to listen for Rabbit - forcing me to to add Another python/django project for the rabbit/db writes part - working with the same models The http bound django project works with.. You can smell the trouble with this config from here. .. Any Ideas how to solve this?
Thanks!
If anyone else bumps into this problem:
The solution is using a RabbitMQ consumer from a different process (But in the same Django codebase) then Django (Not the running through wsgi, etc. you have to start it by it self)
The consumer, connects to the appropriate rabbitmq queues and writes the data into the Django models. Then the usual Django process(es) is actually a "read model" of the data inserted/updated/created/deleted as delivered by the message queue (RabbitMQ or other) from a remote process.

Categories

Resources