Django how best to perform api request for large jobs - python

I need some direction as to how to achieve the following functionality using Django.
I want my application to enable multiple users to submit jobs to make calls to an API.
Each user job will require multiple API calls and will store the results in a db or a file.
Each user should be able to submit multiple jobs.
In case of some failure such as network blocked or API not returning results I want the application to pause for a while and then resume completing that job.
Basically want the application to pickup from where it was left off.
Any ideas as to how I could implement this solution or any technologies such as celery I should be looking at or even if you can suggest an opensource project where I can learn how to perform this would be a great help.

You can do this with rabbitmq and celery.
This post might be helpful.
https://medium.com/#ffreitasalves/executing-time-consuming-tasks-asynchronously-with-django-and-celery-8578eebab356

Related

What should I use Django Non-Global Middlewares or Django triggers

My problem is basically that I am currently making a customized management system based on Django(3.1) Python(v3.7.9) in which I am pulling the data from a third-party tool. The tool is not giving me the webhooks of every data that I want for visualization and analysis.
The webhook is giving me bits of information and I have to perform a GET request to their API to fetch the rest details if those are not in my database. They are asking for a successful response of webhook within 5 secs otherwise it will trigger a retry.
If I try to do a get request within the function of webhook the time of 5 second will get exceeded the solutions that I came up with was to Django Middleware or Django Triggers so which would be best suitable for my problem I am bit confused.
Note: I can not lower the Django version as I have to use Async Functions
This would be a good use case for a task scheduler like Celery.
Django-triggers is an interface to the Celery scheduler, so it might be a good fit.
Keep in mind, Celery has to be run as a separate process next to django.
Another popular task scheduler is rq-scheduler.
This offers a simple implementation using Redis as a message queue. *Note that Loadbalanced/multi-instance applications are not easily setup with RQ.

Consuming webservice from multiple sources and save to Db

I am building an application in django that collects hotel information from various sources and format this data to a uniform format. There after I need to expose API to allow hotels access to web apps and devices using django-rest-framework.
So For example if I have 4 sources
[HotelPlus, xHotelService, HotelSignup, HotelSource]
So please let me know the best implementation practice in terms of django. Being a PHP developer, I prefer to do this by writing a custom third party services implementing an interface so adding more sources becomes easy. That way I only need to call execute() method from the cron task and rest is done by the service controller (fetching feed and populating it in database).
But I am new to python django, so I dont have much idea of creating services or middleware is a right fit for this task.
For fetching data from the sources you will need dedicated worker processes and broker so that your main django process won't be blocked. You can use celery for that and it already supports django.
After writing the tasks for fetching and formatting the data, you should need a scheduler to call this tasks periodically. You can use celery beat for that.

How can I have a python module run asynchronously and recieve calls from other modules?

So I'm currently working on adding a recommendation engine to a Django project and need to do some heavy processing (in an external module) for one of my view functions. This significantly slows down page load time because I have to load in some data, transform it, perform my calculations based on parameters sent by the request, and then return the suggestions to the view. This has to be done every time the view is loaded.
I was wondering if there was some way I could have the recommender module load and transform the data in memory, and then wait for parameters to be sent from the view, have calculations run on those parameters and then send it back to the view.
Any help would be greatly appreciated.
Celery is a task queue that reeally excels at this sort of thing.
It would allow you to do something like:
user makes request to view
view starts an async task that does the heavy lifting, then returns to the user immediately
you can poll from javascript to see if your task is done and load the results when it is
Might not quite be the flow you're looking for but celery is definitetly worth checking out
Celery has a great django package too, extremely easy to use
Rereading your question, i think it would also be possible to create a local webservice around your recommendation engine. On startup it can load all the data into memory, then you can just make requests to it from your django app?

choosing an application framework to handle offline analysis with web requests

I am trying to design a web based app at the moment, that involves requests being made by users to trigger analysis of their previously entered data. The background analysis could be done on the same machine as the web server or be run on remote machines, and should not significantly impede the performance of the website, so that other users can also make analysis requests while the background analysis is being done. The requests should go into some form of queueing system, and once an analysis is finished, the results should be returned and viewable by the user in their account.
Please could someone advise me of the most efficient framework to handle this project? I am currently working on Linux, the analysis software is written in Python, and I have previously designed dynamic sites using Django. Is there something compatible with this that could work?
Given your background and the analysys code already being written in Python, Django + Celery seems like an obvious candidate here. We're currently using this solution for a very processing-heavy app with one front-end django server, one dedicated database server, and two distinct celery servers for the background processing. Having the celery processes on distinct servers keeps the djangon front responsive whatever the load on the celery servers (and we can add new celery servers if required).
So well, I don't know if it's "the most efficient" solution but it does work.

Ruby on Rails frontend and server side processing in Python or Java.. HOW, What, Huh?

I am a data scientist and database veteran but a total rookie in web development and have just finished developing my first Ruby On Rails app. This app accepts data from users submitting data to my frontend webpage and returns stats on the data submitted. Some users have been submitting way too much data - its getting slow and I think I better push the data crunching to a backed python or java app, not a database. I don't even know where to start. Any ideas on how to best architect this application? The job flow is > data being submitted from the fronted app which pushes it to the > backend for my server app to process and > send back to my Ruby on Rails page. Any good tutorials that cover this? Please help!
What should I be reading up on?
I doesn't look like you need another app, but a different approach to how you process data. How about processing in background? There are several gems to accomplish that.
Are you sure your database is well maintained and efficient (good indexes, normalised, clean etc)
Or can you not make use of messaging queues, so you keep your rails crud app, then the jobs are just added to a queue. Python scripts on the backend (or different machine) read from the queue, process then insert back into the database or add results to a results queue or whereever you want to read them from

Categories

Resources