I don't know if this is the right place to ask but, i am desperate for an answer.
The problem in hand here is not the number of requests but, the amount of time one single request will take. For each request, the server has to query about 12 different sources for data and it can take upto 6 hours for server to get the data (let's leave request timeout from this because, this is not the server directly communicating with the client. This server is fetching messages from kafka and then starts getting the data from the sources). I am supposed to come up with a scalable solution. Can anyone help me with this?
The problem don't end here:
Once the server gets the data, he has to push to kafka for further computation using spark. Streaming api will be used in this part.
I am open to any web framework or any scaling solution in python.
Related
Looking at the process of communicating between the server and the client with "fiddler," it was seen that dozens of times of communication with the server was made with a single click in Chrome.
In some cases, the response by the first request may be included in the second request, and I wonder how the information is extracted and included in the next request.
I am wondering about the available solutions to my problem. Need to retrieve data from API every (preferably) 200ms and save this data to the database since it will still be processed by another service. My solution I wanted to base on RabbitMQ and task queuing. That is, my API from which you can send or delete a task that fetches data every 200ms and adds it to the database. There may be several such tasks though not many. While I know that the latency associated with the database can not be avoided, I do not know if the solution with RabbitMQ is optimal in this case. Maybe someone has experience and can suggest better solutions to this problem ? My API is based on python and FastAPI.
I am building an application in Flask API and React.
The first page of the app presents the user with an upload file form. The user selects a file (700 MB) and click uploads.
Once this is done, the backend:
Takes the file, unzip it
Run some ML model
Returns a JSON containing the right data
When this is over, react gets the JSON and renders a new page.
These three steps takes more than 10 minutes therefore I get an error 500 which I believe is due to the long time request timeout.
I would like to know if there is a way to make timeout=None.
I looked for some answers and they suggest to use Celery. However, I am not sure if this is the right approach for my task.
I second with #TheIncorrigible suggestion to solve with some kind of event driven architecture what you are doing is Web Worker Architecture. Ref
Your problem reminds me one of the AWS service called control tower where launching landing zone of that service takes more than >10min and AWS gracefully handles that. When you try to launch it gives me a banner saying it is progress and would take 1 hour. In console log I noticed they were using Promise(Not exactly sure how they are achieving and how long it can handle).
May be you could try using Promises in react for asynchronous computations. I am not expert but it looks like you can achieve this using that. You may watch this short video for basic understanding.
There is also signalr that allows server code to send asynchronous notifications to client-side web applications. You can check if that can be applied in your case signalr in python dicussion
I am only told to create a pythonic web service. At the end of the day, I need to offer a HTTPS endpoint which will receive(from a post request), and be able to process/send back json objects from/to another web service.
To be able to receive post requests from other services, what kind of information do I need?
I have seen some examples using httplib2 such as sending HTTP get and post requests when given a website like www.something.com. But in my case, since I do not know the IP address/URL of the data source, should I create a listener waiting for the incoming data? How to achieve this?
I am really new with building python web server and the requirement I am given is really vague. Thank you in advance for helping me break down this problem.
Take a look at the Flask framework, it can do everything you want and then some. I can especially recommend the Quickstart: A Minimal Application and the JSON Support pages.
Enabling the build in debugger will help you a great deal as well.
All services is listening for incoming connections, so you are right about that :-)
Good luck!
I'm trying to test a web application using selenium python. I've wrote a script to mimic a user. It logs in to the server, generates some reports and so on. It is working fine.
Now, I need to see how much time the server is taking to process a specific request.
Is there a way to find that from the same python code?
Any alternate method is acceptable.
Note:
The server is in the same LAN
Also I don't have privileges to do anything at the server side. So anything I can do is from outside the server.
Any sort of help is appreciable. Thank you
Have you considered the w3c HTTP access log field, "time-taken." This will report on every single request the time in milliseconds maximally. On some platforms the precision reported is more granular. In order for a web server, an application server with an HTTP access layer, an enterprise services bus with an HTTP access layer (for SOAP and REST calls) to be fully w3c standards compliant this log value must be available for inclusion in the HTTP access logs.
You will see every single granular request and the time required for processing from first byte of receipt at the server to the last byte sent minus the final TCP ACK at the end.