I have a NextJS index page where it will resize an image. To do this, I have an api/run.js that will execute a python script and return the result.
However, the python script is resource intensive and takes about 3 minutes to return the result, so I want to make it run consecutively and not concurrently.
My goal is to be able to access the same webpage from multiple devices and be placed in the same queue.
How can I achieve this?
I tried using web sockets and a MySQL database, but I realized that someone has to be active on the website for it to work correctly. If not, the queue will not work.
Related
Before getting too far, I do not think that multiprocessing or Threading will work in this scenario.
I currently have written a Python worker that sends HTTP Post requests to gather data and when an item comes up that falls within set conditions, this initiates a selenium buyer. However, this script only functions on a single item and I need to be able to hit over 200 items at once with the same Python script, but with different parameters.
My plan is to create a online database that can be dynamically updated to reflect up to date requested parameters, but need to find a way to apply these values and run on the same script over 200 times at once 24/7.
I created a scraper in Python that is navigating a website. It pulls many links and then It has to visit every link pull the data and parse and store the result.
Is there an easy way to run that script distributed in the cloud (like AWS)?
Ideally, I would like something like this (probably is more difficult, but just to give an idea)
run_in_the_cloud --number-of-instances 5 scraper.py
after the process is done, the instances are killed, so it does not cost more money.
I remember I was doing something similar with hadoop and java with mapreduce long time ago.
If you can put your scraper in a docker image it's relatively trivial to run and scale dockerized applications using AWS ECS Fargate. Just create a task definition and point it at your container registry, then submit runTask requests for however many instances you want. AWS Batch is another tool you could use to trivially parallelize container instances too.
I currently have a simple HTML with three input text boxes and a button running on Node.js. I am able to send values from the HTML page to the python script as arguments when executing the script (sys.argv) through Node.js as a child process.
The python script keeps generating values. Right now, I simply print those values, but can I grab these values and send them back to the webpage every second until the script stops running after about 3 minutes? If yes, how can I grab them?
I want to use Node.js because I want to use the package pdfmake (https://www.npmjs.com/package/pdfmake) from npm to generate reports of the same.
If you're already executing the Python script as a child process, simply capture the STDOUT stream from the process.
https://nodejs.org/api/child_process.html
https://nodejs.org/api/child_process.html#child_process_subprocess_stdout
I do not have comment privileges yet and this may not be an exact answer but have you looked into websockets? It looks like you should be able to emit the data from the python code to the port that you may be hosting your web page on.
This a question about architecture. Say I have a long running process on a server such as machine learning in a middle of a training. Now as this run on external machine I would like to have a tool to quickly see from time to time the results. So I thought the best way would be to have a website which quickly connects to the process for example using RPC to display the results as this allows me to always check in. Now the question is how should Django view gather the information from the server process:
1) Using RPC calls such as rpyc directly in the views?
2) Using some kind of messaging queue such as celery ?
3) Or in a completely different way I am not seeing ?
There's at least 2 possible ways to do this.
Implement your data-refreshing function as a view and visit it by ajax(sync)+javascript timer.Since you visit your page that contains these js, it will fetch your data silently and update the page. However,this solution does not work well when you need to record all the data in a given frequency;the ajax/view only executes when the web page is open.
Use messaging queue like selcuk suggests.Alongside celery, APscheduler is also a good choice because it's easier to install and use.You can implement a task(as modal) queue with status(queue/done/stoped/whatever as field) and check them at the frequency you wanted,save the date you retrieved and do all the other stuff.
I have just started with Python, although I have been programming in other languages over the past 30 years. I wanted to keep my first application simple, so I started out with a little home automation project hosted on a Raspberry Pi.
I got my code to work fine (controlling a valve, reading a flow sensor and showing some data on a display), but when I wanted to add some web interactivity it came to a sudden halt.
Most articles I have found on the subject suggest to use the Flask framework to compose dynamic web pages. I have tried, and understood, the basics of Flask, but I just can't get around the issue that Flask is blocking once I call the "app.run" function. The rest of my python code waits for Flask to return, which never happens. I.e. no more water flow measurement, valve motor steering or display updating.
So, my basic question would be: What tool should I use in order to serve a simple dynamic web page (with very low load, like 1 request / week), in parallel to my applications main tasks (GPIO/Pulse counting)? All this in the resource constrained environment of a Raspberry Pi (3).
If you still suggest Flask (because it seems very close to target), how should I arrange my code to keep handling the real-world events, such as mentioned above?
(This last part might be tough answering without seeing the actual code, but maybe it's possible answering it in a "generic" way? Or pointing to existing examples that I might have missed while searching.)
You're on the right track with multithreading. If your monitoring code runs in a loop, you could define a function like
def monitoring_loop():
while True:
# do the monitoring
Then, before you call app.run(), start a thread that runs that function:
import threading
from wherever import monitoring_loop
monitoring_thread = threading.Thread(target = monitoring_loop)
monitoring_thread.start()
# app.run() and whatever else you want to do
Don't join the thread - you want it to keep running in parallel to your Flask app. If you joined it, it would block the main execution thread until it finished, which would be never, since it's running a while True loop.
To communicate between the monitoring thread and the rest of the program, you could use a queue to pass messages in a thread-safe way between them.
The way I would probably handle this is to split your program into two distinct separately running programs.
One program handles the GPIO monitoring and communication, and the other program is your small Flask server. Since they run as separate processes, they won't block each other.
You can have the two processes communicate through a small database. The GIPO interface can periodically record flow measurements or other relevant data to a table in the database. It can also monitor another table in the database that might serve as a queue for requests.
Your Flask instance can query that same database to get the current statistics to return to the user, and can submit entries to the requests queue based on user input. (If the GIPO process updates that requests queue with the current status, the Flask process can report that back out.)
And as far as what kind of database to use on a little Raspberry Pi, consider sqlite3 which is a very small, lightweight file-based database well supported as a standard library in Python. (It doesn't require running a full "database server" process.)
Good luck with your project, it sounds like fun!
Hi i was trying the connection with dronekit_sitl and i got the same issue , after 30 seconds the connection was closed.To get rid of that , there are 2 solutions:
You use the decorator before_request:in this one you define a method that will handle the connection before each request
You use the decorator before_first_request : in this case the connection will be made once the first request will be called and the you can handle the object in the other route using a global variable
For more information https://pythonise.com/series/learning-flask/python-before-after-request