I'm looking for a best way to invoke a python script on a remote server and poll the status and fetch the output. It is preferable for me to have this implementation also in python. The python script takes a very long time to execute and I want to poll the status intermediately on what all actions are being performed. Any suggestions?
There are so many options. 'Polling' is generally a bad idea, as it assumes CPU occupation.
You could have your script send you status changes.
You could have your script write it's actual status into a (remote) file (wither overwriting or appending to a log file) and you can look into that file. This is probably the easiest way. You can monitor the file with tail -f file over the link
And many more - and more complicated - other options.
Related
I have an Apache-based PHP website that:
Accepts a document uploaded by a user (typically a text file)
Runs a Python script over the uploaded file using shell_exec (this produces a text file output locally on the disk); the PHP code that runs the shell_exec looks something like $result = shell_exec python3 memory_hog_script.py user_uploaded_text_file.txt
Shows the text output to the user
The Python script executed by PHP's shell_exec typically takes 5-10 seconds to be run, but uses open source libraries that take up a lot of memory/CPU (e.g. 50-70%) while it's being run.
More commonly now, I have multiple users uploading a file at the exact same time. When a handful of users upload a file at the same time, my CPU gets overloaded since a separate instance of the heavy memory hogging python file gets executed via shell_exec for each user. When this happens, the server crashes.
I'm trying to figure out what the best way is to handle this, and I had a few ideas but wasn't sure which one is the best approach / is feasible:
Before shell_exec gets run by PHP, check CPU usage and see if it is greater than a certain threshold (e.g. 70%); if so, wait 10 seconds and try again
Alternatively, I can check the database for current active uploads; if there are more than a certain threshold (e.g. 5 uploads), then do something
Create some sort of queueing system; I'm not sure what my options here, but I imagine there is a way to separate out the shell_exec into some kind of managed queue - I'm very happy to look into new technologies here, so if there are any approaches you can introduce me to, I'd love to dig deeper!
Any input would be much appreciated! Thank you :)
I am creating a test automation which uses an application without any interfaces. However, The application calls a batch script when it changes modes, and I am therefore am able to catch the mode transitions.
What I want to do is to get the batch script to give an input to my python script (I have a state machine running in python) during runtime. Such that I can monitor the state of the application with python instead of the batch file.
I am using a similar state machine to the one of Karn Saheb:
https://dev.to/karn/building-a-simple-state-machine-in-python
However, instead of changing states statically like:
device.on_event('event')
I want the python script to do something similar to:
while(True):
device.on_event(input()) # where the input is passed from the batch script:
REM state.bat
set CurrentState=%1
"magic code to pass CurrentState to python input()" %CurrentState%
I see that a solution would be to start the python script from the batch file every time it is called with the "event" and then save the current event in another file upon termination of the python script... But I want to avoid such handling and rather evaluate this during runtime.
Thank you in advance!
A reasonably portable way of doing this without ugly polling on temporary files is to use a socket: have the main process listen and have the batch file(s) start a small program that connects to the server and writes a message.
There are security considerations here: you can start by listening only to the loopback interface, with further authentication if the local machine should not be trusted.
If you have more than one of these processes, or if you need to handle the child dying before it issues its next report, you’ll have to use threads or something like select to unify the news from different input channels (e.g., waiting on the child to exit vs. waiting on news from the next batch file).
I have a Python script running on a server through SSH with the following command:
nohup python3 python_script.py >> output.txt
It was running for a very long time and (probably) created useful output, so I want to force it to stop now. How do I make it write the output it has so far, into the output.txt file?
The file was automatically created when I started running the script, but the size is zero (nothing has been written in it so far).
as Robert said in his comment, check that the output you are expecting to go to the file is actually making it there and not stderr. If the process is already running and has been for a long time without any response or writes into your output file, I think there are 3 options:
It is generating outout but it's not going where you are expecting (Roberts response)
It is generating output but it's buffered and for some reason has yet to be flushed
It's hasn't generated any output
Option 3 is easy: wait longer. Options 1 & 2 are a little bit tricky. If you are expecting a significant amount of output from this process, you could check the memory consumption of the python instance running your script and see if it's growing or very large. Also you could use lsof to see if it has the file open and to get some idea what it's doing with it.
If you find that your output appears to be going somewhere like /dev/null, take a look at this answer on redirecting output for an existing process.
In the unlikely event that you have a huge buffer that hasn't been flushed, you could try using ps to get the PID and then use kill -STOP [PID] to pause the process and see where you can get using GDB.
Unless it would be extremely painful, I would probably just restart the whole thing, but add periodic flushing to the script, and maybe some extra status reporting so you can tell what is going on. It might help too if you could post your code, since there may be other options available in your situation depending on how the program is written.
Python 2.6
My script needs to monitor some 1G files on the ftp, when ever it's changed/modified, the script will download it to another place. Those file name will remain unchanged, people will delete the original file on ftp first, then upload a newer version. My script will checking the file metadata like file size and date modified to see if any difference.
The question is when the script checking metadata, the new file may be still being uploading. How to handle this situation? Is there any file attribute indicates uploading status (like the file is locked)? Thanks.
There is no such attribute. You may be unable to GET such file, but it depends on the server software. Also, file access flags may be set one way while the file is being uploaded and then changed when upload is complete; or incomplete file may have modified name (e.g. original_filename.ext.part) -- it all depends on the server-side software used for upload.
If you control the server, make your own metadata, e.g. create an empty flag file alongside the newly uploaded file when upload is finished.
In the general case, I'm afraid, the best you can do is monitor file size and consider the file completely uploaded if its size is not changing for a while. Make this interval sufficiently large (on the order of minutes).
Your question leaves out a few details, but I'll try to answer.
If you're running your status checker
program on the same server thats
running ftp:
1) Depending on your operating system, if you're using Linux and you've built inotify into your kernel you could use pyinotify to watch your upload directory -- inotify distinguishes from open, modify, close events and lets you asynchronously watch filesystem events so you're not polling constantly. OSX and Windows both have similar but differently implemented facilities.
2) You could pythonically tail -f to see when a new file is put on the server (if you're even logging that) and just update when you see related update messages.
If you're running your program remotely
3) If your status checking utility has to run on a remote host from the FTP server, you'd have to poll the file for status and build in some logic to detect size changes. You can use the FTP 'SIZE' command for this for an easily parse-able string.
You'd have to put some logic into it such that if the filesize gets smaller you would assume it's being replaced, and then wait for it to get bigger until it stops growing and stays the same size for some duration. If the archive is compressed in a way that you could verify the sum you could then download it, checksum, and then reupload to the remote site.
I am working on a django web application.
A function 'xyx' (it updates a variable) needs to be called every 2 minutes.
I want one http request should start the daemon and keep calling xyz (every 2 minutes) until I send another http request to stop it.
Appreciate your ideas.
Thanks
Vishal Rana
There are a number of ways to achieve this. Assuming the correct server resources I would write a python script that calls function xyz "outside" of your django directory (although importing the necessary stuff) that only runs if /var/run/django-stuff/my-daemon.run exists. Get cron to run this every two minutes.
Then, for your django functions, your start function creates the above mentioned file if it doesn't already exist and the stop function destroys it.
As I say, there are other ways to achieve this. You could have a python script on loop waiting for approx 2 minutes... etc. In either case, you're up against the fact that two python scripts run on two different invocations of cpython (no idea if this is the case with mod_wsgi) cannot communicate with each other and as such IPC between python scripts is not simple, so you need to use some sort of formal IPC (like semaphores, files etc) rather than just common variables (which won't work).
Probably a little hacked but you could try this:
Set up a crontab entry that runs a script every two minutes. This script will check for some sort of flag (file existence, contents of a file, etc.) on the disk to decide whether to run a given python module. The problem with this is it could take up to 1:59 to run the function the first time after it is started.
I think if you started a daemon in the view function it would keep the httpd worker process alive as well as the connection unless you figure out how to send a connection close without terminating the django view function. This could be very bad if you want to be able to do this in parallel for different users. Also to kill the function this way, you would have to somehow know which python and/or httpd process you want to kill later so you don't kill all of them.
The real way to do it would be to code an actual daemon in w/e language and just make a system call to "/etc/init.d/daemon_name start" and "... stop" in the django views. For this, you need to make sure your web server user has permission to execute the daemon.
If the easy solutions (loop in a script, crontab signaled by a temp file) are too fragile for your intended usage, you could use Twisted facilities for process handling and scheduling and networking. Your Django app (using a Twisted client) would simply communicate via TCP (locally) with the Twisted server.