change process state with python - python

I want to search for a process and show it,emacs for example,I use
`p = subprocess.Popen('ps -A | grep emacs',shell=True,stdout=subprocess.PIPE)`
to get the process, then how can I wake it up and show it?
in other words,the question shoud be : how python change the state of process?

In short, python has a pty module and look for the solution there.
This question is not that simple as it may look like.
It is simple to change the running state of a process by delivering corresponding signals but it is not simple to manipulate foreground/background properties.
When we talk about manipulating the foreground/background processes, we really talk about 'job control.' In UNIX environment, job control is achieved by coordination of several parts, including kernel, controlling terminal, current shell and almost every process invoked in that shell session. Telling a process to get back to foreground, you have to tell others to shut up and go to background simultaneously. See?
Let's come back to your question. There could be 2 answers to this, one is no way and the other is it could be done but how.
Why 2 answers?
Generally you cannot have job control unless you program for it. You also cannot use a simple pipe to achieve the coordination model which leads to job control mechanism; the reason is essential since you cannot deliver signals through a pipe. That's why the answer is no way, at least no way in a simple pipe implementation.
However, if you have enough patience to program terminal I/O, it still can be done with a lot of labor work. Concept is simple: you cheat your slave program, which is emacs in this example, that it has been attached to a real terminal having a true keyboard and a solid monitor standby, and you prepare your master program, which is the python script, to handle and relay necessary events from its controlling terminal to the slave's pseudo-terminal.
This schema is actually adopted by many terminal emulators. You just need to write another terminal emulator in your case... Wait! Does it have to be done with so much effort, always?
Luckily no.
Your shell manages all the stuff for you in an interactive scenario. You just tell shell to 'fg/bg' the task, quite easy in real life. The designated command combination can be found in shell's manpage. It could look like 'jobs -l | grep emacs' along with 'fg %1'. Nonetheless those combined commands cannot be invoked by a program. It's a different story since a program will start a new shell to interpret its commands and such a new shell cannot control the old running emacs because it doesn't have the privilege. Type it in with your keyboard and read it out on your monitor; that's an interactive scenario.
In an automation scenario, think twice before you employ a job control design because most automation scenarios do not require a job control schema. You need an editor here and a player there, that's all right, but just don't make them to stay "background" and pop to "foreground." They'd better exit when they complete their task.
But if you are unlucky to have to program job control in automation procedures, try to program pseudo-terminal master and slave I/O as well. They look like a sophisticated IPC mechanism and their details are OS-dependent. However this is the standard answer to your question; though annoying, I know.

you can get the output generated by this process, reading the stdout descriptor:
out = p.stdout.read()

Related

Starting process in Google Colab with Prefix "!" vs. "subprocess.Popen(..)"

I've been using Google Colab for a few weeks now and I've been wondering what the difference is between the two following commands (for example):
!ffmpeg ...
subprocess.Popen(['ffmpeg', ...
I was wondering because I ran into some issues when I started either of the commands above and then tried to stop execution midway. Both of them cancel on KeyboardInterrupt but I noticed that after that the runtime needs a factory reset because it somehow got stuck. Checking ps aux in the Linux console listed a process [ffmpeg] <defunct> which somehow still was running or at least blocking some ressources as it seemed.
I then did some research and came across some similar posts asking questions on how to terminate a subprocess correctly (1, 2, 3). Based on those posts I generally came to the conclusion that using the subprocess.Popen(..) variant obviously provides more flexibility when it comes to handling the subprocess: Defining different stdout procedures or reacting to different returncode etc. But I'm still unsure on what the first command above using the ! as prefix exactly does under the hood.
Using the first command is much easier and requires way less code to start this process. And assuming I don't need a lot of logic handling the process flow it would be a nice way to execute something like ffmpeg - if I were able to terminate it as expected. Even following the answers from the other posts using the 2nd command never got me to a point where I could terminate the process fully once started (even when using shell=False, process.kill() or process.wait() etc.). This got me frustrated, because restarting and re-initializing the Colab instance itself can take several minutes every time.
So, finally, I'd like to understand in more general terms what the difference is and was hoping that someone could enlighten me. Thanks!
! commands are executed by the notebook (or more specifically by the ipython interpreter), and are not valid Python commands. If the code you are writing needs to work outside of the notebook environment, you cannot use ! commands.
As you correctly note, you are unable to interact with the subprocess you launch via !; so it's also less flexible than an explicit subprocess call, though similar in this regard to subprocess.call
Like the documentation mentions, you should generally avoid the bare subprocess.Popen unless you specifically need the detailed flexibility it offers, at the price of having to duplicate the higher-level functionality which subprocess.run et al. already implement. The code to run a command and wait for it to finish is simply
subprocess.check_call(['ffmpeg', ... ])
with variations for capturing its output (check_output) and the more modern run which can easily replace all three of the legacy high-level calls, albeit with some added verbosity.

Python Communicate/Wait with a shell subprocess

Tried searching for the solution to this problem but due to there being a command Shell=True (don't think that is related to what I'm doing but I could well be wrong) it get's lots of hits that aren't seemingly useful.
Ok so the problem I is basically:
I'm running a Python script on a cluster. On the cluster the normal thing to do is to launch all codes/etc. via a shell script which is used to request the appropriate resources (maximum run time, nodes, processors per node, etc.) needed to run the job. This shell then calls the script and away it goes.
This isn't an issue, but the problem I have is my 'parent' code needs to wait for it's 'children' to run fully (and generate their data to be used by the parent) before continuing. This isn't a problem when I don't have the shell between it and the script but as it stands .communicate() and .wait() are 'satisfied' when the shell script is done. I need to to wait until the script(s) called by the shell are done.
I could botch it by putting a while loop in that needs certain files to exist before breaking, but this seems messy to me.
So my question is, is there a way I can get .communicate (idealy) or .wait or via some other (clean/nice) method to pause the parent code until the shell, and everything called by the shell, finishes running? Ideally (nearly essential tbh) is that this be done in the parent code alone.
I might not be explaining this very well so happy to provide more details if needed, and if somewhere else answers this I'm sorry, just point me thata way!

subprocess.call does not wait for the process to complete

Per Python documentation, subprocess.call should be blocking and wait for the subprocess to complete. In this code I am trying to convert few xls files to a new format by calling Libreoffice on command line. I assumed that the call to subprocess call is blocking but seems like I need to add an artificial delay after each call otherwise I miss few files in the out directory.
what am I doing wrong? and why do I need the delay?
from subprocess import call
for i in range(0,len(sorted_files)):
args = ['libreoffice', '-headless', '-convert-to',
'xls', "%s/%s.xls" %(sorted_files[i]['filename'],sorted_files[i]['filename']), '-outdir', 'out']
call(args)
var = raw_input("Enter something: ") # if comment this line I dont get all the files in out directory
EDIT It might be hard to find the answer through the comments below. I used unoconv for document conversion which is blocking and easy to work with from an script.
It's possible likely that libreoffice is implemented as some sort of daemon/intermediary process. The "daemon" will (effectively1) parse the commandline and then farm the work off to some other process, possibly detaching them so that it can exit immediately. (based on the -invisible option in the documentation I suspect strongly that this is indeed the case you have).
If this is the case, then your subprocess.call does do what it is advertised to do -- It waits for the daemon to complete before moving on. However, it doesn't do what you want which is to wait for all of the work to be completed. The only option you have in that scenario is to look to see if the daemon has a -wait option or similar.
1It is likely that we don't have an actual daemon here, only something which behaves similarly. See comments by abernert
The problem is that the soffice command-line tool (which libreoffice is either just a link to, or a further wrapper around) is just a "controller" for the real program soffice.bin. It finds a running copy of soffice.bin and/or creates on, tells it to do some work, and then quits.
So, call is doing exactly the right thing: it waits for libreoffice to quit.
But you don't want to wait for libreoffice to quit, you want to wait for soffice.bin to finish doing the work that libreoffice asked it to do.
It looks like what you're trying to do isn't possible to do directly. But it's possible to do indirectly.
The docs say that headless mode:
… allows using the application without user interface.
This special mode can be used when the application is controlled by external clients via the API.
In other words, the app doesn't quit after running some UNO strings/doing some conversions/whatever else you specify on the command line, it sits around waiting for more UNO commands from outside, while the launcher just runs as soon as it sends the appropriate commands to the app.
You probably have to use that above-mentioned external control API (UNO) directly.
See Scripting LibreOffice for the basics (although there's more info there about internal scripting than external), and the API documentation for details and examples.
But there may be an even simpler answer: unoconv is a simple command-line tool written using the UNO API that does exactly what you want. It starts up LibreOffice if necessary, sends it some commands, waits for the results, and then quits. So if you just use unoconv instead of libreoffice, call is all you need.
Also notice that unoconv is written in Python, and is designed to be used as a module. If you just import it, you can write your own (simpler, and use-case-specific) code to replace the "Main entrance" code, and not use subprocess at all. (Or, of course, you can tear apart the module and use the relevant code yourself, or just use it as a very nice piece of sample code for using UNO from Python.)
Also, the unoconv page linked above lists a variety of other similar tools, some that work via UNO and some that don't, so if it doesn't work for you, try the others.
If nothing else works, you could consider, e.g., creating a sentinel file and using a filesystem watch, so at least you'll be able to detect exactly when it's finished its work, instead of having to guess at a timeout. But that's a real last-ditch workaround that you shouldn't even consider until eliminating all of the other options.
If libreoffice is being using an intermediary (daemon) as mentioned by #mgilson, then one solution is to find out what program it's invoking, and then directly invoke it yourself.

Is there a simple way to launch a background task from a Python CGI script without waiting around for it to terminate?

In Windows, that is.
I think the answer to this question is that I need to create a Windows service. This seems ludicrously heavyweight for what I am trying to do.
I'm just trying to slap together a little prototype here for my manager, I'm not going to be responsible for productizing it... in fact, it may never even BE productized; it might just be something that a few researchers play around with.
I have a CGI script that receives a file for upload, stores it to a temporary location, then launches a background process to do some serious number-crunching on the file. Then some Javascript stuff sits around calling other CGI scripts to check on the status and update the page as needed.
All of this works, except the damn web server won't close the connection as long as the subrocess is running. I've done some searching, and it appears the answer on Unix is to make it a daemon, but I'm stuck on Windows right now and I guess the answer there is to make it a Windows service?!? This seems incredibly heavyweight to just, you know, launch a damn process and then close the server connection.
That's really the only way?
Edit: Okay, found a nifty little hack over here (the choice (3) that the guy gives):
How to completely background a process in Perl CGI under IIS
I was able to modify this to make it even simpler, and although this is a klugey solution, it is perfect for the quick-and-dirty little prototype I am trying to make.
So I initially had my main script doing this:
subprocess.Popen("python.exe","myscript.py","arg1","arg2")
Which doesn't work, as I've described. Instead, I now have my main script emit this little bit of Javascript which runs after the document is fully loaded:
$("#somecrap").load("launchBackgroundProcess.py", {arg1:"foo",arg2:"bar"});
And then launchBackgroundProcess.py does the subprocess.Popen.
This solution would never scale, since it still leaves the browser connection open during the entire time the background task is running. But since this little thinger I am whipping up might someday have two simultaneous users at most (even then I doubt it) resources are not a concern. This allows the user to see the main page and get the Javascript updates even though there is still an http connection hanging open for no good reason.
Thanks for the answers! If I'm ever asked to productize this, I'll take at the resources Profane recommends.
If you haven't much experience with windows programming and don't wish to peruse the MSDN docs-- I don't blame you-- you may want to try to pick up a copy of Mark Hammond's cannonical guide to all things python and windows. It somehow never goes out-of-date on many of these sorts of recurring questions. Instead of launching the process with the every-platform solution, you'd probably be better off using the win32process module. Chapter 17 of the Hammond book covers this extensively, but you could probably get all you need by downloading the pywin ide (I think it comes bundled in the windows extensions which you can download from pypi), and looking through the help docs it has on python's windows' api. Here's an example of using the api, from a project I was working on recently. It may in fact do some of what you want with a little adaptation. You'd probably want to focus on CreationFlags. In particular, win32process.DETACHED_PROCESS is "often used to execute console programs in the background." Many other flags are available and conveniently wrapped however.
if subprocess.mswindows:
su=subprocess.STARTUPINFO()
su.dwFlags |= subprocess._subprocess.STARTF_USESHOWWINDOW
process = subprocess.Popen(['program', 'flag', 'flag2'], bufsize=-1,
stdout=subprocess.PIPE, startupinfo=su)
Simplest, but not most efficient way would be to just run another python executable
from subprocess import Popen
Popen("python somescript.py")
You can just use a system call using the "start" windows command. This way your python script will not wait for the completion of the started program.
CGI scripts are run with standard output redirected, either directly to the TCP socket or to a pipe. Typically, the connection won't close until the handle, and all copies of it, are closed. By default, the subprocess will inherit a copy of the handle.
There are two ways to prevent the connection from waiting on the subprocess. One is to prevent the subprocess from inheriting the handle, the other is for the subprocess to close its copy of the handle when it starts.
If the subprocess is in Perl, I think you could close the handle very simply:
close(STDOUT);
If you want to prevent the subprocess from inheriting the handle, you could use the SetHandleInformation function (if you have access to the Win32 API) or set bInheritHandles to FALSE in the call to CreateProcess. Alternatively, close the handle before launching the subprocess.

Lock down a program so it has no access to outside files, like a virus scanner does

I would like to launch an untrusted application programmatically, so I want to remove the program's ability to access files, network, etc. Essentially, I want to restrict it so its only interface to the rest of the computer is stdin and stdout.
Can I do that? Preferably in a cross-platform way but I sort of expect to have to do it differently for each OS. I'm using Python, but I'm willing to write this part in a lower level or more platform integrated language if necessary.
The reason I need to do this is to write a distributed computing infrastructure. It needs to download a program, execute it, piping data to stdin, and returning data that it receives on stdout to the central server. But since the program it downloads is untrusted, I want to restrict it to only using stdin and stdout.
The short answer is no.
The long answer is not really. Consider a C program, in which the program opens a log file by grabbing the next available file descriptor. Your program, in order to stop this, would need to somehow monitor this, and block it. Depending on the robustness of the untrusted program, this could cause a fatal crash, or inhibit harmless functionality. There are many other similar issues to this one that make what you are trying to do hard.
I would recommend looking into sandboxing solutions already available. In particular, a virtual machine can be very useful for testing out untrusted code. If you can't find anything that meets your needs, your best bet is to probably deal with this at the kernel level, or with something a bit closer to the hardware such as C.
Yes, you can do this. You can run an inferior process through ptrace (essentially you act as a debugger) and you hook on system calls and determine whether they should be allowed or not.
codepad.org does this for instance, see: about codepad. It uses the geordi supervisor to execute the untrusted code.
You can run untrusted apps in chroot and block them from using network with an iptables rule (for example, owner --uid-owner match)
But really, virtual machine is more reliable and on modern hardware performance impact is negligible.

Categories

Resources