I'm trying to port a shell script to the much more readable python version. The original shell script starts several processes (utilities, monitors, etc.) in the background with "&". How can I achieve the same effect in python? I'd like these processes not to die when the python scripts complete. I am sure it's related to the concept of a daemon somehow, but I couldn't find how to do this easily.
While jkp's solution works, the newer way of doing things (and the way the documentation recommends) is to use the subprocess module. For simple commands its equivalent, but it offers more options if you want to do something complicated.
Example for your case:
import subprocess
subprocess.Popen(["rm","-r","some.file"])
This will run rm -r some.file in the background. Note that calling .communicate() on the object returned from Popen will block until it completes, so don't do that if you want it to run in the background:
import subprocess
ls_output=subprocess.Popen(["sleep", "30"])
ls_output.communicate() # Will block for 30 seconds
See the documentation here.
Also, a point of clarification: "Background" as you use it here is purely a shell concept; technically, what you mean is that you want to spawn a process without blocking while you wait for it to complete. However, I've used "background" here to refer to shell-background-like behavior.
Note: This answer is less current than it was when posted in 2009. Using the subprocess module shown in other answers is now recommended in the docs
(Note that the subprocess module provides more powerful facilities for spawning new processes and retrieving their results; using that module is preferable to using these functions.)
If you want your process to start in the background you can either use system() and call it in the same way your shell script did, or you can spawn it:
import os
os.spawnl(os.P_DETACH, 'some_long_running_command')
(or, alternatively, you may try the less portable os.P_NOWAIT flag).
See the documentation here.
You probably want the answer to "How to call an external command in Python".
The simplest approach is to use the os.system function, e.g.:
import os
os.system("some_command &")
Basically, whatever you pass to the system function will be executed the same as if you'd passed it to the shell in a script.
I found this here:
On windows (win xp), the parent process will not finish until the longtask.py has finished its work. It is not what you want in CGI-script. The problem is not specific to Python, in PHP community the problems are the same.
The solution is to pass DETACHED_PROCESS Process Creation Flag to the underlying CreateProcess function in win API. If you happen to have installed pywin32 you can import the flag from the win32process module, otherwise you should define it yourself:
DETACHED_PROCESS = 0x00000008
pid = subprocess.Popen([sys.executable, "longtask.py"],
creationflags=DETACHED_PROCESS).pid
Use subprocess.Popen() with the close_fds=True parameter, which will allow the spawned subprocess to be detached from the Python process itself and continue running even after Python exits.
https://gist.github.com/yinjimmy/d6ad0742d03d54518e9f
import os, time, sys, subprocess
if len(sys.argv) == 2:
time.sleep(5)
print 'track end'
if sys.platform == 'darwin':
subprocess.Popen(['say', 'hello'])
else:
print 'main begin'
subprocess.Popen(['python', os.path.realpath(__file__), '0'], close_fds=True)
print 'main end'
Both capture output and run on background with threading
As mentioned on this answer, if you capture the output with stdout= and then try to read(), then the process blocks.
However, there are cases where you need this. For example, I wanted to launch two processes that talk over a port between them, and save their stdout to a log file and stdout.
The threading module allows us to do that.
First, have a look at how to do the output redirection part alone in this question: Python Popen: Write to stdout AND log file simultaneously
Then:
main.py
#!/usr/bin/env python3
import os
import subprocess
import sys
import threading
def output_reader(proc, file):
while True:
byte = proc.stdout.read(1)
if byte:
sys.stdout.buffer.write(byte)
sys.stdout.flush()
file.buffer.write(byte)
else:
break
with subprocess.Popen(['./sleep.py', '0'], stdout=subprocess.PIPE, stderr=subprocess.PIPE) as proc1, \
subprocess.Popen(['./sleep.py', '10'], stdout=subprocess.PIPE, stderr=subprocess.PIPE) as proc2, \
open('log1.log', 'w') as file1, \
open('log2.log', 'w') as file2:
t1 = threading.Thread(target=output_reader, args=(proc1, file1))
t2 = threading.Thread(target=output_reader, args=(proc2, file2))
t1.start()
t2.start()
t1.join()
t2.join()
sleep.py
#!/usr/bin/env python3
import sys
import time
for i in range(4):
print(i + int(sys.argv[1]))
sys.stdout.flush()
time.sleep(0.5)
After running:
./main.py
stdout get updated every 0.5 seconds for every two lines to contain:
0
10
1
11
2
12
3
13
and each log file contains the respective log for a given process.
Inspired by: https://eli.thegreenplace.net/2017/interacting-with-a-long-running-child-process-in-python/
Tested on Ubuntu 18.04, Python 3.6.7.
You probably want to start investigating the os module for forking different threads (by opening an interactive session and issuing help(os)). The relevant functions are fork and any of the exec ones. To give you an idea on how to start, put something like this in a function that performs the fork (the function needs to take a list or tuple 'args' as an argument that contains the program's name and its parameters; you may also want to define stdin, out and err for the new thread):
try:
pid = os.fork()
except OSError, e:
## some debug output
sys.exit(1)
if pid == 0:
## eventually use os.putenv(..) to set environment variables
## os.execv strips of args[0] for the arguments
os.execv(args[0], args)
You can use
import os
pid = os.fork()
if pid == 0:
Continue to other code ...
This will make the python process run in background.
I haven't tried this yet but using .pyw files instead of .py files should help. pyw files dosen't have a console so in theory it should not appear and work like a background process.
Related
From python I am calling a java function:
os.system("java -jar example.jar run myFunction 'inFile.txt' 'outFile.txt' " )
This function is processing a file and the output is written into 'outFile.txt'. The output is dependent on the information in 'inFile.txt'. While processing the input file and writing into out file, sometimes the 'outFile.txt' grows too large (tens of GBs) and at that time, i want to quit and the current processing and move on to process another inFile.txt
Is there is way to know that my outFile.txt that is being written has grown more than say 10GB.
Edit:
As suggested by Maksym, I am using the following code and seems to be working. Thanks
import subprocess
from time import sleep
p = subprocess.Popen(["java", "-jar", "example.jar", "run", "myFunction", "'inFile.txt'", "'outFile.txt'")
rc = p.poll() #returncode
while (rc == None):
sleep(1)
if(os.path.getsize(outFileName) < 1000000000):
rc = p.poll()
continue
else:
p.kill()
break
Have a look at subprocess module. Using Popen you can fork a process and kill it when you need this:
import subprocess
from time import sleep
p = subprocess.Popen(["java", "-jar", "example.jar", "run", "myFunction", "'inFile.txt'", "'outFile.txt'")
while not check_my_conditions():
sleep(my_timeout)
p.kill()
Then, you can rotate your files and restart the process.
Instead of directly calling os.system, you should strongly considering using the multiprocessing.Process built-in class. It handles dealing with spawned processes much more gracefully.
You need to watch the output file periodically, either using something like os.stat to check the file size. You can then kill the original process (or whatever you want to do) when the threshold is exceeded.
Does the java application provide any output (for example, a count of records processed) to stdout or stderr while it runs? If so, you could invoke it using Python's Popen class (in the subprocess module) and estimate when it has processed 'too much'.
I am trying to detect when an installation program finishes executing from within a Python script. Specifically, the application is the Oracle 10gR2 Database. Currently I am using the subprocess module with Popen. Ideally, I would simply use the wait() method to wait for the installation to finish executing, however, the documented command actually spawns child processes to handle the actual installation. Here is some sample code of the failing code:
import subprocess
OUI_DATABASE_10GR2_SUBPROCESS = ['sudo',
'-u',
'oracle',
os.path.join(DATABASE_10GR2_TMP_PATH,
'database',
'runInstaller'),
'-ignoreSysPrereqs',
'-silent',
'-noconfig',
'-responseFile '+ORACLE_DATABASE_10GR2_SILENT_RESPONSE]
oracle_subprocess = subprocess.Popen(OUI_DATABASE_10GR2_SUBPROCESS)
oracle_subprocess.wait()
There is a similar question here: Killing a subprocess including its children from python, but the selected answer does not address the children issue, instead it instructs the user to call directly the application to wait for. I am looking for a specific solution that will wait for all children of the subprocess. What if there are an unknown number of subprocesses? I will select the answer that addresses the issue of waiting for all children subprocesses to finish.
More clarity on failure: The child processes continue executing after the wait() command since that command only waits for the top level process (in this case it is 'sudo'). Here is a simple diagram of the known child processes in this problem:
Python subprocess module -> Sudo -> runInstaller -> java -> (unknown)
Ok, here is a trick that will work only under Unix. It is similar to one of the answers to this question: Ensuring subprocesses are dead on exiting Python program. The idea is to create a new process group. You can then wait for all processes in the group to terminate.
pid = os.fork()
if pid == 0:
os.setpgrp()
oracle_subprocess = subprocess.Popen(OUI_DATABASE_10GR2_SUBPROCESS)
oracle_subprocess.wait()
os._exit(0)
else:
os.waitpid(-pid)
I have not tested this. It creates an extra subprocess to be the leader of the process group, but avoiding that is (I think) quite a bit more complicated.
I found this web page to be helpful as well. http://code.activestate.com/recipes/278731-creating-a-daemon-the-python-way/
You can just use os.waitpid with the the pid set to -1, this will wait for all the subprocess of the current process until they finish:
import os
import sys
import subprocess
proc = subprocess.Popen([sys.executable,
'-c',
'import subprocess;'
'subprocess.Popen("sleep 5", shell=True).wait()'])
pid, status = os.waitpid(-1, 0)
print pid, status
This is the result of pstree <pid> of different subprocess forked:
python───python───sh───sleep
Hope this can help :)
Check out the following link http://www.oracle-wiki.net/startdocsruninstaller which details a flag you can use for the runInstaller command.
This flag is definitely available for 11gR2, but I have not got a 10g database to try out this flag for the runInstaller packaged with that version.
Regards
Everywhere I look seems to say it's not possible to solve this in the general case. I've whipped up a library called 'pidmon' that combines some answers for Windows and Linux and might do what you need.
I'm planning to clean this up and put it on github, possibly called 'pidmon' or something like that. I'll post a link if/when I get it up.
EDIT: It's available at http://github.com/dbarnett/python-pidmon.
I made a special waitpid function that accepts a graft_func argument so that you can loosely define what sort of processes you want to wait for when they're not direct children:
import pidmon
pidmon.waitpid(oracle_subprocess.pid, recursive=True,
graft_func=(lambda p: p.name == '???' and p.parent.pid == ???))
or, as a shotgun approach, to just wait for any processes started since the call to waitpid to stop again, do:
import pidmon
pidmon.waitpid(oracle_subprocess.pid, graft_func=(lambda p: True))
Note that this is still barely tested on Windows and seems very slow on Windows (but did I mention it's on github where it's easy to fork?). This should at least get you started, and if it works at all for you, I have plenty of ideas on how to optimize it.
Prior to this,I run two command in for loop,like
for x in $set:
command
In order to save time,i want to run these two command in the same time,like parallel method in makefile
Thanks
Lyn
The threading module won't give you much performance-wise because of the Global Interpreter Lock.
I think the best way to do this is to use the subprocess module and open each command with it's own stdout.
processes = {}
for cmd in ['cmd1', 'cmd2', 'cmd3']:
p = subprocess.Popen('cmd1', stdout=subprocess.PIPE)
processes[p.stdout] = p
while len(processes):
rfds, _, _ = select.select(processes.keys(), [], [])
for fd in rfds:
process = processses[fd]
print fd.read()
if process.returncode is not None:
print "Process {0} returned with code {1}".format(process.pid, process.returncode)
del processes[fd]
You basically have to use select to see which file descriptors are ready and you have to check their returncode to see if doing a "read" caused them to exit. Processes basically go into a wait state until their stdout is closed. If you would like to do some things while you're waiting, you can put a timeout on select.select() so you'll stop waiting after so long. You can test the length of rfds and if it is 0 then you know that the timeout happened.
twisted or select module is probably what you're after.
If all you want to do is a bunch of batch commands, shell scripts, ie
#!/bin/sh
for i in "command1 command2 command3"; do
$i &
done
Might work better. Alternately, a Makefile like you said.
Look at the threading module.
I have some code to execute the unix shell command in background in python
import subprocess
process = subprocess.Popen('find / > tmp.txt &',shell=True)
I need to capture the scenario where i come to know that process has finished successful
completion .
Please explain with sample code
Tazim
There is no need for the &: the command is launched in a separate process, and runs independently.
If you want to wait until the process terminates, use wait():
process = subprocess.Popen('find / > tmp.txt', shell = True)
exitcode = process.wait()
if exitcode == 0:
# successful completion
else:
# error happened
If your program can do something meaningful in the meantime, you can use poll() to determine if the process has finished.
Furthermore, instead of writing the output to a temporary file and then reading it from your Python program, you can read from the pipe directly. See the subprocess documentation for details.
Don't use shell=True. It is bad for your health.
proc = subprocess.Popen(['find', '/'], stdout=open('tmp.txt', 'w'))
if proc.wait() == 0:
pass
If you really need a file, use import tempfile instead of a hard-coded temporary file name. If you don't need the file, use a pipe (see the subprocess docs as Thomas suggests).
Also, don't write shell scripts in Python. Use the os.walk function instead.
I'm trying to port a shell script to the much more readable python version. The original shell script starts several processes (utilities, monitors, etc.) in the background with "&". How can I achieve the same effect in python? I'd like these processes not to die when the python scripts complete. I am sure it's related to the concept of a daemon somehow, but I couldn't find how to do this easily.
While jkp's solution works, the newer way of doing things (and the way the documentation recommends) is to use the subprocess module. For simple commands its equivalent, but it offers more options if you want to do something complicated.
Example for your case:
import subprocess
subprocess.Popen(["rm","-r","some.file"])
This will run rm -r some.file in the background. Note that calling .communicate() on the object returned from Popen will block until it completes, so don't do that if you want it to run in the background:
import subprocess
ls_output=subprocess.Popen(["sleep", "30"])
ls_output.communicate() # Will block for 30 seconds
See the documentation here.
Also, a point of clarification: "Background" as you use it here is purely a shell concept; technically, what you mean is that you want to spawn a process without blocking while you wait for it to complete. However, I've used "background" here to refer to shell-background-like behavior.
Note: This answer is less current than it was when posted in 2009. Using the subprocess module shown in other answers is now recommended in the docs
(Note that the subprocess module provides more powerful facilities for spawning new processes and retrieving their results; using that module is preferable to using these functions.)
If you want your process to start in the background you can either use system() and call it in the same way your shell script did, or you can spawn it:
import os
os.spawnl(os.P_DETACH, 'some_long_running_command')
(or, alternatively, you may try the less portable os.P_NOWAIT flag).
See the documentation here.
You probably want the answer to "How to call an external command in Python".
The simplest approach is to use the os.system function, e.g.:
import os
os.system("some_command &")
Basically, whatever you pass to the system function will be executed the same as if you'd passed it to the shell in a script.
I found this here:
On windows (win xp), the parent process will not finish until the longtask.py has finished its work. It is not what you want in CGI-script. The problem is not specific to Python, in PHP community the problems are the same.
The solution is to pass DETACHED_PROCESS Process Creation Flag to the underlying CreateProcess function in win API. If you happen to have installed pywin32 you can import the flag from the win32process module, otherwise you should define it yourself:
DETACHED_PROCESS = 0x00000008
pid = subprocess.Popen([sys.executable, "longtask.py"],
creationflags=DETACHED_PROCESS).pid
Use subprocess.Popen() with the close_fds=True parameter, which will allow the spawned subprocess to be detached from the Python process itself and continue running even after Python exits.
https://gist.github.com/yinjimmy/d6ad0742d03d54518e9f
import os, time, sys, subprocess
if len(sys.argv) == 2:
time.sleep(5)
print 'track end'
if sys.platform == 'darwin':
subprocess.Popen(['say', 'hello'])
else:
print 'main begin'
subprocess.Popen(['python', os.path.realpath(__file__), '0'], close_fds=True)
print 'main end'
Both capture output and run on background with threading
As mentioned on this answer, if you capture the output with stdout= and then try to read(), then the process blocks.
However, there are cases where you need this. For example, I wanted to launch two processes that talk over a port between them, and save their stdout to a log file and stdout.
The threading module allows us to do that.
First, have a look at how to do the output redirection part alone in this question: Python Popen: Write to stdout AND log file simultaneously
Then:
main.py
#!/usr/bin/env python3
import os
import subprocess
import sys
import threading
def output_reader(proc, file):
while True:
byte = proc.stdout.read(1)
if byte:
sys.stdout.buffer.write(byte)
sys.stdout.flush()
file.buffer.write(byte)
else:
break
with subprocess.Popen(['./sleep.py', '0'], stdout=subprocess.PIPE, stderr=subprocess.PIPE) as proc1, \
subprocess.Popen(['./sleep.py', '10'], stdout=subprocess.PIPE, stderr=subprocess.PIPE) as proc2, \
open('log1.log', 'w') as file1, \
open('log2.log', 'w') as file2:
t1 = threading.Thread(target=output_reader, args=(proc1, file1))
t2 = threading.Thread(target=output_reader, args=(proc2, file2))
t1.start()
t2.start()
t1.join()
t2.join()
sleep.py
#!/usr/bin/env python3
import sys
import time
for i in range(4):
print(i + int(sys.argv[1]))
sys.stdout.flush()
time.sleep(0.5)
After running:
./main.py
stdout get updated every 0.5 seconds for every two lines to contain:
0
10
1
11
2
12
3
13
and each log file contains the respective log for a given process.
Inspired by: https://eli.thegreenplace.net/2017/interacting-with-a-long-running-child-process-in-python/
Tested on Ubuntu 18.04, Python 3.6.7.
You probably want to start investigating the os module for forking different threads (by opening an interactive session and issuing help(os)). The relevant functions are fork and any of the exec ones. To give you an idea on how to start, put something like this in a function that performs the fork (the function needs to take a list or tuple 'args' as an argument that contains the program's name and its parameters; you may also want to define stdin, out and err for the new thread):
try:
pid = os.fork()
except OSError, e:
## some debug output
sys.exit(1)
if pid == 0:
## eventually use os.putenv(..) to set environment variables
## os.execv strips of args[0] for the arguments
os.execv(args[0], args)
You can use
import os
pid = os.fork()
if pid == 0:
Continue to other code ...
This will make the python process run in background.
I haven't tried this yet but using .pyw files instead of .py files should help. pyw files dosen't have a console so in theory it should not appear and work like a background process.