I've got an Apache2/web2py server running using the wsgi handler functionality. Within one of the controllers, I am trying to run an external executable to perform some processing on 2 files.
My approach to this is to use the subprocess module to kick off the executable. I have simplified the code to a bare-bones implementation with little success.
from subprocess import *
p = Popen(("echo", "Hello"), shell=False)
ret = p.wait()
print "Process ended with status %s" % ret
When running the above code on its own (create new file and running via python command line), it works exactly as expected.
However, as soon as I place the exact same code into my web2py controller, the external process stops working. Instead of the process returning with code 0 as is expected in the above example, it always returns -6 and "Hello" is not printed to stdout.
After doing some digging, I found that negative results from p.wait() implies that a signal caused the process to end abnormally. And, according to some docs I found, -6 corresponds to the SIGABRT signal.
I would have expected this signal to be a result of some poorly executed code in my child process. However, since this is only running echo (and since it works outside of web2py) I have my doubts that the child process is signalling itself.
Is there some web2py limitation/configuration that causes Popen() requests to always fail? If so, how can I modify my logic so that the controller (or whatever) is actually able to spawn this external process?
** EDIT: Looks like web2py applications may not like the subprocess module. According to a reply to a message reply in the web2py email group:
"You should not use subprocess in a web2py application (if you really need too, look into the admin/controllers/shell.py) but you can use it in a web2py program running from shell (web2py.py -R myprogram.py)."
I will be checking out some options based on the note here and see if any solution presents itself.
In the end, the best I was able to come up with involved setting up a simple XML RPC server and call the functions from that:
my_server.py
#my_server.py
from SimpleXMLRPCServer import SimpleXMLRPCServer, SimpleXMLRPCRequestHandler
from subprocess import *
proc_srvr = xmlrpclib.ServerProxy("http://localhost:12345")
def echo_fn():
p = Popen(("echo", "hello"), shell=False)
ret = p.wait()
print "Process ended with status %s" % ret
return True # RPC Server doesn't like to return None
def main():
server = SimpleXMLRPCServer(("localhost", 12345), ErrorHandler)
server.register_function(echo_fn, "echo_fn")
while True:
server.handle_request()
if __name__ == "__main__":
main()
web2py_controller.py
#web2py_controller.py
def run_echo():
proc_srvr = xmlrpclib.ServerProxy("http://localhost:12345")
proc_srvr.echo_fn()
I'll be honest, I'm not a Python nor SimpleRPCServer guru, so the overall code may not be up to best-practice standards. However, going this route did allow me to, in effect, call a subprocess from a controller in web2py.
(Note, this was a quick and dirty simplification of the code that I have in my project. I have not validated it is in a working state, so it may require some tweaks.)
Related
I am writing a microservice in Haskell and it seems that we'll need to call into a Python library. I know how to create and configure a process to do that from Haskell, but my Python is rusty. Here's the logic I am trying to implement:
The Haskell application initializes by creating a persistent subprocess (lifetime of the subprocess = lifetime of the parent process) running a minimized application serving the Python library.
The Haskell application receives a network request and sends over stdin exactly 1 chunk of data (i.e. bytestring or text) to the Python subprocess; it waits for -- blocking -- exactly 1 chunk of data to be received from the subprocess' stdout, collects the result and returns it as a response.
I've looked around and the closest solution I was able to find where:
Running a Python program from Go and
Persistent python subprocess
Both handle only the part I know how to handle (i.e. calling into a Python subrocess) while not dealing with the details of the Python code run from the subprocess -- hence this question.
The obvious alternative would be to simply create, run and stop a subprocess whenever the Haskell application needs it, but the overhead is unpleasant.
I've tried something whose minimized version looks like:
-- From the Haskell parent process
{-# LANGUAGE OverloadedStrings #-}
import System.IO
import System.Process.Typed
configProc :: ProcessConfig Handle Handle ()
configProc =
setStdin createPipe $
setStdout createPipe $
setStderr closed $
setWorkingDir "/working/directory" $
shell "python3 my_program.py"
startPyProc :: IO (Process Handle Handle ())
startPyProc = do
p <- startProcess configProc
hSetBuffering (getStdin p) NoBuffering
hSetBuffering (getStdout p) NoBuffering
pure p
main :: IO ()
main = do
p <- startPyProc
let stdin = getStdin p
stdout = getStdout p
hSetBuffering stdin NoBuffering
hSetBuffering stdout NoBuffering
-- hGetLine won't get anything before I call hClose
-- making it impossible to stream over both stdin and stout
hPutStrLn stdin "foo" >> hClose stdin >> hGetLine stdout >>= print
# From the Python child process
import sys
if '__name__' == '__main__':
for line in sys.stdin:
# do some work and finally...
print(result)
One issue with this code is that I have not been able to send to sdin and receive from stdout without first closing the stdin handle, which makes the implementation unable to do what I want (send 1 chunk to stdin, block, read the result from stout, rinse and repeat). Another potential issue is that the Python code might not adequate at all for the specification I am trying to meet.
Got it fixed by simply replacing print(...) with print(..., flush=True). It appears that in Python stdin/stdout default to block-buffering, which made my call to hGetLine block since it was expecting lines.
I am in a bit of a pickle here. I have a python script (gather.py) that gathers information from an .xml file and uploads it into a database on a infinite loop that sleeps for 60sec; btw all of this is local. I am using Flask to run a webpage that will later pull information from the database, but at the moment all it does is display a sample page (main.py). I want to run main.py as for it to start gather.py as background process that won't prevent Flask from starting, I tried importing gather.py but it halts the process (indefinitely) and Flask won't start. After Googling for a while it seems that the best option is to use a task queue (Celery) and a message-broker (RabbitMQ) to take care of this. This is fine if the application were to do a lot of stuff in the background, but I only need it to do 1 or 2 things. So I did more digging and found posts stating that subprocess.Popen() could do the job. I tried using it and I don't think it failed, since it didn't raise any errors, but the database is empty. I confirmed that both gather.py and main.py work independently. I tried running the following code in IDLE:
subprocess.Popen([sys.executable, 'path\to\gather.py'], stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
and got this in return:
<subprocess.Popen object at 0x049A1CF0>
Now, I don't know what this means, I tried using .value and .attrib but understandably I get this:
AttributeError: 'Popen' object has no attribute 'value'
and
AttributeError: 'Popen' object has no attribute 'attrib'
Then I read on a StackOverflow post that stdout=subprocess.PIPE would cause the program to halt so, in a 'just in case' moment, I ran:
subprocess.Popen([sys.executable, 'path\to\gather.py'], stdout=subprocess.DEVNULL, stderr=subprocess.STDOUT)
and got this in return:
<subprocess.Popen object at 0x034A77D0>
Through all this process the database tables have remained empty. I am new to the subprocess module but all this checks and I can't figure out why it is not running gather.py. Is it because it has an infinite loop?? If there is a better option pls let me know.
Python version: 3.4.4
PS. IDK if it'll matter but I am running a portable version of Python (PortableApps) on a Windows 10 PC. This is why I included sys.executable inside subprocess.Popen().
Solution 1 (all in python script):
Try to use Thread and Queue.
I do this:
from flask import Flask
from flask import request
import json
from Queue import Queue
from threading import Thread
import time
def task(q):
q.put(0)
t = time.time()
while True:
time.sleep(1)
q.put(time.time() - t)
queue = Queue()
worker = Thread(target=task, args=(queue,))
worker.start()
app = Flask(__name__)
#app.route('/')
def message_from_queue():
msg = "Running: calculate %f seconds" % queue.get()
return msg
if __name__ == '__main__':
app.run(host='0.0.0.0')
If you run this code each access to '/' get a value calculate in task in background. Maybe you need to block until the task get a value, but it isnt enough information in the question. Of course you need to refactor your gather.py to pass a queue for it.
Solution 2 (using a system script):
For windows, create a .bat file and run both script from there:
#echo off
start python 'path\to\gather.py'
set FLASK_APP=app.py
flask run
This will run gather.py and after start the flask server. If you use start /min python 'path\to\gather.py' the gather will run in minimized mode.
subprocess.Popen will not work in opening a python program because it recognizes python as a file and not a executable. Subprocess.Popen can only open .exe files and nothing other than that.
You can use:
os.system('python_file_path.py')
but it won't be a background process(depends on your script)
I'm trying to debug a simple python application but no luck so far.
import multiprocessing
def worker(num):
for a in range(0, 10):
print a
if __name__ == '__main__':
for i in range(5):
p = multiprocessing.Process(target=worker, args=(i,))
p.start()
I want to set a breakpoint inside the for-loop to track the values of 'a' but non of the tools that I tried are able to do that.
So far I tried debuging with:
PyCharm and get the following error: ImportError: No module named
pydevd - http://youtrack.jetbrains.com/issue/PY-6649 It looks like
they are still working on a fix for this and from what I understand, no ETA on this
I also tried debuging with Winpdb - http://winpdb.org but it simply won't go inside my 'worker' method and just print the values of 'a'
I would really appreciate any help with this!
I found it very useful to replace multiprocessing.Process() with threading.Thread() when I'm going to set breakpoints. Both classes have similar arguments so in most cases they are interchangeable.
Usually my scripts use Process() until I specify command line argument --debug which effectively replaces those calls with Thread(). That allows me to debug those scripts with pdb.
You should be able to do it with remote-pdb.
from multiprocessing import Pool
def test(thing):
from remote_pdb import set_trace
set_trace()
s = thing*2
print(s)
return s
if __name__ == '__main__':
with Pool(5) as p:
print(p.map(test,['dog','cat','bird']))
Then just telnet to the port thats listed in the log.
Example:
RemotePdb session open at 127.0.0.1:54273, waiting for connection ...
telnet 127.0.0.1 54273
<telnet junk>
-> s = thing*2
(Pdb)
or
nc -tC 127.0.0.1 54273
-> s = thing * 2
(Pdb)
You should be able to debug the process at that point.
I copied everything in /Applications/PyCharm\ 2.6\ EAP.app/helpers/pydev/*.py to site-packages in my virtualenv and it worked for my (I'm debugging celery/kombu, breakpoints work as expected).
It would be great if regular pdb/ipdb would work with multiprocessing. If I can get away with it, I handle calls to multiprocessing serially if the number of configured processes is 1.
if processes == 1:
for record in data:
worker_function(data)
else:
pool.map(worker_function, data)
Then when debugging, configure the application to only use a single process. This doesn't cover all cases, especially when dealing with concurrency issues, but it might help.
I've rarely needed to use a traditional debugger when attempting to debug Python code, preferring instead to liberally sprinkle my code with trace statements. I'd change your code to the following:
import multiprocessing
import logging
def worker(num):
for a in range(0, 10):
logging.debug("(%d, %d)" % (num, a))
if __name__ == '__main__':
logging.basicConfig(level=logging.DEBUG)
for i in range(5):
p = multiprocessing.Process(target=worker, args=(i,))
logging.info("Starting process %d" % i)
p.start()
In production, you disable the debug trace statements by setting the trace level to logging.WARNING so you only log warnings and errors.
There's a good basic and advanced logging tutorial on the official Python site.
If you are trying to debug multiple processes running simultaneously, as shown in your example, then there's no obvious way to do that from a single terminal: which process should get the keyboard input? Because of this, Python always connects sys.stdin in the child process to os.devnull. But this means that when the debugger tries to get input from stdin, it immediately reaches end-of-file and reports an error.
If you can limit yourself to one subprocess at a time, at least for debugging, then you could get around this by setting sys.stdin = open(0) to reopen the main stdin, as described here.
But if multiple subprocesses may be at breakpoints simultaneously, then you will need a different solution, since they would all end up fighting over input from the single terminal. In that case, RemotePdb is probably your best bet, as described by #OnionKnight.
WingIDE Pro provides this functionality right out-of-the-box.
No additional code (e.g., use of the traceback module) is needed. You just run your program, and the Wing debugger will not only print stdout from subprocesses, but it will break on errors in a subprocess and instantly create and an interactive shell so you can debug the offending thread. It doesn't get any easier than this, and I know of no other IDE that exposes subprocesses in this way.
Yes, it's a commercial product. But I have yet to find any other IDE that provides a debugger to match. PyCharm Professional, Visual Studio Community, Komodo IDE - I've tried them all. WingIDE also leads in parsing source documentation as well, in my opinion. And the Eye Ease Green color scheme is something I can't live without now.
(Yes, I realize this question is 5+ years old. I'm answering it anyway.)
I am developing a wrapper around gdb using python. Basically, I just want to be able to detect a few setup annoyances up-front and be able to run a single command to invoke gdb, rather than a huge string I have to remember each time.
That said, there are two cases that I am using. The first, which works fine, is invoking gdb by creating a new process and attaching to it. Here's the code that I have for this one:
def spawnNewProcessInGDB():
global gObjDir, gGDBProcess;
from subprocess import Popen
from os.path import join
import subprocess
binLoc = join(gObjDir, 'dist');
binLoc = join(binLoc, 'bin');
binLoc = join(binLoc, 'mycommand')
profileDir = join(gObjDir, '..')
profileDir = join(profileDir, 'trash-profile')
try:
gGDBProcess = Popen(['gdb', '--args', binLoc, '-profile', profileDir], cwd=gObjDir)
gGDBProcess.wait()
except KeyboardInterrupt:
# Send a termination signal to the GDB process, if it's running
promptAndTerminate(gGDBProcess)
Now, if the user presses CTRL-C while this is running, it breaks (i.e. it forwards the CTRL-C to GDB). This is the behavior I want.
The second case is a bit more complicated. It might be the case that I already had this program running on my system and it crashed, but was caught. In this case, I want to be able to connect to it using gdb to get a stack trace (or perhaps I was already running it, and I simply now want to connect to the process that's already in memory).
As a convenience feature, I've created a mirror function, which will connect to a running process using gdb:
def connectToProcess(procNum):
global gObjDir, gGDBProcess
from subprocess import Popen
import subprocess
import signal
print("Connecting to mycommand process number " + str(procNum) + "...")
try:
gGDBProcess = Popen(['gdb', '-p', procNum], cwd=gObjDir)
gGDBProcess.wait()
except KeyboardInterrupt:
promptAndTerminate(gGDBProcess)
Again, this seems to work as expected. It starts gdb, I can set breakpoints, run the program, etc. The only catch is that it doesn't forward CTRL-C to gdb if I press it while the program is running. Instead, it jumps immediately to promptAndTerminate().
I'm wondering if anyone can see why this is happening - the two calls to subprocess.Popen() seem identical to me, albeit that one is running gdb in a different mode.
I have also tried replacing the call to subprocess.Popen() with the following:
gGDBProcess = Popen(['gdb', '-p', procNum], cwd=gObjDir, stdin=subprocess.PIPE)
but this leads to undesirable results as well, because it doesn't actually communicate anything to the child gdb process (e.g. if I type in c to start the program running again after it is broken upon connection from gdb, it doesn't do anything). Again, it terminates the running python process when I type CTRL-C.
Any help would be appreciated!
I am trying to detect when an installation program finishes executing from within a Python script. Specifically, the application is the Oracle 10gR2 Database. Currently I am using the subprocess module with Popen. Ideally, I would simply use the wait() method to wait for the installation to finish executing, however, the documented command actually spawns child processes to handle the actual installation. Here is some sample code of the failing code:
import subprocess
OUI_DATABASE_10GR2_SUBPROCESS = ['sudo',
'-u',
'oracle',
os.path.join(DATABASE_10GR2_TMP_PATH,
'database',
'runInstaller'),
'-ignoreSysPrereqs',
'-silent',
'-noconfig',
'-responseFile '+ORACLE_DATABASE_10GR2_SILENT_RESPONSE]
oracle_subprocess = subprocess.Popen(OUI_DATABASE_10GR2_SUBPROCESS)
oracle_subprocess.wait()
There is a similar question here: Killing a subprocess including its children from python, but the selected answer does not address the children issue, instead it instructs the user to call directly the application to wait for. I am looking for a specific solution that will wait for all children of the subprocess. What if there are an unknown number of subprocesses? I will select the answer that addresses the issue of waiting for all children subprocesses to finish.
More clarity on failure: The child processes continue executing after the wait() command since that command only waits for the top level process (in this case it is 'sudo'). Here is a simple diagram of the known child processes in this problem:
Python subprocess module -> Sudo -> runInstaller -> java -> (unknown)
Ok, here is a trick that will work only under Unix. It is similar to one of the answers to this question: Ensuring subprocesses are dead on exiting Python program. The idea is to create a new process group. You can then wait for all processes in the group to terminate.
pid = os.fork()
if pid == 0:
os.setpgrp()
oracle_subprocess = subprocess.Popen(OUI_DATABASE_10GR2_SUBPROCESS)
oracle_subprocess.wait()
os._exit(0)
else:
os.waitpid(-pid)
I have not tested this. It creates an extra subprocess to be the leader of the process group, but avoiding that is (I think) quite a bit more complicated.
I found this web page to be helpful as well. http://code.activestate.com/recipes/278731-creating-a-daemon-the-python-way/
You can just use os.waitpid with the the pid set to -1, this will wait for all the subprocess of the current process until they finish:
import os
import sys
import subprocess
proc = subprocess.Popen([sys.executable,
'-c',
'import subprocess;'
'subprocess.Popen("sleep 5", shell=True).wait()'])
pid, status = os.waitpid(-1, 0)
print pid, status
This is the result of pstree <pid> of different subprocess forked:
python───python───sh───sleep
Hope this can help :)
Check out the following link http://www.oracle-wiki.net/startdocsruninstaller which details a flag you can use for the runInstaller command.
This flag is definitely available for 11gR2, but I have not got a 10g database to try out this flag for the runInstaller packaged with that version.
Regards
Everywhere I look seems to say it's not possible to solve this in the general case. I've whipped up a library called 'pidmon' that combines some answers for Windows and Linux and might do what you need.
I'm planning to clean this up and put it on github, possibly called 'pidmon' or something like that. I'll post a link if/when I get it up.
EDIT: It's available at http://github.com/dbarnett/python-pidmon.
I made a special waitpid function that accepts a graft_func argument so that you can loosely define what sort of processes you want to wait for when they're not direct children:
import pidmon
pidmon.waitpid(oracle_subprocess.pid, recursive=True,
graft_func=(lambda p: p.name == '???' and p.parent.pid == ???))
or, as a shotgun approach, to just wait for any processes started since the call to waitpid to stop again, do:
import pidmon
pidmon.waitpid(oracle_subprocess.pid, graft_func=(lambda p: True))
Note that this is still barely tested on Windows and seems very slow on Windows (but did I mention it's on github where it's easy to fork?). This should at least get you started, and if it works at all for you, I have plenty of ideas on how to optimize it.