Running and calling into a Python program as a persistent subprocess - python

I am writing a microservice in Haskell and it seems that we'll need to call into a Python library. I know how to create and configure a process to do that from Haskell, but my Python is rusty. Here's the logic I am trying to implement:
The Haskell application initializes by creating a persistent subprocess (lifetime of the subprocess = lifetime of the parent process) running a minimized application serving the Python library.
The Haskell application receives a network request and sends over stdin exactly 1 chunk of data (i.e. bytestring or text) to the Python subprocess; it waits for -- blocking -- exactly 1 chunk of data to be received from the subprocess' stdout, collects the result and returns it as a response.
I've looked around and the closest solution I was able to find where:
Running a Python program from Go and
Persistent python subprocess
Both handle only the part I know how to handle (i.e. calling into a Python subrocess) while not dealing with the details of the Python code run from the subprocess -- hence this question.
The obvious alternative would be to simply create, run and stop a subprocess whenever the Haskell application needs it, but the overhead is unpleasant.
I've tried something whose minimized version looks like:
-- From the Haskell parent process
{-# LANGUAGE OverloadedStrings #-}
import System.IO
import System.Process.Typed
configProc :: ProcessConfig Handle Handle ()
configProc =
setStdin createPipe $
setStdout createPipe $
setStderr closed $
setWorkingDir "/working/directory" $
shell "python3 my_program.py"
startPyProc :: IO (Process Handle Handle ())
startPyProc = do
p <- startProcess configProc
hSetBuffering (getStdin p) NoBuffering
hSetBuffering (getStdout p) NoBuffering
pure p
main :: IO ()
main = do
p <- startPyProc
let stdin = getStdin p
stdout = getStdout p
hSetBuffering stdin NoBuffering
hSetBuffering stdout NoBuffering
-- hGetLine won't get anything before I call hClose
-- making it impossible to stream over both stdin and stout
hPutStrLn stdin "foo" >> hClose stdin >> hGetLine stdout >>= print
# From the Python child process
import sys
if '__name__' == '__main__':
for line in sys.stdin:
# do some work and finally...
print(result)
One issue with this code is that I have not been able to send to sdin and receive from stdout without first closing the stdin handle, which makes the implementation unable to do what I want (send 1 chunk to stdin, block, read the result from stout, rinse and repeat). Another potential issue is that the Python code might not adequate at all for the specification I am trying to meet.

Got it fixed by simply replacing print(...) with print(..., flush=True). It appears that in Python stdin/stdout default to block-buffering, which made my call to hGetLine block since it was expecting lines.

Related

how do i redirect fifo to stdin using python either with subprocess or with pwntools?

As an example I am trying to "imitate" the behaviour of the following sets of commands is bash:
mkfifo named_pipe
/challenge/embryoio_level103 < named_pipe &
cat > named_pipe
In Python I have tried the following commands:
import os
import subprocess as sp
os.mkfifo("named_pipe",0777) #equivalent to mkfifo in bash..
fw = open("named_pipe",'w')
#at this point the system hangs...
My idea it was to use subprocess.Popen and redirect stdout to fw...
next open named_pipe for reading and giving it as input to cat (still using Popen).
I know it is a simple (and rather stupid) example, but I can't manage to make it work..
How would you implement such simple scenario?
Hello fellow pwn college user! I just solved this level :)
open(path, flags) blocks execution. There are many similar stackoverflow Q&As, but I'll reiterate here. A pipe will not pass data until both ends are opened, which is why the process hangs (only 1 end was opened).
If you want to open without blocking, you may do so on certain operating systems (Unix works, Windows doesn't as far as I'm aware) using os.open with the flag os.O_NONBLOCK. I don't know what consequences there are, but be cautious of opening with nonblocking because you may try reading prematurely and there will be nothing to read (possibly leading to error, etc.).
Also, note that using the integer literal 0777 causes a syntax error, so I assume you mean 0o777 (max permissions), where the preceding 0o indicates octal. The default for os.mkfifo is 0o666, which is identical to 0o777 except for the execute flags, which are useless because pipes cannot be executed. Also, be aware that these permissions might not all be granted and when trying to set to 0o666, the permissions may actually be 0o644 (like in my case). I believe this is due to the umask, which can be changed and is used simply for security purposes, but more info can be found elsewhere.
For the blocking case, you can use the package multiprocessing like so:
import os
import subprocess as sp
from multiprocessing import Process
path='named_pipe'
os.mkfifo(path)
def read(): sp.run("cat", stdin=open(path, "r"))
def write(): sp.run(["echo", "hello world"], stdout=open(path, "w"))
if __name__ == "__main__":
p_read = Process(target=read)
p_write = Process(target=write)
p_read.start()
p_write.start()
p_read.join()
p_write.join()
os.remove(path)
output:
hello world

How to obtain output from external progam and put it into a variable in Python

I am still fairly new to the python world and know this should be an easy question to answer. I have this section of a script in python that calls a script in Perl. This Perl script is a SOAP service that fetches data from a web page. Everything works great and outputs what I want, but after a bit of trial and error I am confused to how I can capture the data with a python variable and not just output to the screen like it does now.
Any pointers appreciated!
Thank you,
Pablo
# SOAP SERVICE
# Fetch the perl script that will request the users email.
# This service will return a name, email, and certificate.
var = "soap.pl"
pipe = subprocess.Popen(["perl", "./soap.pl", var], stdin = subprocess.PIPE)
pipe.stdin.write(var)
print "\n"
pipe.stdin.close()
I am not sure what your code aims to do (with var in particular), but here are the basics.
There is the subprocess.check_output() function for this
import subprocess
out = subprocess.check_output(['ls', '-l'])
print out
If your Python is before 2.7 use Popen with the communicate() method
import subprocess
proc = subprocess.Popen(['ls', '-l'], stdout=subprocess.PIPE)
out, err = proc.communicate()
print out
You can instead iterate proc.stdout but it appears that you want all output in one variable.
In both cases you provide the program's arguments in the list.
Or add stdin if needed
proc = subprocess.Popen(['perl', 'script.pl', 'arg'],\
stdin = subprocess.PIPE,\
stdout = subprocess.PIPE)
The purpose of stdin = subprocess.PIPE is to be able to feed the STDIN of the process that is started, as it runs. Then you would do proc.stdin.write(string) and this writes to the invoked program's STDIN. That program generally waits on its STDIN and after you send a newline it gets everything written to it (since the last newline) and runs relevant processing.
If you simply need to pass parameters/arguments to the script at its invocation then that generally doesn't need nor involve its STDIN.
Since Python 3.5 the recommended method is subprocess.run(), with a very similar full signature, and similar operation, to that of the Popen constructor.

Unable to open a Python subprocess in Web2py (SIGABRT)

I've got an Apache2/web2py server running using the wsgi handler functionality. Within one of the controllers, I am trying to run an external executable to perform some processing on 2 files.
My approach to this is to use the subprocess module to kick off the executable. I have simplified the code to a bare-bones implementation with little success.
from subprocess import *
p = Popen(("echo", "Hello"), shell=False)
ret = p.wait()
print "Process ended with status %s" % ret
When running the above code on its own (create new file and running via python command line), it works exactly as expected.
However, as soon as I place the exact same code into my web2py controller, the external process stops working. Instead of the process returning with code 0 as is expected in the above example, it always returns -6 and "Hello" is not printed to stdout.
After doing some digging, I found that negative results from p.wait() implies that a signal caused the process to end abnormally. And, according to some docs I found, -6 corresponds to the SIGABRT signal.
I would have expected this signal to be a result of some poorly executed code in my child process. However, since this is only running echo (and since it works outside of web2py) I have my doubts that the child process is signalling itself.
Is there some web2py limitation/configuration that causes Popen() requests to always fail? If so, how can I modify my logic so that the controller (or whatever) is actually able to spawn this external process?
** EDIT: Looks like web2py applications may not like the subprocess module. According to a reply to a message reply in the web2py email group:
"You should not use subprocess in a web2py application (if you really need too, look into the admin/controllers/shell.py) but you can use it in a web2py program running from shell (web2py.py -R myprogram.py)."
I will be checking out some options based on the note here and see if any solution presents itself.
In the end, the best I was able to come up with involved setting up a simple XML RPC server and call the functions from that:
my_server.py
#my_server.py
from SimpleXMLRPCServer import SimpleXMLRPCServer, SimpleXMLRPCRequestHandler
from subprocess import *
proc_srvr = xmlrpclib.ServerProxy("http://localhost:12345")
def echo_fn():
p = Popen(("echo", "hello"), shell=False)
ret = p.wait()
print "Process ended with status %s" % ret
return True # RPC Server doesn't like to return None
def main():
server = SimpleXMLRPCServer(("localhost", 12345), ErrorHandler)
server.register_function(echo_fn, "echo_fn")
while True:
server.handle_request()
if __name__ == "__main__":
main()
web2py_controller.py
#web2py_controller.py
def run_echo():
proc_srvr = xmlrpclib.ServerProxy("http://localhost:12345")
proc_srvr.echo_fn()
I'll be honest, I'm not a Python nor SimpleRPCServer guru, so the overall code may not be up to best-practice standards. However, going this route did allow me to, in effect, call a subprocess from a controller in web2py.
(Note, this was a quick and dirty simplification of the code that I have in my project. I have not validated it is in a working state, so it may require some tweaks.)

Keeping a pipe to a process open

I have an app that reads in stuff from stdin and returns, after a newline, results to stdout
A simple (stupid) example:
$ app
Expand[(x+1)^2]<CR>
x^2 + 2*x + 1
100 - 4<CR>
96
Opening and closing the app requires a lot of initialization and clean-up (its an interface to a Computer Algebra System), so I want to keep this to a minimum.
I want to open a pipe in Python to this process, write strings to its stdin and read out the results from stdout. Popen.communicate() doesn't work for this, as it closes the file handle, requiring to reopen the pipe.
I've tried something along the lines of this related question:
Communicate multiple times with a process without breaking the pipe? but I'm not sure how to wait for the output. It is also difficult to know a priori how long it will take the app to finish to process for the input at hand, so I don't want to make any assumptions. I guess most of my confusion comes from this question: Non-blocking read on a subprocess.PIPE in python where it is stated that mixing high and low level functions is not a good idea.
EDIT:
Sorry that I didn't give any code before, got interrupted. This is what I've tried so far and it seems to work, I'm just worried that something goes wrong unnoticed:
from subprocess import Popen, PIPE
pipe = Popen(["MathPipe"], stdin=PIPE, stdout=PIPE)
expressions = ["Expand[(x+1)^2]", "Integrate[Sin[x], {x,0,2*Pi}]"] # ...
for expr in expressions:
pipe.stdin.write(expr)
while True:
line = pipe.stdout.readline()
if line != '':
print line
# output of MathPipe is always terminated by ';'
if ";" in line:
break
Potential problems with this?
Using subprocess, you can't do this reliably. You might want to look at using the pexpect library. That won't work on Windows - if you're on Windows, try winpexpect.
Also, if you're trying to do mathematical stuff in Python, check out SAGE. They do a lot of work on interfacing with other open-source maths software, so there's a chance they've already done what you're trying to.
Perhaps you could pass stdin=subprocess.PIPE as an argument to subprocess.Popen. This will make the process' stdin available as a general file-like object:
import sys, subprocess
proc = subprocess.Popen(["mathematica <args>"], stdin=subprocess.PIPE,
stdout=sys.stdout, shell=True)
proc.stdin.write("Expand[ (x-1)^2 ]") # Write whatever to the process
proc.stdin.flush() # Ensure nothing is left in the buffer
proc.terminate() # Kill the process
This directs the subprocess' output directly to your python process' stdout. If you need to read the output and do some editing first, that is possible as well. Check out http://docs.python.org/library/subprocess.html#popen-objects.

subprocess.Popen.stdout - reading stdout in real-time (again)

Again, the same question.
The reason is - I still can't make it work after reading the following:
Real-time intercepting of stdout from another process in Python
Intercepting stdout of a subprocess while it is running
How do I get 'real-time' information back from a subprocess.Popen in python (2.5)
catching stdout in realtime from subprocess
My case is that I have a console app written in C, lets take for example this code in a loop:
tmp = 0.0;
printf("\ninput>>");
scanf_s("%f",&tmp);
printf ("\ninput was: %f",tmp);
It continuously reads some input and writes some output.
My python code to interact with it is the following:
p=subprocess.Popen([path],stdout=subprocess.PIPE,stdin=subprocess.PIPE)
p.stdin.write('12345\n')
for line in p.stdout:
print(">>> " + str(line.rstrip()))
p.stdout.flush()
So far whenever I read form p.stdout it always waits until the process is terminated and then outputs an empty string. I've tried lots of stuff - but still the same result.
I tried Python 2.6 and 3.1, but the version doesn't matter - I just need to make it work somewhere.
Trying to write to and read from pipes to a sub-process is tricky because of the default buffering going on in both directions. It's extremely easy to get a deadlock where one or the other process (parent or child) is reading from an empty buffer, writing into a full buffer or doing a blocking read on a buffer that's awaiting data before the system libraries flush it.
For more modest amounts of data the Popen.communicate() method might be sufficient. However, for data that exceeds its buffering you'd probably get stalled processes (similar to what you're already seeing?)
You might want to look for details on using the fcntl module and making one or the other (or both) of your file descriptors non-blocking. In that case, of course, you'll have to wrap all reads and/or writes to those file descriptors in the appropriate exception handling to handle the "EWOULDBLOCK" events. (I don't remember the exact Python exception that's raised for these).
A completely different approach would be for your parent to use the select module and os.fork() ... and for the child process to execve() the target program after directly handling any file dup()ing. (Basically you'd be re-implement parts of Popen() but with different parent file descriptor (PIPE) handling.
Incidentally, .communicate, at least in Python's 2.5 and 2.6 standard libraries, will only handle about 64K of remote data (on Linux and FreeBSD). This number may vary based on various factors (possibly including the build options used to compile your Python interpreter, or the version of libc being linked to it). It is NOT simply limited by available memory (despite J.F. Sebastian's assertion to the contrary) but is limited to a much smaller value.
Push reading from the pipe into a separate thread that signals when a chunk of output is available:
How can I read all availably data from subprocess.Popen.stdout (non blocking)?
The bufsize=256 argument prevents 12345\n from being sent to the child process in a chunk smaller than 256 bytes, as it will be when omitting bufsize or inserting p.stdin.flush() after p.stdin.write(). Default behaviour is line-buffering.
In either case you should at least see one empty line before blocking as emitted by the first printf(\n...) in your example.
Your particular example doesn't require "real-time" interaction. The following works:
from subprocess import Popen, PIPE
p = Popen(["./a.out"], stdin=PIPE, stdout=PIPE)
output = p.communicate(b"12345")[0] # send input/read all output
print output,
where a.out is your example C program.
In general, for a dialog-based interaction with a subprocess you could use pexpect module (or its analogs on Windows):
import pexpect
child = pexpect.spawn("./a.out")
child.expect("input>>")
child.sendline("12345.67890") # send a number
child.expect(r"\d+\.\d+") # expect the number at the end
print float(child.after) # assert that we can parse it
child.close()
I had the same problem, and "proc.communicate()" does not solve it because it waits for process terminating.
So here is what is working for me, on Windows with Python 3.5.1 :
import subprocess as sp
myProcess = sp.Popen( cmd, creationflags=sp.CREATE_NEW_PROCESS_GROUP,stdout=sp.PIPE,stderr=sp.STDOUT)
while i<40:
i+=1
time.sleep(.5)
out = myProcess.stdout.readline().decode("utf-8").rstrip()
I guess creationflags and other arguments are not mandatory (but I don't have time to test), so this would be the minimal syntax :
myProcess = sp.Popen( cmd, stdout=sp.PIPE)
for i in range(40)
time.sleep(.5)
out = myProcess.stdout.readline()

Categories

Resources