Python bash pipe - python

I want to pipe a python script's output to a bash script. What i did so far was i tried to use os.popen(), sys.subprocess(), and tried to give a pipe for an example
os.popen('echo "P 1 1 591336 4927369 1 321 " | v.in.ascii -zn out=abcx format=standard --overwrite')
but this didn't work, the values "591336" and "4927369" are the variables which comes as the output of the python script. but when I do this or change the values manually by repeating the echo command and the pipe, it works (in bash).
v.in.ascii -zn out=abcx format=standard --overwrite
the above part of the bash command is a part of Grass GIS
Can anyone help me!

You can just use print to output to stdout and pipe the Python process to the next process, e.g.
python myprogram.py | ...
Where myprogram.py might look like:
for x in something:
print dosomething(x)

This works for me:
>>> stdin, stdout = os.popen2("echo %s | grep 'test'" % 'some test param')
>>> print stdout.read()
some test param
>>>

As of Python 2.6, the subprocess module is recommended instead of the deprecated os.popen. Here's an example:
from subprocess import Popen, PIPE
p = Popen(["v.in.ascii", "-zn", "out=abcx", "format=standard", "--overwrite"], stdin=PIPE)
p.stdin.write("P 1 1 591336 4927369 1 321\n")
p.stdin.close()
p.wait() # unless background execution preferred

I really like John Paulett's answer.
I think your echo example would work if you used os.system instead of os.popen.
One way to use popen here is like this:
f = os.popen("v.in.ascii -zn out=abcx format=standard --overwrite", 'w')
f.write("P 1 1 591336 4927369 1 321\n")
f.close()
(You have to specify the pipe is for writing.)

Related

Reading only file names from s3 using python [duplicate]

How do I execute the following shell command using the Python subprocess module?
echo "input data" | awk -f script.awk | sort > outfile.txt
The input data will come from a string, so I don't actually need echo. I've got this far, can anyone explain how I get it to pipe through sort too?
p_awk = subprocess.Popen(["awk","-f","script.awk"],
stdin=subprocess.PIPE,
stdout=file("outfile.txt", "w"))
p_awk.communicate( "input data" )
UPDATE: Note that while the accepted answer below doesn't actually answer the question as asked, I believe S.Lott is right and it's better to avoid having to solve that problem in the first place!
You'd be a little happier with the following.
import subprocess
awk_sort = subprocess.Popen( "awk -f script.awk | sort > outfile.txt",
stdin=subprocess.PIPE, shell=True )
awk_sort.communicate( b"input data\n" )
Delegate part of the work to the shell. Let it connect two processes with a pipeline.
You'd be a lot happier rewriting 'script.awk' into Python, eliminating awk and the pipeline.
Edit. Some of the reasons for suggesting that awk isn't helping.
[There are too many reasons to respond via comments.]
Awk is adding a step of no significant value. There's nothing unique about awk's processing that Python doesn't handle.
The pipelining from awk to sort, for large sets of data, may improve elapsed processing time. For short sets of data, it has no significant benefit. A quick measurement of awk >file ; sort file and awk | sort will reveal of concurrency helps. With sort, it rarely helps because sort is not a once-through filter.
The simplicity of "Python to sort" processing (instead of "Python to awk to sort") prevents the exact kind of questions being asked here.
Python -- while wordier than awk -- is also explicit where awk has certain implicit rules that are opaque to newbies, and confusing to non-specialists.
Awk (like the shell script itself) adds Yet Another Programming language. If all of this can be done in one language (Python), eliminating the shell and the awk programming eliminates two programming languages, allowing someone to focus on the value-producing parts of the task.
Bottom line: awk can't add significant value. In this case, awk is a net cost; it added enough complexity that it was necessary to ask this question. Removing awk will be a net gain.
Sidebar Why building a pipeline (a | b) is so hard.
When the shell is confronted with a | b it has to do the following.
Fork a child process of the original shell. This will eventually become b.
Build an os pipe. (not a Python subprocess.PIPE) but call os.pipe() which returns two new file descriptors that are connected via common buffer. At this point the process has stdin, stdout, stderr from its parent, plus a file that will be "a's stdout" and "b's stdin".
Fork a child. The child replaces its stdout with the new a's stdout. Exec the a process.
The b child closes replaces its stdin with the new b's stdin. Exec the b process.
The b child waits for a to complete.
The parent is waiting for b to complete.
I think that the above can be used recursively to spawn a | b | c, but you have to implicitly parenthesize long pipelines, treating them as if they're a | (b | c).
Since Python has os.pipe(), os.exec() and os.fork(), and you can replace sys.stdin and sys.stdout, there's a way to do the above in pure Python. Indeed, you may be able to work out some shortcuts using os.pipe() and subprocess.Popen.
However, it's easier to delegate that operation to the shell.
import subprocess
some_string = b'input_data'
sort_out = open('outfile.txt', 'wb', 0)
sort_in = subprocess.Popen('sort', stdin=subprocess.PIPE, stdout=sort_out).stdin
subprocess.Popen(['awk', '-f', 'script.awk'], stdout=sort_in,
stdin=subprocess.PIPE).communicate(some_string)
To emulate a shell pipeline:
from subprocess import check_call
check_call('echo "input data" | a | b > outfile.txt', shell=True)
without invoking the shell (see 17.1.4.2. Replacing shell pipeline):
#!/usr/bin/env python
from subprocess import Popen, PIPE
a = Popen(["a"], stdin=PIPE, stdout=PIPE)
with a.stdin:
with a.stdout, open("outfile.txt", "wb") as outfile:
b = Popen(["b"], stdin=a.stdout, stdout=outfile)
a.stdin.write(b"input data")
statuses = [a.wait(), b.wait()] # both a.stdin/stdout are closed already
plumbum provides some syntax sugar:
#!/usr/bin/env python
from plumbum.cmd import a, b # magic
(a << "input data" | b > "outfile.txt")()
The analog of:
#!/bin/sh
echo "input data" | awk -f script.awk | sort > outfile.txt
is:
#!/usr/bin/env python
from plumbum.cmd import awk, sort
(awk["-f", "script.awk"] << "input data" | sort > "outfile.txt")()
The accepted answer is sidestepping actual question.
here is a snippet that chains the output of multiple processes:
Note that it also prints the (somewhat) equivalent shell command so you can run it and make sure the output is correct.
#!/usr/bin/env python3
from subprocess import Popen, PIPE
# cmd1 : dd if=/dev/zero bs=1m count=100
# cmd2 : gzip
# cmd3 : wc -c
cmd1 = ['dd', 'if=/dev/zero', 'bs=1M', 'count=100']
cmd2 = ['tee']
cmd3 = ['wc', '-c']
print(f"Shell style : {' '.join(cmd1)} | {' '.join(cmd2)} | {' '.join(cmd3)}")
p1 = Popen(cmd1, stdout=PIPE, stderr=PIPE) # stderr=PIPE optional, dd is chatty
p2 = Popen(cmd2, stdin=p1.stdout, stdout=PIPE)
p3 = Popen(cmd3, stdin=p2.stdout, stdout=PIPE)
print("Output from last process : " + (p3.communicate()[0]).decode())
# thoretically p1 and p2 may still be running, this ensures we are collecting their return codes
p1.wait()
p2.wait()
print("p1 return: ", p1.returncode)
print("p2 return: ", p2.returncode)
print("p3 return: ", p3.returncode)
http://www.python.org/doc/2.5.2/lib/node535.html covered this pretty well. Is there some part of this you didn't understand?
Your program would be pretty similar, but the second Popen would have stdout= to a file, and you wouldn't need the output of its .communicate().
Inspired by #Cristian's answer. I met just the same issue, but with a different command. So I'm putting my tested example, which I believe could be helpful:
grep_proc = subprocess.Popen(["grep", "rabbitmq"],
stdin=subprocess.PIPE,
stdout=subprocess.PIPE)
subprocess.Popen(["ps", "aux"], stdout=grep_proc.stdin)
out, err = grep_proc.communicate()
This is tested.
What has been done
Declared lazy grep execution with stdin from pipe. This command will be executed at the ps command execution when the pipe will be filled with the stdout of ps.
Called the primary command ps with stdout directed to the pipe used by the grep command.
Grep communicated to get stdout from the pipe.
I like this way because it is natural pipe conception gently wrapped with subprocess interfaces.
The previous answers missed an important point. Replacing shell pipeline is basically correct, as pointed out by geocar. It is almost sufficient to run communicate on the last element of the pipe.
The remaining problem is passing the input data to the pipeline. With multiple subprocesses, a simple communicate(input_data) on the last element doesn't work - it hangs forever. You need to create a a pipeline and a child manually like this:
import os
import subprocess
input = """\
input data
more input
""" * 10
rd, wr = os.pipe()
if os.fork() != 0: # parent
os.close(wr)
else: # child
os.close(rd)
os.write(wr, input)
os.close(wr)
exit()
p_awk = subprocess.Popen(["awk", "{ print $2; }"],
stdin=rd,
stdout=subprocess.PIPE)
p_sort = subprocess.Popen(["sort"],
stdin=p_awk.stdout,
stdout=subprocess.PIPE)
p_awk.stdout.close()
out, err = p_sort.communicate()
print (out.rstrip())
Now the child provides the input through the pipe, and the parent calls communicate(), which works as expected. With this approach, you can create arbitrary long pipelines without resorting to "delegating part of the work to the shell". Unfortunately the subprocess documentation doesn't mention this.
There are ways to achieve the same effect without pipes:
from tempfile import TemporaryFile
tf = TemporaryFile()
tf.write(input)
tf.seek(0, 0)
Now use stdin=tf for p_awk. It's a matter of taste what you prefer.
The above is still not 100% equivalent to bash pipelines because the signal handling is different. You can see this if you add another pipe element that truncates the output of sort, e.g. head -n 10. With the code above, sort will print a "Broken pipe" error message to stderr. You won't see this message when you run the same pipeline in the shell. (That's the only difference though, the result in stdout is the same). The reason seems to be that python's Popen sets SIG_IGN for SIGPIPE, whereas the shell leaves it at SIG_DFL, and sort's signal handling is different in these two cases.
EDIT: pipes is available on Windows but, crucially, doesn't appear to actually work on Windows. See comments below.
The Python standard library now includes the pipes module for handling this:
https://docs.python.org/2/library/pipes.html, https://docs.python.org/3.4/library/pipes.html
I'm not sure how long this module has been around, but this approach appears to be vastly simpler than mucking about with subprocess.
For me, the below approach is the cleanest and easiest to read
from subprocess import Popen, PIPE
def string_to_2_procs_to_file(input_s, first_cmd, second_cmd, output_filename):
with open(output_filename, 'wb') as out_f:
p2 = Popen(second_cmd, stdin=PIPE, stdout=out_f)
p1 = Popen(first_cmd, stdout=p2.stdin, stdin=PIPE)
p1.communicate(input=bytes(input_s))
p1.wait()
p2.stdin.close()
p2.wait()
which can be called like so:
string_to_2_procs_to_file('input data', ['awk', '-f', 'script.awk'], ['sort'], 'output.txt')

python subprocess and passing in shell arguments

I'm trying to utilize python's subprocess to run a command that downloads a file, but it requires an argument in order to proceed. If I run the command stand alone, it will prompt you as shown below:
./goro-new export --branch=testing --file=corp/goro.sites/test/meta.json
Finding pages .........
The following pages will be exported from Goro to your local filesystem:
/goro.sites/test/meta.json -> /usr/local/home/$user/schools/goro.sites/test/meta.json
Export pages? [y/N]: y
Exporting 1 pages .............................................................................................................. 0% 0:00:03
Exported 1 pages in 3.66281s.
My question is, how do I answer the "y/N" in the Export pages part? I suspect I need to pass in an argument to my subprocess, but I am relatively a newcomer to python, so I was hoping for some help. Below is a printout of my testing in the python environment:
>>> import subprocess
>>> cmd = ['goro-new export --branch=test --file=corp/goro.sites/test/meta.json']
>>> p = subprocess.Popen(cmd, shell=True, stdout=subprocess.PIPE,stderr=subprocess.PIPE, stdin=subprocess.PIPE)
>>> out, err = p.communicate()
>>> print out
Finding pages ....
The following pages will be exported from Goro to your local filesystem:
/goro.sites/test/meta.json -> /var/www/html/goro.sites/test/meta.json
Export pages? [y/N]:
How can I pass in the "y/N" so it can proceed?
You use the function which you are already using, the communicate() -function and pass whatever you want as it's input parameter. I cannot verify this works but it should give you an idea:
>>> import subprocess
>>> cmd = ['goro-new export --branch=test --file=corp/goro.sites/test/meta.json']
>>> p = subprocess.Popen(cmd, shell=True, stdout=subprocess.PIPE,stderr=subprocess.PIPE, stdin=subprocess.PIPE)
>>> out, err = p.communicate(input="y")
>>> print out
The easiest way to do this if you always want to answer yes (which I'm assuming you do) is with some bash: yes | python myscript.py. To do this directly in python, you can make a new subprocess.Popen (say, called yes) with stdout=subprocess.PIPE, and set the stdin of p to be equal to yes.stdout. Reference: Python subprocess command with pipe

Executing shell program in Python without printing to screen

Is there a way that I can execute a shell program from Python, which prints its output to the screen, and read its output to a variable without displaying anything on the screen?
This sounds a little bit confusing, so maybe I can explain it better by an example.
Let's say I have a program that prints something to the screen when executed
bash> ./my_prog
bash> "Hello World"
When I want to read the output into a variable in Python, I read that a good approach is to use the subprocess module like so:
my_var = subprocess.check_output("./my_prog", shell=True)
With this construct, I can get the program's output into my_var (here "Hello World"), however it is also printed to the screen when I run the Python script. Is there any way to suppress this? I couldn't find anything in the subprocess documentation, so maybe there is another module I could use for this purpose?
EDIT:
I just found out that commands.getoutput() lets me do this. But is there also a way to achieve similar effects in subprocess? Because I was planning to make a Python3 version at some point.
EDIT2: Particular Example
Excerpt from the python script:
oechem_utils_path = "/soft/linux64/openeye/examples/oechem-utilities/"\
"openeye/toolkits/1.7.2.4/redhat-RHEL5-g++4.3-x64/examples/"\
"oechem-utilities/"
rmsd_path = oechem_utils_path + "rmsd"
for file in lMol2:
sReturn = subprocess.check_output("{rmsd_exe} {rmsd_pars}"\
" -in {sIn} -ref {sRef}".format(rmsd_exe=sRmsdExe,\
rmsd_pars=sRmsdPars, sIn=file, sRef=sReference), shell=True)
dRmsds[file] = sReturn
Screen Output (Note that not "everything" is printed to the screen, only a part of
the output, and if I use commands.getoutput everything works just fine:
/soft/linux64/openeye/examples/oechem-utilities/openeye/toolkits/1.7.2.4/redhat-RHEL5-g++4.3-x64/examples/oechem-utilities/rmsd: mols in: 1 out: 0
/soft/linux64/openeye/examples/oechem-utilities/openeye/toolkits/1.7.2.4/redhat-RHEL5-g++4.3-x64/examples/oechem-utilities/rmsd: confs in: 1 out: 0
/soft/linux64/openeye/examples/oechem-utilities/openeye/toolkits/1.7.2.4/redhat-RHEL5-g++4.3-x64/examples/oechem-utilities/rmsd - RMSD utility [OEChem 1.7.2]
/soft/linux64/openeye/examples/oechem-utilities/openeye/toolkits/1.7.2.4/redhat-RHEL5-g++4.3-x64/examples/oechem-utilities/rmsd: mols in: 1 out: 0
/soft/linux64/openeye/examples/oechem-utilities/openeye/toolkits/1.7.2.4/redhat-RHEL5-g++4.3-x64/examples/oechem-utilities/rmsd: confs in: 1 out: 0
To add to Ryan Haining's answer, you can also handle stderr to make sure nothing is printed to the screen:
p = subprocess.Popen(command, shell=True, stdin=subprocess.PIPE, stderr=subprocess.STDOUT, stdout=subprocess.PIPE, close_fds=True)
out,err = p.communicate()
If subprocess.check_ouput is not working for you, use a Popen object and a PIPE to capture the program's output in Python.
prog = subprocess.Popen('./myprog', shell=True, stdout=subprocess.PIPE)
output = prog.communicate()[0]
the .communicate() method will wait for a program to finish execution and then return a tuple of (stdout, stderr) which is why you'll want to take the [0] of that.
If you also want to capture stderr then add stderr=subprocess.PIPE to the creation of the Popen object.
If you wish to capture the output of prog while it is running instead of waiting for it to finish, you can call line = prog.stdout.readline() to read one line at a time. Note that this will hang if there are no lines available until there is one.
I always used Subprocess.Popen, which gives you no output normally

Catching and outputting stderr at the same time with python's subprocess

(Using python 3.2 currently)
I need to be able to:
Run a command using subprocess
Both stdout/stderr of that command need be printed to the terminal in real-time (it doesn't matter if they both come out on stdout or stderr or whatever
At the same time, I need a way to know if the command printed anything to stderr (and preferably what it printed).
I've played around with subprocess pipes as well as doing strange pipe redirects in bash, as well as using tee, but as of yet haven't found anything that would work. Is this something that's possible?
My solution:
import subprocess
process = subprocess.Popen("my command", shell=True,
stdout=None, # print to terminal
stderr=subprocess.PIPE)
duplicator = subprocess.Popen("tee /dev/stderr", shell=True, # duplicate input stream
stdin=process.stderr,
stdout=subprocess.PIPE, # catch error stream of first process
stderr=None) # print to terminal
error_stream = duplicator.stdout
print('error_stream.read() = ' + error_stream.read())
Try something like this:
import os
cmd = 'for i in 1 2 3 4 5; do sleep 5; echo $i; done'
p = os.popen(cmd)
while True:
output = p.readline()
print(output)
if not output: break
In python2 you can catch stderr easily as well by using popen3 like this:
i, o, err = os.popen3(cmd)
but there seem to be no such function in python3. If you don find the way around this, try using subprocess.Popen directly, as described here: http://www.saltycrane.com/blog/2009/10/how-capture-stdout-in-real-time-python/

Output from subprocess.Popen

I have been writing some python code and in my code I was using "command"
The code was working as I intended but then I noticed in the Python docs that command has been deprecated and will be removed in Python 3 and that I should use "subprocess" instead.
"OK" I think, "I don't want my code to go straight to legacy status, so I should change that right now.
The thing is that subprocess.Popen seems to prepend a nasty string to the start of any output e.g.
<subprocess.Popen object at 0xb7394c8c>
All the examples I see have it there, it seems to be accepted as given that it is always there.
This code;
#!/usr/bin/python
import subprocess
output = subprocess.Popen("ls -al", shell=True)
print output
produces this;
<subprocess.Popen object at 0xb734b26c>
brettg#underworld:~/dev$ total 52
drwxr-xr-x 3 brettg brettg 4096 2011-05-27 12:38 .
drwxr-xr-x 21 brettg brettg 4096 2011-05-24 17:40 ..
<trunc>
Is this normal? If I use it as part of a larger program that outputs various formatted details to the console it messes everything up.
I'm using the command to obtain the IP address for an interface by using ifconfig along with various greps and awks to scrape the address.
Consider this code;
#!/usr/bin/python
import commands,subprocess
def new_get_ip (netif):
address = subprocess.Popen("/sbin/ifconfig " + netif + " | grep inet | grep -v inet6 | awk '{print $2}' | sed 's/addr://'i", shell=True)
return address
def old_get_ip (netif):
address = commands.getoutput("/sbin/ifconfig " + netif + " | grep inet | grep -v inet6 | awk '{print $2}' | sed 's/addr://'i")
return address
print "OLD IP is :",old_get_ip("eth0")
print ""
print "NEW IP is :",new_get_ip("eth0")
This returns;
brettg#underworld:~/dev$ ./IPAddress.py
OLD IP is : 10.48.16.60
NEW IP is : <subprocess.Popen object at 0xb744270c>
brettg#underworld:~/dev$ 10.48.16.60
Which is fugly to say the least.
Obviously I am missing something here. I am new to Python of course so I'm sure it is me doing the wrong thing but various google searches have been fruitless to this point.
What if I want cleaner output? Do I have to manually trim the offending output or am I invoking subprocess.Popen incorrectly?
The "ugly string" is what it should be printing. Python is correctly printing out the repr(subprocess.Popen(...)), just like what it would print if you said print(open('myfile.txt')).
Furthermore, python has no knowledge of what is being output to stdout. The output you are seeing is not from python, but from the process's stdout and stderr being redirected to your terminal as spam, that is not even going through the python process. It's like you ran a program someprogram & without redirecting its stdout and stderr to /dev/null, and then tried to run another command, but you'd occasionally see spam from the program. To repeat and clarify:
<subprocess.Popen object at 0xb734b26c> <-- output of python program
brettg#underworld:~/dev$ total 52 <-- spam from your shell, not from python
drwxr-xr-x 3 brettg brettg 4096 2011-05-27 12:38 . <-- spam from your shell, not from python
drwxr-xr-x 21 brettg brettg 4096 2011-05-24 17:40 .. <-- spam from your shell, not from python
...
In order to capture stdout, you must use the .communicate() function, like so:
#!/usr/bin/python
import subprocess
output = subprocess.Popen(["ls", "-a", "-l"], stdout=subprocess.PIPE).communicate()[0]
print output
Furthermore, you never want to use shell=True, as it is a security hole (a major security hole with unsanitized inputs, a minor one with no input because it allows local attacks by modifying the shell environment). For security reasons and also to avoid bugs, you generally want to pass in a list rather than a string. If you're lazy you can do "ls -al".split(), which is frowned upon, but it would be a security hole to do something like ("ls -l %s"%unsanitizedInput).split().
See the subprocess module documentation for more information.
Here is how to get stdout and stderr from a program using the subprocess module:
from subprocess import Popen, PIPE, STDOUT
cmd = 'echo Hello World'
p = Popen(cmd, shell=True, stdin=PIPE, stdout=PIPE, stderr=STDOUT, close_fds=True)
output = p.stdout.read()
print output
results:
b'Hello\r\n'
you can run commands with PowerShell and see results:
from subprocess import Popen, PIPE, STDOUT
cmd = 'powershell.exe ls'
p = Popen(cmd, shell=True, stdin=PIPE, stdout=PIPE, stderr=STDOUT, close_fds=True)
output = p.stdout.read()
useful link
The variable output does not contain a string, it is a container for the subprocess.Popen() function. You don't need to print it. The code,
import subprocess
output = subprocess.Popen("ls -al", shell=True)
works perfectly, but without the ugly : <subprocess.Popen object at 0xb734b26c> being printed.

Categories

Resources