How to use Subprocess in Python - python

I would like to execute following shell command in python: grep 'string' file | tail -1 | cut -c 1-3
I tried:
import subprocess
i = 1
while i < 1070:
file = "sorted." + str(i) + ".txt"
string = "2x"
subprocess.call(grep 'string' file | tail -1 | cut -c 1-3)
i = i + 1
Any help would be appreciated. Thanks.

First of all, whatever you pass into the subprocess.call should be a string. Names grep, file, tail and cut are not defined in your code and you need to turn the whole expression into a string. Since the search string for the grep command should be dynamic, you need to construct the final string before passing it as argument into the function.
import subprocess
i = 1
while i < 1070:
file = "sorted." + str(i) + ".txt"
string = "2x"
command_string = 'grep {0} {1} | tail -1 | cut -c 1-3'.format(string, file)
subprocess.call(command_string)
i = i + 1
You probably want to pass in an additional argument to subprocess.call: shell=True. The argument will make sure the command is executed through the shell.
Your command is using cut. You might want to retrieve the output of the subprocess, so a better option would be to create a new process object and use subprocess.communicate with turned out output capturing:
import subprocess
i = 1
while i < 1070:
file = "sorted." + str(i) + ".txt"
string = "2x"
command_string = 'grep {0} {1} | tail -1 | cut -c 1-3'.format(string, file)
p = subprocess.Popen(command_string, shell=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
stdoutdata, stderrdata = p.communicate()
# stdoutdata now contains the output of the shell commands and you can use it
# in your program
i = i + 1
EDIT: Here is the information on how to store the data into a text file, as requested in the comment.
import subprocess
outputs = []
i = 1
while i < 1070:
file = "sorted." + str(i) + ".txt"
string = "2x"
command_string = 'grep {0} {1} | tail -1 | cut -c 1-3'.format(string, file)
p = subprocess.Popen(command_string, shell=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE, text=True)
stdoutdata, stderrdata = p.communicate()
# stdoutdata now contains the output of the shell commands and you can use it
# in your program, like writing the output to a file.
outputs.append(stdoutdata)
i = i + 1
with open('output.txt', 'w') as f:
f.write('\n'.join(outputs))

Your command should be provided as a string.
In addition, if you want to get the output of your command, you can use the following:
subprocess.run("grep 'string' file | tail -1 | cut -c 1-3", shell=True, capture_output=True, check=True)
where capture_output (works in Python3.7+) returns object with returncode, stdout and stderr and the check flag will raise exception if your command fails.

Subprocess expects the arguments as a string or array:
subprocess.call("grep '{}' {} | tail -1 | cut -c 1-3".format(string, file), shell=True)
shell=True is nececairy because you are using shell-specific commands like the pipe.
However, in this case it might be a lot easier to implement the entire program in pure python.
Note that if either string or file contain any special characters including spaces or quotation marks, the command will not work, and could in fact do a variety of unwanted things to your system. If you need it to work on more than these simple values, consider either a pure-python solution, setting shell=False and using the array syntax with manual piping, or some form of escaping.

Related

Issue specifying parameters to 'pstops' in subprocess.Popen

Issuing this command from the command line:
pdftops -paper A4 -nocenter opf.pdf - | pstops "1:0#0.8(0.5cm,13.5cm)" > test.ps
works fine. I tried to convert this to a parameter list for subprocess.Popen like this:
import subprocess as sp
path = 'opf.pdf'
ps = sp.Popen(
["pdftops",
"-paper", "A4",
"-nocenter",
"{}".format(path),
"-"],
stdout = sp.PIPE)
pr = sp.Popen(
["pstops",
"'1:0#0.8(0.5cm,13.5cm)'"],
stdin = ps.stdout,
stdout = sp.PIPE)
sp.Popen(
["lpr"],
stdin = pr.stdout )
where path is the filename - opf.pdf. This produces error, in the second Popen:
0x23f2dd0age specification error:
pagespecs = [modulo:]spec
spec = [-]pageno[#scale][L|R|U|H|V][(xoff,yoff)][,spec|+spec]
modulo >= 1, 0 <= pageno < modulo
(sic). I suspect the 0x23f2dd0 somehow replaced the 'P'. Anyway, I suspect the problem to be in the page spec 1:0#0.8(0.5cm,13.5cm), so I tried with/without the single quotes, and with (escaped) double quotes. I even tried shlex.quote which produced a very exotic ''"'"'1:0#0.8(0.5cm,13.5cm)'"'"'', but still the same error.
What is causing this?
EDIT As a last resource, I tried:
os.system(("pdftops -paper A4 -nocenter {} - | "
"pstops '1:0#0.8(1cm,13.5cm)' | "
"lpr").format(path))
which works perfectly. I'd still prefer the above Popen solution though.
Think about what the shell does with that argument (or use something like printf '%s\n' to get it to show you). We need to undo the shell quoting and replace it with Python quoting (which happens to be eerily similar):
pr = sp.Popen(
["pstops",
"1:0#0.8(0.5cm,13.5cm)"],
stdin = ps.stdout,
stdout = sp.PIPE)

converting command line syntax to os.system call or subprocess cal; in python [duplicate]

How would one call a shell command from Python which contains a pipe and capture the output?
Suppose the command was something like:
cat file.log | tail -1
The Perl equivalent of what I am trying to do would be something like:
my $string = `cat file.log | tail -1`;
Use a subprocess.PIPE, as explained in the subprocess docs section "Replacing shell pipeline":
import subprocess
p1 = subprocess.Popen(["cat", "file.log"], stdout=subprocess.PIPE)
p2 = subprocess.Popen(["tail", "-1"], stdin=p1.stdout, stdout=subprocess.PIPE)
p1.stdout.close() # Allow p1 to receive a SIGPIPE if p2 exits.
output,err = p2.communicate()
Or, using the sh module, piping becomes composition of functions:
import sh
output = sh.tail(sh.cat('file.log'), '-1')
import subprocess
task = subprocess.Popen("cat file.log | tail -1", shell=True, stdout=subprocess.PIPE)
data = task.stdout.read()
assert task.wait() == 0
Note that this does not capture stderr. And if you want to capture stderr as well, you'll need to use task.communicate(); calling task.stdout.read() and then task.stderr.read() can deadlock if the buffer for stderr fills. If you want them combined, you should be able to use 2>&1 as part of the shell command.
But given your exact case,
task = subprocess.Popen(['tail', '-1', 'file.log'], stdout=subprocess.PIPE)
data = task.stdout.read()
assert task.wait() == 0
avoids the need for the pipe at all.
This:
import subprocess
p = subprocess.Popen("cat file.log | tail -1", shell=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE, stdin=subprocess.PIPE)
#for shell=False use absolute paths
p_stdout = p.stdout.read()
p_stderr = p.stderr.read()
print p_stdout
Or this should work:
import os
result = os.system("cat file.log | tail -1")
Another way similar to Popen would be:
command=r"""cat file.log | tail -1 """
output=subprocess.check_output(command, shell=True)
This is a fork from #chown with some improvements:
an alias for import subprocess, makes easier when setting parameters
if you just want the output, you don't need to set stderr or stdin when calling Popen
for better formatting, it's recommended to decode the output
shell=True is necessary, in order to call an interpreter for the command line
#!/usr/bin/python3
import subprocess as sp
p = sp.Popen("cat app.log | grep guido", shell=True, stdout=sp.PIPE)
output = p.stdout.read()
print(output.decode('utf-8'))
$ cat app.log
2017-10-14 22:34:12, User Removed [albert.wesker]
2017-10-26 18:14:02, User Removed [alexei.ivanovich]
2017-10-28 12:14:56, User Created [ivan.leon]
2017-11-14 09:22:07, User Created [guido.rossum]
$ python3 subproc.py
2017-11-14 09:22:07, User Created [guido.rossum]
Simple function for run shell command with many pipes
Using
res, err = eval_shell_cmd('pacman -Qii | grep MODIFIED | grep -v UN | cut -f 2')
Function
import subprocess
def eval_shell_cmd(command, debug=False):
"""
Eval shell command with pipes and return result
:param command: Shell command
:param debug: Debug flag
:return: Result string
"""
processes = command.split(' | ')
if debug:
print('Processes:', processes)
for index, value in enumerate(processes):
args = value.split(' ')
if debug:
print(index, args)
if index == 0:
p = subprocess.Popen(args, stdout=subprocess.PIPE)
else:
p = subprocess.Popen(args, stdin=p.stdout, stdout=subprocess.PIPE)
if index == len(processes) - 1:
result, error = p.communicate()
return result.decode('utf-8'), error

Escaping both types of quotes in subprocess.Popen call to awk

My subprocess call should be calling tabix 1kg.phase1.snp.bed.gz -B test.bed | awk '{FS="\t";OFS="\t"} $4 >= 10' but is giving me errors because it has both " and ' in it. I have tried using r for a raw string but I can't figure out the right combination to prevent errors. My current call looks like:
snp_tabix = subprocess.Popen(["tabix", tgp_snp, "-B", infile, "|", "awk", """'{FS="\t";OFS="\t"}""", "$4", ">=", maf_cut_off, r"'"], stdout=subprocess.PIPE)
Which gives the error TypeError: execv() arg 2 must contain only strings
r"'" is not the issue. Most likely you're passing maf_cut_off as an integer, which is incorrect. You should use str(maf_cut_off).
There are several issues. You are trying to execute a shell command (there is a pipe | in the command). So it won't work even if you convert all variables to strings.
You could execute it using shell:
from pipes import quote
from subprocess import check_output
cmd = r"""tabix %s -B %s | awk '{FS="\t";OFS="\t"} $4 >= %d'""" % (
quote(tgp_snp), quote(infile), maf_cut_off)
output = check_output(cmd, shell=True)
Or you could use the pipe recipe from subprocess' docs:
from subprocess import Popen, PIPE
tabix = Popen(["tabix", tgp_snp, "-B", infile], stdout=PIPE)
awk = Popen(["awk", r'{FS="\t";OFS="\t"} $4 >= %d' % maf_cut_off],
stdin=tabix.stdout, stdout=PIPE)
tabix.stdout.close() # allow tabix to receive a SIGPIPE if awk exits
output = awk.communicate()[0]
tabix.wait()
Or you could use plumbum that provides some syntax sugar for shell commands:
from plumbum.cmd import tabix, awk
cmd = tabix[tgp_snp, '-B', infile]
cmd |= awk[r'{FS="\t";OFS="\t"} $4 >= %d' % maf_cut_off]
output = cmd() # run it and get output
Another option is to reproduce the awk command in pure Python. To get all lines that have 4th field larger than or equal to maf_cut_off numerically (as an integer):
from subprocess import Popen, PIPE
tabix = Popen(["tabix", tgp_snp, "-B", infile], stdout=PIPE)
lines = []
for line in tabix.stdout:
columns = line.split(b'\t', 4)
if len(columns) > 3 and int(columns[3]) >= maf_cut_off:
lines.append(line)
output = b''.join(lines)
tabix.communicate() # close streams, wait for the subprocess to exit
tgp_snp, infile should be strings and maf_cut_off should be an integer.
You could use bufsize=-1 (Popen()'s parameter) to improve time performance.

Python exec grep

I'm trying to grep a list of file from the "*.nasl" of "Openvas" which contains a certain port's number.
I can make it directly in the terminal with the command :
egrep --only-match '111' /home/gwvm/Openvas/var/lib/openvas/plugins/*.nasl |cut -d ":" -f1
This command return all the name of the nasl file which contains 111.
like :
/home/gwvm/Openvas/var/lib/openvas/plugins/SolarWinds_TFTP.nasl:111
/home/gwvm/Openvas/var/lib/openvas/plugins/trojan_horses.nasl:111
and after the cut :
/home/gwvm/Openvas/var/lib/openvas/plugins/SolarWinds_TFTP.nasl
/home/gwvm/Openvas/var/lib/openvas/plugins/trojan_horses.nasl
But when I'm in python(3.1.3) the output give me an error :
egrep:/home/gwvm/Openvas/var/lib/openvas/plugins/*.nasl: No such file or directory
i was thinking about a problem because of the "*.nasl" but when I'm trying with an existing file, same result.
Here is the part of code :
command = ("egrep --only-match '"+ str(port[0]) +"' "+ openvas_directory["locate"]["nasl"] + '*.nasl' + ' |cut -d ":" -f1 ')
process=sp.Popen(command,shell=True, stdout= sp.PIPE)
or
exec(command)
I was thinking too of a bad construction but wen I'm printing the command it gives me what i want :
egrep --only-match '111' /home/gwvm/Openvas/var/lib/openvas/plugins/*.nasl |cut -d ":" -f1
If there are any idea!
from subprocess import PIPE, Popen
x = Popen('egrep --only-match \'111\' /home/gwvm/Openvas/var/lib/openvas/plugins/*.nasl', stdout=PIPE, stderr=PIPE, shell=True)
y = Popen('cut -d ":" -f1', stdin=x.stdout, stdout=PIPE, stderr=PIPE, shell=True)
for row in y.stdout.readline():
print row
Or just use check_output()
And this is btw how you | in Popen ;)
Guidelines:
When using Popen, if you supply a command as a string, use shell=True.
If you however supply Popen with a list ['ls, '-l'] then use shell=False, that's just how it works.
If you're piping data, execute two different Popen's and use the output from the first command as stdin for the second command, this is equivilant to doing | in Linux.

bash commands from within python

I'm looking for the best way to use bash commands from within python. What ways are there? I know of os.system and subprocess.Popen.
I have tried these:
bootfile = os.system("ls -l /jffs2/a.bin | cut -d '/' -f 4")
print bootfile
This returns a.bin as expected but also it retuns 0 afterwards and so prints:
a.bin
0
with bootfile now being set to 0. The next time I print bootfile it just shows up as 0. Which is the exit value I guess, how do i stop this value interfering?
I have also tried:
bootfile = subprocess.Popen("ls -l /jffs2/a.bin | cut -d '/' -f 4")
print bootfile
but it seems to break the script, as in I get nothing returned at all, have I done that right?
Also which of these is better and why? Are there other ways and what is the preferred way?
Using os.readlink (proposed by #kojiro) and os.path.basename for getting only the namefile:
os.path.basename(os.readlink('/jffs2/a.bin'))
kojiro's comment about os.readlink is probably what you want.
I am explaining what you were trying to implement.
os.system would return you exit status of the command run.
subprocess.Popen will create a pipe, so that you can capture the output of the command run.
Below line will capture output of the command run:
bootfile = subprocess.Popen(["bash","-c","ls -l /jffs2/a.bin | cut -d '/' -f 4"], stdout=subprocess.PIPE).communicate()[0]
More details at http://docs.python.org/library/subprocess.html
The right answer, as #kojiro says, is:
os.readlink('/jffs2/a.bin')
But if you really wanted to do this the complicated way, then in Python 2.7:
cmd = "ls -l /jffs2/a.bin | cut -d '/' -f 4"
bootfile = subprocess.check_output(cmd, shell=True)
Or on older Pythons:
cmd = "ls -l /jffs2/a.bin | cut -d '/' -f 4"
p = subprocess.Popen(cmd, shell=True, stdout=subprocess.PIPE)
bootfile = p.communicate()[0]
if p.returncode != 0:
raise Exception('It failed')

Categories

Resources