How to correctly escape special characters in python subprocess? - python

Im trying to run this bash command using python subprocess
find /Users/johndoe/sandbox -iname "*.py" | awk -F'/' '{ print $NF}'
output:-
helld.xl.py
parse_maillog.py
replace_pattern.py
split_text_match.py
ssh_bad_login.py
Here is what i have done in python2.7 way, but it gives the output where awk command filter is not working
>>> p1=subprocess.Popen(["find","/Users/johndoe/sandbox","-iname","*.py"],stdout=subprocess.PIPE)
>>> p2=subprocess.Popen(['awk','-F"/"','" {print $NF} "'],stdin=p1.stdout,stdout=subprocess.PIPE)
>>>p2.communicate()
('/Users/johndoe/sandbox/argparse.py\n/Users/johndoe/sandbox/custom_logic_substitute.py\n/Users/johndoe/sandbox/finditer_html_parse.py\n/Users/johndoe/sandbox/finditer_simple.py\n/Users/johndoe/sandbox/group_regex.py\n/Users/johndoe/sandbox/helo.py\n/Users/johndoe/sandbox/newdir/helld.xl.py\n/Users/johndoe/sandbox/parse_maillog.py\n/Users/johndoe/sandbox/replace_pattern.py\n/Users/johndoe/sandbox/split_text_match.py\n/Users/johndoe/sandbox/ssh_bad_login.py\n', None)
I could also get output by using p1 alone here like below,but i cant get the awk working here
list1=[]
result=p1.communicate()[0].split("\n")
for item in res:
a=item.rstrip('/').split('/')
list1.append(a[-1])
print list1

You are incorrectly passing in shell quoting (and extra shell quoting which isn't even required by the shell!) when you're not invoking a shell. Don't do that.
p2=subprocess.Popen(['awk', '-F/', '{print $NF}'], stdin=...
When you have shell=True you need extra quotes around some arguments to protect them from the shell, but there is no shell here, so putting them in is incorrect, and will cause parse errors by Awk.
However, you should almost never need to call Awk from Python, especially for trivial tasks which Python can easily do natively:
list1 = [line.split('/')[-1]
for line in subprocess.check_output(
["find", "/Users/johndoe/sandbox",
"-iname", "*.py"]).splitlines()]
In this particular case, note also that GNU find already has a facility to produce this result directly:
list1 = subprocess.check_output(
["find", "/Users/johndoe/sandbox",
"-iname", "*.py", "-printf", "%f\\n"]).splitlines()

Use this: p2.communicate()[0].split("\n").
It will output a list of lines.

if you don't have any reservation using shell=True , then this should be pretty simple solution
from subprocess import Popen
import subprocess
command='''
find /Users/johndoe/sandbox -iname "*.py" | awk -F'/' '{ print $NF}'
'''
process=Popen(command,shell=True,stdout=subprocess.PIPE)
result=process.communicate()
print result

Related

subprocess.call with command having embedded spaces and quotes

I would like to retrieve output from a shell command that contains spaces and quotes. It looks like this:
import subprocess
cmd = "docker logs nc1 2>&1 |grep mortality| awk '{print $1}'|sort|uniq"
subprocess.check_output(cmd)
This fails with "No such file or directory". What is the best/easiest way to pass commands such as these to subprocess?
The absolutely best solution here is to refactor the code to replace the entire tail of the pipeline with native Python code.
import subprocess
from collections import Counter
s = subprocess.run(
["docker", "logs", "nc1"],
text=True, capture_output=True, check=True)
count = Counter()
for line in s.stdout.splitlines():
if "mortality" in line:
count[line.split()[0]] += 1
for count, word in count.most_common():
print(count, word)
There are minor differences in how Counter objects resolve ties (if two words have the same count, the one which was seen first is returned first, rather than by sort order), but I'm guessing that's unimportant here.
I am also ignoring standard output from the subprocess; if you genuinely want to include output from error messages, too, just include s.stderr in the loop driver too.
However, my hunch is that you don't realize your code was doing that, which drives home the point nicely: Mixing shell script and Python raises the mainainability burden, because now you have to understand both shell script and Python to understand the code.
(And in terms of shell script style, I would definitely get rid of the useless grep by refactoring it into the Awk script, and probably also fold in the sort | uniq which has a trivial and more efficient replacement in Awk. But here, we are replacing all of that with Python code anyway.)
If you really wanted to stick to a pipeline, then you need to add shell=True to use shell features like redirection, pipes, and quoting. Without shell=True, Python looks for a command whose file name is the entire string you were passing in, which of course doesn't exist.

How to run advance Linux command on python script

I want to get the string output of the following linux command
systemctl show node_exporter |grep LoadState| awk '{split($0,a,"="); print a[2]}'
I tried with
import subprocess
output = subprocess.check_output("systemctl show node_exporter |grep LoadState| awk '{split($0,a,"="); print a[2]}'", shell=True)
but the output is,
output = subprocess.check_output("systemctl show node_exporter |grep LoadState| awk '{split($0,a,"="); print a[2]}'", shell=True)
SyntaxError: keyword can't be an expression
Well,
First of all, the function takes a list of strings as a command, not a single string. E.g.:
"ls -a -l" - wrong
["ls", "-a", "-l"] - good
Secondly. If the linux command is super complex or contains lots of lines - it makes sense to create a separate bash file e.g. command.sh, put your linux commands there and run the script from python with:
import subprocess
output = subprocess.check_output(["./command.sh"], shell=True)
You need to escape the double quotes (because they indicate the begin/end of the string):
import subprocess
output = subprocess.check_output("systemctl show node_exporter |grep LoadState| awk '{split($0,a,\"=\"); print a[2]}'", shell=True)

python variables in awk

I like python and I like awk too, and I know that can use it via subprocess or command library, BUT I want to use awk with variables defined before in python, like this simple example:
file = 'file_i_want_read.list'
awk '{print $0}' file > another_file
anybody know how can I do it or something similar?
The easy way to do this is to not use the shell, and instead just pass a list of arguments to subprocess, so file is just one of those arguments.
The only trick is that if you don't use the shell, you can't use shell features like redirection; you have to use the equivalent subprocess features. Like this:
with open('another_file', 'wb') as output:
subprocess.check_call(['awk', '{print $0}', file], stdout=output)
If you really want to use shell redirection instead, then you have to build a shell command line. That's mainly just a matter of using your favorite Python string manipulation methods. But you need to be careful to make sure to quote and/or escape thingsā€”e.g., if file might be file i want read.list, then that will show up as 4 separate arguments unless you put it in quotes. shlex.quote can do that for you. So:
cmdline = "awk '{print $0}' %s > another_file" % (shlex.quote(file),)
subprocess.check_call(cmdline, shell=True)

Formatting a command in python subprocess popen

I am trying to format the following awk command
awk -v OFS="\t" '{printf "chr%s\t%s\t%s\n", $1, $2-1, $2}' file1.txt > file2.txt
for use in python subprocess popen. However i am having a hard time formatting it. I have tried solutions suggested in similar answers but none of them worked. I have also tried using raw string literals. Also i would not like to use shell=True as this is not recommended
Edit according to comment:
The command i tried was
awk_command = """awk -v OFS="\t" '{printf "chr%s\t%s\t%s\n", $1, $2-1, $2}' file1.txt > file2.txt"""
command_execute = Popen(shlex.split(awk_command))
However i get the following error upon executing this
KeyError: 'printf "chr%s\t%s\t%s\n", $1, $2-1, $2'
googling the error suggests this happens when a value is requested for an undefined key but i do not understand its context here
> is the shell redirection operator. To implement it in Python, use stdout parameter:
#!/usr/bin/env python
import shlex
import subprocess
cmd = r"""awk -v OFS="\t" '{printf "chr%s\t%s\t%s\n", $1, $2-1, $2}'"""
with open('file2.txt', 'wb', 0) as output_file:
subprocess.check_call(shlex.split(cmd) + ["file1.txt"], stdout=output_file)
To avoid starting a separate process, you could implement this particular awk command in pure Python.
The simplest method, especially if you wish to keep the output redirection stuff, is to use subprocess with shell=True - then you only need to escape Python special characters. The line, as a whole, will be interpreted by the default shell.
WARNING: do not use this with untrusted input without sanitizing it first!
Alternatively, you can replace the command line with an argv-type sequence and feed that to subprocess instead. Then, you need to provide stuff as the program would see it:
remove all the shell-level escaping
remove the output redirection stuff and do the redirection yourself instead
Regarding the specific problems:
you didn't escape Python special characters in the string so \t and \n became the literal tab and newline (try to print awk_command)
using shlex.split is nothing different from shell=True - with an added unreliability since it cannot guarantee if would parse the string the same way your shell would in every case (not to mention the lack of transmutations the shell makes).
Specifically, it doesn't know or care about the special meaning of the redirection part:
>>> awk_command = """awk -v OFS="\\t" '{printf "chr%s\\t%s\\t%s\\n", $1, $2- 1, $2}' file1.txt > file2.txt"""
>>> shlex.split(awk_command)
['awk','-v','OFS=\\t','{printf "chr%s\\t%s\\t%s\\n", $1, $2-1, $2}','file1.txt','>','file2.txt']
So, if you wish to use shell=False, do construct the argument list yourself.

Escaping escape sequence in Python

I am kind of new to python. Goal is to execute a shell command using subprocess parse & retrive the printed output from shell. The execution errors out as shown in the sample output msg below. Also shown below is the sample code snippet
Code snippet:
testStr = "cat tst.txt | grep Location | sed -e '/.*Location: //g' "
print "testStr = "+testStr
testStrOut = subprocess.Popen([testStr],shell=True,stdout=subprocess.PIPE).communicate()[0]
Output:
testStr = cat tst.txt | grep Location | sed -e '/.*Location: //g'
cat: tst.txt: No such file or directory
sed: -e expression #1, char 15: unknown command: `/'
Is there a workaround or a function that could be used ?
Appreciate your help
Thanks
I suppose your main error is not python related. To be more precise, there are 3 of them:
You forgot to import subprocess.
It should be sed -e 's/.*Location: //g'. You wrote ///g instead of s///g.
tst.txt does not exist.
You should be passing testStr directly as the first argument, rather than enclosing it in a list. See subprocess.Popen, the paragraph that starts "On Unix, with shell=True: ...".

Categories

Resources