im having an issue trying to get a simple grep command into python. I want to take the output of the following command in a file or a list.
grep -c 'some thing' /home/user/* | grep -v :0
This is what I have, but its not working at all...
thing = str(subprocess.Popen(['grep', '-c', 'some thing', '/home/user/*', '|', 'grep', '-v', ':0'], stdout=subprocess.PIPE)
Basically I need to search files in a directory and return a result if my string is missing from any of the files in the directory.
Working Code (Thanks!!):
thing = subprocess.Popen(('grep -c "some thing" /home/user/* | grep -v ":0"' ),shell=True, stdout=subprocess.PIPE)
The pipe | is a shell feature. You have to use Popen with shell=True to use it.
To emulate the shell pipeline in Python, see How do I use subprocess.Popen to connect multiple processes by pipes?:
#!/usr/bin/env python
import os
from glob import glob
from subprocess import Popen, PIPE
p1 = Popen(["grep", "-c", 'some thing'] + glob(os.path.expanduser('~/*')),
stdout=PIPE)
p2 = Popen(["grep", "-v", ":0"], stdin=p1.stdout)
p1.stdout.close()
p2.wait()
p1.wait()
To get output as a string, set stdout=PIPE and call output = p2.communicate()[0] instead of p2.wait().
To suppress error messages such as "grep: /home/user/dir: Is a directory", you could set stderr=DEVNULL.
You could implement the pipeline in pure Python:
import os
from glob import glob
for name in glob(os.path.expanduser('~/*')):
try:
count = sum(1 for line in open(name, 'rb') if b'some thing' in line)
except IOError:
pass # ignore
else:
if count: # don't print zero counts
print("%s:%d" % (name, count))
Related
I want only the wlan device name at a linux system with python. I could get the device name with shell scripting:
echo /sys/class/net/*/wireless | awk -F'/' '{ print $5 }'
So i want to use this at python with subprocess.
import shlex
import subprocess
def main():
echo = shlex.split('echo /sys/class/net/*/wireless')
echo_proc = subprocess.Popen(echo, shell=True, stdout=subprocess.PIPE)
awk = shlex.split("awk -F'/' '{ print $5 }'")
awk_proc = subprocess.Popen(awk, stdin=echo_proc.stdout)
print(awk_proc.stdout)
But I get only None as output. If it is possible, I would prefer a solution with subprocess.run(). So I replaced Popen with run. But then I get the error message AttributeError: 'bytes' object has no attribute 'fileno'.
A type glob and a pathname expansion by shell will be a headache.
In my environment, the following snippet works:
import subprocess
subprocess.run('echo /sys/class/net/*/wireless', shell=True)
But the following returns an empty string:
import subprocess
subprocess.run(['echo', '/sys/class/net/*/wireless'], shell=True)
Then please try the following as a starting point:
import subprocess
subprocess.run('echo /sys/class/net/*/wireless | awk -F"/" "{ print \$5 }"', shell=True)
which will bring your desired output.
[Update]
If you want to assign a variable to the output above, please try:
import subprocess
proc = subprocess.run('echo /sys/class/net/*/wireless | awk -F"/" "{ print \$5 }"', shell=True, stdout = subprocess.PIPE)
wlan = proc.stdout.decode("utf8").rstrip("\n")
print(wlan)
BTW if you don't stick to the subprocess module, why don't you go with a native way as:
import glob
list = glob.glob('/sys/class/net/*/wireless')
for elm in list:
print(elm.split('/')[4])
Hope this helps.
My subprocess call should be calling tabix 1kg.phase1.snp.bed.gz -B test.bed | awk '{FS="\t";OFS="\t"} $4 >= 10' but is giving me errors because it has both " and ' in it. I have tried using r for a raw string but I can't figure out the right combination to prevent errors. My current call looks like:
snp_tabix = subprocess.Popen(["tabix", tgp_snp, "-B", infile, "|", "awk", """'{FS="\t";OFS="\t"}""", "$4", ">=", maf_cut_off, r"'"], stdout=subprocess.PIPE)
Which gives the error TypeError: execv() arg 2 must contain only strings
r"'" is not the issue. Most likely you're passing maf_cut_off as an integer, which is incorrect. You should use str(maf_cut_off).
There are several issues. You are trying to execute a shell command (there is a pipe | in the command). So it won't work even if you convert all variables to strings.
You could execute it using shell:
from pipes import quote
from subprocess import check_output
cmd = r"""tabix %s -B %s | awk '{FS="\t";OFS="\t"} $4 >= %d'""" % (
quote(tgp_snp), quote(infile), maf_cut_off)
output = check_output(cmd, shell=True)
Or you could use the pipe recipe from subprocess' docs:
from subprocess import Popen, PIPE
tabix = Popen(["tabix", tgp_snp, "-B", infile], stdout=PIPE)
awk = Popen(["awk", r'{FS="\t";OFS="\t"} $4 >= %d' % maf_cut_off],
stdin=tabix.stdout, stdout=PIPE)
tabix.stdout.close() # allow tabix to receive a SIGPIPE if awk exits
output = awk.communicate()[0]
tabix.wait()
Or you could use plumbum that provides some syntax sugar for shell commands:
from plumbum.cmd import tabix, awk
cmd = tabix[tgp_snp, '-B', infile]
cmd |= awk[r'{FS="\t";OFS="\t"} $4 >= %d' % maf_cut_off]
output = cmd() # run it and get output
Another option is to reproduce the awk command in pure Python. To get all lines that have 4th field larger than or equal to maf_cut_off numerically (as an integer):
from subprocess import Popen, PIPE
tabix = Popen(["tabix", tgp_snp, "-B", infile], stdout=PIPE)
lines = []
for line in tabix.stdout:
columns = line.split(b'\t', 4)
if len(columns) > 3 and int(columns[3]) >= maf_cut_off:
lines.append(line)
output = b''.join(lines)
tabix.communicate() # close streams, wait for the subprocess to exit
tgp_snp, infile should be strings and maf_cut_off should be an integer.
You could use bufsize=-1 (Popen()'s parameter) to improve time performance.
I have the following command:
$ ffmpeg -i http://url/1video.mp4 2>&1 | perl -lane 'print $1 if /(\d+x\d+)/'
640x360
I'm trying to set the output of this command into a python variable. Here is what I have so far:
>>> from subprocess import Popen, PIPE
>>> p1 = Popen(['ffmpeg', '-i', 'http://url/1video.mp4', '2>&1'], stdout=PIPE)
>>> p2=Popen(['perl','-lane','print $1 if /(\d+x\d+)/'], stdin=p1.stdout, stdout=PIPE)
>>> dimensions = p2.communicate()[0]
''
What am I doing incorrectly here, and how would I get the correct value for dimensions?
In general, you can replace a shell pipeline with this pattern:
p1 = Popen(["dmesg"], stdout=PIPE)
p2 = Popen(["grep", "hda"], stdin=p1.stdout, stdout=PIPE)
p1.stdout.close() # Allow p1 to receive a SIGPIPE if p2 exits.
output = p2.communicate()[0]
However, in this case, no pipeline is necessary:
import subprocess
import shlex
import re
url='http://url/1video.mp4'
proc=subprocess.Popen(shlex.split('ffmpeg -i {f}'.format(f=url)),
stdout=subprocess.PIPE,
stderr=subprocess.PIPE)
dimensions=None
for line in proc.stderr:
match=re.search(r'(\d+x\d+)',line)
if match:
dimensions=match.group(1)
break
print(dimensions)
No need to call perl from within python.
If you have the output from ffmpeg in a variable, you can do something like this:
print re.search(r'(\d+x\d+)', str).group()
Note the “shell” argument to subprocess.Popen: this specifies whether the command you pass is parsed by the shell or not.
That “2>&1” is one of those things that needs to be parsed by a shell, otherwise FFmpeg (like most programs) will try to treat it as a filename or option value.
The Python sequence that most closely mimics the original would probably be more like
p1 = subprocess.Popen("ffmpeg -i http://url/1video.mp4 2>&1", shell = True, stdout = subprocess.PIPE)<BR>
p2 = subprocess.Popen(r"perl -lane 'print $1 if /(\d+x\d+)/'", shell = True, stdin = p1.stdout, stdout = subprocess.PIPE)<BR>
dimensions = p2.communicate()[0]
In Python I need to get the version of an external binary I need to call in my script.
Let's say that I want to use Wget in Python and I want to know its version.
I will call
os.system( "wget --version | grep Wget" )
and then I will parse the outputted string.
How to redirect the stdout of the os.command in a string in Python?
One "old" way is:
fin,fout=os.popen4("wget --version | grep Wget")
print fout.read()
The other modern way is to use a subprocess module:
import subprocess
cmd = subprocess.Popen('wget --version', shell=True, stdout=subprocess.PIPE)
for line in cmd.stdout:
if "Wget" in line:
print line
Use the subprocess module:
from subprocess import Popen, PIPE
p1 = Popen(["wget", "--version"], stdout=PIPE)
p2 = Popen(["grep", "Wget"], stdin=p1.stdout, stdout=PIPE)
output = p2.communicate()[0]
Use subprocess instead.
If you are on *nix, I would recommend you to use commands module.
import commands
status, res = commands.getstatusoutput("wget --version | grep Wget")
print status # Should be zero in case of of success, otherwise would have an error code
print res # Contains stdout
I would like to replicate this in python:
gvimdiff <(hg cat file.txt) file.txt
(hg cat file.txt outputs the most recently committed version of file.txt)
I know how to pipe the file to gvimdiff, but it won't accept another file:
$ hg cat file.txt | gvimdiff file.txt -
Too many edit arguments: "-"
Getting to the python part...
# hgdiff.py
import subprocess
import sys
file = sys.argv[1]
subprocess.call(["gvimdiff", "<(hg cat %s)" % file, file])
When subprocess is called it merely passes <(hg cat file) onto gvimdiff as a filename.
So, is there any way to redirect a command as bash does?
For simplicity's sake just cat a file and redirect it to diff:
diff <(cat file.txt) file.txt
It can be done. As of Python 2.5, however, this mechanism is Linux-specific and not portable:
import subprocess
import sys
file = sys.argv[1]
p1 = subprocess.Popen(['hg', 'cat', file], stdout=subprocess.PIPE)
p2 = subprocess.Popen([
'gvimdiff',
'/proc/self/fd/%s' % p1.stdout.fileno(),
file])
p2.wait()
That said, in the specific case of diff, you can simply take one of the files from stdin, and remove the need to use the bash-alike functionality in question:
file = sys.argv[1]
p1 = subprocess.Popen(['hg', 'cat', file], stdout=subprocess.PIPE)
p2 = subprocess.Popen(['diff', '-', file], stdin=p1.stdout)
diff_text = p2.communicate()[0]
There is also the commands module:
import commands
status, output = commands.getstatusoutput("gvimdiff <(hg cat file.txt) file.txt")
There is also the popen set of functions, if you want to actually grok the data from a command as it is running.
This is actually an example in the docs:
p1 = Popen(["dmesg"], stdout=PIPE)
p2 = Popen(["grep", "hda"], stdin=p1.stdout, stdout=PIPE)
output = p2.communicate()[0]
which means for you:
import subprocess
import sys
file = sys.argv[1]
p1 = Popen(["hg", "cat", file], stdout=PIPE)
p2 = Popen(["gvimdiff", "file.txt"], stdin=p1.stdout, stdout=PIPE)
output = p2.communicate()[0]
This removes the use of the linux-specific /proc/self/fd bits, making it probably work on other unices like Solaris and the BSDs (including MacOS) and maybe even work on Windows.
It just dawned on me that you are probably looking for one of the popen functions.
from: http://docs.python.org/lib/module-popen2.html
popen3(cmd[, bufsize[, mode]])
Executes cmd as a sub-process. Returns the file objects (child_stdout, child_stdin, child_stderr).
namaste,
Mark