I need to get the duration of an Video for an application for Django. So I'll have to do this in python. But I'm really a beginner in this. So it would be nice, if you can help.
This is what I got so far:
import subprocess
task = subprocess.Popen("avconv -i video.mp4 2>&1 | grep Duration | cut -d ' ' -f 4 | sed -r 's/([^\.]*)\..*/\1/'", shell=True, stdout=subprocess.PIPE)
time = task.communicate()[0]
print time
I want to solve it with avconv because I'm allready using this at another point. The shell-command works well so far and gives me an output like:
HH:MM:SS.
But when I'm executing the python-code I just get an non-interpretable symbol on the shell.
Thanks a lot allready for your help!
Found a solution. Problem was the sed-part:
import os
import subprocess
task = subprocess.Popen("avconv -i video.mp4 2>&1 | grep Duration | cut -d ' ' -f 4 | sed -e 's/.\{4\}$//'", shell=True, stdout=subprocess.PIPE)
time = task.communicate()[0]
print time
Because it is allways the same part, it was enought to just cut the last 4 characters.
From python documentation:
Warning
Use communicate() rather than .stdin.write, .stdout.read or .stderr.read to avoid deadlocks due to any of the other OS pipe buffers filling up and blocking the child process.
So you should really user communicate for that:
import subprocess
task = subprocess.Popen("avconv -i video.mp4 2>&1 | grep Duration | cut -d ' ' -f 4 | sed -r 's/([^\.]*)\..*/\1/'", shell=True, stdout=subprocess.PIPE)
time = task.communicate()[0]
print time
That way you can also catch stderr message, if any.
Related
I need one help regarding killing application in linux
As manual process I can use command -- ps -ef | grep "app_name" | awk '{print $2}'
It will give me jobids and then I will kill using command " kill -9 jobid".
I want to have python script which can do this task.
I have written code as
import os
os.system("ps -ef | grep app_name | awk '{print $2}'")
this collects jobids. But it is in "int" type. so I am not able to kill the application.
Can you please here?
Thank you
import subprocess
temp = subprocess.run("ps -ef | grep 'app_name' | awk '{print $2}'", stdin=subprocess.PIPE, shell=True, stdout=subprocess.PIPE)
job_ids = temp.stdout.decode("utf-8").strip().split("\n")
# sample job_ids will be: ['59899', '68977', '68979']
# convert them to integers
job_ids = list(map(int, job_ids))
# job_ids = [59899, 68977, 68979]
Then iterate through the job ids and kill them. Use os.kill()
for job_id in job_ids:
os.kill(job_id, 9)
Subprocess.run doc - https://docs.python.org/3/library/subprocess.html#subprocess.run
To kill a process in Python, call os.kill(pid, sig), with sig = 9 (signal number for SIGKILL) and pid = the process ID (PID) to kill.
To get the process ID, use os.popen instead of os.system above. Alternatively, use subprocess.Popen(..., stdout=subprocess.PIPE). In both cases, call the .readline() method, and convert the return value of that to an integer with int(...).
I have an older python 2.7.5 script which suddenly makes problems on Red Hat Enterprise Linux Server release 7.6 (Maipo). After all I see, it runs fine on Red Hat Enterprise Linux Server release 7.4 (Maipo).
The script basically implements something like
cat /proc/cpuinfo | grep -m 1 -i 'cpu MHz'
by creating two subrocesses and piping the output of the first into the second (see code example below). On the newer OS version, the cat processes stay open until the script terminates.
It seems, that the pipe to grep somehow holds the cat-process open and I can't find any documentation on how to explicitely close it.
The issue can be reproduced by pasting this code into the python CLI and then checking the ps process list for a static process 'cat /proc/cpuinfo'.
The code is breaking down what's originally happening inside a loop, so please don't argue about its style. ;-)
import shlex
from subprocess import *
cmd1 = "cat /proc/cpuinfo"
cmd2 = "grep -m 1 -i 'cpu MHz'"
args1 = shlex.split(cmd1) # split into args
args2 = shlex.split(cmd2) # split into args
# first process uses default stdin
ps1 = Popen(args1, stdout=PIPE)
# then use the output of the previous process as stdin
ps2 = Popen(args2, stdin=ps1.stdout, stdout=PIPE)
out, err = ps2.communicate()
print(out)
Afterwards check the process list in a second session(!) with:
ps -eF |grep -v grep|grep /proc/cpuinfo
On RHEL7.4 I find no open process in the process list, whereas on RHEL 7.6 after some attempts it looks like this:
[reinski#myhost ~]$ ps -eF |grep -v grep|grep /proc/cpuinfo
reinski 2422 89459 0 26993 356 142 18:46 pts/3 00:00:00 cat /proc/cpuinfo
reinski 2597 139605 0 26993 352 31 18:39 pts/3 00:00:00 cat /proc/cpuinfo
reinski 7809 139605 0 26993 352 86 18:03 pts/3 00:00:00 cat /proc/cpuinfo
These processes will only dissappear when I close the python CLI, in which case I get errors like this (I left the formatting messed up as it was):
cat: write error: Broken pipe
cat: write errorcat: write error: Broken pipe
: Broken pipe
Why is cat obviously still wanting to write to the pipe, even though it should have already output the whole /proc/cpuinfo and should have terminated itself?
Or more important: How can I prevent this from happening?
Thanks for any help!
Example 2:
Given the suggestion from VPfB it turned out, that my example was a little unlucky, since the expected result can be achieved by a single grep command.
So here is a modified example to show the problem with piping in another way:
import shlex
from subprocess import *
cmd1 = "grep -m 1 -i 'cpu MHz' /proc/cpuinfo"
cmd2 = "awk '{print $4}'"
args1 = shlex.split(cmd1) # split into args
args2 = shlex.split(cmd2) # split into args
# first process uses default stdin
ps1 = Popen(args1, stdout=PIPE)
# then use the output of the previous process as stdin
ps2 = Popen(args2, stdin=ps1.stdout, stdout=PIPE)
out, err = ps2.communicate()
print(out)
This time, the result is a single zombie process for the grep process (169731 is the pid of the python session):
[reinski#myhost ~]$ ps -eF|grep 169731
reinski 169731 189499 0 37847 6024 198 17:51 pts/2 00:00:00 python
reinski 193999 169731 0 0 0 142 17:53 pts/2 00:00:00 [grep] <defunct>
So, is this just another symptom of the same problem or am I doing something completely wrong here?
Ok, it seems I just found a solution for the zombie processes staying open from the examples:
Simply need to do a
ps1.communicate()
It seems, this is required to close the pipe properly.
I'd expect this to happen when the second process's communicate() is called and it reads the pipe from the first process.
Can someone maybe point out to me, what I am missing here?
I am always willing to learn... ;-)
I have a script that saves 5 seconds length of videos locally and it omits the file name.
Here's the bash command
ffmpeg -i http://0.0.0.0:8080/stream/video.mjpeg -vcodec copy -map 0 -f segment -segment_time 2 -loglevel 40 -segment_format mp4 capture-%05d.mp4 2>&1 | grep --line-buffered -Eo "segment:.+ended" | gawk -F "'" '{print $2; system("")}' | xargs -n1
If I run this command in the terminal, it will return the expected file name, as such
capture-00001.mp4
Notice that the xargs command at the end easily lets me to pass the file name to a python script as a new argument. But now I want to execute this command within Python itself, specifically getting the file name with subprocess.
Here's what I've done so far. When running the script, as expected the terminal will print the file name, but it never pass it as a string to fName. I've tried subprocess.check_output but it never passes anything as the command continuously capture videos and save it locally.
FFMPEG_SCRIPT = r"""ffmpeg -i http://0.0.0.0:8080/stream/video.mjpeg -vcodec copy -map 0 -f segment -segment_time 2 -loglevel 40 -segment_format mp4 capture-%05d.mp4 2>&1 | grep --line-buffered -Eo "segment:.+ended" | gawk -F "'" '{print $2; system("")}' | xargs -n1 """
try:
fName = subprocess.check_call(FFMPEG_SCRIPT, stderr=subprocess.STDOUT, shell=True).decode('utf-8')
print(">>> {}".format(fName))
except subprocess.CalledProcessError as e:
print(e.output)
import subprocess
from subprocess import PIPE
fName = subprocess.Popen("test.bat", stdin=PIPE, stdout=PIPE)
(stdout, stderr) = fName.communicate()
print(">>> {}".format(fName))
print(stdout)
test.bat is a simple echo yes since I can't test with your script, and the resulting output of print(stdout) is b'yes\r\n'. If the file name is the only thing the scripts prints, it shouldn't be too hard to extract it.
That is a common problem when you pipe commands. The IO subsystem processes differently output on terminal and on files or pipes. On terminal, output is flushed on each newline, which does not append on files or pipes unless the program specifically asks for an explicit flush.
It is not a problem on pipelines where the first command ends on an end of file, because everything is flushed before the command exists, and the write end of the pipe is closed. So next commands sees an end of file and everything propagates smoothly.
But when the first program contains an infinite reading loop, it does queue its output, but nothing is flushed until the buffer is full, and buffers are huge on modern systems. In that case, everything works fine but you cannot see any output.
Here is a version based on Alexandre Cox proposal with a piped command to check a mount:
import subprocess
fName = subprocess.Popen('mount | grep sda3 | cut -d " " -f 3', shell=True, stdin=subprocess.PIPE, stdout=subprocess.PIPE)
(stdout, stderr) = fName.communicate()
print(">>> {}".format(fName))
print(stdout)
If the mount exists, the output is:
>>> <subprocess.Popen object at 0x7f9b5eb2fdd8>
b'/mnt/sda3\n'
Executing this in shell gets me tangible results:
wget -O c1 --no-cache "http://some.website" | sed "1,259d" c1 | sed "4,2002d"
Doing this in Python gets me nothing:
subprocess.call(shlex.split("wget -O c1 --no-cache \"http://some.website/tofile\""))
c1 = open("c1",'w')
first = subprocess.Popen(shlex.split("sed \"1,259d\" c1"), stdout=subprocess.PIPE)
subprocess.Popen(shlex.split("sed \"4,2002d\""), stdin=first.stdout, stdout=c1)
c1.close()
Doing this also gets me no results:
c1.write(subprocess.Popen(shlex.split("sed \"4,2002d\""), stdin=first.stdout, stdout=subprocess.PIPE).communicate()[0])
By 'gets me nothing' I mean blank output in the file. Does anyone see anything out of the ordinary here?
I always use plumbum for running external commands. It provides a very intuitive interface and, of course, takes care of escaping for me.
Would look something like:
from plumbum.cmd import wget, sed
cmd1 = wget['-O', 'c1']['--no-cache']["http://some.website"]
cmd2 = sed["1,259d"]['c1'] | sed["4,2002d"]
print cmd1
cmd1() # run it
print cmd2
cmd2() # run it
The statement c1 = open("c1",'w') opens file c1 for writing and truncates any existing data, so everything wget wrote to the file gets erased before you call sed.
Anyway, I think shlex.split is generally awkward. I prefer to build the args list manually:
from subprocess import Popen, PIPE
p0 = Popen(['wget', '-O', '-', 'http://www.google.com'], stdout=PIPE)
p1 = Popen(['sed', '2,8d'], stdin=p0.stdout, stdout=PIPE)
with open('c1', 'w') as c1:
p2 = Popen(['sed', '2,7d'], stdin=p1.stdout, stdout=c1)
p2.wait()
However, there's no obvious reason a Python programmer should have to call out to sed. Python has string methods and regular expressions. Also, instead of wget you can use urllib2.urlopen.
Why not just do everything all in pipes and send the output to a file?
wget -O - "http://www.google.com" | sed "1,259d" | sed "4,2002d" > c1
Or if you don't want to send it to a file, and want it on stdout instead:
wget -O - "http://www.google.com" | sed "1,259d" | sed "4,2002d"
And if you want to do it in Python:
pipe = subprocess.Popen(shlex.split("wget -O - \"http://www.google.com\" | sed \"1,259d\" | sed \"4,2002d\""), stdout=subprocess.PIPE)
result = pipe.communicate()[0]
In the interest of making life easier for people who more-or-less may be running into the same type of problem, I have decided to post the final revised code, which factored in comments about c1 and overwriting of data. Of particular interest is the usage of communicate() which helped to completely eliminate any manifestations of zombie processes, which were quite irritating. Also, I found it useful to use subprocess.call in portions where piping wasn't necessary. No wait() was necessary in the end. Ultimately, staying away from sed and wget is a good idea, especially with Python's inbuilt tools and urllib2.
p0 = subprocess.call(shlex.split("wget -Oc1 --no-cache \"http://Some.website/tofile\""))
p1 = subprocess.Popen(shlex.split("sed \"1,261d\" c1"), stdout=subprocess.PIPE)
with open("cc1", 'w') as cc1:
p2 = subprocess.Popen(shlex.split("sed \"3,2002d\""), stdin=p1.stdout, stdout=cc1)
p2.communicate()
p1.communicate()
p3 = subprocess.call(shlex.split("mv cc1 c1"))
i got a weird problem regarding egrep and pipe
I tried to filter a stream containing some lines who start with a topic name, such as
"TICK:this is a tick message\n"
When I try to use egrep to filter it :
./stream_generator | egrep 'TICK' | ./topic_processor
It seems that the topic_processor never receives any messages
However, when i use the following python script:
./stream_generator | python filter.py --topics TICK | ./topic_processor
everything looks to be fine.
I guess there need to be a 'flush' mechanism for egrep as well, is this correct?
Can anyone here give me a clue? Thanks a million
import sys
from optparse import OptionParser
if __name__ == '__main__':
parser = OptionParser()
parser.add_option("-m", "--topics",
action="store", type="string", dest="topics")
(opts, args) = parser.parse_args()
topics = opts.topics.split(':')
while True:
s = sys.stdin.readline()
for each in topics:
if s[0:4] == each:
sys.stdout.write(s)
sys.stdout.flush()
Have you allowed the command ./stream_generator | egrep 'TICK' | ./topic_processor to run to completion? If the command has completed without producing output then the problem does not lie with buffering since, upon the termination of ./stream_generator, egrep will flush any of its buffers and in turn terminate.
Now, it is true that egrep will use heavy buffering when not outputting directly to a terminal (i.e. when outputting to a pipe or file), and it may appear for a while that egrep produces no output if not enough data has accumulated in egrep's buffer to warrant a flush. This behaviour can be changed in GNU egrep by using the --line-buffered option:
./stream_generator | egrep --line-buffered 'TICK' | ./topic_processor