Python ffmpeg subprocess never exits on Linux, works on Windows

Python ffmpeg subprocess never exits on Linux, works on Windows - python

I wonder if someone can help explain what is happening?
I run 2 subprocesses, 1 for ffprobe and 1 for ffmpeg.
popen = subprocess.Popen(ffprobecmd, stderr=subprocess.PIPE, shell=True)
And
popen = subprocess.Popen(ffmpegcmd, shell=True, stdout=subprocess.PIPE)
On both Windows and Linux the ffprobe command fires, finishes and gets removed from taskmanager/htop. But only on Windows does the same happen to ffmpeg. On Linux the command remains in htop...
Can anyone explain what is going on, if it matters and how I can stop it from happening please?
EDIT: Here are the commands...
ffprobecmd = 'ffprobe' + \
' -user_agent "' + request.headers['User-Agent'] + '"' + \
' -headers "Referer: ' + request.headers['Referer'] + '"' + \
' -timeout "5000000"' + \
' -v error -select_streams v -show_entries stream=height -of default=nw=1:nk=1' + \
' -i "' + request.url + '"'
and
ffmpegcmd = 'ffmpeg' + \
' -re' + \
' -user_agent "' + r.headers['User-Agent'] + '"' + \
' -headers "Referer: ' + r.headers['Referer'] + '"' + \
' -timeout "10"' + \
' -i "' + r.url + '"' + \
' -c copy' + \
' -f mpegts' + \
' pipe:'
EDIT: Here is a example that behaves as described...
import flask
from flask import Response
import subprocess
app = flask.Flask(__name__)
#app.route('/', methods=['GET'])
def go():
def stream(ffmpegcmd):
popen = subprocess.Popen(ffmpegcmd, stdout=subprocess.PIPE, shell=True)
try:
for stdout_line in iter(popen.stdout.readline, ""):
yield stdout_line
except GeneratorExit:
raise
url = "https://bitdash-a.akamaihd.net/content/MI201109210084_1/m3u8s/f08e80da-bf1d-4e3d-8899-f0f6155f6efa.m3u8"
ffmpegcmd = 'ffmpeg' + \
' -re' + \
' -timeout "10"' + \
' -i "' + url + '"' + \
' -c copy' + \
' -f mpegts' + \
' pipe:'
return Response(stream(ffmpegcmd))
if __name__ == '__main__':
app.run(host= '0.0.0.0', port=5000)

You have the extra sh process due to shell=True, and your copies of ffmpeg are allowed to try to attach to the original terminal's stdin because you aren't overriding that file handle. To fix both those issues, and also some security bugs, switch to shell=False, set stdin=subprocess.DEVNULL, and (to stop zombies from potentially being left behind, note the finally: block below that calls popen.poll() to see if the child exited, and popen.terminate() to tell it to exit if it hasn't):
#!/usr/bin/env python
import flask
from flask import Response
import subprocess
app = flask.Flask(__name__)
#app.route('/', methods=['GET'])
def go():
def stream(ffmpegcmd):
popen = subprocess.Popen(ffmpegcmd, stdin=subprocess.DEVNULL, stdout=subprocess.PIPE)
try:
# NOTE: consider reading fixed-sized blocks (4kb at least) at a time
# instead of parsing binary streams into "lines".
for stdout_line in iter(popen.stdout.readline, ""):
yield stdout_line
finally:
if popen.poll() == None:
popen.terminate()
popen.wait() # yes, this can cause things to actually block
url = "https://bitdash-a.akamaihd.net/content/MI201109210084_1/m3u8s/f08e80da-bf1d-4e3d-8899-f0f6155f6efa.m3u8"
ffmpegcmd = [
'ffmpeg',
'-re',
'-timeout', '10',
'-i', url,
'-c', 'copy',
'-f', 'mpegts',
'pipe:'
]
return Response(stream(ffmpegcmd))
if __name__ == '__main__':
app.run(host= '127.0.0.1', port=5000)
Mind, it's not appropriate to be parsing a binary stream as a series of lines at all. It would be much more appropriate to use blocks (and to change your response headers so the browser knows to parse the content as a video).

What type is the ffmpegcmd variable? Is it a string or a list/sequence?
Note that Windows and Linux/POSIX behave differently with the shell=True parameter enabled or disabled. It matters whether ffmpegcmd is a string or a list.
Direct excerpt from the documentation:
On POSIX with shell=True, the shell defaults to /bin/sh. If args is a
string, the string specifies the command to execute through the shell.
This means that the string must be formatted exactly as it would be
when typed at the shell prompt. This includes, for example, quoting or
backslash escaping filenames with spaces in them. If args is a
sequence, the first item specifies the command string, and any
additional items will be treated as additional arguments to the shell
itself. That is to say, Popen does the equivalent of:
Popen(['/bin/sh', '-c', args[0], args[1], ...])
On Windows with shell=True, the COMSPEC environment variable specifies
the default shell. The only time you need to specify shell=True on
Windows is when the command you wish to execute is built into the
shell (e.g. dir or copy). You do not need shell=True to run a batch
file or console-based executable.

Related

Writing Python Script from Batch Script is not working for one command

I'm trying to convert batch script into python script.
This is batch script, which is calling Klockwork exe on the project specified building it.
%KwPath%\Kwinject -o kwinjectmp.out msbuild %BaseProjPath%/CodingGuide.vcxproj /t:Rebuild /p:Configuration="Release" /p:Platform="x64" /p:CLToolExe=cl.exe /p:CLToolPath=%VSBinPath%
I have write equivalent python script for it.
args = KwPath + '\\Kwinject.exe sync -o ' + 'kwinjectmp.out' + 'msbuild ' + BaseProject + '\\' + ProjectFolder + '\\' + ProjectName + '/t:Rebuild /p:Configuration="Release" /p:Platform="x64" /p:CLToolExe=cl.exe /p:CLToolPath=' + VSBinPath
print(args)
subprocess.call(args, shell=True, stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
Where I have declared BaseProject, VSBinPath, KwPath correctly. But the exectuion is not happening as it is happening in BatchScript, Basically script is not giving any output/working.

spaces between arguments may be absent because of your usage. try this:
import os
path1 = os.path.join(KwPath, 'Kwinject.exe')
path2 = os.path.join(BaseProject, ProjectFolder, ProjectName)
subprocess.call([
path1,
'sync',
'-o', 'kwinjectmp.out',
'msbuild',
path2,
'/t:Rebuild',
'/p:Configuration="Release"',
'/p:Platform="x64"',
'/p:CLToolExe=cl.exe',
'/p:CLToolPath=' + VSBinPath
],
shell=True,
stdout=subprocess.PIPE,
stderr=subprocess.STDOUT
)

python whitespace issue with 2 variables

Running Python 2.6.6 and whenever I try to use 2 variables which are paths in another variable, I get a whitespace error:
'C:\Program' is not recognized as an internal or external command,
operable program or batch file.
This is my code and the issue is with the cmd variable:
from subprocess import call, Popen, PIPE, STDOUT
example = '"C:\\Program Files\\Example\\test.cmd"'
output = '"C:\\test\\python\\reportFromPython.xml"'
cmd = example + " -T 'testing title' " + output
p = Popen(cmd, shell=True, stdin=PIPE, stdout=PIPE, stderr=STDOUT)
output = p.stdout.read()
print output
If I change
cmd = example + " -T 'testing title' " + output
to
cmd = example + " -T 'testing title' "
Then it works but I need the output portion... How can I get it working with both variables?

According to this answer, you don't need shell=True if you're running a .cmd file. Then you can pass in your arguments as a list:
cmd = [example, "-T", "'testing title'", output]
And the rest of the code would be the same except for the removal of shell=True.

subprocess.call using cygwin instead of cmd on Windows

I'm programming on Windows 7 and in one of my Python projects I need to call bedtools, which only works with Cygwin on Windows. I'm new to Cygwin, installed the default version + everything needed for bedtools and then used Cygwin to install bedtools by using make as described in the installation instructions.
$ tar -zxvf BEDTools.tar.gz
$ cd BEDTools-<version>
$ make
When I use the Cygwin terminal to call it manually like below, it works without problem and the output file contains the correct result.
bedtools_exe_path intersect -a gene_bed_file -b snp_bed_file -wa -wb > output_file
But when I use subprocess.call in my program it seems to use Windows cmd instead of Cygwin, which doesn't work.
arguments = [bedtools_exe_path, 'intersect', '-a', gene_bed_file, '-b',
snp_bed_file, '-wa', '-wb', '>', output_file]
return_code = suprocess.call(arguments)
Results in no output file and a return code of 3221225781.
arguments = [bedtools_exe_path, 'intersect', '-a', gene_bed_file, '-b',
snp_bed_file, '-wa', '-wb', '>', output_file]
return_code = suprocess.call(arguments, shell=True)
Results in an empty output file and a return code of 3221225781.
cygwin_bash_path = 'D:/Cygwin/bin/bash.exe'
arguments = [cygwin_bash_path, bedtools_exe_path, 'intersect', '-a', gene_bed_file, '-b',
snp_bed_file, '-wa', '-wb', '>', output_file]
return_code = suprocess.call(arguments)
Results in no output file, a return code of 126 and
D:/BEDTools/bin/bedtools.exe: D:/BEDTools/bin/bedtools.exe: cannot execute binary file
arguments = [cygwin_bash_path, bedtools_exe_path, 'intersect', '-a', gene_bed_file, '-b',
snp_bed_file, '-wa', '-wb', '>', output_file]
return_code = suprocess.call(arguments, shell=True)
Results in an empty output file, a return code of 126 and
D:/BEDTools/bin/bedtools.exe: D:/BEDTools/bin/bedtools.exe: cannot execute binary file
Any ideas how I can get it to work?

Imagine you want to run a Linux command from Windows. You could install Linux into a VM and run commands via ssh (Putty/plink on Windows):
#!/usr/bin/env python
import subprocess
cmd = [r'C:\path\to\plink.exe', '-ssh', 'user#vm_host', '/path/to/bedtools']
with open('output', 'wb', 0) as file:
subprocess.check_call(cmd, stdout=file)
Cygwin provides run command that allows to run commands directly:
cmd = [r'C:\cygwin\path\to\run.exe', '-p', '/path/to/', 'bedtools',
'-wait', 'arg1', 'arg2']
Note: Python script is run from Windows in both cases. bedtools is Linux or Cygwin (non-Windows) command here and therefore you should provide POSIX paths.

The following works without a problem. The " does not need to be escaped.
argument = 'sh -c \"' + bedtools_exe_path + ' intersect -a ' + gene_bed_file +
' -b ' + snp_bed_file + ' -wa -wb\"'
with open(output_file, 'w') as file:
subprocess.call(argument, stdout=file)
Using the following works as well:
argument = 'bash -c \"' + bedtools_exe_path + ' intersect -a ' + gene_bed_file +
' -b ' + snp_bed_file + ' -wa -wb\"'
with open(output_file, 'w') as file:
subprocess.call(argument, stdout=file)
With:
bedtools_exe_path = 'D:/BEDTools/bin/bedtools.exe'
gene_bed_file = 'output/gene.csv'
snp_bed_file = 'output/snps.csv'
output_file = 'output/intersect_gene_snp.bed'
Using the path to the cygwin bash.exe (D:/Cygwin/bin/bash.exe) instead of bash or sh does not work.
Thank you, eryksun, Padraic Cunningham and J.F. Sebastian.

Wrapping bash scripts in python

I just found this great wget wrapper and I'd like to rewrite it as a python script using the subprocess module. However it turns out to be quite tricky giving me all sorts of errors.
download()
{
local url=$1
echo -n " "
wget --progress=dot $url 2>&1 | grep --line-buffered "%" | \
sed -u -e "s,\.,,g" | awk '{printf("\b\b\b\b%4s", $2)}'
echo -ne "\b\b\b\b"
echo " DONE"
}
Then it can be called like this:
file="patch-2.6.37.gz"
echo -n "Downloading $file:"
download "http://www.kernel.org/pub/linux/kernel/v2.6/$file"
Any ideas?
Source: http://fitnr.com/showing-file-download-progress-using-wget.html

I think you're not far off. Mainly I'm wondering, why bother with running pipes into grep and sed and awk when you can do all that internally in Python?
#! /usr/bin/env python
import re
import subprocess
TARGET_FILE = "linux-2.6.0.tar.xz"
TARGET_LINK = "http://www.kernel.org/pub/linux/kernel/v2.6/%s" % TARGET_FILE
wgetExecutable = '/usr/bin/wget'
wgetParameters = ['--progress=dot', TARGET_LINK]
wgetPopen = subprocess.Popen([wgetExecutable] + wgetParameters,
stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
for line in iter(wgetPopen.stdout.readline, b''):
match = re.search(r'\d+%', line)
if match:
print '\b\b\b\b' + match.group(0),
wgetPopen.stdout.close()
wgetPopen.wait()

If you are rewriting the script in Python; you could replace wget by urllib.urlretrieve() in this case:
#!/usr/bin/env python
import os
import posixpath
import sys
import urllib
import urlparse
def url2filename(url):
"""Return basename corresponding to url.
>>> url2filename('http://example.com/path/to/file?opt=1')
'file'
"""
urlpath = urlparse.urlsplit(url).path # pylint: disable=E1103
basename = posixpath.basename(urllib.unquote(urlpath))
if os.path.basename(basename) != basename:
raise ValueError # refuse 'dir%5Cbasename.ext' on Windows
return basename
def reporthook(blocknum, blocksize, totalsize):
"""Report download progress on stderr."""
readsofar = blocknum * blocksize
if totalsize > 0:
percent = readsofar * 1e2 / totalsize
s = "\r%5.1f%% %*d / %d" % (
percent, len(str(totalsize)), readsofar, totalsize)
sys.stderr.write(s)
if readsofar >= totalsize: # near the end
sys.stderr.write("\n")
else: # total size is unknown
sys.stderr.write("read %d\n" % (readsofar,))
url = sys.argv[1]
filename = sys.argv[2] if len(sys.argv) > 2 else url2filename(url)
urllib.urlretrieve(url, filename, reporthook)
Example:
$ python download-file.py http://example.com/path/to/file
It downloads the url to a file. If the file is not given then it uses basename from the url.
You could also run wget if you need it:
#!/usr/bin/env python
import sys
from subprocess import Popen, PIPE, STDOUT
def urlretrieve(url, filename=None, width=4):
destination = ["-O", filename] if filename is not None else []
p = Popen(["wget"] + destination + ["--progress=dot", url],
stdout=PIPE, stderr=STDOUT, bufsize=1) # line-buffered (out side)
for line in iter(p.stdout.readline, b''):
if b'%' in line: # grep "%"
line = line.replace(b'.', b'') # sed -u -e "s,\.,,g"
percents = line.split(None, 2)[1].decode() # awk $2
sys.stderr.write("\b"*width + percents.rjust(width))
p.communicate() # close stdout, wait for child's exit
print("\b"*width + "DONE")
url = sys.argv[1]
filename = sys.argv[2] if len(sys.argv) > 2 else None
urlretrieve(url, filename)
I have not noticed any buffering issues with this code.

I've done something like this before. and i'd love to share my code with you:)
#!/usr/bin/python2.7
# encoding=utf-8
import sys
import os
import datetime
SHEBANG = "#!/bin/bash\n\n"
def get_cmd(editor='vim', initial_cmd=""):
from subprocess import call
from tempfile import NamedTemporaryFile
# Create the initial temporary file.
with NamedTemporaryFile(delete=False) as tf:
tfName = tf.name
tf.write(initial_cmd)
# Fire up the editor.
if call([editor, tfName], shell=False) != 0:
return None
# Editor died or was killed.
# Get the modified content.
fd = open(tfName)
res = fd.read()
fd.close()
os.remove(tfName)
return res
def main():
initial_cmd = "wget " + sys.argv[1]
cmd = get_cmd(editor='vim', initial_cmd=initial_cmd)
if len(sys.argv) > 1 and sys.argv[1] == 's':
#keep the download infomation.
t = datetime.datetime.now()
filename = "swget_%02d%02d%02d%02d%02d" %\
(t.month, t.day, t.hour, t.minute, t.second)
with open(filename, 'w') as f:
f.write(SHEBANG)
f.write(cmd)
f.close()
os.chmod(filename, 0777)
os.system(cmd)
main()
# run this script with the optional argument 's'
# copy the command to the editor, then save and quit. it will
# begin to download. if you have use the argument 's'.
# then this script will create another executable script, you
# can use that script to resume you interrupt download.( if server support)
so, basically, you just need to modify the initial_cmd's value, in your case, it's
wget --progress=dot $url 2>&1 | grep --line-buffered "%" | \
sed -u -e "s,\.,,g" | awk '{printf("\b\b\b\b%4s", $2)}'
this script will first create a temp file, then put shell commands in it, and give it execute permissions. and finally run the temp file with commands in it.

vim download.py
#!/usr/bin/env python
import subprocess
import os
sh_cmd = r"""
download()
{
local url=$1
echo -n " "
wget --progress=dot $url 2>&1 |
grep --line-buffered "%" |
sed -u -e "s,\.,,g" |
awk '{printf("\b\b\b\b%4s", $2)}'
echo -ne "\b\b\b\b"
echo " DONE"
}
download "http://www.kernel.org/pub/linux/kernel/v2.6/$file"
"""
cmd = 'sh'
p = subprocess.Popen(cmd,
shell=True,
stdin=subprocess.PIPE,
env=os.environ
)
p.communicate(input=sh_cmd)
# or:
# p = subprocess.Popen(cmd,
# shell=True,
# stdin=subprocess.PIPE,
# env={'file':'xx'})
#
# p.communicate(input=sh_cmd)
# or:
# p = subprocess.Popen(cmd, shell=True,
# stdin=subprocess.PIPE,
# stdout=subprocess.PIPE,
# stderr=subprocess.PIPE,
# env=os.environ)
# stdout, stderr = p.communicate(input=sh_cmd)
then you can call like:
file="xxx" python dowload.py

In very simple words, considering you have script.sh file, you can execute it and print its return value, if any:
import subprocess
process = subprocess.Popen('/path/to/script.sh', shell=True, stdout=subprocess.PIPE)
process.wait()
print process.returncode

How to show the rsync --progress in web browser using DJango?

I am writing a Python/Django application which transfer files from server to the local machine using rsync protocol. We will be dealing with the large files so the progress bar is mandatory. --progress argument in rsync command does this beautifully. All the detail progresses are shown in the terminal. How can I show that progress in web browser? Is there any hook function or something like that? Or Can I store the progress in a log file, call it and update it every one minute or so?

The basic principle is to run rsync in subprocess, expose a web API and get updates via javascript
Here's an example.
import subprocess
import re
import sys
print('Dry run:')
cmd = 'rsync -az --stats --dry-run ' + sys.argv[1] + ' ' + sys.argv[2]
proc = subprocess.Popen(cmd,
shell=True,
stdin=subprocess.PIPE,
stdout=subprocess.PIPE,)
remainder = proc.communicate()[0]
mn = re.findall(r'Number of files: (\d+)', remainder)
total_files = int(mn[0])
print('Number of files: ' + str(total_files))
print('Real rsync:')
cmd = 'rsync -avz --progress ' + sys.argv[1] + ' ' + sys.argv[2]
proc = subprocess.Popen(cmd,
shell=True,
stdin=subprocess.PIPE,
stdout=subprocess.PIPE,)
while True:
output = proc.stdout.readline()
if 'to-check' in output:
m = re.findall(r'to-check=(\d+)/(\d+)', output)
progress = (100 * (int(m[0][1]) - int(m[0][0]))) / total_files
sys.stdout.write('\rDone: ' + str(progress) + '%')
sys.stdout.flush()
if int(m[0][0]) == 0:
break
print('\rFinished')
But this only shows us the progress in our standard output (stdout).
We can however, modify this code to return the progress as a JSON output and this output can be made available via a progress webservice/API that we create.
On the client side use, we will then write javascript (ajax) to contact our progress webservice/API from time-to-time, and using that info update something client side e.g. a text msg, width of an image, color of some div etc

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python ffmpeg subprocess never exits on Linux, works on Windows - python

Related

Writing Python Script from Batch Script is not working for one command

python whitespace issue with 2 variables

subprocess.call using cygwin instead of cmd on Windows

Wrapping bash scripts in python

How to show the rsync --progress in web browser using DJango?

Categories

Resources