Using ffmpeg to obtain video durations in python - python

I've installed ffprobe using the pip ffprobe command on my PC, and installed ffmpeg from here.
However, I'm still having trouble running the code listed here.
I try to use the following code unsuccessfully.
SyntaxError: Non-ASCII character '\xe2' in file GetVideoDurations.py
on line 12, but no encoding declared; see
http://python.org/dev/peps/pep-0263/ for details
Does anyone know what's wrong? Am I not referencing the directories correctly? Do I need to make sure the .py and video files are in a specific location?
import subprocess
def getLength(filename):
result = subprocess.Popen(["ffprobe", "filename"],
stdout = subprocess.PIPE, stderr = subprocess.STDOUT)
return [x for x in result.stdout.readlines() if "Duration" in x]
fileToWorkWith = ‪'C:\Users\PC\Desktop\Video.mkv'
getLength(fileToWorkWith)
Apologies if the question is somewhat basic. All I need is to be able to iterate over a group of video files and get their start time and end time.
Thank you!

There is no need to iterate though the output of FFprobe. There is one simple command which returns only the duration of the input file:
ffprobe -v error -show_entries format=duration -of default=noprint_wrappers=1:nokey=1 <input_video>
You can use the following method instead to get the duration:
def get_length(input_video):
result = subprocess.run(['ffprobe', '-v', 'error', '-show_entries', 'format=duration', '-of', 'default=noprint_wrappers=1:nokey=1', input_video], stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
return float(result.stdout)

I'd suggest using FFprobe (comes with FFmpeg).
The answer Chamath gave was pretty close, but ultimately failed for me.
Just as a note, I'm using Python 3.5 and 3.6 and this is what worked for me.
import subprocess
def get_duration(file):
"""Get the duration of a video using ffprobe."""
cmd = 'ffprobe -i {} -show_entries format=duration -v quiet -of csv="p=0"'.format(file)
output = subprocess.check_output(
cmd,
shell=True, # Let this run in the shell
stderr=subprocess.STDOUT
)
# return round(float(output)) # ugly, but rounds your seconds up or down
return float(output)
If you want to throw this function into a class and use it in Django (1.8 - 1.11), just change one line and put this function into your class, like so:
def get_duration(file):
to:
def get_duration(self, file):
Note: Using a relative path worked for me locally, but the production server required an absolute path. You can use os.path.abspath(os.path.dirname(file)) to get the path to your video or audio file.

Using the python ffmpeg package (https://pypi.org/project/python-ffmpeg)
import ffmpeg
duration = ffmpeg.probe(local_file_path)["format"]["duration"]
where local_file_path is a relative or absolute path to your file.

I think Chamath's second comment answers the question: you have a strange character somewhere in your script, either because you are using a ` instead of a ' or you have a word with non-english accents, something like this.
As a remark, for what you are doing you can also try MoviePy which parses the ffmpeg output like you do (but maybe in the future I'll use Chamath's ffprobe method it looks cleaner):
import moviepy.editor as mp
duration = mp.VideoFileClip("my_video.mp4").duration

Updated solution using ffprobe based on #llogan guidance with the pointed link:
import subprocess
def get_duration(input_video):
cmd = ["ffprobe", "-i", input_video, "-show_entries", "format=duration",
"-v", "quiet", "-sexagesimal", "-of", "csv=p=0"]
return subprocess.check_output(cmd).decode("utf-8").strip()
Fragile Solution due to stderr output:
the stderr output from ffmpeg is not intended for machine parsing and
is considered fragile.
I get help from the following documentation (https://codingwithcody.com/2014/05/14/get-video-duration-with-ffmpeg-and-python/) and https://stackoverflow.com/a/6239379/2402577
Actually, sed is unnecessary: ffmpeg -i file.mp4 2>&1 | grep -o -P "(?<=Duration: ).*?(?=,)"
You can use the following method to get the duration in HH:MM:SS format:
import subprocess
def get_duration(input_video):
# cmd: ffmpeg -i file.mkv 2>&1 | grep -o -P "(?<=Duration: ).*?(?=,)"
p1 = subprocess.Popen(['ffmpeg', '-i', input_video], stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
p2 = subprocess.Popen(["grep", "-o", "-P", "(?<=Duration: ).*?(?=,)"], stdin=p1.stdout, stdout=subprocess.PIPE)
p1.stdout.close()
return p2.communicate()[0].decode("utf-8").strip()
Example output for both: 01:37:11.83

Have you tried adding the encoding? That error is typical of that, as Chamath said.
Add the utf-8 encoding to your script header:
#!/usr/bin/env python
# -*- coding: utf-8 -*-

I like to build a shared library with ffmpeg, and load it in python.
C++ code:
#ifdef __WIN32__
#define LIB_CLASS __declspec(dllexport)
#else
#define LIB_CLASS
#endif
extern "C" {
#define __STDC_CONSTANT_MACROS
#include "libavformat/avformat.h"
}
extern "C" LIB_CLASS int64_t getDur(const char* url) {
AVFormatContext* pFormatContext = avformat_alloc_context();
if (avformat_open_input(&pFormatContext, url, NULL, NULL)) {
avformat_free_context(pFormatContext);
return -1;
}
int64_t t = pFormatContext->duration;
avformat_close_input(&pFormatContext);
avformat_free_context(pFormatContext);
return t;
}
Then use gcc to compile it and get a shared library.
Python code:
from ctypes import *
lib = CDLL('/the/path/to/your/library')
getDur = lib.getDur
getDur.restype = c_longlong
duration = getDur('the path/URL to your file')
It works well in my python program.

Python Code
<code>
cmnd = ['/root/bin/ffmpeg', '-i', videopath]
process = subprocess.Popen(cmnd, stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
stdout, stderr = process.communicate()
#This matches regex to get the time in H:M:S format
matches = re.search(r"Duration:\s{1}(?P<hours>\d+?):(?P<minutes>\d+?):(?P<seconds>\d+\.\d+?),", stdout, re.DOTALL).groupdict()
t_hour = matches['hours']
t_min = matches['minutes']
t_sec = matches['seconds']
t_hour_sec = int(t_hour) * 3600
t_min_sec = int(t_min) * 60
t_s_sec = int(round(float(t_sec)))
total_sec = t_hour_sec + t_min_sec + t_s_sec
#This matches1 is to get the frame rate of a video
matches1 = re.search(r'(\d+) fps', stdout)
frame_rate = matches1.group(0) // This will give 20fps
frame_rate = matches1.group(1) //It will give 20
</code>

we can also use ffmpeg to get the duration of any video or audio files.
To install ffmpeg follow this link
import subprocess
import re
process = subprocess.Popen(['ffmpeg', '-i', path_of_video_file], stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
stdout, stderr = process.communicate()
matches = re.search(r"Duration:\s{1}(?P<hours>\d+?):(?P<minutes>\d+?):(?P<seconds>\d+\.\d+?),", stdout, re.DOTALL).groupdict()
print (matches['hours'])
print (matches['minutes'])
print (matches['seconds'])

Related

subprocess that prints to pseudo terminal is not using full terminal size

I have the following program that wraps top in a pseudo terminal and prints it back to the real terminal.
import os
import pty
import subprocess
import sys
import time
import select
stdout_master_fd, stdout_slave_fd = pty.openpty()
stderr_master_fd, stderr_slave_fd = pty.openpty()
p = subprocess.Popen(
"top",
shell=True,
stdout=stdout_slave_fd,
stderr=stderr_slave_fd,
close_fds=True
)
stdout_parts = []
while p.poll() is None:
rlist, _, _ = select.select([stdout_master_fd, stderr_master_fd], [], [])
for f in rlist:
output = os.read(f, 1000) # This is used because it doesn't block
sys.stdout.write(output.decode("utf-8"))
sys.stdout.flush()
time.sleep(0.01)
This works well control sequences are handled as expected. However, the subprocess is not using the full dimensions of the real terminal.
For comparison, running the above program:
And running top directly:
I didn't find any api of the pty library to suggest dimensions could be provided.
The dimensions I get in practice for the pseudo terminal are height of 24 lines and width of 80 columns, I'm assuming it might be hardcoded somewhere.
Reading on Emulate a number of columns for a program in the terminal I found the following working solution, at least on my environment (OSX and xterm)
echo LINES=$LINES COLUMNS=$COLUMNS TERM=$TERM
which comes to LINES=40 COLUMNS=203 TERM=xterm-256color in my shell. Then setting the following in the script gives the expected output:
p = subprocess.Popen(
"top",
shell=True,
stdout=stdout_slave_fd,
stderr=stderr_slave_fd,
close_fds=True,
env={
"LINES": "40",
"COLUMNS": "203",
"TERM": "xterm-256color"
}
)
#Mugen's answer pointed me in the right direction but did not quite work, here is what worked for me personally :
import os
import subprocess
my_env = os.environ.copy()
my_env["LINES"] = "40"
my_env["COLUMNS"] = "203"
result = subprocess.Popen(
cmd,
stdout= subprocess.PIPE,
env=my_env
).communicate()[0]
So I had to first get my entire environment variable with os library and then add the elements I needed to it.
The solutions provided by #leas and #Mugen did not work for me, but I eventually stumbled upon ptyprocess Python module, which allows you to provide terminal dimensions when spawning a process.
For context, I am trying to use a Python script to run a PowerShell 7 script and capture the PowerShell script's output. The host OS is Ubuntu Linux 22.04.
My code looks something like this:
from ptyprocess import PtyProcessUnicode
# Run the PowerShell script
script_run_cmd = 'pwsh -file script.ps1 param1 param2'
p = PtyProcessUnicode.spawn(script_run_cmd.split(), dimensions=(24,130))
# Get all script output
script_output = []
while True:
try:
script_output.append(p.readline().rstrip())
except EOFError:
break
# Not sure if this is necessary
p.close()
I feel like there should be a class method to get all the output, but I couldn't find one and the above code works well for me.

Running a C executable inside a python program

I have written a C code where I have converted one file format to another file format. To run my C code, I have taken one command line argument : filestem.
I executed that code using : ./executable_file filestem > outputfile
Where I have got my desired output inside outputfile
Now I want to take that executable and run within a python code.
I am trying like :
import subprocess
import sys
filestem = sys.argv[1];
subprocess.run(['/home/dev/executable_file', filestem , 'outputfile'])
But it is unable to create the outputfile. I think some thing should be added to solve the > issue. But unable to figure out. Please help.
subprocess.run has optional stdout argument, you might give it file handle, so in your case something like
import subprocess
import sys
filestem = sys.argv[1]
with open('outputfile','wb') as f:
subprocess.run(['/home/dev/executable_file', filestem],stdout=f)
should work. I do not have ability to test it so please run it and write if it does work as intended
You have several options:
NOTE - Tested in CentOS 7, using Python 2.7
1. Try pexpect:
"""Usage: executable_file argument ("ex. stack.py -lh")"""
import pexpect
filestem = sys.argv[1]
# Using ls -lh >> outputfile as an example
cmd = "ls {0} >> outputfile".format(filestem)
command_output, exitstatus = pexpect.run("/usr/bin/bash -c '{0}'".format(cmd), withexitstatus=True)
if exitstatus == 0:
print(command_output)
else:
print("Houston, we've had a problem.")
2. Run subprocess with shell=true (Not recommended):
"""Usage: executable_file argument ("ex. stack.py -lh")"""
import sys
import subprocess
filestem = sys.argv[1]
# Using ls -lh >> outputfile as an example
cmd = "ls {0} >> outputfile".format(filestem)
result = subprocess.check_output(shlex.split(cmd), shell=True) # or subprocess.call(cmd, shell=True)
print(result)
It works, but python.org frowns upon this, due to the chance of a shell injection: see "Security Considerations" in the subprocess documentation.
3. If you must use subprocess, run each command separately and take the SDTOUT of the previous command and pipe it into the STDIN of the next command:
p = subprocess.Popen(cmd, stdin=PIPE, stdout=PIPE)
stdout_data, stderr_data = p.communicate()
p = subprocess.Popen(cmd, stdin=stdout_data, stdout=PIPE)
etc...
Good luck with your code!

How to hide console output of FFmpeg in Python?

I was working on a YouTube video downloader Python program.
I want to encode downloaded data to other media formats for this job i used FFmpeg and FFmpeg-Python (Package to use FFmpeg in Python).
Everything is Fine but i want to ask that how can i disable FFmpeg Output on the console ?
Here is some Pic of my Program :-
But this console often appears when my program starts encoding, suppressing the main GUI :-
If you know any solution for my problem then please give me some solution.
It is my first time that i am trying Stackoverflow for my problem.
THANKS IN ADVANCE !!!!!
It has been 1 year and 8 months since you have asked this question, you might already have a solution for that. However, I found a solution to solve your problem.
You can solve this problem by modifying the original ffmpeg code when you package your python program.
First, find your ffmpeg lib folder, if you install with the default location, you can check your libs here: C:\Users\User\AppData\Local\Programs\Python\Python310\Lib\site-packages\ffmpeg.
Second, find _probe.py and modify codes, here is the code that already got modified, any change is written in the comments. You need to Popen add args: shell=True, stdin=subprocess.PIPE.
import json
import subprocess
from ._run import Error
from ._utils import convert_kwargs_to_cmd_line_args
def probe(filename, cmd='ffprobe', **kwargs):
"""Run ffprobe on the specified file and return a JSON representation of the output.
Raises:
:class:`ffmpeg.Error`: if ffprobe returns a non-zero exit code,
an :class:`Error` is returned with a generic error message.
The stderr output can be retrieved by accessing the
``stderr`` property of the exception.
"""
args = [cmd, '-show_format', '-show_streams', '-of', 'json']
args += convert_kwargs_to_cmd_line_args(kwargs)
args += [filename]
# Original: p = subprocess.Popen(args, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
# Popen add args: shell=True, stdin=subprocess.PIPE,
p = subprocess.Popen(args, shell=True, stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
out, err = p.communicate()
if p.returncode != 0:
raise Error('ffprobe', out, err)
return json.loads(out.decode('utf-8'))
__all__ = ['probe']
Then, go to _run.py. You need to add shell=True, modify stdin=subprocess.PIPE or modify pipe_stdin=True (The code section below is just a part of the code):
#output_operator()
def run_async(
stream_spec,
cmd='ffmpeg',
pipe_stdin=False,
pipe_stdout=False,
pipe_stderr=False,
quiet=False,
overwrite_output=False,
):
"""Asynchronously invoke ffmpeg for the supplied node graph.
Args:
pipe_stdin: if True, connect pipe to subprocess stdin (to be
used with ``pipe:`` ffmpeg inputs).
pipe_stdout: if True, connect pipe to subprocess stdout (to be
used with ``pipe:`` ffmpeg outputs).
pipe_stderr: if True, connect pipe to subprocess stderr.
quiet: shorthand for setting ``capture_stdout`` and
``capture_stderr``.
**kwargs: keyword-arguments passed to ``get_args()`` (e.g.
``overwrite_output=True``).
Returns:
A `subprocess Popen`_ object representing the child process.
Examples:
Run and stream input::
process = (
ffmpeg
.input('pipe:', format='rawvideo', pix_fmt='rgb24', s='{}x{}'.format(width, height))
.output(out_filename, pix_fmt='yuv420p')
.overwrite_output()
.run_async(pipe_stdin=True)
)
process.communicate(input=input_data)
Run and capture output::
process = (
ffmpeg
.input(in_filename)
.output('pipe':, format='rawvideo', pix_fmt='rgb24')
.run_async(pipe_stdout=True, pipe_stderr=True)
)
out, err = process.communicate()
Process video frame-by-frame using numpy::
process1 = (
ffmpeg
.input(in_filename)
.output('pipe:', format='rawvideo', pix_fmt='rgb24')
.run_async(pipe_stdout=True)
)
process2 = (
ffmpeg
.input('pipe:', format='rawvideo', pix_fmt='rgb24', s='{}x{}'.format(width, height))
.output(out_filename, pix_fmt='yuv420p')
.overwrite_output()
.run_async(pipe_stdin=True)
)
while True:
in_bytes = process1.stdout.read(width * height * 3)
if not in_bytes:
break
in_frame = (
np
.frombuffer(in_bytes, np.uint8)
.reshape([height, width, 3])
)
out_frame = in_frame * 0.3
process2.stdin.write(
frame
.astype(np.uint8)
.tobytes()
)
process2.stdin.close()
process1.wait()
process2.wait()
.. _subprocess Popen: https://docs.python.org/3/library/subprocess.html#popen-objects
"""
args = compile(stream_spec, cmd, overwrite_output=overwrite_output)
stdin_stream = subprocess.PIPE if pipe_stdin else None
stdout_stream = subprocess.PIPE if pipe_stdout or quiet else None
stderr_stream = subprocess.PIPE if pipe_stderr or quiet else None
# Original: return subprocess.Popen(
# args, stdin=pipe_stdin, stdout=stdout_stream, stderr=stderr_stream)
# Add shell=True, modify stdin=subprocess.PIPE or modify pipe_stdin=True
return subprocess.Popen(
args, shell=True, stdin=subprocess.PIPE, stdout=stdout_stream, stderr=stderr_stream
)
Add "from subprocess import CREATE_NO_WINDOW" and use "creationflags=CREATE_NO_WINDOW" for Popen. Below is updated part of "_run.py" code from ffmpeg-python library, that worked for me.
from subprocess import CREATE_NO_WINDOW
#output_operator()
def run_async(
stream_spec,
cmd='ffmpeg',
pipe_stdin=False,
pipe_stdout=False,
pipe_stderr=False,
quiet=False,
overwrite_output=False,
):
args = compile(stream_spec, cmd, overwrite_output=overwrite_output)
stdin_stream = subprocess.PIPE if pipe_stdin else None
stdout_stream = subprocess.PIPE if pipe_stdout or quiet else None
stderr_stream = subprocess.PIPE if pipe_stderr or quiet else None
return subprocess.Popen(
args, stdin=subprocess.PIPE, stdout=stdout_stream, stderr=stderr_stream, creationflags=CREATE_NO_WINDOW
)
Bradley's answer worked for to stop console flashes after compiling with pyinstaller. However, I wasn't comfortable updating the ffmpeg-python library itself since it would be overwritten when there was an update from PIP, and felt a little hacky just in general.
I ended up hi-jacking the functions to use within my class and used those directly, which also did the trick. I think it's safer but still carries its own risks if the library is updated in a way that conflicts with the hijacked functions.
"""Run OS command
Function to merge video and
subtitle file(s) into an MKV
"""
def run_os_command(self, os_command):
subprocess.call(os_command, shell=True)
"""FFmpeg probe hi-jack
Customized arguments to Popen to
prevent console flashes after
compiled with PyInstaller
"""
def ffmpeg_probe(self, video_input_path):
command = ['ffprobe', '-show_format', '-show_streams', '-of', 'json']
command += [video_input_path]
process = subprocess.Popen(
command,
shell=True,
stdin=subprocess.PIPE,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE
)
out, err = process.communicate()
if process.returncode != 0:
raise Exception(f"ffprobe error: {err}")
return json.loads(out.decode('utf-8'))
"""FFmpeg run hi-jack
Uses argument compiler from
library but alternate sub-
process method to run command
to prevent console flashes.
"""
def ffmpeg_run(self, stream):
os_command = ffmpeg.compile(stream, 'ffmpeg', overwrite_output=True)
return self.run_os_command(os_command)
Then to use
probe = ffmpeg_probe(video_input_path) # use like ffmpeg.probe()
ffmpeg_run(stream) # use like ffmpeg.run() can update the function if you pass more than stream

Python subprocess call to xpdf's pdftotext not working with encoding

I am trying to run pdftotext using python subprocess module.
import subprocess
pdf = r"path\to\file.pdf"
txt = r"path\to\out.txt"
pdftotext = r"path\to\pdftotext.exe"
cmd = [pdftotext, pdf, txt, '-enc UTF-8']
response = subprocess.check_output(cmd,
shell=True,
stderr=subprocess.STDOUT)
TB
CalledProcessError: Command '['path\\to\\pdftotext.exe',
'path\\to\\file.pdf', 'path\\to\\out.txt', '-enc UTF-8']'
returned non-zero exit status 99
When I remove last argument '-enc UTF-8' from cmd, it works OK in python.
When I run pdftotext pdf txt -enc UTF-8 in cmd, it works ok.
What I am missing?
Thanks.
subprocess has some complicated rules for handling commands. From the docs:
The shell argument (which defaults to False) specifies whether to use
the shell as the program to execute. If shell is True, it is
recommended to pass args as a string rather than as a sequence.
More details explained in this answer here.
So, as the docs explain, you should convert your command to a string:
cmd = r"""{} "{}" "{}" -enc UTF-8""".format('pdftotext', pdf, txt)
Now, call subprocess as:
subprocess.call(cmd, shell=True, stderr=subprocess.STDOUT)

using os.popen3() to extract thumbnail for a video in python

I am using ffmpeg to extract a frame from a video. This works fine when I use ffmpeg from the command line, however, when I try to do the same thing using the python:
os.popen3('ffmpeg -i videoPath -an -ss 00:00:02 -an -r 1 -vframes 1 -y picturePath')
I have no idea on how to get the extracted image. So far, I get only text saying (ffmpeg version N-62039-gc00f368 Copyright (c) 2000....) which is what I see in the command line. Would you please guide through what I need to do to get the image extracted. Thank you.
I typically use the following function for that sort of tasks. It includes a timeout parameter for automatically aborting excessively long running processes. That's often handy when it comes to user uploaded content:
import subprocess
from threading import Timer
def run_with_timeout(cmd, sec):
def kill_proc(p, killed):
killed['val'] = True
p.kill()
p = subprocess.Popen(cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
killed = {'val': False}
timer = Timer(sec, kill_proc, [p, killed])
timer.start()
stdout, stderr = p.communicate()
timer.cancel()
return p.returncode, stdout, stderr, killed['val']
With that, you can simply call any shell command with options, which is run synchronously. That means the function waits until the process is finished. Therefore, when the function is done, the thumbnail is either created or an error is returned:
videoPath = '/path/to/video/source/file.mp4'
picturePath = '/path/to/output.jpg'
result = run_with_timeout(['ffmpeg', '-i', videoPath, '-an', '00:00:02', '-r', '1', '-vframes', '1', '-y', picturePath], 30)
if result[0] != 0:
print 'error'
else:
print 'success'
# do something with videoPath
Works in current Python versions 2.7.x.
os is deprecated. You should use subprocess.

Categories

Resources