Piped FFMPEG won't write frames correctly - python

I am using Python's Image module to load JPEGs and modify them. After I have a modified image, I want to load that image in to a video, using more modified images as frames in my video.
I have 3 programs written to do this:
ImEdit (My image editing module that I wrote)
VideoWriter (writes to an mp4 file using FFMPEG) and
VideoMaker (The file I'm using to do everything)
My VideoWriter looks like this...
import subprocess as sp
import os
import Image
FFMPEG_BIN = "ffmpeg"
class VideoWriter():
def __init__(self,xsize=480,ysize=360,FPS=29,
outDir=None,outFile=None):
if outDir is None:
print("No specified output directory. Using default.")
outDir = "./VideoOut"
if outFile is None:
print("No specified output file. Setting temporary.")
outFile = "temp.mp4"
if (outDir and outFile) is True:
if os.path.exists(outDir+outFile):
print("File path",outDir+outFile, "already exists:",
"change output filename or",
"overwriting will occur.")
self.outDir = outDir
self.outFile = outFile
self.xsize,self.ysize,self.FPS = xsize,ysize,FPS
self.buildWriter()
def setOutFile(self,fileName):
self.outFile = filename
def setOutDir(self,dirName):
self.outDir = dirName
def buildWriter(self):
commandWriter = [FFMPEG_BIN,
'-y',
'-f', 'rawvideo',
'-vcodec','mjpeg',
'-s', '480x360',#.format(480,
'-i', '-',
'-an', #No audio
'-r', str(29),
'./{}//{}'.format(self.outDir,self.outFile)]
self.pW = sp.Popen(commandWriter,
stdin = sp.PIPE)
def writeFrame(self,ImEditObj):
stringData = ImEditObj.getIm().tostring()
im = Image.fromstring("RGB",(309,424),stringData)
im.save(self.pW.stdin, "JPEG")
self.pW.stdin.flush()
def finish(self):
self.pW.communicate()
self.pW.stdin.close()
ImEditObj.getIm() returns an instance of a Python Image object
This code works to the extent that I can load one frame in to the video and no matter how many more calls to writeFrame that I do, the video only every ends up being one frame long. I have other code that works as far as making a video out of single frames and that code is nearly identical to this code. I don't know what difference there is though that makes this code not work as intended where the other code does work.
My question is...
How can I modify my VideoWriter class so that I can pass in an instance of an Python's Image object and write that frame to an output file? I also would like to be able to write more than one frame to the video.
I've spent 5 hours or more trying to debug this, having not found anything helpful on the internet, so if I missed any StackOverflow questions that would point me in the right direction, those would be appreciated...
EDIT:
After a bit more debugging, the issue may have been that I was trying to write to a file that already existed, however, this doesn't make much sense with the -y flag in my commandWriter. the -y flag should overwrite any file that already exists. Any thoughts on that?

I suggest that you follow the OpenCV tutorial in writing videos. This is a very common way of writing video files from Python, so you should find many answers on the internet, if you can't get certain things to work.
Note that the VideoWriter will discard (and won't write) any frames that are not in the exact same pixel size that you give it on initialization.

Related

How to utilise ffmpeg to to extract key frames from a video stream and only print the labels present within these frames?

So a bit of context, I'm using the TensorFlow object detection API for a project, and I've modified the visualization_utils file to print any present class labels to the terminal and then write them to a .txt file. From a bit of research I've come across FFmpeg, I'm wondering if there is a function I can use in FFmpeg so that it only prints and writes the class labels from keyframes within the video? - i.e. when there is a change in the video. At the moment it is printing all the class labels per frame even if there is no change, so I have duplicate numbers of labels even if there is no new object within the video. Following on from this, would I have to apply this keyframe filtering to an input video beforehand?
Thanks in advance!
I'm using opencv2 to capture my video input.
Please see below for code:
visualization_utils.py - inside the draw_bounding_box_on_image_array function:
# Write video output to file for evaluation.
f = open("ObjDecOutput.txt", "a")
print(display_str_list[0])
f.write(display_str_list[0])
Thought I'd just follow up on this, I ended up using ffmpeg mpdecimate and setpts filters to remove duplicate and similar frames.
ffmpeg -i example.mp4 -vf mpdecimate=frac=1,setpts=N/FRAME_RATE/TB example_decimated.mp4
This however didn't solve the problem of duplicates within the file I was writing the labels to - to solve this I appended each row in the file to a list and looped through it to remove groups of duplicated elements and only kept the first occurrence and appended that to a new list.
Finally, I found the solution here after a year. However, there is a small bug in the code converted from this script.
The fix is and frame["key_frame"]
import json
import subprocess
def get_frames_metadata(file):
command = '"{ffexec}" -show_frames -print_format json "{filename}"'.format(ffexec='ffprobe', filename=file)
response_json = subprocess.check_output(command, shell=True, stderr=None)
frames = json.loads(response_json)["frames"]
frames_metadata, frames_type, frames_type_bool = [], [], []
for frame in frames:
if frame["media_type"] == "video":
video_frame = json.dumps(dict(frame), indent=4)
frames_metadata.append(video_frame)
frames_type.append(frame["pict_type"])
if frame["pict_type"] == "I" and frame["key_frame"]:
frames_type_bool.append(True)
else:
frames_type_bool.append(False)
# print(frames_type)
return frames_metadata, frames_type, frames_type_bool
The frame types are stores in frames_type, but don't trust it. True keyframes are in frames_type_bool.
I tested a clip for which I had two consecutive I-frames at the beginning, but avidemux was showing only one. So I checked the original code and found that some frames may have pict_type = I but key_frame = False. I thus fixed the code.
After having the frames_type_bool, you can extract the True indices and opencv or imageio to extract keyframes only.
This is how to use this function and imageio to show the keyframes:
import matplotlib.pyplot as plt
import imageio
filename = 'Clip.mp4'
# extract frame types
_,_, isKeyFrame = get_frames_metadata(filename)
# keep keyframes indices
keyframes_index = [i for i,b in enumerate(isKeyFrame) if b]
# open file
vid = imageio.get_reader(filename, 'ffmpeg')
for i in keyframes_index:
image = vid.get_data(i)
fig = plt.figure()
fig.suptitle('image #{}'.format(i), fontsize=20)
plt.imshow(image)
plt.show()

python ghostscript not closing output file

I'm trying to turn PDF files with one or many pages into images for each page. This is very much like the question found here. In fact, I'm trying to use the code from #Idan Yacobi in that post to accomplish this. His code looks like this:
import ghostscript
def pdf2jpeg(pdf_input_path, jpeg_output_path):
args = ["pdf2jpeg", # actual value doesn't matter
"-dNOPAUSE",
"-sDEVICE=jpeg",
"-r144",
"-sOutputFile=" + jpeg_output_path,
pdf_input_path]
ghostscript.Ghostscript(*args)
When I run the code I get the following output from python:
##### 238647312 c_void_p(238647312L)
When I look at the folder where the new .jpg image is supposed to be created, there is a file there with the new name. However, when I attempt to open the file, the image preview says "Windows Photo Viewer can't open this picture because the picture is being edited in another program."
It seems that for some reason Ghostscript opened the file and wrote to it, but didn't close it after it was done. Is there any way I can force that to happen? Or, am I missing something else?
I already tried changing the last line above to the code below to explicitly close ghostscript after it was done.
GS = ghostscript.Ghostscript(*args)
GS.exit()
I was having the same problem where the image files were kept open but when I looked into the ghostscript init.py file (found in the following directory: PythonDirectory\Lib\site-packages\ghostscript__init__.py), the exit method has a line commented.
The gs.exit(self._instance) line is commented by default but when you uncomment the line, the image files are being closed.
def exit(self):
global __instance__
if self._initialized:
print '#####', self._instance.value, __instance__
if __instance__:
gs.exit(self._instance) # uncomment this line
self._instance = None
self._initialized = False
I was having this same problem while batching a large number of pdfs, and I believe I've isolated the problem to an issue with the python bindings for Ghostscript, in that like you said, the image file is not properly closed. To bypass this, I had to go to using an os system call. so given your example, the function and call would be replaced with:
os.system("gs -dNOPAUSE -sDEVICE=jpeg -r144 -sOutputFile=" + jpeg_output_path + ' ' + pdf_input_path)
You may need to change "gs" to "gswin32c" or "gswin64c" depending on your operating system. This may not be the most elegant solution, but it fixed the problem on my end.
My work around was actually just to install an image printer and have Python print the PDF using the image printer instead, thus creating the desired jpeg image. Here's the code I used:
import win32api
def pdf_to_jpg(pdf_path):
"""
Turn pdf into jpg image(s) using jpg printer
:param pdf_path: Path of the PDF file to be converted
"""
# print pdf to jpg using jpg printer
tempprinter = "ImagePrinter Pro"
printer = '"%s"' % tempprinter
win32api.ShellExecute(0, "printto", pdf_path, printer, ".", 0)
I was having the same problem when running into a password protected PDF - ghostscript would crash and not close the PDF preventing me from deleting the PDF.
Kishan's solution was already applied for me and therefore it wouldn't help my problem.
I fixed it by importing GhostscriptError and instantiating an empty Ghostscript before a try/finally block like so:
from ghostscript import GhostscriptError
from ghostscript import Ghostscript
...
# in my decryptPDF function
GS = Ghostscript()
try:
GS = Ghostscript(*args)
finally:
GS.exit()
...
# in my function that runs decryptPDF function
try:
if PDFencrypted(append_file_path):
decryptPDF(append_file_path)
except GhostscriptError:
remove(append_file_path)
# more code to log and handle the skipped file
...
For those that stumble upon this with the same problem. I looked through the python ghostscript init file and discovered the ghostscript.cleanup() function/def.
Therefore, I was able to solve the problem by adding this simple one-liner to the end of my script [or the end of the loop].
ghostscript.cleanup()
Hope it helps someone else because it frustrated me for quite a while.

No audio when adding Mp3 to VideoFileClip MoviePy

I'm trying to add an mp3 audio file to a video clip that I'm creating out of images with MoviePy. When the script runs it creates the mp4 file and plays successfully, however there's no audio. I'm not really sure why and can't seem to find a ton of documentation around this in general. MoviePy is pretty new to me so any help would be appreciated - thank-you!
def make_video(images):
image_clips = []
for img in images:
if not os.path.exists(img):
raise FileNotFoundError(img)
ic = ImageClip(img).set_duration(3)
image_clips.append(ic)
video = concatenate(image_clips, method="compose")
video.set_audio(AudioFileClip("audio.mp3"))
video.write_videofile("mp4_with_audio.mp4", fps=60, codec="mpeg4")
This worked for me:
clip.write_videofile(out_path,
codec='libx264',
audio_codec='aac',
temp_audiofile='temp-audio.m4a',
remove_temp=True
)
Found it here: https://github.com/Zulko/moviepy/issues/51
Admittedly, this question is old but comes high in search results for the problem. I had the same issue and think the solution can be clarified.
The line:
video.set_audio(AudioFileClip("audio.mp3"))
actually does not change the audio track of the "video" object, but returns a copy of the object with the new AudioFileClip attached to it.
That means that the method:
video.write_videofile("mp4_with_audio.mp4", fps=60, codec="mpeg4")
does not write the final file with the new audio track, since the "video" object remains unchanged.
Changing the script as per the below solved the issue for me.
video_with_new_audio = video.set_audio(AudioFileClip("audio.mp3"))
video_with_new_audio.write_videofile("mp4_with_audio.mp4", fps=60, codec="mpeg4")
See also the docs
Check the video mp4_with_audio.mp4 with VLC media player, i also have same issue with quick player.
I run into this problem too. I found a solution, try
video = video.set_audio(AudioFileClip("audio.mp3"))
I was doing something similar and found that moviepy 1.0.1 did not call ffmpeg with the right arguments to combine the video and audio for mp4 video. I solved this through a workaround using ffmpeg directly. It uses the temp audio file and video file from moviepy to create a final file. This is a similar question: Output video has no sound
Since you are working with mp3, you may need to have ffmpeg convert to aac, so this code does that.
This link helped me with ffmpeg:https://superuser.com/questions/277642/how-to-merge-audio-and-video-file-in-ffmpeg
video_with_new_audio = video.set_audio(AudioFileClip("audio.mp3"))
video_with_new_audio.write_videofile("temp_moviepy.mp4", temp_audiofile="tempaudio.m4a",codec="libx264",remove_temp=False,audio_codec='aac')
import subprocess as sp
command = ['ffmpeg',
'-y', #approve output file overwite
'-i', "temp_moviepy.mp4",
'-i', "tempaudio.m4a",
'-c:v', 'copy',
'-c:a', 'aac', #to convert mp3 to aac
'-shortest',
"mp4_with_audio.mp4" ]
with open(ffmpeg_log, 'w') as f:
process = sp.Popen(command, stderr=f)
Use this:
video.write_videofile("output.mp4", fps=30, audio_codec="aac", audio_bitrate="192k")

having cv2.imread reading images from file objects or memory-stream-like data (here non-extracted tar)

I have a .tar file containing several hundreds of pictures (.png). I need to process them via opencv.
I am wondering whether - for efficiency reasons - it is possible to process them without passing by the disc. In other, words I want to read the pictures from the memory stream related to the tar file.
Consider for instance
import tarfile
import cv2
tar0 = tarfile.open('mytar.tar')
im = cv2.imread( tar0.extractfile('fname.png').read() )
The last line doesn't work as imread expects a file name rather than a stream.
Consider that this way of reading directly from the tar stream can be achieved e.g. for text (see e.g. this SO question).
Any suggestion to open the stream with the correct png encoding?
Untarring to ramdisk is of course an option, although I was looking for something more cachable.
Thanks to the suggestion of #abarry and this SO answer I managed to find the answer.
Consider the following
def get_np_array_from_tar_object(tar_extractfl):
'''converts a buffer from a tar file in np.array'''
return np.asarray(
bytearray(tar_extractfl.read())
, dtype=np.uint8)
tar0 = tarfile.open('mytar.tar')
im0 = cv2.imdecode(
get_np_array_from_tar_object(tar0.extractfile('fname.png'))
, 0 )
Perhaps use imdecode with a buffer coming out of the tar file? I haven't tried it but seems promising.

Converting PDF to images automatically

So the state I'm in released a bunch of data in PDF form, but to make matters worse, most (all?) of the PDFs appear to be letters typed in Office, printed/fax, and then scanned (our government at its best eh?). At first I thought I was crazy, but then I started seeing numerous pdfs that are 'tilted', like someone didn't get them on the scanner properly. So, I figured the next best thing to getting the actual text out of them, would be to turn each page into an image.
Obviously this needs to be automated, and I'd prefer to stick with Python if possible. If Ruby or Perl have some form of implementation that's just too awesome to pass up, I can go that route. I've tried pyPDF for text extraction, that obviously didn't do me much good. I've tried swftools, but the images I'm getting from that are just shy of completely unusable. It just seems like the fonts get ruined in the conversion. I also don't even really care about the image format on the way out, just as long as they're relatively lightweight, and readable.
If the PDFs are truly scanned images, then you shouldn't convert the PDF to an image, you should extract the image from the PDF. Most likely, all of the data in the PDF is essentially one giant image, wrapped in PDF verbosity to make it readable in Acrobat.
You should try the simple expedient of simply finding the image in the PDF, and copying the bytes out: Extracting JPGs from PDFs. The code there is dead simple, and there are probably dozens of reasons it won't work on your PDF files. But if it does, you'll have a quick and painless way to get the image data out of the PDF files.
You could call e.g. pdftoppm from the command-line (or using Python's subprocess module) and then convert the resulting PPM files to the desired format using e.g. ImageMagick (again, using subprocess or some bindings if they exist).
Ghostscript is ideal for converting PDF files to images. It is reliable and has many configurable options. Its also available under the GPL license or commercial license. You can call it from the command line or use its native API. For more information:
Ghostscript Main Website
Ghostscript docs on Command line usage
Another stackoverflow thread that provides some examples of invoking Ghostscript's command line interface from Python
Ghostscript API Documentation
Here's an alternative approach to turning a .pdf file into images: Use an image printer. I've successfully used the function below to "print" pdf's to jpeg images with ImagePrinter Pro. However, there are MANY image printers out there. Pick the one you like. Some of the code may need to be altered slightly based on the image printer you pick and the standard file saving format that image printer uses.
import win32api
import os
def pdf_to_jpg(pdfPath, pages):
# print pdf using jpg printer
# 'pages' is the number of pages in the pdf
filepath = pdfPath.rsplit('/', 1)[0]
filename = pdfPath.rsplit('/', 1)[1]
#print pdf to jpg using jpg printer
tempprinter = "ImagePrinter Pro"
printer = '"%s"' % tempprinter
win32api.ShellExecute(0, "printto", filename, printer, ".", 0)
# Add time delay to ensure pdf finishes printing to file first
fileFound = False
if pages > 1:
jpgName = filename.split('.')[0] + '_' + str(pages - 1) + '.jpg'
else:
jpgName = filename.split('.')[0] + '.jpg'
jpgPath = filepath + '/' + jpgName
waitTime = 30
for i in range(waitTime):
if os.path.isfile(jpgPath):
fileFound = True
break
else:
time.sleep(1)
# print Error if the file was never found
if not fileFound:
print "ERROR: " + jpgName + " wasn't found after " + str(waitTime)\
+ " seconds"
return jpgPath
The resulting jpgPath variable tells you the path location of the last jpeg page of the pdf printed. If you need to get another page, you can easily add some logic to modify the path to get prior pages
in pdf_to_jpg(pdfPath)
6 # 'pages' is the number of pages in the pdf
7 filepath = pdfPath.rsplit('/', 1)[0]
----> 8 filename = pdfPath.rsplit('/', 1)[1]
9
10 #print pdf to jpg using jpg printer
IndexError: list index out of range
With Wand there are now excellent imagemagick bindings for Python that make this a very easy task.
Here is the code necessary for converting a single PDF file into a sequence of PNG images:
from wand.image import Image
input_path = "name_of_file.pdf"
output_name = "name_of_outfile_{index}.png"
source = Image(filename=upload.original.path, resolution=300, width=2200)
images = source.sequence
for i in range(len(images)):
Image(images[0]).save(filename=output_name.format(i))

Categories

Resources