MoviePy write_videofile taking hours - python

I am trying to concatenate every clip in a directory as well as add an intro and outro. In the future I will also be adding editing functions such as zooms and rotations, which is why I am not directly calling ffmpeg, but instead using MoviePy.
Everything runs smoothly, until final_vid.write_videofile(). It first renders the audio at a fairly good speed.
chunk: 55%|█████▍ | 8721/15977 [00:14<00:10, 672.54it/s, now=None]
When it then tries to render video, the speed slows down massively, to an expected rendering time of 72 hours. This is running on a ryzen 2600 with 16 gigs of ram, so I doubt the hardware is the bottleneck.
t: 0%| | 5/43473 [00:28<72:17:35, 5.99s/it, now=None]
I have tried this with different codecs, fps settings, logger off and multiple other settings. How would I go about speeding this up, since this cannot reasonably be the maximum speed of MoviePy?
Full code below:
def edit(game):
intro = VideoFileClip("intro.mp4")
final_vid = intro
game = game.replace(" ", "")
game_treated = game.replace(" ", "%20")
for clip_name in os.listdir("current_vids"):
new_clip = VideoFileClip(os.path.join("current_vids", clip_name), target_resolution=(1920, 1080))
final_vid = concatenate_videoclips(clips=[final_vid, new_clip], method="compose")
outro = VideoFileClip("outro.mp4")
final_vid = concatenate_videoclips(clips=(final_vid, outro), method="compose")
final_vid.write_videofile(game +"%Y-%m-%d") + ".mp4")
for clip_name in os.listdir("current_vids"):
os.remove(os.path.join("current_vids", clip_name))
return game_treated +"%Y-%m-%d") + ".mp4"


Concatenate a video, image and audio using ffmpeg

I am trying to concatenate a group of images with associated audio with a video clip at the start and front of the video. Whenever I concatenate the image with the associated audio it dosen't playback correctly in VLC media player and only displays the image for a frame before cutting to black and continually playing audio. I came across this github issue: where the accepted solution was the one I implemented but one of the comments mentioned this issue of incorrect playback and error on youtube.
Generates a clip from an image and a wav file, helper function for export_video
def generate_clip(img):
transition_cond = os.path.exists("static/transitions/" + img + ".mp4")
chart_path = os.path.exists("charts/" + img + ".png")
if transition_cond:
clip = ffmpeg.input("static/transitions/" + img + ".mp4")
elif chart_path:
clip = ffmpeg.input("charts/" + img + ".png")
clip = ffmpeg.input("static/transitions/Transition.jpg")
audio_clip = ffmpeg.input("audio/" + img + ".wav")
clip = ffmpeg.concat(clip, audio_clip, v=1, a=1)
clip = ffmpeg.filter(clip, "setdar","16/9")
return clip
Combines the charts from charts/ and the audio from audio/ to generate one final video that will be uploaded to Youtube
def export_video(CHARTS):
clips = []
intro = generate_clip("Intro")
for key in CHARTS.keys():
value = CHARTS.get(key)
value.insert(0, key)
subclip = []
for img in value:
concat_clip = ffmpeg.concat(*subclip)
outro = generate_clip("Outro")
concat_clip = ffmpeg.concat(*clips)
It is unfortunate concat filter does not offer the shortest option like overlay. Anyway, the issue here is that image2 demuxer uses 25 fps by default, so a video stream with one image only lasts for 1/25 seconds long. There are a several ways to address this, but you first need to get the duration of the paired audio files. To incorporate the duration information to the ffmpeg command, you can:
Use tpad filter for each video (in series with setdar) to make the video duration to match the audio. Padded amount should be 1/25 seconds less than the audio duration.
Specify -loop 1 input option so the image will loop (indefinitely) and then specify an additional -t {duration} input option to limit the number of loops. Caution that the video duration may not be exact.
Specify -r {1/duration} so the image will last as long as the audio and use fps filter on each input to the output frame rate.
I'm not familiar with ffmpeg-python so I cannot provide its solution, but if you're interested, I'd be happy to post an equivalent code with my ffmpegio package.
ffmpegio Solution
Here is how I'd code the 3rd solution with ffmpegio:
import ffmpegio
def generate_clip(img):
Generates a clip from an image and a wav file,
helper function for export_video
transition_cond = path.exists("static/transitions/" + img + ".mp4")
chart_path = path.exists("charts/" + img + ".png")
if transition_cond:
video_file = "static/transitions/" + img + ".mp4"
elif chart_path:
video_file = "charts/" + img + ".png"
video_file = "static/transitions/Transition.jpg"
audio_file = "audio/" + img + ".wav"
video_opts = {}
if not transition_cond:
# audio_streams_basic() returns audio duration in seconds as Fraction
# set the "framerate" of the video to be the reciprocal
info = ffmpegio.probe.audio_streams_basic(audio_file)
video_opts["r"] = 1 / info[0]["duration"]
return [(video_file, video_opts), (audio_file, None)]
def export_video(CHARTS):
Combines the charts from charts/ and the audio from audio/
to generate one final video that will be uploaded to Youtube
# get all input files (video/audio pairs)
clips = [
*(generate_clip(img) for key, value in CHARTS.items() for img in value),
# number of clips
nclips = len(clips)
# filter chains to set DAR and fps of all video streams
vfilters = (f"[{2*n}:v]setdar=16/9,fps=30[v{n}]" for n in range(nclips))
# concatenation filter input: [v0][1:a][v1][3:a][v2][5:a]...
concatfilter = "".join((f"[v{n}][{2*n+1}:a]" for n in range(nclips))) + f"concat=n={nclips}:v=1:a=1[vout][aout]"
# form the full filtergraph
fg = ";".join((*vfilters, concatfilter))
# set output file and options
output = ("export/export.mp4", {"map": ["[vout]", "[aout]"]})
# run ffmpeg
"inputs": [input for pair in clips for input in pair],
"outputs": [output],
"global_options": {"filter_complex": fg},
Since this code does not use the read/write features, ffmpegio-core package suffices:
pip install ffmpegio-core
Make sure that FFmpeg binary can be found by ffmpegio. See the installation doc.
Here are the direct links to the documentations of the functions used:
ffmpeg_args dict argument
probe.audio_streams_basic (Ignore the documentation error both duration and start_time are both of Fraction type.
The code has not been fully validated. If you encounter a problem, it might be the easiest to post it on the GitHub Discussions to proceed.

Python New n frame video is heavier than the input video

I managed to write a code to decimate my video and take only 1 frame out of 10, in order to make my neural network more efficient in the future for character recognition.
The new video exit_video is well decimated because it's way faster than the previous one.
1: When I print the fps of the new video, I have 30 again despite the decimation
2: Why is my new video heavier ? 50.000 ko and it was 42.000 ko for the firts one
Thanks for your help
import cv2
#import os
import sys
video = cv2.VideoCapture("./video/inputvideo.mp4")
frameWidth = int(video.get(cv2.CAP_PROP_FRAME_WIDTH))
frameHeight = int(video.get(cv2.CAP_PROP_FRAME_HEIGHT))
frameFourcc = int(video.get(cv2.CAP_PROP_FOURCC))
success,image =
if not success:
print('impossible de prendre une frame')
fps = video.get(cv2.CAP_PROP_FPS)
print("fps de base " + str(fps))
count = 0
exit_file = 'decimated_v1.mp4'
exit_video = cv2.VideoWriter(exit_file, frameFourcc, fps, (frameWidth, frameHeight))
while True:
if ((count % 10 ) ==0):
success,image =
if not success:
count +=1
exit_video_info = cv2.VideoCapture("decimated_v1.mp4")
fps_sortie = exit_video_info.get(cv2.CAP_PROP_FPS)
print("fps de sortie " + str(fps_sortie))
Decimating a video file that's not all Intra frames will require re-encoding. Unless your input file is e.g. ProRes or MJPEG, that's likely going to be the case.
Since you're not setting encoding parameters, OpenCV likely end up using some defaults that end up with a higher bitrate than your input file.
You'll probably have a better time using the FFmpeg tool than OpenCV, and its select filter.
ffmpeg -i ./video/inputvideo.mp4 -vf select='not(mod(n\,10))' ./decimated_v1.mp4
would be the basic syntax to use every tenth frame from the input; you can then add your desired encoding parameters such as -crf to adjust the H.264 rate factor – or, of course, you can change to a different codec altogether.

Store image in memory then write to disk

This program snippet cuts objects recognized by color from the video frame and saves them to disk.
Distinguishes different objects with a full save path and objectID. Because I want to store pictures of different objects in a separate folder.
This has worked well so far. However, for high-resolution images, continuous disc burning completely freezes the program.
I would ask for your help on how to temporarily store the cropped images and their names in memory and write them to disk at the end of the program.
I mean bypass the cv.imwrite(os.path.join(path + cwd + str(objectID), fileName), crop_img) as long as the program is busy cropping the images.
path = os.getcwd()
cwd = "/Data/"
for (objectID, centroid) in objects.items():
# draw both the ID of the object and the centroid of the
# object on the output frame
text = "ID {}".format(objectID)
cv.putText(frame, text, (centroid[0], centroid[1] - 20),
cv.FONT_HERSHEY_SIMPLEX, 0.5, (255, 0, 0), 2), (centroid[0], centroid[1]), 2, (255, 0, 0), -1)
# coordinates for cropping
ext_left = centroid[0] - 70
ext_right = centroid[0] + 70
ext_top = centroid[1] - 70
ext_bot = centroid[1] + 70
crop_img = frame[ext_top:ext_bot, ext_left:ext_right]
createFolder(path + cwd + str(objectID))
fileName = '%s.jpg' % (str(objectID) + str(uuid.uuid4()))
cv.imwrite(os.path.join(path + cwd + str(objectID), fileName), crop_img)
You could create an list and append each cropped image to it, as you create them, like this:
crop_imgs = []
for (objectID, centroid) in objects.items():
crop_img = frame[ext_top:ext_bot, ext_left:ext_right]
crop_imgs.append((objectID, crop_img))
We append a tuple of both the objectID and the image itself. You could also use a dict, if you prefer.
Then separate your writing loop here:
for (objectID, crop_img) in crop_imgs:
createFolder(path + cwd + str(objectID))
fileName = '%s.jpg' % (str(objectID) + str(uuid.uuid4()))
cv.imwrite(os.path.join(path + cwd + str(objectID), fileName), crop_img)
However, consider the drawbacks of your proposal:
The overall runtime of your program will remain the same, but now you will not get intermediate results written to disk. If the program crashes, you'll lose everything, and can't resume it without starting over.
Unlike in a video file, the images will be stored in memory with no compression. There's no point in letting available memory go unused, but if you exhaust available memory, the operating system has to page the memory to disk, which will be slower than if you had just written out the compressed JPEGs at each step.
By the way, even without modifying the code, you could use a RAM disk, which is a virtual filesystem that exists only in RAM, then copy the results to your hard disk. The same caveats apply.
You could potentially get speed gains from using the threading or multiprocessing libraries to add processed video frames to a queue and have another thread/process perform the encoding to JPEG.
Another minor improvement: Use an incrementing number instead of a UUID. Generating lots of random numbers can be slow.

Generate 2d images of molecules from PubChem FTP data

Rather than crawl PubChem's website, I'd prefer to be nice and generate the images locally from the PubChem ftp site:
The only problem is that I'm limited to OSX and Linux and I can't seem to find a way of programmatically generating the 2d images that they have on their site. See this example:
Under the heading "2D Structure" we have this image here:
That is what I'm trying to generate.
If you want something working out of the box I would suggest using molconvert from ChemAxon's Marvin (, which is free for academics. It can be used easily from the command line and it supports plenty of input and output formats. So for your example it would be:
molconvert "png" -s "C1=CC(=C(C=C1[N+](=O)[O-])[N+](=O)[O-])Cl" -o cdnb.png
Resulting in the following image:
It also allows you to set parameters such as width, height, quality, background color and so on.
However, if you are a programmer I would definitely recommend RDKit. Follows a code which generates images for a pair of compounds given as smiles.
from rdkit import Chem
from rdkit.Chem import Draw
ms_smis = [["C1=CC(=C(C=C1[N+](=O)[O-])[N+](=O)[O-])Cl", "cdnb"],
["C1=CC(=CC(=C1)N)C(=O)N", "3aminobenzamide"]]
ms = [[Chem.MolFromSmiles(x[0]), x[1]] for x in ms_smis]
for m in ms: Draw.MolToFile(m[0], m[1] + ".svg", size=(800, 800))
This gives you following images:
So I also emailed the PubChem guys and they got back to me very quickly with this response:
The only bulk access we have to images is through the download
You can request up to 50,000 images at a time.
Which is better than I was expecting, but still not amazing since it requires downloading things that I in theory could generate locally. So I'm leaving this question open until some kind soul writes an open source library to do the same.
I figure I might as well save people some time if they are doing the same thing as I am. I've created a Ruby Gem backed on Mechanize to automate the downloading of images. Please be kind to their servers and only download what you need.
gem install pubchem
An open source option is the Indigo Toolkit, which also has pre-compiled packages for Linux, Windows, and MacOS and language bindings for Python, Java, .NET, and C libraries. I chose the 1.4.0 beta.
I had a similar interest to yours in converting SMILES to 2D structures and adapted my Python to address your question and to capture timing information. It uses the PubChem FTP (Compound/Extras) download of CID-SMILES.gz. The following script is an implementation of a local SMILES-to-2D-structure converter that reads a range of rows from the PubChem CID-SMILES file of isomeric SMILES (which contains over 102 million compound records) and converts the SMILES to PNG images of the 2D structures. In three tests with 1000 SMILES-to-structure conversions, it took 35, 50, and 60 seconds to convert 1000 SMILES at file row offsets of 0, 100,000, and 10,000,000 on my Windows 10 laptop (Intel i7-7500U CPU, 2.70GHz) with a solid state drive and running Python 3.7.4. The 3000 files totaled 100 MB in size.
from indigo import *
from indigo.renderer import *
import subprocess
import datetime
def timerstart():
# start timer and print time, return start time
start =
print("Start time =", start)
return start
def timerstop(start):
# end timer and print time and elapsed time, return elapsed time
endtime =
elapsed = endtime - start
print("End time =", endtime)
print("Elapsed time =", elapsed)
return elapsed
numrecs = 1000
recoffset = 0 # 10000000 # record offset
starttime = timerstart()
indigo = Indigo()
renderer = IndigoRenderer(indigo)
# set render options
indigo.setOption("render-atom-color-property", "color")
indigo.setOption("render-coloring", True)
indigo.setOption("render-comment-position", "bottom")
indigo.setOption("render-comment-offset", "20")
indigo.setOption("render-background-color", 1.0, 1.0, 1.0)
indigo.setOption("render-output-format", "png")
# set data path (including data file) and output file path
datapath = r'../Download/CID-SMILES'
pngpath = r'./2D/'
# read subset of rows from data file
mycmd = "head -" + str(recoffset+numrecs) + " " + datapath + " | tail -" + str(numrecs)
(out, err) = subprocess.Popen(mycmd, stdout=subprocess.PIPE, shell=True).communicate()
lines = str(out.decode("utf-8")).split("\n")
count = 0
for line in lines:
cols = line.split("\t") # split on tab
key = cols[0] # cid in cols[0]
smiles = cols[1] # smiles in cols[1]
mol = indigo.loadMolecule(smiles)
s = "CID=" + key
indigo.setOption("render-comment", s)
#indigo.setOption("render-image-size", 200, 250)
#indigo.setOption("render-image-size", 400, 500)
renderer.renderToFile(mol, pngpath + key + ".png")
count += 1
print("Error processing line after", str(count), ":", line)
elapsedtime = timerstop(starttime)
print("Converted", str(count), "SMILES to PNG")

How can I produce real-time audio output from music made with Music21?

How can I produce real-time audio output from music made with Music21. Failing that, how can i produce ANY audio output from music made with Music21 via open-source software? Thanks for the help.
As you've seen, music21 isn't designed to be a music playback system, but it IS designed to be embedded within other playback systems or to call them from within the system. We're not planning on putting too much work into playback systems (because of the hardware support, our being a tiny research lab, the work still needing to be done on musical analysis, etc.), but your solution is so elegant that it is now included in all versions of music21 (post v1.1) as the music21.midi.realtime module. Here's an example that takes music21's ability to dynamically allocate midi channels with different pitch-bend objects in order to simulate microtonal playback (a major problem for most midi playback):
# Set up a detuned piano
# (where each key has a random
# but consistent detuning from 30 cents flat to sharp)
# and play a Bach Chorale on it in real time.
from music21 import *
import random
keyDetune = []
for i in range(0, 127):
keyDetune.append(random.randint(-30, 30))
b = corpus.parse('bach/bwv66.6')
for n in b.flat.notes:
n.microtone = keyDetune[n.pitch.midi]
sp = midi.realtime.StreamPlayer(b)
The StreamPlayer's .play() function can also take busyFunction and busyArgs and busyWaitMilliseconds arguments which specify a function to call with arguments at most every busyWaitMilliseconds (could be more if your system is slower). There is also an endFunction and endArgs that will be called at the end, in case you want to set up some sort of threaded playback. -- Myke Cuthbert (Music21 creator)
So here's what I found out. Here's a python script that works on Windows XP. It needs pygame in addition to music21.
# Generates and Plays 2 Music21 Scores "on the fly".
# see way below for source notes
from music21 import *
# we create the music21 Bottom Part, and do this explicitly, one object at a time.
n1 = note.Note('e4')
n1.duration.type = 'whole'
n2 = note.Note('d4')
n2.duration.type = 'whole'
m1 = stream.Measure()
m2 = stream.Measure()
partLower = stream.Part()
# For the music21 Upper Part, we automate the note creation procedure
data1 = [('g4', 'quarter'), ('a4', 'quarter'), ('b4', 'quarter'), ('c#5', 'quarter')]
data2 = [('d5', 'whole')]
data = [data1, data2]
partUpper = stream.Part()
def makeUpperPart(data):
for mData in data:
m = stream.Measure()
for pitchName, durType in mData:
n = note.Note(pitchName)
n.duration.type = durType
# Now, we can add both Part objects into a music21 Score object.
sCadence = stream.Score()
sCadence.insert(0, partUpper)
sCadence.insert(0, partLower)
# Now, let's play the MIDI of the sCadence Score [from memory, ie no file write necessary] using pygame
import cStringIO
# for music21 <= v.1.2:
if hasattr(sCadence, 'midiFile'):
sCadence_mf = sCadence.midiFile
else: # for >= v.1.3:
sCadence_mf = midi.translate.streamToMidiFile(sCadence)
sCadence_mStr = sCadence_mf.writestr()
sCadence_mStrFile = cStringIO.StringIO(sCadence_mStr)
import pygame
freq = 44100 # audio CD quality
bitsize = -16 # unsigned 16 bit
channels = 2 # 1 is mono, 2 is stereo
buffer = 1024 # number of samples
pygame.mixer.init(freq, bitsize, channels, buffer)
# optional volume 0 to 1.0
def play_music(music_file):
stream music with module in blocking manner
this will stream the sound from disk while playing
clock = pygame.time.Clock()
print "Music file %s loaded!" % music_file
except pygame.error:
print "File %s not found! (%s)" % (music_file, pygame.get_error())
# check if playback has finished
# play the midi file we just saved
# now let's make a new music21 Score by reversing the upperPart notes
data2 = [('d5', 'whole')]
data = [data1, data2]
partUpper = stream.Part()
sCadence2 = stream.Score()
sCadence2.insert(0, partUpper)
sCadence2.insert(0, partLower)
# now let's play the new Score
sCadence2_mf = sCadence2.midiFile
sCadence2_mStr = sCadence2_mf.writestr()
sCadence2_mStrFile = cStringIO.StringIO(sCadence2_mStr)
## There are 3 sources for this mashup:
# 1. Source for the Music21 Score Creation
# 2. Source for the Music21 MidiFile Class Behaviour
# 3. Source for the pygame player:

