MoviePy multiple textClips one after another

MoviePy multiple textClips one after another - python

I have an audio file and its script file, that looks like this
One day I was playing Fortnite
solos until I got a strange fend
invites I never seen before I joned
it an he said time to play
a game I sruged it of like
no big deal I regreted doing that
but he kept saying time to play
over and over and over ugen
My goal is to make a video where voice is followed by text appearing on the screen (next line appears, previous disappears). The way I do it is obviously wrong and it just renders all lines at the beginning of the video stacked on each other + I can not know how much lines of script there will be so doing texts[0], texts[1]... is not an option. Please send help!
My code:
videoclip = VideoFileClip("Satisfying Minecraft Parkour.mp4")
audioclip = AudioFileClip(f"audio.mp3")
new_audioclip = CompositeAudioClip([audioclip])
videoclip.audio = new_audioclip
texts = []
with open(f'text.txt', 'r') as f:
for line in f:
txt_clip = TextClip(line, fontsize = 55, color = 'white')
txt_clip = txt_clip.set_pos('center')
txt_clip = txt_clip.set_duration(audio_in_seconds/len(str(text))*len(line))
texts.append(txt_clip)
video = CompositeVideoClip([videoclip, texts[0]])
video = CompositeVideoClip([video, texts[1]])

Related

How to display text on the screen it is said over the audio

As a personal project, I decided to create one of the reddit text-to-speech bot.
I pulled all the data from reddit with praw
import praw, random
def scrapeData(subredditName):
# Instantiate praw
reddit = praw.Reddit()
# Get subreddit
subreddit = reddit.subreddit(subredditName)
# Get a bunch of posts and convert them into a list
posts = list(subreddit.new(limit=100))
# Get random number
randomNumber = random.randint(0, 100)
# Store post's title and description in variables
postTitle = posts[randomNumber].title
postDesc = posts[randomNumber].selftext
return postTitle + " " + postDesc
Then, I converted it to speech stored in a .mp3 file with gTTS.
from google.cloud import texttospeech
def convertTextToSpeech(textString):
# Instantiate TTS
client = texttospeech.TextToSpeechClient().from_service_account_json("path/to/json")
# Set text input to be synthesized
synthesisInput = texttospeech.SynthesisInput(text=textString)
# Build the voice request
voice = texttospeech.VoiceSelectionParams(language_code = "en-us",
ssml_gender = texttospeech.SsmlVoiceGender.MALE)
# Select the type of audio file
audioConfig = texttospeech.AudioConfig(audio_encoding =
texttospeech.AudioEncoding.MP3)
# Perform the TTS request on the text input
response = client.synthesize_speech(input = synthesisInput, voice =
voice, audio_config= audioConfig)
# Convert from binary to mp3
with open("output.mp3", "wb") as out:
out.write(response.audio_content)
I've created an .mp4 with moviepy that has generic footage in the background with the audio synced over it,
from moviepy.editor import *
from moviepy.video.tools.subtitles import SubtitlesClip
# get vide and audio source files
clip = VideoFileClip("background.mp4").subclip(20,30)
audio = AudioFileClip("output.mp3").subclip(0, 10)
# Set audio and create final video
videoClip = clip.set_audio(audio)
videoClip.write_videofile("output.mp4")
but my issue is I can't find a way to have only the current word or sentence displayed on screen as a subtitle, rather than the entire post.

How to change volume of stem files while playing using python

I'm attempting to write a python project that plays multiple parts of a song at the same time.
For background information, a song is split into "stems", and then each stem is played simultaneously to recreate the full song. What I am trying to achieve is using potentiometers to control the volume of each stem, so that the user can mix songs differently. For a product relation, the StemPlayer from Kanye West is what I am trying to achieve.
I can change the volume of the overlayed song at the end, but what I want to do is change the volume of each stem using a potentiometer while the song is playing. Is this even possible using pyDub? Below is the code I have right now.
from pydub import AudioSegment
from pydub.playback import play
vocals = AudioSegment.from_file("walkin_vocals.mp3")
drums = AudioSegment.from_file("walkin_drums.mp3")
bass = AudioSegment.from_file("walkin_bass.mp3")
vocalsDrums = vocals.overlay(drums)
bassVocalsDrums = vocalsDrums.overlay(bass)
songQuiet = bassVocalsDrums - 20
play(songQuiet)

Solved this question, I ended up using pyaudio instead of pydub.
With pyaudio, I was able to define a custom stream_callback function. Within this callback function, I multiply each stem by a modifier, then add each stem to one audio output.
def callback(in_data, frame_count, time_info, status):
global drumsMod, vocalsMod, bassMod, otherMod
drums = drumsWF.readframes(frame_count)
vocals = vocalsWF.readframes(frame_count)
bass = bassWF.readframes(frame_count)
other = otherWF.readframes(frame_count)
decodedDrums = numpy.frombuffer(drums, numpy.int16)
decodedVocals = numpy.frombuffer(vocals, numpy.int16)
decodedBass = numpy.frombuffer(bass, numpy.int16)
decodedOther = numpy.frombuffer(other, numpy.int16)
newdata = (decodedDrums*drumsMod + decodedVocals*vocalsMod + decodedBass*bassMod + decodedOther*otherMod).astype(numpy.int16)
return (newdata.tobytes(), pyaudio.paContinue)

Print image and record audio input

I am very green about programming but wish to learn and develop.
I want to write a simple application that will be useful in linguistic treatments - but at first it is simple demo.
The application is about to display image and record sound during projection.
There are few variables - interval and image/sound/movie clip paths - taken from external txt file (for the beginning - later I would like to perform some creator with presaved configurations).
The config file now looks like:
10
path1
path2
...
The first line is about to input interval in seconds, next there are paths to images, sounds or movie clips (I tried with images for now).
#!/usr/bin/python
# main.py
import sys
from PyQt4 import QtGui, QtCore
from Tkinter import *
import numpy as np
import pyaudio
import wave
import time
from PIL import Image, ImageTk
import multiprocessing
import threading
from threading import Thread
master = Tk()
conf_file = open("conf.txt", "r") #open conf file read only
conf_lines = conf_file.readlines()
conf_file.close()
interwal = conf_lines[0] #interval value from conf.txt file
bodziec1 = conf_lines[1] #paths to stimulus file (img / audio / video)
bodziec2 = conf_lines[2]
bodziec3 = conf_lines[3]
CHUNK = 1024
FORMAT = pyaudio.paInt16
CHANNELS = 1
RATE = 44100
RECORD_SECONDS = interwal #every stimulus has it's own audio record file for further work
timestr = time.strftime("%Y%m%d-%H%M%S") #filename is set to year / month / day - hour / minute / second for easier systematization
def nagrywanie(): #recording action - found somewhere in the all-knowing web
p = pyaudio.PyAudo()
stream = p.open(format=FORMAT,
channels=CHANNELS,
rate=RATE,
input=True,
frames_per_buffer=CHUNK)
print("* nagrywanie") #info about record to start
frames = []
for i in range(0, int(RATE / CHUNK * RECORD_SECONDS)):
data = stream.read(CHUNK)
frames.append(data)
print("* koniec nagrywania") #info about record to end
stream.stop_stream()
stream.close()
p.terminate()
wf = wave.open(timestr, 'wb')
wf.setnchannels(CHANNELS)
wf.setsampwidth(p.get_sample_size(FORMAT))
wf.setframerate(RATE)
wf.writeframes(b''.join(frames))
def bod1(): #stimulus 1st to display / play
image = Image.open(bodziec1)
photo = ImageTk.PhotoImage(image)
def bod2():
image = Image.open(bodziec2) #stimulus 2nd to display / play
photo = ImageTk.PhotoImage(image)
def bod3():
image = Image.open(bodziec3) #stimulus 3rd to display / play
photo = ImageTk.PhotoImage(image)
def odpal(): #attemption to run display and recording at the same time
Thread(target = bod1).start()
Thread(target = nagrywanie).start()
# Wait interwal for odpal #give impetus for time in first line of the conf.txt
time.sleep(interwal)
# Terminate odpal #stop giving impetus
bod1.terminate()
# Cleanup #?? this part is also copied from all-knowing internet
p.join()
b = Button(master, text="OK", command=odpal) #wanted the program to be easier for non-programmers to operate so few buttons are necessary
b.pack()
mainloop()
When asked few programmers about the code it is as simple as riding a bike, so I wanted to learn how to write it by myself.
I guess it is peace of cake for professionals - 1000s of thanks to these ones who want even to read this junk.
It takes a lot of time for me to understand and figure out the exact commends that is why I am asking politely about the help - not only for education but also for better diagnosis.
Excuse me for the language - English is not my native language.

could not convert string to float 9DOF Razor IMU

I am receiving the "could not convert string to float" error in this code. Could anyone take a look to see what is wrong please? I am working with the 9DOF Razor IMU connected to the PC using an FTDI, both from sparkfun. I am trying to see the axis x,y and z and the yaw, pitch and roll bars moving while I rotate my Razor, but I am receiving this error. It's the first project I'm working on, so everything I am doing is based on tutorials and blogs. English is not my motherlanguage, so sorry 'bout any english mistakes i've made. Thank you.
# This script needs VPhyton, pyserial and pywin modules
from visual import *
import serial
import string
import math
from time import time
grad2rad = 3.141592/180.0
# Check your COM port and baud rate
ser = serial.Serial(port='COM13',baudrate=57600, timeout=1)
# Main scene
scene=display(title="9DOF Razor IMU test")
scene.range=(1.2,1.2,1.2)
#scene.forward = (0,-1,-0.25)
scene.forward = (1,0,-0.25)
scene.up=(0,0,1)
# Second scene (Roll, Pitch, Yaw)
scene2 = display(title='9DOF Razor IMU test',x=0, y=0, width=500, height=200,center=(0,0,0), background=(0,0,0))
scene2.range=(1,1,1)
scene.width=500
scene.y=200
scene2.select()
#Roll, Pitch, Yaw
cil_roll = cylinder(pos=(-0.4,0,0),axis=(0.2,0,0),radius=0.01,color=color.red)
cil_roll2 = cylinder(pos=(-0.4,0,0),axis=(-0.2,0,0),radius=0.01,color=color.red)
cil_pitch = cylinder(pos=(0.1,0,0),axis=(0.2,0,0),radius=0.01,color=color.green)
cil_pitch2 = cylinder(pos=(0.1,0,0),axis=(-0.2,0,0),radius=0.01,color=color.green)
#cil_course = cylinder(pos=(0.6,0,0),axis=(0.2,0,0),radius=0.01,color=color.blue)
#cil_course2 = cylinder(pos=(0.6,0,0),axis=(-0.2,0,0),radius=0.01,color=color.blue)
arrow_course = arrow(pos=(0.6,0,0),color=color.cyan,axis=(-0.2,0,0), shaftwidth=0.02, fixedwidth=1)
#Roll,Pitch,Yaw labels
label(pos=(-0.4,0.3,0),text="Roll",box=0,opacity=0)
label(pos=(0.1,0.3,0),text="Pitch",box=0,opacity=0)
label(pos=(0.55,0.3,0),text="Yaw",box=0,opacity=0)
label(pos=(0.6,0.22,0),text="N",box=0,opacity=0,color=color.yellow)
label(pos=(0.6,-0.22,0),text="S",box=0,opacity=0,color=color.yellow)
label(pos=(0.38,0,0),text="W",box=0,opacity=0,color=color.yellow)
label(pos=(0.82,0,0),text="E",box=0,opacity=0,color=color.yellow)
label(pos=(0.75,0.15,0),height=7,text="NE",box=0,color=color.yellow)
label(pos=(0.45,0.15,0),height=7,text="NW",box=0,color=color.yellow)
label(pos=(0.75,-0.15,0),height=7,text="SE",box=0,color=color.yellow)
label(pos=(0.45,-0.15,0),height=7,text="SW",box=0,color=color.yellow)
L1 = label(pos=(-0.4,0.22,0),text="-",box=0,opacity=0)
L2 = label(pos=(0.1,0.22,0),text="-",box=0,opacity=0)
L3 = label(pos=(0.7,0.3,0),text="-",box=0,opacity=0)
# Main scene objects
scene.select()
# Reference axis (x,y,z)
arrow(color=color.green,axis=(1,0,0), shaftwidth=0.02, fixedwidth=1)
arrow(color=color.green,axis=(0,-1,0), shaftwidth=0.02 , fixedwidth=1)
arrow(color=color.green,axis=(0,0,-1), shaftwidth=0.02, fixedwidth=1)
# labels
label(pos=(0,0,0.8),text="9DOF Razor IMU test",box=0,opacity=0)
label(pos=(1,0,0),text="X",box=0,opacity=0)
label(pos=(0,-1,0),text="Y",box=0,opacity=0)
label(pos=(0,0,-1),text="Z",box=0,opacity=0)
# IMU object
platform = box(length=1, height=0.05, width=1, color=color.red)
p_line = box(length=1,height=0.08,width=0.1,color=color.yellow)
plat_arrow = arrow(color=color.green,axis=(1,0,0), shaftwidth=0.06, fixedwidth=1)
f = open("Serial"+str(time())+".txt", 'w')
roll=0
pitch=0
yaw=0
while 1:
line = ser.readline()
line = line.replace("!ANG:","") # Delete "!ANG:"
print line
f.write(line) # Write to the output log file
words = string.split(line,",") # Fields split
if len(words) > 2:
try:
roll = float(words[0])*grad2rad
pitch = float(words[1])*grad2rad
yaw = float(words[2])*grad2rad
except:
print "Invalid line"
axis=(cos(pitch)*cos(yaw),-cos(pitch)*sin(yaw),sin(pitch))
up=(sin(roll)*sin(yaw)+cos(roll)*sin(pitch)*cos(yaw),sin(roll)*cos(yaw)- cos(roll)*sin(pitch)*sin(yaw),-cos(roll)*cos(pitch))
platform.axis=axis
platform.up=up
platform.length=1.0
platform.width=0.65
plat_arrow.axis=axis
plat_arrow.up=up
plat_arrow.length=0.8
p_line.axis=axis
p_line.up=up
cil_roll.axis=(0.2*cos(roll),0.2*sin(roll),0)
cil_roll2.axis=(-0.2*cos(roll),-0.2*sin(roll),0)
cil_pitch.axis=(0.2*cos(pitch),0.2*sin(pitch),0)
cil_pitch2.axis=(-0.2*cos(pitch),-0.2*sin(pitch),0)
arrow_course.axis=(0.2*sin(yaw),0.2*cos(yaw),0)
L1.text = str(float(words[0]))
L2.text = str(float(words[1]))
L3.text = str(float(words[2]))
ser.close
f.close

You must be using some old tutorials if the are suggesting stuff like string.split(line,",")
From your source, it seems the problem is here:
L1.text = str(float(words[0]))
L2.text = str(float(words[1]))
L3.text = str(float(words[2]))
You are trying to convert something to a float, that Python can't convert to a number. From the looks of it, these are text labels, so why not try:
L1.text = words[0]
L2.text = words[1]
L3.text = words[2]
Some more tips:
words = line.split(',') # instead of: words = string.split(line,",")
ser.close() # close is a function, so you should call it.
f.close()

make the following changes to the code:
replace the line
line = line.replace("!ANG:","")
with
line = line.replace("#YPR=","")
replace the split line as told earlier and that should just do the trick!

Use decodebin with adder

I'm trying to create an audio stream that has a constant audio source (in this case, audiotestsrc) to which I can occasionally add sounds from files (of various formats, that's why I'm using decodebin) through the play_file() method. I use an adder for that purpose. However, for some reason, I cannot add the second sound correctly. Not only does the program play the sound incorrectly, it also completely stops the original audiotestsrc. Here's my code so far:
import gst; import gobject; gobject.threads_init()
pipe = gst.Pipeline()
adder = gst.element_factory_make("adder", "adder")
first_sink = adder.get_request_pad('sink%d')
pipe.add(adder)
test = gst.element_factory_make("audiotestsrc", "test")
test.set_property('freq', 100)
pipe.add(test)
testsrc = test.get_pad("src")
testsrc.link(first_sink)
output = gst.element_factory_make("alsasink", "output")
pipe.add(output)
adder.link(output)
pipe.set_state(gst.STATE_PLAYING)
raw_input('Press key to play sound')
def play_file(filename):
adder_sink = adder.get_request_pad('sink%d')
audiofile = gst.element_factory_make('filesrc', 'audiofile')
audiofile.set_property('location', filename)
decoder = gst.element_factory_make('decodebin', 'decoder')
def on_new_decoded_pad(element, pad, last):
pad.link(adder_sink)
decoder.connect('new-decoded-pad', on_new_decoded_pad)
pipe.add(audiofile)
pipe.add(decoder)
audiofile.link(decoder)
pipe.set_state(gst.STATE_PAUSED)
pipe.set_state(gst.STATE_PLAYING)
play_file('sample.wav')
while True:
pass

Thanks to moch on #gstreamer, I realized that all adder sources should have the same format. I modified the above script so as to have the caps "audio/x-raw-int, endianness=(int)1234, channels=(int)1, width=(int)16, depth=(int)16, signed=(boolean)true, rate=(int)11025" (example) go before every input in the adder.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

MoviePy multiple textClips one after another - python

Related

How to display text on the screen it is said over the audio

How to change volume of stem files while playing using python

Print image and record audio input

could not convert string to float 9DOF Razor IMU

Use decodebin with adder

Categories

Resources