Overlay wav files at different start times using Pydub

Overlay wav files at different start times using Pydub - python

I have a series of wav files I would like to combine and export as a single wav using Pydub. I would like the audio from the original files to play back at different times in the exported file e.g. the audio in audio_1.wav starts at time=0 in the exported file while the audio in audio_2.wav starts at time=5 instead of both starting at time=0 as the overlay function has them. Is there any way to do this? Below is the code I currently have for importing, overlaying, and exporting the audio files.
from pydub import AudioSegment
audio_1 = AudioSegment.from_file("audio_1.wav",
format="wav")
audio_2 = AudioSegment.from_file("audio_2.wav",
format="wav")
overlay = vln_audio_1.overlay(vla_audio_2)
file_handle = overlay.export("output2.wav", format="wav")

I didn't test it but based on documentation it may need overlay(..., position=5000)
BTW:
you may also add silence at the beginning to move audio
silence_5_seconds = AudioSegment.silent(duration=5000)
audio_2 = silence_5_seconds + audio_2

Related

Why isn't silence placed in the output audio file with AudioSegment.silent?

The output of the file is unchanged from the source.
I expect the following to mute the audio for a length of one second at two seconds into the audio file.
Python version: 3.7
from pydub import AudioSegment
audio_file = "input_audio.mp3"
# Load audio file into pydub
audio = AudioSegment.from_mp3(audio_file)
# place one second of silence two seconds in to mute that portion
audio = audio.overlay(AudioSegment.silent(duration=1000, frame_rate=audio.frame_rate), position=2000)
# Save audio with word muted to new file
audio.export("output audio.mp3", format="mp3")```

[Python][Moviepy] How to add a short silence in the end of an audio?

I'd like to include a short silence duration at the end of an audio clip. I haven't found any specific functions in the Moviepy documentation, so I've resorted to creating a muted audio file of 500ms and concatenating it with the original audio file.
In some cases, this concatenation will introduce a noticeable glitch at the intersection, and I haven't figured out why. I also realized by importing the concatenated audiofile to Audacity that Moviepy actually creates two audio tracks when concatenating.
Do you know a better way to add silence to the end of the clip, or maybe the reason why this glitch appears sometimes (in my experience about 1 every 4 instances)?
Here's my code:
from moviepy.editor import *
temp_audio = "original audio dir"
silence = "silence audio dir"
audio1 = AudioFileClip(temp_audio) #original audio file
audio2 = AudioFileClip(silence) #silence audio file
final_audio = concatenate_audioclips([audio1,audio2])
final_audio.write_audiofile(output)
I am currently using Python 3.9.5 and Moviepy 1.0.3

may be fps=44100
will work . mp3 file's frequence

You can use the below solution for adding the silence at the end or start of audio:
from pydub import AudioSegment
orig_seg = AudioSegment.from_file('audio.wav')
silence_seg = AudioSegment.silent(duration=1000) # 1000 for 1 sec, 2000 for 2 secs
# for adding silence at the end of audio
combined_audio = orig_seg + silence_seg
# for adding silence at the start of audio
#combined_audio = silence_seg + orig_seg
combined_audio.export('new_audio.wav', format='wav')

Split Audio File with python Librosa

After doing split in an audio file with Librosa, I want to know how to obtain the resultant fragments in mp3 filesSee audio image

Can you just open individual files like
fragment1 = open("x.mp3", "a")
fragment2 = open("y.mp3", "a")
and then write to each of those using what you have as variables?

Python, speech_recognition tool does not recognize .wav file

I have generated a .wav audio file containing some speech with some other interference speech in the background.
This code worked for me for a test .wav file:
import speech_recognition as sr
r = sr.Recognizer()
with sr.WavFile(wav_path) as source:
audio = r.record(source)
text = r.recognize_google(audio)
If I use my .wav file, I get the following error:
ValueError: Audio file could not be read as PCM WAV, AIFF/AIFF-C, or Native FLAC; check if file is corrupted or in another format
The situation slightly improves if I save this .wav file with soundfile:
import soundfile as sf
wav, samplerate = sf.read(wav_path)
sf.write(saved_wav_path, original_wav, fs)
and then load the new saved_wav_path back into the first block of code, this time I get:
if not isinstance(actual_result, dict) or len(actual_result.get("alternative", [])) == 0: raise UnknownValueError()
The audio files were saved as
wavfile.write(wav_path, fs, data)
where wav_path = 'data.wav'. Any ideas?
SOLUTION:
Saving the audio data the following way generates the correct .wav files:
import wavio
wavio.write(wav_path, data, fs ,sampwidth=2)

From a brief look at the code in the speech_recognition package, it appears that it uses wave from the Python standard library to read WAV files. Python's wave library does not handle floating point WAV files, so you'll have to ensure that you use speech_recognition with files that were saved in an integer format.
SciPy's function scipy.io.wavfile.write will create an integer file if you pass it an array of integers. So if data is a floating point numpy array, you could try this:
from scipy.io import wavfile
# Convert `data` to 32 bit integers:
y = (np.iinfo(np.int32).max * (data/np.abs(data).max())).astype(np.int32)
wavfile.write(wav_path, fs, y)
Then try to read that file with speech_recognition.
Alternatively, you could use wavio (a small library that I created) to save your data to a WAV file. It also uses Python's wave library to create its output, so speech_recognition should be able to read the files that it creates.

I couldn't figure out what the sampwidth should be for wavio from its documentation; however, I added the following line sounddevice.default.dtype='int32', 'int32' which allowed sounddevice, scipy.io.wavfile.write / soundfile, and speech_recognizer to finally work together. The default dtype for sounddevice was float32 for both input and output. I tried changing only the output but it didnt work. Weirdly, audacity still thinks the output files are in float32. I am not suggesting this is a better solution, but it did work with both soundfile and scipy.
I also noticed another oddity. When sounddevice.default.dtype was left at the default [float32, float32] and I opened the resulting file in audacity. From audacity, I exported it and this exported wav would work with speechrecognizer. Audacity says its export is float32 and the same samplerate, so I don't fully understand. I am a noob but looked at both files in a hex editor and they look the same for the first 64 hex values then they differ... so it seems like the header is the same. Those two look very different than the file I made using int32 output, so seems like there's another factor at play...

Similar to Warren's answer, I was able to resolve this issue by rewriting the WAV file using pydub:
from pydub import AudioSegment
filename = "payload.wav" # File that already exists.
sound = AudioSegment.from_mp3(filename)
sound.export(filename, format="wav")

Mixing audio files in MoviePy

I've been writing a script using MoviePy. So far I've been able to import videos, clip them, add text, replace the audio and write a new file. It's been a great learning experience. My question is this:
The movie that I'm editing has audio attached. I'd like to be able to import an audio track and add it to the movie without replacing the original audio. In other words, I'd like to mix the new audio file with the audio that's attached to the video so both can be heard.
Does anyone know how to do this?
Thanks in advance!

I wrote my own version, but then I found this here:
new_audioclip = CompositeAudioClip([videoclip.audio, audioclip])
videoclip.audio = new_audioclip
So, create a CompositeAudioClip with the audio of the video clip and the new audio clip, then set the old videoclip's audio to the composite audio track.
Full working code:
from moviepy.editor import *
videoclip = VideoFileClip("filename.mp4")
audioclip = AudioFileClip("audioname.mp3")
new_audioclip = CompositeAudioClip([videoclip.audio, audioclip])
videoclip.audio = new_audioclip
videoclip.write_videofile("new_filename.mp4")
If you want to change an individual audioclip's volume, refer to audio.fx.volumex.
Documentation
Source Code

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Overlay wav files at different start times using Pydub - python

I didn't test it but based on documentation it may need overlay(..., position=5000) BTW: you may also add silence at the beginning to move audio silence_5_seconds = AudioSegment.silent(duration=5000) audio_2 = silence_5_seconds + audio_2

Related

Why isn't silence placed in the output audio file with AudioSegment.silent?

[Python][Moviepy] How to add a short silence in the end of an audio?

Split Audio File with python Librosa

Python, speech_recognition tool does not recognize .wav file

Mixing audio files in MoviePy

Categories

Resources