Python, pydub splitting an audio file

Python, pydub splitting an audio file - python

Hi I am using pydub to split an audio file, giving the ranges to take segments from the original.
What I have is:
from pydub import AudioSegment
sound_file = AudioSegment.from_mp3("C:\\audio file.mp3")
# milliseconds in the sound track
ranges = [(30000,40000),(50000,60000),(80000,90000),(100000,110000),(150000,180000)]
for x, y in ranges:
new_file = sound_file[x : y]
new_file.export("C:\\" + str(x) + "-" + str(y) +".mp3", format="mp3")
It works well for the first 3 new files. However not the rest - it doesn’t split accordingly.
Does the problem lie in the way I give the range?
Thank you.
Add-on:
When it's made simple - for example
sound_file[150000:180000]
and export it to a mp3 file. it works but only cuts 50000:80000 part. it seems not reading a correct range.

Try this, it might work
import pydub
import numpy as np
sound_file = pydub.AudioSegment.from_mp3("a.mp3")
sound_file_Value = np.array(sound_file.get_array_of_samples())
# milliseconds in the sound track
ranges = [(30000,40000),(50000,60000),(80000,90000),(100000,110000),(150000,180000)]
for x, y in ranges:
new_file=sound_file_Value[x : y]
song = pydub.AudioSegment(new_file.tobytes(), frame_rate=sound_file.frame_rate,sample_width=sound_file.sample_width,channels=1)
song.export(str(x) + "-" + str(y) +".mp3", format="mp3")

Related

Outputting lists into google sheets based on inputs from that sheet using python

I'm trying to automate a google sheet to take zip code inputs from one column and make a list of all zip codes within a 10 mile radius of that zip code and paste it back into the next column. I am currently running into two roadblocks.
In each row, it is only pasting the output from the first input. Assuming it has something to do with the list variables in the for loop.
There is a character per minute limit implemented into google sheets. I've read a little bit about using the time.sleep() function to rerun the program 2 minutes after it reaches the quota, or something of that nature. If anyone could help me with that implementation as well it would be greatly appreciated.
Here is my code so far:
import gspread
from pyzipcode import ZipCodeDatabase
zcdb = ZipCodeDatabase()
gc = gspread.service_account(filename='')
sh = gc.open('zipcodes')
worksheet = sh.worksheet("zipcodes")
values_list = worksheet.col_values(1)
for zips in values_list:
rad = [z.zip for z in zcdb.get_zipcodes_around_radius(zips, 10)]
zip1 = [zips]
ziprad = zip1 + rad
space = ", "
str1 = space.join(ziprad)
blist = range(1,43192)
for j in blist:
cell = 'B' + str(j)
worksheet.update(cell, str1)
I really appreciate any help I can get here. I'm a comp sci dropout trying to remember some of the stuff I barely paid attention to.
Edit: This is where I've gotten too and it seems as though my problem remains. Any help would be appreciated.
import gspread
import time
gc = gspread.service_account(filename='')
sh = gc.open('testzipsearcher')
worksheet = sh.worksheet("zip")
## list of each zip code in US
values_list = worksheet.col_values(1)
##create list of cells to be changed
b_list = list(range(1, 43192))
cellrange = ["B" + str(cells) for cells in b_list]
## create list of zip codes surronding each zip code in US
radius = 10
def getZipCodes(values_list, radius):
from pyzipcode import ZipCodeDatabase
zcdb = ZipCodeDatabase()
for zips in values_list:
zipcodes = [z.zip for z in zcdb.get_zipcodes_around_radius(zips, radius)]
space = ", "
ziplist = space.join(zipcodes)
return ziplist
ziplist = getZipCodes(values_list, radius)
## Execute
for i in cellrange:
y = ziplist
worksheet.update(i, y)
time.sleep(1)

Concatenated audio clips come out broken

I am concatenating a couple of audio clips using moviepy, but every 1/2 times or so, in the place where the files unite, there is a hiss sound or other extra sounds. How can I fix this?
Code:
clips = []
for x in os.listdir(r"{}".format(cwd) + "/SpeechFolder"):
clips.append(AudioFileClip(r"{}".format(cwd) + "/SpeechFolder/" + x))
speech = concatenate_audioclips(clips)```

Extract Timestamps of an audio file when loud noises occur, python

I have an audio file in wav format, I would like extract particular timestamps from duration of audio where the loudness is significantly high.
For examples, Consider speech commentaries of sports game , my goal is to identify a timestamp in audio where the commentator shouts for a specific highlight in on-going game.
Python is the priority
Expected output:
start(seconds) end(seconds)
[0.81, 2.429] etc
def target_amplitude(sound, target_dBFS):
diff_in_dBFS = target_dBFS - sound.dBFS
return sound.apply_gain(diff_in_dBFS)
verified_sound = target_amplitude(vid_aud, -20.0)
nonsilent_data = detect_nonsilent(verified_sound, min_silence_len=500, silence_thresh=-20, seek_step=1)
for chunks in nonsilent_data:
chunk=[chunk/1000 for chunk in chunks]
time_list.append(chunk)

This is not actually very hard. The wave module can read a wave file. `numpy can tell you which array elements are outside of a range.
import wave
import numpy as np
w = wave.open('sound.wav')
sam = w.readframes(w.getnframes())
sam = np.frombuffer(sam, dtype=np.int16)
bigpos = np.where( sam > 20000 )
bigneg = np.where( sam < -20000 )
This assumes you have a wave file. If you have an MP3, you'll have to deccode it.

Split audio on timestamps librosa

I have an audio file and I want to split it every 2 seconds. Is there a way to do this with librosa?
So if I had a 60 seconds file, I would split it into 30 two second files.

librosa is first and foremost a library for audio analysis, not audio synthesis or processing. The support for writing simple audio files is given (see here), but it is also stated there:
This function is deprecated in librosa 0.7.0. It will be removed in 0.8. Usage of write_wav should be replaced by soundfile.write.
Given this information, I'd rather use a tool like sox to split audio files.
From "Split mp3 file to TIME sec each using SoX":
You can run SoX like this:
sox file_in.mp3 file_out.mp3 trim 0 2 : newfile : restart
It will create a series of files with a 2-second chunk of the audio each.
If you'd rather stay within Python, you might want to use pysox for the job.

You can split your file using librosa running the following code. I have added comments necessary so that you understand the steps carried out.
# First load the file
audio, sr = librosa.load(file_name)
# Get number of samples for 2 seconds; replace 2 by any number
buffer = 2 * sr
samples_total = len(audio)
samples_wrote = 0
counter = 1
while samples_wrote < samples_total:
#check if the buffer is not exceeding total samples
if buffer > (samples_total - samples_wrote):
buffer = samples_total - samples_wrote
block = audio[samples_wrote : (samples_wrote + buffer)]
out_filename = "split_" + str(counter) + "_" + file_name
# Write 2 second segment
librosa.output.write_wav(out_filename, block, sr)
counter += 1
samples_wrote += buffer
[Update]
librosa.output.write_wav() has been removed from librosa, so now we have to use soundfile.write()
Import required library
import soundfile as sf
replace
librosa.output.write_wav(out_filename, block, sr)
with
sf.write(out_filename, block, sr)

Python - Overlay more than 3 WAV files end to end

I am trying to overlap the end of 1 wav file with 20% of the start of the next file. Like this, there are a variable number of files to overla (usually around 5-6).
I have tried using pydub implementation be expanding the following for overlaying 2 wav files :
from pydub import AudioSegment
sound1 = AudioSegment.from_wav("/path/to/file1.wav")
sound2 = AudioSegment.from_wav("/path/to/file1.wav")
# mix sound2 with sound1, starting at 70% into sound1)
output = sound1.overlay(sound2, position=0.7 * len(sound1))
# save the result
output.export("mixed_sounds.wav", format="wav")
And wrote the following program :
for i in range(0,len(files_to_combine)-1):
if 'full_wav' in locals():
prev_wav = full_wav
else:
prev = files_to_combine[i]
prev_wav = AudioSegment.from_wav(prev)
next = files_to_combine[i+1]
next_wav = AudioSegment.from_wav(next)
new_wave = prev_wav.overlay(next_wav,position=len(prev_wav) - 0.3 * len(next_wav))
new_wave.export('partial_wav.wav', format='wav')
full_wav = AudioSegment.from_wav('partial_wav.wav')
However, when I look at the final wave file, only the first 2 files in the list files_to_combine were actually combined and not the rest. The idea was to continuously rewrite partial_wav.wav until it finally contains the full wav file of the near end to end overlapped sounds. To debug this, I stored the new_wave in different files for every combination. The first wave file is the last: it only shows the first 2 wave files combined instead of the entire thing. Furthermore, I expected the len(partial_wav) for every iteration to gradually increase. Hoever, this remains the same after the first combination:
partial_wave : 237
partial_wave : 237
partial_wave : 237
partial_wave : 237
partial_wave : 237
MAIN QUESTION
How do I overlap the end of one wav file (about the last 30%) with the beginning of the next for more than 3 wave files?

I believe you can just keep on cascading audiosegments until your final segment as below.
Working Code:
from pydub import AudioSegment
from pydub.playback import play
sound1 = AudioSegment.from_wav("SineWave_440Hz.wav")
sound2 = AudioSegment.from_wav("SineWave_150Hz.wav")
sound3 = AudioSegment.from_wav("SineWave_660Hz.wav")
# mix sound2 with sound1, starting at 70% into sound1)
tmpsound = sound1.overlay(sound2, position=0.7 * len(sound1))
# mix sound3 with sound1+sound2, starting at 30% into sound1+sound2)
output = tmpsound .overlay(sound3, position=0.3 * len(tmpsound))
play(output)
output.export("mixed_sounds.wav", format="wav")

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python, pydub splitting an audio file - python

Related

Outputting lists into google sheets based on inputs from that sheet using python

Concatenated audio clips come out broken

Extract Timestamps of an audio file when loud noises occur, python

Split audio on timestamps librosa

Python - Overlay more than 3 WAV files end to end

Categories

Resources