Librosa adds noise to wav data

Librosa adds noise to wav data - python

I was programming a little something reading a file and playing it back. I need it to use librosa, it it's impossible i might be able to fix it. the simpleaudio bit is easier to replace. This is my code:
from pathlib import Path
import librosa
import simpleaudio as sa
def play_audio(audio, sampling_rate):
print("PLAYING AUDIO")
wave_obj = sa.WaveObject(audio, sample_rate=sampling_rate)
play_obj = wave_obj.play()
play_obj.wait_done()
in_fpath = Path("trump.wav")
original_wav, sampling_rate = librosa.load(in_fpath)
play_audio(original_wav, int(sampling_rate))
And it loads the data and plays it back, but it's just that if I play trump.wav file in Windows on the music player, it sounds like it should. When I do it in Python this way, however, it becomes EXTREMELY noisy. You can still hear what he is saying, but barely. Where is the problem? Librosa or SimpleAudio?

I have a suspicion. librosa.load returns an array with float data but simpleaudio needs integer data. Please try to change the dtype of audio:
import numpy as np
# [...]
audio *= 32767 / np.max(np.abs(audio)) # re-scaling
audio = audio.astype(np.int16) # change data type
wave_obj = sa.WaveObject(audio, sample_rate=sampling_rate)
# [...]
Also see the documentation of simpleaudio:
https://simpleaudio.readthedocs.io/en/latest/tutorial.html#using-numpy

I figured out the problem, i just needed to change the WaveObject or librosa.load (don't remember which) with a byte_rate (something with byte_) to 4 instead of the default of 2.

Related

Playing audio in jupyter, in a for loop

I have a ton of training data I need annotated, in order to do so I need to listen through a bunch of sound snippets and note what I hear. I wrote a small script for this in a notebook.
My main issue is that IPython display dosent show in loops. As an example:
import numpy
import IPython.display as ipd
sr = 22050# sample rate
T = 2.0# seconds
t = numpy.linspace(0, T, int(T*sr), endpoint=False)# time variable
x = 0.5*numpy.sin(2*numpy.pi*440*t)
ipd.Audio(x, rate=sr)
will show up with an audio box, and I will be able to play the sine wave.
But trying to play anything in a for loop yields nothing (such as:)
for i in range(10000000):
x = 0.5*numpy.sin(i*numpy.pi*440*t)
ipd.Audio(x, rate=sr)
If anyone has a good solution for looping through (and listening) a bunch of audio files (one at a time, since I need to loop through potentially hundreds of thousands sound snippets), I would be very much appreciative!

To display the audio files within the for loop, you need to use IPython.display.display with the Audio object like so:
import numpy
import IPython.display as ipd
for i in range(10000000):
x = 0.5*numpy.sin(i*numpy.pi*440*t)
ipd.display(ipd.Audio(x, rate=sr))

my answer was deleted. but if you want a continuous loop you can use the method i described here https://stackoverflow.com/a/73425194/664456
which is a = Audio(...); a.autoplay_attr = lambda: 'autoplay="autoplay" loop="loop"'

why no sound when I use CompositeAudioClip in python moviepy?

here is the python script.
from moviepy.editor import *
videoclip = VideoFileClip("1_0_1522314608.m4v")
audioclip = AudioFileClip("jam_1_Mix.mp3")
new_audioclip = CompositeAudioClip([videoclip.audio, audioclip])
videoclip.audio = new_audioclip
videoclip.write_videofile("new_filename.mp4")
then returned
[MoviePy] >>>> Building video new_filename.mp4 [MoviePy] Writing audio
in new_filenameTEMP_MPY_wvf_snd.mp3
100%|████████████████████████████████████████| 795/795 [00:01<00:00,
466.23it/s] [MoviePy] Done. [MoviePy] Writing video new_filename.mp4 100%|███████████████████████████████████████| 1072/1072 [01:26<00:00,
10.31it/s] [MoviePy] Done. [MoviePy] >>>> Video ready: new_filename.mp4
1_0_1522314608.m4v and jam_1_Mix.mp3 they both have sound.
but the new file new_filename.mp4 no sound.
did I do something wrong? please help. thank you.

I had similar issues. I found that sometimes the audio is in fact present, but will not be played by all players. (Quicktime does not work, but VLC does).
I finally use something like :
video_clip.set_audio(composite_audio).write_videofile(
composite_file_path,
fps=None,
codec="libx264",
audio_codec="aac",
bitrate=None,
audio=True,
audio_fps=44100,
preset='medium',
audio_nbytes=4,
audio_bitrate=None,
audio_bufsize=2000,
# temp_audiofile="/tmp/temp.m4a",
# remove_temp=False,
# write_logfile=True,
rewrite_audio=True,
verbose=True,
threads=None,
ffmpeg_params=None,
progress_bar=True)
A few remarks :
aac seems to work better for quicktime than mp3.
logfile file generation helps to diagnose what is happening behind the scenes
temp_audiofile can be tested on its own
set_audio() returns a new clip and leave the object on which it is called unchanged
Setting the audio clip duration to match the video helps :
composite_audio = composite_audio.subclip(0, video_clip.duration)
composite_audio.set_duration(video_clip.duration)

How to play a sound onto a input stream

I was wondering if it was possible to play a sound directly into a input from python. I am using linux, and with that I am using OSS, ALSA, and Pulseaudio

You can definitely play (and generate) sound with python
Here is a example code that generates sinewave, opens default Alsa playback device and plays sinewave through that
#!/usr/bin/env python3
import math
import struct
import alsaaudio
from itertools import *
def sine_wave(frequency=440.0, framerate=44100, amplitude=0.5):
"""Stolen from here: http://zacharydenton.com/generate-audio-with-python/"""
period = int(framerate / frequency)
if amplitude > 1.0: amplitude = 1.0
if amplitude < 0.0: amplitude = 0.0
lookup_table = [float(amplitude) * math.sin(2.0*math.pi*float(frequency)*(float(i%period)/float(framerate))) for i in range(period)]
return (lookup_table[i%period] for i in count(0))
sound_out = alsaaudio.PCM() # open default sound output
sound_out.setchannels(1) # use only one channel of audio (aka mono)
sound_out.setrate(44100) # how many samples per second
sound_out.setformat(alsaaudio.PCM_FORMAT_FLOAT_LE) # sample format
for sample in sine_wave():
# alsa only eats binnary data
b = struct.pack("<f", sample) # convert python float to binary float
sound_out.write(b)
or you can loopback microphone to your speakers
#!/usr/bin/env python3
import struct
import alsaaudio
sound_out = alsaaudio.PCM() # open default sound output
sound_out.setchannels(1) # use only one channel of audio (aka mono)
sound_out.setperiodsize(5) # buffer size, default is 32
sound_in = alsaaudio.PCM(type=alsaaudio.PCM_CAPTURE) # default recording device
sound_in.setchannels(1) # use only one channel of audio (aka mono)
sound_in.setperiodsize(5) # buffer size, default is 32
while True:
sample_lenght, sample = sound_in.read()
sound_out.write(sample)
much more examples can be found in python alsaaudio library http://pyalsaaudio.sourceforge.net/libalsaaudio.html

I guess it depends on what you would like to do with it after you got it "into" python.
I would definitely look at the scikits.audiolab library. That's what you might use if you wanted to draw up spectrograms of what ever sound you are trying process (I'm guessing that's what you want to do?).

Real-time procedural sounds with Python?

I want to do procedural sounds in Python, and instantly play them back rather than save them to a file. What should I be using for this? Can I just use built-in modules, or will I need anything extra?
I will probably want to change pitch, volume, things like that.

Using numpy along with scikits.audiolab should do the trick. audiolab has a play function which supports the ALSA and Core Audio backends.
Here's an example of how you might produce a simple sine wave using numpy:
from __future__ import division
import numpy as np
def testsignal(hz,amplitude = .5,seconds=5.,sr=44100.):
'''
Create a sine wave at hz for n seconds
'''
# cycles per sample
cps = hz / sr
# total samples
ts = seconds * sr
return amplitude * np.sin(np.arange(0,ts*cps,cps) * (2*np.pi))
To create five seconds of a sine wave at 440 hz and listen to it, you'd do:
>>> from scikits.audiolab import play
>>> samples = testsignal(440)
>>> play(samples)
Note that play is a blocking call. Control won't be returned to your code until the sound has completed playing.

Check out this Python wiki page. Particularly the "Music programming in Python" section.

Pyglet could be a good option as it embeds functions for audio procedural:
Pyglet/media_procedural

Python: what are the nearest Linux and OS X [edit: macOS] equivalents of winsound.Beep?

If one wishes to beep the speaker on Windows, Python 2 apparently provides a useful function: winsound.Beep(). The neat thing about this function is that it takes arguments specifying the exact frequency and duration of the beep. This is exactly what I want to do, except that I don't use Windows. So...
What are the nearest equivalents of winsound.Beep() for Linux and OS X [edit: macOS], bringing in as few dependencies as possible?
Please note that I want to be able to beep the speaker directly, not to play a sound file. Also, I need to be able to control the frequency and duration of the beep, so curses.beep() and print '\a' won't do. Lastly, I am aware that PyGame provides extensive sound capabilities, but given that I don't require any of PyGame's other functionality, that would seem like using a sledgehammer to crack a nut (and anyway, I'm trying to do away with dependencies as far as possible).
[Edited on 9 Feb 2023 to reflect the fact that OS X was renamed macOS a few years after this question was asked]

winsound is only for windows and I could not find any cross platform way to do this, other than print "/a". However, you cannot set the frequency and duration with this.
However, you can try the os.system command to do the same with the system command beep. Here is a snippet, which defines the function playsound in a platform independent way
try:
import winsound
except ImportError:
import os
def playsound(frequency,duration):
#apt-get install beep
os.system('beep -f %s -l %s' % (frequency,duration))
else:
def playsound(frequency,duration):
winsound.Beep(frequency,duration)
For more info, look at this blog
EDIT: You will need to install the beep package on linux to run the beep command. You can install by giving the command
sudo apt-get install beep

I found a potential solution here:
http://bytes.com/topic/python/answers/25217-beeping-under-linux
It involves writing directly to /dev/audio. Not sure how portable it is or if it even works at all - i'm not on a linux machine atm.
def beep(frequency, amplitude, duration):
sample = 8000
half_period = int(sample/frequency/2)
beep = chr(amplitude)*half_period+chr(0)*half_period
beep *= int(duration*frequency)
audio = file('/dev/audio', 'wb')
audio.write(beep)
audio.close()

This works on mac:
import numpy as np
import simpleaudio as sa
def sound(x,z):
frequency = x # Our played note will be 440 Hz
fs = 44100 # 44100 samples per second
seconds = z # Note duration of 3 seconds
# Generate array with seconds*sample_rate steps, ranging between 0 and seconds
t = np.linspace(0, seconds, seconds * fs, False)
# Generate a 440 Hz sine wave
note = np.sin(frequency * t * 2 * np.pi)
# Ensure that highest value is in 16-bit range
audio = note * (2**15 - 1) / np.max(np.abs(note))
# Convert to 16-bit data
audio = audio.astype(np.int16)
# Start playback
play_obj = sa.play_buffer(audio, 1, 2, fs)
# Wait for playback to finish before exiting
play_obj.wait_done()
sound(300,2)
sound(200,1)

The most light-weight cross-platform layer I can see is "PortAudio". This is used by R for instance in their package to wrap platform-specific driver calls into simple play/record of digitized waveforms as an array.
The good folk at M.I.T. produce a Python binding for this, but you will need to include the compiled .dll/.so for this to work. http://people.csail.mit.edu/hubert/pyaudio/
( libao is similar by Xiph the makers of Ogg/Vorbis , a wrapper pyao exists but this seems less widely used )
SoX is an excellent set of cross-platform tools with much more functionality for format conversion and reading files etc..
Using ctypes to make calls from Python to a driver is feasible but very messy, even the simplest legacy WinMM.

I've found 3 methods for Linux:
new method using the Linux evdev API, works with any user in the input group (example source code)
old method using fcntl and /dev/console (requires root priviledges) (example source code)
invoke the beep command directly with subprocess or os.system (slower and must be installed in the system).
See also my tone() function here with all the alternatives.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Librosa adds noise to wav data - python

I figured out the problem, i just needed to change the WaveObject or librosa.load (don't remember which) with a byte_rate (something with byte_) to 4 instead of the default of 2.

Related

Playing audio in jupyter, in a for loop

why no sound when I use CompositeAudioClip in python moviepy?

How to play a sound onto a input stream

Real-time procedural sounds with Python?

Python: what are the nearest Linux and OS X [edit: macOS] equivalents of winsound.Beep?

Categories

Resources