Analyzing ambient room volume - python

I am looking for a function/packages, which basically returns an integer which corresponds to the ambient volume in the room.
I thought that many people might have already wanted such a function, however, searing through the internet did not yield a result.
Any help is much appreciated!
Cheers!

This code does what I want:
import pyaudio
import numpy as np
CHUNK = 2 ** 11
RATE = 44100
p = pyaudio.PyAudio()
stream = p.open(format=pyaudio.paInt16, channels=1, rate=RATE, input=True,
frames_per_buffer=CHUNK)
while True: # go for a few seconds
data = np.frombuffer(stream.read(CHUNK), dtype=np.int16)
peak = np.mean(np.abs(data))
if peak > THRESHOLD:
#do stuff

Related

CHUNKING an audio signal with python

when i chunk an audio file (Sine wave in this case) the sound changes!
i want to stream out an audio signal (a sine wave). first of all i tried streaming the whole original Signal
import numpy as np
import pyaudio as py
from scipy.io.wavfile import read
fs, y = read('Sinus_440Hz.wav')
p = py.PyAudio()
stream = p.open(format=p.get_format_from_width(y.dtype.itemsize),
channels=1,
rate=fs,
output=True,
frames_per_buffer=1024)
stream.write(y) #Output 1 to 1 Original Sound (WORKS FINE)
stream.stop_stream()
stream.close()
py.terminate()
this works fine and i hear the original sine wave without any artefacts or modifications.
i need to treat the data in Chunks and then stream it out. i did it this way
import numpy as np
import pyaudio as py
from scipy.io.wavfile import read
fs, y = read('Sinus_440Hz.wav')
totalSamps = len(y)
sample = 128
seg = 0
p = py.PyAudio()
stream = p.open(format=p.get_format_from_width(y.dtype.itemsize),
channels=1,
rate=fs,
output=True,
frames_per_buffer=1024)
while True:
inds = 1 + np.mod((np.arange(sample) + sample * seg), totalSamps) # Chunks of 128
Output = y[inds]
stream.write(Output) # Signal is not the same and have a lot of artefacts!!
seg = seg + 1
stream.stop_stream()
stream.close()
py.terminate()
i didnt alter the signal yet and the sound of the sine wave has already changed
Why am i getting this signal changed although i didnt modifie anything yet? im just splitting it in Chunks
and stream it out.
Thanks in advance!

Realtime step detection from wav file

My goal is to take a real time audio stream, and find the steps in it to signal my lights to flash to it.
Right now I have this code:
import pyaudio
import numpy as np
import matplotlib.pyplot as plt
CHUNK = 2**5
RATE = 44100
LEN = 3
p = pyaudio.PyAudio()
stream = p.open(format=pyaudio.paInt16, channels=1, rate=RATE, input=True, frames_per_buffer=CHUNK)
print(1)
frames = []
n = 0
for i in range(int(LEN*RATE/CHUNK)): #go for a LEN seconds
n += 1
data = np.fromstring(stream.read(CHUNK),dtype=np.int16)
num = 0
for ii in data:
num += abs(ii)
print(num)
frames.append(data)
stream.stop_stream()
stream.close()
p.terminate()
plt.figure(1)
plt.title("Signal Wave...")
plt.plot(frames)
open("frames.txt", "w").write(str(frames))
It takes the live audio steam created by pyaudio in this format
[[0,0,-1,0,0,0,0,-1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,-1,0,0,0,0,0,0,0,0,0]),[1,0,-1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,-1,0,0,0,0,0,0,0,0,0]]
(this is depicting silence)
and adds all of the number together after they have gone through the abs() function (absolute value)
This gives an accurate(ish) representation of what a graph like this looks like
I see the numbers getting larger and the big jumps should be easy to calculate, but the smaller jumps are almost indistinguishable from silence.
I found this answer that seems right, but i dont know how to use it.
Any help would be appreciated
Thanks!

How can i increase the volume of a byte array which is from pyaudio in python

I am streaming audio from a mic to my speaker. But I want to increase the volume of the sound live but I can;t figure out a way and I've searched google for a while.
Her Is my code
import pyaudio
Chunk = 1024
AudioFormat = pyaudio.paInt16
Channels = 2
Rate = 44100
PortAudio = pyaudio.PyAudio()
sourceDevice = PortAudio.open(format=AudioFormat,
channels=Channels,
rate=Rate,
input=True,
input_device_index=2,
frames_per_buffer=Chunk
)
destinationDevice = PortAudio.open(format=AudioFormat,
channels=Channels,
rate=Rate,
output=True,
output_device_index=4,
frames_per_buffer=Chunk
)
while True:
try:
data = sourceDevice.read(Chunk)
except OSError:
data = '\x00' * Chunk
except IOError as ex:
if ex[1] != pyaudio.paInputOverflowed:
raise
data = '\x00' * Chunk
# Doing Something To Data Here To Incrase Volume Of It
data = data # Function Here??
destinationDevice.write(data, Chunk, exception_on_underflow=True)
an example of what the data variable is is
(This is Shortened Quite by a lot the original is MASSIVE)
b'\xec\x00G\x01\xa7\x01\xbe\x01\x95\x00\xf7\x00+\x00\x91\x00\xa1\x01W\x01\xec\x01\x94\x01n\x00\xac\x00I\x00\xa4\x00\xfb\x00"\x01g\x00\x8d\x00*\x00m\x00\xde\x00\x04\x01\xb2\x00\xc7\x005\x00-\x00(\x01\xb0\x00\xec\x01Q\x01.'
You can use numpy to convert the raw data into numpy arrays, then multiply the array by a volume ratio and write it to the output stream.
from math import sqrt
import numpy as np
# ...
# convert the linear volume to a logarithmic scale (see explanation below)
volumeFactor = 2
multiplier = pow(2, (sqrt(sqrt(sqrt(volumeFactor))) * 192 - 192)/6)
while True:
try:
data = sourceDevice.read(Chunk)
except OSError:
data = '\x00' * Chunk
except IOError as ex:
if ex[1] != pyaudio.paInputOverflowed:
raise
data = '\x00' * Chunk
# Doing Something To Data Here To Incrase Volume Of It
numpy_data = np.fromstring(data, dtype=np.int16)
# double the volume using the factor computed above
np.multiply(numpyData, volumeMultiplier,
out=numpyData, casting="unsafe")
destinationDevice.write(numpy_data.tostring(), Chunk, exception_on_underflow=True)
The concept is that audio data is conceptually an array of samples, each one with a value that depends on the bit "depth". Standard digital audio (as CD audio) is at 44100kHz, 16bit, stereo, which means that each seconds has 88200 samples (since it's stereo) with each sample occupying 2 bytes (8bit + 8bit). If you equally change the value of each of those samples, you will actually change its volume.
Now, the problem is that perceived volume is not linear, but logarithmic. So, if you want to get twice the volume, you can't just double sample values.
I'm using a conversion I found out some years ago (from Ardour sliders, if I recall correctly), which should be accurate enough.
Be careful, though, you could easily get very high levels, which will result in distorted sound.

PyAudio Print Sound Level From 1 To 100

I am looking to get the sound coming from my microphone printed out as a loudness level from 1 to 100 using PyAudio. Currently my code just prints the raw sound which is just numbers and letters, how would I turn it into a scale from 1 to 100? Here is my code so far:
import pyaudio
import wave
import threading
CHUNK = 1024
FORMAT = pyaudio.paInt16
CHANNEL = 1
RATE = 44100
pa = pyaudio.PyAudio()
stream = pa.open(format=FORMAT, channels=CHANNEL,
rate=RATE, input=True,
frames_per_buffer=CHUNK)
def getdata():
threading.Timer(1, getdata).start()
audio_data = stream.read(CHUNK)
print(audio_data)
getdata()
I am quite a beginner so please explain thing thoroughly. Thanks!
EDIT: Here is a small sample of what is outputted:
0\xdd\x00\xdf\x00\xd6\x00\xd4\x00\xd8\x00\xc3\x00\xb6\x00\xc5\x00\xd0\x00\xc1\x00\xbb\x00\xbf\x00\xc5\x00\xc6\x00\xcf\x00\xb7\x00\xb1\x00\xcb\x00\xc2\x00\xc8\x00\xc5\x00\xc6\x00\xbe\x00\xaa\x00\xac\x00\xb1\x00\xa8\x00\xa7\x00\xb3\x00\xaa\x00\xa6\x00\xaa\x00\xa4\x00\x98\x00\x92\x00\xa0\x00\x9a\x00\x99\x00\x95\x00\x9f\x00\xb0\x00\x90\x00\x94\x00\x91\x00\x98\x00\xa2\x00\xa3\x00\xaa\x00\x94\x00\x98\x00\xa1\x00\x9d\x00\x96\x00\x90\x00\x91\x00\x89\x00\x85\x00{\x00\x83\x00\x84\x00\x8b\x00\x85\x00|\x00z\x00\x83\x00\x88\x00\x89\x00\x8a\x00\x8b\x00\x84\x00\x8f\x00\x83\x00o\x00p\x00p\x00\x88\x00\x8c\x00\x8b\x00\x8d\x00\x89\x00y\x00r\x00s\x00w\x00q\x00a\x00q\x00i\x00
SOLVED: Found an answer here:
Pyaudio : how to check volume

How to handle in_data in Pyaudio callback mode?

I'm doing a project on Signal Processing in python. So far I've had a little succes with the nonblocking mode, but it gave a considerable amount of delay and clipping to the output.
I want to implement a simple real-time audio filter using Pyaudio and Scipy.Signal, but in the callback function provided in the pyaudio example when I want to read the in_data I can't process it. Tried converting it in various ways but with no success.
Here's a code I want to achieve(read data from mic, filter, and output ASAP):
import pyaudio
import time
import numpy as np
import scipy.signal as signal
WIDTH = 2
CHANNELS = 2
RATE = 44100
p = pyaudio.PyAudio()
b,a=signal.iirdesign(0.03,0.07,5,40)
fulldata = np.array([])
def callback(in_data, frame_count, time_info, status):
data=signal.lfilter(b,a,in_data)
return (data, pyaudio.paContinue)
stream = p.open(format=pyaudio.paFloat32,
channels=CHANNELS,
rate=RATE,
output=True,
input=True,
stream_callback=callback)
stream.start_stream()
while stream.is_active():
time.sleep(5)
stream.stop_stream()
stream.close()
p.terminate()
What is the right way to do this?
Found the answer to my question in the meantime, the callback looks like this:
def callback(in_data, frame_count, time_info, flag):
global b,a,fulldata #global variables for filter coefficients and array
audio_data = np.fromstring(in_data, dtype=np.float32)
#do whatever with data, in my case I want to hear my data filtered in realtime
audio_data = signal.filtfilt(b,a,audio_data,padlen=200).astype(np.float32).tostring()
fulldata = np.append(fulldata,audio_data) #saves filtered data in an array
return (audio_data, pyaudio.paContinue)
I had a similar issue trying to work with the PyAudio callback mode, but my requirements where:
Working with stereo output (2 channels).
Processing in real time.
Processing the input signal using an arbitrary impulse response, that could change in the middle of the process.
I succeeded after a few tries, and here are fragments of my code (based on the PyAudio example found here):
import pyaudio
import scipy.signal as ss
import numpy as np
import librosa
track1_data, track1_rate = librosa.load('path/to/wav/track1', sr=44.1e3, dtype=np.float64)
track2_data, track2_rate = librosa.load('path/to/wav/track2', sr=44.1e3, dtype=np.float64)
track3_data, track3_rate = librosa.load('path/to/wav/track3', sr=44.1e3, dtype=np.float64)
# instantiate PyAudio (1)
p = pyaudio.PyAudio()
count = 0
IR_left = first_IR_left # Replace for actual IR
IR_right = first_IR_right # Replace for actual IR
# define callback (2)
def callback(in_data, frame_count, time_info, status):
global count
track1_frame = track1_data[frame_count*count : frame_count*(count+1)]
track2_frame = track2_data[frame_count*count : frame_count*(count+1)]
track3_frame = track3_data[frame_count*count : frame_count*(count+1)]
track1_left = ss.fftconvolve(track1_frame, IR_left)
track1_right = ss.fftconvolve(track1_frame, IR_right)
track2_left = ss.fftconvolve(track2_frame, IR_left)
track2_right = ss.fftconvolve(track2_frame, IR_right)
track3_left = ss.fftconvolve(track3_frame, IR_left)
track3_right = ss.fftconvolve(track3_frame, IR_right)
track_left = 1/3 * track1_left + 1/3 * track2_left + 1/3 * track3_left
track_right = 1/3 * track1_right + 1/3 * track2_right + 1/3 * track3_right
ret_data = np.empty((track_left.size + track_right.size), dtype=track1_left.dtype)
ret_data[1::2] = br_left
ret_data[0::2] = br_right
ret_data = ret_data.astype(np.float32).tostring()
count += 1
return (ret_data, pyaudio.paContinue)
# open stream using callback (3)
stream = p.open(format=pyaudio.paFloat32,
channels=2,
rate=int(track1_rate),
output=True,
stream_callback=callback,
frames_per_buffer=2**16)
# start the stream (4)
stream.start_stream()
# wait for stream to finish (5)
while_count = 0
while stream.is_active():
while_count += 1
if while_count % 3 == 0:
IR_left = first_IR_left # Replace for actual IR
IR_right = first_IR_right # Replace for actual IR
elif while_count % 3 == 1:
IR_left = second_IR_left # Replace for actual IR
IR_right = second_IR_right # Replace for actual IR
elif while_count % 3 == 2:
IR_left = third_IR_left # Replace for actual IR
IR_right = third_IR_right # Replace for actual IR
time.sleep(10)
# stop stream (6)
stream.stop_stream()
stream.close()
# close PyAudio (7)
p.terminate()
Here are some important reflections about the code above:
Working with librosa instead of wave allows me to use numpy arrays for processing which is much better than the chunks of data from wave.readframes.
The data type you set in p.open(format= must match the format of the ret_data bytes. And PyAudio works with float32 at most.
Even index bytes in ret_data go to the right headphone, and odd index bytes go to the left one.
Just to clarify, this code sends the mix of three tracks to the output audio in stereo, and every 10 seconds it changes the impulse response and thus the filter being applied.
I used this for testing a 3d audio app I'm developing, and so the impulse responses where Head Related Impulse Responses (HRIRs), that changed the position of the sound every 10 seconds.
EDIT:
This code had a problem: the output had a noise of a frequency corresponding to the size of the frames (higher frequency when size of frames was smaller). I fixed that by manually doing an overlap and add of the frames. Basically, the ss.oaconvolve returned an array of size track_frame.size + IR.size - 1, so I separated that array into the first track_frame.size elements (which was then used for ret_data), and then the last IR.size - 1 elements I saved for later. Those saved elements would then be added to the first IR.size - 1 elements of the next frame. The first frame adds zeros.

Categories

Resources