I'm recording live audio in 5 second clips with Python and want to cut out all sound below a certain frequency e.g. 10kHz. This is my script so far:
import pyaudio, wave, time, sys, os
from array import array
from scipy import signal
FORMAT=pyaudio.paInt16
CHANNELS=1
CHUNK=1024
RATE=44100
RECORD_SECONDS=5
def butter_highpass(cutoff, fs, order=5):
nyq = 0.5 * fs
normal_cutoff = cutoff / nyq
b, a = signal.butter(order, normal_cutoff, btype='high', analog=False)
return b, a
def butter_highpass_filter(data, cutoff, fs, order=5):
b, a = butter_highpass(cutoff, fs, order=order)
y = signal.filtfilt(b, a, data)
return y
audio=pyaudio.PyAudio()
stream=audio.open(format=FORMAT,channels=CHANNELS,
rate=RATE,
input=True,
frames_per_buffer=CHUNK)
while True:
# read data
data=stream.read(CHUNK)
data_chunk=array('h',data)
data_chunk = butter_highpass_filter(data_chunk,10000,RATE)
frames=[]
for i in range(0, int(RATE / CHUNK * RECORD_SECONDS)):
data = stream.read(CHUNK)
frames.append(data)
# write to file
words = ["RECORDING-", time.strftime("%Y%m%d-%H%M%S"), ".wav"]
FILE_NAME= "".join(words)
wavfile=wave.open(FILE_NAME,'wb')
wavfile.setnchannels(CHANNELS)
wavfile.setsampwidth(audio.get_sample_size(FORMAT))
wavfile.setframerate(RATE)
wavfile.writeframes(b''.join(frames))
wavfile.close()
But this doesn't seem to work. I want to cut out all (or as much as possible of the) sound below the specified frequency. Why doesn't the filter I'm using seem to cut out the sound below 10kHz? How can I make it work?
Brief
The goal is to apply a brick-wall 10 kHz high-pass filter to audio, then save it. Audio is recorded continuously and saved in 5 second snippets to separate .wav files.
What we have so far
At the moment the current script:
declares a function to apply a butterworth high-pass filter (butter_high-pass_filter) whose output is an array of floating point values
butter_high-pass_filter uses signal.filtfilt
the input to the function is in short format (bug 1)
data_chunk=array('h',data)
data_chunk = butter_high-pass_filter(data_chunk,10000,RATE)
data_chunk is never used, so a high passed frame of audio is never saved to file (bug 2)
data is read for 5 seconds worth audio
frames=[]
for i in range(0, int(RATE / CHUNK * RECORD_SECONDS)):
data = stream.read(CHUNK)
frames.append(data)
stream.read is blocking, so this will wait until the correct amount of audio has been read.
collected data is then written to a wav file in the same format
words = ["RECORDING-", time.strftime("%Y%m%d-%H%M%S"), ".wav"]
FILE_NAME= "".join(words)
wavfile=wave.open(FILE_NAME,'wb')
wavfile.setnchannels(CHANNELS)
wavfile.setsampwidth(audio.get_sample_size(FORMAT))
wavfile.setframerate(RATE)
wavfile.writeframes(b''.join(frames))
wavfile.close()
Solution
The problem here is that the solutions is multifaceted and requires multiple parts that are currently missing from the current script.
Also, to avoid further complications a slightly different approach needs to be taken than the one originally intended. Rather than applying a filter in real-time a filter can simply be applied to the wav sample data before it is saved.
This
Removes the need for dealing with filter state continuity
limits the need for casting back and forth between data types
Also, the outer forever while loop has been removed. Timing and memory start to become an issue. The program can simply be re-run over and over until there is more clarity on the use case covering
Why must the audio be high pass filtered?
Why can't the filtering take place after all data is recorded (i.e. applied to wav files?
Why does data have to be saved?
Are there limitations on data format? bit-depth? sampling rate?
Until those are answered there are too many possible routes each with limitation that can realistically be covered in a single answer.
Breakdown
Full breakdown for the process will be
Declare a function that takes floating point sample data as input and high pass filters with floating point data as ouput
concatenate 5 seconds of byte-string data from pyaudio into a single variable
unpack data as a 16-bit (signed short) format array of samples
scale samples to floating point format between 1.0 and -1.0
send data to high pass filter
scale filter samples in the range of 16-bit (signed short) format
pack 16-bit filtered samples into a byte string
write filtered byte-string data to a wav file.
Script
import pyaudio, wave, time, sys, os, struct
from array import array
from scipy import signal
# 1. Declare a function that takes floating point sample data as input and high pass
# filters with floating point data as output
# 2. concatenate 5 seconds of byte-string data from PyAudio into a single variable
# 3. unpack data as a 16-bit (signed short) format array of samples
# 4. scale samples to floating point format between 1.0 and -1.0
# 5. send data to high pass filter
# 6. scale filter samples in the range of 16-bit (signed short) format
# 7. pack 16-bit filtered samples into a byte string
# 8. write filtered byte-string data to a wav file.
# ---------------------------------------------------------------------
# 1. Declare High Pass Filter
def butter_highpass(cutoff: float, fs: float, order: int = 5) -> tuple:
"""
Generate FIR and IIR coefficients for a butterworth highpass filter
:param cutoff: cutoff frequency (hz)
:param fs: sampling rate
:param order: filter order
:return: tuple of filter coefficients
"""
nyq = 0.5 * fs
normal_cutoff = cutoff / nyq
b, a = signal.butter(order, normal_cutoff, btype='high', analog=False)
return b, a
def butter_highpass_filter(data: [float], cutoff: float, fs: float, order: int = 5) -> [float]:
"""
apply a butterworth high pass filter to sample data in floating point format
:param data: float sample data array
:param cutoff: filter cutoff (hz)
:param fs: sample data sampling rate
:param order: filter order
:return: floating point array of filtered sample data
"""
b, a = butter_highpass(cutoff, fs, order=order)
y = signal.filtfilt(b, a, data)
return y
# ---------------------------------------------------------------------
# Init Global Variables
FORMAT = pyaudio.paInt16
CHANNELS = 1
CHUNK = 1024
RATE = 44100
RECORD_SECONDS = 5
audio = pyaudio.PyAudio()
stream = audio.open(format=FORMAT,
channels=CHANNELS,
rate=RATE,
input=True,
frames_per_buffer=CHUNK)
# ---------------------------------------------------------------------
# Main Program
if __name__ == '__main__':
# ---------------------------------------------------------------------
# 2. concat 5 seconds of data into a single string
frames = b''
for i in range(0, int(RATE / CHUNK * RECORD_SECONDS)):
frames += stream.read(CHUNK)
# ---------------------------------------------------------------------
# 3. Unpack data as a 16 - bit
sample_data = array('h', frames)
# ---------------------------------------------------------------------
# 4. scale samples to floating point format between 1.0 and -1.0
samples = [sample_datum / (2**15) for sample_datum in sample_data]
print("Max Amplitude:", max(samples))
# ---------------------------------------------------------------------
# 5. send data to high pass filter
filtered_sample_data = butter_highpass_filter(samples, 10000.0, RATE)
# filtered_sample_data = samples
# ---------------------------------------------------------------------
# 6. scale filter samples in the range of 16-bit (signed short) format
# (2 ** 14) for headroom (very lazy)
sample_data_16_bit = [int(sample * (2 ** 14)) for sample in filtered_sample_data]
# # ---------------------------------------------------------------------
# # 7. pack 16-bit filtered samples into a byte string
raw_data = [struct.pack('h', sample) for sample in sample_data_16_bit]
# # ---------------------------------------------------------------------
# # 8. Write Wav
file_name = "".join(["RECORDING-", time.strftime("%Y%m%d-%H%M%S"), ".wav"])
wavfile = wave.open(file_name, 'wb')
wavfile.setnchannels(CHANNELS)
wavfile.setsampwidth(audio.get_sample_size(FORMAT))
wavfile.setframerate(RATE)
wavfile.writeframes(b''.join(raw_data))
wavfile.close()
Comments
Potential additions could include plotting the spectrum with matplotlib, normalising the filtered audio, encapsulating the process into custom functions, but these have been left as an exercise for the OP.
Below is included a screen shot of the audio spectrum after normalisation. As has been referenced in comments by #marco, the filter does have a slope, which is expected. This can be improved by increasing the order of the filter.
Related
I am trying to write an audio file using python's wave and numpy. So far I have the following and it works well:
import wave
import numpy as np
# set up WAV file parameters
num_channels = 1 # mono audio
sample_width = 1 # 8 bits(1 byte)/sample
sample_rate = 44.1e3 # 44.1k samples/second
frequency = 440 # 440 Hz
duration = 20 # play for this many seconds
num_samples = int(sample_rate * duration) # samples/seconds * seconds
# open WAV file and write data
with wave.open('sine8bit_2.wav', 'w') as wavfile:
wavfile.setnchannels(num_channels)
wavfile.setsampwidth(sample_width)
wavfile.setframerate(sample_rate)
t = np.linspace(0, duration, num_samples)
data = (127*np.sin(2*np.pi*frequency*t)).astype(np.int8)
wavfile.writeframes(data) # or data.tobytes() ??
My issue is that since I am using a high sampling rate, the num_samples variable might quickly become too large (9261000 samples for a 3 minute 30 seconds track say). Would using a numpy array this large be advisable? Is there a better way of going about this? Also is use of writeframes(.tobytes()) needed in this case because my code runs fine without it and it seems like extra overhead (especially if the arrays get too large).
Assuming you are only going to write a sine wave, you could very well create only one period as your data array and write that several times to the .wav file.
Using the parameters you provided, your data array is 8800 times smaller with that approach. Its size also no longer depends on the duration of your file!
import wave
import numpy as np
# set up WAV file parameters
num_channels = 1 # mono audio
sample_width = 1 # 8 bits(1 byte)/sample
sample_rate = 44.1e3 # 44.1k samples/second
frequency = 440 # 440 Hz
duration = 20 # play for this many seconds
# Create a single period of sine wave.
n = round(sample_rate/frequency)
t = np.linspace(0, 1/frequency, n)
data = (127*np.sin(2*np.pi*frequency*t)).astype(np.int8)
periods = round(frequency*duration)
# open WAV file and write data
with wave.open('sine8bit_2.wav', 'w') as wavfile:
wavfile.setnchannels(num_channels)
wavfile.setsampwidth(sample_width)
wavfile.setframerate(sample_rate)
for _ in range(periods):
wavfile.writeframes(data)
In the code, first I'm opening wav file called output_test.wav. I then filter the noise from the signal using fftpack.
Problem: I'm trying to convert the filtered signal i.e. filtered_sig array into wav file properly. Currently when I open TestFiltered.wav I get the error:
The item was encoded into a format not supported: 0xc00d5212
Upon further investigation it seems I'm not filtering noise correctly?
I think the error comes from the last 2 lines:
filteredwrite = np.fft.irfft(filtered_sig, axis=0)
wavfile.write('TestFiltered.wav', frame_rate, filteredwrite)
CODE:
import numpy as np
from scipy import fftpack
import pyaudio
import wave
from scipy.io import wavfile
def playback():
CHUNK = 1024
FORMAT = pyaudio.paInt16
CHANNELS = 2
RATE = 44100
RECORD_SECONDS = 8
WAVE_OUTPUT_FILENAME = "output.wav"
filename = 'output_test.wav'
# Set chunk size of 1024 samples per data frame
chunk = 1024
# Open the sound file
wf = wave.open(filename, 'rb')
frame_rate = wf.getframerate()
wf_x = wf.readframes(-1)
signal = np.frombuffer(wf_x, dtype='int16')
#print("signalxx", signal)
return [signal, frame_rate]
time_step = 0.5
# get the data
data = playback()
sig = data[0]
frame_rate = data[1]
# Return discrete Fourier transform of real or complex sequence
sig_fft = fftpack.fft(sig) # tranform the sin function
# Get Amplitude ?
Amplitude = np.abs(sig_fft) # np.abs() - calculate absolute value from a complex number a + ib
Power = Amplitude**2 # create a power spectrum by power of 2 of amplitude
# Get the (angle) base spectrum of these transform values i.e. sig_fft
Angle = np.angle(sig_fft) # Return the angle of the complex argument
# For each Amplitude and Power (of each element in the array?) - there is will be a corresponding difference in xxx
# This is will return the sampling frequecy or corresponding frequency of each of the (magnitude) i.e. Power
sample_freq = fftpack.fftfreq(sig.size, d=time_step)
print(Amplitude)
print(sample_freq)
# Because we would like to remove the noise we are concerned with peak freqence that contains the peak amplitude
Amp_Freq = np.array([Amplitude, sample_freq])
# Now we try to find the peak amplitude - so we try to extract
Amp_position = Amp_Freq[0,:].argmax()
peak_freq = Amp_Freq[1, Amp_position] # find the positions of max value position (Amplitude)
# print the position of max Amplitude
print("--", Amp_position)
# print the frequecies of those max amplitude
print(peak_freq)
high_freq_fft = sig_fft.copy()
# assign all the value the corresponding frequecies larger than the peak frequence - assign em 0 - cancel!! in the array (elements) (?)
high_freq_fft[np.abs(sample_freq) > peak_freq] = 0
print("yes:", high_freq_fft)
# Return discrete inverse Fourier transform of real or complex sequence
filtered_sig = fftpack.ifft(high_freq_fft)
# Using Fast Fourier Transform and inverse Fast Fourier Transform we can remove the noise from the frequency domain (that would be otherwise impossible to do in Time Domain) - done.
print("filtered noise: ", filtered_sig)
print("getiing frame rate $$", frame_rate)
filteredwrite = np.fft.irfft(filtered_sig, axis=0)
print (filteredwrite)
wavfile.write('TestFiltered.wav', frame_rate, filteredwrite)
Any ideas?
I am streaming audio from a mic to my speaker. But I want to increase the volume of the sound live but I can;t figure out a way and I've searched google for a while.
Her Is my code
import pyaudio
Chunk = 1024
AudioFormat = pyaudio.paInt16
Channels = 2
Rate = 44100
PortAudio = pyaudio.PyAudio()
sourceDevice = PortAudio.open(format=AudioFormat,
channels=Channels,
rate=Rate,
input=True,
input_device_index=2,
frames_per_buffer=Chunk
)
destinationDevice = PortAudio.open(format=AudioFormat,
channels=Channels,
rate=Rate,
output=True,
output_device_index=4,
frames_per_buffer=Chunk
)
while True:
try:
data = sourceDevice.read(Chunk)
except OSError:
data = '\x00' * Chunk
except IOError as ex:
if ex[1] != pyaudio.paInputOverflowed:
raise
data = '\x00' * Chunk
# Doing Something To Data Here To Incrase Volume Of It
data = data # Function Here??
destinationDevice.write(data, Chunk, exception_on_underflow=True)
an example of what the data variable is is
(This is Shortened Quite by a lot the original is MASSIVE)
b'\xec\x00G\x01\xa7\x01\xbe\x01\x95\x00\xf7\x00+\x00\x91\x00\xa1\x01W\x01\xec\x01\x94\x01n\x00\xac\x00I\x00\xa4\x00\xfb\x00"\x01g\x00\x8d\x00*\x00m\x00\xde\x00\x04\x01\xb2\x00\xc7\x005\x00-\x00(\x01\xb0\x00\xec\x01Q\x01.'
You can use numpy to convert the raw data into numpy arrays, then multiply the array by a volume ratio and write it to the output stream.
from math import sqrt
import numpy as np
# ...
# convert the linear volume to a logarithmic scale (see explanation below)
volumeFactor = 2
multiplier = pow(2, (sqrt(sqrt(sqrt(volumeFactor))) * 192 - 192)/6)
while True:
try:
data = sourceDevice.read(Chunk)
except OSError:
data = '\x00' * Chunk
except IOError as ex:
if ex[1] != pyaudio.paInputOverflowed:
raise
data = '\x00' * Chunk
# Doing Something To Data Here To Incrase Volume Of It
numpy_data = np.fromstring(data, dtype=np.int16)
# double the volume using the factor computed above
np.multiply(numpyData, volumeMultiplier,
out=numpyData, casting="unsafe")
destinationDevice.write(numpy_data.tostring(), Chunk, exception_on_underflow=True)
The concept is that audio data is conceptually an array of samples, each one with a value that depends on the bit "depth". Standard digital audio (as CD audio) is at 44100kHz, 16bit, stereo, which means that each seconds has 88200 samples (since it's stereo) with each sample occupying 2 bytes (8bit + 8bit). If you equally change the value of each of those samples, you will actually change its volume.
Now, the problem is that perceived volume is not linear, but logarithmic. So, if you want to get twice the volume, you can't just double sample values.
I'm using a conversion I found out some years ago (from Ardour sliders, if I recall correctly), which should be accurate enough.
Be careful, though, you could easily get very high levels, which will result in distorted sound.
My main task is to recognize a human humming from a microphone in real time. As the first step to recognizing signals in general, I have made a 5 seconds recording of a 440 Hz signal generated from an app on my phone and tried to detect the same frequency.
I used Audacity to plot and verify the spectrum from the same 440Hz wav file and I got this, which shows that 440Hz is indeed the dominant frequency :
(https://i.imgur.com/2UImEkR.png)
To do this with python, I use the PyAudio library and refer this blog. The code I have so far which I run with the wav file is this :
"""PyAudio Example: Play a WAVE file."""
import pyaudio
import wave
import sys
import struct
import numpy as np
import matplotlib.pyplot as plt
CHUNK = 1024
if len(sys.argv) < 2:
print("Plays a wave file.\n\nUsage: %s filename.wav" % sys.argv[0])
sys.exit(-1)
wf = wave.open(sys.argv[1], 'rb')
p = pyaudio.PyAudio()
stream = p.open(format=p.get_format_from_width(wf.getsampwidth()),
channels=wf.getnchannels(),
rate=wf.getframerate(),
output=True)
data = wf.readframes(CHUNK)
i = 0
while data != '':
i += 1
data_unpacked = struct.unpack('{n}h'.format(n= len(data)/2 ), data)
data_np = np.array(data_unpacked)
data_fft = np.fft.fft(data_np)
data_freq = np.abs(data_fft)/len(data_fft) # Dividing by length to normalize the amplitude as per https://www.mathworks.com/matlabcentral/answers/162846-amplitude-of-signal-after-fft-operation
print("Chunk: {} max_freq: {}".format(i,np.argmax(data_freq)))
fig = plt.figure()
ax = fig.add_subplot(1,1,1)
ax.plot(data_freq)
ax.set_xscale('log')
plt.show()
stream.write(data)
data = wf.readframes(CHUNK)
stream.stop_stream()
stream.close()
p.terminate()
In the output, I get that the max frequency is 10 for all the chunks and an example of one of the plots is :
(https://i.imgur.com/zsAXME5.png)
I had expected this value to be 440 instead of 10 for all the chunks. I admit I know very little about the theory of FFTs and I appreciate any help in letting my solve this.
EDIT:
The sampling rate is 44100. no. of channels is 2 and sample width is also 2.
Forewords
As xdurch0 pointed out, you are reading a kind of index instead of a frequency. If you are about to make all computation by yourself you need to compute you own frequency vector before plotting if you want to get consistent result. Reading this answer may help you towards the solution.
The frequency vector for FFT (half plane) is:
f = np.linspace(0, rate/2, N_fft/2)
Or (full plane):
f = np.linspace(-rate/2, rate/2, N_fft)
On the other hand we can delegate most of the work to the excellent scipy.signal toolbox which aims to cope with this kind of problems (and many more).
MCVE
Using scipy package it is straight forward to get the desired result for a simple WAV file with a single frequency (source):
import numpy as np
from scipy import signal
from scipy.io import wavfile
import matplotlib.pyplot as plt
# Read the file (rate and data):
rate, data = wavfile.read('tone.wav') # See source
# Compute PSD:
f, P = signal.periodogram(data, rate) # Frequencies and PSD
# Display PSD:
fig, axe = plt.subplots()
axe.semilogy(f, P)
axe.set_xlim([0,500])
axe.set_ylim([1e-8, 1e10])
axe.set_xlabel(r'Frequency, $\nu$ $[\mathrm{Hz}]$')
axe.set_ylabel(r'PSD, $P$ $[\mathrm{AU^2Hz}^{-1}]$')
axe.set_title('Periodogram')
axe.grid(which='both')
Basically:
Read the wav file and get the sample rate (here 44.1kHz);
Compute the Power Spectrum Density and frequencies;
Then display it with matplotlib.
This outputs:
Find Peak
Then we can find the frequency of the first highest peak (P>1e-2, this criterion is subject to tuning) using find_peaks:
idx = signal.find_peaks(P, height=1e-2)[0][0]
f[idx] # 440.0 Hz
Putting all together it merely boils down to:
def freq(filename, setup={'height': 1e-2}):
rate, data = wavfile.read(filename)
f, P = signal.periodogram(data, rate)
return f[signal.find_peaks(P, **setup)[0][0]]
Handling multiple channels
I tried this code with my wav file, and got the error for the line
axe.semilogy(f, Pxx_den) as follows : ValueError: x and y must have
same first dimension. I checked the shapes and f has (2,) while
Pxx_den has (220160,2). Also, the Pxx_den array seems to have all
zeros only.
Wav file can hold multiple channels, mainly there are mono or stereo files (max. 2**16 - 1 channels). The problem you underlined occurs because of multiple channels file (stereo sample).
rate, data = wavfile.read('aaaah.wav') # Shape: (46447, 2), Rate: 48 kHz
It is not well documented, but the method signal.periodogram also performs on matrix and its input is not directly consistent with wavfile.read output (they perform on different axis by default). So we need to carefully orient dimensions (using axis switch) when performing PSD:
f, P = signal.periodogram(data, rate, axis=0, detrend='linear')
It also works with Transposition data.T but then we need to back transpose the result.
Specifying the axis solve the issue: frequency vector is correct and PSD is not null everywhere (before it performed on the axis=1 which is of length 2, in your case it performed 220160 PSD on 2-samples signals we wanted the converse).
The detrend switch ensure the signal has zero mean and its linear trend is removed.
Real application
This approach should work for real chunked samples, provided chunks hold enough data (see Nyquist-Shannon sampling theorem). Then data are sub-samples of the signal (chunks) and rate is kept constant since it does not change during the process.
Having chunks of size 2**10 seems to work, we can identify specific frequencies from them:
f, P = signal.periodogram(data[:2**10,:], rate, axis=0, detrend='linear') # Shapes: (513,) (513, 2)
idx0 = signal.find_peaks(P[:,0], threshold=0.01, distance=50)[0] # Peaks: [46.875, 2625., 13312.5, 16921.875] Hz
fig, axe = plt.subplots(2, 1, sharex=True, sharey=True)
axe[0].loglog(f, P[:,0])
axe[0].loglog(f[idx0], P[idx0,0], '.')
# [...]
At this point, the trickiest part is the fine tuning of find-peaks method to catch desired frequencies. You may need to consider to pre-filter your signal or post-process the PSD in order to make the identification easier.
The following code writes a simple sine at frequency 400Hz to a mono WAV file. How should this code be changed in order to produce a stereo WAV file. The second channel should be in a different frequency.
import math
import wave
import struct
freq = 440.0
data_size = 40000
fname = "WaveTest.wav"
frate = 11025.0 # framerate as a float
amp = 64000.0 # multiplier for amplitude
sine_list_x = []
for x in range(data_size):
sine_list_x.append(math.sin(2*math.pi*freq*(x/frate)))
wav_file = wave.open(fname, "w")
nchannels = 1
sampwidth = 2
framerate = int(frate)
nframes = data_size
comptype = "NONE"
compname = "not compressed"
wav_file.setparams((nchannels, sampwidth, framerate, nframes,
comptype, compname))
for s in sine_list_x:
# write the audio frames to file
wav_file.writeframes(struct.pack('h', int(s*amp/2)))
wav_file.close()
Build a parallel sine_list_y list with the other frequency / channel, set nchannels=2, and in the output loop use for s, t in zip(sine_list_x, sine_list_y): as the header clause, and a body with two writeframes calls -- one for s, one for t. IOW, corresponding frames for the two channels "alternate" in the file.
See e.g. this page for a thorough description of all possible WAV file formats, and I quote:
Multi-channel digital audio samples
are stored as interlaced wave data
which simply means that the audio
samples of a multi-channel (such as
stereo and surround) wave file are
stored by cycling through the audio
samples for each channel before
advancing to the next sample time.
This is done so that the audio files
can be played or streamed before the
entire file can be read. This is handy
when playing a large file from disk
(that may not completely fit into
memory) or streaming a file over the
Internet. The values in the diagram
below would be stored in a Wave file
in the order they are listed in the
Value column (top to bottom).
and the following table clearly shows the channels' samples going left, right, left, right, ...
For an example producing a stereo .wav file, see the test_wave.py module.
The test produces an all-zero file.
You can modify by inserting alternating sample values.
nchannels = 2
sampwidth = 2
framerate = 8000
nframes = 100
# ...
def test_it(self):
self.f = wave.open(TESTFN, 'wb')
self.f.setnchannels(nchannels)
self.f.setsampwidth(sampwidth)
self.f.setframerate(framerate)
self.f.setnframes(nframes)
output = '\0' * nframes * nchannels * sampwidth
self.f.writeframes(output)
self.f.close()
Another option is to use the SciPy and NumPy libraries. In the below example, we produce a stereo wave file where the left channel has a low-frequency tone while the right channel has a higher-frequency tone. (Note: Use VLC player to play the audio)
To install SciPy, see: https://pypi.org/project/scipy/
import numpy as np
from scipy.io import wavfile
# User input
duration=5.0
toneFrequency_left=500 #Hz (20,000 Hz max value)
toneFrequency_right=1200 #Hz (20,000 Hz max value)
# Constants
samplingFrequency=48000
# Generate Tones
time_x=np.arange(0, duration, 1.0/float(samplingFrequency))
toneLeft_y=np.cos(2.0 * np.pi * toneFrequency_left * time_x)
toneRight_y=np.cos(2.0 * np.pi * toneFrequency_right * time_x)
# A 2D array where the left and right tones are contained in their respective rows
tone_y_stereo=np.vstack((toneLeft_y, toneRight_y))
# Reshape 2D array so that the left and right tones are contained in their respective columns
tone_y_stereo=tone_y_stereo.transpose()
# Produce an audio file that contains stereo sound
wavfile.write('stereoAudio.wav', samplingFrequency, tone_y_stereo)
Environment Notes
Version Used
Python 3.7.1
Python 3.7.1
SciPy 1.1.0