How to write stereo wav files in Python? - python

The following code writes a simple sine at frequency 400Hz to a mono WAV file. How should this code be changed in order to produce a stereo WAV file. The second channel should be in a different frequency.
import math
import wave
import struct
freq = 440.0
data_size = 40000
fname = "WaveTest.wav"
frate = 11025.0 # framerate as a float
amp = 64000.0 # multiplier for amplitude
sine_list_x = []
for x in range(data_size):
sine_list_x.append(math.sin(2*math.pi*freq*(x/frate)))
wav_file = wave.open(fname, "w")
nchannels = 1
sampwidth = 2
framerate = int(frate)
nframes = data_size
comptype = "NONE"
compname = "not compressed"
wav_file.setparams((nchannels, sampwidth, framerate, nframes,
comptype, compname))
for s in sine_list_x:
# write the audio frames to file
wav_file.writeframes(struct.pack('h', int(s*amp/2)))
wav_file.close()

Build a parallel sine_list_y list with the other frequency / channel, set nchannels=2, and in the output loop use for s, t in zip(sine_list_x, sine_list_y): as the header clause, and a body with two writeframes calls -- one for s, one for t. IOW, corresponding frames for the two channels "alternate" in the file.
See e.g. this page for a thorough description of all possible WAV file formats, and I quote:
Multi-channel digital audio samples
are stored as interlaced wave data
which simply means that the audio
samples of a multi-channel (such as
stereo and surround) wave file are
stored by cycling through the audio
samples for each channel before
advancing to the next sample time.
This is done so that the audio files
can be played or streamed before the
entire file can be read. This is handy
when playing a large file from disk
(that may not completely fit into
memory) or streaming a file over the
Internet. The values in the diagram
below would be stored in a Wave file
in the order they are listed in the
Value column (top to bottom).
and the following table clearly shows the channels' samples going left, right, left, right, ...

For an example producing a stereo .wav file, see the test_wave.py module.
The test produces an all-zero file.
You can modify by inserting alternating sample values.
nchannels = 2
sampwidth = 2
framerate = 8000
nframes = 100
# ...
def test_it(self):
self.f = wave.open(TESTFN, 'wb')
self.f.setnchannels(nchannels)
self.f.setsampwidth(sampwidth)
self.f.setframerate(framerate)
self.f.setnframes(nframes)
output = '\0' * nframes * nchannels * sampwidth
self.f.writeframes(output)
self.f.close()

Another option is to use the SciPy and NumPy libraries. In the below example, we produce a stereo wave file where the left channel has a low-frequency tone while the right channel has a higher-frequency tone. (Note: Use VLC player to play the audio)
To install SciPy, see: https://pypi.org/project/scipy/
import numpy as np
from scipy.io import wavfile
# User input
duration=5.0
toneFrequency_left=500 #Hz (20,000 Hz max value)
toneFrequency_right=1200 #Hz (20,000 Hz max value)
# Constants
samplingFrequency=48000
# Generate Tones
time_x=np.arange(0, duration, 1.0/float(samplingFrequency))
toneLeft_y=np.cos(2.0 * np.pi * toneFrequency_left * time_x)
toneRight_y=np.cos(2.0 * np.pi * toneFrequency_right * time_x)
# A 2D array where the left and right tones are contained in their respective rows
tone_y_stereo=np.vstack((toneLeft_y, toneRight_y))
# Reshape 2D array so that the left and right tones are contained in their respective columns
tone_y_stereo=tone_y_stereo.transpose()
# Produce an audio file that contains stereo sound
wavfile.write('stereoAudio.wav', samplingFrequency, tone_y_stereo)
Environment Notes
Version Used
Python 3.7.1
Python 3.7.1
SciPy 1.1.0

Related

Preferred way to write audio data to a WAV file?

I am trying to write an audio file using python's wave and numpy. So far I have the following and it works well:
import wave
import numpy as np
# set up WAV file parameters
num_channels = 1 # mono audio
sample_width = 1 # 8 bits(1 byte)/sample
sample_rate = 44.1e3 # 44.1k samples/second
frequency = 440 # 440 Hz
duration = 20 # play for this many seconds
num_samples = int(sample_rate * duration) # samples/seconds * seconds
# open WAV file and write data
with wave.open('sine8bit_2.wav', 'w') as wavfile:
wavfile.setnchannels(num_channels)
wavfile.setsampwidth(sample_width)
wavfile.setframerate(sample_rate)
t = np.linspace(0, duration, num_samples)
data = (127*np.sin(2*np.pi*frequency*t)).astype(np.int8)
wavfile.writeframes(data) # or data.tobytes() ??
My issue is that since I am using a high sampling rate, the num_samples variable might quickly become too large (9261000 samples for a 3 minute 30 seconds track say). Would using a numpy array this large be advisable? Is there a better way of going about this? Also is use of writeframes(.tobytes()) needed in this case because my code runs fine without it and it seems like extra overhead (especially if the arrays get too large).
Assuming you are only going to write a sine wave, you could very well create only one period as your data array and write that several times to the .wav file.
Using the parameters you provided, your data array is 8800 times smaller with that approach. Its size also no longer depends on the duration of your file!
import wave
import numpy as np
# set up WAV file parameters
num_channels = 1 # mono audio
sample_width = 1 # 8 bits(1 byte)/sample
sample_rate = 44.1e3 # 44.1k samples/second
frequency = 440 # 440 Hz
duration = 20 # play for this many seconds
# Create a single period of sine wave.
n = round(sample_rate/frequency)
t = np.linspace(0, 1/frequency, n)
data = (127*np.sin(2*np.pi*frequency*t)).astype(np.int8)
periods = round(frequency*duration)
# open WAV file and write data
with wave.open('sine8bit_2.wav', 'w') as wavfile:
wavfile.setnchannels(num_channels)
wavfile.setsampwidth(sample_width)
wavfile.setframerate(sample_rate)
for _ in range(periods):
wavfile.writeframes(data)

python handle wav file with wave and numpy

I'm trying to extract frequency from .wav files. So I'm using python wave and numpy, I'm almost done! But I face an error.. I followed this url's answer : Extracting frequencies from a wav file python
when I exract frequency from .wav file that created myself by following that answer, it succeed. However, when I exract frequency from .wav file that recorded by mic. it raised an error :
struct.error: unpack requires a buffer of 288768 bytes
following is my code
import wave
import struct
import numpy as np
if __name__ == '__main__':
wf = wave.open('test6.wav', 'rb')
frame = wf.getnframes()
data_size = wf.getnframes()
frate = wf.getframerate()
data = wf.readframes(data_size)
wf.close()
duration = frame / float(frate)
data = struct.unpack('{n}h'.format(n=data_size), data)
data = np.array(data)
w = np.fft.fft(data)
freqs = np.fft.fftfreq(len(w))
print(freqs.min(), freqs.max())
# (-0.5, 0.499975)
# Find the peak in the coefficients
idx = np.argmax(np.abs(w))
freq = freqs[idx]
freq_in_hertz = abs(freq * frate)
print('freqiency: ',freq_in_hertz)
print('duration: ',duration)
288768 in error message is exactly double of data_size.
So when I use data_size=wf.getnframes()*2, it does not raise error. But, it raise an error with file that created by code.
How can I solve this?
Given that the size of the buffer is exactly double data_size, I would guess that the .wav file you recorded with your mic has two channels instead of one. You can verify this by looking at the output of wf.getnchannels(). It should be 2 for your mic recording.
If this is the case, you can load just one channel of your mic recording by following this answer:
Read the data of a single channel from a stereo wave file in Python

Read the data of a single channel from a stereo wave file wave with 24-bit data in Python

I want to read the left and rigth channel.
import wave
origAudio = wave.open("6980.wav","r")
frameRate = origAudio.getframerate()
nChannels = origAudio.getnchannels()
sampWidth = origAudio.getsampwidth()
nbframe=origAudio.getnframes()
da = np.fromstring(origAudio.readframes(48000), dtype=np.int16)
origAudio.getparams()
the parametre
(2, 3, 48000, 2883584, 'NONE', 'not compressed')
Now I want to separate left and right channel with wave file in 24 bit data
You can use wavio, a small module that I wrote to read and write WAV files using numpy arrays. In your case:
import wavio
wav = wavio.read("6980.wav")
# wav.data is the numpy array of samples.
# wav.rate is the sampling rate.
# wav.sampwidth is the sample width, in bytes. For a 24 bit file,
# wav.sampwdith is 3.
left_channel = wav.data[:, 0]
right_channel = wav.data[:, 1]
wavio is on PyPi, and the source is on github at https://github.com/WarrenWeckesser/wavio.
The parameters tell you that you have 2 channels of data at 3 bytes per sample, at 48kHz. So when you say readframes(48000) you get one second of frames which you should probably read into a slightly different data structure:
da = np.fromstring(origAudio.readframes(48000), dtype=np.uint8)
Now you should have 48000 * 2 * 3 bytes, i.e. len(da). To take only the first channel you'd do this:
chan1 = np.zeros(48000, np.uint32)
chan1bytes = chan1.view(np.uint8)
chan1bytes[0::4] = da[0::6]
chan1bytes[1::4] = da[1::6]
chan1bytes[2::4] = da[2::6]
That is, you make an array of integers, one per sample, and copy the appropriate bytes over from the source data (you could try copying directly from the result of readframes() and skip creating da).

Python wave: Extracting stereo channels separately from *.wav-file

I have a 32 bit *.wav-file 44100Hz sampling. I use use wave and python struct.unpack to get the data to an array. However I want to get each of the two stereo channels as a separate array. How do I do this as simply as possible? This is the code I have:
def read_values(filename):
wave_file = open(filename, 'r')
nframes = wave_file.getnframes()
nchannels = wave_file.getnchannels()
sampling_frequency = wave_file.getframerate()
T = nframes / float(sampling_frequency)
read_frames = wave_file.readframes(nframes)
wave_file.close()
data = unpack("%dh" % nchannels*nframes, read_frames)
return T, data, nframes, nchannels, sampling_frequency
Are you able to (1) modify the code such that it returns two arrays, one for each channel, and (2) explain how the wave is structured, and how the functions which I used incorrectly is used correctly.
Left and right channel in stereo files are interleaved. You can find lots of information about this online, e.g. look at the figures here.
So if you already have your whole audio data in an list, you just have to get very 2nd sample for stereo, every 4th sample for quadro, etc. You can do this with list splicing:
data_per_channel = [data[offset::nchannels] for offset in range(nchannels)]

Reading *.wav files in Python

I need to analyze sound written in a .wav file. For that I need to transform this file into set of numbers (arrays, for example). I think I need to use the wave package. However, I do not know how exactly it works. For example I did the following:
import wave
w = wave.open('/usr/share/sounds/ekiga/voicemail.wav', 'r')
for i in range(w.getnframes()):
frame = w.readframes(i)
print frame
As a result of this code I expected to see sound pressure as function of time. In contrast I see a lot of strange, mysterious symbols (which are not hexadecimal numbers). Can anybody, pleas, help me with that?
Per the documentation, scipy.io.wavfile.read(somefile) returns a tuple of two items: the first is the sampling rate in samples per second, the second is a numpy array with all the data read from the file:
from scipy.io import wavfile
samplerate, data = wavfile.read('./output/audio.wav')
Using the struct module, you can take the wave frames (which are in 2's complementary binary between -32768 and 32767 (i.e. 0x8000 and 0x7FFF). This reads a MONO, 16-BIT, WAVE file. I found this webpage quite useful in formulating this:
import wave, struct
wavefile = wave.open('sine.wav', 'r')
length = wavefile.getnframes()
for i in range(0, length):
wavedata = wavefile.readframes(1)
data = struct.unpack("<h", wavedata)
print(int(data[0]))
This snippet reads 1 frame. To read more than one frame (e.g., 13), use
wavedata = wavefile.readframes(13)
data = struct.unpack("<13h", wavedata)
Different Python modules to read wav:
There is at least these following libraries to read wave audio files:
SoundFile
scipy.io.wavfile (from scipy)
wave (to read streams. Included in Python 2 and 3)
scikits.audiolab (unmaintained since 2010)
sounddevice (play and record sounds, good for streams and real-time)
pyglet
librosa (music and audio analysis)
madmom (strong focus on music information retrieval (MIR) tasks)
The most simple example:
This is a simple example with SoundFile:
import soundfile as sf
data, samplerate = sf.read('existing_file.wav')
Format of the output:
Warning, the data are not always in the same format, that depends on the library. For instance:
from scikits import audiolab
from scipy.io import wavfile
from sys import argv
for filepath in argv[1:]:
x, fs, nb_bits = audiolab.wavread(filepath)
print('Reading with scikits.audiolab.wavread:', x)
fs, x = wavfile.read(filepath)
print('Reading with scipy.io.wavfile.read:', x)
Output:
Reading with scikits.audiolab.wavread: [ 0. 0. 0. ..., -0.00097656 -0.00079346 -0.00097656]
Reading with scipy.io.wavfile.read: [ 0 0 0 ..., -32 -26 -32]
SoundFile and Audiolab return floats between -1 and 1 (as matab does, that is the convention for audio signals). Scipy and wave return integers, which you can convert to floats according to the number of bits of encoding, for example:
from scipy.io.wavfile import read as wavread
samplerate, x = wavread(audiofilename) # x is a numpy array of integers, representing the samples
# scale to -1.0 -- 1.0
if x.dtype == 'int16':
nb_bits = 16 # -> 16-bit wav files
elif x.dtype == 'int32':
nb_bits = 32 # -> 32-bit wav files
max_nb_bit = float(2 ** (nb_bits - 1))
samples = x / (max_nb_bit + 1) # samples is a numpy array of floats representing the samples
IMHO, the easiest way to get audio data from a sound file into a NumPy array is SoundFile:
import soundfile as sf
data, fs = sf.read('/usr/share/sounds/ekiga/voicemail.wav')
This also supports 24-bit files out of the box.
There are many sound file libraries available, I've written an overview where you can see a few pros and cons.
It also features a page explaining how to read a 24-bit wav file with the wave module.
You can accomplish this using the scikits.audiolab module. It requires NumPy and SciPy to function, and also libsndfile.
Note, I was only able to get it to work on Ubunutu and not on OSX.
from scikits.audiolab import wavread
filename = "testfile.wav"
data, sample_frequency,encoding = wavread(filename)
Now you have the wav data
If you want to procces an audio block by block, some of the given solutions are quite awful in the sense that they imply loading the whole audio into memory producing many cache misses and slowing down your program. python-wavefile provides some pythonic constructs to do NumPy block-by-block processing using efficient and transparent block management by means of generators. Other pythonic niceties are context manager for files, metadata as properties... and if you want the whole file interface, because you are developing a quick prototype and you don't care about efficency, the whole file interface is still there.
A simple example of processing would be:
import sys
from wavefile import WaveReader, WaveWriter
with WaveReader(sys.argv[1]) as r :
with WaveWriter(
'output.wav',
channels=r.channels,
samplerate=r.samplerate,
) as w :
# Just to set the metadata
w.metadata.title = r.metadata.title + " II"
w.metadata.artist = r.metadata.artist
# This is the prodessing loop
for data in r.read_iter(size=512) :
data[1] *= .8 # lower volume on the second channel
w.write(data)
The example reuses the same block to read the whole file, even in the case of the last block that usually is less than the required size. In this case you get an slice of the block. So trust the returned block length instead of using a hardcoded 512 size for any further processing.
If you're going to perform transfers on the waveform data then perhaps you should use SciPy, specifically scipy.io.wavfile.
Here's a Python 3 solution using the built in wave module [1], that works for n channels, and 8,16,24... bits.
import sys
import wave
def read_wav(path):
with wave.open(path, "rb") as wav:
nchannels, sampwidth, framerate, nframes, _, _ = wav.getparams()
print(wav.getparams(), "\nBits per sample =", sampwidth * 8)
signed = sampwidth > 1 # 8 bit wavs are unsigned
byteorder = sys.byteorder # wave module uses sys.byteorder for bytes
values = [] # e.g. for stereo, values[i] = [left_val, right_val]
for _ in range(nframes):
frame = wav.readframes(1) # read next frame
channel_vals = [] # mono has 1 channel, stereo 2, etc.
for channel in range(nchannels):
as_bytes = frame[channel * sampwidth: (channel + 1) * sampwidth]
as_int = int.from_bytes(as_bytes, byteorder, signed=signed)
channel_vals.append(as_int)
values.append(channel_vals)
return values, framerate
You can turn the result into a NumPy array.
import numpy as np
data, rate = read_wav(path)
data = np.array(data)
Note, I've tried to make it readable rather than fast. I found reading all the data at once was almost 2x faster. E.g.
with wave.open(path, "rb") as wav:
nchannels, sampwidth, framerate, nframes, _, _ = wav.getparams()
all_bytes = wav.readframes(-1)
framewidth = sampwidth * nchannels
frames = (all_bytes[i * framewidth: (i + 1) * framewidth]
for i in range(nframes))
for frame in frames:
...
Although python-soundfile is roughly 2 orders of magnitude faster (hard to approach this speed with pure CPython).
[1] https://docs.python.org/3/library/wave.html
My dear, as far as I understood what you are looking for, you are getting into a theory field called Digital Signal Processing (DSP). This engineering area comes from a simple analysis of discrete-time signals to complex adaptive filters. A nice idea is to think of the discrete-time signals as a vector, where each element of this vector is a sampled value of the original, continuous-time signal. Once you get the samples in a vector form, you can apply different digital signal techniques to this vector.
Unfortunately, on Python, moving from audio files to NumPy array vector is rather cumbersome, as you could notice... If you don't idolize one programming language over other, I highly suggest trying out MatLab/Octave. Matlab makes the samples access from files straightforward. audioread() makes this task to you :) And there are a lot of toolboxes designed specifically for DSP.
Nevertheless, if you really intend to get into Python for this, I'll give you a step-by-step to guide you.
1. Get the samples
The easiest way the get the samples from the .wav file is:
from scipy.io import wavfile
sampling_rate, samples = wavfile.read(f'/path/to/file.wav')
Alternatively, you could use the wave and struct package to get the samples:
import numpy as np
import wave, struct
wav_file = wave.open(f'/path/to/file.wav', 'rb')
# from .wav file to binary data in hexadecimal
binary_data = wav_file.readframes(wav_file.getnframes())
# from binary file to samples
s = np.array(struct.unpack('{n}h'.format(n=wav_file.getnframes()*wav_file.getnchannels()), binary_data))
Answering your question: binary_data is a bytes object, which is not human-readable and can only make sense to a machine. You can validate this statement typing type(binary_data). If you really want to understand a little bit more about this bunch of odd characters, click here.
If your audio is stereo (that is, has 2 channels), you can reshape this signal to achieve the same format obtained with scipy.io
s_like_scipy = s.reshape(-1, wav_file.getnchannels())
Each column is a chanell. In either way, the samples obtained from the .wav file can be used to plot and understand the temporal behavior of the signal.
In both alternatives, the samples obtained from the files are represented in the Linear Pulse Code Modulation (LPCM)
2. Do digital signal processing stuffs onto the audio samples
I'll leave that part up to you :) But this is a nice book to take you through DSP. Unfortunately, I don't know good books with Python, they are usually horrible books... But do not worry about it, the theory can be applied in the very same way using any programming language, as long as you domain that language.
Whatever the book you pick up, stick with the classical authors, such as Proakis, Oppenheim, and so on... Do not care about the language programming they use. For a more practical guide of DPS for audio using Python, see this page.
3. Play the filtered audio samples
import pyaudio
p = pyaudio.PyAudio()
stream = p.open(format = p.get_format_from_width(wav_file.getsampwidth()),
channels = wav_file.getnchannels(),
rate = wav_file.getframerate(),
output = True)
# from samples to the new binary file
new_binary_data = struct.pack('{}h'.format(len(s)), *s)
stream.write(new_binary_data)
where wav_file.getsampwidth() is the number of bytes per sample, and wav_file.getframerate() is the sampling rate. Just use the same parameters of the input audio.
4. Save the result in a new .wav file
wav_file=wave.open('/phat/to/new_file.wav', 'w')
wav_file.setparams((nchannels, sampwidth, sampling_rate, nframes, "NONE", "not compressed"))
for sample in s:
wav_file.writeframes(struct.pack('h', int(sample)))
where nchannels is the number of channels, sampwidth is the number of bytes per samples, sampling_rate is the sampling rate, nframes is the total number of samples.
I needed to read a 1-channel 24-bit WAV file. The post above by Nak was very useful. However, as mentioned above by basj 24-bit is not straightforward. I finally got it working using the following snippet:
from scipy.io import wavfile
TheFile = 'example24bit1channelFile.wav'
[fs, x] = wavfile.read(TheFile)
# convert the loaded data into a 24bit signal
nx = len(x)
ny = nx/3*4 # four 3-byte samples are contained in three int32 words
y = np.zeros((ny,), dtype=np.int32) # initialise array
# build the data left aligned in order to keep the sign bit operational.
# result will be factor 256 too high
y[0:ny:4] = ((x[0:nx:3] & 0x000000FF) << 8) | \
((x[0:nx:3] & 0x0000FF00) << 8) | ((x[0:nx:3] & 0x00FF0000) << 8)
y[1:ny:4] = ((x[0:nx:3] & 0xFF000000) >> 16) | \
((x[1:nx:3] & 0x000000FF) << 16) | ((x[1:nx:3] & 0x0000FF00) << 16)
y[2:ny:4] = ((x[1:nx:3] & 0x00FF0000) >> 8) | \
((x[1:nx:3] & 0xFF000000) >> 8) | ((x[2:nx:3] & 0x000000FF) << 24)
y[3:ny:4] = (x[2:nx:3] & 0x0000FF00) | \
(x[2:nx:3] & 0x00FF0000) | (x[2:nx:3] & 0xFF000000)
y = y/256 # correct for building 24 bit data left aligned in 32bit words
Some additional scaling is required if you need results between -1 and +1. Maybe some of you out there might find this useful
if its just two files and the sample rate is significantly high, you could just interleave them.
from scipy.io import wavfile
rate1,dat1 = wavfile.read(File1)
rate2,dat2 = wavfile.read(File2)
if len(dat2) > len(dat1):#swap shortest
temp = dat2
dat2 = dat1
dat1 = temp
output = dat1
for i in range(len(dat2)/2): output[i*2]=dat2[i*2]
wavfile.write(OUTPUT,rate,dat)
PyDub (http://pydub.com/) has not been mentioned and that should be fixed. IMO this is the most comprehensive library for reading audio files in Python right now, although not without its faults. Reading a wav file:
from pydub import AudioSegment
audio_file = AudioSegment.from_wav('path_to.wav')
# or
audio_file = AudioSegment.from_file('path_to.wav')
# do whatever you want with the audio, change bitrate, export, convert, read info, etc.
# Check out the API docs http://pydub.com/
PS. The example is about reading a wav file, but PyDub can handle a lot of various formats out of the box. The caveat is that it's based on both native Python wav support and ffmpeg, so you have to have ffmpeg installed and a lot of the pydub capabilities rely on the ffmpeg version. Usually if ffmpeg can do it, so can pydub (which is quite powerful).
Non-disclaimer: I'm not related to the project, but I am a heavy user.
u can also use simple import wavio library u also need have some basic knowledge of the sound.

Categories

Resources