looking for reading audio wave values in python

looking for reading audio wave values in python - python

I'm trying to read and manipulate audio file ,
how to read and manipulate the waves value of a wave file using python?

The SciPy libraries have great resources for this:
Writing and Reading:
import numpy as np
from scipy.io import wavfile
fs = 44.1e3
t = np.arange(0, 1.0, 1.0/fs)
f1 = 440
f2 = 600
x = 0.5*np.sin(2*np.pi*f1*t) + 0.5*np.sin(2*np.pi*f2*t)
fname = 'output.wav'
wavfile.write( fname, fs, x )
fs, data = wavfile.read( fname )
print fs, data[:10]
Documentation:
http://docs.scipy.org/doc/scipy-0.14.0/reference/io.html#module-scipy.io.wavfile
Reading: http://docs.scipy.org/doc/scipy-0.14.0/reference/generated/scipy.io.wavfile.read.html#scipy.io.wavfile.read
Writing: http://docs.scipy.org/doc/scipy-0.14.0/reference/generated/scipy.io.wavfile.write.html#scipy.io.wavfile.write
This question has been asked before: Reading *.wav files in Python

Related

How can I reverse a scipy.signal.spectrogram to audio with Python?

I have:
import librosa
from scipy import signal
import scipy.io.wavfile as sf
samples, sample_rate = sf.read(args.file)
nperseg = int(sample_rate * 0.001 * 20)
frequencies, times, spectrogram = signal.spectrogram(samples,
sample_rate,
nperseg=nperseg,
window=signal.hann(nperseg))
audio_signal = librosa.griffinlim(spectrogram)
print(audio_signal, audio_signal.shape)
sf.write('test.wav', audio_signal, sample_rate)
However, this produces a (near) empty sound file.

As #DrSpill mentioned, scipy.io.wav.read and scipy.io.wav.write orders were wrong and also the import from librosa was not correct. This should do it:
import librosa
import numpy as np
import scipy.signal
import scipy.io.wavfile
# read file
file = "temp/processed_file.wav"
fs, sig = scipy.io.wavfile.read(file)
nperseg = int(fs * 0.001 * 20)
# process
frequencies, times, spectrogram = scipy.signal.spectrogram(sig,
fs,
nperseg=nperseg,
window=scipy.signal.hann(nperseg))
audio_signal = librosa.core.spectrum.griffinlim(spectrogram)
print(audio_signal, audio_signal.shape)
# write output
scipy.io.wavfile.write('test.wav', fs, np.array(audio_signal, dtype=np.int16))
Remark:
The resulting file had an accelerated tempo when I heard it, I think this is due to your processing but with some tweaking it should work.
A good alternative, would be to only use librosa, like this:
import librosa
import numpy as np
# read file
file = "temp/processed_file.wav"
sig, fs = librosa.core.load(file, sr=8000)
# process
abs_spectrogram = np.abs(librosa.core.spectrum.stft(sig))
audio_signal = librosa.core.spectrum.griffinlim(abs_spectrogram)
print(audio_signal, audio_signal.shape)
# write output
librosa.output.write_wav('test2.wav', audio_signal, fs)

librosa.output was removed. It is no longer providing its deprecated output module. Instead try soundfile.write:
import numpy as np
import soundfile as sf
sf.write('stereo_file.wav', np.random.randn(10, 2), 44100, 'PCM_24')
#Per your code you could try:
sf.write('test.wav', audio_signal, sample_rate, 'PCM_24')

How to create multichannel .WAV file in Python?

The .WAV format looks like it should allow more than two channels (nChannels).
But unfortunately scipy.io.wavfile only writes 1 or 2.
I don't really want to manually write my own Python WAV-writer, but I can't find anything out there.
Is there any code out there that does the job?

It turns out the documentation for scipy.io.wavfile is incorrect.
Looking at the source code, I can clearly see that it accepts an arbitrary number of channels.
The following code works:
import numpy as np
from scipy.io import wavfile
fs = 48000
nsamps = fs * 10
A, Csharp, E, G = 440.0, 554.365, 660.0, 783.991
def sine(freqHz):
τ = 2 * np.pi
return np.sin(
np.linspace(0, τ * freqHz * nsamps / fs, nsamps, endpoint=False)
)
A7_chord = np.array( [ sine(A), sine(Csharp), sine(E), sine(G) ] ).T
wavfile.write("A7--4channel.wav", fs, A7_chord)

Creating .wav file from bytes

I am reading bytes from wav audio downloaded from a URL. I would like to "reconstruct" these bytes into a .wav file. I have attempted the code below, but the resulting file is pretty much static. For example, when I download audio of myself speaking, the .wav file produced is static only, but I can hear slight alterations/distortions when I know the audio should be playing my voice. What am I doing wrong?
from pprint import pprint
import scipy.io.wavfile
import numpy
#download a wav audio recording from a url
>>>response = client.get_recording(r"someurl.com")
>>>pprint(response)
(b'RIFFv\xfc\x03\x00WAVEfmt \x10\x00\x00\x00\x01\x00\x01\x00\x80>\x00\x00'
...
b'\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff'
...
b'\xea\xff\xfd\xff\x10\x00\x0c\x00\xf0\xff\x06\x00\x10\x00\x06\x00'
...)
>>>a=bytearray(response)
>>>pprint(a)
bytearray(b'RIFFv\xfc\x03\x00WAVEfmt \x10\x00\x00\x00\x01\x00\x01\x00'
b'\x80>\x00\x00\x00}\x00\x00\x02\x00\x10\x00LISTJ\x00\x00\x00INFOINAM'
b'0\x00\x00\x00Conference d95ac842-08b7-4380-83ec-85ac6428cc41\x00'
b'IART\x06\x00\x00\x00Nexmo\x00data\x00\xfc\x03\x00\xff\xff'
b'\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff'
...
b'\x12\x00\xf6\xff\t\x00\xed\xff\xf6\xff\xfc\xff\xea\xff\xfd\xff'
...)
>>>b = numpy.array(a, dtype=numpy.int16)
>>>pprint(b)
array([ 82, 73, 70, ..., 255, 248, 255], dtype=int16)
>>>scipy.io.wavfile.write(r"C:\Users\somefolder\newwavfile.wav",
16000, b)

You can simply write the data in response to a file:
with open('myfile.wav', mode='bx') as f:
f.write(response)
If you want to access the audio data as a NumPy array without writing it to a file first, you can do this with the soundfile module like this:
import io
import soundfile as sf
data, samplerate = sf.read(io.BytesIO(response))
See also this example: https://pysoundfile.readthedocs.io/en/0.9.0/#virtual-io

AudioSegment.from_raw() also will work while you have a continues stream of bytes:
import io
from pydub import AudioSegment
current_data is defined as the stream of bytes that you receive
s = io.BytesIO(current_data)
audio = AudioSegment.from_raw(s, sample_width, frame_rate, channels).export(filename, format='wav')

To add wave file header to raw audio bytes (extracted from wave library):
import struct
def write_header(_bytes, _nchannels, _sampwidth, _framerate):
WAVE_FORMAT_PCM = 0x0001
initlength = len(_bytes)
bytes_to_add = b'RIFF'
_nframes = initlength // (_nchannels * _sampwidth)
_datalength = _nframes * _nchannels * _sampwidth
bytes_to_add += struct.pack('<L4s4sLHHLLHH4s',
36 + _datalength, b'WAVE', b'fmt ', 16,
WAVE_FORMAT_PCM, _nchannels, _framerate,
_nchannels * _framerate * _sampwidth,
_nchannels * _sampwidth,
_sampwidth * 8, b'data')
bytes_to_add += struct.pack('<L', _datalength)
return bytes_to_add + _bytes

I faced the same problem while streaming and I used the answers above to write a complete function.
In my case, the byte array was coming from streaming an audio file (the frontend) and the backend needs to process it as a ndarray.
This function simulates how the front-ends sends the audio file as chunks that are accumulated into a byte array:
audio_file_path = 'offline_input/zoom283.wav'
chunk = 1024
wf = wave.open(audio_file_path, 'rb')
audio_input = b''
d = wf.readframes(chunk)
while len(d) > 0:
d = wf.readframes(chunk)
audio_input = audio_input + d
some import libraries:
import io
import wave
import numpy as np
import scipy.io.wavfile
import soundfile as sf
from scipy.io.wavfile import write
Finally, the backend will take a byte array and convert it to ndarray:
def convert_bytearray_to_wav_ndarray(input_bytearray: bytes, sampling_rate=16000):
bytes_wav = bytes()
byte_io = io.BytesIO(bytes_wav)
write(byte_io, sampling_rate, np.frombuffer(input_bytearray, dtype=np.int16))
output_wav = byte_io.read()
output, samplerate = sf.read(io.BytesIO(output_wav))
return output
output = convert_bytearray_to_wav_ndarray(input_bytearray=audio_input)
The output represents the audio file to be processed by the backend:
To check that the file has been received correctly, we write it to the desk:
scipy.io.wavfile.write("output1.wav", 16000, output)

Writing WAV file using Python, Numpy array and WAVE module

I am trying to implement the Karplus-Strong algorithm.
All is looking fine when I play (through Jupyter Notebook using Audio(y, rate=Fs)) the collected numpy array (representing guitar accord).
Unfortunately, writing the numpy array: y, into wav file using WAVE module is incorrect (using the next python code):
noise_output = wave.open('k-s.wav', 'w')
noise_output.setparams((1, 4, Fs, 0, 'NONE', 'not compressed'))
for i in range(0, len(y)):
value = y[i]
packed_value = struct.pack('f', value)
noise_output.writeframes(packed_value)
noise_output.close()
Each element of y is
<type 'numpy.float64'>
How should I amend the writing loop in order write the WAV file correctly?
Some more information about the issue. Before writing to WAV, the first elements of the y array are:
[ 0.33659756 0.33659756 -0.43915295 -0.87036152 1.40708988 0.32123558
-0.6889402 1.9739982 -1.29587159 -0.12299964 2.18381762 0.82228042
0.24593503 -1.28067426 -0.67568838 -0.01843234 -1.830472 1.2729578
-0.56575346 0.55410736]
After writing the elements to the WAV file, close the WAV file and read it again, I got this for the first 20 elements of the collected array:
[ 1051481732 1051481732 -1092560728 -1084305405 1068768133 1050966269
-1087349149 1073523705 -1079648481 -1107564740 1074512811 1062371576
1048303204 -1079775966 -1087571478 -1130954901 -1075163928 1067642952
-1089415880 1057872379]

Here are code samples to write a (stereo) wave file using the wave standard library.
I included two examples: one using numpy, and one that doesn't require any dependencies.
Using a numpy array
Note that if your data is in a numpy array, no need for the struct library.
import wave
import numpy as np
samplerate = 44100
# A note on the left channel for 1 second.
t = np.linspace(0, 1, samplerate)
left_channel = 0.5 * np.sin(2 * np.pi * 440.0 * t)
# Noise on the right channel.
right_channel = np.random.random(size=samplerate)
# Put the channels together with shape (2, 44100).
audio = np.array([left_channel, right_channel]).T
# Convert to (little-endian) 16 bit integers.
audio = (audio * (2 ** 15 - 1)).astype("<h")
with wave.open("sound1.wav", "w") as f:
# 2 Channels.
f.setnchannels(2)
# 2 bytes per sample.
f.setsampwidth(2)
f.setframerate(samplerate)
f.writeframes(audio.tobytes())
Using a list
This is (almost) the same code but without using numpy. No external dependencies are required.
import math
import random
import struct
import wave
samplerate = 44100
left_channel = [
0.5 * math.sin(2 * math.pi * 440.0 * i / samplerate) for i in range(samplerate)
]
right_channel = [random.random() for _ in range(samplerate)]
with wave.open("sound2.wav", "w") as f:
f.setnchannels(2)
f.setsampwidth(2)
f.setframerate(samplerate)
for samples in zip(left_channel, right_channel):
for sample in samples:
sample = int(sample * (2 ** 15 - 1))
f.writeframes(struct.pack("<h", sample))

import scipy.io.wavfile
scipy.io.wavfile.write("karplus.wav", Fs, y)
Tada! AFAIK works with float64 and float32, and probably others. For stereo, shape must be (nb_samples, 2). See scipy.io.wavfile.write.

Read and write wave file to and from a file:
from scipy.io import wavfile
sampling_rate, data = wavfile.read(wpath)
wavfile.write('abc1.wav', sampling_rate, data)

How to program Power Spectrum of .wav file

So I am trying to calculate the power spectrum of noise I recorded from the sun from a .wav file it recorded to. So far my code is (NEW CODE FROM OLD POST):
import pyaudio
import sys
import struct
import numpy
from pylab import *
import wave
import pyfits
sundata = ('sun_noise_ouput.wav')
chunk = 1024
FORMAT = pyaudio.paInt16 # 16-bit integers
CHANNELS = 1
RATE = 25000
RECORD_SECONDS = 120
p = pyaudio.PyAudio()
# Convert to pair of bytes to numerical datatype
N = len(sundata)/2
data = numpy.zeros(N,dtype=float)
for i in range(N) :
data[i] = struct.unpack('h',sundata[2*i:2*(i+1)])[0]
column = pyfits.Column(name='integer data', array=data, format="J")
fitsoutput = pyfits.new_table([column])
fitsoutput.writeto('sun_noise_output.fits', clobber=True)
wf = wave.open('sun_noise_output.wav', 'wb')
wf.setnchannels(CHANNELS)
wf.setsampwidth(p.get_sample_size(FORMAT))
wf.setframerate(RATE)
wf.writeframes(sundata)
wf.close()
dataft = numpy.fft.fft(data)
powerspectrum = abs(dataft)**2
figure()
plot(range(N),data)
figure()
plot(range(N),powerspectrum)
show()
May also help to note that when I try playing the file it returns no audio and says it has length 0:00 seconds
Also when I downlaod a sample from NASA's homepage there is no playback audio and these are the graphs produced:

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

looking for reading audio wave values in python - python

I'm trying to read and manipulate audio file , how to read and manipulate the waves value of a wave file using python?

Related

How can I reverse a scipy.signal.spectrogram to audio with Python?

How to create multichannel .WAV file in Python?

Creating .wav file from bytes

Writing WAV file using Python, Numpy array and WAVE module

How to program Power Spectrum of .wav file

Categories

Resources