Binary string to wav file - python

I'm trying to write a wav upload function for my webapp. The front end portion seems to be working great. The problem is my backend (python). When it receives the binary data I'm not sure how to write it to a file. I tried using the basic write functon, and the sound is corrupt... Sounds like "gobbly-gook". Is there a special way to write wav files in Python?
Here is my backend... Not really much to it.
form = cgi.FieldStorage()
fileData = str(form.getvalue('data'))
with open("audio", 'w') as file:
file.write(fileData)
I even tried...
with open("audio", 'wb') as file:
file.write(fileData)
I am using aplay to play the sound, and I noticed that all the properties are messed up as well.
Before:
Signed 16 bit Little Endian, Rate 44100 Hz, Stereo
After upload:
Unsigned 8 bit, Rate 8000 Hz, Mono

Perhaps the wave module might help?
import wave
import struct
import numpy as np
rate = 44100
def sine_samples(freq, dur):
# Get (sample rate * duration) samples on X axis (between freq
# occilations of 2pi)
X = (2*np.pi*freq/rate) * np.arange(rate*dur)
# Get sine values for these X axis samples (as integers)
S = (32767*np.sin(X)).astype(int)
# Pack integers as signed "short" integers (-32767 to 32767)
as_packed_bytes = (map(lambda v:struct.pack('h',v), S))
return as_packed_bytes
def output_wave(path, frames):
# Python 3.X allows the use of the with statement
# with wave.open(path,'w') as output:
# # Set parameters for output WAV file
# output.setparams((2,2,rate,0,'NONE','not compressed'))
# output.writeframes(frames)
output = wave.open(path,'w')
output.setparams((2,2,rate,0,'NONE','not compressed'))
output.writeframes(frames)
output.close()
def output_sound(path, freq, dur):
# join the packed bytes into a single bytes frame
frames = b''.join(sine_samples(freq,dur))
# output frames to file
output_wave(path, frames)
output_sound('sine440.wav', 440, 2)
EDIT:
I think in your case, you might only need:
packedData = map(lambda v:struct.pack('h',v), fileData)
frames = b''.join(packedData)
output_wave('example.wav', frames)
In this case, you just need to know the sampling rate. Check the wave module for information on the other output file parameters (i.e. the arguments to the setparams method).

The code I pasted will write a wav file as long as the data isn't corrupt. It was not necessary to use the wave module.
with open("audio", 'w') as file:
file.write(fileData)
I was originally reading the file in Javascript as FileAPI.readAsBinaryString. I changed this to FileAPI.readAsDataURL, and then decoded it in python using base64.decode(). Once I decoded it I was able to just write the data to a file. The .wav file was in perfect condition.

Related

How to process a wav file for WebCRT VAD on Python

I'm learning how to use py-webrtcvad library for Python and trying to pick out segments containing speech on a .wav file. Right now, I don't know how to split a .wav files into smaller frames to put it into a vad object. Here's my code below
spf = wave.open("test4.wav", "rb")
signal = spf.readframes(-1)
vad = webrtcvad.Vad(3)
sample_rate = 16000
frame_duration = 10 # ms
#TODO: turn signal into segments and run the below print statement in a for loop
print('Contains speech: %s' % (vad.is_speech(signal, sample_rate)))
As I have read from https://github.com/wiseman/py-webrtcvad, the frames fed into the vad object must be 16-bit and either 10,20 or 30ms long. My sample is about 9s long and now I don't know how to split it into the appropriate frames. I have tested the example.py in the repo, but it is a bit too complicated for me so I tried to do a very simple example first.
Thank you.

Python3 modifying wav audio data correctly

Learning how to modify different types of audio files, .wav, .mp3, etcetera using Python3 using the wave module. Specifically .wav file format, in this regard for this question. Presently, I know there are ISO standards for audio formats, and any references for this subject are greatly appreciated regarding audio standards for the .wav file format as well on a side note.
But in terms of my question, simply ignoring the RIFF, FMT headers, in a .wav file using the Python3 wave module import.
Is there a more efficient way to skip the RIFF headers, other containers, and go straight to the data container to modify its contents?
This crude example simply is converting a two-channel audio .wav file to a single-channel audio .wav file while modifying all values to (0, 0).
import wave
import struct
# Open Files
inf = wave.open(r"piano2.wav", 'rb')
outf = wave.open(r"output.wav", 'wb')
# Input Parameters
ip = list(inf.getparams())
print('Input Parameters:', ip)
# Example Output: Input Parameters: [2, 2, 48000, 302712, 'NONE', 'not compressed']
# Output Parameters
op = ip[:]
op[0] = 1
outf.setparams(op)
number_of_channels, sample_width, frame_rate, number_of_frames, comp_type, comp_name = ip
format = '<{}h'.format(number_of_channels)
print('# Channels:', format)
# Read >> Second
for index in range(number_of_frames):
frame = inf.readframes(1)
data = struct.unpack(format, frame)
# Here, I change data to (0, 0), testing purposes
print('Before Audio Data:', data)
print('After Modifying Audio Data', (0, 0))
# Change Audio Data
data = (0, 0)
value = data[0]
value = (value * 2) // 3
outf.writeframes(struct.pack('<h', value))
# Close In File
inf.close()
# Close Out File
outf.close()
Is there a better practice or reference material if simply just modifying data segments of .wav files?
Say you wanted to literally add a sound at a specific timestamp, that would be a more appropriate result to my question.
Performance comparison
Let's examine first 3 ways to read WAVE files.
The slowest one - wave module
As you might have noticed already, wave module can be painfully slow. Consider this code:
import wave
import struct
wavefile = wave.open('your.wav', 'r') # check e.g. freesound.org for samples
length = wavefile.getnframes()
for i in range(0, length):
wavedata = wavefile.readframes(1)
data = struct.unpack("<h", wavedata)
For a WAVE as defined below:
Input File : 'audio.wav'
Channels : 1
Sample Rate : 48000
Precision : 16-bit
Duration : 00:09:35.71 = 27634080 samples ~ 43178.2 CDDA sectors
File Size : 55.3M
Bit Rate : 768k
Sample Encoding: 16-bit Signed Integer PCM
it took on average 27.7s to load the full audio. The flip side to the wave module it is that is available out of the box and will work on any system.
The convenient one - audiofile
A much more convenient and faster solution is e.g. audiofile. According to the project description, its focus is on reading speed.
import audiofile as af
signal, sampling_rate = af.read(audio.wav)
This gave me on average 33 ms to read the mentioned file.
The fastest one - numpy
If we decide to skip header (as OP asks) and go solely for speed, numpy is a great choice:
import numpy as np
byte_length = np.fromfile(filename, dtype=np.int32, count=1, offset=40)[0]
data = np.fromfile(filename, dtype=np.int16, count=byte_length // np.dtype(np.int16).itemsize, offset=44)
The header structure (that tells us what offset to use) is defined here.
The execution of that code takes ~6 ms, 5x less than the audioread. Naturally it comes with a price / preconditions: we need to know in advance what is the data type.
Modifying the audio
Once you have the audio in a numpy array, you can modify it at will, you can also decide to stream the file rather than reading everything at once. Be warned though: since sound is a wave, in a typical scenario simply injecting new data at arbitrary time t will lead to distortion of that audio (unless it was silence).
As for writing the stream back, "modifying the container" would be terribly slow in Python. That's why you should either use arrays or switch to a more suitable language (e.g. C).
If we go with arrays, we should mind that numpy knows nothing about the WAVE format and therefore we'd have to define the header ourselves and write individual bytes. Perfectly feasible exercise, but clunky. Luckily, scipy provides a convenient function that has the benefits of numpy speed (it uses numpy underneath), while making the code much more readable:
from scipy.io.wavfile import write
fs = np.fromfile('audio.wav', dtype=np.int32, count=1, offset=24)[0] # we need sample rate
with open('audio_out.wav', 'a') as fout:
new_data = data.append(np.zeros(2 * fs)) # append 2 seconds of zeros
write(fout, fs, new_data)
It could be done in a loop, where you read a chunk with numpy / scipy, modify the array (data) and write to the file (with a for append).

Trying to split audio into 20ms chunks and making a spectrogram but its looking weird

I basically have this audio file that is 16-bit PCM WAV mono at 44100hz and I'm trying to convert it into a spectrogram. But I want a spectrogram of the audio every 20ms (Trying this for speech recognition), but whenever I try to compare what I have to Audacity, its really different. I'm kind of new to python so I was trying to base this off of my java knowledge. Any help would be appreciated. I think I'm either splitting the read samples incorrectly (What I did was split it every 220 elements in the array since I believe Audio Data is just samples in the time domain to get it to 20ms audio)
Here's the code I have right now:
import librosa.display
import numpy
audioPath = 'C:\\Users\\pawar\\Desktop\\Resister.wav'
audioData, sampleRate = librosa.load(audioPath, sr=None)
print(sampleRate)
new = numpy.zeros(shape=(220, 1))
counter = 0
for i in range(0, len(audioData), 882):
new = (audioData[i: i + 882])
STFT = librosa.stft(new, n_fft=882)
print(type(STFT))
audioDatainDB = librosa.amplitude_to_db(abs(STFT))
print(type(audioDatainDB))
librosa.display.specshow(audioDatainDB, sr=sampleRate, x_axis='time', y_axis='hz')
#plt.figure(figsize=(20,10))
plt.show()
counter += 1
print("Your local debug print statement ", counter)
As for the values, well I was playing around with them quite a bit trying to get it to work. Doesn't seem to be of any luck ;/
Here's the output:
https://i.stack.imgur.com/EVntx.png
And here's what Audacity shows:
https://i.stack.imgur.com/GIGy8.png
I know its not 20ms in the audacity one but you can see the two don't look even a bit similar

Read and write stereo .wav file with python + metadatas

What's the easiest way to read and write a stereo .wav file in Python ?
Should I use scipy.io.wavfile.read ?
Should I use a 2-dimension array (how ?) in order to have x[n,j] where j is the channel number?
I also want to read/write metadatas stored in the wav file like the markers, MIDI root note (Soundforge, as well as other sound editors, can read/write this specific .wav metadata called "MIDI root note")
Thank you
PS : I already know how to do with a mono file :
from scipy.io.wavfile import read
(fs, x) = read('test.wav')
Here is an updated version of scipy.io.wavfile that adds:
24 bit .wav files support for read/write,
access to cue markers,
cue marker labels,
some other metadata like pitch (if defined), etc.
wavfile.py (enhanced)
Old (original) answer: a solution for only a part of the question (ie read stereo samples):
(fs, x) = read('stereo_small-file.wav')
print len(x.shape) # 1 if mono, 2 if stereo
# if stereo, x is a 2-dimensional array, so we can access both channels with :
print x[:,0]
print x[:,1]
Take a look at Pythons' wave module

Using Wave Python Module to Get and Write Audio

So, I'm trying to use the Python Wave module to get an audio file and basically get all of the frames from it, examine them, and then write them back to another file. I tried to output the sound that I'm reading to another file just now, but it came out either as noise, or as no sound at all. So, I'm pretty sure that I'm not analyzing the file and getting the correct frames...? I'm dealing with a stereo 16-bit sound file. While I could use a simpler file to just understand the process, I eventually want to be able to accept any kind of sound file to work with, so I need to understand what the problem is.
I also noted that 32-bit sound files wouldn't be read by the Wave module - it gave me an error of "Unknown Format". Any ideas about that? Is it something I can bypass so that I could at least, for example, read 32-bit audio files, even if I can only 'render' 16-bit files?
I'm somewhat aware that wave files are interleaved between the left and right channels (first sample's for the left channel, second's for the right, etc)., but how do I separate the channels? Here's my code. I cut out the output code to just see if I'm reading the files correctly. I'm using Python 2.7.2:
import scipy
import wave
import struct
import numpy
import pylab
fp = wave.open('./sinewave16.wav', 'rb') # Problem loading certain kinds of wave files in binary?
samplerate = fp.getframerate()
totalsamples = fp.getnframes()
fft_length = 2048 # Guess
num_fft = (totalsamples / fft_length) - 2
temp = numpy.zeros((num_fft, fft_length), float)
leftchannel = numpy.zeros((num_fft, fft_length), float)
rightchannel = numpy.zeros((num_fft, fft_length), float)
for i in range(num_fft):
tempb = fp.readframes(fft_length / fp.getnchannels() / fp.getsampwidth());
#tempb = fp.readframes(fft_length)
up = (struct.unpack("%dB"%(fft_length), tempb))
#up = (struct.unpack("%dB"%(fft_length * fp.getnchannels() * fp.getsampwidth()), tempb))
#print (len(up))
temp[i,:] = numpy.array(up, float) - 128.0
temp = temp * numpy.hamming(fft_length)
temp.shape = (-1, fp.getnchannels())
fftd = numpy.fft.rfft(temp)
pylab.plot(abs(fftd[:,1]))
pylab.show()
#Frequency of an FFT should be as follows:
#The first bin in the FFT is DC (0 Hz), the second bin is Fs / N, where Fs is the sample rate and N is the size of the FFT. The next bin is 2 * Fs / N. To express this in general terms, the nth bin is n * Fs / N.
# (It would appear to me that n * Fs / N gives you the hertz, and you can use sqrt(real portion of number*r + imaginary portion*i) to find the magnitude of the signal
Currently, this will load the sound file, unpack it into a struct, and plot the sound file so that I can look at it, but I don't think it's getting all of the audio file, or it's not getting it correctly. Am I reading the wave file into the struct correctly? Are there any up-to-date resources on using Python to read and analyze wave / audio files? Any help would be greatly appreciated.
Perhaps you should try the SciPy io.wavefile module:
http://docs.scipy.org/doc/scipy/reference/io.html

Categories

Resources