I've got some raw ADPCM compressed audio streams and I want to play them with pygame, but as far as I know this isn't possible with pygame. How can I decompress them with python to normal PCM streams (or something else pygame can play) and then play them with pygame?
I already tried the audioop module as it has got something that converts ADPCM to linear streams but I neither know what linear streams are nor how to use the function that converts them.
I already tried the audioop module as it has got something that converts ADPCM to linear streams but I neither know what linear streams are nor how to use the function that converts them.
The short version: "Linear" is what you want.* So, the function you want is adpcm2lin.
How do you use it?
Almost everything in audioop works the same way: you loop over frames, and call a function on each frame. If your input data has some inherent frame size, like when you're reading from an MP3 file (using an external library), or your output library demands some specific frame size, you're a bit constrained on how you determine your frames. But when you're dealing with raw PCM formats, the frames are whatever size you want, from a single sample to the whole file.**
Let's do the whole file first, for simplicity:
with open('spam.adpcm', 'rb') as f:
adpcm = f.read()
pcm, _ = audioop.adpcm2lin(adpcm, 2, None)
If your adpcm file is too big to load into memory and process all at once, you'll need to keep track of the state, so:
with open('spam.adpcm', 'rb') as f:
state = None
while True:
adpcm = f.read(BLOCKSIZE)
if not adpcm:
return
pcm, state = audioop.adpcm2lin(adpcm, 2, state)
yield pcm
Of course I'm assuming that you don't need to convert the sample rate or do anything else. If you do, any such conversions should come after the ADPCM decompression.***
* The long version: "Linear" means the samples are encoded directly, rather than mapped through another algorithm. For example, if you have a 16-bit A-to-D, and you save the audio in an 8-bit linear PCM file, you're just saving the top 8 bits of each sample. That gives you a very dynamic range, so quieter sounds get lost in the noise. There are various companding algorithms that give you a much wider dynamic range for the same number of bits (at the cost of losing other information elsewhere, of course); see μ-law algorithm for details on how they work. But if you can stay in 16 bits, linear is fine.
** Actually, with 4-bit raw ADPCM, you really can't do a single sample… but you can do 2 samples, which is close enough.
*** If you're really picky, you might want to convert to 32-bit first, then do the work, then convert back to 16-bit to avoid accumulating losses. But when you're starting with 4-bit ADPCM, you aren't going for audiophile sound here.
Related
I'm building a simple Python application that involves altering the speed of an audio track.
(I acknowledge that changing the framerate of an audio also make pitch appear different, and I do not care about pitch of the audio being altered).
I have tried using solution from abhi krishnan using pydub, which looks like this.
from pydub import AudioSegment
sound = AudioSegment.from_file(…)
def speed_change(sound, speed=1.0):
# Manually override the frame_rate. This tells the computer how many
# samples to play per second
sound_with_altered_frame_rate = sound._spawn(sound.raw_data, overrides={
"frame_rate": int(sound.frame_rate * speed)
})
# convert the sound with altered frame rate to a standard frame rate
# so that regular playback programs will work right. They often only
# know how to play audio at standard frame rate (like 44.1k)
return sound_with_altered_frame_rate.set_frame_rate(sound.frame_rate)
However, the audio with changed speed sounds distorted, or crackled, which would not be heard with using Audacity to do the same, and I hope I find out a way to reproduce in Python how Audacity (or other digital audio editors) changes the speed of audio tracks.
I presume that the quality loss is caused by the original audio having low framerate, which is 8kHz, and that .set_frame_rate(sound.frame_rate) tries to sample points of the audio with altered speed in the original, low framerate. Simple attempts of setting the framerate of the original audio or the one with altered framerate, and the one that were to be exported didn't work out.
Is there a way in Pydub or in other Python modules that perform the task in the same way Audacity does?
Assuming what you want to do is to play audio back at say x1.5 the speed of the original. This is synonymous to saying to resample the audio samples down by 2/3rds and pretend that the sampling rate hasn't changed. Assuming this is what you are after, I suspect most DSP packages would support it (search audio resampling as the keyphrase).
You can try scipy.signal.resample_poly()
from scipy.signal import resample_poly
dec_data = resample_poly(sound.raw_data,up=2,down=3)
dec_data should have 2/3rds of the number of samples as the original raw_data samples. If you play dec_data samples at the sound's sampling rate, you should get a sped-up version. The downside of using resample_poly is you need a rational factor, and having large numerator or denominator will cause output less ideal. You can try scipy's resample function or seek other packages, which supports audio resampling.
I have been searching and reading the approach of representing raw binaries or executable to a Spectrogram figure in Python. The thing I found was representing an already audio file, as .wav, into spectrogram
(https://fairyonice.github.io/implement-the-spectrogram-from-scratch-in-python.html)
(https://pythontic.com/visualization/signals/spectrogram)
My understanding that any file in my computer is a list of 0's and 1's. So if I want to represent an executable as a spectrogram, it is important first to define the sampling rate and bit depth. I was thinking of having 8000 samples/second as a sample rate and each sample is 1 byte. Then, we can represent the generated wave signal and the spectrogram.
Please let me know if I am misunderstanding anything.
I am attempting to write a program that detects the frequency of a sound in a .wav file. I would like to do this with exclusively native python, no third-party modules. I used the built-in read() and open() functions and got some strange results:
with open('pcm-test.wav', 'rb') as f:
data = f.read(255)
print data
When I run it, I get this:
>>>
RIFF$ÈWAVEfmt data
>>>
What am I doing wrong? Any advice would be appreciated. Thanks!
EDIT
I suppose I phrased this wrong. I'm looking for the frequency of the tone in the .wav file, not the sample rate. I have an algorithm for computing frequency based on an array of amplitudes, but I have no way of finding it. I guess my question would be how can I get raw amplitude data from the .wav file and store it as a list, tuple, etc.
What am I doing wrong?
Nothing, everything works just fine. Next, parse the WAVE header (http://soundfile.sapp.org/doc/WaveFormat/) and get your sample rate.
I wired up the MCP3008 ADC chip to an Electret Microphone and to my pi. I'm reading the input using bit-banging in python, and I'm getting an integer from 0-1024.
I followed this tutorial to do the bit-banging: https://learn.adafruit.com/reading-a-analog-in-and-controlling-audio-volume-with-the-raspberry-pi/connecting-the-cobbler-to-a-mcp3008
My question is how do I take this integer and convert it to something meaningful? Can I somehow write these bytes to a file in python to get the raw audio data that Audacity can play? Right now when I try to write the values they just show up as the integer instead of binary. I'm really new to python, and I've found this link for converting the raw data, but I'm having trouble generating the raw data first:Python open raw audio data file
I'm not even sure what these values represent, are they PCM data that I have to do math with related to time?
What you are doing here is sampling a time-varying analogue signal. so yes, the values you obtain are PCM - but with a huge caveat (see below). If you write them as a WAV file (possibly using this to help you), you will be able to open them in Audacity. You could either convert the values to unsigned 8-bit (by truncation and) or to 16-bit signed with a shift and subtraction.
The caveat is that PCM is the modulation of a sample clock with the signal. The clock signal in your case is the frequency with which you bit-bang the ADC.
Practically, it is very difficult to arrange for this to be regular in software - and particularly when bit-banging the device from a high-level language such as Python. You need to sample at twice the bandwidth of the signal (Nyquist's law) - so realistically, 8kHz for telephone speech quality.
An irregular sample clock will also result in significant artefacts - which you will hear as distortion.
I'm interested in precisely extracting portions of a PCM WAV file, down to the sample level. Most audio modules seem to rely on platform-specific audio libraries. I want to make this cross platform and speed is not an issue, are there any native python audio modules that can do this?
If not, I'll have to interpret the PCM binary. While I'm sure I can dig up the PCM specs fairly easily, and raw formats are easy enough to walk, I've never actually dealt with binary data in Python before. Are there any good resources that explain how to do this? Specifically relating to audio would just be icing.
I read the question and the answers and I feel that I must be missing something completely obvious, because nobody mentioned the following two modules:
audioop: manipulate raw audio data
wave: read and write WAV files
Perhaps I come from a parallel universe and Guido's time machine is actually a space-time machine :)
Should you need example code, feel free to ask.
PS Assuming 48kHz sampling rate, a video frame at 24/1.001==23.976023976… fps is 2002 audio samples long, and at 25fps it's 1920 audio samples long.
I've only written a PCM reader in C++ and Java, but the format itself is fairly simple. A decent description can be found here: http://ccrma.stanford.edu/courses/422/projects/WaveFormat/
Past that you should be able to just read it in (binary file reading, http://www.johnny-lin.com/cdat_tips/tips_fileio/bin_array.html) and just deal with the resulting array. You may need to use some bit shifting to get the alignments correct (https://docs.python.org/reference/expressions.html#shifting-operations) but depending on how you read it in, you might not need to.
All of that said, I'd still lean towards David's approach.
Is it really important that your solution be pure Python, or would you accept something that can work with native audio libraries on various platforms (so it's effectively cross-platform)? There are several examples of the latter at http://wiki.python.org/moin/PythonInMusic
Seems like a combination of open(..., "rb"), struct module, and some details about the wav/riff file format (probably better reference out there) will do the job.
Just curious, what do you intend on doing with the raw sample data?
I was looking this up and I found this: http://www.swharden.com/blog/2009-06-19-reading-pcm-audio-with-python/
It requires Numpy (and matplotlib if you want to graph it)
import numpy
data = numpy.memmap("test.pcm", dtype='h', mode='r')
print "VALUES:",data
Check out the original author's site for more details.