I wired up the MCP3008 ADC chip to an Electret Microphone and to my pi. I'm reading the input using bit-banging in python, and I'm getting an integer from 0-1024.
I followed this tutorial to do the bit-banging: https://learn.adafruit.com/reading-a-analog-in-and-controlling-audio-volume-with-the-raspberry-pi/connecting-the-cobbler-to-a-mcp3008
My question is how do I take this integer and convert it to something meaningful? Can I somehow write these bytes to a file in python to get the raw audio data that Audacity can play? Right now when I try to write the values they just show up as the integer instead of binary. I'm really new to python, and I've found this link for converting the raw data, but I'm having trouble generating the raw data first:Python open raw audio data file
I'm not even sure what these values represent, are they PCM data that I have to do math with related to time?
What you are doing here is sampling a time-varying analogue signal. so yes, the values you obtain are PCM - but with a huge caveat (see below). If you write them as a WAV file (possibly using this to help you), you will be able to open them in Audacity. You could either convert the values to unsigned 8-bit (by truncation and) or to 16-bit signed with a shift and subtraction.
The caveat is that PCM is the modulation of a sample clock with the signal. The clock signal in your case is the frequency with which you bit-bang the ADC.
Practically, it is very difficult to arrange for this to be regular in software - and particularly when bit-banging the device from a high-level language such as Python. You need to sample at twice the bandwidth of the signal (Nyquist's law) - so realistically, 8kHz for telephone speech quality.
An irregular sample clock will also result in significant artefacts - which you will hear as distortion.
Related
I have been searching and reading the approach of representing raw binaries or executable to a Spectrogram figure in Python. The thing I found was representing an already audio file, as .wav, into spectrogram
(https://fairyonice.github.io/implement-the-spectrogram-from-scratch-in-python.html)
(https://pythontic.com/visualization/signals/spectrogram)
My understanding that any file in my computer is a list of 0's and 1's. So if I want to represent an executable as a spectrogram, it is important first to define the sampling rate and bit depth. I was thinking of having 8000 samples/second as a sample rate and each sample is 1 byte. Then, we can represent the generated wave signal and the spectrogram.
Please let me know if I am misunderstanding anything.
By using wave in Python we can read .wav audio format and can calculate the frequency and power of a signal. But I want to calculate the frequency of .mp3 audio format directly. I've heard a little bit about Pysox. Is Pysox capable of reading frames and can we calculate the fft and frequency using Pysox? Or is there any other software which can calculate the frequency of an MP3 file using Python?
your questions has a few parts, but I'll give it a shot: you can get the raw audio data using pydub (the same thing the wave module gives you)
import pydub
sound = pydub.AudioSegment.from_mp3("/path/to/file.mp3")
raw_data = sound._data
(note that you'll need ffmpeg or avlib installed for the mp3 decoding)
From there you should be able to use numpy. This O'Reilly post may also help
I've got some raw ADPCM compressed audio streams and I want to play them with pygame, but as far as I know this isn't possible with pygame. How can I decompress them with python to normal PCM streams (or something else pygame can play) and then play them with pygame?
I already tried the audioop module as it has got something that converts ADPCM to linear streams but I neither know what linear streams are nor how to use the function that converts them.
I already tried the audioop module as it has got something that converts ADPCM to linear streams but I neither know what linear streams are nor how to use the function that converts them.
The short version: "Linear" is what you want.* So, the function you want is adpcm2lin.
How do you use it?
Almost everything in audioop works the same way: you loop over frames, and call a function on each frame. If your input data has some inherent frame size, like when you're reading from an MP3 file (using an external library), or your output library demands some specific frame size, you're a bit constrained on how you determine your frames. But when you're dealing with raw PCM formats, the frames are whatever size you want, from a single sample to the whole file.**
Let's do the whole file first, for simplicity:
with open('spam.adpcm', 'rb') as f:
adpcm = f.read()
pcm, _ = audioop.adpcm2lin(adpcm, 2, None)
If your adpcm file is too big to load into memory and process all at once, you'll need to keep track of the state, so:
with open('spam.adpcm', 'rb') as f:
state = None
while True:
adpcm = f.read(BLOCKSIZE)
if not adpcm:
return
pcm, state = audioop.adpcm2lin(adpcm, 2, state)
yield pcm
Of course I'm assuming that you don't need to convert the sample rate or do anything else. If you do, any such conversions should come after the ADPCM decompression.***
* The long version: "Linear" means the samples are encoded directly, rather than mapped through another algorithm. For example, if you have a 16-bit A-to-D, and you save the audio in an 8-bit linear PCM file, you're just saving the top 8 bits of each sample. That gives you a very dynamic range, so quieter sounds get lost in the noise. There are various companding algorithms that give you a much wider dynamic range for the same number of bits (at the cost of losing other information elsewhere, of course); see μ-law algorithm for details on how they work. But if you can stay in 16 bits, linear is fine.
** Actually, with 4-bit raw ADPCM, you really can't do a single sample… but you can do 2 samples, which is close enough.
*** If you're really picky, you might want to convert to 32-bit first, then do the work, then convert back to 16-bit to avoid accumulating losses. But when you're starting with 4-bit ADPCM, you aren't going for audiophile sound here.
The module "wave" of python gives me a list of hexadecimal bytes, that I can read like numbers. Let's say the frequency of my sample is 11025. Is there a 'header' in those bytes that specify this? I know I can use the wave method to get the frequency, but I wanna talk about the .wav file structure. It has a header? If I get those bytes, how do I know wich ones are the music and the ones that are information? If I could play these numbers in a speaker 11025 times per second with the intensity from 0 to 255, could I play the sound just like it is in the file?
Thanks!
.wav files are actually RIFF files under the hood. The WAVE section contains both the format information and the waveform data. Reading the codec, sample rate, sample size, and sample polarity from the format information will allow you to play the waveform data assuming you support the codec used.
So, I'm planning on trying out making a light organ with an Arduino and Python, communicating over serial to control the brightness of several LEDs. The computer will use the microphone or a playing MP3 to generate the data.
I'm not so sure how to handle the audio processing. What's a good option for python that can take either a playing audio file or microphone data (I'd prefer the microphone), and then split it into different frequency ranges and write the intensity to variables? Do I need to worry about overtones if I use the microphone?
If you're not committed to using Python, you should also look at using PureData (PD) to handle the audio analysis. Interfacing PD to the Arduino is already a solved problem, and there are a lot of pre-existing components that make working with audio easy.
Try http://wiki.python.org/moin/Audio for links to various Python audio processing packages.
The audioop package has some basic waveform manipulation functions.
See also:
Detect and record a sound with python
Detect & Record Audio in Python
Portaudio has a Python interface that would let you read data off the microphone.
For the band splitting, you could use something like a band-pass filter feeding into an envelope follower -- one filter+follower for each frequency band of interest.