howto stream numpy array into pyaudio stream? - python

I'm writing a code that supposed to give some audio output to the user based on his action, and I want to generate the sound rather than having a fixed number of wav files to play. Now, what I'm doing is to generate the signal in numpy format, store the data in a wav file and then read the same file into pyaudio. I think this is redundant, however, I couldn't find a way to do that. My question is, can I stream a numpy array (or a regular list) directly into my the pyaudio to play?

If its just playback and does not need to be synchronised to anything then you can just do the following:
# Open stream with correct settings
stream = self.p.open(format=pyaudio.paFloat32,
channels=CHANNELS,
rate=48000,
output=True,
output_device_index=1
)
# Assuming you have a numpy array called samples
data = samples.astype(np.float32).tostring()
stream.write(data)
I use this method and it works fine for me. If you need to record at the same time then this won't work.

If you are just looking to generate audio tones then below code may be useful,
It does need pyaudio that can be installed as
pip install pyaudio
Sample Code
#Play a fixed frequency sound.
from __future__ import division
import math
import pyaudio
#See http://en.wikipedia.org/wiki/Bit_rate#Audio
BITRATE = 44100 #number of frames per second/frameset.
#See http://www.phy.mtu.edu/~suits/notefreqs.html
FREQUENCY = 2109.89 #Hz, waves per second, 261.63=C4-note.
LENGTH = 1.2 #seconds to play sound
NUMBEROFFRAMES = int(BITRATE * LENGTH)
RESTFRAMES = NUMBEROFFRAMES % BITRATE
WAVEDATA = ''
for x in xrange(NUMBEROFFRAMES):
WAVEDATA = WAVEDATA+chr(int(math.sin(x/((BITRATE/FREQUENCY)/math.pi))*127+128))
#fill remainder of frameset with silence
for x in xrange(RESTFRAMES):
WAVEDATA = WAVEDATA+chr(128)
p = pyaudio.PyAudio()
stream = p.open(format = p.get_format_from_width(1),
channels = 1,
rate = BITRATE,
output = True)
stream.write(WAVEDATA)
stream.stop_stream()
stream.close()
p.terminate()
Code is slightly modified from this askubuntu site

You can directly stream the data through pyaudio, there is no need to write and read a .wav file.
import pyaudio
p = pyaudio.PyAudio()
stream = p.open(format=pyaudio.paFloat32,
channels=1,
rate=44100,
frames_per_buffer=1024,
output=True,
output_device_index=1
)
samples = np.sin(np.arange(50000)/20)
stream.write(samples.astype(np.float32).tostring())
stream.close()

Related

PyAudio Recording and Playing Back in Real Time

I am trying to record audio from the microphone and then play that audio through the speakers. Eventually I want to modify the audio before playing it back, but I'm having trouble taking the input data and successfully play it back through the speakers.
The format for the input stream I'm using is Int16 and for the output stream is Float32. These were the only ones which made any sound at all (albeit a demonic one).
First I tried simply putting the input data into the output stream. This outputs a demonic sound:
import pyaudio
import numpy as np
import struct
FORMATIN = pyaudio.paInt16
FORMATOUT = pyaudio.paFloat32
CHANNELS = 1
RATE = 44100
CHUNK = 1024
audio = pyaudio.PyAudio()
# start Recording
streamIn = audio.open(format=FORMATIN, channels=CHANNELS,
rate=RATE, input=True, input_device_index=0,
frames_per_buffer=CHUNK)
streamOut = audio.open(format=FORMATOUT, channels=CHANNELS,
rate=RATE, output=True, input_device_index=0,
frames_per_buffer=CHUNK)
print("recording...")
while True:
in_data = streamIn.read(CHUNK)
streamOut.write(in_data)
in_data is as follows when printed:
1\x00\x12\x00\x0f\x00\x05\x00\x14\x00\x1e\x00\x16\x00\x14\x00\x12\x00\x10\x00\x02\x00\xf7\xff\xf7\xff\xd4\xff\xde\xff\xf8\xff\xd3\xff\xe9\xff\x14\x00#\x00Z\x00\xb9\xfft\xff\xce\x00\x93\x01\xc2\xff\xe4\xfe\x93\x00d\x00\xca\xff\x94\x01V\x01\xc8\xffS\x00t\x00\xc4\xffi\x00\xaf\x01l\x00\xdb\xfeM\xffw\xffp\x01\xf5\xffr\xfc\x97\x00~\x02S\x00\x97\x00v\x00\x87\xfe\xb7\xfc\x81\xff\xf6\x00\xef\x00\xc4\x03\x84\x02\x99\xfd`\xfc\xe2\x01b\x03\xda\xfe\xc4\xff\xfd\x00:\x00\xc6\x00\xf1\xfcV\xfd\xf0\x02\xdc\xff&\xff\xa1\x02\xc7\xff\xf5\xfe\xa9\xfe\x99\xfa\x06\xfdo\x04\xaa\x02\x8f\xfe\xec\x00\x1b\xffZ\xfe;\x01t\xfe<\xffd\x02<\x02\x04\x02\xcd\xfd\xe8\xfd\xf3\x00i\xfcD\xfa\x86\xfe\xb3\x01\xea\x00$\x00q\x00\x03\x022\x00d\xf9\x14\xfa\x86\xfdQ\xfd\xc5\xfe\x81\x02\xc2\x02=\x01\xfc\x00\xe5\xfd\t\xff\x93\xff\x83\xffd\x00(\xfeQ\xffM\x01\xb1\x01\xde\xfdE\xfd\xfe\xff\x00\x00\x06\x00\x02\xffV\xff\xcd\xffJ\xff\xfb\xfc\x86\xfd^\x00\x8d\x00\x91\xff\xb6\xfe\xf7\x00\x95\x01E\x00\x1b\xff9\xfe8\xff\xa7\xff\xd4\xff\xdd\xff\xb0\x00\x97\x01\xe8\x00\xa7\xff\xd8\xfe\x89\xff\x0c\x00\x81\xff\x81\xfe\xd1\xfeN\x00\x1a\x01\xcb\x00\x19\x00\x90\x00`\x00\x93\xff5\xff\x9b\xff\\\x00\x08\x00\xc0\xff,\x00\xc0\x00\xba\x00\x83\x00\x0f\x00\xf5\xffY\x00\x19\
Then I tried changing in_data to Float32, but that did not work either:
in_data = np.frombuffer(in_data, np.float32))
I tried various clipping and packing of the data, none of which worked:
in_data = np.clip(in_data, -2**15+1, 2**15-1)
in_data = struct.pack('d' * 1024, *in_data)
Does anyone know how to record audio from the microphone and then output it through speakers? Thank you.
Set FORMATOUT =FORMATIN.
Currently, your code does the following:
44100 times per second, a frame is recorded
each frame is a 16 bit signed number (16 bit LPCM). It takes 2 bytes to encode a frame. This is the FORMATIN = pyaudio.paInt16 setting you chose.
when 1024 frames have been recorded (this takes ca 23 ms), these are returned as a bytes-object in python. It consists of 2048 bytes. you call this variable in_data
then, you pass these 2048 bytes to the output device via the .write call.
the output device works in pyaudio.paFloat32, which means that it believes each frame is 32 bits (4 bytes). It concludes that you have provided 2048/4=512 frames for it to play back. the output unit is set to 44100Hz as well, so it takes ca 12 ms to play back. the values it plays back are a mess, since it tries to interpret integers as floats. both the bitrate and the encoding mismatches, and the sound in your speaker seems to be from tormented souls in the purgatory.
then the whole process repeats
matching the input and output format should resolve these issues.
Audio data with 16-bit signed integer format will have values between 32768 and -32767. Data with float (32 bit or 64 bit) will be in range 1.0 to -1.0.
I would recommend doing all processing in floating point in Python. So try to do in_data = (in_data / 32768), before processing or sending to the output.
if you useing linux you can put
os.system("pactl load-module module-loopback latency_msec=1")
at the beginning of the script and
os.system("/usr/bin/pulseaudio --kill")
at the end
pls tell me if it work now

Adding to a WAV file in Python

I want to add to a wav file, ideal would be from a numpy array. I tried the following code:
data = stream.read(CHUNK)
audio_numpy = numpy.frombuffer(data, dtype=numpy.int16)
scipy.io.wavfile.write(FILENAME, RATE, audio_numpy)
where stream is created by
p = pyaudio.PyAudio()
stream = p.open(format=pyaudio.paInt16,
channels=CHANNELS,
rate=RATE,
input=True,
frames_per_buffer=CHUNK)
as I heard that scipy would add to the file and not overwrite it. Unfortunately however, it overwrites the file.
How can I append to a WAV file? The input comes from the microphone.
The WAV file should be accessed from ffmpeg later, so that the WAV file should not be written in total again, as this is also inefficient.

How to record microphone on macos with pyaudio?

I made a simple voice assistant in python with speech_recognition on Windows 10 and I wanted to copy the code for macOs too.
I downloaded PortAudio and PyAudio, the code runs fine but when i play the audio track I hear nothing :( (and the program not detect when I try to use the speech_recognition)
I guess it something with permissions and things like that... anyone have an idea?
( I also checked I use the right device index and I indeed use index 0 (The Mackbook built-in Microphone)
here is some code sample:
import pyaudio
import wave
chunk = 1024 # Record in chunks of 1024 samples
sample_format = pyaudio.paInt16 # 16 bits per sample
channels = 1
fs = 44100 # Record at 44100 samples per second
seconds = 3
filename = "output.wav"
p = pyaudio.PyAudio() # Create an interface to PortAudio
print('Recording')
stream = p.open(format=sample_format,
channels=channels,
rate=fs,
frames_per_buffer=chunk,
input=True)
frames = [] # Initialize array to store frames
# Store data in chunks for 3 seconds
for i in range(0, int(fs / chunk * seconds)):
data = stream.read(chunk)
frames.append(data)
# Stop and close the stream
stream.stop_stream()
stream.close()
# Terminate the PortAudio interface
p.terminate()
print('Finished recording')
# Save the recorded data as a WAV file
wf = wave.open(filename, 'wb')
wf.setnchannels(channels)
wf.setsampwidth(p.get_sample_size(sample_format))
wf.setframerate(fs)
wf.writeframes(b''.join(frames))
wf.close()
I found the answer!!!
The code actually worked fine all this time, the problem was that I used Visual Studio Code that for some reason messed up with the microphone permissions
Now I run the code through terminal with python [filename].py and its working great!

Audio Recording in Python

I want to record short audio clips from a USB microphone in Python. I have tried pyaudio, which seemed to fail communicating with ALSA, and alsaaudio, the code example of which produces an unreadable files.
So my question: What is the easiest way to record clips from a USB mic in Python?
This script records to test.wav while printing the current amplitute:
import alsaaudio, wave, numpy
inp = alsaaudio.PCM(alsaaudio.PCM_CAPTURE)
inp.setchannels(1)
inp.setrate(44100)
inp.setformat(alsaaudio.PCM_FORMAT_S16_LE)
inp.setperiodsize(1024)
w = wave.open('test.wav', 'w')
w.setnchannels(1)
w.setsampwidth(2)
w.setframerate(44100)
while True:
l, data = inp.read()
a = numpy.fromstring(data, dtype='int16')
print numpy.abs(a).mean()
w.writeframes(data)

record output sound in python

i want to programatically record sound coming out of my laptop in python. i found PyAudio and came up with the following program that accomplishes the task:
import pyaudio, wave, sys
chunk = 1024
FORMAT = pyaudio.paInt16
CHANNELS = 1
RATE = 44100
RECORD_SECONDS = 5
WAVE_OUTPUT_FILENAME = sys.argv[1]
p = pyaudio.PyAudio()
channel_map = (0, 1)
stream_info = pyaudio.PaMacCoreStreamInfo(
flags = pyaudio.PaMacCoreStreamInfo.paMacCorePlayNice,
channel_map = channel_map)
stream = p.open(format = FORMAT,
rate = RATE,
input = True,
input_host_api_specific_stream_info = stream_info,
channels = CHANNELS)
all = []
for i in range(0, RATE / chunk * RECORD_SECONDS):
data = stream.read(chunk)
all.append(data)
stream.close()
p.terminate()
data = ''.join(all)
wf = wave.open(WAVE_OUTPUT_FILENAME, 'wb')
wf.setnchannels(CHANNELS)
wf.setsampwidth(p.get_sample_size(FORMAT))
wf.setframerate(RATE)
wf.writeframes(data)
wf.close()
the problem is i have to connect the headphone jack to the microphone jack. i tried replacing these lines:
input = True,
input_host_api_specific_stream_info = stream_info,
with these:
output = True,
output_host_api_specific_stream_info = stream_info,
but then i get this error:
Traceback (most recent call last):
File "./test.py", line 25, in
data = stream.read(chunk)
File "/Library/Python/2.5/site-packages/pyaudio.py", line 562, in read
paCanNotReadFromAnOutputOnlyStream)
IOError: [Errno Not input stream] -9975
is there a way to instantiate the PyAudio stream so that it inputs from the computer's output and i don't have to connect the headphone jack to the microphone? is there a better way to go about this? i'd prefer to stick w/ a python app and avoid cocoa.
You can install Soundflower, which allows you to create extra audio devices and route audio between them. This way you can define your system's output to the Soundflower device and read the audio from it using PyAudio.
You can also take a look at PyJack, an audio client for Jack.
unfortunately, theres no foolproof way to do it, but Audio Hijack and Wiretap are the best tools available for that.
I can give an answer by not using a programmatic way.
Panel > Sound > Recording >> enabling stereo mix.
This needs a driver support.
I also found that this makes my real sound echo.
At least this solves my problem.
Doing this programmatically will be tricky. Basically, it is not possible to intercept the audio traffic in front of your "output". Therefore, what you would have to do is create your own virtual audio device and make whatever application you want to capture play to that device.
Different third-party applications also mentioned by other people seem to provide such capabilities on MacOS. I can add Loopback but I have no experience with either of the tools.
Programmatically, you would have to mimic something exactly like that.
Check out this script at pycorder. It records output sound in python. I would consider modifying one of the functions to record output using an output stream with the sounddevice module. You can see documentation of sounddevice here. The script works and you can implement it to your needs, but replacing the recording system with an output stream would probably make the code less messy and more efficient.

Categories

Resources