i want to programatically record sound coming out of my laptop in python. i found PyAudio and came up with the following program that accomplishes the task:
import pyaudio, wave, sys
chunk = 1024
FORMAT = pyaudio.paInt16
CHANNELS = 1
RATE = 44100
RECORD_SECONDS = 5
WAVE_OUTPUT_FILENAME = sys.argv[1]
p = pyaudio.PyAudio()
channel_map = (0, 1)
stream_info = pyaudio.PaMacCoreStreamInfo(
flags = pyaudio.PaMacCoreStreamInfo.paMacCorePlayNice,
channel_map = channel_map)
stream = p.open(format = FORMAT,
rate = RATE,
input = True,
input_host_api_specific_stream_info = stream_info,
channels = CHANNELS)
all = []
for i in range(0, RATE / chunk * RECORD_SECONDS):
data = stream.read(chunk)
all.append(data)
stream.close()
p.terminate()
data = ''.join(all)
wf = wave.open(WAVE_OUTPUT_FILENAME, 'wb')
wf.setnchannels(CHANNELS)
wf.setsampwidth(p.get_sample_size(FORMAT))
wf.setframerate(RATE)
wf.writeframes(data)
wf.close()
the problem is i have to connect the headphone jack to the microphone jack. i tried replacing these lines:
input = True,
input_host_api_specific_stream_info = stream_info,
with these:
output = True,
output_host_api_specific_stream_info = stream_info,
but then i get this error:
Traceback (most recent call last):
File "./test.py", line 25, in
data = stream.read(chunk)
File "/Library/Python/2.5/site-packages/pyaudio.py", line 562, in read
paCanNotReadFromAnOutputOnlyStream)
IOError: [Errno Not input stream] -9975
is there a way to instantiate the PyAudio stream so that it inputs from the computer's output and i don't have to connect the headphone jack to the microphone? is there a better way to go about this? i'd prefer to stick w/ a python app and avoid cocoa.
You can install Soundflower, which allows you to create extra audio devices and route audio between them. This way you can define your system's output to the Soundflower device and read the audio from it using PyAudio.
You can also take a look at PyJack, an audio client for Jack.
unfortunately, theres no foolproof way to do it, but Audio Hijack and Wiretap are the best tools available for that.
I can give an answer by not using a programmatic way.
Panel > Sound > Recording >> enabling stereo mix.
This needs a driver support.
I also found that this makes my real sound echo.
At least this solves my problem.
Doing this programmatically will be tricky. Basically, it is not possible to intercept the audio traffic in front of your "output". Therefore, what you would have to do is create your own virtual audio device and make whatever application you want to capture play to that device.
Different third-party applications also mentioned by other people seem to provide such capabilities on MacOS. I can add Loopback but I have no experience with either of the tools.
Programmatically, you would have to mimic something exactly like that.
Check out this script at pycorder. It records output sound in python. I would consider modifying one of the functions to record output using an output stream with the sounddevice module. You can see documentation of sounddevice here. The script works and you can implement it to your needs, but replacing the recording system with an output stream would probably make the code less messy and more efficient.
Related
I know the question does not seems to make sense but let me explain.
So I have a voice changer software that Change the voice in real time. If i open the audacity and choose the micro of that software I can speak and record with the voice changed.
Now what I want is, I have a audio file already recorded and I want to pass that file into that same microphone (to simulate me speaking) and save the output with the voice changed in another file. I made some attemps using pyaudio but no success.
The idea here is to use a tts module in python to read a dataset I have with multiple lines, save the output in a file and then pass that output to the microphone to change the voice and save in another file. That way I can automate a creation of a new dataset with a new speaker to train a new tts. But the problem is I missing the way to pass a file to the microphone to simulate me speaking to it but instead using an audio file already recorded.
Sorry of it was confused. I made my best to explain. Hope someone can help me.
Thank you in advanced!
This is what I have but no success.
import pyaudio
import wave
# Open the audio file
wf = wave.open("my_audio_file.wav", "rb")
# Open the output file
wf_out = wave.open("my_output_file.wav", "wb")
# Set the output file's format and parameters to match the input file
wf_out.setframerate(wf.getframerate())
wf_out.setsampwidth(wf.getsampwidth())
wf_out.setnchannels(wf.getnchannels())
# Open the microphone using pyaudio
p = pyaudio.PyAudio()
# Create a stream to send the audio data to the microphone
stream = p.open(format=p.get_format_from_width(wf.getsampwidth()),
channels=wf.getnchannels(),
rate=wf.getframerate(),
output=True)
# Start streaming the audio data to the microphone
stream.start_stream()
# Send the audio data to the stream and output file
data = wf.readframes(1024)
while data != "":
stream.write(data)
wf_out.writeframes(data)
data = wf.readframes(1024)
# Stop the stream
stream.stop_stream()
# Close the stream, microphone, and output file
stream.close()
p.terminate()
wf_out.close()
I made a simple voice assistant in python with speech_recognition on Windows 10 and I wanted to copy the code for macOs too.
I downloaded PortAudio and PyAudio, the code runs fine but when i play the audio track I hear nothing :( (and the program not detect when I try to use the speech_recognition)
I guess it something with permissions and things like that... anyone have an idea?
( I also checked I use the right device index and I indeed use index 0 (The Mackbook built-in Microphone)
here is some code sample:
import pyaudio
import wave
chunk = 1024 # Record in chunks of 1024 samples
sample_format = pyaudio.paInt16 # 16 bits per sample
channels = 1
fs = 44100 # Record at 44100 samples per second
seconds = 3
filename = "output.wav"
p = pyaudio.PyAudio() # Create an interface to PortAudio
print('Recording')
stream = p.open(format=sample_format,
channels=channels,
rate=fs,
frames_per_buffer=chunk,
input=True)
frames = [] # Initialize array to store frames
# Store data in chunks for 3 seconds
for i in range(0, int(fs / chunk * seconds)):
data = stream.read(chunk)
frames.append(data)
# Stop and close the stream
stream.stop_stream()
stream.close()
# Terminate the PortAudio interface
p.terminate()
print('Finished recording')
# Save the recorded data as a WAV file
wf = wave.open(filename, 'wb')
wf.setnchannels(channels)
wf.setsampwidth(p.get_sample_size(sample_format))
wf.setframerate(fs)
wf.writeframes(b''.join(frames))
wf.close()
I found the answer!!!
The code actually worked fine all this time, the problem was that I used Visual Studio Code that for some reason messed up with the microphone permissions
Now I run the code through terminal with python [filename].py and its working great!
I'm writing a code that supposed to give some audio output to the user based on his action, and I want to generate the sound rather than having a fixed number of wav files to play. Now, what I'm doing is to generate the signal in numpy format, store the data in a wav file and then read the same file into pyaudio. I think this is redundant, however, I couldn't find a way to do that. My question is, can I stream a numpy array (or a regular list) directly into my the pyaudio to play?
If its just playback and does not need to be synchronised to anything then you can just do the following:
# Open stream with correct settings
stream = self.p.open(format=pyaudio.paFloat32,
channels=CHANNELS,
rate=48000,
output=True,
output_device_index=1
)
# Assuming you have a numpy array called samples
data = samples.astype(np.float32).tostring()
stream.write(data)
I use this method and it works fine for me. If you need to record at the same time then this won't work.
If you are just looking to generate audio tones then below code may be useful,
It does need pyaudio that can be installed as
pip install pyaudio
Sample Code
#Play a fixed frequency sound.
from __future__ import division
import math
import pyaudio
#See http://en.wikipedia.org/wiki/Bit_rate#Audio
BITRATE = 44100 #number of frames per second/frameset.
#See http://www.phy.mtu.edu/~suits/notefreqs.html
FREQUENCY = 2109.89 #Hz, waves per second, 261.63=C4-note.
LENGTH = 1.2 #seconds to play sound
NUMBEROFFRAMES = int(BITRATE * LENGTH)
RESTFRAMES = NUMBEROFFRAMES % BITRATE
WAVEDATA = ''
for x in xrange(NUMBEROFFRAMES):
WAVEDATA = WAVEDATA+chr(int(math.sin(x/((BITRATE/FREQUENCY)/math.pi))*127+128))
#fill remainder of frameset with silence
for x in xrange(RESTFRAMES):
WAVEDATA = WAVEDATA+chr(128)
p = pyaudio.PyAudio()
stream = p.open(format = p.get_format_from_width(1),
channels = 1,
rate = BITRATE,
output = True)
stream.write(WAVEDATA)
stream.stop_stream()
stream.close()
p.terminate()
Code is slightly modified from this askubuntu site
You can directly stream the data through pyaudio, there is no need to write and read a .wav file.
import pyaudio
p = pyaudio.PyAudio()
stream = p.open(format=pyaudio.paFloat32,
channels=1,
rate=44100,
frames_per_buffer=1024,
output=True,
output_device_index=1
)
samples = np.sin(np.arange(50000)/20)
stream.write(samples.astype(np.float32).tostring())
stream.close()
using C or python (python preferred), How would i encode a binary file to audio that is then outputted though the headphone jack, also how would i decode the audio back to binary using input from the microphone jack, so far i have learned how to covert a text file to binary using python, It would be similar to RTTY communication.
This is so that i can record data onto a cassette tape.
import binascii
a = open('/Users/kyle/Desktop/untitled folder/unix commands.txt', 'r')
f = open('/Users/kyle/Desktop/file_test.txt', 'w')
c = a.read()
b = bin(int(binascii.hexlify(c), 16))
f.write(b)
f.close()
From your comments, you want to process the binary data bit by bit, turning each bit into a high or low sound.
You still need to decide exactly what those high and low sounds are, and how long each one sounds for (and whether there's a gap in between, and so on). If you make it slow, like 1/4 of a second per sound, then you're treating them as notes. If you make it very fast, like 1/44100 of a second, you're treating them as samples. The human ear can't hear 44100 different sounds in a second; instead, it hears a single sound at up to 22050Hz.
Once you've made those decisions, there are two parts to your problem.
First, you have to generate a stream of samples—for example, a stream of 44100 16-bit integers for every second. For really simple things, like playing a chunk of a raw PCM file in 44k 16-bit mono format, or generating a square wave, this is trivial. For more complex cases, like playing a chunk of an MP3 file or synthesizing a sound out of sine waves and filters, you'll need some help. The audioop module, and a few others in the stdlib, can give you the basics; beyond that, you'll need to search PyPI for appropriate modules.
Second, you have to send that sample stream to the headphone jack. There's no built-in support for this in Python. On some platforms, you can do this just by opening a special file and writing to it. But, more generally, you will need to find a third-party library on PyPI.
The simpler modules work for one particular type of audio system. Mac and Windows each have their own standards, and Linux has a half dozen different ones. There are also some Python modules that talk to higher-level wrappers; you may have to install and set up the wrapper, but once you do that, your code will work on any system.
So, let's work through one really simple example. Let's say you've got PortAudio set up on your system, and you've installed PyAudio to talk to it. This code will play square waves of 441Hz and 220.5Hz (just above middle C and low C) for just under 1/4th of a second (just because that's really easy).
import binascii
a = open('/Users/kyle/Desktop/untitled folder/unix commands.txt', 'r')
c = a.read()
b = bin(int(binascii.hexlify(c), 16))
sample_stream = []
high_note = (b'\xFF'*100 + b'\0'*100) * 50
low_note = (b'\xFF'*50 + b'\0'*50) * 100
for bit in b[2:]:
if bit == '1':
sample_stream.extend(high_note)
else:
sample_stream.extend(low_note)
sample_buffer = b''.join(sample_stream)
p = pyaudio.PyAudio()
stream = p.open(format=p.get_format_from_width(8),
channels=1,
rate=44100,
output=True)
stream.write(sample_buffer)
So you want to transmit digital information using audio? Basically you want to implement a MODEM in software (no matter if it is pure software, it's still called modem).
A modem (MOdulator-DEModulator) is a device that modulates an analog carrier signal to encode digital information, and also demodulates such a carrier signal to decode the transmitted information. The goal is to produce a signal that can be transmitted easily and decoded to reproduce the original digital data. Modems can be used over any means of transmitting analog signals, from light emitting diodes to radio. [wikipedia]
There are modems everywhere you need to transmit data over an analog media, be it sound, light or radio waves. Your TV remote probably is an infrared modem.
Modems implemented in pure software are called soft-modems. Most soft-modems I see in the wild are using some form of FSK modulation:
Frequency-shift keying (FSK) is a frequency modulation scheme in which digital information is transmitted through discrete frequency changes of a carrier wave.1 The simplest FSK is binary FSK (BFSK). BFSK uses a pair of discrete frequencies to transmit binary (0s and 1s) information.2 With this scheme, the "1" is called the mark frequency and the "0" is called the space frequency. The time domain of an FSK modulated carrier is illustrated in the figures to the right. [wikipedia]
There are very interesting applications for data transmission through atmosphere via sound waves - I guess it is what shopkick uses to verify user presence.
For Python check the GnuRadio project.
For a C library, look at the work of Steve Underwood (but please don't contact him with silly questions). I used his soft-modem to bootstrap a FAX to email gateway for Asterisk (a fax transmission is not much more than a B/W TIFF file encoded in audio for transmission over a phone line).
So, here is Abarnert's code, updated to python3.
import binascii
import pyaudio
a = open('/home/ian/Python-programs/modem/testy_mcTest.txt', 'rb')
c = a.read()
b = bin(int(binascii.hexlify(c), 16))
sample_stream = []
high_note = (b'\xFF'*100 + b'\0'*100) * 50
low_note = (b'\xFF'*50 + b'\0'*50) * 100
for bit in b[2:]:
if bit == '1':
sample_stream.extend(high_note)
else:
sample_stream.extend(low_note)
sample_buffer = ''.join(map(str, sample_stream))
p = pyaudio.PyAudio()
stream = p.open(format=p.get_format_from_width(1),
channels=1,
rate=44100,
output=True)
stream.write(sample_buffer)
# stop stream (4)
stream.stop_stream()
stream.close()
# close PyAudio (5)
p.terminate()
If you're looking for a library that does this, I'd recommend libquiet. It uses an existing SDR library to perform its modulation, and it provides binaries that will offer you a pipe at one end and will feed the sound right to your soundcard using PortAudio at the other. It has GMSK for "over the air" low bitrate transmissions and QAM for cable-based higher bitrate.
I want to record short audio clips from a USB microphone in Python. I have tried pyaudio, which seemed to fail communicating with ALSA, and alsaaudio, the code example of which produces an unreadable files.
So my question: What is the easiest way to record clips from a USB mic in Python?
This script records to test.wav while printing the current amplitute:
import alsaaudio, wave, numpy
inp = alsaaudio.PCM(alsaaudio.PCM_CAPTURE)
inp.setchannels(1)
inp.setrate(44100)
inp.setformat(alsaaudio.PCM_FORMAT_S16_LE)
inp.setperiodsize(1024)
w = wave.open('test.wav', 'w')
w.setnchannels(1)
w.setsampwidth(2)
w.setframerate(44100)
while True:
l, data = inp.read()
a = numpy.fromstring(data, dtype='int16')
print numpy.abs(a).mean()
w.writeframes(data)