Playing mp3 file through microphone with python - python

Is there a way using python (and not any external software) to play a mp3 file like a microphone input?
For example, I have a mp3 file and with a python script it would play it through my mic so other in a voice room would hear it. As I say it is just an example.
Of course, I have done some research. I found out that I can use a software to create a virtual device and do few things to have the result. But my point is if it is possible without installing software but with some kind of python script?

It is possible but it isn't 100% in python as it requires the installation of other software. (Also from what I know this specific answer only works on Windows, but it should be similar on Linux with PulseAudio instead of VB-Audio Cable, but I'm not a daily Linux user so I don't know.)
First download: https://www.vb-audio.com/Cable/, this will create a "Virtual Audio Cable" where programs can play music to the input device (What looks like a speaker) and it'll pipe it to the output device (What looks like a microphone).
Then run this command in cmd: pip install pygame==2.0.0.dev8 (or py -m pip install pygame==2.0.0.dev8, depending on your installation of python) [Also the reason it's the dev version is that it requires some functions only in sdl2, whereas the main branch uses sdl1)
Then:
>>> from pygame._sdl2 import get_num_audio_devices, get_audio_device_name #Get playback device names
>>> from pygame import mixer #Playing sound
>>> mixer.init() #Initialize the mixer, this will allow the next command to work
>>> [get_audio_device_name(x, 0).decode() for x in range(get_num_audio_devices(0))] #Returns playback devices
['Headphones (Oculus Virtual Audio Device)', 'MONITOR (2- NVIDIA High Definition Audio)', 'Speakers (High Definition Audio Device)', 'Speakers (NVIDIA RTX Voice)', 'CABLE Input (VB-Audio Virtual Cable)']
>>> mixer.quit() #Quit the mixer as it's initialized on your main playback device
>>> mixer.init(devicename='CABLE Input (VB-Audio Virtual Cable)') #Initialize it with the correct device
>>> mixer.music.load("Megalovania.mp3") #Load the mp3
>>> mixer.music.play() #Play it
To stop the music do: mixer.music.stop()
Also, the music doesn't play through your speakers, so you're going to have another python script or thread running that handles playing it through your speakers. (Also if you want it to play on a button press I recommend using the python library keyboard, the GitHub documentation is really good and you should be able to figure it out on your own.)
PS: This took me a while to figure out, your welcome.
PPS: I'm still trying to figure out a way to pipe your own mic through there as well since this method will obviously not pipe your real microphone in too, but looking into the source code of pygame is making my head hurt due to it all being written in C.

If you meant how to play MP3 using Python, well, this is a broad question.
Is it possible, without any dependencies, yes it is, but it is not worth it. Well, playing uncompressed audio is, but MP3, well, I'll explain below.
To play raw audio data from Python without installing pyaudio or pygame or similar, you first have to know the platform on which your script will be run.
Then implement a nice set of functions for choosing an audio device, setting up properties like sample rate, bit rate, mono/stereo..., feeding the stream to audio card and stopping the playback.
It is not hard, but to do it you have to use ctypes on Windows, PyObjC on Mac and Linux is special case as it supports many audio systems (probably use sockets to connect to PulseAudio or pipe to some process like aplay/paplay/mpeg123... or exploit gstreamer.).
But why go through all this just to avoid dependencies, when you have nice libraries out there with simple interfaces to access and use audio devices.
PyAudio is great one.
Well, that is your concern.
But, playing MP3 without external libraries, in real time, from pure Python, well, it's not exactly impossible, but it is very hard to achieve, and as far as I know nobody even tried doing it.
There is pure Python MP3 decoder implementation, but it is 10 times slower than necessary for real-time audio playback. It can be optimized for nearly full speed, but nobody is interested in doing so.
It has mostly educational value and it is used in cases where you do not need real-time speed.
This is what you should do:
Install pygame and use it to play MP3 directly
or:
Install PyAudio and some library that decodes Mp3, there are quite a few of them on pypi.python.org, and use it to decode the MP3 and feed the output to PyAudio.
There are some more possibilities, including pymedia, but I consider these the easiest solutions.
Okay, as we clarified what is really you need here is the answer.
I will leave first answer intact as you need that part too.
Now, you want to play audio to the recording stream, so that any application recording the audio input records the stuff that you are playing.
On Windows, this is called stereo mix and can be found in Volume Control, under audio input.
You choose stereo mix as your default input. Now, when you open an recording app which doesn't select itsown input channel, but uses the selected one (e.g. Skype) , it will record all coming out of your speakers and coming into your mic/line in.
I am not 100% sure whether this option will appear on all Windows or it is a feature of an audio card you have.
I am positive that Creative and Realtek audio cards supports it.
So, research this.
To select that option from Python, you have to connect to winmm.dll using ctypes and call the appropriate function. I do not know which one and with what arguments.
If this option is not present in volume control, there is nothing for it but to install a virtual audio card to do the loopback for you.
There might be such a software that comes packaged in as library so that you can use it from Python or whatever.
On Linux this should be easy using Pulseaudio. I do not know how, but I know that you can do it, redirect the streams etc. There is tutorial out there somewhere.
Then you can call that command from Python, to set to this and reset back to normal.
On Mac, well, I really have no idea, but it should be possible.
If you want your MP3 to be played only to the recording stream, and not on your speakers at all, well on Windows, you will not be able to do that without a loopback audio device.
On Linux, I am sure you will be able to do it, and on Mac it should be possible, but how is the Q.
I currently have no time to sniff around libraries etc. to provide you with some useful code, so you will have to do it yourself. But I hope my directions will help you.

Just an update on #PyPylia's answer for the benefit of anyone who struggled to implement this like I did.
Current Package Version: pygame 2.1.2 (SDL 2.0.18, Python 3.10.6)
Tested Systems: Windows 10 (21H2 - 19044.1288), (Should be the same process on Mac but this is untested as of now...)
First, you'll need to download the VB-Cable Virtual Mic Driver for your respective platform and install it. This provides us with a virtual mic that'll allow us to pass audio we play on our machine as a microphone input when using a video calling software (Google Meet, Microsoft Teams, Zoom). After that, it's all handled through the pygame module's audio package.
To get the audio device list:
from pygame import mixer, _sdl2 as devicer
mixer.init() # Initialize the mixer, this will allow the next command to work
# Returns playback devices, Boolean value determines whether they are Input or Output devices.
print("Inputs:", devicer.audio.get_audio_device_names(True))
print("Outputs:", devicer.audio.get_audio_device_names(False))
mixer.quit() # Quit the mixer as it's initialized on your main playback device
For example, My device returns:
Inputs: ['Microphone (High Definition Audio Device)', 'CABLE Output (VB-Audio Virtual Cable)']
Outputs: ['Speakers (High Definition Audio Device)', 'CABLE Input (VB-Audio Virtual Cable)']
Then, to playback the audio:
import time
from pygame import mixer
mixer.init(devicename = 'CABLE Input (VB-Audio Virtual Cable)') # Initialize it with the correct device
mixer.music.load("Toby Fox - Megalovania.mp3") # Load the mp3
mixer.music.play() # Play it
while mixer.music.get_busy(): # wait for music to finish playing
time.sleep(1)
If you wish to play multiple tracks back to back, add the following code segments to the while loop above:
...
else:
mixer.music.unload() # Unload the mp3 to free up system resources
mixer.music.load("Sleeping at Last - Saturn.wav") # Load the wav
...
Then, on the other end, inside the relevant software, just change the microphone input from the default to CABLE Output (VB-Audio Virtual Cable) to have those on the other end hear the audio from the source.
If you're using a newer version of the package and some of the listed methods don't seem to work because of an AttributeError: module 'pygame' has no attribute {method_name}, use pyup and search for the method in question, to see if there have been any changes to how the method is invoked. This was the main reason #PyPylia's code snippet no longer works unless you use an older version of pygame.

If you want to play an audio file in local directory, you may follow this flow.
#!/usr/bin/env python
import pyaudio
import socket
import sys
import os
CHUNK = 4096
output = os.path.join(BASE_DIR, "speech.wav") #WAV format Output file name
wf = wave.open(output, 'rb')
p = pyaudio.PyAudio()
stream = p.open(format=p.get_format_from_width(wf.getsampwidth()),
channels=wf.getnchannels(),
rate=wf.getframerate(),
output=True)
try:
while True:
data = wf.readframes(CHUNK)
stream.write(data)
except KeyboardInterrupt:
pass
print('Shutting down')
s.close()
stream.close()
audio.terminate()

Related

How to listen to audio output by application

Background
I'm working on a music player, and want any sort of audio visualizer to be part of it. I've already made a previous post specifically targeting python-vlc, but I guess not too many people know about it. Here is the post in question: How to get audio samples from python-VLC
What I need
I need a way to listen to audio output, preferably by application although it is not completely necessary. It must be a python module.
What I found
A bunch of tutorials regarding audio input from the microphone, which is NOT what I need
Two swhardware.com tutorials with audio visualizers
https://swharden.com/blog/2016-07-19-realtime-audio-visualization-in-python/
Another one from 2013 but I can't find the link
PyAudio tutorials also using the microphone
Listening audio output with python
terminatorX on Linux; it uses the microphone
squishyball on Linux; it doesn't support all audio files
livefft; uses pyqt4 so I couldn't even test it
Alternatives
Any python module or linux terminal that can provide a list of audio samples from any audio file. Similar to pydub.AudioSegement.get_array_of_samples()
What I don't need
More suggestions to use pydub's get_array_of_samples()
Pydub doesn't read the audio file correctly and quickly gets out of sync from already playing audio
More tutorials of how to get audio from the microphone
Answers on how to strip audio from a video file with ffmpeg

How to play an mp3 or wav file directly into my own microphone without it playing through my speakers? [duplicate]

Is there a way using python (and not any external software) to play a mp3 file like a microphone input?
For example, I have a mp3 file and with a python script it would play it through my mic so other in a voice room would hear it. As I say it is just an example.
Of course, I have done some research. I found out that I can use a software to create a virtual device and do few things to have the result. But my point is if it is possible without installing software but with some kind of python script?
It is possible but it isn't 100% in python as it requires the installation of other software. (Also from what I know this specific answer only works on Windows, but it should be similar on Linux with PulseAudio instead of VB-Audio Cable, but I'm not a daily Linux user so I don't know.)
First download: https://www.vb-audio.com/Cable/, this will create a "Virtual Audio Cable" where programs can play music to the input device (What looks like a speaker) and it'll pipe it to the output device (What looks like a microphone).
Then run this command in cmd: pip install pygame==2.0.0.dev8 (or py -m pip install pygame==2.0.0.dev8, depending on your installation of python) [Also the reason it's the dev version is that it requires some functions only in sdl2, whereas the main branch uses sdl1)
Then:
>>> from pygame._sdl2 import get_num_audio_devices, get_audio_device_name #Get playback device names
>>> from pygame import mixer #Playing sound
>>> mixer.init() #Initialize the mixer, this will allow the next command to work
>>> [get_audio_device_name(x, 0).decode() for x in range(get_num_audio_devices(0))] #Returns playback devices
['Headphones (Oculus Virtual Audio Device)', 'MONITOR (2- NVIDIA High Definition Audio)', 'Speakers (High Definition Audio Device)', 'Speakers (NVIDIA RTX Voice)', 'CABLE Input (VB-Audio Virtual Cable)']
>>> mixer.quit() #Quit the mixer as it's initialized on your main playback device
>>> mixer.init(devicename='CABLE Input (VB-Audio Virtual Cable)') #Initialize it with the correct device
>>> mixer.music.load("Megalovania.mp3") #Load the mp3
>>> mixer.music.play() #Play it
To stop the music do: mixer.music.stop()
Also, the music doesn't play through your speakers, so you're going to have another python script or thread running that handles playing it through your speakers. (Also if you want it to play on a button press I recommend using the python library keyboard, the GitHub documentation is really good and you should be able to figure it out on your own.)
PS: This took me a while to figure out, your welcome.
PPS: I'm still trying to figure out a way to pipe your own mic through there as well since this method will obviously not pipe your real microphone in too, but looking into the source code of pygame is making my head hurt due to it all being written in C.
If you meant how to play MP3 using Python, well, this is a broad question.
Is it possible, without any dependencies, yes it is, but it is not worth it. Well, playing uncompressed audio is, but MP3, well, I'll explain below.
To play raw audio data from Python without installing pyaudio or pygame or similar, you first have to know the platform on which your script will be run.
Then implement a nice set of functions for choosing an audio device, setting up properties like sample rate, bit rate, mono/stereo..., feeding the stream to audio card and stopping the playback.
It is not hard, but to do it you have to use ctypes on Windows, PyObjC on Mac and Linux is special case as it supports many audio systems (probably use sockets to connect to PulseAudio or pipe to some process like aplay/paplay/mpeg123... or exploit gstreamer.).
But why go through all this just to avoid dependencies, when you have nice libraries out there with simple interfaces to access and use audio devices.
PyAudio is great one.
Well, that is your concern.
But, playing MP3 without external libraries, in real time, from pure Python, well, it's not exactly impossible, but it is very hard to achieve, and as far as I know nobody even tried doing it.
There is pure Python MP3 decoder implementation, but it is 10 times slower than necessary for real-time audio playback. It can be optimized for nearly full speed, but nobody is interested in doing so.
It has mostly educational value and it is used in cases where you do not need real-time speed.
This is what you should do:
Install pygame and use it to play MP3 directly
or:
Install PyAudio and some library that decodes Mp3, there are quite a few of them on pypi.python.org, and use it to decode the MP3 and feed the output to PyAudio.
There are some more possibilities, including pymedia, but I consider these the easiest solutions.
Okay, as we clarified what is really you need here is the answer.
I will leave first answer intact as you need that part too.
Now, you want to play audio to the recording stream, so that any application recording the audio input records the stuff that you are playing.
On Windows, this is called stereo mix and can be found in Volume Control, under audio input.
You choose stereo mix as your default input. Now, when you open an recording app which doesn't select itsown input channel, but uses the selected one (e.g. Skype) , it will record all coming out of your speakers and coming into your mic/line in.
I am not 100% sure whether this option will appear on all Windows or it is a feature of an audio card you have.
I am positive that Creative and Realtek audio cards supports it.
So, research this.
To select that option from Python, you have to connect to winmm.dll using ctypes and call the appropriate function. I do not know which one and with what arguments.
If this option is not present in volume control, there is nothing for it but to install a virtual audio card to do the loopback for you.
There might be such a software that comes packaged in as library so that you can use it from Python or whatever.
On Linux this should be easy using Pulseaudio. I do not know how, but I know that you can do it, redirect the streams etc. There is tutorial out there somewhere.
Then you can call that command from Python, to set to this and reset back to normal.
On Mac, well, I really have no idea, but it should be possible.
If you want your MP3 to be played only to the recording stream, and not on your speakers at all, well on Windows, you will not be able to do that without a loopback audio device.
On Linux, I am sure you will be able to do it, and on Mac it should be possible, but how is the Q.
I currently have no time to sniff around libraries etc. to provide you with some useful code, so you will have to do it yourself. But I hope my directions will help you.
Just an update on #PyPylia's answer for the benefit of anyone who struggled to implement this like I did.
Current Package Version: pygame 2.1.2 (SDL 2.0.18, Python 3.10.6)
Tested Systems: Windows 10 (21H2 - 19044.1288), (Should be the same process on Mac but this is untested as of now...)
First, you'll need to download the VB-Cable Virtual Mic Driver for your respective platform and install it. This provides us with a virtual mic that'll allow us to pass audio we play on our machine as a microphone input when using a video calling software (Google Meet, Microsoft Teams, Zoom). After that, it's all handled through the pygame module's audio package.
To get the audio device list:
from pygame import mixer, _sdl2 as devicer
mixer.init() # Initialize the mixer, this will allow the next command to work
# Returns playback devices, Boolean value determines whether they are Input or Output devices.
print("Inputs:", devicer.audio.get_audio_device_names(True))
print("Outputs:", devicer.audio.get_audio_device_names(False))
mixer.quit() # Quit the mixer as it's initialized on your main playback device
For example, My device returns:
Inputs: ['Microphone (High Definition Audio Device)', 'CABLE Output (VB-Audio Virtual Cable)']
Outputs: ['Speakers (High Definition Audio Device)', 'CABLE Input (VB-Audio Virtual Cable)']
Then, to playback the audio:
import time
from pygame import mixer
mixer.init(devicename = 'CABLE Input (VB-Audio Virtual Cable)') # Initialize it with the correct device
mixer.music.load("Toby Fox - Megalovania.mp3") # Load the mp3
mixer.music.play() # Play it
while mixer.music.get_busy(): # wait for music to finish playing
time.sleep(1)
If you wish to play multiple tracks back to back, add the following code segments to the while loop above:
...
else:
mixer.music.unload() # Unload the mp3 to free up system resources
mixer.music.load("Sleeping at Last - Saturn.wav") # Load the wav
...
Then, on the other end, inside the relevant software, just change the microphone input from the default to CABLE Output (VB-Audio Virtual Cable) to have those on the other end hear the audio from the source.
If you're using a newer version of the package and some of the listed methods don't seem to work because of an AttributeError: module 'pygame' has no attribute {method_name}, use pyup and search for the method in question, to see if there have been any changes to how the method is invoked. This was the main reason #PyPylia's code snippet no longer works unless you use an older version of pygame.
If you want to play an audio file in local directory, you may follow this flow.
#!/usr/bin/env python
import pyaudio
import socket
import sys
import os
CHUNK = 4096
output = os.path.join(BASE_DIR, "speech.wav") #WAV format Output file name
wf = wave.open(output, 'rb')
p = pyaudio.PyAudio()
stream = p.open(format=p.get_format_from_width(wf.getsampwidth()),
channels=wf.getnchannels(),
rate=wf.getframerate(),
output=True)
try:
while True:
data = wf.readframes(CHUNK)
stream.write(data)
except KeyboardInterrupt:
pass
print('Shutting down')
s.close()
stream.close()
audio.terminate()

playing sound without initial click in python

I am new to python and am attempting to build a simple alarm app (command line for now). This is using Python 3.6 and I am developing on Ubuntu 18.04. When I play a sound using pydub, playsound or simpleaudio, the sound is preceded by an annoying click, which I presume is meant to emulate the pressing of a button on a machine. I may have missed it in the docs, but do not see anything.
To be clear, this clicking does not exist in the sound file. I have put the play command in a loop to verify and only hear it on the initial play. For example:
# run pydub
sound = AudioSegment.from_file(f, format="wav")
# play(sound)
for _ in range(2):
play(sound)
This happens regardless of playing wav, mp3 or flac.
FWIW - I have been unsuccessful using python-vlc and pygame. I fear spending much time only to continue to hear the "click".
So, the question is, how do I prevent the click or what library/module should I use to achieve playback of a snippet in such a simple app?
Probably the easiest way to eliminate click in existing audio data is to put a very short fade-in at the beginning to ensure playback starts from a "zero crossing"
sound = AudioSegment.from_file(f, format="wav")
# 10 ms fade in
sound = sound.fade_in(duration=10)
for _ in range(2):
play(sound)
I'm having the same issue on a MacBook Pro trying to play short sound clips with simpleaudio. If I play an array of zeros, I get an obnoxious click noise at the beginning and end as with this code. Built-in speakers, as well as air pods, show the same issue.
import simpleaudio as sa
import numpy as np
print("Play silence for 1 second.")
sample_rate = 44100
s = np.zeros(sample_rate).astype(np.int16)
sp = sa.play_buffer(s, 1, 1, sample_rate)
sp.wait_done()
print(" Done.")
[EDIT: Found my problem -- the second 1 in the arguments to play_buiffer() is the number of bytes per sample. It should have been 2 for a 16-bit sample size. But using 1 shouldn't break anything for this sample of zeros, but it does probably force the audio hardware to be reconfigured for 8-bit play vs 16 bit, and changing the configuration might well be the cause of the pop. When I switch to 2, there are no pops.]

Can I use AUX port in Raspberry Pi 3 (Model B) to plug a microphone to get audio signals in?

import speech_recognition as sr
r = sr.Recognizer()
with sr.Microphone() as source:
audio = r.listen(source)
print(r.recognize_sphinx(audio))
When I run this code in Python in raspberry pi 3 (model B), it gives the following error.
OSError: No Default Input Device Available
what is the reason for this? do I need to have a USB microphone to get the audio signals in rather than using the microphone in earphones?
< /Hey >
As designed by the Raspberry Pi's circuit layout, in short:
The 3.5mm Audio Jack on the Raspberry Pi models cannot be used as an audio input.
I'm not sure if you would want to anyways.
This means you have a couple of options on how you want to set up your microphone setup.
1. Using a small mic array (Like Alexa Echo or Google Home)
A lot of the time these kind of systems are prototyped on Raspberry Pi's or similar (see the official Alexa development kit). You can find similar replicas to the microphone arrays found on google home etc. , specifically fitted for the Raspberry Pi. These include some added advanced features such as Noise Suppression, Direction of Sound Source and other neat features I'll leave for you to explore yourself.
Here's 3 I found after googling (I'm sure if you look you can find more):
ReSpeaker 4-mic array
ReSpeaker 7-mic array
Matrix Creator
If you wanted high quality results for speech recognition I'd probably begin to look more down this route.
2. Using a normal USB microphone
Probably the most common approach is to get a standard USB microphone that has Raspberry Pi drivers and use this. I found one from Adafruit which I'm sure is just plug and play which could be nice and easy to get going with.
Again I'm sure you'll find plenty of other options online, these were just suggestions to get you started.
Hopefully this helps! :-)
What you could use is a USB microphone, these tend to install the required drivers and work out of the box more easily.
Source: https://www.raspberrypi.org/forums/viewtopic.php?t=188108

How can I script video playback with output to multiple screens?

Background
I'm attempting to craft a simple video playback script for a small cinema that automates the playing of videos and control of the projector, sound and lighting systems. I have two video outputs, one goes to a monitor in the projection booth, and the other directly to the projector. I desire to play video (and only video) fullscreen to the projector while putting controls and a small (~1/4 screen) preview on the monitor. This will allow the projectionist to view the video being output and control the playback from the monitor in the booth while all the audience ever sees is the video output.
Problem
I am currently using Python to control VLC player (with libvlc Python bindings) to playback videos. I have everything working fine except that I can't figure out how to get a preview (direct copy) of the video being played fullscreen on the projector output into my GUI.
I have tried using the clone filter, but I cant get the cloned window to automagically appear full screen nor in my GUI. The clone filter seems like the logical choice but it seems to be VERY inflexible when it comes to specifying destination screens, fullscreen, etc. I must be able to open video windows full screen on the projector monitor. Professionalism is key and it would look bad if the projectionist had to drag a window over and double click on it when the movie started.
Currently Using:
Debian Linux
Python 2.7
wxPython
libvlc
I would like to continue using Python as I already have the code for controlling the projector, sound processor, lighting and curtain written and tested. I chose VLC because it really seems bulletproof when it comes to video playback but am not committed to it's continued use. I also chose wxWidgets for my GUI as a result of past experience but I am not stuck on that either.
This describes the direct solution and does not concentrate on any alternative or the overall design of your application.
As Your Application and VLC media player are separate processes, you will not be able to get what you want directly because there is no "shared memory" between those 2 applications. The best shot to "copy" the decoded frames from VLC will be to e.g. send a RAW Video .mts stream (ts is usually used for this kind of usecase) and send e.g. to udp://localhost:1234.
In your application, you will need to be able to receive the ts stream, "decode" it and display at the spot of interest.
For start, i would try if you are able to do this using 2 vlc players that you control manually. When you achieved that the first VLC streams to udp and outputs on the main display at the same time, and the other VLC player receives and plays the udp stream you can go on:
Find a player library that you can use directly in your wxpython application and check if it can receive the udp stream as well E.g.
https://wxpython.org/Phoenix/docs/html/wx.media.MediaCtrl.html
This player lib for example requires gstreamer as a base.
As a result, main display and the picture in your applicatoin might have a latency of some seconds. To come around this latency, the best way that i currently know is using WebRTC but this is a lot more complex setup than the above.
https://www.sipwise.org/news/technical/tv-over-webrt/
Sure in case you do some "encoding" for WebRTC or even for UDP, you would need to utilize some hardware encoder, e.g. Nvidia NVENC in order to be able to guarantee the needed resources are always there.

Categories

Resources