TLDR: Is there a way to mix multiple input devices' audio streams in real-time in Python (ideally with Pyo)?
I'm trying to use Pyo to combine audio streams from multiple microphones (or virtual microphones, e.g., Loopback devices) into one output audio stream. I've tried using pyo.Input, but it seems to have a core limitation: in Input(x), x seems to refer to a channel, not an input device. The input device seems to be constrained by the server: for s = pyo.Server(duplex=1), s.setInputDevice(someDeviceIndex) seems to be the only way to choose which input device will be used, with x then referring to a channel of that device (e.g., for a stereo microphone).
I have not seen any information in Pyo's docs (or in what I've read about other libraries) about how to mix multiple input devices together. pyo.Mixer seems promising, but examples given in the docs don't demonstrate how to use multiple Inputs, just other sound sources (e.g., a sine wave).
Is there any way that I can access multiple input devices within the scope of one Server and mix them together? If Pyo is not a good option for this task, what other libraries (and/or vanilla Python) might be a good fit?
Resolved on the Pyo forums at this link.
Related
I'm trying to build a virtual microphone that takes the physical microphone's input stream, modifies the audio using a neural network model written with Python, and replaces the sound stream with my program's output sound stream (in near realtime) so that other apps (e.g. Zoom, Skype, etc.) will receive my app's modified audio stream, for both Mac and Windows.
I've been reading for the last few hours and so far found several libraries that might (?) be able to work here: WebRTC, Soundflower, etc. Does anyone know if there's already a good library that can do or facilitate this? This might be wading into offtopic territory, but if this is doable entirely in Python, e.g. through eel or similar, does anyone know of any library that can do the same type of audio manipulation?
Thank you!
You can create virtual microphone device using official Core Audio APIs
I want to make a program, which does a specific command when the system’s basic sound plays any type of sound. Like if you receive a message on facebook, you got a little alarm sound. I want to recognise this ‘peak’. How is it possible in python?
< /Hey >
Getting your audio data
I think what you are looking for is someway to loopback the system output so that you can read it as if your OS thinks its an input. There are different ways of doing this (depending on your OS).
However since in the comments you mentioned your OS is Windows 8.1, you can use a fork of PyAudio -> PyAudio_portaudio : Which is the normal PyAudio but extended to use the WASAPI to loopback your windows system output back into something you can retreive in Python.
Please see this other SO post on recording your system output with Python, if I missed anything and thanks to #mate for posting the link to the PyAudio fork.
This is a quick explanation:
The official PyAudio build isn't able to record the output. BUT with
Windows Vista and above, a new API, WASAPI was introduced, which
includes the ability to open a stream to an output device in loopback
mode. In this mode the stream will behave like an input stream, with
the ability to record the outgoing audio stream.
To set the mode, one has to set a special flag
(AUDCLNT_STREAMFLAGS_LOOPBACK,
https://msdn.microsoft.com/de-de/library/windows/desktop/dd316551(v=vs.85).aspx
). Since this flag is not supported in the official build one needs to
edit PortAudio as well as PyAudio, to add loopback support.
New option: "as_loopback":(true|false)
Analyzing your data
This will give you the data block by block (in the block size you specified). From there, you can do whatever DSP / Peak analysis you desire to calculate which sound has been played / has whatever properties.
Here is a quick example to get you started on peak detection in Python. For more accurate results maybe you could store the .wav files you want to recognize and perform cross correlation to see if the same .wav file was played.
Cross correlation 1D Arrays (Mono Audio)
Cross correlation 2D Arrays (Stereo Audio)
If i have two keyboards (default keyboard and an RFID reader) in my Linux machine
Using python how can I know from which device the input is coming from ?
I can read my input using the
input()
but i need to distinguish between the two devices
I'm assuming the RFID reader works on USB, if that's the case, should be treated like an input device as well as the HID Keybord (I'm assuming HID by your default).
I'm using evdev in python to do something similar for myself.
You can find the documentation here: http://python-evdev.readthedocs.io/en/latest/tutorial.html#reading-events
It has lots of simple and useful examples, like identify and read from multiple devices asynchronously.
I found it very easy to use.
I am looking for a way to capture audio from a mic, transform it using a LADSPA/VST plugin, and publish the resulting sound as a stream available in IP network. The type of filter applied and its parameters should be controlled via Python.
I am aware this is pretty many layered question, but the real-time sound processing in Python is terra incognita for me, so I am clutching at straws here, even at how to architect this solution.
I'm working on an audio mixing program (DAW) web app, and considering using Python and Python Gstreamer for the backend. I understand that I can contain the audio tracks of a single music project in a gst.Pipeline bin, but playback also appears to be controlled by this Pipeline.
Is it possible to create several "views" into the Pipeline representing the project? So that more than one client can grab an audio stream of this Pipeline at will, with the ability to do time seek?
If there is a better platform/library out there to use, I'd appreciate advice on that too. I'd prefer sticking to Python though, because my team members are already researching Python for other parts of this project.
Thanks very much!
You might want to look at Flumotion (www.flumotion.org). It is a python based streaming server using GStreamer, you might be able to get implementation ideas from that in terms of how you do your application. It relies heavily on the python library Twisted for its network handling.