I want to make a program, which does a specific command when the system’s basic sound plays any type of sound. Like if you receive a message on facebook, you got a little alarm sound. I want to recognise this ‘peak’. How is it possible in python?
< /Hey >
Getting your audio data
I think what you are looking for is someway to loopback the system output so that you can read it as if your OS thinks its an input. There are different ways of doing this (depending on your OS).
However since in the comments you mentioned your OS is Windows 8.1, you can use a fork of PyAudio -> PyAudio_portaudio : Which is the normal PyAudio but extended to use the WASAPI to loopback your windows system output back into something you can retreive in Python.
Please see this other SO post on recording your system output with Python, if I missed anything and thanks to #mate for posting the link to the PyAudio fork.
This is a quick explanation:
The official PyAudio build isn't able to record the output. BUT with
Windows Vista and above, a new API, WASAPI was introduced, which
includes the ability to open a stream to an output device in loopback
mode. In this mode the stream will behave like an input stream, with
the ability to record the outgoing audio stream.
To set the mode, one has to set a special flag
(AUDCLNT_STREAMFLAGS_LOOPBACK,
https://msdn.microsoft.com/de-de/library/windows/desktop/dd316551(v=vs.85).aspx
). Since this flag is not supported in the official build one needs to
edit PortAudio as well as PyAudio, to add loopback support.
New option: "as_loopback":(true|false)
Analyzing your data
This will give you the data block by block (in the block size you specified). From there, you can do whatever DSP / Peak analysis you desire to calculate which sound has been played / has whatever properties.
Here is a quick example to get you started on peak detection in Python. For more accurate results maybe you could store the .wav files you want to recognize and perform cross correlation to see if the same .wav file was played.
Cross correlation 1D Arrays (Mono Audio)
Cross correlation 2D Arrays (Stereo Audio)
Related
TLDR: Is there a way to mix multiple input devices' audio streams in real-time in Python (ideally with Pyo)?
I'm trying to use Pyo to combine audio streams from multiple microphones (or virtual microphones, e.g., Loopback devices) into one output audio stream. I've tried using pyo.Input, but it seems to have a core limitation: in Input(x), x seems to refer to a channel, not an input device. The input device seems to be constrained by the server: for s = pyo.Server(duplex=1), s.setInputDevice(someDeviceIndex) seems to be the only way to choose which input device will be used, with x then referring to a channel of that device (e.g., for a stereo microphone).
I have not seen any information in Pyo's docs (or in what I've read about other libraries) about how to mix multiple input devices together. pyo.Mixer seems promising, but examples given in the docs don't demonstrate how to use multiple Inputs, just other sound sources (e.g., a sine wave).
Is there any way that I can access multiple input devices within the scope of one Server and mix them together? If Pyo is not a good option for this task, what other libraries (and/or vanilla Python) might be a good fit?
Resolved on the Pyo forums at this link.
I'm looking to use Python SoundDevice to record the system audio. In other words, instead of using a microphone to detect sounds, I just want to pull into my sounddevice.InputStream() whatever would normally be coming out of the speakers of my computer.
I've found various techniques to doing this with PyAudio(Here, for example), but I need to use SoundDevice for this. How can I access system audio?
I'm trying to build a virtual microphone that takes the physical microphone's input stream, modifies the audio using a neural network model written with Python, and replaces the sound stream with my program's output sound stream (in near realtime) so that other apps (e.g. Zoom, Skype, etc.) will receive my app's modified audio stream, for both Mac and Windows.
I've been reading for the last few hours and so far found several libraries that might (?) be able to work here: WebRTC, Soundflower, etc. Does anyone know if there's already a good library that can do or facilitate this? This might be wading into offtopic territory, but if this is doable entirely in Python, e.g. through eel or similar, does anyone know of any library that can do the same type of audio manipulation?
Thank you!
You can create virtual microphone device using official Core Audio APIs
I'm trying to write something that catches the audio being played to the speakers/headphones/soundcard and see whether it is playing and what the longest silence is. This is to test an application that is being executed and to see if it stops playing audio after a certain point, as such i don't actually need to really know what the audio itself is, just whether or not there is audio playing.
I need this to be fully programmatic (so not requiring the use of GUI tools or the like, to set up an environment). I know applications like projectM do this, I just can't for the life of me find anything anywhere that denotes how.
An audio level meter would also work for this, as would ossiliscope data or the like, really would take any recommendation.
Here is a very similar question: record output sound in python
You could try to route your output to a new device with jack and record this with portaudio. There are Python Bindings for portaudio called pyaudio and for jack called PyJack. I have never used the latter one but pyaudio works great.
Is it possible to get the system output audio (the exact same thing that goes through the speakers) and analyze it in real time with Python? My intention is to build a sound visualizer. I know that it is possible to access the microphone with pyaudio, but I was not able to access the sound card output in any way, I'm looking for a solution that works on Windows.
Thank you for reading.
Not sure how this project is doing these days, it's been a long time since it's been updated. PyVST allows you to run python code in a VST inside a VST host, which makes it possible to handle realtime audio events.
You might want to look at http://code.google.com/p/pyo/ for some ideas about how to handle DSP data as well.