We have a module on Python (through win32) to detect user mouse and keyboard activity by GetLastInputInfo and GetTickCount. How can we register Voice activity in GetLastInputInfo?
Or maybe can we add a synthesized input to update GetLastInputInfo every time the mic detects voice input? but can we do that without interrupting the user?
Sample code on Pyaudio to detect user voice by volume:
audio = pyaudio.PyAudio()
FORMAT = pyaudio.paInt16
CHANNELS = 2
RATE = 44100
CHUNK = 1024
# recording prerequisites
stream = audio.open(format=FORMAT, channels=CHANNELS,
rate=RATE,
input=True,
frames_per_buffer=CHUNK)
while True:
data = stream.read(CHUNK)
data_chunk = array('h', data)
vol = max(data_chunk)
if vol >= 500:
# voice detected from mic
print("talking - {}".format(vol))
else:
print("-")
Sample code for detecting user input:
# code to get inactivity
class LastInputInfo(Structure):
_fields_ = [
("cbSize", UINT),
("dwTime", DWORD)
]
def _getLastInputTick() -> int:
"""
retrieves the last input action
:return: int
"""
prototype = WINFUNCTYPE(BOOL, POINTER(LastInputInfo))
paramflags = ((1, "lastinputinfo"), )
# type: ignore
c_GetLastInputInfo = prototype(("GetLastInputInfo", ctypes.windll.user32), paramflags)
l = LastInputInfo()
l.cbSize = ctypes.sizeof(LastInputInfo)
assert 0 != c_GetLastInputInfo(l)
return l.dwTime
def _getTickCount() -> int:
"""
:return: int
tick count
"""
prototype = WINFUNCTYPE(DWORD)
paramflags = ()
c_GetTickCount = prototype(("GetTickCount", ctypes.windll.kernel32), paramflags) # type: ignore
return c_GetTickCount()
def seconds_since_last_input():
"""
:return: float
the time of user input
"""
seconds_since_input = (_getTickCount() - _getLastInputTick()) / 1000
return seconds_since_input
# inactivity in N seconds
seconds_since_input = seconds_since_last_input()
inactive_seconds = 10
while True:
# Becomes active
if afk and seconds_since_input < inactive_seconds:
afk = False
#becomes afk
elif not afk and seconds_since_input >= inactive_seconds:
afk = True
print("afk status: {}, seconds since last input :{}".format(seconds_since_input))
If you want to do something, without interrupting the user, you can use multithreading with threading.
If you want to save something in a variable that every thread can use, you can use queue.
This will run whatever you need to run, in a different thread, and save on a shared variable.
import modules
import threading
import queue
Create a shared variable
shared_var = queue.Queue()
Create a function that checks what you want (in this case audio), and edits the shared variable
Edit shared variable: shared_var.put(item)
(in this case, whenever audio is detected you can say audio_detected.put(True) and/or current_tick_count.put(tick_count), or something like that`)
create a thread and pass in the function you made to check
thread = threading.Thread(target=function, args=arguments)
where target is the function you want to call in this new tread, and args are the arguments you need to pass into your function
Start the new thread
thread.start()
On main thread or a new thread, do what you want with that variable
shared_var.get() will wait until something is added to shared_var and then return what was added.
Example code:
import threading
import queue
import time
text = queue.Queue()
def change(text):
time.sleep(3)
text.put("hello world")
thread = threading.Thread(target=change, args=(text,))
# ^ IMPORTANT! (,)
thread.start()
def display(text):
text = text.get() # This will wait till text has somthing inside and then returns it
print(text)
thread2 = threading.Thread(target=display, args=(text,))
# ^ IMPORTANT! (,)
thread2.start()
input() # To show it won't interrupt the user until the text has something
I am sorry if this answer isn't so clear. I'm not familiar with pyaudio and win32, but I do know threading and queue so you can just work with this and add you're code. If you want you could edit the answer with your code in it.
I hope this helps!
Related
I am writing some code where I have 3 processes (spawned from the main). The first one is a process that uses Async IO to create 3 coroutines and switch between them. The last two processes run independently and generate two outputs that are used in one of the coroutines of the first process.
The communication has been managed using multiprocessing.queue(), the main puts the input data inside queue_source_position_hrir_calculator and queue_source_position_cutoff_calculator, then these two queues are emptied by p2_hrir_computation_process and p3_cutoff_computation_process. These two processes outputs their computation results in two output queues queue_computed_hrirs and queue_computed_cutoff
Finally these two queues are consumed by the Async IO process, in particular inside the input_parameters_coroutine function.
The full code is the following (I will highlight the key parts in following snippets):
import asyncio
import multiprocessing
import numpy as np
import time
from classes.HRIR_interpreter_min_phase_linear_interpolation import HRIR_interpreter_min_phase_linear_interpolation
from classes.object_renderer import ObjectRenderer
#Useful resources: https://bbc.github.io/cloudfit-public-docs/asyncio/asyncio-part-2
#https://realpython.com/async-io-python/
Fs = 44100
# region Async_IO functions
async def audio_input_coroutine(overlay):
for i in range(0,100):
print('Executing audio input coroutine')
print(overlay)
await asyncio.sleep(1/(Fs*4))
async def input_parameters_coroutine(overlay, queue_computed_hrirs,queue_computed_cutoff):
for i in range(0,10):
print('Executing audio input_parameters coroutine')
#print(overlay)
current_hrir = queue_computed_hrirs.get()
print('got current hrir')
current_cutoff = queue_computed_cutoff.get()
print('got current cutoff')
await asyncio.sleep(0.5)
async def audio_output_coroutine(overlay):
for i in range(0,10):
print('Executing audio_output coroutine')
#print(overlay)
await asyncio.sleep(0.5)
async def main_coroutine(overlay, queue_computed_hrirs,queue_computed_cutoff):
await asyncio.gather(audio_input_coroutine(overlay), input_parameters_coroutine(overlay, queue_computed_hrirs,queue_computed_cutoff), audio_output_coroutine(overlay))
def async_IO_main_process(queue_computed_hrirs,queue_computed_cutoff):
overlay = 10
asyncio.run(main_coroutine(overlay, queue_computed_hrirs,queue_computed_cutoff))
# endregion
# region HRIR_computation_process
def compute_hrir(queue_source_position, queue_computed_hrirs):
print('computing hrir')
SOFA_filename = '../HRTF_data/HUTUBS_min_phase.sofa'
# loading the simulated dataset using the support class HRIRInterpreter
HRIRInterpreter = HRIR_interpreter_min_phase_linear_interpolation(SOFA_filename=SOFA_filename)
# variable to check if I have other positions in my input queue
eof_source_position = False
# Un-comment following line to return when no more messages
while not eof_source_position:
#while True:
# print('inside while loop')
time.sleep(1)
# print('state of the queue', queue_source_position.empty())
if not eof_source_position:
position = queue_source_position.get()
if position is None:
eof_source_position = True # end of messages indicator
else:
required_IR = HRIRInterpreter.get_interpolated_IR(position[0], position[1], 1)
queue_computed_hrirs.put(required_IR)
# print('printing computed HRIR:', required_IR)
print('completed hrir computation, adding none to queue')
queue_computed_hrirs.put(None) # end of messages indicator
print('completed hrir process')
# endregion
# region cutoff_computation_process
def compute_cutoff(queue_source_position, queue_computed_cutoff):
print('computing cutoff')
cutoff = 20000
object_renderer = ObjectRenderer()
object_positions = np.array([(20, 0), (40, 0), (100, 0), (225, 0)])
eof_source_position = False
# Un-comment following line to return when no more messages
while not eof_source_position:
#while True:
time.sleep(1)
object_renderer.update_object_position(object_positions)
if not eof_source_position:
print('inside source position update')
source_position = queue_source_position.get()
if source_position is None: # end of messages indicator
eof_source_position = True
else:
cutoff = object_renderer.get_cutoff(azimuth=source_position[0], elevation=source_position[1])
queue_computed_cutoff.put(cutoff)
queue_computed_cutoff.put(None) # end of messages indicator
# endregion
if __name__ == "__main__":
import time
queue_source_position_hrir_calculator = multiprocessing.Queue()
queue_source_position_cutoff_calculator = multiprocessing.Queue()
queue_computed_hrirs = multiprocessing.Queue()
queue_computed_cutoff = multiprocessing.Queue()
i = 0.0
#Basically here I am writing a sequence of positions into the queue
#then I add a None value to detect when I am done with the simulation so the process can end
for _ in range(10):
# print('into main while-> source_position:', source_position[0])
source_position = np.array([i, 0.0])
queue_source_position_hrir_calculator.put(source_position)
queue_source_position_cutoff_calculator.put(source_position)
i += 10
queue_source_position_hrir_calculator.put(None) # "end of messages" indicator
queue_source_position_cutoff_calculator.put(None) # "end of messages" indicator
p1_async_IO_process = multiprocessing.Process(target=async_IO_main_process, args=(queue_computed_hrirs,queue_computed_cutoff)) #process that manages the ASYNC_IO coroutines between DMAs
p2_hrir_computation_process = multiprocessing.Process(target=compute_hrir, args=(queue_source_position_hrir_calculator, queue_computed_hrirs))
p3_cutoff_computation_process = multiprocessing.Process(target=compute_hrir, args=(queue_source_position_cutoff_calculator, queue_computed_cutoff))
p1_async_IO_process.start()
p2_hrir_computation_process.start()
p3_cutoff_computation_process.start()
#temp cycle to join processes
#for _ in range(2):
# current_hrir = queue_computed_hrirs.get()
# current_cutoff = queue_computed_cutoff.get()
print('joining async_IO process')
p1_async_IO_process.join()
print('joined async_IO process')
#NB: to join a process, its qeues must be empty. So before calling the join on p2, I should get the values from the queue_computed_hrirs queue
print('joining hrir computation process')
p2_hrir_computation_process.join()
print('joined hrir computation process')
print('joining hrir computation process')
p2_hrir_computation_process.join()
print('joined hrir computation process')
print('joining cutoff computation process')
p3_cutoff_computation_process.join()
print('joined cutoff computation process')
print("completed main")
The important part of the code is:
async def input_parameters_coroutine(overlay, queue_computed_hrirs,queue_computed_cutoff):
for i in range(0,10):
print('Executing audio input_parameters coroutine')
#print(overlay)
current_hrir = queue_computed_hrirs.get()
print('got current hrir')
current_cutoff = queue_computed_cutoff.get()
print('got current cutoff')
await asyncio.sleep(0.5)
This coroutine receives as input 3 variables overlay (which is a dummy variable I am using for future developments) and the two multiprocessing.Queue() classes, queue_computed_hrirs and queue_computed_cutoff.
At the moment my input_parameters_coroutine gets "stuck" while executing current_hrir = queue_computed_hrirs.get() and current_cutoff = queue_computed_cutoff.get(). I said "stuck" because the code works fine and complete its execution, the problem is that those two commands are blocking, thus my coroutine stops until it has something to get from the queue.
What I would like to achieve is: try to execute current_hrir = queue_computed_hrirs.get(), if it is not possible at that moment, switch to another coroutine and let it execute what it wants, then go back and check if it possible to execute current_hrir = queue_computed_hrirs.get(), if yes go on, if not switch again to another coroutine and let it do its job.
I saw that there are some problems in making async IO and multiprocessing communicate ( What kind of problems (if any) would there be combining asyncio with multiprocessing? , Can I somehow share an asynchronous queue with a subprocess? ) but I wasn't able to find a smart solution to my problem.
I am new to Google API and web services. I only tried GoogleTransateAPI once but that one works fine. Now, I want to use Google Media Translation API to translate voice input. I followed the tutorial from https://cloud.google.com/translate/media/docs/streaming.
However, I cannot make it work. There is no error at the run time so I don't know where to look at. Could you please help me identify the problem?
# [START media_translation_translate_from_mic]
from __future__ import division
import itertools
from google.cloud import mediatranslation as media
import pyaudio
from six.moves import queue
import os
os.environ["GOOGLE_APPLICATION_CREDENTIALS"]="/Users/Me/GoogleMT/TranslationAPI/MediaKey.json"
# Audio recording parametersss
RATE = 16000
CHUNK = int(RATE / 10) # 100ms
SpeechEventType = media.StreamingTranslateSpeechResponse.SpeechEventType
class MicrophoneStream:
"""Opens a recording stream as a generator yielding the audio chunks."""
def __init__(self, rate, chunk):
self._rate = rate
self._chunk = chunk
# Create a thread-safe buffer of audio data
self._buff = queue.Queue()
self.closed = True
def __enter__(self):
self._audio_interface = pyaudio.PyAudio()
self._audio_stream = self._audio_interface.open(
format=pyaudio.paInt16,
channels=1, rate=self._rate,
input=True, frames_per_buffer=self._chunk,
# Run the audio stream asynchronously to fill the buffer object.
# This is necessary so that the input device's buffer doesn't
# overflow while the calling thread makes network requests, etc.
stream_callback=self._fill_buffer,
)
self.closed = False
return self
def __exit__(self, type=None, value=None, traceback=None):
self._audio_stream.stop_stream()
self._audio_stream.close()
self.closed = True
# Signal the generator to terminate so that the client's
# streaming_recognize method will not block the process termination.
self._buff.put(None)
self._audio_interface.terminate()
def _fill_buffer(self, in_data, frame_count, time_info, status_flags):
"""Continuously collect data from the audio stream, into the buffer."""
self._buff.put(in_data)
return None, pyaudio.paContinue
def exit(self):
self.__exit__()
def generator(self):
while not self.closed:
# Use a blocking get() to ensure there's at least one chunk of
# data, and stop iteration if the chunk is None, indicating the
# end of the audio stream.
chunk = self._buff.get()
if chunk is None:
return
data = [chunk]
# Now consume whatever other data's still buffered.
while True:
try:
chunk = self._buff.get(block=False)
if chunk is None:
return
data.append(chunk)
except queue.Empty:
break
yield b''.join(data)
def listen_print_loop(responses):
"""Iterates through server responses and prints them.
The responses passed is a generator that will block until a response
is provided by the server.
"""
translation = ''
source = ''
for response in responses:
# Once the transcription settles, the response contains the
# END_OF_SINGLE_UTTERANCE event.
if (response.speech_event_type ==
SpeechEventType.END_OF_SINGLE_UTTERANCE):
print(u'\nFinal translation: {0}'.format(translation))
print(u'Final recognition result: {0}'.format(source))
return 0
result = response.result
translation = result.text_translation_result.translation
source = result.recognition_result
print(u'\nPartial translation: {0}'.format(translation))
print(u'Partial recognition result: {0}'.format(source))
def do_translation_loop():
print('Begin speaking...')
client = media.SpeechTranslationServiceClient()
speech_config = media.TranslateSpeechConfig(
audio_encoding='linear16',
source_language_code='en-US',
target_language_code='ja')
config = media.StreamingTranslateSpeechConfig(
audio_config=speech_config, single_utterance=True)
# The first request contains the configuration.
# Note that audio_content is explicitly set to None.
first_request = media.StreamingTranslateSpeechRequest(
streaming_config=config, audio_content=None)
with MicrophoneStream(RATE, CHUNK) as stream:
audio_generator = stream.generator()
mic_requests = (media.StreamingTranslateSpeechRequest(
audio_content=content,
streaming_config=config)
for content in audio_generator)
requests = itertools.chain(iter([first_request]), mic_requests)
responses = client.streaming_translate_speech(requests)
# Print the translation responses as they arrive
result = listen_print_loop(responses)
if result == 0:
stream.exit()
def main():
while True:
print()
option = input('Press any key to translate or \'q\' to quit: ')
if option.lower() == 'q':
break
do_translation_loop()
if __name__ == '__main__':
main()
# [END media_translation_translate_from_mic]
The result is like this. No translation nor recognition result.
Result screenshot
I was not sure if the problem is with my mic so I tried a similar example code from another Google tutorial to translate an audio file. The result is the same, no recognition result nor translation.
Did I miss something?
Thank you very much.
I want to use the bleak library in Python to receive data from a Bluetooth Low Energy device. This part is working. My problem is now, that I don't know how to run this code in the background or parallel.
Eventually, I want to build a tiny python app which is processing the data from the Bluetooth device. So bleak is looping all the time fetching data from a bluetooth device and sending it to the main process where it is processed and displayed.
For some reason, bleak does not run in a thread. Is it possible to use asyncio for this (since it is already used by bleak maybe a good way to go)?
I checked out threads and multiprocessing but somehow I found only examples without processes which loop infinitely and send data. I'm totally new to the topic of parallelization and/or asynchronous processes. Maybe one of you can give a hint where to look for a proper solution for this case.
Below is my code so far (for now I just loop and print data).
from bleak import BleakClient
import json
import time
current_index = 0
time_array = [0] * 20
def TicTocGenerator():
# Generator that returns time differences
ti = 0 # initial time
tf = time.time() # final time
while True:
ti = tf
tf = time.time()
yield tf-ti # returns the time difference
TicToc = TicTocGenerator() # create an instance of the TicTocGen generator
# This will be the main function through which we define both tic() and toc()
def toc(tempBool=True):
# Prints the time difference yielded by generator instance TicToc
tempTimeInterval = next(TicToc)
global current_index
if tempBool:
#print( "Elapsed time: %f seconds.\n" %tempTimeInterval )
time_array[current_index] = tempTimeInterval
if current_index == 19:
current_index = 0
else:
current_index += 1
def tic():
# Records a time in TicToc, marks the beginning of a time interval
toc(False)
def Average(lst):
return sum(lst) / len(lst)
#address = "30:ae:a4:5d:bc:ba"
address = "CCA9907B-10EA-411E-9816-A5E247DCA0C7"
MODEL_NBR_UUID = "beb5483e-36e1-4688-b7f5-ea07361b26a8"
async def run(address, loop):
async with BleakClient(address, loop=loop) as client:
while True:
tic()
model_number = await client.read_gatt_char(MODEL_NBR_UUID)
toc()
json_payload=json.loads(model_number)
print()
print(json_payload)
print("Temp [°C]: "+"{:.2f}".format(json_payload["Temp"]))
print("Volt [V]: "+"{:.2f}".format(json_payload["Volt"]))
print("AngX: "+str(json_payload["AngX"]))
print("AngY: "+str(json_payload["AngY"]))
print("AngZ: "+str(json_payload["AngZ"]))
#print("Millis: {0}".format("".join(map(chr, model_number))))
print("Average [ms]: {:.1f}".format(Average(time_array)*1000))
loop = asyncio.get_event_loop()
loop.run_until_complete(run(address, loop))
I had to make GUI for app that automates FUOTA on multiple BLE devices so my solution was to put bleak loop in separate thread in order to be able to use tkinter mainloop in main thread. You need to use asyncio.run_coroutine_threadsafe to schedule a new task from main thread.
from threading import Thread
import tkinter as tk
from Bleak import BleakScanner
async def scan():
device = await BleakScanner.discover()
for device in devices:
print(device)
def startScan():
# call startScan() from main thread
asyncio.run_coroutine_threadsafe(scan(), loop)
if __name__ == "__main__":
window = tk.Tk()
# ...
loop = asyncio.get_event_loop()
def bleak_thread(loop):
asyncio.set_event_loop(loop)
loop.run_forever()
t = Thread(target=bleak_thread, args=(loop,))
t.start()
window.mainloop()
loop.call_soon_threadsafe(loop.stop)
I'm trying to create a simple app that loads wav files (one for each note of a keyboard) and plays specific ones when a midi note is pressed (or played). So far, I've created a midi input stream using mido and an audio stream using pyaudio in two separate threads. the goal is for the midi stream to update the currently playing notes, and the callback of the pyaudio stream to check for active notes and play those that are. The midi stream works fine, but my audio stream only seems to call the callback once, right when the script is started (print(notes)). Any idea how I can get the audio stream callback to update constantly?
import wave
from io import BytesIO
import os
from mido import MidiFile
import pyaudio
from time import sleep
from threading import Thread
import numpy
# Pipe: active, released
# Rank: many pipes
# Stop: one or more ranks
# Manual: multiple ranks
# Organ: multiple manuals
pipes = []
notes = []
p = pyaudio.PyAudio()
def mapRange(num, inMin, inMax, outMin, outMax):
return int((num - inMin) * (outMax - outMin) / (inMax - inMin) + outMin)
def callback(in_data, frame_count, time_info, status):
data = bytes(frame_count)
print(notes)
for note in notes:
pipedata = bytes()
if len(data) != 0:
data1 = numpy.fromstring(data, numpy.int16)
data2 = numpy.fromstring(note['sample'].readframes(frame_count), numpy.int16)
pipedata = (data1 * 0.5 + data2 * 0.5).astype(numpy.int16)
else:
data2 = numpy.fromstring(note['sample'].readframes(frame_count), numpy.int16)
pipedata = data2.astype(numpy.int16)
data = pipedata.tostring()
return (data, pyaudio.paContinue)
stream = p.open(format=pyaudio.paInt24,
channels=2,
rate=48000,
output=True,
stream_callback=callback,
start=True)
# start the stream (4)
stream.start_stream()
for root, dirs, files in os.walk("samples"):
for filename in files:
file_on_disk = open(os.path.join(root, filename), 'rb')
pipes.append(
{"sample": wave.open(BytesIO(file_on_disk.read()), 'rb')})
for msg in MidiFile('test.mid').play():
if msg.type == "note_on":
notes.append(pipes[mapRange(msg.note, 36, 96, 0, 56)])
print("on")
if msg.type == "note_off":
#notes[mapRange(msg.note, 36, 96, 0, 56)] = False
print("off")
# wait for stream to finish (5)
while stream.is_active():
sleep(0.1)
# stop stream (6)
stream.stop_stream()
stream.close()
# close PyAudio (7)
p.terminate()
I too faced this issue and found this question in hopes of finding an answer, ended up figuring it out myself.
The data returned on the callback must match the number of frames (frames_per_buffer parameter in p.open). I see you didn't specify one so I think the default is 1024.
The thing is frames_per_buffer does not represent bytes but acrual frames. So since you specify the format as being pyaudio.paInt24 that means that one frames is represented by 3 bytes (24 / 8). So in your callback you should be returning 3072 bytes or the callback will not be called again for some reason.
If you were using blocking mode and not writing those 3072 bytes in stream.write() it would result in a weird effect of slow and crackling audio.
I'm having some problems and I cannot seem to get my head around the concept.
What I am trying to do is this:
Have the microphone "listen" for voiced (above a particular threshold) and then start recording to a .wav file until the person has stopped speaking / the signal is no longer there. For example:
begin:
listen() -> nothing is being said
listen() -> nothing is being said
listen() -> VOICED - _BEGIN RECORDING_
listen() -> VOICED - _BEGIN RECORDING_
listen() -> UNVOICED - _END RECORDING_
end
I want to do this also using "threading" so a thread would be created that "listens" to the file constantly, and, another thread will begin when there is voiced data.. But, I cannot for the life of me figure out how I should go about it.. Here is my code so far:
import wave
import sys
import threading
from array import array
from sys import byteorder
try:
import pyaudio
CHECK_PYLIB = True
except ImportError:
CHECK_PYLIB = False
class Audio:
_chunk = 0.0
_format = 0.0
_channels = 0.0
_rate = 0.0
record_for = 0.0
stream = None
p = None
sample_width = None
THRESHOLD = 500
# initial constructor to accept params
def __init__(self, chunk, format, channels, rate):
#### set data-types
self._chunk = chunk
self.format = pyaudio.paInt16,
self.channels = channels
self.rate = rate
self.p = pyaudio.PyAudio();
def open(self):
# print "opened"
self.stream = self.p.open(format=pyaudio.paInt16,
channels=2,
rate=44100,
input=True,
frames_per_buffer=1024);
return True
def record(self):
# create a new instance/thread to record the sound
threading.Thread(target=self.listen).start();
def is_silence(snd_data):
return max(snd_data) < THRESHOLD
def listen(self):
r = array('h')
while True:
snd_data = array('h', self.stream.read(self._chunk))
if byteorder == 'big':
snd_data.byteswap()
r.extend(snd_data)
return sample_width, r
I'm guessing that I could record "5" second blocks, and, then if the block is deemed as "voiced" then it the thread should be started until all the voice data has been captured. However, because at current it's at while True: i don't want to capture all of the audio up until there are voiced commands, so e.g. "no voice", "no voice", "voice", "voice", "no voice", "no voice" i just want the "voice" inside the wav file.. Anyone have any suggestions?
Thank you
EDIT:
import wave
import sys
import time
import threading
from array import array
from sys import byteorder
from Queue import Queue, Full
import pyaudio
CHUNK_SIZE = 1024
MIN_VOLUME = 500
BUF_MAX_SIZE = 1024 * 10
process_g = 0
def main():
stopped = threading.Event()
q = Queue(maxsize=int(round(BUF_MAX_SIZE / CHUNK_SIZE)))
listen_t = threading.Thread(target=listen, args=(stopped, q))
listen_t.start()
process_g = threading.Thread(target=process, args=(stopped, q))
process_g.start()
try:
while True:
listen_t.join(0.1)
process_g.join(0.1)
except KeyboardInterrupt:
stopped.set()
listen_t.join()
process_g.join()
def process(stopped, q):
while True:
if stopped.wait(timeout = 0):
break
print "I'm processing.."
time.sleep(300)
def listen(stopped, q):
stream = pyaudio.PyAudio().open(
format = pyaudio.paInt16,
channels = 2,
rate = 44100,
input = True,
frames_per_buffer = 1024
)
while True:
if stopped and stopped.wait(timeout=0):
break
try:
print process_g
for i in range(0, int(44100 / 1024 * 5)):
data_chunk = array('h', stream.read(CHUNK_SIZE))
vol = max(data_chunk)
if(vol >= MIN_VOLUME):
print "WORDS.."
else:
print "Nothing.."
except Full:
pass
if __name__ == '__main__':
main()
Now, after every 5 seconds, I need the "process" function to execute, and then process the data (time.delay(10) whilst it does this and then start the recording back up..
Having spent some time on it, I've come up with the following code that seems to be doing what you need, except writing to file:
import threading
from array import array
from Queue import Queue, Full
import pyaudio
CHUNK_SIZE = 1024
MIN_VOLUME = 500
# if the recording thread can't consume fast enough, the listener will start discarding
BUF_MAX_SIZE = CHUNK_SIZE * 10
def main():
stopped = threading.Event()
q = Queue(maxsize=int(round(BUF_MAX_SIZE / CHUNK_SIZE)))
listen_t = threading.Thread(target=listen, args=(stopped, q))
listen_t.start()
record_t = threading.Thread(target=record, args=(stopped, q))
record_t.start()
try:
while True:
listen_t.join(0.1)
record_t.join(0.1)
except KeyboardInterrupt:
stopped.set()
listen_t.join()
record_t.join()
def record(stopped, q):
while True:
if stopped.wait(timeout=0):
break
chunk = q.get()
vol = max(chunk)
if vol >= MIN_VOLUME:
# TODO: write to file
print "O",
else:
print "-",
def listen(stopped, q):
stream = pyaudio.PyAudio().open(
format=pyaudio.paInt16,
channels=2,
rate=44100,
input=True,
frames_per_buffer=1024,
)
while True:
if stopped.wait(timeout=0):
break
try:
q.put(array('h', stream.read(CHUNK_SIZE)))
except Full:
pass # discard
if __name__ == '__main__':
main()
Look here:
https://github.com/jeysonmc/python-google-speech-scripts/blob/master/stt_google.py
It even converts Wav to flac and sends it to the google Speech api , just delete the stt_google_wav function if you dont need it ;)