How to use Wave file as input in VOSK speech recognition?

How to use Wave file as input in VOSK speech recognition? - python

I have a project that needs to get a recorded file and then process by the code and extract the text from file and match the extracted file with the other text and verify it.
my problem is:
I can't use recorded file in code and it does'nt read the file
init function is the fundamental of code.
verify functtion confirm the matched speech and text.
import argparse
import json
import os
import queue
import random
import sys
from difflib import SequenceMatcher
import numpy as np
import sounddevice as sd
import vosk
q = queue.Queue()
def int_or_str(text):
"""Helper function for argument parsing."""
try:
return int(text)
except ValueError:
return text
def callback(indata, frames, time, status):
"""This is called (from a separate thread) for each audio block."""
if status:
print(status, file=sys.stderr)
q.put(bytes(indata))
def init():
parser = argparse.ArgumentParser(add_help=False)
parser.add_argument(
'-l', '--list-devices', action='store_true',
help='show list of audio devices and exit')
args, remaining = parser.parse_known_args()
if args.list_devices:
print(sd.query_devices())
parser.exit(0)
parser = argparse.ArgumentParser(
description=__doc__,
formatter_class=argparse.RawDescriptionHelpFormatter,
parents=[parser])
parser.add_argument(
'-f', '--filename', type=str, metavar='FILENAME',
help='audio file to store recording to')
parser.add_argument(
'-m', '--model', type=str, metavar='MODEL_PATH',
help='Path to the model')
parser.add_argument(
'-d', '--device', type=int_or_str,
help='input device (numeric ID or substring)')
parser.add_argument(
'-r', '--samplerate', type=int, help='sampling rate')
args = parser.parse_args(remaining)
try:
if args.model is None:
args.model = "model"
if not os.path.exists(args.model):
print("Please download a model for your language from https://alphacephei.com/vosk/models")
print("and unpack as 'model' in the current folder.")
parser.exit(0)
if args.samplerate is None:
device_info = sd.query_devices(args.device, 'input')
# soundfile expects an int, sounddevice provides a float:
args.samplerate = int(device_info['default_samplerate'])
model = vosk.Model(args.model)
if args.filename:
dump_fn = open(args.filename, "wb")
else:
dump_fn = None
except KeyboardInterrupt:
print('\nDone')
parser.exit(0)
except Exception as e:
parser.exit(type(e).__name__ + ': ' + str(e))
return model, args
def verify(random_sentence, model, args):
num, T_num, F_num, num_word = 0, 0, 0, 1
with sd.RawInputStream(samplerate=args.samplerate, blocksize=8000, device=args.device, dtype='int16',
channels=1, callback=callback):
rec = vosk.KaldiRecognizer(model, args.samplerate)
print("{}) ".format(num_word), random_sentence, end='\n')
print('=' * 30, end='\n')
run = True
while run:
data = q.get()
if rec.AcceptWaveform(data):
res = json.loads(rec.FinalResult())
res['text'] = res['text'].replace('ي', 'ی')
if SequenceMatcher(None, random_sentence, res['text']).ratio() > 0.65:
T_num, num, num_word += 1
else:
F_num, num, num_word += 1
run = False
print('=' * 30)
print('True Cases : {}\n False Cases : {}'.format(T_num, F_num))
if __name__ == "__main__":
model, args = init()
verify(random_sentences, model, args)

I have been working on a similar project. I modified the code from VOSK Git repo and wrote the following function that takes file name / path as the input and outputs the captured text. Sometimes, when there is a long pause (~seconds) in the audio file, the returned text would be an empty string. To remedy this problem, I had to write additional code that picks out the longest string that was captured. I could make do with this fix.
def get_text_from_voice(filename):
if not os.path.exists("model"):
print ("Please download the model from https://alphacephei.com/vosk/models and unpack as 'model' in the current folder.")
exit (1)
wf = wave.open(filename, "rb")
if wf.getnchannels() != 1 or wf.getsampwidth() != 2 or wf.getcomptype() != "NONE":
print ("Audio file must be WAV format mono PCM.")
exit (1)
model = Model("model")
rec = KaldiRecognizer(model, wf.getframerate())
rec.SetWords(True)
text_lst =[]
p_text_lst = []
p_str = []
len_p_str = []
while True:
data = wf.readframes(4000)
if len(data) == 0:
break
if rec.AcceptWaveform(data):
text_lst.append(rec.Result())
print(rec.Result())
else:
p_text_lst.append(rec.PartialResult())
print(rec.PartialResult())
if len(text_lst) !=0:
jd = json.loads(text_lst[0])
txt_str = jd["text"]
elif len(p_text_lst) !=0:
for i in range(0,len(p_text_lst)):
temp_txt_dict = json.loads(p_text_lst[i])
p_str.append(temp_txt_dict['partial'])
len_p_str = [len(p_str[j]) for j in range(0,len(p_str))]
max_val = max(len_p_str)
indx = len_p_str.index(max_val)
txt_str = p_str[indx]
else:
txt_str =''
return txt_str
Make sure that the correct model is present in the same directory or put in the path to the model. Also, note that VOSK accepts audio files only in wav mono PCM format.

Related

subprocess or multithreading or threading pool

I'm going to run the command line utility multiple times in parallel using Python.
I know that multithreading is better to use for I/O operations, multiprocessing - for CPU oriented operations.
But what should I use for parallel subprocess.run?
I also know that I can create a pool from the subprocess module, but how is it different from pools from the multiprocessing and threading modules? And why shouldn't I just put subprocess.run function into multiprocessing or threading pools?
Or maybe there are some criteria when it is better to put a utility run cmd into a pool of threads or processes?
(In my case, I'm going to run the "ffmpeg" utility)

In a situation like this, I tend to run subprocesses from a ThreadPoolExecutor, basically because it's easy.
Example (from here):
from datetime import datetime
from functools import partial
import argparse
import concurrent.futures as cf
import logging
import os
import subprocess as sp
import sys
__version__ = "2021.09.19"
def main():
"""
Entry point for dicom2jpg.
"""
args = setup()
if not args.fn:
logging.error("no files to process")
sys.exit(1)
if args.quality != 80:
logging.info(f"quality set to {args.quality}")
if args.level:
logging.info("applying level correction.")
convert_partial = partial(convert, quality=args.quality, level=args.level)
starttime = str(datetime.now())[:-7]
logging.info(f"started at {starttime}.")
with cf.ThreadPoolExecutor(max_workers=os.cpu_count()) as tp:
for infn, outfn, rv in tp.map(convert_partial, args.fn):
logging.info(f"finished conversion of {infn} to {outfn} (returned {rv})")
endtime = str(datetime.now())[:-7]
logging.info(f"completed at {endtime}.")
def setup():
"""Parse command-line arguments."""
parser = argparse.ArgumentParser(description=__doc__)
parser.add_argument(
"--log",
default="warning",
choices=["debug", "info", "warning", "error"],
help="logging level (defaults to 'warning')",
)
parser.add_argument("-v", "--version", action="version", version=__version__)
parser.add_argument(
"-l",
"--level",
action="store_true",
default=False,
help="Correct color levels (default: no)",
)
parser.add_argument(
"-q", "--quality", type=int, default=80, help="JPEG quailty level (default: 80)"
)
parser.add_argument(
"fn", nargs="*", metavar="filename", help="DICOM files to process"
)
args = parser.parse_args(sys.argv[1:])
logging.basicConfig(
level=getattr(logging, args.log.upper(), None),
format="%(levelname)s: %(message)s",
)
logging.debug(f"command line arguments = {sys.argv}")
logging.debug(f"parsed arguments = {args}")
# Check for requisites
try:
sp.run(["convert"], stdout=sp.DEVNULL, stderr=sp.DEVNULL)
logging.info("found “convert”")
except FileNotFoundError:
logging.error("the program “convert” cannot be found")
sys.exit(1)
return args
def convert(filename, quality, level):
"""
Convert a DICOM file to a JPEG file.
Removing the blank areas from the Philips detector.
Arguments:
filename: name of the file to convert.
quality: JPEG quality to apply
level: Boolean to indicate whether level adustment should be done.
Returns:
Tuple of (input filename, output filename, convert return value)
"""
outname = filename.strip() + ".jpg"
size = "1574x2048"
args = [
"convert",
filename,
"-units",
"PixelsPerInch",
"-density",
"300",
"-depth",
"8",
"-crop",
size + "+232+0",
"-page",
size + "+0+0",
"-auto-gamma",
"-quality",
str(quality),
]
if level:
args += ["-level", "-35%,70%,0.5"]
args.append(outname)
cp = sp.run(args, stdout=sp.DEVNULL, stderr=sp.DEVNULL)
return (filename, outname, cp.returncode)
if __name__ == "__main__":
main()
Alternatively, you can manage a bunch of subprocesses (in the form of Popen objects) directly, as shown below.
(This was older code, now modified for Python 3)
import os
import sys
import subprocess
from multiprocessing import cpu_count
from time import sleep
def checkfor(args):
"""Make sure that a program necessary for using this script is
available.
Arguments:
args -- string or list of strings of commands. A single string may
not contain spaces.
"""
if isinstance(args, str):
if " " in args:
raise ValueError("No spaces in single command allowed.")
args = [args]
try:
with open("/dev/null", "w") as bb:
subprocess.check_call(args, stdout=bb, stderr=bb)
except Exception:
print("Required program '{}' not found! exiting.".format(args[0]))
sys.exit(1)
def startconvert(fname):
"""Use the convert(1) program from the ImageMagick suite to convert the
image and crop it."""
size = "1574x2048"
args = [
"convert",
fname,
"-units",
"PixelsPerInch",
"-density",
"300",
"-crop",
size + "+232+0",
"-page",
size + "+0+0",
fname + ".png",
]
with open("/dev/null") as bb:
p = subprocess.Popen(args, stdout=bb, stderr=bb)
print("Start processing", fname)
return (fname, p)
def manageprocs(proclist):
"""Check a list of subprocesses for processes that have ended and
remove them from the list.
"""
for it in proclist:
fn, pr = it
result = pr.poll()
if result is not None:
proclist.remove(it)
if result == 0:
print("Finished processing", fn)
else:
s = "The conversion of {} exited with error code {}."
print(s.format(fn, result))
sleep(0.5)
def main(argv):
"""Main program.
Keyword arguments:
argv -- command line arguments
"""
if len(argv) == 1:
path, binary = os.path.split(argv[0])
print("Usage: {} [file ...]".format(binary))
sys.exit(0)
del argv[0] # delete the name of the script.
checkfor("convert")
procs = []
maxprocs = cpu_count()
for ifile in argv:
while len(procs) == maxprocs:
manageprocs(procs)
procs.append(startconvert(ifile))
while len(procs) > 0:
manageprocs(procs)
# This is the main program ##
if __name__ == "__main__":
main(sys.argv)

How can one get a read/write speed for a given disk?

I have multiple disks mounted on my Windows 10 PC and a program that regularly makes i/o operations on these disks. I am trying to log the data on memory usage, cpu time and disk i/o speed at the moment of calling the function via a python script - somewhat similar to what Task Manager Performance window does. While I got cpu and memory covered by psutil module, I can't figure the simple way to retrieve the disk i/o speeds.
How would you advice to implement it?

Ok. Ensure to put in an empty directory, as it will get full of written, read files for testing.
rwrite.py
import sys
import ctypes
def run_as_admin(argv=None, debug=False):
shell32 = ctypes.windll.shell32
if argv is None and shell32.IsUserAnAdmin():
return True
if argv is None:
argv = sys.argv
if hasattr(sys, '_MEIPASS'):
# Support pyinstaller wrapped program.
arguments = map(unicode, argv[1:])
else:
arguments = map(unicode, argv)
argument_line = u' '.join(arguments)
executable = unicode(sys.executable)
if debug:
print 'Command line: ', executable, argument_line
ret = shell32.ShellExecuteW(None, u"runas", executable, argument_line, None, 1)
if int(ret) <= 32:
return False
return None
from __future__ import division, print_function # for compatability with py2
import os, sys
from random import shuffle
import argparse
import json
try: # if Python >= 3.3 use new high-res counter
from time import perf_counter as time
except ImportError: # else select highest available resolution counter
if sys.platform[:3] == 'win':
from time import clock as time
else:
from time import time
def get_args():
parser = argparse.ArgumentParser(description='Arguments', formatter_class = argparse.ArgumentDefaultsHelpFormatter)
parser.add_argument('-f', '--file',
required=False,
action='store',
default=os.getcwd(),
help='The file to read/write to')
parser.add_argument('-s', '--size',
required=False,
action='store',
type=int,
default=128,
help='Total MB to write')
parser.add_argument('-w', '--write-block-size',
required=False,
action='store',
type=int,
default=1024,
help='The block size for writing in bytes')
parser.add_argument('-r', '--read-block-size',
required=False,
action='store',
type=int,
default=512,
help='The block size for reading in bytes')
parser.add_argument('-j', '--json',
required=False,
action='store',
help='Output to json file')
args = parser.parse_args()
return args
class Benchmark:
def __init__(self, file,write_mb, write_block_kb, read_block_b):
self.file = file
self.write_mb = write_mb
self.write_block_kb = write_block_kb
self.read_block_b = read_block_b
wr_blocks = int(self.write_mb * 1024 / self.write_block_kb)
rd_blocks = int(self.write_mb * 1024 * 1024 / self.read_block_b)
self.write_results = self.write_test( 1024 * self.write_block_kb, wr_blocks)
self.read_results = self.read_test(self.read_block_b, rd_blocks)
def write_test(self, block_size, blocks_count, show_progress=True):
'''
Tests write speed by writing random blocks, at total quantity
of blocks_count, each at size of block_size bytes to disk.
Function returns a list of write times in sec of each block.
'''
f = os.open(self.file, os.O_CREAT | os.O_WRONLY, 0o777) # low-level I/O
took = []
for i in range(blocks_count):
if show_progress:
# dirty trick to actually print progress on each iteration
sys.stdout.write('\rWriting: {:.2f} %'.format(
(i + 1) * 100 / blocks_count))
sys.stdout.flush()
buff = os.urandom(block_size)
start = time()
os.write(f, buff)
os.fsync(f) # force write to disk
t = time() - start
took.append(t)
os.close(f)
return took
def read_test(self, block_size, blocks_count, show_progress=True):
'''
Performs read speed test by reading random offset blocks from
file, at maximum of blocks_count, each at size of block_size
bytes until the End Of File reached.
Returns a list of read times in sec of each block.
'''
f = os.open(self.file, os.O_RDONLY, 0o777) # low-level I/O
# generate random read positions
offsets = list(range(0, blocks_count * block_size, block_size))
shuffle(offsets)
took = []
for i, offset in enumerate(offsets, 1):
if show_progress and i % int(self.write_block_kb * 1024 / self.read_block_b) == 0:
# read is faster than write, so try to equalize print period
sys.stdout.write('\rReading: {:.2f} %'.format(
(i + 1) * 100 / blocks_count))
sys.stdout.flush()
start = time()
os.lseek(f, offset, os.SEEK_SET) # set position
buff = os.read(f, block_size) # read from position
t = time() - start
if not buff: break # if EOF reached
took.append(t)
os.close(f)
return took
def print_result(self):
result = ('\n\nWritten {} MB in {:.4f} s\nWrite speed is {:.2f} MB/s'
'\n max: {max:.2f}, min: {min:.2f}\n'.format(
self.write_mb, sum(self.write_results), self.write_mb / sum(self.write_results),
max=self.write_block_kb / (1024 * min(self.write_results)),
min=self.write_block_kb / (1024 * max(self.write_results))))
result += ('\nRead {} x {} B blocks in {:.4f} s\nRead speed is {:.2f} MB/s'
'\n max: {max:.2f}, min: {min:.2f}\n'.format(
len(self.read_results), self.read_block_b,
sum(self.read_results), self.write_mb / sum(self.read_results),
max=self.read_block_b / (1024 * 1024 * min(self.read_results)),
min=self.read_block_b / (1024 * 1024 * max(self.read_results))))
print(result)
print(ASCIIART)
def get_json_result(self,output_file):
results_json = {}
results_json["Written MB"] = self.write_mb
results_json["Write time (sec)"] = round(sum(self.write_results),2)
results_json["Write speed in MB/s"] = round(self.write_mb / sum(self.write_results),2)
results_json["Read blocks"] = len(self.read_results)
results_json["Read time (sec)"] = round(sum(self.read_results),2)
results_json["Read speed in MB/s"] = round(self.write_mb / sum(self.read_results),2)
with open(output_file,'w') as f:
json.dump(results_json,f)
def main():
args = get_args()
benchmark = Benchmark(args.file, args.size, args.write_block_size, args.read_block_size)
if args.json is not None:
benchmark.get_json_result(args.json)
else:
benchmark.print_result()
os.remove(args.file)
if __name__ == "__main__":
ret = run_as_admin()
if ret is True:
print 'I have admin privilege.'
raw_input('Press ENTER to exit.')
elif ret is None:
print 'I am elevating to admin privilege.'
raw_input('Press ENTER to exit.')
else:
print 'Error(ret=%d): cannot elevate privilege.' % (ret, )
main()

how to make my python code to run faster with Concurrent Futures library

This block of code here, fetches records from a mysql table with over 100k distinguished by unique ips. Then it takes set of ips in batch and processes on each of them.
The Problem that I face here is speed of execution and time. I tried running my script with ThreadPoolExecutor but it takes 10 hours to run over 100k records. So, to speed up I tried using ProcessPoolExecutor but it takes same amount of time.
Please suggest as what needs to be done to increase the execution speed of my script.
import sys
sys.path.insert(1, '/var/lib/hadoop-hdfs/Yogesh/pylib')
import argparse
import datetime
import sqlalchemy
import pymysql
import time
import concurrent.futures
from concurrent.futures import ProcessPoolExecutor
def get_remaining_details(db_engine, ip, device):
'''fetches model,network,region and size details for a particular device'''
try:
results = db_engine.execute("""select model,network,region_num,(select size from SSD_size where model=a.model)
from
SSD_fail_predictor a
where
inet_ntoa(machine_ip)='{0}' and device='{1}'""".
format(ip, device)).fetchone()
except Exception as exception:
sys.stderr.write("FAILED to get details from {0} with {} : {1}".format(
str(ip), str(device), str(exception)))
sys.exit(1)
return results
def insert_data(db_engine, adata, ip, device):
''' To insert the data passed into mysql'''
try:
results = db_engine.execute ("insert into ssd_dwpd values ('{0}','{1}',{days},{data_writes_gb},'{start_date}',{samples},{wpd_in_gb},'{model}',{size},{dwpd},'{network}',{region_num})".format(ip,device,**adata))
except Exception as exception:
sys.stderr.write("FAILED to insert data {0} into ssd_dwpd table: {1}".format(str(adata),str(exception)))
sys.exit(1)
return 1
def analyse_data(db_engine, ip, device):
''' How do we find drivers that have been replaced ?
first, we sort the values of all the devices in descending order
second, we create a pair of value using zip function which takes two adjacent values from sorted list
third, we take the first value from the list where the differnce is negative
so cutoff-1 is the highest index, after which is the replaced data'''
mod, net, reg, size = ('','',0,0)
details = get_remaining_details(db_engine,ip, device)
if (details == None):
with open("Void_SSD_Fail_Predictor.txt", 'a') as file1:
file1.write(ip + " : " + device + "\n")
return 1
else:
mod, net, reg, size = details
if size == None:
with open("Void_SSD_size.txt", 'a') as file2:
file2.write(ip + " : " + mod + "\n")
return 1
try:
attribute = []
results = db_engine.execute(
"select distinct attribute from lba_written where machineip='{0}' and device='{1}' order by time_stamp".format(str(ip), str(device)))
for x in results:
attribute.append(x[0])
where_clause = ''
if len(attribute) == 1:
where_clause = "machineip='{0}' and device='{1}' and value > 0 order by time_stamp".format(str(ip), str(device))
else:
attribute = attribute[-1]
where_clause = "machineip='{0}' and device='{1}' and attribute = '{2}' and value > 0 order by time_stamp".format(str(ip), str(device), attribute)
results = db_engine.execute(
"select (max(value)-min(value)) Value,count(1) as Sample,((max(time_stamp) - min(time_stamp)) / 86400) days,FROM_UNIXTIME(min(time_stamp)) start_day from lba_written where {wc}".format(wc=where_clause))
value,samples,days,start_date = results.fetchone()
if samples == 0:
with open("nullip.txt", 'a') as f:
f.write("IP: {0} with device: {1} has no samples with values > 0\n".format(ip, device))
return 1
if days > 1.0:
days = float(round(days,2))
else:
days = 1
if attribute == 'Host_Writes_32MiB':
data_writes_gb = round((value * 32) / 1024, 2)
else:
data_writes_gb = round((value * 512) / 1073741824, 2)
wpd_in_gb = round(data_writes_gb / days, 2)
dwpd = round(wpd_in_gb / size, 2)
except Exception as exception:
sys.stderr.write("For IP {0} Something is not correct here: {1}".format(
str(ip), str(exception)))
sys.exit(1)
return {
'data_writes_gb': data_writes_gb,
'days': days,
'network': net,
'start_date': start_date,
'samples': samples,
'wpd_in_gb': wpd_in_gb,
'model': mod,
'region_num': reg,
'size': size,
'dwpd': dwpd,
}
def get_devices_for_ip(db_engine, ip):
"""Fetches unique device for the passed ip"""
devices = []
try:
results = db_engine.execute(
"select distinct device from lba_written where machineip='{}'".format(str(ip)))
for row in results.fetchall():
devices.append(row[0])
except Exception as exception:
## Ideally script should never come here ##
sys.stderr.write("FAILED to get devices for {0} : {1}".format(
str(ip), str(exception)))
sys.exit(1)
return devices
def process_ips(ip):
""" Process IPs
Input:
IP address of the device
Output:
If updated 1 else 0
"""
db_engine = sqlalchemy.create_engine("mysql+pymysql://root:hdfs#xxx.xx.xx.xxx/hardware_perf")
devices = get_devices_for_ip(db_engine,ip)
for device in devices:
analysed_data = analyse_data(db_engine,ip, device)
#self.printv("Analysed data: {}".format(analysed_data))
if (isinstance(analysed_data, dict) == True):
op = insert_data(db_engine,
analysed_data, ip, device)
elif (analysed_data == 1):
op = 1
else:
op = 0
db_engine.dispose()
return op
def batcher(ips,seg_size):
for x in range(0, len(ips), seg_size):
yield ips[x:x + seg_size]
def printv(args, message):
""" Print verbose message
Print message only if VEBOSE is true
Input:
message (str): string with message
"""
if args.verbose:
print(str(message))
def get_ips_from_db(args):
'''Fetches distinct machine ips from db'''
db_engine = sqlalchemy.create_engine(
"mysql+pymysql://root:hdfs#xxx.xx.xx.xxx/hardware_perf")
ips = []
if args.limit:
sql = "select distinct machineip from lba_written limit {}".format(
int(args.limit))
else:
sql = "select distinct machineip from lba_written"
try:
results = db_engine.execute(sql)
for ip in results.fetchall():
ips.append(ip[0])
if not ips:
printv(args,"Got no device from lba_written table")
sys.exit(1)
except Exception as exception:
sys.stderr.write(
"FAILED to get IPs from db : {}".format(str(exception)))
sys.exit(1)
db_engine.dispose()
printv(args, "Number of Distinct IPs got : {}".format(len(ips)))
printv(args, "IPs are : {}".format(str(ips)))
return ips
def arg_parser():
parser = argparse.ArgumentParser(description="Calculate dwpd for SSD")
parser.add_argument(
'--host',
required=True,
help='host name of mysql server'
)
parser.add_argument(
'-u',
'--user',
required=True,
help='Username for login mysql server'
)
parser.add_argument(
'-p',
'--password',
help='password for login mysql'
)
parser.add_argument(
'--db',
required=True,
help='database name in mysql'
)
parser.add_argument(
'-l',
'--limit',
type=int,
help="limit to restrict number of IP's to be processed"
)
parser.add_argument(
'-v',
'--verbose',
default=0,
help='Verbose mode'
)
parser.add_argument(
'-bs',
'--batch_size',
type=int,
default=50,
help='Number of records to process in each batch'
)
parser.add_argument(
'-t',
'--threads',
type=int,
default = 8,
help='Number of threads'
)
return parser.parse_args()
def main():
args = arg_parser()
ips = get_ips_from_db(args)
for ips_batch in batcher(ips,args.batch_size):
with ProcessPoolExecutor (max_workers=args.threads) as executor:
try:
results = executor.map(process_ips, ips_batch)
except Exception as ex:
print ("timeout: {}".format(ex))
if __name__ == '__main__':
main()
print ("Finished")

Record and Recognize music from URL [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
I am using an open source audio fingerprinting platform in python DeJavu that can recognize music from disk and from microphone.
I have tested the recognition from disk and it is amazing. 100% accuracy.
I seek assistance on how to add a class "BroadcastRecognizer"
This will recognize music from a URL online stream example online radio [http://bbcmedia.ic.llnwd.net/stream/bbcmedia_radio1_mf_p]
Because the music in the radio stream is constantly changing, would like to set it to recognize after every 10 seconds.
Here is the recognize.py
import dejavu.fingerprint as fingerprint
import dejavu.decoder as decoder
import numpy as np
import pyaudio
import time
class BaseRecognizer(object):
def __init__(self, dejavu):
self.dejavu = dejavu
self.Fs = fingerprint.DEFAULT_FS
def _recognize(self, *data):
matches = []
for d in data:
matches.extend(self.dejavu.find_matches(d, Fs=self.Fs))
return self.dejavu.align_matches(matches)
def recognize(self):
pass # base class does nothing
class FileRecognizer(BaseRecognizer):
def __init__(self, dejavu):
super(FileRecognizer, self).__init__(dejavu)
def recognize_file(self, filename):
frames, self.Fs, file_hash = decoder.read(filename, self.dejavu.limit)
t = time.time()
match = self._recognize(*frames)
t = time.time() - t
if match:
match['match_time'] = t
return match
def recognize(self, filename):
return self.recognize_file(filename)
class MicrophoneRecognizer(BaseRecognizer):
default_chunksize = 8192
default_format = pyaudio.paInt16
default_channels = 2
default_samplerate = 44100
def __init__(self, dejavu):
super(MicrophoneRecognizer, self).__init__(dejavu)
self.audio = pyaudio.PyAudio()
self.stream = None
self.data = []
self.channels = MicrophoneRecognizer.default_channels
self.chunksize = MicrophoneRecognizer.default_chunksize
self.samplerate = MicrophoneRecognizer.default_samplerate
self.recorded = False
def start_recording(self, channels=default_channels,
samplerate=default_samplerate,
chunksize=default_chunksize):
self.chunksize = chunksize
self.channels = channels
self.recorded = False
self.samplerate = samplerate
if self.stream:
self.stream.stop_stream()
self.stream.close()
self.stream = self.audio.open(
format=self.default_format,
channels=channels,
rate=samplerate,
input=True,
frames_per_buffer=chunksize,
)
self.data = [[] for i in range(channels)]
def process_recording(self):
data = self.stream.read(self.chunksize)
nums = np.fromstring(data, np.int16)
for c in range(self.channels):
self.data[c].extend(nums[c::self.channels])
def stop_recording(self):
self.stream.stop_stream()
self.stream.close()
self.stream = None
self.recorded = True
def recognize_recording(self):
if not self.recorded:
raise NoRecordingError("Recording was not complete/begun")
return self._recognize(*self.data)
def get_recorded_time(self):
return len(self.data[0]) / self.rate
def recognize(self, seconds=10):
self.start_recording()
for i in range(0, int(self.samplerate / self.chunksize
* seconds)):
self.process_recording()
self.stop_recording()
return self.recognize_recording()
class NoRecordingError(Exception):
pass
Here is the dejavu.py
import os``
import sys
import json
import warnings
import argparse
from dejavu import Dejavu
from dejavu.recognize import FileRecognizer
from dejavu.recognize import MicrophoneRecognizer
from argparse import RawTextHelpFormatter
warnings.filterwarnings("ignore")
DEFAULT_CONFIG_FILE = "dejavu.cnf.SAMPLE"
def init(configpath):
"""
Load config from a JSON file
"""
try:
with open(configpath) as f:
config = json.load(f)
except IOError as err:
print("Cannot open configuration: %s. Exiting" % (str(err)))
sys.exit(1)
# create a Dejavu instance
return Dejavu(config)
if __name__ == '__main__':
parser = argparse.ArgumentParser(
description="Dejavu: Audio Fingerprinting library",
formatter_class=RawTextHelpFormatter)
parser.add_argument('-c', '--config', nargs='?',
help='Path to configuration file\n'
'Usages: \n'
'--config /path/to/config-file\n')
parser.add_argument('-f', '--fingerprint', nargs='*',
help='Fingerprint files in a directory\n'
'Usages: \n'
'--fingerprint /path/to/directory extension\n'
'--fingerprint /path/to/directory')
parser.add_argument('-r', '--recognize', nargs=2,
help='Recognize what is '
'playing through the microphone\n'
'Usage: \n'
'--recognize mic number_of_seconds \n'
'--recognize file path/to/file \n')
args = parser.parse_args()
if not args.fingerprint and not args.recognize:
parser.print_help()
sys.exit(0)
config_file = args.config
if config_file is None:
config_file = DEFAULT_CONFIG_FILE
# print "Using default config file: %s" % (config_file)
djv = init(config_file)
if args.fingerprint:
# Fingerprint all files in a directory
if len(args.fingerprint) == 2:
directory = args.fingerprint[0]
extension = args.fingerprint[1]
print("Fingerprinting all .%s files in the %s directory"
% (extension, directory))
djv.fingerprint_directory(directory, ["." + extension], 4)
elif len(args.fingerprint) == 1:
filepath = args.fingerprint[0]
if os.path.isdir(filepath):
print("Please specify an extension if you'd like to fingerprint a directory!")
sys.exit(1)
djv.fingerprint_file(filepath)
elif args.recognize:
# Recognize audio source
song = None
source = args.recognize[0]
opt_arg = args.recognize[1]
if source in ('mic', 'microphone'):
song = djv.recognize(MicrophoneRecognizer, seconds=opt_arg)
elif source == 'file':
song = djv.recognize(FileRecognizer, opt_arg)
print(song)
sys.exit(0)

I still think that you need a discrete "piece" of audio, so you need a beginning and an end.
For what it is worth, start with something like this, which records a 10 second burst of audio, which you can then test against your finger-printed records.
Note: that this is bashed out for python 2, so you would have to edit it for it to run on python 3
import time, sys
import urllib2
url = "http://bbcmedia.ic.llnwd.net/stream/bbcmedia_radio1_mf_p"
print ("Connecting to "+url)
response = urllib2.urlopen(url, timeout=10.0)
fname = "Sample"+str(time.clock())[2:]+".wav"
f = open(fname, 'wb')
block_size = 1024
print ("Recording roughly 10 seconds of audio Now - Please wait")
limit = 10
start = time.time()
while time.time() - start < limit:
try:
audio = response.read(block_size)
if not audio:
break
f.write(audio)
sys.stdout.write('.')
sys.stdout.flush()
except Exception as e:
print ("Error "+str(e))
f.close()
sys.stdout.flush()
print("")
print ("10 seconds from "+url+" have been recorded in "+fname)
#
# here run the finger print test to identify the audio recorded
# using the sample you have downloaded in the file "fname"
#

import python with main method

I have a python script that have __main__ statement and took all values parametric.
I want to import and use it in my own script.
Actually I can import but don't know how to use it.
As you see below, __main__ is a bit complicated and rewriting it will take time because I even don't know what does most of code mean.
Want to know is there any way to import and use the code as a function?
import os
import sys
import time
import base64
from urllib2 import urlopen
from urllib2 import Request
from urllib2 import HTTPError
from urllib import urlencode
from urllib import quote
from exceptions import Exception
from email.mime.multipart import MIMEMultipart
from email.mime.base import MIMEBase
from email.mime.application import MIMEApplication
from email.encoders import encode_noop
from api_util import json2python, python2json
class MalformedResponse(Exception):
pass
class RequestError(Exception):
pass
class Client(object):
default_url = 'http://nova.astrometry.net/api/'
def __init__(self,
apiurl = default_url):
self.session = None
self.apiurl = apiurl
def get_url(self, service):
return self.apiurl + service
def send_request(self, service, args={}, file_args=None):
'''
service: string
args: dict
'''
if self.session is not None:
args.update({ 'session' : self.session })
print 'Python:', args
json = python2json(args)
print 'Sending json:', json
url = self.get_url(service)
print 'Sending to URL:', url
# If we're sending a file, format a multipart/form-data
if file_args is not None:
m1 = MIMEBase('text', 'plain')
m1.add_header('Content-disposition', 'form-data; name="request-json"')
m1.set_payload(json)
m2 = MIMEApplication(file_args[1],'octet-stream',encode_noop)
m2.add_header('Content-disposition',
'form-data; name="file"; filename="%s"' % file_args[0])
#msg.add_header('Content-Disposition', 'attachment',
# filename='bud.gif')
#msg.add_header('Content-Disposition', 'attachment',
# filename=('iso-8859-1', '', 'FuSballer.ppt'))
mp = MIMEMultipart('form-data', None, [m1, m2])
# Makie a custom generator to format it the way we need.
from cStringIO import StringIO
from email.generator import Generator
class MyGenerator(Generator):
def __init__(self, fp, root=True):
Generator.__init__(self, fp, mangle_from_=False,
maxheaderlen=0)
self.root = root
def _write_headers(self, msg):
# We don't want to write the top-level headers;
# they go into Request(headers) instead.
if self.root:
return
# We need to use \r\n line-terminator, but Generator
# doesn't provide the flexibility to override, so we
# have to copy-n-paste-n-modify.
for h, v in msg.items():
print >> self._fp, ('%s: %s\r\n' % (h,v)),
# A blank line always separates headers from body
print >> self._fp, '\r\n',
# The _write_multipart method calls "clone" for the
# subparts. We hijack that, setting root=False
def clone(self, fp):
return MyGenerator(fp, root=False)
fp = StringIO()
g = MyGenerator(fp)
g.flatten(mp)
data = fp.getvalue()
headers = {'Content-type': mp.get('Content-type')}
if False:
print 'Sending headers:'
print ' ', headers
print 'Sending data:'
print data[:1024].replace('\n', '\\n\n').replace('\r', '\\r')
if len(data) > 1024:
print '...'
print data[-256:].replace('\n', '\\n\n').replace('\r', '\\r')
print
else:
# Else send x-www-form-encoded
data = {'request-json': json}
print 'Sending form data:', data
data = urlencode(data)
print 'Sending data:', data
headers = {}
request = Request(url=url, headers=headers, data=data)
try:
f = urlopen(request)
txt = f.read()
print 'Got json:', txt
result = json2python(txt)
print 'Got result:', result
stat = result.get('status')
print 'Got status:', stat
if stat == 'error':
errstr = result.get('errormessage', '(none)')
raise RequestError('server error message: ' + errstr)
return result
except HTTPError, e:
print 'HTTPError', e
txt = e.read()
open('err.html', 'wb').write(txt)
print 'Wrote error text to err.html'
def login(self, apikey):
args = { 'apikey' : apikey }
result = self.send_request('login', args)
sess = result.get('session')
print 'Got session:', sess
if not sess:
raise RequestError('no session in result')
self.session = sess
def _get_upload_args(self, **kwargs):
args = {}
for key,default,typ in [('allow_commercial_use', 'd', str),
('allow_modifications', 'd', str),
('publicly_visible', 'y', str),
('scale_units', None, str),
('scale_type', None, str),
('scale_lower', None, float),
('scale_upper', None, float),
('scale_est', None, float),
('scale_err', None, float),
('center_ra', None, float),
('center_dec', None, float),
('radius', None, float),
('downsample_factor', None, int),
('tweak_order', None, int),
('crpix_center', None, bool),
# image_width, image_height
]:
if key in kwargs:
val = kwargs.pop(key)
val = typ(val)
args.update({key: val})
elif default is not None:
args.update({key: default})
print 'Upload args:', args
return args
def url_upload(self, url, **kwargs):
args = dict(url=url)
args.update(self._get_upload_args(**kwargs))
result = self.send_request('url_upload', args)
return result
def upload(self, fn, **kwargs):
args = self._get_upload_args(**kwargs)
try:
f = open(fn, 'rb')
result = self.send_request('upload', args, (fn, f.read()))
return result
except IOError:
print 'File %s does not exist' % fn
raise
def submission_images(self, subid):
result = self.send_request('submission_images', {'subid':subid})
return result.get('image_ids')
def overlay_plot(self, service, outfn, wcsfn, wcsext=0):
from astrometry.util import util as anutil
wcs = anutil.Tan(wcsfn, wcsext)
params = dict(crval1 = wcs.crval[0], crval2 = wcs.crval[1],
crpix1 = wcs.crpix[0], crpix2 = wcs.crpix[1],
cd11 = wcs.cd[0], cd12 = wcs.cd[1],
cd21 = wcs.cd[2], cd22 = wcs.cd[3],
imagew = wcs.imagew, imageh = wcs.imageh)
result = self.send_request(service, {'wcs':params})
print 'Result status:', result['status']
plotdata = result['plot']
plotdata = base64.b64decode(plotdata)
open(outfn, 'wb').write(plotdata)
print 'Wrote', outfn
def sdss_plot(self, outfn, wcsfn, wcsext=0):
return self.overlay_plot('sdss_image_for_wcs', outfn,
wcsfn, wcsext)
def galex_plot(self, outfn, wcsfn, wcsext=0):
return self.overlay_plot('galex_image_for_wcs', outfn,
wcsfn, wcsext)
def myjobs(self):
result = self.send_request('myjobs/')
return result['jobs']
def job_status(self, job_id, justdict=False):
result = self.send_request('jobs/%s' % job_id)
if justdict:
return result
stat = result.get('status')
if stat == 'success':
result = self.send_request('jobs/%s/calibration' % job_id)
print 'Calibration:', result
result = self.send_request('jobs/%s/tags' % job_id)
print 'Tags:', result
result = self.send_request('jobs/%s/machine_tags' % job_id)
print 'Machine Tags:', result
result = self.send_request('jobs/%s/objects_in_field' % job_id)
print 'Objects in field:', result
result = self.send_request('jobs/%s/annotations' % job_id)
print 'Annotations:', result
result = self.send_request('jobs/%s/info' % job_id)
print 'Calibration:', result
return stat
def sub_status(self, sub_id, justdict=False):
result = self.send_request('submissions/%s' % sub_id)
if justdict:
return result
return result.get('status')
def jobs_by_tag(self, tag, exact):
exact_option = 'exact=yes' if exact else ''
result = self.send_request(
'jobs_by_tag?query=%s&%s' % (quote(tag.strip()), exact_option),
{},
)
return result
if __name__ == '__main__':
import optparse
parser = optparse.OptionParser()
parser.add_option('--server', dest='server', default=Client.default_url,
help='Set server base URL (eg, %default)')
parser.add_option('--apikey', '-k', dest='apikey',
help='API key for Astrometry.net web service; if not given will check AN_API_KEY environment variable')
parser.add_option('--upload', '-u', dest='upload', help='Upload a file')
parser.add_option('--wait', '-w', dest='wait', action='store_true', help='After submitting, monitor job status')
parser.add_option('--wcs', dest='wcs', help='Download resulting wcs.fits file, saving to given filename; implies --wait if --urlupload or --upload')
parser.add_option('--kmz', dest='kmz', help='Download resulting kmz file, saving to given filename; implies --wait if --urlupload or --upload')
parser.add_option('--urlupload', '-U', dest='upload_url', help='Upload a file at specified url')
parser.add_option('--scale-units', dest='scale_units',
choices=('arcsecperpix', 'arcminwidth', 'degwidth', 'focalmm'), help='Units for scale estimate')
#parser.add_option('--scale-type', dest='scale_type',
# choices=('ul', 'ev'), help='Scale bounds: lower/upper or estimate/error')
parser.add_option('--scale-lower', dest='scale_lower', type=float, help='Scale lower-bound')
parser.add_option('--scale-upper', dest='scale_upper', type=float, help='Scale upper-bound')
parser.add_option('--scale-est', dest='scale_est', type=float, help='Scale estimate')
parser.add_option('--scale-err', dest='scale_err', type=float, help='Scale estimate error (in PERCENT), eg "10" if you estimate can be off by 10%')
parser.add_option('--ra', dest='center_ra', type=float, help='RA center')
parser.add_option('--dec', dest='center_dec', type=float, help='Dec center')
parser.add_option('--radius', dest='radius', type=float, help='Search radius around RA,Dec center')
parser.add_option('--downsample', dest='downsample_factor', type=int, help='Downsample image by this factor')
parser.add_option('--parity', dest='parity', choices=('0','1'), help='Parity (flip) of image')
parser.add_option('--tweak-order', dest='tweak_order', type=int, help='SIP distortion order (default: 2)')
parser.add_option('--crpix-center', dest='crpix_center', action='store_true', default=None, help='Set reference point to center of image?')
parser.add_option('--sdss', dest='sdss_wcs', nargs=2, help='Plot SDSS image for the given WCS file; write plot to given PNG filename')
parser.add_option('--galex', dest='galex_wcs', nargs=2, help='Plot GALEX image for the given WCS file; write plot to given PNG filename')
parser.add_option('--substatus', '-s', dest='sub_id', help='Get status of a submission')
parser.add_option('--jobstatus', '-j', dest='job_id', help='Get status of a job')
parser.add_option('--jobs', '-J', dest='myjobs', action='store_true', help='Get all my jobs')
parser.add_option('--jobsbyexacttag', '-T', dest='jobs_by_exact_tag', help='Get a list of jobs associated with a given tag--exact match')
parser.add_option('--jobsbytag', '-t', dest='jobs_by_tag', help='Get a list of jobs associated with a given tag')
parser.add_option( '--private', '-p',
dest='public',
action='store_const',
const='n',
default='y',
help='Hide this submission from other users')
parser.add_option('--allow_mod_sa','-m',
dest='allow_mod',
action='store_const',
const='sa',
default='d',
help='Select license to allow derivative works of submission, but only if shared under same conditions of original license')
parser.add_option('--no_mod','-M',
dest='allow_mod',
action='store_const',
const='n',
default='d',
help='Select license to disallow derivative works of submission')
parser.add_option('--no_commercial','-c',
dest='allow_commercial',
action='store_const',
const='n',
default='d',
help='Select license to disallow commercial use of submission')
opt,args = parser.parse_args()
if opt.apikey is None:
# try the environment
opt.apikey = os.environ.get('AN_API_KEY', None)
if opt.apikey is None:
parser.print_help()
print
print 'You must either specify --apikey or set AN_API_KEY'
sys.exit(-1)
args = {}
args['apiurl'] = opt.server
c = Client(**args)
c.login(opt.apikey)
if opt.upload or opt.upload_url:
if opt.wcs or opt.kmz:
opt.wait = True
kwargs = dict(
allow_commercial_use=opt.allow_commercial,
allow_modifications=opt.allow_mod,
publicly_visible=opt.public)
if opt.scale_lower and opt.scale_upper:
kwargs.update(scale_lower=opt.scale_lower,
scale_upper=opt.scale_upper,
scale_type='ul')
elif opt.scale_est and opt.scale_err:
kwargs.update(scale_est=opt.scale_est,
scale_err=opt.scale_err,
scale_type='ev')
elif opt.scale_lower or opt.scale_upper:
kwargs.update(scale_type='ul')
if opt.scale_lower:
kwargs.update(scale_lower=opt.scale_lower)
if opt.scale_upper:
kwargs.update(scale_upper=opt.scale_upper)
for key in ['scale_units', 'center_ra', 'center_dec', 'radius',
'downsample_factor', 'tweak_order', 'crpix_center',]:
if getattr(opt, key) is not None:
kwargs[key] = getattr(opt, key)
if opt.parity is not None:
kwargs.update(parity=int(opt.parity))
if opt.upload:
upres = c.upload(opt.upload, **kwargs)
if opt.upload_url:
upres = c.url_upload(opt.upload_url, **kwargs)
stat = upres['status']
if stat != 'success':
print 'Upload failed: status', stat
print upres
sys.exit(-1)
opt.sub_id = upres['subid']
if opt.wait:
if opt.job_id is None:
if opt.sub_id is None:
print "Can't --wait without a submission id or job id!"
sys.exit(-1)
while True:
stat = c.sub_status(opt.sub_id, justdict=True)
print 'Got status:', stat
jobs = stat.get('jobs', [])
if len(jobs):
for j in jobs:
if j is not None:
break
if j is not None:
print 'Selecting job id', j
opt.job_id = j
break
time.sleep(5)
success = False
while True:
stat = c.job_status(opt.job_id, justdict=True)
print 'Got job status:', stat
if stat.get('status','') in ['success']:
success = (stat['status'] == 'success')
break
time.sleep(5)
if success:
c.job_status(opt.job_id)
# result = c.send_request('jobs/%s/calibration' % opt.job_id)
# print 'Calibration:', result
# result = c.send_request('jobs/%s/tags' % opt.job_id)
# print 'Tags:', result
# result = c.send_request('jobs/%s/machine_tags' % opt.job_id)
# print 'Machine Tags:', result
# result = c.send_request('jobs/%s/objects_in_field' % opt.job_id)
# print 'Objects in field:', result
#result = c.send_request('jobs/%s/annotations' % opt.job_id)
#print 'Annotations:', result
retrieveurls = []
if opt.wcs:
# We don't need the API for this, just construct URL
url = opt.server.replace('/api/', '/wcs_file/%i' % opt.job_id)
retrieveurls.append((url, opt.wcs))
if opt.kmz:
url = opt.server.replace('/api/', '/kml_file/%i/' % opt.job_id)
retrieveurls.append((url, opt.kmz))
for url,fn in retrieveurls:
print 'Retrieving file from', url, 'to', fn
f = urlopen(url)
txt = f.read()
w = open(fn, 'wb')
w.write(txt)
w.close()
print 'Wrote to', fn
opt.job_id = None
opt.sub_id = None
if opt.sdss_wcs:
(wcsfn, outfn) = opt.sdss_wcs
c.sdss_plot(outfn, wcsfn)
if opt.galex_wcs:
(wcsfn, outfn) = opt.galex_wcs
c.galex_plot(outfn, wcsfn)
if opt.sub_id:
print c.sub_status(opt.sub_id)
if opt.job_id:
print c.job_status(opt.job_id)
#result = c.send_request('jobs/%s/annotations' % opt.job_id)
#print 'Annotations:', result
if opt.jobs_by_tag:
tag = opt.jobs_by_tag
print c.jobs_by_tag(tag, None)
if opt.jobs_by_exact_tag:
tag = opt.jobs_by_exact_tag
print c.jobs_by_tag(tag, 'yes')
if opt.myjobs:
jobs = c.myjobs()
print jobs
#print c.submission_images(1)

No, there is no clean way to do so. When the module is being imported, it's code is executed and all global variables are set as attributes to the module object. So if part of the code is not executed at all (is guarded by __main__ condition) there is no clean way to get access to that code. You can however run code of this module with substituted __name__ but that's very hackish.
You should refactor this module and move whole __main__ part into a method and call it like this:
def main():
do_everything()
if __name__ == '__main__':
main()
This way consumer apps will be able to run code without having to run it in a separate process.

Use the runpy module in the Python 3 Standard Library
See that data can be passed to and from the called script
# top.py
import runpy
import sys
sys.argv += ["another parameter"]
module_globals_dict = runpy.run_path("other_script.py",
init_globals = globals(), run_name="__main__")
print(module_globals_dict["return_value"])
# other_script.py
# Note we did not load sys module, it gets passed to this script
script_name = sys.argv[0]
print(f"Script {script_name} loaded")
if __name__ == "__main__":
params = sys.argv[1:]
print(f"Script {script_name} run with params: {params}")
return_value = f"{script_name} Done"

by what your saying you want to call a function in the script that is importing the module so try:
import __main__
__main__.myfunc()

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to use Wave file as input in VOSK speech recognition? - python

Related

subprocess or multithreading or threading pool

How can one get a read/write speed for a given disk?

how to make my python code to run faster with Concurrent Futures library

Record and Recognize music from URL [closed]

import python with main method

Categories

Resources

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to use Wave file as input in VOSK speech recognition? - python

Related

subprocess or multithreading or threading pool

How can one get a read/write speed for a given disk?

how to make my python code to run faster with Concurrent Futures library

Record and Recognize music from URL [closed]

import python with __main__ method

Categories

Resources

import python with main method