why speech_recognition does not work after exe-pack by pyinstaller? - python

I am having problem with pyinstaller.
when I run following code with anaconda, operation is fine.
(automatically collect audio file and write text).
but if I made exe with pyinstaller (one file mode).
exe file does not work...
I am wondering if there is way to make this work.
windows 10
anaconda
python 3.4
import pydub, os
import speech_recognition as sr
#read wav file and Write out text to document.txt.
def auido2text():
docuFile = open('document2.txt', 'w',encoding="utf-8_sig")
for Filename in os.listdir('.'):
if not Filename.endswith('.wav'):
continue #skip non-wav files
r = sr.Recognizer()
with sr.AudioFile(Filename) as source:
audio = r.record(source)
text = r.recognize_google(audio, language='es-ES')
docuFile.write(text)
docuFile.write('\n\n')
docuFile.close()
auido2text()

Related

speech_recognition does not work with .pyw files

I'm creating a voice assistant with a graphical interface, I use speech_recognition to capture audio and recognize it into text, but if I use my script with the .py extension it works, instead if I use .pyw it doesn't work
I have searched a lot but can't find an answer to this problem
I don't get any errors if I use .py, so I don't understand why it doesn't work
import speech_recognition as sr
from speech_recognition import Microphone
device_index = 1
sample_rate = 48000
chunk_size = 1024
r = sr.Recognizer()
while True:
with Microphone(device_index=device_index, sample_rate=sample_rate, chunk_size=chunk_size) as source:
r.adjust_for_ambient_noise(source, duration=0.7)
r.pause_threshold = 400
try:
audio = r.listen(source,timeout=None,phrase_time_limit=5)
Input = str(r.recognize_google(audio,language="it-IT",pfilter=0,show_all=False,with_confidence=False)).lower()
with open("Inputs.txt", 'a') as fp:
fp.write(Input)
except Exception as e:
r = sr.Recognizer()
audio = r.listen(source,timeout=None,phrase_time_limit=5)
data = r.recognize_google(audio,language="it-IT",pfilter=0,show_all=True,with_confidence=True)
I found a solution to the problem, i dont know why but if show_all = False, it works only with .py, instead if show_all = True it also works with .pyw
Input = (data['alternative'][0]['transcript']).lower()
confidence = (data['alternative'][0]['confidence'])

Why is error shown when looping through .wav files in a directory but works OK when not looping?

The following runs successfully:
import speech_recognition as sr
filename = 'audiofiles/myaudiofile.wav'
# initiailse the recognizer
r = sr.Recognizer()
with sr.AudioFile(filename) as source:
# listen for the data (load audio to memory)
audio_data = r.record(source)
# recognize (convert from speech to text)
text = r.recognize_google(audio_data)
print(text)
...and outputs the text of the words spoken in the .wav file.
When I run the following code (to check it will work for multiple files which I'll soon add to this directory):
import os
directory = 'audiofiles'
for filename in os.listdir(directory):
with sr.AudioFile(filename) as source:
# listen for the data (load audio to memory)
audio_data = r.record(source)
# recognize (convert from speech to text)
text = r.recognize_google(audio_data)
print(text)
print('---')
...the text is output correctly but then its followed by the error below. Why? How can I fix this?
ValueError: Audio file could not be read as PCM WAV, AIFF/AIFF-C, or Native FLAC; check if file is corrupted or in another format
There was a hidden file that wasn't a .wav file. So I added if filename.endswith('.wav'): like this:
for filename in os.listdir(directory):
if filename.endswith('.wav'):
with sr.AudioFile(filename) as source:
# listen for the data (load audio to memory)
audio_data = r.record(source)
# recognize (convert from speech to text)
text = r.recognize_google(audio_data)
print(text)
print('---')
...and it worked successfully.

OSError: cannot open resources error while running .exe file

I just wrote the barcode generator code. The work of the program is to generate the code. I am taking data through the XML file then I will generate the barcode. Everything is working fine in .py file but when I convert the .py file to .exe file then it will not generate the barcode then shows OSError: cannot open resources.
My code is
from barcode.writer import ImageWriter
import xml.dom.minidom
DOMTree = xml.dom.minidom.parse('codedata.xml')
ENVELOPE = DOMTree.documentElement
QRCODE = ENVELOPE.getElementsByTagName("QRCODE")[0]
INVOICE = QRCODE.getElementsByTagName('INVOICE')[0]
dataget=INVOICE.childNodes[0].data
with open('barcode_image.jpeg', 'wb') as f:
barcode.Code128(str(dataget),writer=ImageWriter()).write(f) # error shows in this line
I don't know what to do please help me.

Saving audio from mp4 as wav file using Moviepy Audiofile

I have a video file named 'video.mp4'. I am trying to seperate a section of audio from the video and save it as a wav file that can be used with other Python modules. I want to do this with MoviePy.
I send parameters to the write_audiofile function, specifying the filename, fps, nbyte, and codec.
Following the MoviePy AudioClip docs, I specified the codec as ‘pcm_s32le’ for a 32-bit wav file.
from moviepy.editor import *
sound = AudioFileClip("video.mp4")
newsound = sound.subclip("00:00:13","00:00:15") #audio from 13 to 15 seconds
newsound.write_audiofile("sound.wav", 44100, 2, 2000,"pcm_s32le")
This code generates a .wav file, named 'sound.wav'.
Opening the audio file in Audacity
The resulting file, sound.wav, can be opened in Audacity, however I run into problems when I try to use it as a wav file with other Python modules.
Playing the sound file in pygame
import pygame
pygame.mixer.init()
sound=pygame.mixer.Sound("sound.wav")
The third line gives the following error:
pygame.error: Unable to open file 'sound.wav'
Determining type of sound file using sndhdr.what()
import sndhdr
sndhdr.what("sound.wav")
The sndhdr method returned none
. According to the docs, when this happens, the method failed to determine the type of sound data stored in the file.
Reading the file with Google Speech Recognition
import speech_recognition as sr
r = sr.Recognizer()
audio = "sound.wav"
with sr.AudioFile(audio) as source:
audio = r.record(source)
text= r.recognize_google(audio)
print(text)
This code stops execution on the second to last line:
ValueError: Audio file could not be read as PCM WAV, AIFF/AIFF-C, or Native FLAC; check if file is corrupted or in another format
Why does the audio file open in Audacity, if sndhdr.what() can not recognize it as an audio file type?
How can I properly export a MoviePy AudioClip as a wav file?
I had the same issue with no codec specified or with codec = 'pcms32le', the one that worked for me was pcm_s16le.
Note that I am using "fr-FR" language, you should probably adapt to yur needs.
here is the entire code :
# Python code to convert video to audio
import moviepy.editor as mp
import speech_recognition as sr
# Insert Local Video File Path
clip = mp.VideoFileClip("/tmp/data/test.mp4")
# Insert Local Audio File Path
clip.audio.write_audiofile("/tmp/data/test.wav",codec='pcm_s16le')
# initialize the recognizer
r = sr.Recognizer()
# open the file
with sr.AudioFile("/tmp/data/test.wav") as source:
# listen for the data (load audio to memory)
audio_data = r.record(source)
# recognize (convert from speech to text)
text = r.recognize_google(audio_data, language = "fr-FR")
print(text)
I had the same issue. I was trying to get a mp4 file from URL, then convert It into wav file and call Google Speech Recognition over It. Instead I used pydub to handle conversion and it worked! Here's a sample of the code:
import requests
import io
import speech_recognition as sr
from pydub import AudioSegment
# This function translate speech to text
def speech_to_text(file):
recognizer = sr.Recognizer()
audio = sr.AudioFile(file)
with audio as source:
speech = recognizer.record(source)
try:
# Call recognizer with audio and language
text = recognizer.recognize_google(speech, language='pt-BR')
print("Você disse: " + text)
return text
# If recognizer don't understand
except:
print("Não entendi")
def mp4_to_wav(file):
audio = AudioSegment.from_file(file, format="mp4")
audio.export("audio.wav", format="wav")
return audio
def mp4_to_wav_mem(file):
audio = AudioSegment.from_file_using_temporary_files(file, 'mp4')
file = io.BytesIO()
file = audio.export(file, format="wav")
file.seek(0)
return file
url = ''
r = requests.get(url, stream=True)
file = io.BytesIO(r.content)
file = mp4_to_wav_mem(file)
speech_to_text(file)
Note that I wrote two functions: mp4_to_wav and mp4_to_wav_mem. The only difference is mp4_to_wav_mem handle all files in memory and mp4_to_wav generates .wav file.
I read the docs of MoviePy and found that the parameter nbyte should be consistent with codec. nbyte is for the Sample width (set to 2 for 16-bit sound, 4 for 32-bit sound). Hence, it better set nbyte=4, when you set codec=pcm_s32le.
i think this is the right method:
import os
from moviepy.editor import AudioFileClip
PATH= "files/"
fileName = "nameOfYourFile.mp4"
newFileName = "nameOfTheNewFile"
Ext = "wav"
AudioFileClip(os.path.join(PATH, f"{fileName}")).write_audiofile(os.path.join(PATH, f"{newFileName}.{Ext}"))
I think this approach is very easy to understand.
from moviepy.editor import *
input_file = "../Database/myvoice.mp4"
output_file = "../Database/myvoice.wav"
sound = AudioFileClip(input_file)
sound.write_audiofile(output_file, 44100, 2, 2000,"pcm_s32le")

MP3 to FLAC for Google's Speech API

I'm trying to find a simple way to send an MP3 to Google for speech recognition. Currently, I'm using a sub process to call SoX which converts it to a WAV. Then, using SpeechRecognition, it converts it again to FLAC. Ideally, I'd like a more portable (not OS specific) way to decode the MP3 and send it with no intermediate file saving and the like.
Here's what I have currently:
import speech_recognition as sr
import subprocess
import requests
audio = requests.get('http://somesite.com/some.mp3')
with open('/tmp/audio.mp3', 'wb') as file:
file.write(audio.content)
subprocess.run(['sox', '/tmp/audio.mp3', '/tmp/audio.wav'])
r = sr.Recognizer()
with sr.WavFile('/tmp/audio.wav') as source:
audio = r.record(source)
result = r.recognize_google(audio)
del r
I've tried directly using the FLAC binaries included in SpeechRecognition, but the output was just static. I'm not too keen on distributing binaries on Git, but I will if that is the only way.
Some important links:
SR's code for speech recognition
SR's code for WAV to FLAC
Edit
I'm considering distributing SoX in a way like the FLAC binaries were, one for each OS, if SoX's license allows it...
Second thought, software licenses are confusing and I don't want to mess with that.
I decided to go with this:
import subprocess
import requests
import shutil
import glob
import json
audio = requests.get('http://somesite.com/some.mp3')
sox = shutil.which('sox') or glob.glob('C:\Program Files*\sox*\sox.exe')[0]
p = subprocess.Popen(sox + ' -t mp3 - -t flac - rate 16k', stdin = subprocess.PIPE, stdout = subprocess.PIPE, shell = True)
stdout, stderr = p.communicate(audio.content)
url = 'http://www.google.com/speech-api/v2/recognize?client=chromium&lang=en-US&key=AIzaSyBOti4mM-6x9WDnZIjIeyEU21OpBXqWBgw'
headers = {'Content-Type': 'audio/x-flac; rate=16000'}
response = requests.post(url, data = stdout, headers = headers).text
result = None
for line in response.split('\n'):
try:
result = json.loads(line)['result'][0]['alternative'][0]['transcript']
break
except:
pass
This is more of a middle ground I suppose borrowing some stuff from the SR module. It would require the user to install SoX, but should work on all OS and doesn't have any intermediate files. I have only tested it on Linux however.

Categories

Resources