Speech recognition in python to include the whole duration - python

In the following code, I have an issue with speech recognition. The audio WAV file duration is 4 seconds only, however I can only got the transcript for only the last two seconds.
import speech_recognition as sr
audio_data = sr.AudioFile('f1.wav')
recognizer = sr.Recognizer()
with audio_data as file_audio:
print("Start Transcribing File ...")
recognizer.adjust_for_ambient_noise(file_audio)
file_audio = recognizer.record(file_audio)
print(recognizer.recognize_google(file_audio))
Any idea how to fix that or how to convert the audio file to WAV to specific setttings that allows me to get the whole transcript?

Related

Python speech_recognition can't not read wav file

I want to use recognize_google to analyze the number in the wav file
try:
temp = r.recognize_google(".//splitAudio//split.wav",language="zh-TW")
print("You have said \n" + temp )
print("Audio Recorded Successfully \n ")
except Exception as e:
print("Error : " + str(e))
but I got the following error: Error : audio_data must be audio data
I have found this answer for a while, but I don't figure out.
Who can help me,i really appreciate it,thanks.
I do not have time to install stuff and find a WAV file but here's what The Ultimate Guide To Speech Recognition With Python says re "Using record() to Capture Data From a File"
import speech_recognition as sr
r = sr.Recognizer()
harvard = sr.AudioFile('harvard.wav')
with harvard as source:
audio = r.record(source)
r.recognize_google(audio)

Why is error shown when looping through .wav files in a directory but works OK when not looping?

The following runs successfully:
import speech_recognition as sr
filename = 'audiofiles/myaudiofile.wav'
# initiailse the recognizer
r = sr.Recognizer()
with sr.AudioFile(filename) as source:
# listen for the data (load audio to memory)
audio_data = r.record(source)
# recognize (convert from speech to text)
text = r.recognize_google(audio_data)
print(text)
...and outputs the text of the words spoken in the .wav file.
When I run the following code (to check it will work for multiple files which I'll soon add to this directory):
import os
directory = 'audiofiles'
for filename in os.listdir(directory):
with sr.AudioFile(filename) as source:
# listen for the data (load audio to memory)
audio_data = r.record(source)
# recognize (convert from speech to text)
text = r.recognize_google(audio_data)
print(text)
print('---')
...the text is output correctly but then its followed by the error below. Why? How can I fix this?
ValueError: Audio file could not be read as PCM WAV, AIFF/AIFF-C, or Native FLAC; check if file is corrupted or in another format
There was a hidden file that wasn't a .wav file. So I added if filename.endswith('.wav'): like this:
for filename in os.listdir(directory):
if filename.endswith('.wav'):
with sr.AudioFile(filename) as source:
# listen for the data (load audio to memory)
audio_data = r.record(source)
# recognize (convert from speech to text)
text = r.recognize_google(audio_data)
print(text)
print('---')
...and it worked successfully.

Convert sound from website to text in python

How can I convert sound from website to a text? When I click the button in a website is play a sound but my problem is how can I convert it to a text without using microphone just the website and the python.
import speech_recognition as sr
r = sr.Recognizer()
with sr.AudioFile('my.wav') as source:
audio_text = r.listen(source)
try:
text = r.recognize_google(audio_text)
print('Converting audio transcripts into text ...')
print(text)
except:
print('Sorry.. run again...')
Here is my code but I don't have a wav file just the voice coming from the website what I trying to convert.
Example of what I trying to make
when I click the button in the website it plays hello and the python will get the sound from the website and print it.
Try downloading the file first, I don't know the location or format of your audio file so this is a guess:
EDIT: added a url to a real audio file and it works, it fails with poor quality audio though
import requests
import speech_recognition as sr
def download(url, path):
response = requests.get(url) # get the response of the url
with open(path, 'wb') as file: # create the file
file.write(response.content) # write response contents to the file
def transcribe(path):
r = sr.Recognizer()
with sr.AudioFile(path) as source:
audio_text = r.record(source)
text = r.recognize_google(audio_text)
print('Converting audio transcripts into text ...')
return text
audio_url = 'https://google.github.io/tacotron/publications/parrotron/audio/norm_vctk/03_norm_input.wav'
audio_path = './speech.wav'
download(audio_url, audio_path)
audio_text = transcribe(audio_path)
print(audio_text)
Output
Converting audio transcripts into text ...
this is a huge confidence boost

Saving audio from mp4 as wav file using Moviepy Audiofile

I have a video file named 'video.mp4'. I am trying to seperate a section of audio from the video and save it as a wav file that can be used with other Python modules. I want to do this with MoviePy.
I send parameters to the write_audiofile function, specifying the filename, fps, nbyte, and codec.
Following the MoviePy AudioClip docs, I specified the codec as ‘pcm_s32le’ for a 32-bit wav file.
from moviepy.editor import *
sound = AudioFileClip("video.mp4")
newsound = sound.subclip("00:00:13","00:00:15") #audio from 13 to 15 seconds
newsound.write_audiofile("sound.wav", 44100, 2, 2000,"pcm_s32le")
This code generates a .wav file, named 'sound.wav'.
Opening the audio file in Audacity
The resulting file, sound.wav, can be opened in Audacity, however I run into problems when I try to use it as a wav file with other Python modules.
Playing the sound file in pygame
import pygame
pygame.mixer.init()
sound=pygame.mixer.Sound("sound.wav")
The third line gives the following error:
pygame.error: Unable to open file 'sound.wav'
Determining type of sound file using sndhdr.what()
import sndhdr
sndhdr.what("sound.wav")
The sndhdr method returned none
. According to the docs, when this happens, the method failed to determine the type of sound data stored in the file.
Reading the file with Google Speech Recognition
import speech_recognition as sr
r = sr.Recognizer()
audio = "sound.wav"
with sr.AudioFile(audio) as source:
audio = r.record(source)
text= r.recognize_google(audio)
print(text)
This code stops execution on the second to last line:
ValueError: Audio file could not be read as PCM WAV, AIFF/AIFF-C, or Native FLAC; check if file is corrupted or in another format
Why does the audio file open in Audacity, if sndhdr.what() can not recognize it as an audio file type?
How can I properly export a MoviePy AudioClip as a wav file?
I had the same issue with no codec specified or with codec = 'pcms32le', the one that worked for me was pcm_s16le.
Note that I am using "fr-FR" language, you should probably adapt to yur needs.
here is the entire code :
# Python code to convert video to audio
import moviepy.editor as mp
import speech_recognition as sr
# Insert Local Video File Path
clip = mp.VideoFileClip("/tmp/data/test.mp4")
# Insert Local Audio File Path
clip.audio.write_audiofile("/tmp/data/test.wav",codec='pcm_s16le')
# initialize the recognizer
r = sr.Recognizer()
# open the file
with sr.AudioFile("/tmp/data/test.wav") as source:
# listen for the data (load audio to memory)
audio_data = r.record(source)
# recognize (convert from speech to text)
text = r.recognize_google(audio_data, language = "fr-FR")
print(text)
I had the same issue. I was trying to get a mp4 file from URL, then convert It into wav file and call Google Speech Recognition over It. Instead I used pydub to handle conversion and it worked! Here's a sample of the code:
import requests
import io
import speech_recognition as sr
from pydub import AudioSegment
# This function translate speech to text
def speech_to_text(file):
recognizer = sr.Recognizer()
audio = sr.AudioFile(file)
with audio as source:
speech = recognizer.record(source)
try:
# Call recognizer with audio and language
text = recognizer.recognize_google(speech, language='pt-BR')
print("Você disse: " + text)
return text
# If recognizer don't understand
except:
print("Não entendi")
def mp4_to_wav(file):
audio = AudioSegment.from_file(file, format="mp4")
audio.export("audio.wav", format="wav")
return audio
def mp4_to_wav_mem(file):
audio = AudioSegment.from_file_using_temporary_files(file, 'mp4')
file = io.BytesIO()
file = audio.export(file, format="wav")
file.seek(0)
return file
url = ''
r = requests.get(url, stream=True)
file = io.BytesIO(r.content)
file = mp4_to_wav_mem(file)
speech_to_text(file)
Note that I wrote two functions: mp4_to_wav and mp4_to_wav_mem. The only difference is mp4_to_wav_mem handle all files in memory and mp4_to_wav generates .wav file.
I read the docs of MoviePy and found that the parameter nbyte should be consistent with codec. nbyte is for the Sample width (set to 2 for 16-bit sound, 4 for 32-bit sound). Hence, it better set nbyte=4, when you set codec=pcm_s32le.
i think this is the right method:
import os
from moviepy.editor import AudioFileClip
PATH= "files/"
fileName = "nameOfYourFile.mp4"
newFileName = "nameOfTheNewFile"
Ext = "wav"
AudioFileClip(os.path.join(PATH, f"{fileName}")).write_audiofile(os.path.join(PATH, f"{newFileName}.{Ext}"))
I think this approach is very easy to understand.
from moviepy.editor import *
input_file = "../Database/myvoice.mp4"
output_file = "../Database/myvoice.wav"
sound = AudioFileClip(input_file)
sound.write_audiofile(output_file, 44100, 2, 2000,"pcm_s32le")

How to convert speech to text in python input from audio file

speech to text in python using audio file.
This is answer for this question.
You have install pyaudio and SpeechRecognition.
and audio file format should be in WAV file.
Its code for speech to text (input from audio file).
import speech_recognition as sr
r = sr.Recognizer()
audio = 'trial.wav'
with sr.AudioFile(audio) as source:
audio = r.record(source)
print ('Done!')
try:
text = r.recognize_google(audio)
print (text)
except Exception as e:
print (e)
If you want different languages to be converted. You can use below code.
import speech_recognition as sr
r = sr.Recognizer()
with sr.AudioFile('Audio.wav') as source:
audio = r.listen(source)
try:
text = (r.recognize_google(audio, language="IN_HI"))
print('working on...')
print(text)
except:
print('Sorry.. run again..')

Categories

Resources