I haven't found anything online, though Python webpage https://pypi.python.org/pypi/SpeechRecognition/ clearly shows that there's a difference. I have been using Google Speech Recognition (not Google Cloud Speech API) without any API key and I am not sure whether it is okay to do so.
Related
Problem statement: Need to transcript the speech to text in real-time and distinguish the user
as speaker 1 and speaker 2 using azure cognitive speech service.
Until now I explore the documentation of azure regarding conversation transcription which provides the sample code for Javascript and C#link for the documentationbut I was not able to find the sample code in python so does that means azure's this service is not available in python?
Does azure conversation transcription service is available in "python"?
No, at present Conversation Transcription SDK does not support Python language.
Conversation Transcription SDK supports only c# and javascript and is only available in few regions like centralus, eastasia, eastus, westeurope.
You can reach Microsoft here for support.
A C# Example for using Micorosft Cognitive Vision API on Real time videos can be found here. https://learn.microsoft.com/en-us/azure/cognitive-services/computer-vision/vision-api-how-to-topics/howtoanalyzevideo_vision
I cannot find something similar for Python.
How would I go about do that in Python?
I am building a personal assistant, which requires to speak back in hindi. I find it weird that google cloud text to speech doesn't offer hindi language,
https://cloud.google.com/text-to-speech/docs/voices
while google translator speaks back if you ever translate from english to Hindi and click on speaker button.
https://translate.google.com/
So, I read on Internet that
https://translate.google.com.vn/translate_tts?ie=UTF-8&q=ANYTHING_TEXT&tl=en&client=tw-ob
can do the trick. But it is for english. SO if i change the tl to hi, it should work if i replace ANYTHING_TEXT to anything in hindi, but doing so gives me these results:
https://translate.google.com.vn/translate_tts?ie=UTF-8&q=आप%20कैसे%20हैं&tl=hi&client=tw-ob
it is giving me audio I cant understand.
So, my questions are
1) Why we cant access hindi voice using google cloud and can using google translate?
2) How to work around to get hindi voice working in my python file.
3) Google cloud offers google translation api but it only translates the text and gives text at output and not audio. Tell me please if it's true.
https://cloud.google.com/translate/docs/
1) It seems that Google has already developed its Hindi voice which can be used as best effort in a free service such as Google Translate, but if it isn't good enough as you point out, they may be improving it before supporting it for the paid TTS service.
2) You would have to use an alternate TTS service that supports the hindi language until this language gets supported by the Google's TTS API.
3) Yes, the Translation API is only meant for text. You could use the output of the Translation API as input for the TTS API, but since it doesn't support the language you want, you may have to submit a feature request to the Google Cloud's Public Issue Tracker to show interest in the Hindi language support for this service.
I am developing a Python application for real-time translation. I need to recognize speech in real time: as user says something it automatically sends this piece of audio to Google Speech API and returns a text. So I want the recognized text appearing immediately while speaking.
I've found Streaming Speech Recognition but it seems that I still need to record the full speech first and then send it to the server. Also, there are no examples of how to use it in Python
Is it possible to do this with Google Speech API?
You can do it with Google Speech API.
But, it has a 1 minute content limit.
Please check the link below.
https://cloud.google.com/speech/quotas
So you have to restart every 1 minute.
and the link below is example code of microphone streaming by python.
https://cloud.google.com/speech/docs/streaming-recognize#speech-streaming-recognize-python
Check this link out:
https://github.com/Uberi/speech_recognition/blob/master/examples/microphone_recognition.py
This is an example for obtaining audio from the microphone. There are several components for the recognition process. In my experience the Sphinx Recognition lacks on accuracy. The Google Speech Recognition works very well.
Working with Google Speech API for real-time transcription is a bit cumbersome. You can use this repository for inspiration
https://github.com/saharmor/realtime-transcription
It transcribes client-side's microphone in real-time (disclaimer: I'm the author).
I want to write a desktop python app using google drive API but the documentation on google website is a little messy and twisted with drive sdk which makes thing complicated and difficult to find useful staffs.
Can anyone give some useful doc and some sample code?
Thx~
When using Twisted, you will need to adopt a policy whereby every Drive API call is deferred to a thread or threadPool. I recently wrote a blog post about it here:
http://unpythonic.blogspot.com/2012/07/calling-google-drive-api-and-other.html