I am working on a machine learning chatbot project which uses google's speech recognition api.
Now my problem is, when I say 2 or more sentences in one command, speech recognition api returns all sentences in one string, without any fullstop or commas. As a result, it has become harder to seperate sentences. For example, if I say,
Take a photo. Tell me about today's weather. Open Google Chrome.
the speech recognition api returnes:
take a photo tell me about todays weather open Google Chrome
so, my chatbot takes this full string as one sentence.
Is there any way to extract sentences from a string like the one above?
(BTW, I am using Python)
If you are about to say multiple commands say words like "and" and split the command based on that word. Now loop through the list and pass each value to your execute function.
If the variable command stores your value split it using command.split(" and ")
I had previously answered a similar question take a look at it:
https://stackoverflow.com/a/65872940/12279129
I think you could try different approaches to solve the problem:
A Naive solution
I don't know how your system works for now but if you are just looking for some subsentences you could search in the full set of sentences if there is what you are looking for.
i.e.
input_str = "Take a photo turn on fan".lower()
if "take a photo" in input_str :
print("Just took a photo!")
if "turn on fan" in input_str :
print("Just turned the fan on!")
Ofc you could also select a separator word (like and, furthermore, ..) and use it as separator.
A more advanced solution
You could use a NLP library (i.e. spacy) and perform entity recognition so that you can isolate verbs from noun and so on.
After that you could evemtually make use of stemming and lemmatization to further generalize the recognition.
You could also perform many intermediate step with different NLP techniques like stopwords removal.
Try auto punctuation from API
Maybe you can try enabling automatic punctuation in the speech to text api and see if this works good enough for you.
That's because the Google Cloud Speech doesn't provide Natural Language Understanding and you are stuck parsing text transcripts.
You can of course create the natural language understanding component yourself, either by using simple regular expressions or using something like Rasa, but there's a smarter way, too.
Speechly provides you with everything you need to create voice user interfaces on Android, iOS or web. It returns you not only the transcript, but also actionable intents and entities that makes it a lot easier to create something a bit more complex. The best part is that it's free for up to 20 hours a month.
You can see a very simple example on how it works for instance for creating search experiences here. However, the basic idea is always the same: create a model and test that it returns correct intents for your speech input. After you are done, you integrate it to your app by looping through the returned results and whenever you get the correct intent, react in your application as needed. It's actually very simple.
You can use split method
Let your string is A
X = A.split('.')
It will make X a list which will contain items as sentences
Related
I have a python project in mind but i'm not too sure where to start.
I want to do some text comparison between two blocks of text, I want a user to be able to input two blocks of text and the program to identify the parts that are different/not the same.
I've seen this functionality in Git - when you make a change in a repo, it shows you the changes before you commit - this makes me think that I should be able to make something with similar functionality.
Any kinda' insight would be greatly appreciated!!
EDIT:
While searching I came across this Git repo online, it's all i'm looking for! A simple GUI interface where a user can load two different files and see the similarities or differences between them!
For others looking for something similar: https://github.com/yebrahim/pydiff
From my point of view, you can take user input and store it in two strings say str1 and str2 then you can make use of split( ) method or rather word_tokenize( )(Natural Language Processing) to get all the words in the String
If you want you can also remove stopwords Here for better comparison
Now you can run a loop comparing each word and for clear perception, you can underline the words or a particular part of a word that doesn't match
I am currently working on a more ore less little project in Python, where I build somewhat of an voice assistant that interacts with some Gaming APIs like the Destiny2 API.
The big problem I am running into is the Recognition of the usernames (gamertags) like, for example: Ultra_Luck_y which the speech_recognition module for python I am using clearly doesn't understand. So it just returns Ultra Lucky.
I also tried spelling it, but i automatically got put together to words.
So my question is wether there is a solution (no matter how crappy) or not and I have to go a different way about this?
Thanks to Azamat Galimzhanov, i solved it with the the Radio Alphabet, but kind of edited it so the Words were a little shorter.
I simply put the references all in one json document and loaded them as a dictoniary with:
with open('usernames.txt') as json_file:
alphabet = json.load(json_file)
I am working on a project based something on natural language understanding.
So, what I am currently doing is to try and reference the pronouns to their respective antecedents, for which I am trying to build a model. I have worked out the basic part of it, but to complete the task, I need to understand the narrative of the sentence. So what I want is to check whether the noun and object are associated with each other by the verb using an API in python.
Example:
method(laptop, have, operating-system) = yes
method(program, have, operating-system) = No
method("he"/"proper_noun", play, football) = yes
method("he"/"proper_noun", play, college) = No
I've heard about nltk's wordnet API, but I am not sure whether I can use it to perform the same. Can it be used?
Also, I am kind of on a clock.
Any suggestions are welcome and appreciated.
Notes: I am using parsey-mcparseface to break the sentence. I could do the same with nltk but P-MPF is more accurate.
** Why isn't there an NLU tag available? **
Edit 1:
Thanks to alexis, The thing I am trying to do is called "Anaphora Resolution".
The name for what you want is "anaphora resolution", or "coreference resolution". It's a hard problem (probably harder than you realize-- nlp tasks are like that), so unless your purpose is just to learn, I recommend you try some existing solutions. I don't know of an anaphora resolution module in the nltk itself, but you can find it as part of the Stanford CoreNLP suite.
See this question about how to interface to it from the nltk. (I haven't tried it myself).
I have been wanting to create an application using the Microsoft Speech Recognition.
My application's users are expected to often say abbreviated things, such as 'LHC' for 'Large Hadron Collider' or 'CERN'. Given that exact order, my application will return
You said: At age C.
You said: Cern
While it did work for 'CERN', it failed very badly for 'LHC'.
However, if I could make my own custom training files, I could easily place the term 'LHC' somewhere in there. Then, I could make the user access the Speech Control Panel and run my training file.
All the links I have found for this have been frustratingly useless, as they just say things like 'This is ----, you should try going to the ---- forum instead'.
If it does help, here is a list of the links:
http://compgroups.net/comp.speech.users/add-my-own-training/153194
https://groups.google.com/forum/#!topic/microsoft.public.speech.server/v58SH1ov22s
http://social.msdn.microsoft.com/Forums/en/servercorefordevelopers/thread/f7a35f3f-b352-464a-b264-e16eb4afd049
Is my problem even possible? Or are the training files themselves in a special format? If so, can that format be reproduced?
A solution that can also work on Windows XP would be ideal.
Thanks in advance!
P.S. If there are any libraries or modules out there already for this, could anyone point me to some? A Python or C/C++ solution would be splendid. Also, since I'd rather not post another question regarding this, is it possible to utilize the train utilities from command prompt (or without the GUI visible, but still having total command of all controls)?
Okay, pulling this from a thing I wrote three or four years ago now, but I believe you want to do something like this.
The grammar library is a trained system which can recognize words. You can create your own grammar library cued to specific words.
C#, sorry
using System.Speech
using System.Speech.Recognition
using System.Speech.AudioFormat
SpeechRecognitionEngine sre = new SpeechRecognitionEngine();
string[] words = {"L H C", "CERN"};
Choices choices = new Choices(words);
GrammarBuilder gb = new GrammarBuilder(choices);
Grammar grammar = new Grammar(gb);
sre.LoadGrammar(grammar);
That is as far as I can get you. From docs it looks like you can define the pronunciations somehow. So perhaps that way you could have LHC map directly to a single word. Here are the docs on the grammar class - http://msdn.microsoft.com/en-us/library/system.speech.recognition.grammar.aspx
Small update - see example in their docs here http://msdn.microsoft.com/en-us/library/ms554228.aspx
I am trying to write a program that will give an apt title when an article is give ( usually an abstract). Is there any standard algorithm available?
If you want to do it by hand, you'd have to start with something like word frequency counting, then analyzing phrases that appear a lot or words that appear around each other. I have only briefly touched this topic in Java, but there seems to be a good book for Python that deals with text analysis:
Text Processing in Python
OpenFTS, an open full text search engine has a Python interface, called [PyFTS].3
Check it out. Maybe that's what you want.