I whipped up a basic voice assistant using what I know on python with some research, as a self learning project.
Link to the code is here
I am basically converting the audio to text and then splitting it to look for keywords and then trigger a response or an action so to speak, which is not very intelligent but it's working for the time being.
How else can I look for keywords, is there a better approach, an efficient way, if you will, than a thousand lines of ifs and elifs?
Another problem I have is, I built a GUI interface for this program so I could interact with it at the click of a button but the problem is, the window isn't responding after clicking the button, turns out it's a known problem and I don't know how to get around it as I don't know the concept of threads and processes and queues. I am hoping that someone could help me with my problem.
I would like to point out that if I have to do any learning for this project, I would be interested to do that since the idea behind this whole project is learning how to code or build an AI, which may sound stupid
PS: I implemented, well, sort of did, always listen feature or keep running feature by keeping the function in a while loop. I would like to find a way for a voice speech trigger as well to wake up the assistant. Any help in that aspect would be much appreciated.
And also, help me set a name to this assistant, preferably female.
The code is here:
import os
import time
import random
import webbrowser
import tkinter as tk
from gtts import gTTS
from mutagen.mp3 import MP3
from PIL import ImageTk, Image
from playsound import playsound
import speech_recognition as sr
from weather import Weather, Unit
def startAssistant():
keepRunning = 1
while keepRunning is 1:
mainFunction()
if mainFunction() is 0: break
def doNothing(): print("I don't do anything apart from printing this line of course!")
def mainFunction():
f = open("assistant.txt", "a")
# Printing what a user is saying for better user experience
def say(text):
print(text)
f.write("\n" + text + "\n")
return text
# This function will take inputs to talk back
def talkBack(text, recordingName):
# Variable Declaration
extension = ".mp3"
# Synthesising the reponse as speech
tts = gTTS(text=say(text), lang="en-us")
# Saving the response files
fileName = recordingName + extension
audioPath = "audioFiles\\"
responseFile = audioPath + fileName
# Checking to see if the file is already created
if not os.path.exists(responseFile):
tts.save(responseFile)
# Playing the audio
playsound(responseFile)
# Initialising things here
recognizer = sr.Recognizer()
microphone = sr.Microphone()
# Asking for input and saving that
with microphone as source:
print ("Speak:")
audio = recognizer.listen(source)
# Converting audio into text
convertedAudio = recognizer.recognize_google(audio)
convertedAudioSplit = convertedAudio.split()
# Printing what was picked up when the user Spoke and also logging it
print("\n" + convertedAudio + "\n")
f.write("\n" + convertedAudio + "\n")
# Start of a conversation
if "hello" in convertedAudioSplit:
talkBack("Hi, how are you doing today?", "hello")
# Wishing people based on the time of the day
elif "morning" in convertedAudioSplit:
talkBack("Good morning! The sun's shining bright, let's head out for a run. We'll get back and make a healthy breakfast for ourselves", "morning")
elif "afternoon" in convertedAudioSplit:
talkBack("Good afternoon! You must be hungry right about now, why don't you break for lunch?", "afternoon")
elif "night" in convertedAudioSplit:
talkBack("Nighty night sleepy pot! Get a good night's sleep while I learn more to be more helpful to you tomorrow.", "night")
# Getting her information
elif "doing" in convertedAudioSplit:
talkBack("I am doing very good, Thank you for asking!", "doing")
# Making the assistant open web browser with a URL
elif "Google" in convertedAudioSplit:
talkBack("Okay, lets get you to Google.", "google")
# Opening the browser with the required URL
webbrowser.open("https://www.google.com/", new = 1)
# Brings the weather report
elif "weather" in convertedAudioSplit:
weatherVariable = Weather(unit=Unit.CELSIUS)
location = weatherVariable.lookup_by_location('bangalore')
condition = location.condition.text
talkBack("It is {0} right now in Bengaluru.".format(condition), "weather")
# Exiting the program on user's consent
elif "exit" in convertedAudioSplit:
talkBack("Sure, if that's what you want! I will miss you, have a good day.", "exit")
return 0
# If there is an UnknownValueError, this will kick in
elif sr.UnknownValueError:
talkBack("I am sorry, I couldn't quite get what you said. Could you please say that again?", "UnknownValueError")
# When things go out of the box
else:
# Out of scope reply
talkBack("I am a demo version. When you meet the completed me, you will be surprised.", "somethingElse")
return 0
root = tk.Tk()
root.title("Voice Assistant")
mainFrame = tk.Frame(root, width = 1024, height = 720, bg = "turquoise", borderwidth = 5)
menu = tk.Menu(root)
root.config(menu=menu)
subMenu = tk.Menu(menu)
startButton = tk.Button(mainFrame, text="Interact", command = startAssistant)
startButton.place(relx = 0.5, rely = 1.0, anchor = tk.S)
menu.add_cascade(label="File", menu=subMenu)
subMenu.add_command(label="Do Nothing", command=doNothing)
subMenu.add_separator()
subMenu.add_command(label="Exit", command=root.quit)
mainFrame.pack()
root.mainloop()
One potential solution is to use a simpler GUI package. Perhaps the GUI package PySimpleGUI would be a fit. It could solve your GUI problem and free you up to work on the other portions of your project.
Check out the Chat Demo that implements a Chat front-end. Therre's also a Chatterbot Demo that implements a front-end to the Chatterbot project.
You can start by copying that code and modifying it.
Related
When I run the code the window will appear after the code finished and if I add the main loop at the start the code won't run until I close the window. I want the window to update every time I add a label variable in my code. I searched on multiple docs but they all seem to give the same answer and it did not work.
import pyttsx3
import datetime
import wikipedia
import webbrowser
import os
import tkinter
import speech_recognition as sr
from notifypy import Notify
window = tkinter.Tk()
window.title("GUI")
window.geometry('500x500')
engine = pyttsx3.init('sapi5')
voices = engine.getProperty('voices')
engine.setProperty('voice', voices[0].id)
# print(voices)
def speak(audio):
engine.say(audio)
engine.runAndWait()
def wishme():
hour = int(datetime.datetime.now().hour)
if hour>=0 and hour<=12:
speak("good morning sir")
lab1 = tkinter.Label(window,text="Good morning sir").pack()
elif hour>=12 and hour<=18:
speak("good afternoon sir")
lab2 = tkinter.Label(window,text="Good afternoon sir").pack()
elif hour>=18 and hour<=22:
speak("good evening sir")
lab3 = tkinter.Label(window,text="Good evening sir").pack()
else:
speak("good night sir")
lab4 = tkinter.Label(window,text="Good night sir").pack()
lab5 = tkinter.Label(window,text="I am D bot,how may I help you").pack()
speak("I am D bot,how may I help you")
def takecommand():
r = sr.Recognizer()
with sr.Microphone() as sourse:
lab6 = tkinter.Label(window,text="listning...").pack()
r.pause_threshold = 1
audio = r.listen(sourse)
try:
lab7 = tkinter.Label(window,text="recognizing...").pack()
query = r.recognize_google(audio,language='en-in')
# print(query)
lab8 = tkinter.Label(window,text=query).pack()
except Exception as e:
lab9 = tkinter.Label(window,text="Good morning").pack()
lab10 = tkinter.Label(window,text="say that again please").pack()
speak("say that again please")
takecommand().lower
return "none"
return query
def wiki():
# if 'wikipedia' in query:
lab11 = tkinter.Label(window,text="searching wikipedia").pack()
speak('searching wikipedia...')
results = wikipedia.summary(query, sentences=2)
lab12 = tkinter.Label(window,text="according to wikipedia").pack()
speak("according to wikipedia")
# print(results)
lab13 = tkinter.Label(window,text=results).pack()
speak(results)
lab14 = tkinter.Label(window,text="check the notification for more details").pack()
speak('check the notification for more details')
notification = Notify()
notification.title = "check out this website for more details"
notification.message = 'https://en.wikipedia.org/wiki/Main_Page'
notification.icon='G:\code projects\python\D bot\drone_115355.ico'
notification.application_name="D bot"
notification.send()
if __name__=="__main__":
wishme()
while True:
# if 1:
query = takecommand().lower()
# query = "play music"
if 'open youtube' in query:
webbrowser.get(using=None).open_new_tab("https://youtube.com/")
# C:\\Program Files\\Google\\Chrome\\Application\\chrome.exe
# elif "close" in query:
# break
elif 'open amazon' in query:
webbrowser.get(using=None).open_new_tab("https://www.amazon.com/")
elif 'open gmail' in query:
webbrowser.get(using=None).open_new_tab("https://mail.google.com/mail/u/0/#inbox")
elif 'open google' in query:
google = "C:\\Program Files\\Google\\Chrome\\Application\\chrome.exe"
elif 'open chrome' in query:
google = "C:\\Program Files\\Google\\Chrome\\Application\\chrome.exe"
os.startfile(google)
elif 'open stack overflow' in query:
webbrowser.get(using=None).open_new_tab("https://stackoverflow.com/")
elif "what's the time" in query:
strtime = datetime.datetime.now().strftime('%H:%M:%S')
# print('the time is '+strtime)
lab15 = tkinter.Label(window,text="Hello sir,nice to meet you,how may i help you"+strtime).pack()
speak('the time is '+strtime)
window.mainloop()
I am so sorry the mainloop was not shown in the code. I have now edited the code.
all you need to do is add these two functions every time you want the window to update
window.update_idletasks()
window.update()
Programs written for event-driven GUI toolkits like tkinter are significantly different from standard Python scripts.
Once you have created a window, filled it with widgets and have initialized global data, you need to start the mainloop.
Without a running mainloop, there is no interaction with the GUI.
In essence, your program consists of a bunch of functions that are called from the mainloop, in response to the user operating controls, or timers expiring.
(I'm leaving out complications like using threading out on purpose for the sake of simplicity.)
I don't think that pyttsx3 was written with event-driven GUIs in mind. So I suspect you will have to run it in a separate thread or process. Both threads and processes have their pros and cons.
If you are using a process to run pyttsx3, you have to explicitly send the data to the GUI process using interprocess communication. On the other hand, the sound collection process cannot interfere with the GUI.
You could probably relatively easily test it separately from the GUI.
If you use threads you might have issues with responsiveness because in CPython only one thread at a time can be executing Python bytecode.
On the other hand, transferring the data to the GUI is trivial since both live in the same address space.
In my experience calling tkinter functions or methods from a second thread is possible if two conditions are met;
You are using Python 3 and
the tcl and tk used were built with support for threads enabled.
Would like your opinion and support on an issue i am trying to overcome. This will be the last piece of puzzle for completion of a small project i am building. Its based on OCR. I am reading text from a live screen ( using below python script ) and able to get the results logged into a file. However, the output is only getting logged once i make the python console window ( in which the script prints the output ) is active/focused by keyboad using alt+tab.
But doing this halts the software from where i am reading the text, breaking the whole process. Toggling the window to the front of the software is a failure to the scripts purpose.
So, i added code after searching from other users about keeping the python console window on top always no matter what the software is doing. I am not able to keep this python console window on top of this sw screen. The SW uses all screen for its purpose of work.
Is there an alternative to this? How can i make the python console become on top of any other window no matter what is on the screen? If not this, please suggest an alternative.
import numpy as nm
from datetime import datetime
import pytesseract
import cv2
import PIL
from PIL import ImageGrab
import win32gui, win32process, win32con
import os
hwnd = win32gui.GetForegroundWindow()
win32gui.SetWindowPos(hwnd,win32con.HWND_TOPMOST,0,0,100,300,0)
#Define function for OCR to enable on multiple screens.
def imToString():
# Path of tesseract executable
pytesseract.pytesseract.tesseract_cmd ='C:\\Tesseract-OCR\\tesseract.exe'
while(True):
# ImageGrab-To capture the screen image in a loop.
# Bbox used to capture a specific area.
#screen base
cap1 = PIL.ImageGrab.grab(bbox =(0, 917, 1913, 1065), include_layered_windows=False, all_screens=True)
date = datetime.now().strftime("%Y-%m-%d %I:%M:%S")
#str config - OCR Engine settings for ONLY text based capture.
config1 = ('-l eng --oem 2 --psm 6')
#configuring tesseract engine for OCR
tess1 = pytesseract.image_to_string(
cv2.cvtColor(nm.array(cap1), cv2.COLOR_BGR2GRAY),
config=config1)
#Defining log pattern to generate
a = [ date, " State: ", tess1 ]
#writing logging output to file
file1 = open("C:\\Users\\User\\Desktop\\rev2.log", "a", encoding='UTF8')
file1.writelines(a)
file1.writelines("\n")
file1.close()
#OUTPUT on colse for Logging verification
print (date, "State: ", tess1)
# Calling the function
imToString()
By requirement, i am not allowed to use a keyboad while operating the screen. I am fairly new to python and have been seeing similar solutions and adding it to the script to make a proper solution.
Please advise.
Here is the tkinter method:
from tkinter import Tk, Text
import subprocess
import threading
from queue import Queue, Empty
filename = 'test.py'
def stream_from(queue):
command = f'python {filename}'
with subprocess.Popen(command, stdout=subprocess.PIPE, stderr=subprocess.PIPE) as process:
for out in process.stdout:
queue.put(out.decode())
def update_text(queue):
try:
data = queue.get(block=False)
except Empty:
pass
else:
text.config(state='normal')
text.insert('end', data.strip() + '\n')
text.config(state='disabled')
finally:
root.after(100, update_text, queue)
def loop():
root.attributes('-topmost', True)
root.after(1, loop)
root = Tk()
text = Text(root, state='disabled')
text.pack()
data_queue = Queue()
threading.Thread(target=stream_from, args=(data_queue, ), daemon=True).start()
update_text(data_queue)
loop()
root.mainloop()
Just change the filename to the name of the file you are running and place this script in the same directory
Change delay_in_ms (delay in milliseconds so each 1000 units is one second) and see if that helps (could also leave the 10 seconds and see if it works now, if not there is still another thing to try)
If you are using print to output then this should work (tho I maybe didn't exactly get what you want), if you use logging then there is a slightly different solution
i'm new in the python language and have been learning and working with it for 2 days now.
I'm writing a code to send grbl files to my cnc-machine.
my code:
def grbl_sturing(Gcode_file):
print('running')
lbl_Running = Label(root, text="running")
lbl_Running.grid(row=0, column=2)
#Grbl setup
poort = serial.Serial('com11',115200)
code = open(Gcode_file,'r');
poort.write(b'\r\n\r\n')
time.sleep(2)
poort.flushInput()
#sturing
for line in code:
l = line.strip()
print ('Sending: ' + l)
poort.write(l.encode() + b'\r\n')
grbl_out = poort.readline()
print (' : ' + str(grbl_out.strip()))
#Grbl afsluiten
code.close()
poort.close()
So when i press a button in my tkinter window i go to this fucntion. My intensions where to let me know in a label and in my cmd that the program is sending/running.
But when i press this button my cmd show this:
running
Sending: $H
: b'ALARM:9'
Don't mind the alarm its because the cnc-machine isn't powered.
In the cmd it works like itended but when i look in my tkinter window it runs first and when its done it shows me that it is running. Why does it do this and how can i fix it? thank you in advance.
ps(sorry for my bad English)
So, what is happening here is that you are creating the label, but the GUI isn't updating until later, to make it update in that order you must use the line:
TK.update()
Where TK is your tkInter variable to force the interface to update at that point, instead of waiting for the main loop.
I want to write the output in a separate notepad or ms word using python package keyboard.
import keyboard
keyboard.write('The quick brown fox jumps over the lazy dog.')
but it writes these sentence in the command prompt, where I run the script, not in notepad.
How can I make it to control the other software?
You need to give your other application focus.
When searching for a method I found this blog post which shows how it can done in Windows:
https://www.blog.pythonlibrary.org/2014/10/20/pywin32-how-to-bring-a-window-to-front/
import win32gui
def windowEnumerationHandler(hwnd, top_windows):
top_windows.append((hwnd, win32gui.GetWindowText(hwnd)))
if __name__ == "__main__":
results = []
top_windows = []
win32gui.EnumWindows(windowEnumerationHandler, top_windows)
for i in top_windows:
if "notepad" in i[1].lower():
print i
win32gui.ShowWindow(i[0],5)
win32gui.SetForegroundWindow(i[0])
break
After the application has focus, you can use your simulated key presses.
You can use pywinauto package which is more efficient and friendly:
from pywinauto.application import Application
app = Application(backend="uia").start('notepad.exe')
# describe the window inside Notepad.exe process
window = app.UntitledNotepad # or app['Untitled - Notepad'], its the same
# wait till the window is really open
window_ready = window.wait('visible')
# Write in some text
app.UntitledNotepad.Edit.type_keys("Hello world", with_spaces = True)
NOTE: Some lines are adapted from the documentation
Source code
def is_admin():
try:
return ctypes.windll.shell32.IsUserAnAdmin()
except:
return False
if is_admin():
app = Application(backend='uia').start("C:\\Program Files (x86)\\Advantech\\AdamApax.NET Utility\\Program\\AdamNET.exe")
win = app['Advantech Adam/Apax .NET Utility (Win32) Version 2.05.11 (B19)']
win.wait('ready')
win.menu_select("Setup->Refresh Serial and Ethernet")
win.top_window().print_control_identifiers(filename="file.txt")
# win.top_window().OKButton.click_input() ---------This is what I hope to do
else
ctypes.windll.shell32.ShellExecuteW(None, "runas", sys.executable, __file__, None, 1)
Problem Statement
I had to run this application with elevation rights. The above is my code. The problem is I can't identify the window (view in output image) that pops up after selection from menu. I need to close the window. Please excuse the line
win.top_window().print_control_identifiers(filename="file.txt")
It was meant write the identifiers into a text file because the structure of this code does not display the outputs for me to view. However, since nothing is appended, I guess pywinauto couldn't identify the dialog.
For a clearer understanding, please view the image (input) of when it selects the menu.
Input
Now, it pops up with this dialog (output)
Output
I've also used spy to identify the caption and it gives:
(Handle: 004E07D4,
Caption: Information,
Class: #32770(Dialog),
Style: 94C801C5)
Other things I've tried:
Besides using win.topwindow() to identify the dialog, I've used
win[Information].OKButton.click_input()
win[Information].OK.click_input()
win[Information].OK.close()
win[Information].OK.kill(soft=false)
win.Information.OKButton.click_input()
win.Information.OK.click_input()
win.Information.OK.close()
win.Information.OK.kill(soft=false)
app[Information] ...... curious if I could discover the new window from original application
I've also send keys like enter, space, esc & alt-f4 to close the dialog with libraries like keyboard, pynput & ctypes. It still doesn't work.
Link to download the same application: http://downloadt.advantech.com/download/downloadsr.aspx?File_Id=1-1NHAMZX
Any help would be greatly appreciated !
I finally found a thread that demonstrated the way multi thread works to solve this issue. I tried it myself and it works. It's a little different as a few parts of the code have depreciated. Here is the link to the solution:
How to stop a warning dialog from halting execution of a Python program that's controlling it?
Here are the edits I made to solve the problem:
def is_admin():
try:
return ctypes.windll.shell32.IsUserAnAdmin()
except:
return False
if is_admin():
def __init__(self, window_name, quit_event):
threading.Thread.__init__(self)
self.quit_event = quit_event
self.window_name = window_name
def run(self):
while True:
try:
handles = windows.find_windows(title=self.window_name)
except windows.WindowNotFoundError:
pass
else:
for hwnd in handles:
app = Application()
app.connect(handle=hwnd)
popup = app[self.window_name]
popup.close()
if self.quit_event.is_set():
break
time.sleep(1)
quit_event = threading.Event()
mythread = ClearPopupThread('Information', quit_event)
mythread.start()
application = Application(backend="uia").start("C:\\Program Files (x86)\\Advantech\\AdamApax.NET Utility\\Program\\AdamNET.exe")
time.sleep(2)
win = application['Advantech Adam/Apax .NET Utility (Win32) Version 2.05.11 (B19)']
win.menu_select("Setup->Refresh Serial and Ethernet")
quit_event.set()
else:
ctypes.windll.shell32.ShellExecuteW(None, "runas", sys.executable, __file__, None, 1)
The best thing is this solution works for every other dialog that halts the main script from working & I could use them to do different actions like clicking buttons, inserting values, by adding more multi threads.