I am having some trouble with a GUI that is freezing during a file save operation that is taking some time, and I'd love to understand why that is.
I've followed the instructions of Schollii's wonderful answer on a similar question, but there must be something I'm missing because I cannot get the GUI behaving as I expect.
The below example is not runnable, since it shows only the relevant parts, but hopefully it's enough to get a discussion going. Basically I have a main application class that generates some large data, and I need to save it to HDF5 format, but this takes some time. To leave the GUI responsive, the main class creates an object of the Saver class and a QThread to do the actual data saving (using moveToThread).
The output of this code is pretty much what I would expect (i.e. I see a message that the "saving thread" has a different thread id than the "main" thread) so I know that another thread is being created. The data is successfully saved, too, so that part is working correctly.
During the actual data saving however (which can take some minutes), the GUI freezes up and goes "Not responding" on Windows. Any clues as to what is going wrong?
Stdout during running:
outer thread "main" (#15108)
<__main__.Saver object at 0x0000027BEEFF3678> running SaveThread
Saving data from thread "saving_thread" (#13624)
Code sample:
from PyQt5 import QtCore, QtGui, QtWidgets
from PyQt5.QtCore import QThread, pyqtSignal, pyqtSlot, QObject
class MyApp(QtWidgets.QMainWindow, MyAppDesign.Ui_MainWindow):
def save_file(self):
self.save_name, _ = QtWidgets.\
QFileDialog.getSaveFileName(self)
QThread.currentThread().setObjectName('main')
outer_thread_name = QThread.currentThread().objectName()
outer_thread_id = int(QThread.currentThreadId())
# print debug info about main app thread:
print('outer thread "{}" (#{})'.format(outer_thread_name,
outer_thread_id))
# Create worker and thread to save the data
self.saver = Saver(self.data,
self.save_name,
self.compressionSlider.value())
self.save_thread = QThread()
self.save_thread.setObjectName('saving_thread')
self.saver.moveToThread(self.save_thread)
# Connect signals
self.saver.sig_done.connect(self.on_saver_done)
self.saver.sig_msg.connect(print)
self.save_thread.started.connect(self.saver.save_data)
self.save_thread.start())
#pyqtSlot(str)
def on_saver_done(self, filename):
print('Finished saving {}'.format(filename))
''' End Class '''
class Saver(QObject):
sig_done = pyqtSignal(str) # worker id: emitted at end of work()
sig_msg = pyqtSignal(str) # message to be shown to user
def __init__(self, data_to_save, filename, compression_level):
super().__init__()
self.data = data_to_save
self.filename = filename
self.compression_level = compression_level
#pyqtSlot()
def save_data(self):
thread_name = QThread.currentThread().objectName()
thread_id = int(QThread.currentThreadId())
self.sig_msg.emit('Saving data '
'from thread "{}" (#{})'.format(thread_name,
thread_id))
print(self, "running SaveThread")
h5f = h5py.File(self.filename, 'w')
h5f.create_dataset('data',
data=self.data,
compression='gzip',
compression_opts=self.compression_level)
h5f.close()
self.sig_done.emit(self.filename)
''' End Class '''
There are actually two issues here: (1) Qt's signals and slots mechanisms, and (2) h5py.
First, the signals/slots. These actually work by copying arguments passed to the signal, to avoid any race conditions. (This is just one of the reasons you see so many signals with pointer arguments in the Qt C++ code: copying a pointer is cheap.) Because you're generating the data in the main thread, it must be copied in the main thread's event loop. The data is obviously big enough for this to take some time, blocking the event loop from handling GUI events. If you instead (for testing purposes) generate the data inside the Saver.save_data() slot, the GUI remains responsive.
However, you'll now notice a small lag after the first "Saving data from thread..." message is printed, indicating that the main event loop is blocked during the actual save. This is where h5py comes in.
You're presumably importing h5py at the top of your file, which is the "correct" thing to do. I noticed that if you instead import h5py directly before you create the file, this goes away. My best guess is that the global interpreter lock is involved, as the h5py code is visible from both the main and saving threads. I would have expected that the main thread would be entirely inside Qt module code at this point, however, over which the GIL has no control. So, like I said, I'm not sure what causes the blocking here.
As far as solutions, to the extent you can do what I described here, that will alleviate the problem. Generating the data outside the main thread, if possible, is advisable. It may also be possible to pass some memoryview object, or a numpy.view object, to the saving thread, though you'll then have to deal with thread-synchronization yourself. Also, importing h5py inside the Saver.save_data() slot will help, but isn't feasible if you need the module elsewhere in the code.
Hope this helps!
Related
I have a PySide2 application that runs mostly well. However, from time to time, it exits itself without any error (stdout, stderr are empty, even when run from terminal - tried everything) while a worker thread sends updates to the GUI thread via signals.
To emit these signals however, I'm doing it in a way such that my libraries can just take callbacks and not even know they are actually emitting signals, and I was wondering whether this could be a potential source of crashes:
Gui code:
from mylib import do_threaded_work
MyClass(QObject):
sig = Signal(int)
# ... proper initialization and connection of self.sig to a QProgressBar setValue slot
def on_some_button_pressed(self):
threading.Thread(target=do_threaded_work, args=(self.sig.emit,), daemon=True).start()
mylib.py dummy code:
do_threaded_work(on_progress):
i = 0
while (True):
# ... do some work
on_progress(i)
i = min(100, i + 1)
As you see, I'm directly passing the emit function of my signal instance to the library, that calls it and thus should emit the signal. Is it OK to do that?
Sometimes, I pass the signal object self.sig as argument instead of self.sig.emit, then call argument.emit(...) in the library code. I assume it has the same effect?
The reason I ask is because I didn't find any counter argument stating not to do this in PySide2, and the official documentation on signal and slots is quite light here.
You guys have any input on this? Thanks!
I'm trying to create a chess game using tkinter. I don't have a huge experience in python programming, but I kind of find weird the philosophy of tkinter : if my assumptions are correct, it seems to me that using tkinter means setting it as the base of the project, and everything has to work around it. And what I mean by that is that using whatever code that is not 'wrapped' in the tkinter framework is a pain to deal with (you have to use the event system, you have to use the after method if you want to perform an action after starting the main loop, etc.)
I have a rather different view on that, and in my chess project I simply consider the tkinter display as a part of my rendering system, and the event system provided by tkinter as a part of my input parser system. That being said, I want to be able to easily change the renderer or the input parser, which means that I could want to detect input from the terminal (for instance by writing D2 D3) instead of moving the objects on the screen. I could also want to print the chessboard on the terminal instead of having a GUI.
More to the point, because tkinter blocks the thread through the mainloop method instead of looping in another thread, I have to put my Tk object in a different thread, so that I can run the rest of my program in parallel. And I'm having a tough time doing it, because my Tk variable contained by my thread needs to be accessed by my program, to update it for instance.
After quite a bit of research, I found that queues in python were synchronized, which means that if I put my Tk object in a queue, I could access it without any problem from the main thread. I tried to see if the following code was working :
import threading, queue
class VariableContainer(threading.Thread):
def __init__(self):
threading.Thread.__init__(self)
self.queue = queue.Queue()
def run(self):
self.queue.put("test")
container = VariableContainer()
container.start()
print(container.queue.get(False))
and it does ! The output is test.
However, if I replace my test string by a Tk object, like below :
import threading, queue
import tkinter
class VariableContainer(threading.Thread):
def __init__(self):
threading.Thread.__init__(self)
self.queue = queue.Queue()
def run(self):
root = tkinter.Tk()
self.queue.put(root)
root.mainloop() # whether I call the mainloop or not doesn't change anything
container = VariableContainer()
container.start()
print(container.queue.get(False))
then the print throws an error, stating that the queue is empty.
(Note that the code above is not the code of my program, it is just an exemple since posting sample codes from my project might be less clear)
Why?
The answer to the trivial question you actually asked: you have a race condition because you call Queue.get(block=False). Tk taking a lot longer to initialize, the main thread almost always wins and finds the queue still empty.
The real question is “How do I isolate my logic from the structure of my interface library?”. (While I understand the desire for a simple branch point between “read from the keyboard” and “wait for a mouse event”, it is considered more composable, in the face of large numbers of event types, sources, and handlers, to have one event dispatcher provided by the implementation. It can sometimes be driven one event at a time, but that composes less well with other logic than one might think.)
The usual answer to that is to make your logic a state machine rather than an algorithm. Mechanically, this means replacing local variables with attributes on an object and dividing the code into methods on its class (e.g., one call per “read from keyboard” in a monolithic implementation). Sometimes language features like coroutines can be used to make this transformation mostly transparent/automatic, but they’re not always a good fit. For example:
def algorithm(n):
tot=0
for i in range(n):
s=int(input("#%s:"%i))
tot+=(i+1)*(n-i)*s
print(tot)
class FSM(object):
def __init__(self,n):
self.n=n
self.i=self.tot=0
def send(self,s):
self.tot+=(self.i+1)*(self.n-self.i)*s
self.i+=1
def count(self): return self.i
def done(self): return self.i>=self.n
def get(self):
return self.tot
def coroutine(n): # old-style, not "async def"
tot=0
for i in range(n):
s=(yield)
tot+=(i+1)*(n-i)*s
yield tot
Having done this, it’s trivial to layer the traditional stream-driven I/O back on top, or to connect it to an event-driven system (be it a GUI or asyncio). For example:
def fsmClient(n):
fsm=FSM(n)
while not fsm.done():
fsm.send(int(input("#%s:"%fsm.count())))
return fsm.get()
def coClient(n):
co=coroutine(n)
first=True
while True:
ret=co.send(None if first else
int(input("#%s:"%fsm.count())))
if ret is not None:
co.close()
return ret
first=False
These clients can work with any state machine/coroutine using the same interface; it should be obvious how to instead supply values from a Tkinter input box or so.
I'll do my best to explain this issue in a clear way, it's come up as part of a much larger piece of software I'm developing for an A level project - a project that aims to create a simple version of a graphical programming system (think scratch made by monkeys with about 7 commands).
My trouble currently stems from the need to have an execution function running on a unique thread that is capable of interacting with a user interface that shows the results of executing the code blocks made by the user (written using the Tkinter libraries) on the main thread. This function is designed to go through a dynamic list that contains information on the user's "code" in a form that can be looped through and dealt with "line by line".
The issue occurs when the execution begins, and the threaded function attempts to call a function that is part of the user interface class. I have limited understanding of multi threading, so it's all too likely that I am breaking some important rules and doing things in ways that don't make sense, and help here would be great.
I have achieved close to the functionality I am after previously, but always with some errors coming up in different ways (mostly due to my original attempts opening a tkinter window in a second thread... a bad idea).
As far as I'm aware my current code works in terms of opening a second thread, opening the UI in the main thread, and beginning to run the execution function in the second thread. In order to explain this issue, I have created a small piece of code that works on the same basis, and produces the same "none type" error, I would use the original code, but it's bulky, and a lot more annoying to follow than below:
from tkinter import *
import threading
#Represents what would be my main code
class MainClass():
#attributes for instances of each of the other classes
outputUI = None
threadingObject = None
#attempt to open second thread and the output ui
def beginExecute(self):
self.threadingObject = ThreadingClass()
self.outputUI = OutputUI()
#called by function of the threaded class, attempts to refer to instance
#of "outputUI" created in the "begin execute" function
def execute(self):
return self.outputUI.functionThatReturns()
#class for the output ui - just a blank box
class OutputUI():
#constructor to make a window
def __init__(self):
root = Tk()
root.title = ("Window in main thread")
root.mainloop()
#function to return a string when called
def functionThatReturns(self):
return("I'm a real object, look I exist! Maybe")
#inherits from threading library, contains threading... (this is where my
#understanding gets more patchy)
class ThreadingClass(threading.Thread):
#constructor - create new thread, run the thread...
def __init__(self):
threading.Thread.__init__(self)
self.start()
#auto called by self.start() ^ (as far as I'm aware)
def run(self):
#attempt to run the main classes "execute" function
print(mainClass.execute())
#create instance of the main class, then attempt execution of some
#threading
mainClass = MainClass()
mainClass.beginExecute()
When this code is run, it produces the following result:
Exception in thread Thread-1:
Traceback (most recent call last):
File "C:\Python34\lib\threading.py", line 920, in _bootstrap_inner
self.run()
File "H:/Programs/Python/more more threading tests.py", line 33, in run
print(mainClass.execute())
File "H:/Programs/Python/more more threading tests.py", line 14, in execute
return self.outputUI.functionThatReturns()
AttributeError: 'NoneType' object has no attribute 'functionThatReturns'
I guess it should be noted that the tkinter window opens as I hoped, and the threading class does what it's supposed to, but does not appear to be aware of the existence of the output UI. I assume this is due to some part of object orientation and threading of which I am woefully under-informed.
So, is there a way in which I can call the function in the output ui from the threaded function? Or is there a work around to something similar?
It should be noted that I didn't put the creation of the output window in the init function of the main class, as I need to be able to create the output window and start threading etc as a result of another input.
Sorry if this doesn't make sense, shout at me and I'll try and fix it, but help would be greatly appreciated, cheers.
The problem here is your order of operations. You create a ThreadingClass in MainClass.beginExecute(), which calls self.start(). That calls run which ultimately tries to call a function on main.outputUI. But main.outputUI hasn't been initialized yet (that's the next line in beginExecute()). You could probably make it work just by reordering those two lines. But you probably don't want to be calling self.start() in the thread's __init__. That seems like poor form.
I have an application that has a GUI thread and many different worker threads. In this application, I have a functions.py module, which contains a lot of different "utility" functions that are used all over the application.
Yesterday the application has been released and some users (a minority, but still) has reported problems with the application crashing. I looked over my code and noticed a possible design flaw, and would like to check with the lovely people of SO and see if I am right and if this is indeed a flaw.
Suppose I have this defined in my functions.py module:
class Functions:
solveComputationSignal = Signal(str)
updateStatusSignal = Signal(int, str)
text = None
#classmethod
def setResultText(self, text):
self.text = text
#classmethod
def solveComputation(cls, platform, computation, param=None):
#Not the entirety of the method is listed here
result = urllib.urlopen(COMPUTATION_URL).read()
if param is None:
cls.solveComputationSignal.emit(result)
else:
cls.solveAlternateComputation(platform, computation)
while not self.text:
time.sleep(3)
return self.text if self.text else False
#classmethod
def updateCurrentStatus(cls, platform, statusText):
cls.updateStatusSignal.emit(platform, statusText)
I think these methods in themselves are fine. The two signals defined here are connected to in the GUI thread. The first signal pops-up a dialog in which the computation is presented. The GUI thread calls the setResultText() method and sets the resulting string as entered by the user (if anyone knows of a better way to wait until the user has inputted the text other than sleeping and waiting for self.text to become True, please let me know). The solveAlternateComputation is another method in the same class that solves the computation automatically, however, it too calls the setResultText() method that sets the resulting text.
The second signal updates the statusBar text of the main GUI as well.
What's worse is that I think the above design, while perhaps flawed, is not the problem.
The problem lies, I believe, in the way I call these methods, whihch is from the worker threads (note that I have multiple similar workers, all of which are different "platforms")
Assume I have this (and I do):
class WorkerPlatform1(QThread):
#Init and other methods are here
def run(self):
#Thread does its job here, but then when it needs to present the
#computation, instead of emitting a signal, this is what I do
self.f = functions.Functions
result = self.f.solveComputation(platform, computation)
if result:
#Go on with the task
else:
self.f.updateCurrentStatus(platform, "Error grabbing computation!")
In this case I think that my flaw is that the thread itself is not emitting any signals, but rather calling callables residing outside of that thread directly. Am I right in thinking that this could cause my application to crash? Although the faulty module is reported as QtGui4.dll
One more thing: both of these methods in the Functions class are accessed by many threads almost simultaneously. Is this even advisable - have methods residing outside of a thread be accessed by many threads all at the same time? Can it so happen that I "confuse" my program? The reason I am asking is because people who say that the application is not crashing report that, very often, the solveComputation() returns the incorrect text - not all the time, but very often. Since that COMPUTATION_URL's server can take some time to respond (even 10+ seconds), is it possible that, once a thread calls that method, while the urllib library is still waiting for server response, in that time another thread can call it, causing it to use a different COMPUTATION_URL, which will result in it returning an incorrect value on some cases?
Finally, I am thinking of solutions: for my first (crashing) problem, do you think the proper solution would be to directly emit a Signal from the thread itself, and then connect it in the GUI thread? Is that the right way to go about it?
Secondly, for the solveComputation returning incorrect values, would I solve it by moving that method (and accompanying methods) to every Worker class? then I could call them directly and hopefully have the correct response - or, dozens of different responses (since I have that many threads) - for every thread?
Thank you all and I apologize for the wall of text.
EDIT: I would like to add that when running in console with some users, this error appears QObject: Cannot create children for a parent that is in a different thread.
(Parent is QLabel(0x4795500), parent's thread is QThread(0x2d3fd90), current thread is WordpressCreator(0x49f0548)
Your design is flawed if you really are using your Functions class like this with classmethods storing results on class attributes, being shared amongst multiple workers. It should be using all instance methods, and each thread should be using an instance of this class:
class Functions(QObject):
solveComputationSignal = pyqtSignal(str)
updateStatusSignal = pyqtSignal(int, str)
def __init__(self, parent=None):
super(Functions, self).__init__(parent)
self.text = ""
def setResultText(self, text):
self.text = text
def solveComputation(self, platform, computation, param=None):
result = urllib.urlopen(COMPUTATION_URL).read()
if param is None:
self.solveComputationSignal.emit(result)
else:
self.solveAlternateComputation(platform, computation)
while not self.text:
time.sleep(3)
return self.text if self.text else False
def updateCurrentStatus(self, platform, statusText):
self.updateStatusSignal.emit(platform, statusText)
# worker_A
def run(self):
...
f = Functions()
# worker_B
def run(self):
...
f = Functions()
Also, for doing your urlopen, instead of doing sleeps to check for when it is ready, you can make use of the QNetworkAccessManager to make your requests and use signals to be notified when results are ready.
I would like to ask how to read a big file from disk and maintain the PyQt4 UI responsive (not blocked). I had moved the load of the file to a QThread subclass but my GUI thread get freezed. Any suggestions? I think it must be something with the GIL but I don't know how to sort it?
EDIT:
I am using vtkGDCMImageReader from the GDCM project to read a multiframe DICOM image and display it with vtk and pyqt4. I do this load in a different thread (QThread) but my app freeze until the image is loaded. here is an example code:
class ReadThread(QThread):
def __init__(self, file_name):
super(ReadThread, self).__init__(self)
self.file_name = file_name
self.reader.vtkgdcm.vtkGDCMImageReader()
def run(self):
self.reader.SetFileName(self.file_name)
self.reader.Update()
self.emit(QtCore.SIGNAL('image_loaded'), self.reader.GetOutput())
I'm guessing you're directly calling the run to begin the thread. That would make the GUI freeze because you're not activating the thread.
So you'd be missing the start there, which would indirectly and properly call the run:
thread = ReadThread()
thread.begin()
class ReadThread(QThread):
def __init__(self, file_name):
super(ReadThread, self).__init__(self)
self.file_name = file_name
self.reader.vtkgdcm.vtkGDCMImageReader()
def run(self):
self.reader.SetFileName(self.file_name)
self.reader.Update()
self.emit(QtCore.SIGNAL('image_loaded'), self.reader.GetOutput())
def begin(self):
self.start()
A bit late, but I think I can pinpoint the problem you face.
The image is large, and the unpacking is probably a CPU intensive task. This means that your GUI thread goes to sleep and the loading thread is CPU bound. At that point the loading thread has the GIL, and the GUI cannot start.
Even if you can get into the loading thread, and introduce a sleep(0), to alow the GUI to continue, this will not help on a mulit-core or multi-processor machine. What happens is the O/S has two threads and thinks it can run both. The loading thread is set up on (say) core 1 and the GUI can be loaded and run on (say) core 2. So after initiating the load and start of the GUI thread on core 2, the O/S resumes the loading thread on core 1 - which promtply grabs the GIL. Moments later the GUI thread is ready to start, and attempts to aquire the GIL, which fails. All it can do without the GIL is go back to sleep!
One solution is to insert a short (greater than zero) sleep in the background thread at strategic intervals, so that the GUI can run. This is not always possible.
Take a look at threading-in-a-pyqt-application-use-qt-threads-or-python-threads
Maybe create your reader object in the tread:
class ReadThread(QThread):
def __init__(self, file_name):
super(ReadThread, self).__init__(self)
self.file_name = file_name
- self.reader = vtkgdcm.vtkGDCMImageReader()
def run(self):
+ self.reader = vtkgdcm.vtkGDCMImageReader()
self.reader.SetFileName(self.file_name)
self.reader.Update()
self.emit(QtCore.SIGNAL('image_loaded'), self.reader.GetOutput())