Block main thread until python background thread finishes side-task - python

I have a threaded python application with a long-running mainloop in the background thread. This background mainloop is actually a call to pyglet.app.run(), which drives a GUI window and also can be configured to call other code periodically. I need a do_stuff(duration) function to be called at will from the main thread to trigger an animation in the GUI, wait for the animation to stop, and then return. The actual animation must be done in the background thread because the GUI library can't handle being driven by separate threads.
I believe I need to do something like this:
import threading
class StuffDoer(threading.Thread):
def __init__(self):
threading.Thread.__init__(self)
self.max_n_times = 0
self.total_n_times = 0
self.paused_ev = threading.Event()
def run(self):
# this part is outside of my control
while True:
self._do_stuff()
# do other stuff
def _do_stuff(self):
# this part is under my control
if self.paused_ev.is_set():
if self.max_n_times > self.total_n_times:
self.paused_ev.clear()
else:
if self.total_n_times >= self.max_n_times:
self.paused_ev.set()
if not self.paused_ev.is_set():
# do stuff that must execute in the background thread
self.total_n_times += 1
sd = StuffDoer()
sd.start()
def do_stuff(n_times):
sd.max_n_times += n_times
sd.paused_ev.wait_for_clear() # wait_for_clear() does not exist
sd.paused_ev.wait()
assert (sd.total_n_times == sd.max_n_times)
EDIT: use max_n_times instead of stop_time to clarify why Thread.join(duration) won't do the trick.
From the documentation for threading.Event:
wait([timeout])
Block until the internal flag is true.
If the internal flag is true on entry,
return immediately. Otherwise, block
until another thread calls set() to
set the flag to true, or until the
optional timeout occurs.
I've found I can get the behavior I'm looking for if I have a pair of events, paused_ev and not_paused_ev, and use not_paused_ev.wait(). I could almost just use Thread.join(duration), except it needs to only return precisely when the background thread actually registers that the time is up. Is there some other synchronization object or other strategy I should be using instead?
I'd also be open to arguments that I'm approaching this whole thing the wrong way, provided they're good arguments.

Hoping I get some revision or additional info from my comment, but I'm kind of wondering if you're not overworking things by subclassing Thread. You can do things like this:
class MyWorker(object):
def __init__(self):
t = Thread(target = self._do_work, name "Worker Owned Thread")
t.daemon = True
t.start()
def _do_work(self):
While True:
# Something going on here, forever if necessary. This thread
# will go away if the other non-daemon threads terminate, possibly
# raising an exception depending this function's body.
I find this makes more sense when the method you want to run is something that is more appropriately a member function of some other class than it would be to as the run method on the thread. Additionally, this saves you from having to encapsulate a bunch of business logic inside of a Thread. All IMO, of course.

It appears that your GUI animation thread is using a spin-lock in its while True loop. This can be prevented using thread-safe queues. Based on my reading of your question, this approach would be functionally equivalent and efficient.
I'm omitting some details of your code above which would not change. I'm also assuming here that the run() method which you do not control uses the self.stop_time value to do its work; otherwise there is no need for a threadsafe queue.
from Queue import Queue
from threading import Event
class StuffDoer:
def __init__(self, inq, ready):
self.inq = inq
self.ready = ready
def _do_stuff(self):
self.ready.set()
self.stop_time = self.inq.get()
GUIqueue = Queue()
control = Event()
sd = StuffDoer(GUIqueue, control)
def do_stuff(duration):
control.clear()
GUIqueue.put(time.time() + duration)
control.wait()

I ended up using a Queue similar to what #wberry suggested, and making use of Queue.task_done and Queue.wait:
import Queue
import threading
class StuffDoer(threading.Thread):
def __init__(self):
threading.Thread.__init__(self)
self.setDaemon(True)
self.max_n_times = 0
self.total_n_times = 0
self.do_queue = Queue.Queue()
def run(self):
# this part is outside of my control
while True:
self._do_stuff()
# do other stuff
def _do_stuff(self):
# this part is under my control
if self.total_n_times >= self.max_n_times:
try:
self.max_n_times += self.do_queue.get(block=False)
except Queue.Empty, e:
pass
if self.max_n_times > self.total_n_times:
# do stuff that must execute in the background thread
self.total_n_times += 1
if self.total_n_times >= self.max_n_times:
self.do_queue.task_done()
sd = StuffDoer()
sd.start()
def do_stuff(n_times):
sd.do_queue.put(n_times)
sd.do_queue.join()
assert (sd.total_n_times == sd.max_n_times)

I made solution based on #g.d.d.c advice for this question. There is my code:
threads = []
# initializing aux thread(s) in the main thread ...
t = threading.Thread(target=ThreadF, args=(...))
#t.setDaemon(True) # I'm not sure does it really needed
t.start()
threads.append(t.ident)
# Block main thread
while filter(lambda thread: thread.ident in threads, threading.enumerate()):
time.sleep(10)
Also, you can use Thread.join to block the main thread - it is better way.

Related

tkinter, master window does not loop after I start a thread [duplicate]

My interface is freezing on pressing the button. I am using threading but I am not sure why is still hanging. Any help will be appreciated. Thanks in advance
class magic:
def __init__(self):
self.mainQueue=queue.Queue()
def addItem(self,q):
self.mainQueue.put(q)
def startConverting(self,funcName):
if(funcName=="test"):
while not self.mainQueue.empty():
t = Thread(target = self.threaded_function)
t.start()
t.join()
def threaded_function(self):
time.sleep(5)
print(self.mainQueue.get())
m=magic()
def helloCallBack():
m.addItem("asd")
m.startConverting("test") //this line of code is freezing
B = tkinter.Button(top, text ="Hello", command = helloCallBack)
B.pack()
top.mainloop()
Here's a recipe for doing an asynchronous task with a tkinter-based GUI. I adapted it from a recipe in the cited book. You should be able to modify it to do what you need.
To keep the GUI responsive requires not interfering with its mainloop() by doing something like join()ing a background thread—which makes the GUI "hang" until the thread is finished. This is accomplished by using the universal after() widget method to poll a Queue at regular intervals.
# from "Python Coobook 2nd Edition", section 11.9, page 439.
# Modified to work in Python 2 & 3.
from __future__ import print_function
try:
import Tkinter as tk, time, threading, random, Queue as queue
except ModuleNotFoundError: # Python 3
import tkinter as tk, time, threading, random, queue
class GuiPart(object):
def __init__(self, master, queue, end_command):
self.queue = queue
# Set up the GUI
tk.Button(master, text='Done', command=end_command).pack()
# Add more GUI stuff here depending on your specific needs
def processIncoming(self):
""" Handle all messages currently in the queue, if any. """
while self.queue.qsize():
try:
msg = self.queue.get_nowait()
# Check contents of message and do whatever is needed. As a
# simple example, let's print it (in real life, you would
# suitably update the GUI's display in a richer fashion).
print(msg)
except queue.Empty:
# just on general principles, although we don't expect this
# branch to be taken in this case, ignore this exception!
pass
class ThreadedClient(object):
"""
Launch the main part of the GUI and the worker thread. periodic_call()
and end_application() could reside in the GUI part, but putting them
here means that you have all the thread controls in a single place.
"""
def __init__(self, master):
"""
Start the GUI and the asynchronous threads. We are in the main
(original) thread of the application, which will later be used by
the GUI as well. We spawn a new thread for the worker (I/O).
"""
self.master = master
# Create the queue
self.queue = queue.Queue()
# Set up the GUI part
self.gui = GuiPart(master, self.queue, self.end_application)
# Set up the thread to do asynchronous I/O
# More threads can also be created and used, if necessary
self.running = True
self.thread1 = threading.Thread(target=self.worker_thread1)
self.thread1.start()
# Start the periodic call in the GUI to check the queue
self.periodic_call()
def periodic_call(self):
""" Check every 200 ms if there is something new in the queue. """
self.master.after(200, self.periodic_call)
self.gui.processIncoming()
if not self.running:
# This is the brutal stop of the system. You may want to do
# some cleanup before actually shutting it down.
import sys
sys.exit(1)
def worker_thread1(self):
"""
This is where we handle the asynchronous I/O. For example, it may be
a 'select()'. One important thing to remember is that the thread has
to yield control pretty regularly, be it by select or otherwise.
"""
while self.running:
# To simulate asynchronous I/O, create a random number at random
# intervals. Replace the following two lines with the real thing.
time.sleep(rand.random() * 1.5)
msg = rand.random()
self.queue.put(msg)
def end_application(self):
self.running = False # Stops worker_thread1 (invoked by "Done" button).
rand = random.Random()
root = tk.Tk()
client = ThreadedClient(root)
root.mainloop()
For anyone having a problem with sys.exit(1) in #martineau's code - if you replace sys.exit(1) with self.master.destroy() the program ends gracefully. I lack the reputation to add a comment, hence the seperate answer.

Python run a function on main thread from a background thread

Created a background thread this way
def listenReply(self):
while self.SOCK_LISTENING:
fromNodeRED = self.nodeRED_sock.recv(1024).decode()
if fromNodeRED=="closeDoor":
self.door_closed()
...
self.listenThread = Thread(target=self.listenReply, daemon=True)
self.SOCK_LISTENING = True
self.listenThread.start()
But self.door_closed() has some UI stuffs so that's no good. How do I call self.door_closed in main thread instead?
One thing to keep in mind is that each thread is a sequential execution of a single flow of code, starting from the function the thread was started on.
It doesn't make much sense to simply run something on an existing thread, since that thread is already executing something, and doing so would disrupt it's current flow.
However, it's quite easy to communicate between threads and it's possible to implement a thread's code such that it simply receives functions/events from other threads which tell it what to do. This is commonly known as an event loop.
For example your main thread could look something like this
from queue import Queue
tasks = Queue()
def event_loop():
while True:
next_task = tasks.get()
print('Executing function {} on main thread'.format(next_task))
next_task()
In your other threads you could ask the main thread to run a function by simply adding it to the tasks queue:
def listenReply(self):
while self.SOCK_LISTENING:
fromNodeRED = self.nodeRED_sock.recv(1024).decode()
if fromNodeRED=="closeDoor":
tasks.put(door_closed)
You can use threading.Event() and set it whenever you you receive "closeDoor" from recv.
For example:
g_should_close_door = threading.Event()
def listenReply(self):
while self.SOCK_LISTENING:
fromNodeRED = self.nodeRED_sock.recv(1024).decode()
if fromNodeRED=="closeDoor":
g_should_close_door.set()
...
self.listenThread = Thread(target=self.listenReply, daemon=True)
self.SOCK_LISTENING = True
self.listenThread.start()
if g_should_close_door.is_set():
door_closed()
g_should_close_door.clear()
Solved it myself using signal and slots of the PyQt.
class App(QWidget):
socketSignal = QtCore.pyqtSignal(object) #must be defined in class level
# BG THREAD
def listenReply(self):
while self.SOCK_LISTENING:
fromNodeRED = self.nodeRED_sock.recv(1024).decode()
print(fromNodeRED)
self.socketSignal.emit(fromNodeRED)
.... somewhere in init of Main thread:
self.socketSignal.connect(self.executeOnMain)
self.listenThread = Thread(target=self.listenReply, daemon=True)
self.SOCK_LISTENING = True
self.listenThread.start()
....
def executeOnMain(self, data):
if data=="closeDoor":
self.door_closed() # a method that changes the UI
Works great for me.

Interrupting a thread in Python with a KeyboardException in the main thread

I have a few classes that look more or less like this:
import threading
import time
class Foo():
def __init__(self, interval, callbacks):
self.thread = threading.Thread(target=self.loop)
self.interval = interval
self.thread_stop = threading.Event()
self.callbacks = callbacks
def loop():
while not self.thread_stop.is_set():
#do some stuff...
for callback in self.callbacks():
callback()
time.sleep(self.interval)
def start(self):
self.thread.start()
def kill(self):
self.thread_stop.set()
Which I am using from my main thread like this:
interval = someinterval
callbacks = [some callbacks]
f = Foo(interval, callbacks)
try:
f.start()
except KeyboardInterrupt:
f.kill()
raise
I would like a KeyboardInterrupt to kill the thread after all the callbacks have been completed, but before the loop repeats. Currently they are ignored and I have to resort to killing the terminal process that the program is running in.
I saw the idea of using threading.Event from this post, but it appears like I'm doing it incorrectly, and it's making working on this project a pretty large hassle.
I don't know if it may be relevant, but the callbacks I'm passing access data from the Internet and make heavy use of the retrying decorator to deal with unreliable connections.
EDIT
After everyone's help, the loop now looks like this inside Foo:
def thread_loop(self):
while not self.thread_stop.is_set():
# do some stuff
# call the callbacks
self.thread_stop.wait(self.interval)
This is kind of a solution, although it isn't ideal. This code runs on PythonAnywhere and the price of the account is by CPU time. I'll have to see how much this uses over the course of a day with the constant waking and sleeping of threads, but it at least solves the main issue
I think your problem is that you have a try-except-block around f.start(), but that returns immediately, so you aren't going to catch KeyboardInterrupts after the thread was started.
You could try adding a while-loop at the bottom of your program like this:
f.start()
try:
while True:
time.sleep(0.1)
except KeyboardInterrupt:
f.kill()
raise
This isn't exactly the most elegant solution, but it should work.
Thanks to #shx2 and #jazzpi for putting together the two separate pieces of the puzzle.
so the final code is
import threading
import time
class Foo():
def __init__(self, interval, callbacks):
self.thread = threading.Thread(target=self.loop)
self.interval = interval
self.thread_stop = threading.Event()
self.callbacks = callbacks
def loop():
while not self.thread_stop.is_set():
#do some stuff...
for callback in self.callbacks():
callback()
self.thread_stop.wait(self.interval)
def start(self):
self.thread.start()
def kill(self):
self.thread_stop.set()
And then in main
interval = someinterval
callbacks = [some, callbacks]
f = Foo(interval, callbacks)
f.start()
try:
while True:
time.sleep(0.1)
except KeyboardInterrupt:
f.kill()
raise
#jazzpi's answer correctly addresses the issue you're having in the main thread.
As to the sleep in thread's loop, you can simply replace the call to sleep with a call to self.thread_stop.wait(self.interval).
This way, your thread wakes up as soon as the stop event is set, or after waiting (i.e. sleeping) for self.interval seconds. (Event docs)

How to stop a looping thread in Python?

What's the proper way to tell a looping thread to stop looping?
I have a fairly simple program that pings a specified host in a separate threading.Thread class. In this class it sleeps 60 seconds, the runs again until the application quits.
I'd like to implement a 'Stop' button in my wx.Frame to ask the looping thread to stop. It doesn't need to end the thread right away, it can just stop looping once it wakes up.
Here is my threading class (note: I haven't implemented looping yet, but it would likely fall under the run method in PingAssets)
class PingAssets(threading.Thread):
def __init__(self, threadNum, asset, window):
threading.Thread.__init__(self)
self.threadNum = threadNum
self.window = window
self.asset = asset
def run(self):
config = controller.getConfig()
fmt = config['timefmt']
start_time = datetime.now().strftime(fmt)
try:
if onlinecheck.check_status(self.asset):
status = "online"
else:
status = "offline"
except socket.gaierror:
status = "an invalid asset tag."
msg =("{}: {} is {}. \n".format(start_time, self.asset, status))
wx.CallAfter(self.window.Logger, msg)
And in my wxPyhton Frame I have this function called from a Start button:
def CheckAsset(self, asset):
self.count += 1
thread = PingAssets(self.count, asset, self)
self.threads.append(thread)
thread.start()
Threaded stoppable function
Instead of subclassing threading.Thread, one can modify the function to allow
stopping by a flag.
We need an object, accessible to running function, to which we set the flag to stop running.
We can use threading.currentThread() object.
import threading
import time
def doit(arg):
t = threading.currentThread()
while getattr(t, "do_run", True):
print ("working on %s" % arg)
time.sleep(1)
print("Stopping as you wish.")
def main():
t = threading.Thread(target=doit, args=("task",))
t.start()
time.sleep(5)
t.do_run = False
if __name__ == "__main__":
main()
The trick is, that the running thread can have attached additional properties. The solution builds
on assumptions:
the thread has a property "do_run" with default value True
driving parent process can assign to started thread the property "do_run" to False.
Running the code, we get following output:
$ python stopthread.py
working on task
working on task
working on task
working on task
working on task
Stopping as you wish.
Pill to kill - using Event
Other alternative is to use threading.Event as function argument. It is by
default False, but external process can "set it" (to True) and function can
learn about it using wait(timeout) function.
We can wait with zero timeout, but we can also use it as the sleeping timer (used below).
def doit(stop_event, arg):
while not stop_event.wait(1):
print ("working on %s" % arg)
print("Stopping as you wish.")
def main():
pill2kill = threading.Event()
t = threading.Thread(target=doit, args=(pill2kill, "task"))
t.start()
time.sleep(5)
pill2kill.set()
t.join()
Edit: I tried this in Python 3.6. stop_event.wait() blocks the event (and so the while loop) until release. It does not return a boolean value. Using stop_event.is_set() works instead.
Stopping multiple threads with one pill
Advantage of pill to kill is better seen, if we have to stop multiple threads
at once, as one pill will work for all.
The doit will not change at all, only the main handles the threads a bit differently.
def main():
pill2kill = threading.Event()
tasks = ["task ONE", "task TWO", "task THREE"]
def thread_gen(pill2kill, tasks):
for task in tasks:
t = threading.Thread(target=doit, args=(pill2kill, task))
yield t
threads = list(thread_gen(pill2kill, tasks))
for thread in threads:
thread.start()
time.sleep(5)
pill2kill.set()
for thread in threads:
thread.join()
This has been asked before on Stack. See the following links:
Is there any way to kill a Thread in Python?
Stopping a thread after a certain amount of time
Basically you just need to set up the thread with a stop function that sets a sentinel value that the thread will check. In your case, you'll have the something in your loop check the sentinel value to see if it's changed and if it has, the loop can break and the thread can die.
I read the other questions on Stack but I was still a little confused on communicating across classes. Here is how I approached it:
I use a list to hold all my threads in the __init__ method of my wxFrame class: self.threads = []
As recommended in How to stop a looping thread in Python? I use a signal in my thread class which is set to True when initializing the threading class.
class PingAssets(threading.Thread):
def __init__(self, threadNum, asset, window):
threading.Thread.__init__(self)
self.threadNum = threadNum
self.window = window
self.asset = asset
self.signal = True
def run(self):
while self.signal:
do_stuff()
sleep()
and I can stop these threads by iterating over my threads:
def OnStop(self, e):
for t in self.threads:
t.signal = False
I had a different approach. I've sub-classed a Thread class and in the constructor I've created an Event object. Then I've written custom join() method, which first sets this event and then calls a parent's version of itself.
Here is my class, I'm using for serial port communication in wxPython app:
import wx, threading, serial, Events, Queue
class PumpThread(threading.Thread):
def __init__ (self, port, queue, parent):
super(PumpThread, self).__init__()
self.port = port
self.queue = queue
self.parent = parent
self.serial = serial.Serial()
self.serial.port = self.port
self.serial.timeout = 0.5
self.serial.baudrate = 9600
self.serial.parity = 'N'
self.stopRequest = threading.Event()
def run (self):
try:
self.serial.open()
except Exception, ex:
print ("[ERROR]\tUnable to open port {}".format(self.port))
print ("[ERROR]\t{}\n\n{}".format(ex.message, ex.traceback))
self.stopRequest.set()
else:
print ("[INFO]\tListening port {}".format(self.port))
self.serial.write("FLOW?\r")
while not self.stopRequest.isSet():
msg = ''
if not self.queue.empty():
try:
command = self.queue.get()
self.serial.write(command)
except Queue.Empty:
continue
while self.serial.inWaiting():
char = self.serial.read(1)
if '\r' in char and len(msg) > 1:
char = ''
#~ print('[DATA]\t{}'.format(msg))
event = Events.PumpDataEvent(Events.SERIALRX, wx.ID_ANY, msg)
wx.PostEvent(self.parent, event)
msg = ''
break
msg += char
self.serial.close()
def join (self, timeout=None):
self.stopRequest.set()
super(PumpThread, self).join(timeout)
def SetPort (self, serial):
self.serial = serial
def Write (self, msg):
if self.serial.is_open:
self.queue.put(msg)
else:
print("[ERROR]\tPort {} is not open!".format(self.port))
def Stop(self):
if self.isAlive():
self.join()
The Queue is used for sending messages to the port and main loop takes responses back. I've used no serial.readline() method, because of different end-line char, and I have found the usage of io classes to be too much fuss.
Depends on what you run in that thread.
If that's your code, then you can implement a stop condition (see other answers).
However, if what you want is to run someone else's code, then you should fork and start a process. Like this:
import multiprocessing
proc = multiprocessing.Process(target=your_proc_function, args=())
proc.start()
now, whenever you want to stop that process, send it a SIGTERM like this:
proc.terminate()
proc.join()
And it's not slow: fractions of a second.
Enjoy :)
My solution is:
import threading, time
def a():
t = threading.currentThread()
while getattr(t, "do_run", True):
print('Do something')
time.sleep(1)
def getThreadByName(name):
threads = threading.enumerate() #Threads list
for thread in threads:
if thread.name == name:
return thread
threading.Thread(target=a, name='228').start() #Init thread
t = getThreadByName('228') #Get thread by name
time.sleep(5)
t.do_run = False #Signal to stop thread
t.join()
I find it useful to have a class, derived from threading.Thread, to encapsulate my thread functionality. You simply provide your own main loop in an overridden version of run() in this class. Calling start() arranges for the object’s run() method to be invoked in a separate thread.
Inside the main loop, periodically check whether a threading.Event has been set. Such an event is thread-safe.
Inside this class, you have your own join() method that sets the stop event object before calling the join() method of the base class. It can optionally take a time value to pass to the base class's join() method to ensure your thread is terminated in a short amount of time.
import threading
import time
class MyThread(threading.Thread):
def __init__(self, sleep_time=0.1):
self._stop_event = threading.Event()
self._sleep_time = sleep_time
"""call base class constructor"""
super().__init__()
def run(self):
"""main control loop"""
while not self._stop_event.isSet():
#do work
print("hi")
self._stop_event.wait(self._sleep_time)
def join(self, timeout=None):
"""set stop event and join within a given time period"""
self._stop_event.set()
super().join(timeout)
if __name__ == "__main__":
t = MyThread()
t.start()
time.sleep(5)
t.join(1) #wait 1s max
Having a small sleep inside the main loop before checking the threading.Event is less CPU intensive than looping continuously. You can have a default sleep time (e.g. 0.1s), but you can also pass the value in the constructor.
Sometimes you don't have control over the running target. In those cases you can use signal.pthread_kill to send a stop signal.
from signal import pthread_kill, SIGTSTP
from threading import Thread
from itertools import count
from time import sleep
def target():
for num in count():
print(num)
sleep(1)
thread = Thread(target=target)
thread.start()
sleep(5)
pthread_kill(thread.ident, SIGTSTP)
result
0
1
2
3
4
[14]+ Stopped

Python how to stop threading operations

I want to know how can I stop my program in console with CTRL+C or smth similar.
The problem is that there are two threads in my program. Thread one crawls the web and extracts some data and thread two displays this data in a readable format for the user. Both parts share same database. I run them like this :
from threading import Thread
import ResultsPresenter
def runSpider():
Thread(target=initSpider).start()
Thread(target=ResultsPresenter.runPresenter).start()
if __name__ == "__main__":
runSpider()
how can I do that?
Ok so I created my own thread class :
import threading
class MyThread(threading.Thread):
"""Thread class with a stop() method. The thread itself has to check
regularly for the stopped() condition."""
def __init__(self):
super(MyThread, self).__init__()
self._stop = threading.Event()
def stop(self):
self._stop.set()
def stopped(self):
return self._stop.isSet()
OK so I will post here snippets of resultPresenter and crawler.
Here is the code of resultPresenter :
# configuration
DEBUG = False
DATABASE = database.__path__[0] + '/database.db'
app = Flask(__name__)
app.config.from_object(__name__)
app.config.from_envvar('CRAWLER_SETTINGS', silent=True)
def runPresenter():
url = "http://127.0.0.1:5000"
webbrowser.open_new(url)
app.run()
There are also two more methods here that I omitted - one of them connects to the database and the second method loads html template to display result. I repeat this until conditions are met or user stops the program ( what I am trying to implement ). There are also two other methods too - one get's initial link from the command line and the second valitated arguments - if arguments are invalid I won't run crawl() method.
Here is short version of crawler :
def crawl(initialLink, maxDepth):
#here I am setting initial values, lists etc
while not(depth >= maxDepth or len(pagesToCrawl) <= 0):
#this is the main loop that stops when certain depth is
#reached or there is nothing to crawl
#Here I am popping urls from url queue, parse them and
#insert interesting data into the database
parser.close()
sock.close()
dataManager.closeConnection()
Here is the init file which starts those modules in threads:
import ResultsPresenter, MyThread, time, threading
def runSpider():
MyThread.MyThread(target=initSpider).start()
MyThread.MyThread(target=ResultsPresenter.runPresenter).start()
def initSpider():
import Crawler
import database.__init__
import schemas.__init__
import static.__init__
import templates.__init__
link, maxDepth = Crawler.getInitialLink()
if link:
Crawler.crawl(link, maxDepth)
killall = False
if __name__ == "__main__":
global killall
runSpider()
while True:
try:
time.sleep(1)
except:
for thread in threading.enumerate():
thread.stop()
killall = True
raise
Killing threads is not a good idea, since (as you already said) they may be performing some crucial operations on database. Thus you may define global flag, which will signal threads that they should finish what they are doing and quit.
killall = False
import time
if __name__ == "__main__":
global killall
runSpider()
while True:
try:
time.sleep(1)
except:
/* send a signal to threads, for example: */
killall = True
raise
and in each thread you check in a similar loop whether killall variable is set to True. If it is close all activity and quit the thread.
EDIT
First of all: the Exception is rather obvious. You are passing target argument to __init__, but you didn't declare it in __init__. Do it like this:
class MyThread(threading.Thread):
def __init__(self, *args, **kwargs):
super(MyThread, self).__init__(*args, **kwargs)
self._stop = threading.Event()
And secondly: you are not using my code. As I said: set the flag and check it in thread. When I say "thread" I actually mean the handler, i.e. ResultsPresenter.runPresenter or initSpide. Show us the code of one of these and I'll try to show you how to handle stopping.
EDIT 2
Assuming that the code of crawl function is in the same file (if it is not, then you have to import killall variable), you can do something like this
def crawl(initialLink, maxDepth):
global killall
# Initialization.
while not killall and not(depth >= maxDepth or len(pagesToCrawl) <= 0):
# note the killall variable in while loop!
# the other code
parser.close()
sock.close()
dataManager.closeConnection()
So basically you just say: "Hey, thread, quit the loop now!". Optionally you can literally break a loop:
while not(depth >= maxDepth or len(pagesToCrawl) <= 0):
# some code
if killall:
break
Of course it will still take some time before it quits (has to finish the loop and close parser, socket, etc.), but it should quit safely. That's the idea at least.
Try this:
ps aux | grep python
copy the id of the process you want to kill and:
kill -3 <process_id>
And in your code (adapted from here):
import signal
import sys
def signal_handler(signal, frame):
print 'You killed me!'
sys.exit(0)
signal.signal(signal.SIGQUIT, signal_handler)
print 'Kill me now'
signal.pause()

Categories

Resources