I've been trying to find an implementation that looks like mine but I can't seem to find one.
Specifics: I retrieve some database records and want to process all of them in a maximum of 5 threads. But I want these threads to report any potential errors and then close the individual threads (or log them). So I want to push all the records onto a queue and have the threads fetch from the queue.
So far I have this.
class DatabaseRecordImporterThread(threading.Thread):
def __init__(self, record_queue):
super(DatabaseRecordImporterThread, self).__init__()
self.record_queue = record_queue
def run(self):
try:
record = self.record_queue.get()
force_key_error(record)
except Exception as e:
print("Thread failed: ", e) # I want this to print to the main thread stdout
logger.log(e) # I want this to log to a shared log file (with appending)
MAX_THREAD_COUNT = 5
jobs = queue.Queue()
workers = []
database_records_retrieved = database.get_records(query) # unimportant
# this is where i put all records on a queue
for record in database_records_retrieved:
jobs.put(record)
for _ in range(MAX_THREAD_COUNT):
worker = DatabaseRecordImporterThread(jobs)
worker.start()
workers.append(worker)
print('*** Main thread waiting')
jobs.join()
print('*** Done')
So the idea is that every thread gets the jobs queue and they are retrieving records from it and printing. Since the amount to process isn't predesignated (defined to do k records at a time or something), each thread will attempt to just process whatever is on the queue. However the output looks like this, when I force an error.
Thread failed: 'KeyError'
Thread failed: 'KeyError'
Thread failed: 'KeyError'
Thread failed: 'KeyError'
Thread failed: 'KeyError'
*** Main thread waiting
when no errors are reported the threads only read one record each:
(record)
(record)
(record)
(record)
(record)
*** Main thread waiting
In the normal Threading setup, I understand that you can setup a queue by doing something like this
Thread(target=function, args=(parameters, queue)
But when you use a class that inherits the Thread object, how do you set this up properly? I can't seem to figure it out. One of my assumptions is that the queue object is not shallow, so every new object created actually refers to the same queue in memory - is this true?
The threads are hanging, obviously, because they are not(?) daemon threads. Not only that, but it seems as though the threads only read one record each and then do the same thing. Some thing I want to do but don't really understand how to do.
If all threads fail, the main thread should move on and say "*** Done."
The threads should continue processing the queue until it is empty
In order to do (2), I probably need something in the main thread like while !queue.empty but then how would I make sure that I limit the threads to only have a maximum of 5?
I figured out the answer to the question. After doing a lot of research and some code reading, what needs to happen is the following
The queue should not be checked whether or not it is empty since it presents a race condition. Rather, the workers should continue under an infinite loop and attempt to keep retrieving from the Queue
Whenever a queue task is finished, the queue.task_done() method needs to be called to alert the MainThread join() method. What happens is that the number of task_done calls will sync with the number of enqueue calls and the thread will officially join once the queue is empty.
Using a queue for a fixed data size task is somewhat suboptimal. Instead of creating a queue that each thread reads off of, it would be better to simply partition the data into chunks of equal size and have the threads just run processing a list subset. This way we don't potentially get blocked by queue.get() waiting for a new element to be added. Something like, while True: if not queue.empty(): do_something()
Exception handling should still make a call to task_done() if we want to proceed past. Deciding whether the whole thread should fail or not depending on whether an exception is caught is a design choice, but if it is the case, then the element should still be marked as processed.
Related
Lets assume I'm working with Python although it's not really relevant.
I have a big array and I want to find whether element x is in the array.
However, when one of the threads finds the element, I want that all other threads will stop,
there is no point for them to continue running. I want to continue with main program with the result.
What would be the right way for doing this?
I want to minimize the cpu time of the other threads after I already found that the element is truly exist.
In Python, you can create a thread-safe queue in the main thread and pass it to each worker thread. Each worker should search while the queue is empty() and then terminate. If the result is found, the lucky worker should put() it into the queue, causing all other workers to stop after their current iteration.
Example code (untested):
from Queue import Queue
from Threading import Thread
class Worker(Thread):
def __init__(self, queue):
self.queue=queue
def run(self):
while self.queue.empty():
result=search( ... )
if result:
queue.put(result)
def main():
queue=Queue()
workers=[]
for i in xrange(0,5):
workers.append(Worker(queue))
result=queue.get()
print result
There are multiple ways, one of them is polling a queue in caller's thread, where spawned threads store their results. As soon as there first result appears, terminate all running threads.
Just note, in CPython only one thread can run at the same time due to Global Interpreter Lock limitation (unless in C-extension which can free the lock). Also note, for searching in large data more appropriate data structure then array should be used, like a binary tree.
I have some function which does some file writing. The semaphore is for limiting a number of threads to 2. The total number of threads are 3. How can I prevent from the 3 threads a starvation? Is the queue is an option for that?
import time
import threading
sema = threading.Semaphore(2)
def write_file(file,data):
sema.acquire()
try:
f=open(file,"a")
f.write(data)
f.close()
finally:
sema.release()
I have to object to the accepted question. It is true that Condition queues the waits, but the more important part is when it tries to acquire the Condition lock.
The order in which threads are released is not deterministic
The implementation may pick one at random, so the order in which blocked threads are awakened should not be relied on.
In the case of three threads, there I agree, it's very unlikely that two are trying to acquire the lock at the same time (one working, one in wait, one acquiring the lock), but there still might be interferences.
A good solution for your problem IMO would be a thread that's single purpose is to read your data from a queue and write it to a file. All other threads can write to the queue and continue working.
If a thread is waiting to acquire the semaphore, either of the other two threads will be done writing and release the semaphore.
If you are worried that if there is a lot of writing going on, the writers might reacquire the semaphore before the waiting thread is notified. This can not happen, I think.
The Semaphore object in Python (2.7) uses a Condition. The Condition adds waiting threads (actually a lock, which the waiting thread is blocking on) to the end of an waiters list and when notifying threads, the notified threads are taken from the beginning of the list. So the list acts like a FIFO-queue.
It looks something like this:
def wait(self, timeout=None):
self.__waiters.append(waiter)
...
def notify(self, n=1):
...
waiters = self.__waiters[:n]
for waiter in waiters:
waiter.release()
...
My understanding, after reading the source code, is that Python's Semaphores are FIFO. I couldn't find any other information about this, so please correct me if I'm wrong.
I have a set of long-running process in a typical "pub/sub" setup with queues for communication.
I would like to do two things, and I can't figure out how to accomplish both simultaneously:
Addition/removal of workers. For example, I want to be able to add extra consumers if I see that my pending queue size has grown too large.
Watchdog for my processes - I want to be notified if any of my producers or consumers crashes.
I can do (2) in isolation:
try:
while True:
for process in workers + consumers:
if not process.is_alive():
logger.critical("%-8s%s died!", process.pid, process.name)
sleep(3)
except KeyboardInterrupt:
# Python propagates CTRL+C to all workers, no need to terminate them
logger.warn('Received CTR+C, shutting down')
The above blocks, which prevents me from doing (1).
So I decided to move the code into its own process.
This doesn't work, because process.is_alive() only works for a parent checking the status of its children. In this case, the processes I want to check would be siblings instead of children.
I'm a bit stumped on how to proceed. How can my main process support changes to subprocesses while also monitoring subprocesses?
multiprocessing.Pool actually has a watchdog built-in already. It runs a thread that checks every 0.1 seconds to see if a worker has died. If it has, it starts a new one to take its place:
def _handle_workers(pool):
thread = threading.current_thread()
# Keep maintaining workers until the cache gets drained, unless the pool
# is terminated.
while thread._state == RUN or (pool._cache and thread._state != TERMINATE):
pool._maintain_pool()
time.sleep(0.1)
# send sentinel to stop workers
pool._taskqueue.put(None)
debug('worker handler exiting')
def _maintain_pool(self):
"""Clean up any exited workers and start replacements for them.
"""
if self._join_exited_workers():
self._repopulate_pool()
This is primarily used to implement the maxtasksperchild keyword argument, and is actually problematic in some cases. If a process dies while a map or apply command is running, and that process is in the middle of handling a task associated with that call, it will never finish. See this question for more information about that behavior.
That said, if you just want to know that a process has died, you can just create a thread (not a process) that monitors the pids of all the processes in the pool, and if the pids in the list ever change, you know a process has crashed:
def monitor_pids(pool):
pids = [p.pid for p in pool._pool]
while True:
new_pids = [p.pid for p in pool._pool]
if new_pids != pids:
print("A worker died")
pids = new_pids
time.sleep(3)
Edit:
If you're rolling your own Pool implementation, you can just take a cue from multiprocessing.Pool, and run your monitoring code in a background thread in the parent process. The checks to see if the processes are still running are quick, so the time lost to the background thread taking the GIL should be negligible. Consider that the multiprocessing.Process watchdog is running every 0.1 seconds! Running yours every 3 seconds shouldn't cause any problems.
I am working on a project where I have a pool of workers. I am not using the built-in multiprocessing.Pool, but have created my own process pool.
The way it works is that I have created two instances of multiprocessing.Queue - one for sending work tasks to the workers and another to receive the results back.
Each worker just sits in a permanently running loop like this:
while True:
try:
request = self.request_queue.get(True, 5)
except Queue.Empty:
continue
else:
result = request.callable(*request.args, **request.kwargs)
self.results_queue.put((request, result))
There is also some error-handling code, but I have left it out for brewity. Each worker process has daemon set to 1.
I wish to properly shutdown the main process and all child worker processes. My experiences so far (doing Ctrl+C):
With no special implementations, each child process stops/crashes with a KeyboardInterrupt traceback, but the main process does not exist and have to be killed (sudo kill -9).
If I implement a signal handler for the child processes, set to ignore SIGINT's, the main thread shows the KeyboardInterrupt tracebok but nothing happens either way.
If I implement a signal handler for the child processes and the main process, I can see that the signal handler is called in the main process, but calling sys.exit() does not seem to have any effect.
I am looking for a "best practice" way of handling this. I also read somewhere that shutting down processes that were interacting with Queues and Pipes might cause them to deadlock with other processes (due to the Semaphores and other stuff used internally).
My current approach would be the following:
- Find a way to send an internal signal to each process (using a seperate command queue or similar) that will terminate their main loop.
- Implement a signal handler for the main loop that sends the shutdown command. The child processes will have a child handler that sets them to ignore the signal.
Is this the right approach?
The thing you need to watch out for is to deal with the possibility that there are messages in the queues at the time that you want to shutdown so you need a way for your processes to drain their input queues cleanly. Assuming that your main process is the one that will recognize that it is time to shutdown, you could do this.
Send a sentinel to each worker process. This is a special message (frequently None) that can never look like a normal message. After the sentinel, flush and close the queue to each worker process.
In your worker processes use code similar to the following pseudocode:
while True: # Your main processing loop
msg = inqueue.dequeue() # A blocking wait
if msg is None:
break
do_something()
outqueue.flush()
outqueue.close()
If it is possible that several processes could be sending messages on the inqueue you will need a more sophisticated approach. This sample taken from the source code for the monitor method in logging.handlers.QueueListener in Python 3.2 or later shows one possibility.
"""
Monitor the queue for records, and ask the handler
to deal with them.
This method runs on a separate, internal thread.
The thread will terminate if it sees a sentinel object in the queue.
"""
q = self.queue
has_task_done = hasattr(q, 'task_done')
# self._stop is a multiprocessing.Event object that has been set by the
# main process as part of the shutdown processing, before sending
# the sentinel
while not self._stop.isSet():
try:
record = self.dequeue(True)
if record is self._sentinel:
break
self.handle(record)
if has_task_done:
q.task_done()
except queue.Empty:
pass
# There might still be records in the queue.
while True:
try:
record = self.dequeue(False)
if record is self._sentinel:
break
self.handle(record)
if has_task_done:
q.task_done()
except queue.Empty:
break
I have a program which spawns 4 threads, these threads need to stay running indefinitely and if one of them crashes I need to know so I can restart.
If I use a list with 4 numbers and pass it to each thread through using a queue. Then all each thread has to do is reset its section in the timer while the main thread counts it down.
So the queue will never be empty, only a single value could go to 0, and then if this happens then the main thread knows its child hasn't responded and it can act accordingly.
But every time I .get() from the queue, it makes it empty, so I have to get from the queue, store into a variable, modify the variable and put it back in the queue.
Is this fine using the queue like this for a watchdog.
If you're using Threads, you could regularly check through threading.enumerate to make sure that you have the correct number and kind of threads running.
But, also, passing things into a Queue that gets returned from a thread is a technique that I have at least seen used to make sure that threads are still running. So, if I'm understanding you correctly, what you're doing isn't completely crazy.
Your "thread must re-set its sentinal occasionally" might make more sense to have as a list of Queues that each Thread is expected to respond to asap. This depends on if your Threads are actually doing process-intensive stuff, or if they're just backgrounded for interface reasons. If they're not spending all their time doing math, you could do something like:
def guarded_thread(sentinal_queue, *args):
while True:
try:
sentinal_queue.get_nowait()
sentinal_queue.put('got it')
except Queue.Empty:
# we just want to make sure that we respond if we have been
# pinged
pass
# do actual work with other args
def main(arguments):
queues = [Queue() for q in range(4)]
threads = [(Thread(target=guarded_thread, args=(queue, args)), queue)
for queue, args in zip(queues, arguments)]
for thread, queue in threads:
thread.start()
while True:
for thread, queue in threads:
queue.put(True)
for thread, queue in threads:
try:
response = queue.get(True, MAX_TIMEOUT)
if response != 'got it':
# either re-send or restart the thread
except Queue.Empty:
# restart the thread
time.sleep(PING_INTERVAL)
Note that you could also use different request/response queues to avoid having different kinds of sentinal values, it depends on your actual code which one would look less crazy.