I have a Python app that runs a pinball machine. It needs to run at a fairly consistent loop rate to do pinball-type things, but I also need to be able to load images and sounds at various points throughout the games. I don't have enough memory to pre-load all the sound files I need for the entire game, so I want to use an additional thread (or threads) to load them in the background while the main game loop continues on.
Using Python's threading module is easy enough, as is using a Queue.Queue to maintain a list of assets that need to load. My question is whether it's "ok" (for lack of a better word) to have the asset loader thread always running, or whether I should just create the thread when I need it and then let it end when I'm done. In my case the pinball machine—and my Python app—will be on an running for many hours (or days) at a time.
All of the examples of Python threading I've found tend to be for apps that do something and then end, versus creating (potentially) temporary threads for a long-running app.
In my case I think I have two options:
Option 1, where the loader thread runs forever:
self.loader_queue = Queue.Queue
def loader_thread(self):
while True:
do_my_work(self.loader_queue.get())
Option 2, where the loader thread ends when the queue is empty:
def loader_thread(self):
while not self.loader_queue.empty():
do_my_work(self.loader_queue.get())
Obviously I've left some things out.. Some try: blocks and a method for creating the thread in Option 2, but I think these snippets explain my two options.
The real question I have is that with Option 1, is that "bad" because then I'm wasting half of Python's execution cycles while the loader thread just spins and does nothing for the 99.99% of the time the queue is empty?
Or is this a case where I should use the first option, but use self.loader_queue.get(block=True)? I assume if my loader thread is just blocking while waiting for an item in the Queue then that's an efficient type of wait and I won't be wasting a bunch of cycles?
Thanks!
Brian
The default for Queue.get is to block, which is what you need:
Remove and return an item from the queue. If optional args block is true and timeout is None (the default), block if necessary until an item is available. If timeout is a positive number, it blocks at most timeout seconds and raises the Empty exception if no item was available within that time. Otherwise (block is false), return an item if one is immediately available, else raise the Empty exception (timeout is ignored in that case).
This way the while loop only runs a single time for each item in the queue and is blocked when the queue is empty.
You can actually test this yourself by doing something visible (like printing some output) in the while loop.
Option 1 is good if you are waiting for items, since option 2 may terminate before you get them (if loader is fast enough).
The thread is probably not taking up enough resources to be considered as an optimization candidate. And since you're blocking it when it shouldn't be running option 1 seems to be the way to go.
Related
I have a program that uses threads to start another thread once a certain threshold is reached. Right now the second thread is being started multiple times. I implemented a lock but I don't think I did it right.
for i in range(max_threads):
t1 = Thread(target=grab_queue)
t1.start()
in grab_queue, I have:
...
rows.append(resultJson)
if len(rows.value()) >= 250:
with Lock():
row_thread = Thread(target=insert_rows, kwargs={'rows': rows.value()})
row_thread.start()
rows.reset()
Which starts another thread to process the list of rows. I would like to make sure that as soon as it hits the if condition, the other threads wont run in order to make sure that extra threads to process the list of rows aren't started.
Your lock is covering the wrong portion of the code. You have a race condition between the check for the size of rows, and the portion of the code where you reset the rows. Given that the lock is taken only after the size check, two threads could easily both decide that the array has grown too large, and only then would the lock kick in to serialize the resetting of the array. "Serialize" in this case means that the task would still be performed twice, once by each thread, but it would happen in succession rather than in parallel.
The correct code could look like this:
rows.append(resultJson)
with grow_lock:
if len(rows.value()) >= 250:
row_thread = Thread(target=insert_rows, kwargs={'rows': rows.value()})
row_thread.start()
rows.reset()
There is another issue with the code as shown in the question: if Lock() refers to threading.Lock, it is creating and locking a new lock on each invocation, and in each thread! A lock protects a resource shared among threads, and to perform that function, the lock must itself be shared. To fix the problem, instantiate the lock once and pass it to the thread's target function.
Taking a step back, your code implements a custom thread pool. Getting that right and covering all the corner cases takes a lot of work, testing, and debugging. There are production-tested modules specialized for that purpose, such as the multiprocessing module shipped with Python (which supports both process and thread pools), and it is a good idea to get acquainted with them before reimplementing their functionality. See, for example, this article for an accessible introduction to multiprocessing-based thread pools.
In this case, say I wanted to wait on a condition to happen, that may happen at any random time.
while True:
if condition:
#Do Whatever
else:
pass
As you can see, pass will just happen until the condition is True. But while the condition isn't True the cpu is being pegged with pass causing higher cpu usage, when I simply just want it to wait until the condition occurs. How may I do this?
See Busy_loop#Busy-waiting_alternatives:
Most operating systems and threading libraries provide a variety of system calls that will block the process on an event, such as lock acquisition, timer changes, I/O availability or signals.
Basically, to wait for something, you have two options (same as IRL):
Check for it periodically with a reasonable interval (this is called "polling")
Make the event you're waiting for notify you: invoke (or, as a special case, unblock) your code somehow (this is called "event handling" or "notifications". For system calls that block, "blocking call" or "synchronous call" or call-specific terms are typically used instead)
As already mentioned you can a) poll i.e. check for a condition and if it is not true wait for some time interval, if your condition is an external event you can arrange for a blocking wait for the state to change, or you can also take a look at the publish subscribe model, pubsub, where your code registers an interest in a given item and then other parts of the code publish the item.
This is not really a Python problem. Optimally, you want to put your process to sleep and wait for some sort of signal that the action has occured, which will use no CPU while waiting. So it's not so much a case of writing Python code but figuring out what mechanism is used to make condition true and thus wait on that.
If the condition is a simple flag set by another thread in your program rather than an external resource, you need to go back and learn from scratch how threading works.
Only if the thing that you're waiting for does not provide any sort of push notification that you can wait on should you consider polling it in a loop. A sleep will help reduce the CPU load but not eliminate it and it will also increase the response latency as the sleep has to complete before you can commence processing.
As for waiting on events, an event-driven paradigm might be what you want unless your program is utterly trivial. Python has the Twisted framework for this.
First method:
import threading
import time
def keepalive():
while True:
print 'Alive.'
time.sleep(200)
threading.Thread(target=keepalive).start()
Second method:
import threading
def keepalive():
print 'Alive.'
threading.Timer(200, keepalive).start()
threading.Timer(200, keepalive).start()
Which method takes up more RAM? And in the second method, does the thread end after being activated? or does it remain in the memory and start a new thread? (multiple threads)
Timer creates a new thread object for each started timer, so it certainly needs more resources when creating and garbage collecting these objects.
As each thread exits immediately after it spawned another active_count stays constant, but there are constantly new Threads created and destroyed, which causes overhead. I'd say the first method is definitely better.
Altough you won't realy see much difference, only if the interval is very small.
Here's an example of how to test this yourself:
And in the second method, does the thread end after being activated? or does it remain in the memory and start a new thread? (multiple threads)
import threading
def keepalive():
print 'Alive.'
threading.Timer(200, keepalive).start()
print threading.active_count()
threading.Timer(200, keepalive).start()
I also changed the 200 to .2 so it wouldn't take as long.
The thread count was 3 forever.
Then I did this:
top -pid 24767
The #TH column never changed.
So, there's your answer: We don't have enough info to know whether Python maintains a single timer thread for all of the timers, or ends and cleans up the thread as soon as the timer runs, but we can be sure the threads doesn't stick around and pile up. (If you do want to know which of the former is happening, you can, e.g., print the thread ids.)
An alternative way to find out is to look at the source. As the documentation says, "Timer is a subclass of Thread and as such also functions as an example of creating custom threads". The fact that it's a subclass of Thread already tells you that each Timer is a Thread. And the fact that it "functions as an example" implies that it ought to be easy to read. If you click the link form the documentation to the source, you can see how trivial it is. Most of the work is done by Event, but that's in the same source file, and it's almost as simple. Effectively, it just creates a condition variable, waits on it (so it blocks until it times out, or you notify the condition by calling cancel), then quits.
The reason I'm answering one sub-question and explaining how I did it, rather than answering each sub-question, is because I think it would be more useful for you to walk through the same steps.
On further reflection, this probably isn't a question to be decided by optimization in the first place:
If you have a simple, synchronous program that needs to do nothing for 200 seconds, make a blocking call to sleep. Or, even simpler, just do the job and quit, and pick an external tool to schedule your script to run every 200s.
On the other hand, if your program is inherently asynchronous—especially if you've already got thread, signal handlers, and/or an event loop—there's just no way you're going to get sleep to work. If Timer is too inefficient, go to PyPI or ActiveState and find a better timer that lets you schedule repeatable timers (or even multiple timers) with a single instance and thread. (Or, if you're using signals, use signal.alarm or setitimer, and if you're using an event loop, build the timer into your main loop.)
I can't think of any use case where sleep and Timer would both be serious contenders.
What is the best way to continuously repeat the execution of a given function at a fixed interval while being able to terminate the executor (thread or process) immediately?
Basically I know two approaches:
use multiprocessing and function with infinite cycle and time.sleep at the end. Processing is terminated with process.terminate() in any state.
use threading and constantly recreate timers at the end of the thread function. Processing is terminated by timer.cancel() while sleeping.
(both “in any state” and “while sleeping” are fine, even though the latter may be not immediate). The problem is that I have to use both multiprocessing and threading as the latter appears not to work on ARM (some fuzzy interaction of python interpreter and vim, outside of vim everything is fine) (I was using the second approach there, have not tried threading+cycle; no code is currently left) and the former spawns way too many processes which I would like not to see unless really required. This leads to a problem of having to code two different approaches while threading with cycle is just a few more imports for drop-in replacements of all multiprocessing stuff wrapped in if/else (except that there is no thread.terminate()). Is there some better way to do the job?
Currently used code is here (currently with cycle for both jobs), but I do not think it will be much useful to answer the question.
Update: The reason why I am using this solution are functions that display file status (and some other things like branch) in version control systems in vim statusline. These statuses must be updated, but updating them immediately cannot be done without using hooks and I have no idea how to set hooks temporary and remove on vim quit without possibly spoiling user configuration. Thus standard solution is cache expiring after N seconds. But when cache expired I need to do an expensive shell call and the delay appears to be noticeable, the more noticeable the heavier IO load is. What I am implementing now is updating values for viewed buffers each N seconds in a separate process thus delays are bothering that process and not me. Threads are likely to also work because GIL does not affect calls to external programs.
I'm not clear on why a single long-lived thread that loops infinitely over the tasks wouldn't work for you? Or why you end up with many processes in the multiprocess option?
My immediate reaction would have been a single thread with a queue to feed it things to do. But I may be misunderstanding the problem.
I do not know how do it simply and/or cleanly in Python, but I was wondering if maybe you couldn't take avantage of an existing system scheduler, e.g. crontab for *nix system.
There is an API in python and it might satisfied your needs.
Am new to python and making some headway with threading - am doing some music file conversion and want to be able to utilize the multiple cores on my machine (one active conversion thread per core).
class EncodeThread(threading.Thread):
# this is hacked together a bit, but should give you an idea
def run(self):
decode = subprocess.Popen(["flac","--decode","--stdout",self.src],
stdout=subprocess.PIPE)
encode = subprocess.Popen(["lame","--quiet","-",self.dest],
stdin=decode.stdout)
encode.communicate()
# some other code puts these threads with various src/dest pairs in a list
for proc in threads: # `threads` is my list of `threading.Thread` objects
proc.start()
Everything works, all the files get encoded, bravo! ... however, all the processes spawn immediately, yet I only want to run two at a time (one for each core). As soon as one is finished, I want it to move on to the next on the list until it is finished, then continue with the program.
How do I do this?
(I've looked at the thread pool and queue functions but I can't find a simple answer.)
Edit: maybe I should add that each of my threads is using subprocess.Popen to run a separate command line decoder (flac) piped to stdout which is fed into a command line encoder (lame/mp3).
If you want to limit the number of parallel threads, use a semaphore:
threadLimiter = threading.BoundedSemaphore(maximumNumberOfThreads)
class EncodeThread(threading.Thread):
def run(self):
threadLimiter.acquire()
try:
<your code here>
finally:
threadLimiter.release()
Start all threads at once. All but maximumNumberOfThreads will wait in threadLimiter.acquire() and a waiting thread will only continue once another thread goes through threadLimiter.release().
"Each of my threads is using subprocess.Popen to run a separate command line [process]".
Why have a bunch of threads manage a bunch of processes? That's exactly what an OS does that for you. Why micro-manage what the OS already manages?
Rather than fool around with threads overseeing processes, just fork off processes. Your process table probably can't handle 2000 processes, but it can handle a few dozen (maybe a few hundred) pretty easily.
You want to have more work than your CPU's can possibly handle queued up. The real question is one of memory -- not processes or threads. If the sum of all the active data for all the processes exceeds physical memory, then data has to be swapped, and that will slow you down.
If your processes have a fairly small memory footprint, you can have lots and lots running. If your processes have a large memory footprint, you can't have very many running.
If you're using the default "cpython" version then this won't help you, because only one thread can execute at a time; look up Global Interpreter Lock. Instead, I'd suggest looking at the multiprocessing module in Python 2.6 -- it makes parallel programming a cinch. You can create a Pool object with 2*num_threads processes, and give it a bunch of tasks to do. It will execute up to 2*num_threads tasks at a time, until all are done.
At work I have recently migrated a bunch of Python XML tools (a differ, xpath grepper, and bulk xslt transformer) to use this, and have had very nice results with two processes per processor.
It looks to me that what you want is a pool of some sort, and in that pool you would like the have n threads where n == the number of processors on your system. You would then have another thread whose only job was to feed jobs into a queue which the worker threads could pick up and process as they became free (so for a dual code machine, you'd have three threads but the main thread would be doing very little).
As you are new to Python though I'll assume you don't know about the GIL and it's side-effects with regard to threading. If you read the article I linked you will soon understand why traditional multithreading solutions are not always the best in the Python world. Instead you should consider using the multiprocessing module (new in Python 2.6, in 2.5 you can use this backport) to achieve the same effect. It side-steps the issue of the GIL by using multiple processes as if they were threads within the same application. There are some restrictions about how you share data (you are working in different memory spaces) but actually this is no bad thing: they just encourage good practice such as minimising the contact points between threads (or processes in this case).
In your case you are probably intersted in using a pool as specified here.
Short answer: don't use threads.
For a working example, you can look at something I've recently tossed together at work. It's a little wrapper around ssh which runs a configurable number of Popen() subprocesses. I've posted it at: Bitbucket: classh (Cluster Admin's ssh Wrapper).
As noted, I don't use threads; I just spawn off the children, loop over them calling their .poll() methods and checking for timeouts (also configurable) and replenish the pool as I gather the results. I've played with different sleep() values and in the past I've written a version (before the subprocess module was added to Python) which used the signal module (SIGCHLD and SIGALRM) and the os.fork() and os.execve() functions --- which my on pipe and file descriptor plumbing, etc).
In my case I'm incrementally printing results as I gather them ... and remembering all of them to summarize at the end (when all the jobs have completed or been killed for exceeding the timeout).
I ran that, as posted, on a list of 25,000 internal hosts (many of which are down, retired, located internationally, not accessible to my test account etc). It completed the job in just over two hours and had no issues. (There were about 60 of them that were timeouts due to systems in degenerate/thrashing states -- proving that my timeout handling works correctly).
So I know this model works reliably. Running 100 current ssh processes with this code doesn't seem to cause any noticeable impact. (It's a moderately old FreeBSD box). I used to run the old (pre-subprocess) version with 100 concurrent processes on my old 512MB laptop without problems, too).
(BTW: I plan to clean this up and add features to it; feel free to contribute or to clone off your own branch of it; that's what Bitbucket.org is for).
I am not an expert in this, but I have read something about "Lock"s. This article might help you out
Hope this helps
I would like to add something, just as a reference for others looking to do something similar, but who might have coded things different from the OP. This question was the first one I came across when searching and the chosen answer pointed me in the right direction. Just trying to give something back.
import threading
import time
maximumNumberOfThreads = 2
threadLimiter = threading.BoundedSemaphore(maximumNumberOfThreads)
def simulateThread(a,b):
threadLimiter.acquire()
try:
#do some stuff
c = a + b
print('a + b = ',c)
time.sleep(3)
except NameError: # Or some other type of error
# in case of exception, release
print('some error')
threadLimiter.release()
finally:
# if everything completes without error, release
threadLimiter.release()
threads = []
sample = [1,2,3,4,5,6,7,8,9]
for i in range(len(sample)):
thread = threading.Thread(target=(simulateThread),args=(sample[i],2))
thread.daemon = True
threads.append(thread)
thread.start()
for thread in threads:
thread.join()
This basically follows what you will find on this site:
https://www.kite.com/python/docs/threading.BoundedSemaphore