I would like to find a mechanism to easily report the progress of a Python thread. For example, if my thread had a counter, I would like to know the value of the counter once in awhile, but, importantly, I only need to know the latest value, not every value that's ever gone by.
What I imagine to be the simplest solution is a single value Queue, where every time I put a new value on in the thread, it replaces the old value with the new one. Then when I do a get in the main program, it would only return the latest value.
Because I don't know how to do the above, instead what I do is put every counter value in a queue, and when I get, I get all the values until there are no more, and just keep the last. But this seems far from ideal, in that I'm filling the queues with thousands of values the I don't care about.
Here's an example of what I do now:
from threading import Thread
from Queue import Queue, Empty
from time import sleep
N = 1000
def fast(q):
count = 0
while count<N:
sleep(.02)
count += 1
q.put(count)
def slow(q):
while 1:
sleep(5) # sleep for a long time
# read last item in queue
val = None
while 1: # read all elements of queue, only saving last
try:
val = q.get(block=False)
except Empty:
break
print val # the last element read from the queue
if val==N:
break
if __name__=="__main__":
q = Queue()
fast_thread = Thread(target=fast, args=(q,))
fast_thread.start()
slow(q)
fast_thread.join()
My question is, is there a better approach?
Just use a global variable and a threading.Lock to protect it during assignments:
import threading
from time import sleep
N = 1000
value = 0
def fast(lock):
global value
count = 0
while count<N:
sleep(.02)
count += 1
with lock:
value = count
def slow():
while 1:
sleep(5) # sleep for a long time
print value # read current value
if value == N:
break
if __name__=="__main__":
lock = threading.Lock()
fast_thread = threading.Thread(target=fast, args=(lock,))
fast_thread.start()
slow()
fast_thread.join()
yields (something like)
249
498
747
997
1000
As Don Question points out, if there is only one thread modifying value, then
actually no lock is needed in the fast function. And as dano points out, if you want to
ensure that the value printed in slow is the same value used in the
if-statement, then a lock is needed in the slow function.
For more on when locks are needed, see Thread Synchronization Mechanisms in Python.
Just use a deque with a maximum length of 1. It will just keep your latest value.
So, instead of:
q = Queue()
use:
from collections import deque
q = deque(maxlen=1)
To read from the deque, there's no get method, so you'll have to do something like:
val = None
try:
val = q[0]
except IndexError:
pass
In your special case, you may over-complicate the issue. If your variable is just some kind of progress-indenticator of a single thread, and only this thread actually changes the variable, then it's completely safe to use a shared object to communicate the progress as long as all other threads do only read.
I guess we all read to many (rightfully) warnings about race-conditions and other pitfalls of shared states in concurrent programming, so we tend to overthink and add more precaution then is sometimes needed.
You could basically share a pre-constructed dict:
thread_progress = dict.fromkeys(list_of_threads, progress_start_value)
or manually:
thread_progress = {thread: progress_value, ...}
without further precaution as long as no thread changes the dict-keys.
This way you can track the progress of multiple threads over one dict. Only condition is to not change the dict once the threading started. Which means the dict must contain all threads BEFORE the first child-thread starts, else you must use a Lock, before writing to the dict. With "changing the dict" i mean all operation regarding the keys. You may change the associated values of a key, because that's in the next level of indirection.
Update:
The underlying problem is the shared state. Which is already a problem in linear Programs, but a nightmare in concurrent.
For example: Imagine a global (shared) variable sv and two functions G(ood) and B(ad) in a linear program. Both function calculate a result depending on sv, but B unintentionally changes sv. Now you are wondering why the heck G doesn't do what it should do, despite not finding any error in your function G, even after you tested it isolated and it was perfectly fine.
Now imagine the same scenario in a concurrent program, with two Threads A and B. Both Threads increment the shared state/variable sv by one.
without locking (current value of sv in parenthesis):
sv = 0
A reads sv (0)
B reads sv (0)
A inc sv (0)
B inc sv (0)
A writes sv (1)
B writes sv (1)
sv == 1 # should be 2!
To find the source of the problem is a pure nightmare! Because it could also succeed sometimes. More often than not A actually would succeed to finish, before B even starts to read sv, but now your problem just seems to behave non-deterministic or erratic and is even harder to find. In contrast to my linear example, both threads are "good", but nevertheless behave not as intentioned.
with locking:
sv = 0
l = lock (for access on sv)
A tries to aquire lock for sv -> success (0)
B tries to aquire lock for sv -> failure, blocked by A (0)
A reads sv (0)
B blocked (0)
A inc sv (0)
B blocked (0)
A writes sv (1)
B blocked (1)
A releases lock on sv (1)
B tries to aquire lock for sv -> success (1)
...
sv == 2
I hope my little example explained the underlying problem of accessing a shared state and
why making write operations (including the read operation) atomic through locking is necessary.
Regarding my advice of a pre-initialized dict: This is a mere precaution because of two reasons:
if you iterate over the threads in a for-loop, the loop may raise an
exception if a thread adds or removes an entry to/from the dict
while still in the loop, because it now is unclear what the next key
should be.
Thread A reads the dict and gets interrupted by Thread B which adds
an entry and finishes. Thread A resumes, but doesn't have the dict
Thread B changed and writes the pre-B together with it's own changes
back. Thread Bs changes are lost.
BTW my proposed solution wouldn't work atm, because of the immutability of the primitive types. But this could be easily fixed by making them mutable, e.g. by encapsulating them into a list or an special Progress-Object, or even simpler: give the thread-function access to the thread_progress dict .
Explanation by example:
t = Thread()
progress = 0 # progress points to the object `0`
dict[t] = progress # dict[t] points now to object `0`
progress = 1 # progress points to object `1`
dict[t] # dict[t] still points to object `0`
better:
t = Thread()
t.progress = 0
dict[thread_id] = t
t.progress = 1
dict[thread_id].progress == 1
Related
According to GIL wiki it states that
In CPython, the global interpreter lock, or GIL, is a mutex that prevents multiple native threads from executing Python bytecodes at once. This lock is necessary mainly because CPython's memory management is not thread-safe.
When multiple threads tries to do some operation on a shared variable at same time we need to synchronise the threads to avoid Race Conditions. We achieve this by acquiring a lock.
But since python uses GIL only one thread is allowed to execute python's byte code, so this problem should be never faced in case of python programs - is what I thought :( .But I saw an article about thread synchronisation in python where we have a code snippet that is causing race conditions.
https://www.geeksforgeeks.org/multithreading-in-python-set-2-synchronization/
Can someone please explain me how this is possible?
Code
import threading
# global variable x
x = 0
def increment():
"""
function to increment global variable x
"""
global x
x += 1
def thread_task():
"""
task for thread
calls increment function 100000 times.
"""
for _ in range(100000):
increment()
def main_task():
global x
# setting global variable x as 0
x = 0
# creating threads
t1 = threading.Thread(target=thread_task)
t2 = threading.Thread(target=thread_task)
# start threads
t1.start()
t2.start()
# wait until threads finish their job
t1.join()
t2.join()
if __name__ == "__main__":
for i in range(10):
main_task()
print("Iteration {0}: x = {1}".format(i,x))
Output:
Iteration 0: x = 175005
Iteration 1: x = 200000
Iteration 2: x = 200000
Iteration 3: x = 169432
Iteration 4: x = 153316
Iteration 5: x = 200000
Iteration 6: x = 167322
Iteration 7: x = 200000
Iteration 8: x = 169917
Iteration 9: x = 153589
Only one thread at a time can execute bytecode. That ensures memory allocation, and primitive objects like lists, dicts and sets are always consistent without the need for any explicit control on the Python side of the code.
However, the += 1, integers being imutable objects, is not atomic: it fetches the previous value in the same variable, creates (or gets a reference to) a new object, which is the result of the operation, and then stores that value in the original global variable. The bytecode for that can be seen with the help of the dis module:
In [2]: import dis
In [3]: global counter
In [4]: counter = 0
In [5]: def inc():
...: global counter
...: counter += 1
...:
In [6]: dis.dis(inc)
1 0 RESUME 0
3 2 LOAD_GLOBAL 0 (counter)
14 LOAD_CONST 1 (1)
16 BINARY_OP 13 (+=)
20 STORE_GLOBAL 0 (counter)
22 LOAD_CONST 0 (None)
24 RETURN_VALUE
And the running thread can change arbitrarily between each of these bytecode instructions.
So, for this kind of concurrency, one has to resort to, as in lower level code, to a lock -the inc function should be like this:
In [7]: from threading import Lock
In [8]: inc_lock = Lock()
In [9]: def inc():
...: global counter
...: with inc_lock:
...: counter += 1
...:
So, this will ensure no other thread will run bytecode while performing the whole counter += 1 part.
(The disassemble here would be significantly lengthier, but it has to do with the semantics of the with block, not with the lock, so, not related to the problem we are looking at. The lock can be acquired through other means as well - a with block is just the most convenient.)
Also, this is one of the greatest advantages of async code when compared to threaded parallelism: in async code one's Python code will always run without being interrupted unless there is an explicit deferring of the flow to the controlling loop - by using an await or one of the various async <command> patterns.
since python uses GIL only one thread is allowed to execute python's byte code
All of the threads in a Python program must be able to execute byte codes, but at any one moment in time, only one thread can have a lock on the GIL. The threads in a program continually take turns locking and unlocking it as needed.
When multiple threads tries to do some operation on a shared variable at same time we need to synchronise the threads to avoid Race Conditions.
"Race condition" is kind of a low-level idea. There is a higher-level way to understand why we need mutexes.
Threads communicate through shared variables. Imagine, we're selling seats to a show in a theater. We've got a list of seats that already have been sold, and we've got a list of seats that still are available, and we've got some number of pending transactions which have seats "on-hold." At any given instant in time, if we count all of the seats in all of those different places, they'd better add up to the number of seats in the theater—a constant number.
Computer scientists call that property an invariant. The number of seats in the theater never varies, and we always want all the seats that we know about to add up to that number.
The problem is, how can you sell a seat without breaking the invariant? You can't. You can't write code that moves a seat from one category to another in a single, atomic operation. Computer hardware doesn't have an operation for that. We have to use a sequence of simpler operations to move some object from one list to another. And, if one thread tries to count the seats while some other thread is half-way done performing that sequence, then the first thread will get the wrong number of seats. The invariant is "broken."
Mutexes solve the problem. If every thread that can temporarily break the invariant only ever does it while keeping a certain mutex locked, and if every other thread that cares about the invariant only ever checks it while keeping the same mutex locked, then no thread will ever see the broken invariant other than the one thread that is doing it on purpose.
You can talk about "invariants," or you can talk about "race conditions," but which feels right, depends on the complexity of the program. If it's a complicated system, then it often makes sense to describe the need for a mutex at a high level—by describing the invariant that the mutex protects. If it's a really simple problem (e.g., like incrementing a counter) then it feels better to talk about the "race condition" that the mutex averts. But they're really just two different ways of thinking about the same thing.
I am running multiple processes from single python code:
Code Snippet:
while 1:
if sqsObject.msgCount() > 0:
ReadyMsg = sqsObject.readM2Q()
if ReadyMsg == 0:
continue
fileName = ReadyMsg['fileName']
dirName = ReadyMsg['dirName']
uuid = ReadyMsg['uid']
guid = ReadyMsg['guid']
callback = ReadyMsg['callbackurl']
# print ("Trigger Algorithm Process")
if(countProcess < maxProcess):
try:
retValue = Process(target=dosomething, args=(dirName, uuid,guid,callback))
processArray.append(retValue)
retValue.start()
countProcess = countProcess + 1
except:
print "Cannot Run Process"
else:
for i in range(len(processArray)):
if (processArray[i].is_alive() == True):
continue
else:
try:
#print 'Restart Process'
processArray[i] = Process(target=dosomething, args=(dirName,uuid,guid,callback))
processArray[i].start()
except:
print "Cannot Run Process"
else: # No more request to service
for i in range(len(processArray)):
if (processArray[i].is_alive() == True):
processRunning = 1
break
else:
continue
if processRunning == 0:
countProcess = 0
else:
processRunning = 0
Here I am reading the messages from the queue and creating a process to run the algorithm on that message. I am putting upper limit of maxProcess. And hence after reaching maxProcess, I want to reuse the processArray slots which are not alive by checking is_alive().
This process runs fine for smaller number of processes however, for large number of messages say 100, Memory consumption goes through roof. I am thinking I have leak by reusing the process slots.
Not sure what is wrong in the process.
Thank you in advance for spotting an error or wise advise.
Your code is, in a word, weird :-)
It's not an mvce, so no one else can test it, but just looking at it, you have this (slightly simplified) structure in the inner loop:
if count < limit:
... start a new process, and increment count ...
else:
do things that can potentially start even more processes
(but never, ever, decrease count)
which seems unwise at best.
There are no invocations of a process instance's join(), anywhere. (We'll get back to the outer loop and its else case in a bit.)
Let's look more closely at the inner loop's else case code:
for i in range(len(processArray)):
if (processArray[i].is_alive() == True):
Leaving aside the unnecessary == True test—which is a bit of a risk, since the is_alive() method does not specifically promise to return True and False, just something that works boolean-ly—consider this description from the documentation (this link goes to py2k docs but py3k is the same, and your print statements imply your code is py2k anyway):
is_alive()
Return whether the process is alive.
Roughly, a process object is alive from the moment the start() method returns until the child process terminates.
Since we can't see the code for dosomething, it's hard to say whether these things ever terminate. Probably they do (by exiting), but if they don't, or don't soon enough, we could get problems here, where we just drop the message we pulled off the queue in the outer loop.
If they do terminate, we just drop the process reference from the array, by overwriting it:
processArray[i] = Process(...)
The previous value in processArray[i] is discarded. It's not clear if you may have saved this anywhere else, but if you have not, the Process instance gets discarded, and now it is actually impossible to call its join() method.
Some Python data structures tend to clean themselves up when abandoned (e.g., open streams flush output and close as needed), but the multiprocess code appears not to auto-join() its children. So this could be the, or a, source of the problem.
Finally, whenever we do get to the else case in the outer loop, we have the same somewhat odd search for any alive processes—which, incidentally, can be written more clearly as:
if any(p.is_alive() for p in processArray):
as long as we don't care about which particular ones are alive, and which are not—and if none report themselves as alive, we reset the count, but never do anything with the variable processArray, so that each processArray[i] still holds the identity of the Process instance. (So at least we could call join on each of these, excluding any lost by overwriting.)
Rather than building your own Pool yourself, you are probably better off using multiprocess.Pool and its apply and apply_async methods, as in miraculixx's answer.
Not sure what is wrong in the process.
It appears you are creating as many processes as there are messages, even when the maxProcess count is reached.
I am thinking I have leak by reusing the process slots.
There is no need to manage the processes yourself. Just use a process pool:
# before your while loop starts
from multiprocessing import Pool
pool = Pool(processes=max_process)
while 1:
...
# instead of creating a new Process
res = pool.apply_async(dosomething,
args=(dirName,uuid,guid,callback))
# after the while loop has finished
# -- wait to finish
pool.close()
pool.join()
Ways to submit jobs
Note that the Pool class supports several ways to submit jobs:
apply_async - one message at a time
map_async - a chunk of messages at a time
If messages arrive fast enough it might be better to collect several of them (say 10 or 100 at a time, depending on the actual processing done) and use map to submit a "mini-batch" to the target function at a time:
...
while True:
messages = []
# build mini-batch of messages
while len(messages) < batch_size:
... # get message
messages.append((dirName,uuid,guid,callback))
pool.map_async(dosomething, messages)
To avoid memory leaks left by dosomething you can ask the Pool to restart a process after it has consumed some number of messages:
max_tasks = 5 # some sensible number
Pool(max_processes, maxtasksperchild=max_tasks)
Going distributed
If with this approach the memory capacity is still exceeded, consider using a distributed approach i.e. add more machines. Using Celery that would be pretty straight forward, coming from the above:
# tasks.py
#task
def dosomething(...):
... # same code as before
# driver.py
while True:
... # get messages as before
res = somefunc.apply_async(args=(dirName,uuid,guid,callback))
https://docs.python.org/3/library/multiprocessing.html#multiprocessing.Array
What I’m trying to do
Create an array in MainProcess and send it through inheritance to any subsequent child processes. The child processes will change the array. The parent process will look out for the changes and act accordingly.
The problem
The parent process does not "see" any changes done by the child processes. However the child processes do "see" the changes. Ie if child 1 adds an item then child 2 will see that item etc
This is true for sARRAY and iARRAY, and iVALUE.
BUT
While the parent process seems to be oblivious to the array values it does take notice of the changes done to the iVALUE.
I don’t understand what I’m doing wrong.
UPDATE 2 https://stackoverflow.com/a/6455704/1267259
The main source of confusion is that multiprocessing uses separate processes and not threads. This means that any changes to object state
made by the children aren't automatically visible to the parent.
To clarify. What I want to do is possible, right?
https://stackoverflow.com/a/26554759/1267259
I mean that's the purpose with multiprocessing Array and Value, to communicate between children and parent processes? And iVALUE works so...
I’ve found this Shared Array not shared correctly in python multiprocessing
But I don’t understand the answer "Assigning to values that have meaning in all processes seems to help:"
UPDATE 1
Found
Python : multiprocessing and Array of c_char_p
> "the assignment to arr[i] points arr[i] to a memory address that was
only meaningful to the subprocess making the assignment. The other
subprocesses retrieve garbage when looking at that address."
As I understand it this doesn't apply to this problem. The assignment
by one subprocess to the array does make sense to the other
subprocesses in this case. But why doesn't it make sense for the main
process?
And I am aware of "managers" but it feels like Array should suffice for this use case. I've read the manual but obviously I don't seem to get it.
UPDATE 3 Indeed, this works
manage = multiprocessing.Manager()
manage = list(range(3))
So...
What am I doing wrong?
import multiprocessing
import ctypes
class MainProcess:
# keep track of process
iVALUE = multiprocessing.Value('i',-1) # this works
# keep track of items
sARRAY = multiprocessing.Array(ctypes.c_wchar_p, 1024) # this works between child processes
iARRAY = multiprocessing.Array(ctypes.c_int, 3) # this works between child processes
pLOCK = multiprocessing.Lock()
def __init__(self):
# create an index for each process
self.sARRAY.value = [None] * 3
self.iARRAY.value = [None] * 3
def InitProcess(self):
# list of items to process
items = []
item = (i for i in items)
with(multiprocessing.Pool(3)) as pool:
# main loop: keep looking for updated values
while True:
try:
pool.apply_async(self.worker, (next(item),callback=eat_finished_cake))
except StopIteration:
pass
print(self.sARRAY) # yields [None][None][None]
print(self.iARRAY) # yields [None][None][None]
print(self.iVALUE) # yields 1-3
pool.close()
pool.join()
def worker(self,item):
with self.pLOCK:
self.iVALUE.value += 1
self.sARRAY.value[self.iVALUE.value] = item # value: 'item 1'
self.iARRAY.value[self.iVALUE.value] = 2
# on next child process run
print(self.iVALUE.value) # prints 1
print(self.sARRAY.value) # prints ['item 1'][None][None]
print(self.iARRAY.value) # prints [2][None][None]
sleep(0.5)
...
with self.pLOCK:
self.iVALUE.value -= 1
UPDATE 4
Changing
pool.apply_async(self.worker, (next(item),))
To
x = pool.apply_async(self.worker, (next(item),))
print(x.get())
Or
x = pool.apply(self.worker, (next(item),))
print(x)
And in self. worker() returning self.iARRAY.value or self.sARRAY.value does return a variable that has the updated value. This is not what I want to achieve though, this doesn't event require the use of ARRAY to achive...
So I need to clarify. In the self.worker() I'm doing important heavy lifting that can take a long time and I need to send back information to the main process, eg the progress before the return value is finished to be sent to the callback.
I don't expect the return of the finished worked result to the main method/that is to be handled by the callback function. I see now that omitting the callback in the code example could've give a different impression sorry.
UPDATE 5
Re: Use numpy array in shared memory for multiprocessing
I've seen that answer and tried a variation of it using initilaizer() with a global var and passed array through initargs with no luck. I don't understand the use of nymphs and with "closing()" but that code doesn't seem to access the "arr" inside main() although shared_arr is used, but only after p.join().
As far as I can see the array is declared then turned to a nymph and inherited through init(x). My code should have the same behavior as that code so far.
One major difference seems to be how the array is accessed
I've only succeeded setting and getting array value using the attribute value, when I tried
self.iARRAY[0] = 1 # instead of iARRAY.value = [None] * 3
self.iARRAY[1] = 1
self.iARRAY[2] = 1
print(self.iARRAY) # prints <SynchronizedArray wrapper for <multiprocessing.sharedctypes.c_int_Array_3 object at 0x7f9cfa8538c8>>
And I can't find a method to access and check the values (the attribute "value" gives an unknown method error)
Another major difference from that code is the prevention of data copying using the get_obj().
Isn't this a nymphy issue?
assert np.allclose(((-1)**M)*tonumpyarray(shared_arr), arr_orig)
Not sure how to make use of that.
def worker(self,item):
with self.pLOCK:
self.iVALUE.value += 1
self.sARRAY.value[self.iVALUE.value] = item # value: 'item 1'
with self.iARRAY.get_lock():
arr = self.iARRAY.get_obj()
arr[self.iVALUE.value] = 2 # and now ???
sleep(0.5)
...
with self.pLOCK:
self.iVALUE.value -= 1
UPDATE 6
I've tried using multiprocessing.Process() instead of Pool() but the result is the same.
correct way to declare the global variable (in this case class attribute)
iARRAY = multiprocessing.Array(ctypes.c_int, range(3))
correct way to set value
self.iARRAY[n] = x
correct way to get value
self.iARRAY[n]
Not sure why the examples I've seen had used Array(ctypes.c_int, 3) and iARRAY.value[n] but in this case that was wrong
This is your problem:
while True:
try:
pool.apply_async(self.worker, (next(item),))
except StopIteration:
pass
print(self.sARRAY) # yields [None][None][None]
print(self.iARRAY) # yields [None][None][None]
print(self.iVALUE) # yields 1-3
The function pool.apply_async() starts the subprocess running and returns immediately. You don't appear to be waiting for the workers to finish. For that, you might consider using a barrier.
I have a function that must never be called with the same value simultaneously from two threads. To enforce this, I have a defaultdict that spawns new threading.Locks for a given key. Thus, my code looks similar to this:
from collections import defaultdict
import threading
lock_dict = defaultdict(threading.Lock)
def f(x):
with lock_dict[x]:
print "Locked for value x"
The problem is that I cannot figure out how to safely delete the lock from the defaultdict once its no longer needed. Without doing this, my program has a memory leak that becomes noticeable when f is called with many different values of x.
I cannot simply del lock_dict[x] at the end of f, because in the scenario that another thread is waiting for the lock, then the second thread will lock a lock that's no longer associated with lock_dict[x], and thus two threads could end up simultaneously calling f with the same value of x.
I'd use a different approach:
fcond = threading.Condition()
fargs = set()
def f(x):
with fcond:
while x in fargs:
fcond.wait()
fargs.add(x) # this thread has exclusive rights to use `x`
# do useful stuff with x
# any other thread trying to call f(x) will
# block in the .wait above()
with fcond:
fargs.remove(x) # we're done with x
fcond.notify_all() # let blocked threads (if any) proceed
Conditions have a learning curve, but once it's climbed they make it much easier to write correct thread-safe, race-free code.
Thread safety of the original code
#JimMischel asked in a comment whether the orignal's use of defaultdict was subject to races. Good question!
The answer is - alas - "you'll have to stare at your specific Python's implementation".
Assuming the CPython implementation: if any of the code invoked by defaultdict to supply a default invokes Python code, or C code that releases the GIL (global interpreter lock), then 2 (or more) threads could "simultaneously" invoke withlock_dict[x] with the same x not already in the dict, and:
Thread 1 sees that x isn't in the dict, gets a lock, then loses its timeslice (before setting x in the dict).
Thread 2 sees that x isn't in the dict, and also gets a lock.
One of those thread's locks ends up in the dict, but both threads execute f(x).
Staring at the source for 3.4.0a4+ (the current development head), defaultdict and threading.Lock are both implemented by C code that doesn't release the GIL. I don't recall whether earlier versions did or didn't, at various times, implement all or parts of defaultdict or threading.Lock in Python.
My suggested alternative code is full of stuff implemented in Python (all threading.Condition methods), but is race-free by design - even if you're using an old version of Python with sets also implemented in Python (the set is only accessed under the protection of the condition variable's lock).
One lock per argument
Without conditions, this seems to be much harder. In the original approach, I believe you need to keep a count of threads wanting to use x, and you need a lock to protect those counts and to protect the dictionary. The best code I've come up with for that is so long-winded that it seems sanest to put it in a context manager. To use, create an argument locker per function that needs it:
farglocker = ArgLocker() # for function `f()`
and then the body of f() can be coded simply:
def f(x):
with farglocker(x):
# only one thread at a time can run with argument `x`
Of course the condition approach could also be wrapped in a context manager. Here's the code:
import threading
class ArgLocker:
def __init__(self):
self.xs = dict() # maps x to (lock, count) pair
self.lock = threading.Lock()
def __call__(self, x):
return AllMine(self.xs, self.lock, x)
class AllMine:
def __init__(self, xs, lock, x):
self.xs = xs
self.lock = lock
self.x = x
def __enter__(self):
x = self.x
with self.lock:
xlock = self.xs.get(x)
if xlock is None:
xlock = threading.Lock()
xlock.acquire()
count = 0
else:
xlock, count = xlock
self.xs[x] = xlock, count + 1
if count: # x was already known - wait for it
xlock.acquire()
assert xlock.locked
def __exit__(self, *args):
x = self.x
with self.lock:
xlock, count = self.xs[x]
assert xlock.locked
assert count > 0
count -= 1
if count:
self.xs[x] = xlock, count
else:
del self.xs[x]
xlock.release()
So which way is better? Using conditions ;-) That way is "almost obviously correct", but the lock-per-argument (LPA) approach is a bit of a head-scratcher. The LPA approach does have the advantage that when a thread is done with x, the only threads allowed to proceed are those wanting to use the same x; using conditions, the .notify_all() wakes all threads blocked waiting on any argument. But unless there's very heavy contention among threads trying to use the same arguments, this isn't going to matter much: using conditions, the threads woken up that aren't waiting on x stay awake only long enough to see that x in fargs is true, and then immediately block (.wait()) again.
I was wondering if it would be possible to start a new thread and update its argument when this argument gets a new value in the main of the program, so something like this:
i = 0
def foo(i):
print i
time.sleep(5)
thread.start_new_thread(foo,(i,))
while True:
i = i+1
Thanks a lot for any help!
An argument is just a value, like anything else. Passing the value just makes a new reference to the same value, and if you mutate that value, every reference will see it.
The fact that both the global variable and the function parameter have the same name isn't relevant here, and is a little confusing, so I'm going to rename one of them. Also, your foo function only does that print once (possibly before you even increment the value), then sleeps for 5 seconds, then finishes. You probably wanted a loop there; otherwise, you can't actually tell whether things are working or not.
So, here's an example:
i = []
def foo(j):
while True:
print j
time.sleep(5)
thread.start_new_thread(foo,(i,))
while True:
i.append(1)
So, why doesn't your code work? Well, i = i+1 isn't mutating the value 0, it's assigning a new value, 0 + 1, to i. The foo function still has a reference to the old value, 0, which is unchanged.
Since integers are immutable, you can't directly solve this problem. But you can indirectly solve it very easily: replace the integer with some kind of wrapper that is mutable.
For example, you can write an IntegerHolder class with set and get methods; when you i.set(i.get() + 1), and the other reference does i.get(), it will see the new value.
Or you can just use a list as a holder. Lists are mutable, and hold zero or more elements. When you do i[0] = i[0] + 1, that replaces i[0] with a new integer value, but i is still the same list value, and that's what the other reference is pointing at. So:
i = [0]
def foo(j):
print j[0]
time.sleep(5)
thread.start_new_thread(foo,(i,))
while True:
i[0] = i[0]+1
This may seem a little hacky, but it's actually a pretty common Python idiom.
Meanwhile, the fact that foo is running in another thread creates another problem.
In theory, threads run simultaneously, and there's no ordering of any data accesses between them. Your main thread could be running on core 0, and working on a copy of i that's in core 0's cache, while your foo thread is running on core 1, and working on a different copy of i that's in core 1's cache, and there is nothing in your code to force the caches to get synchronized.
In practice, you will often get away with this, especially in CPython. But to actually know when you can get away with it, you have to learn how the Global Interpreter Lock works, and how the interpreter handles variables, and (in some cases) even how your platform's cache coherency and your C implementation's memory model and so on work. So, you shouldn't rely on it. The right thing to do is to use some kind of synchronization mechanism to guard access to i.
As a side note, you should also almost never use thread instead of threading, so I'm going to switch that as well.
i = []
lock = threading.Lock()
def foo(j):
while True:
with lock:
print j[0]
time.sleep(5)
t = threading.Thread(target=foo, args=(i,))
t.start()
while True:
with lock:
i[0] = i[0]+1
One last thing: If you create a thread, you need to join it later, or you can't quit cleanly. But your foo thread never exits, so if you try to join it, you'll just block forever.
For simple cases like this, there's a simple solution. Before calling t.start(), do t.daemon = True. This means when your main thread quits, the background thread will be automatically killed at some arbitrary point. That's obviously a bad thing if it's, say, writing to a file or a database. But in your case, it's not doing anything persistent or dangerous.
For more realistic cases, you generally want to create some way to signal between the two threads. Often you've already got something for the thread to wait on—a Queue, a file object or collection of them (via select), etc. If not, just create a flag variable protected by a lock (or condition or whatever is appropriate).
Try globals.
i = 0
def foo():
print i
time.sleep(5)
thread.start_new_thread(foo,())
while True:
i = i+1
You could also pass a hash holding the variables you need.
args = {'i' : 0}
def foo(args):
print args['i']
time.sleep(5)
thread.start_new_thread(foo,(args,))
while True:
args['i'] = arg['i'] + 1
You might also want to use a thread lock.
import thread
lock = thread.allocate_lock()
args = {'i' : 0}
def foo(args):
with lock:
print args['i']
time.sleep(5)
thread.start_new_thread(foo,(args,))
while True:
with lock:
args['i'] = arg['i'] + 1
Hoped this helped.