Dynamically allocating and destroying mutexes?

Dynamically allocating and destroying mutexes? - python

I have an application that's built on top of Eventlet.
I'm trying to write a decent decorator for synchronizing calls to certain methods across threads.
The decorator currently looks something like this:
_semaphores_semaphore = semaphore.Semaphore()
_semaphores = {}
def synchronized(name):
def wrap(f):
def inner(*args, **kwargs):
# Grab the lock protecting _semaphores.
with _semaphores_semaphore:
# If the named semaphore does not yet exist, create it.
if name not in _semaphores:
_semaphores[name] = semaphore.Semaphore()
sem = _semaphores[name]
with sem:
return f(*args, **kwargs)
This works fine, and looks nice and thread safe to me, although this whole thread safety and locking business might be a bit rusty for me.
The problem is that a specific, existing use of semaphores elsewhere in the application, which I'm wanting to convert to using this decorator, creates these semaphores on the fly: Based on user input, it has to create a file. It checks in a dict whether it already has a semaphore for this file, if not, it creates one, and locks it. Once it's done and has released the lock, it checks if it's been locked again (by another process in the mean time), and if not, it deletes the semaphore. This code is written with the assumption of green threads and is safe in that context, but if I want to convert it to use my decorator, and this is what I can't work out.
If I don't care about cleaning up the possibly-never-to-be-used-again semaphores (there could be hundreds of thousands of these), I'm fine. If I do want to clean them up, I'm not sure what to do.
To delete the semaphore, it seems obvious that I need to be holding the _semaphores_semaphore, since I'm manipulating the _semaphores dict, but I have to do something with the specific semaphore, too, but everything I can think of seems to be racy:
* While inside the "with sem:" block, I could grab the _semaphores_semaphore and sem from _semaphores. However, other threads might be blocked waiting for it (at "with sem:"), and if a new thread comes along wanting to touch the same resource, it will not find the same semaphore in _semaphores, but instead create a new one => fail.
I could improve this slightly by checking the balance of sem to see if another thread is already waiting for me to release it. If so, leave it alone, if not, delete it. This way, the last thread waiting to act on the resource will delete it. However, if a thread has just left the "with _semaphores_semaphore:" block, but hasn't yet made it to "with sem:", I have the same problem as before => fail.
It feels like I'm missing something obvious, but I can't work out what it is.

I think you might be able to solve it with a reader-writer lock aka. shared-exclusive lock on the _semaphores dict.
This is untested code, to show the principle. An RWLock implementation can be found in e.g. http://code.activestate.com/recipes/413393-multiple-reader-one-writer-mrow-resource-locking/
_semaphores_rwlock = RWLock()
_semaphores = {}
def synchronized(name):
def wrap(f):
def inner(*args, **kwargs):
lock = _semaphores_rwlock.reader()
# If the named semaphore does not yet exist, create it.
if name not in _semaphores:
lock = _semaphores_rwlock.writer()
_semaphores[name] = semaphore.Semaphore()
sem = _semaphores[name]
with sem:
retval = f(*args, **kwargs)
lock.release()
return retval
When you want to clean up you do:
wlock = _semaphores_rwlock.writer() #this might take a while; it waits for all readers to release
cleanup(_semaphores)
wlock.release()

mchro's answer didn't work for me since it blocks all threads on a single semaphore whenever one thread needs to create a new semaphore.
The answer that I came up with is to keep counters of occupants between the two transactions with _semaphores (which are both done behind the same mutex):
A: get semaphore
A1: dangerzone
B: with sem: block etc
C: cleanup semaphore
The problem is knowing how many people are between A and C. The counter of the semaphore doesn't tell you that, since someone may be in A1. The answer is to keep a counter of entrants along with each semaphore in _semaphores, increment it at A, decrement it at C, and if it's at 0 then you know that there's no-one else in A-C with the same key and you can safely delete it.

Related

Call method on many objects in parallel

I wanted to use concurrency in Python for the first time. So I started reading a lot about Python concurreny (GIL, threads vs processes, multiprocessing vs concurrent.futures vs ...) and seen a lot of convoluted examples. Even in examples using the high level concurrent.futures library.
So I decided to just start trying stuff and was surprised with the very, very simple code I ended up with:
from concurrent.futures import ThreadPoolExecutor
class WebHostChecker(object):
def __init__(self, websites):
self.webhosts = []
for website in websites:
self.webhosts.append(WebHost(website))
def __iter__(self):
return iter(self.webhosts)
def check_all(self):
# sequential:
#for webhost in self:
# webhost.check()
# threaded:
with ThreadPoolExecutor(max_workers=10) as executor:
executor.map(lambda webhost: webhost.check(), self.webhosts)
class WebHost(object):
def __init__(self, hostname):
self.hostname = hostname
def check(self):
print("Checking {}".format(self.hostname))
self.check_dns() # only modifies internal state, i.e.: sets self.dns
self.check_http() # only modifies internal status, i.e.: sets self.http
Using the classes looks like this:
webhostchecker = WebHostChecker(["urla.com", "urlb.com"])
webhostchecker.check_all() # -> this calls .check() on all WebHost instances in parallel
The relevant multiprocessing/threading code is only 3 lines. I barely had to modify my existing code (which I hoped to be able to do when first starting to write the code for sequential execution, but started to doubt after reading the many examples online).
And... it works! :)
It perfectly distributes the IO-waiting among multiple threads and runs in less than 1/3 of the time of the original program.
So, now, my question(s):
What am I missing here?
Could I implement this differently? (Should I?)
Why are other examples so convoluted? (Although I must say I couldn't find an exact example doing a method call on multiple objects)
Will this code get me in trouble when I expand my program with features/code I cannot predict right now?
I think I already know of one potential problem and it would be nice if someone can confirm my reasoning: if WebHost.check() also becomes CPU bound I won't be able to swap ThreadPoolExecutor for ProcessPoolExecutor. Because every process will get cloned versions of the WebHost instances? And I would have to code something to sync those cloned instances back to the original?
Any insights/comments/remarks/improvements/... that can bring me to greater understanding will be much appreciated! :)

Ok, so I'll add my own first gotcha:
If webhost.check() raises an Exception, then the thread just ends and self.dns and/or self.http might NOT have been set. However, with the current code, you won't see the Exception, UNLESS you also access the executor.map() results! Leaving me wondering why some objects raised AttributeErrors after running check_all() :)
This can easily be fixed by just evaluating every result (which is always None, cause I'm not letting .check() return anything). You can do it after all threads have run or during. I choose to let Exceptions be raised during (ie: within the with statement), so the program stops at the first unexpected error:
def check_all(self):
with ThreadPoolExecutor(max_workers=10) as executor:
# this alone works, but does not raise any exceptions from the threads:
#executor.map(lambda webhost: webhost.check(), self.webhosts)
for i in executor.map(lambda webhost: webhost.check(), self.webhosts):
pass
I guess I could also use list(executor.map(lambda webhost: webhost.check(), self.webhosts)) but that would unnecessarily use up memory.

Thread blocks in an RLock

I have this implementation:
def mlock(f):
'''Method lock. Uses a class lock to execute the method'''
def wrapper(self, *args, **kwargs):
with self._lock:
res = f(self, *args, **kwargs)
return res
return wrapper
class Lockable(object):
def __init__(self):
self._lock = threading.RLock()
Which I use in several places, for example:
class Fifo(Lockable):
'''Implementation of a Fifo. It will grow until the given maxsize; then it will drop the head to add new elements'''
def __init__(self, maxsize, name='FIFO', data=None, inserted=0, dropped=0):
self.maxsize = maxsize
self.name = name
self.inserted = inserted
self.dropped = dropped
self._fifo = []
self._cnt = None
Lockable.__init__(self)
if data:
for d in data:
self.put(d)
#mlock
def __len__(self):
length = len(self._fifo)
return length
...
The application is quite complex, but it works well. Just to make sure, I have been doing stress tests of the running service, and I find that it sometimes (rarely) deadlocks in the mlock. I assume another thread is holding the lock and not releasing it. How can I debug this? Please note that:
it is very difficult to reproduce: I need hours of testing to deadlock
the application is running in the background
once it deadlocks, I can not interact with it anymore
I would like to know:
what thread is holding the lock?
why is it not being released? I am using a context manager to acquire the lock, so it should always be released. Where is the bug?!
What options do I have to further debug this?
I have been checking if there is any way of knowing what thread is holding an RLock, but it seems there is not API for this.

I don't think there's an easy solution for this, but it can be done with some work.
Personally, I've found the following useful (albeit in C++).
Start by creating a Lockable base that uses tracks threads' interactions with it. A Lockable object will use an additional (non-recursive) lock for protecting a dictionary mapping thread ids to interactions with it:
When a thread tries to lock, it (locks and) creates an entry.
When it acquires the lock, it (locks and) modifies the entry.
When it releases the lock, it (locks and) removes the entry.
Additionally, a Lockable object will have a low-priority thread, waking up very rarely (once every several minutes), and seeing if there's indication of a deadlock (approximated by the event that a thread has been holding the lock for a long time, while at least one other thread has waited for it).
The entry for a thread should therefore include:
the operation's time
the stacktrace info leading to the operation.
The problem is that this can alter the relative timing of threads, which might cause your program to go into different execution paths than it normally does.
Here you need to get creative. You might need to also induce (random) time lapses in these (and possibly other) operations.

Locking with Tornado and multiple instances

I'm fairly new to Python and Tornado, so please forgive if I overcomplicated a long-solved problem, but I didn't find much else out there.
I'm running multiple Tornado instances (multiple instances per server, multiple servers) for an application and have some tasks that only one instance should perform, such as scheduling certain events in the application. Instead of running a dedicated instance that performs this task, I'd like to have an opportunistic approach where the first instance that tries gets to do the job.
Part of my solution is a database based locking mechanism (MongoDB findAndUpdate). The code below seems to work just fine but I'd like to get some advice if this is a good solution or if there are ready-made locking and task distribution solutions out there for Tornado.
This is the decorator that acquires the lock when entering the function and releases it afterwards:
def locking(fn):
#tornado.gen.engine
def wrapped(wself, *args, **kwargs):
#tornado.gen.engine
def wrapped_callback(*cargs, **ckwargs):
logging.info("release lock")
yield tornado.gen.Task(lock.release_lock)
logging.info("release lock done")
original_callback(*cargs, **ckwargs)
logging.info("acquire lock")
yield tornado.gen.Task(model.SchedulerLock.initialize_lock, area_id=wself.area_id)
lock = yield tornado.gen.Task(model.SchedulerLock.acquire_lock, area_id=wself.area_id)
if lock:
logging.info("acquire lock done")
original_callback = kwargs['callback']
kwargs['callback'] = wrapped_callback
fn(wself, *args, **kwargs)
else:
logging.info("acquire lock not possible, postponed")
ioloop = tornado.ioloop.IOLoop.instance()
ioloop.add_timeout(datetime.timedelta(seconds=2), functools.partial(wrapped, wself, *args, **kwargs))
return wrapped
The acquire_lock method returns the lock object or False
Any thoughts on this? I know that the lock is only half of the solution, as I also need a mechanism that ensures that a one-off task only gets done once. However, this can be achieved very similarly. Is there anything that solves the problem more elegantly?

Is this an insane implementation of producer consumer type thing?

# file1.py
class _Producer(self):
def __init__(self):
self.chunksize = 6220800
with open('/dev/zero') as f:
self.thing = f.read(self.chunksize)
self.n = 0
self.start()
def start(self):
import subprocess
import threading
def produce():
self._proc = subprocess.Popen(['producer_proc'], stdout=subprocess.PIPE)
while True:
self.thing = self._proc.stdout.read(self.chunksize)
if len(self.thing) != self.chunksize:
msg = 'Expected {0} bytes. Read {1} bytes'.format(self.chunksize, len(self.thing))
raise Exception(msg)
self.n += 1
t = threading.Thread(target=produce)
t.daemon = True
t.start()
self._thread = t
def stop(self):
if self._thread.is_alive():
self._proc.terminate()
self._thread.join(1)
producer = _Producer()
producer.start()
I have written some code more or less like the above design, and now I want to be able to consume the output of producer_proc in other files by going:
# some_other_file.py
import file1
my_thing = file1.producer.thing
Multiple other consumers might be grabbing a reference to file.producer.thing, they all need to use from the same producer_proc. And the producer_proc should never be blocked. Is this a sane implementation? Does the python GIL make it thread safe, or do I need to reimplement using a Queue for getting data of the worker thread? Do consumers need to explicitly make a copy of the thing?
I guess am trying to implement something like Producer/Consumer pattern or Observer pattern, but I'm not really clear on all the technical details of design patterns.
A single producer is constantly making things
Multiple consumers using things at arbitrary times
producer.thing should be replaced by a fresh thing as soon as the new one is available, most things will go unused but that's ok
It's OK for multiple consumers to read the same thing, or to read the same thing twice in succession. They only want to be sure they have got the most recent thing when asked for it, not some stale old thing.
A consumer should be able to keep using a thing as long as they have it in scope, even though the producer may have already overwritten his self.thing with a fresh new thing.

Given your (unusual!) requirements, your implementation seems correct. In particular,
If you're only updating one attribute, the Python GIL should be sufficient. Single bytecode instructions are atomic.
If you do anything more complex, add locking! It's basically harmless anyway - if you cared about performance or multicore scalability, you probably wouldn't be using Python!
In particular, be aware that self.thing and self.n in this code are updated in a separate bytecode instructions. The GIL could be released/acquired between, so you can't get a consistent view of the two of them unless you add locking. If you're not going to do that, I'd suggest removing self.n as it's an "attractive nuisance" (easily misused) or at least adding a comment/docstring with this caveat.
Consumers don't need to make a copy. You're not ever mutating a particular object pointed to by self.thing (and couldn't with string objects; they're immutable) and Python is garbage-collected, so as long as a consumer grabbed a reference to it, it can keep accessing it without worrying too much about what other threads are doing. The worst that could happen is your program using a lot of memory from several generations of self.thing being kept alive.
I'm a bit curious where your requirements came from. In particular, that you don't care if a thing is never used or used many times.

Is this Python code thread safe?

import time
import threading
class test(threading.Thread):
def __init__ (self):
threading.Thread.__init__(self)
self.doSkip = False
self.count = 0
def run(self):
while self.count<9:
self.work()
def skip(self):
self.doSkip = True
def work(self):
self.count+=1
time.sleep(1)
if(self.doSkip):
print "skipped"
self.doSkip = False
return
print self.count
t = test()
t.start()
while t.count<9:
time.sleep(2)
t.skip()

Thread-safe in which way? I don't see any part you might want to protect here.
skip may reset the doSkip at any time, so there's not much point in locking it. You don't have any resources that are accessed at the same time - so IMHO nothing can be corrupted / unsafe in this code.
The only part that might run differently depending on locking / counting is how many "skip"s do you expect on every call to .skip(). If you want to ensure that every skip results in a skipped call to .work(), you should change doSkip into a counter that is protected by a lock on both increment and compare/decrement. Currently one thread might turn doSkip on after the check, but before the doSkip reset. It doesn't matter in this example, but in some real situation (with more code) it might make a difference.

Whenever the test of a mutex boolean ( e.g. if(self.doSkip) ) is separate from the set of the mutex boolean you will probably have threading problems.
The rule is that your thread will get swapped out at the most inconvenient time. That is, after the test and before the set. Moving them closer together reduces the window for screw-ups but does not eliminate them. You almost always need a specially created mechanism from the language or kernel to fully close that window.
The threading library has Semaphores that can be used to synchronize threads and/or create critical sections of code.

Apparently there isn't any critical resource, so I'd say it's thread-safe.
But as usual you can't predict in which order the two threads will be blocked/run by the scheduler.

This is and will thread safe as long as you don't share data between threads.
If an other thread needs to read/write data to your thread class, then this won't be thread safe unless you protect data with some synchronization mechanism (like locks).

To elaborate on DanM's answer, conceivably this could happen:
Thread 1: t.skip()
Thread 2: if self.doSkip: print 'skipped'
Thread 1: t.skip()
Thread 2: self.doSkip = False
etc.
In other words, while you might expect to see one "skipped" for every call to t.skip(), this sequence of events would violate that.
However, because of your sleep() calls, I think this sequence of events is actually impossible.
(unless your computer is running really slowly)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Dynamically allocating and destroying mutexes? - python

Related

Call method on many objects in parallel

Thread blocks in an RLock

Locking with Tornado and multiple instances

Is this an insane implementation of producer consumer type thing?

Is this Python code thread safe?

Categories

Resources