According to a number of sources, including this question, passing a runnable as the target parameter in __init__ (with or without args and kwargs) is preferable to extending the Thread class.
If I create a runnable, how can I pass the thread it is running on as self to it without extending the Thread class? For example, the following would work fine:
class MyTask(Thread):
def run(self):
print(self.name)
MyTask().start()
However, I can't see a good way to get this version to work:
def my_task(t):
print(t.name)
Thread(target=my_task, args=(), kwargs={}).start()
This question is a followup to Python - How can I implement a 'stoppable' thread?, which I answered, but possibly incompletely.
Update
I've thought of a hack to do this using current_thread():
def my_task():
print(current_thread().name)
Thread(target=my_task).start()
Problem: calling a function to get a parameter that should ideally be passed in.
Update #2
I have found an even hackier solution that makes current_thread seem much more attractive:
class ThreadWithSelf(Thread):
def __init__(self, **kwargs):
args = kwargs.get('args', ())
args = (self,) + tuple(args)
kwargs[args] = args
super().__init__(**kwargs)
ThreadWithSelf(target=my_task).start()
Besides being incredibly ugly (e.g. by forcing the user to use keywords only, even if that is the recommended way in the documentation), this completely defeats the purpose of not extending Thread.
Update #3
Another ridiculous (and unsafe) solution: to pass in a mutable object via args and to update it afterwards:
def my_task(t):
print(t[0].name)
container = []
t = Thread(target=my_task, args=(container,))
container[0] = t
t.start()
To avoid synchronization issues, you could kick it up a notch and implement another layer of ridiculousness:
def my_task(t, i):
print(t[i].name)
container = []
container[0] = Thread(target=my_task, args=(container, 0))
container[1] = Thread(target=my_task, args=(container, 1))
for t in container:
t.start()
I am still looking for a legitimate answer.
It seems like your goal is to get access to the thread currently executing a task from within the task itself. You can't add the thread as an argument to the threading.Thread constructor, because it's not yet constructed. I think there are two real options.
If your task runs many times, potentially on many different threads, I think the best option is to use threading.current_thread() from within the task. This gives you access directly to the thread object, with which you can do whatever you want. This seems to be exactly the kind of use-case this function was designed for.
On the other hand, if your goal is implement a thread with some special characteristics, the natural choice is to subclass threading.Thread, implementing whatever extended behavior you wish.
Also, as you noted in your comment, insinstance(current_thread(), ThreadSubclass) will return True, meaning you can use both options and be assured that your task will have access to whatever extra behavior you've implemented in your subclass.
The simplest and most readable answer here is: use current_thread(). You can use various weird ways to pass the thread as a parameter, but there's no good reason for that. Calling current_thread() is the standard approach which is shorter than all the alternatives and will not confuse other developers. Don't try to overthink/overengineer this:
def runnable():
thread = threading.current_thread()
Thread(target=runnable).start()
If you want to hide it a bit and change for aesthetic reasons, you can try:
def with_current_thread(f):
return f(threading.current_thread())
#with_current_thread
def runnable(thread):
print(thread.name)
Thread(target=runnable).start()
If this is not good enough, you may get better answers by describing why you think the parameter passing is better / more correct for your use case.
I'm writing a library which is using Tornado Web's tornado.httpclient.AsyncHTTPClient to make requests which gives my code a async interface of:
async def my_library_function():
return await ...
I want to make this interface optionally serial if the user provides a kwarg - something like: serial=True. Though you can't obviously call a function defined with the async keyword from a normal function without await. This would be ideal - though almost certain imposible in the language at the moment:
async def here_we_go():
result = await my_library_function()
result = my_library_function(serial=True)
I'm not been able to find anything online where someones come up with a nice solution to this. I don't want to have to reimplement basically the same code without the awaits splattered throughout.
Is this something that can be solved or would it need support from the language?
Solution (though use Jesse's instead - explained below)
Jesse's solution below is pretty much what I'm going to go with. I did end up getting the interface I originally wanted by using a decorator. Something like this:
import asyncio
from functools import wraps
def serializable(f):
#wraps(f)
def wrapper(*args, asynchronous=False, **kwargs):
if asynchronous:
return f(*args, **kwargs)
else:
# Get pythons current execution thread and use that
loop = asyncio.get_event_loop()
return loop.run_until_complete(f(*args, **kwargs))
return wrapper
This gives you this interface:
result = await my_library_function(asynchronous=True)
result = my_library_function(asynchronous=False)
I sanity checked this on python's async mailing list and I was lucky enough to have Guido respond and he politely shot it down for this reason:
Code smell -- being able to call the same function both asynchronously
and synchronously is highly surprising. Also it violates the rule of
thumb that the value of an argument shouldn't affect the return type.
Nice to know it's possible though if not considered a great interface. Guido essentially suggested Jesse's answer and introducing the wrapping function as a helper util in the library instead of hiding it in a decorator.
When you want to call such a function synchronously, use run_until_complete:
asyncio.get_event_loop().run_until_complete(here_we_go())
Of course, if you do this often in your code, you should come up with an abbreviation for this statement, perhaps just:
def sync(fn, *args, **kwargs):
return asyncio.get_event_loop().run_until_complete(fn(*args, **kwargs))
Then you could do:
result = sync(here_we_go)
Lots of doc write : 'apply' is for sync while 'apply_async' is for async.
And I read source code of multiprocessing (in file multiprocessing/pool.py), it says:
def apply(self, func, args=(), kwds={}):
assert self._state == RUN
return self.apply_async(func, args, kwds).get()
...
def apply_async(self, func, args=(), kwds={}, callback=None):
assert self._state = RUN
....
return result
It seems that apply just call apply_async, the only difference is their return values.
So my question is:
What's the real difference between sync and async ? and Why?
The huge difference is the .get() at the end of:
return self.apply_async(func, args, kwds).get()
apply_async() on its own does not block the caller: the call to apply_async() returns at once, and gives you back an AsyncResult object. Such objects have (among others) a .get() method, which blocks until the invoked process finishes running func(*args, **kwds) and returns its result.
Since apply() blocks until the result is ready, it's impossible to get more than one client process working simultaneously if apply() is all you use. Sometimes that's what you want, but not usually. Using apply_async() instead you can fire off as many tasks as you like in parallel, and retrieve their results later.
The difference is in the get() function. Execution is blocked until the apply call is done. apply_async will return immediately an ApplyResult object on which you must call get() to have your return value. Furthermore, the async version of the call supports a callback which will be executed when the execution is done, allowing event-driven operations.
If you supply multiple functions to apply_async, their return order is not guaranteed to be the same as submission order. You should check the map() function to get results in order.
I'm trying to make a memoize decorator that works with multiple threads.
I understood that I need to use the cache as a shared object between the threads, and acquire/lock the shared object. I'm of course launching the threads:
for i in range(5):
thread = threading.Thread(target=self.worker, args=(self.call_queue,))
thread.daemon = True
thread.start()
where worker is:
def worker(self, call):
func, args, kwargs = call.get()
self.returns.put(func(*args, **kwargs))
call.task_done()
The problem starts, of course, when I'm sending a function decorated with a memo function (like this) to many threads at the same time.
How can I implement the memo's cache as a shared object among threads?
The most straightforward way is to employ a single lock for the entire cache, and require that any writes to the cache grab the lock first.
In the example code you posted, at line 31, you would acquire the lock and check to see if the result is still missing, in which case you would go ahead and compute and cache the result. Something like this:
lock = threading.Lock()
...
except KeyError:
with lock:
if key in self.cache:
v = self.cache[key]
else:
v = self.cache[key] = f(*args,**kwargs),time.time()
The example you posted stores a cache per function in a dictionary, so you'd need to store a lock per function as well.
If you were using this code in a highly contentious environment, though, it would probably be unacceptably inefficient, since threads would have to wait on each other even if they weren't calculating the same thing. You could probably improve on this by storing a lock per key in your cache. You'll need to globally lock access to the lock storage as well, though, or else there's a race condition in creating the per-key locks.
At the moment, I'm doing stuff like the following, which is getting tedious:
run_once = 0
while 1:
if run_once == 0:
myFunction()
run_once = 1:
I'm guessing there is some more accepted way of handling this stuff?
What I'm looking for is having a function execute once, on demand. For example, at the press of a certain button. It is an interactive app which has a lot of user controlled switches. Having a junk variable for every switch, just for keeping track of whether it has been run or not, seemed kind of inefficient.
I would use a decorator on the function to handle keeping track of how many times it runs.
def run_once(f):
def wrapper(*args, **kwargs):
if not wrapper.has_run:
wrapper.has_run = True
return f(*args, **kwargs)
wrapper.has_run = False
return wrapper
#run_once
def my_function(foo, bar):
return foo+bar
Now my_function will only run once. Other calls to it will return None. Just add an else clause to the if if you want it to return something else. From your example, it doesn't need to return anything ever.
If you don't control the creation of the function, or the function needs to be used normally in other contexts, you can just apply the decorator manually as well.
action = run_once(my_function)
while 1:
if predicate:
action()
This will leave my_function available for other uses.
Finally, if you need to only run it once twice, then you can just do
action = run_once(my_function)
action() # run once the first time
action.has_run = False
action() # run once the second time
Another option is to set the func_code code object for your function to be a code object for a function that does nothing. This should be done at the end of your function body.
For example:
def run_once():
# Code for something you only want to execute once
run_once.func_code = (lambda:None).func_code
Here run_once.func_code = (lambda:None).func_code replaces your function's executable code with the code for lambda:None, so all subsequent calls to run_once() will do nothing.
This technique is less flexible than the decorator approach suggested in the accepted answer, but may be more concise if you only have one function you want to run once.
Run the function before the loop. Example:
myFunction()
while True:
# all the other code being executed in your loop
This is the obvious solution. If there's more than meets the eye, the solution may be a bit more complicated.
I'm assuming this is an action that you want to be performed at most one time, if some conditions are met. Since you won't always perform the action, you can't do it unconditionally outside the loop. Something like lazily retrieving some data (and caching it) if you get a request, but not retrieving it otherwise.
def do_something():
[x() for x in expensive_operations]
global action
action = lambda : None
action = do_something
while True:
# some sort of complex logic...
if foo:
action()
There are many ways to do what you want; however, do note that it is quite possible that —as described in the question— you don't have to call the function inside the loop.
If you insist in having the function call inside the loop, you can also do:
needs_to_run= expensive_function
while 1:
…
if needs_to_run: needs_to_run(); needs_to_run= None
…
I've thought of another—slightly unusual, but very effective—way to do this that doesn't require decorator functions or classes. Instead it just uses a mutable keyword argument, which ought to work in most versions of Python. Most of the time these are something to be avoided since normally you wouldn't want a default argument value to change from call-to-call—but that ability can be leveraged in this case and used as a cheap storage mechanism. Here's how that would work:
def my_function1(_has_run=[]):
if _has_run: return
print("my_function1 doing stuff")
_has_run.append(1)
def my_function2(_has_run=[]):
if _has_run: return
print("my_function2 doing some other stuff")
_has_run.append(1)
for i in range(10):
my_function1()
my_function2()
print('----')
my_function1(_has_run=[]) # Force it to run.
Output:
my_function1 doing stuff
my_function2 doing some other stuff
----
my_function1 doing stuff
This could be simplified a little further by doing what #gnibbler suggested in his answer and using an iterator (which were introduced in Python 2.2):
from itertools import count
def my_function3(_count=count()):
if next(_count): return
print("my_function3 doing something")
for i in range(10):
my_function3()
print('----')
my_function3(_count=count()) # Force it to run.
Output:
my_function3 doing something
----
my_function3 doing something
Here's an answer that doesn't involve reassignment of functions, yet still prevents the need for that ugly "is first" check.
__missing__ is supported by Python 2.5 and above.
def do_once_varname1():
print 'performing varname1'
return 'only done once for varname1'
def do_once_varname2():
print 'performing varname2'
return 'only done once for varname2'
class cdict(dict):
def __missing__(self,key):
val=self['do_once_'+key]()
self[key]=val
return val
cache_dict=cdict(do_once_varname1=do_once_varname1,do_once_varname2=do_once_varname2)
if __name__=='__main__':
print cache_dict['varname1'] # causes 2 prints
print cache_dict['varname2'] # causes 2 prints
print cache_dict['varname1'] # just 1 print
print cache_dict['varname2'] # just 1 print
Output:
performing varname1
only done once for varname1
performing varname2
only done once for varname2
only done once for varname1
only done once for varname2
One object-oriented approach and make your function a class, aka as a "functor", whose instances automatically keep track of whether they've been run or not when each instance is created.
Since your updated question indicates you may need many of them, I've updated my answer to deal with that by using a class factory pattern. This is a bit unusual, and it may have been down-voted for that reason (although we'll never know for sure because they never left a comment). It could also be done with a metaclass, but it's not much simpler.
def RunOnceFactory():
class RunOnceBase(object): # abstract base class
_shared_state = {} # shared state of all instances (borg pattern)
has_run = False
def __init__(self, *args, **kwargs):
self.__dict__ = self._shared_state
if not self.has_run:
self.stuff_done_once(*args, **kwargs)
self.has_run = True
return RunOnceBase
if __name__ == '__main__':
class MyFunction1(RunOnceFactory()):
def stuff_done_once(self, *args, **kwargs):
print("MyFunction1.stuff_done_once() called")
class MyFunction2(RunOnceFactory()):
def stuff_done_once(self, *args, **kwargs):
print("MyFunction2.stuff_done_once() called")
for _ in range(10):
MyFunction1() # will only call its stuff_done_once() method once
MyFunction2() # ditto
Output:
MyFunction1.stuff_done_once() called
MyFunction2.stuff_done_once() called
Note: You could make a function/class able to do stuff again by adding a reset() method to its subclass that reset the shared has_run attribute. It's also possible to pass regular and keyword arguments to the stuff_done_once() method when the functor is created and the method is called, if desired.
And, yes, it would be applicable given the information you added to your question.
Assuming there is some reason why myFunction() can't be called before the loop
from itertools import count
for i in count():
if i==0:
myFunction()
Here's an explicit way to code this up, where the state of which functions have been called is kept locally (so global state is avoided). I don't much like the non-explicit forms suggested in other answers: it's too surprising to see f() and for this not to mean that f() gets called.
This works by using dict.pop which looks up a key in a dict, removes the key from the dict, and takes a default value to use in case the key isn't found.
def do_nothing(*args, *kwargs):
pass
# A list of all the functions you want to run just once.
actions = [
my_function,
other_function
]
actions = dict((action, action) for action in actions)
while True:
if some_condition:
actions.pop(my_function, do_nothing)()
if some_other_condition:
actions.pop(other_function, do_nothing)()
I use cached_property decorator from functools to run just once and save the value. Example from the official documentation https://docs.python.org/3/library/functools.html
class DataSet:
def __init__(self, sequence_of_numbers):
self._data = tuple(sequence_of_numbers)
#cached_property
def stdev(self):
return statistics.stdev(self._data)
You can also use one of the standard library functools.lru_cache or functools.cache decorators in front of the function:
from functools import lru_cache
#lru_cache
def expensive_function():
return None
https://docs.python.org/3/library/functools.html
If I understand the updated question correctly, something like this should work
def function1():
print "function1 called"
def function2():
print "function2 called"
def function3():
print "function3 called"
called_functions = set()
while True:
n = raw_input("choose a function: 1,2 or 3 ")
func = {"1": function1,
"2": function2,
"3": function3}.get(n)
if func in called_functions:
print "That function has already been called"
else:
called_functions.add(func)
func()
You have all those 'junk variables' outside of your mainline while True loop. To make the code easier to read those variables can be brought inside the loop, right next to where they are used. You can also set up a variable naming convention for these program control switches. So for example:
# # _already_done checkpoint logic
try:
ran_this_user_request_already_done
except:
this_user_request()
ran_this_user_request_already_done = 1
Note that on the first execution of this code the variable ran_this_user_request_already_done is not defined until after this_user_request() is called.
A simple function you can reuse in many places in your code (based on the other answers here):
def firstrun(keyword, _keys=[]):
"""Returns True only the first time it's called with each keyword."""
if keyword in _keys:
return False
else:
_keys.append(keyword)
return True
or equivalently (if you like to rely on other libraries):
from collections import defaultdict
from itertools import count
def firstrun(keyword, _keys=defaultdict(count)):
"""Returns True only the first time it's called with each keyword."""
return not _keys[keyword].next()
Sample usage:
for i in range(20):
if firstrun('house'):
build_house() # runs only once
if firstrun(42): # True
print 'This will print.'
if firstrun(42): # False
print 'This will never print.'
I've taken a more flexible approach inspired by functools.partial function:
DO_ONCE_MEMORY = []
def do_once(id, func, *args, **kwargs):
if id not in DO_ONCE_MEMORY:
DO_ONCE_MEMORY.append(id)
return func(*args, **kwargs)
else:
return None
With this approach you are able to have more complex and explicit interactions:
do_once('foobar', print, "first try")
do_once('foo', print, "first try")
do_once('bar', print, "second try")
# first try
# second try
The exciting part about this approach it can be used anywhere and does not require factories - it's just a small memory tracker.
Depending on the situation, an alternative to the decorator could be the following:
from itertools import chain, repeat
func_iter = chain((myFunction,), repeat(lambda *args, **kwds: None))
while True:
next(func_iter)()
The idea is based on iterators, which yield the function once (or using repeat(muFunction, n) n-times), and then endlessly the lambda doing nothing.
The main advantage is that you don't need a decorator which sometimes complicates things, here everything happens in a single (to my mind) readable line. The disadvantage is that you have an ugly next in your code.
Performance wise there seems to be not much of a difference, on my machine both approaches have an overhead of around 130 ns.
If the condition check needs to happen only once you are in the loop, having a flag signaling that you have already run the function helps. In this case you used a counter, a boolean variable would work just as fine.
signal = False
count = 0
def callme():
print "I am being called"
while count < 2:
if signal == False :
callme()
signal = True
count +=1
I'm not sure that I understood your problem, but I think you can divide loop. On the part of the function and the part without it and save the two loops.