I spent the last hour(s???) looking/googling for a way to have a class start one of its methods in a new thread as soon as it is instanciated.
I could run something like this:
x = myClass()
def updater():
while True:
x.update()
sleep(0.01)
update_thread = Thread(target=updater)
update_thread.daemon = True
update_thread.start()
A more elegant way would be to have the class doing it in init when it is instanciated.
Imagine having 10 instances of that class...
Until now I couldn't find a (working) solution for this problem...
The actual class is a timer and the method is an update method that updates all the counter's variables. As this class also has to run functions at a given time it is important that the time updates won't be blocked by the main thread.
Any help is much appreciated. Thx in advance...
You can subclass directly from Thread in this specific case
from threading import Thread
class MyClass(Thread):
def __init__(self, other, arguments, here):
super(MyClass, self).__init__()
self.daemon = True
self.cancelled = False
# do other initialization here
def run(self):
"""Overloaded Thread.run, runs the update
method once per every 10 milliseconds."""
while not self.cancelled:
self.update()
sleep(0.01)
def cancel(self):
"""End this timer thread"""
self.cancelled = True
def update(self):
"""Update the counters"""
pass
my_class_instance = MyClass()
# explicit start is better than implicit start in constructor
my_class_instance.start()
# you can kill the thread with
my_class_instance.cancel()
In order to run a function (or memberfunction) in a thread, use this:
th = Thread(target=some_func)
th.daemon = True
th.start()
Comparing this with deriving from Thread, it has the advantage that you don't export all of Thread's public functions as own public functions. Actually, you don't even need to write a class to use this code, self.function or global_function are both equally usable as target here.
I'd also consider using a context manager to start/stop the thread, otherwise the thread might stay alive longer than necessary, leading to resource leaks and errors on shutdown. Since you're putting this into a class, start the thread in __enter__ and join with it in __exit__.
Related
This is the code:
class foo:
def levelOne(self):
def worker(self, i):
print('doing hard work')
def writer(self):
print('writing workers work')
session = foo()
i=0
threads = list()
for i in range(0,5):
thread = threading.Thread(target=session.levelOne.worker, args=(i,))
thread.start()
threads.append(thread)
writerThread = threading.Thread(target=session.levelOne.writer)
writerThread.start()
for thread in threads:
thread.join()
writerThread.join()
5 workers should do the job and the writer should collect their results.
The error I get is: session object has no attribute worker
workers are actually testers that do a certain work in different "areas" while writer is keeping track of them without making my workers return any result.
It's important for this algorithm to be divided on layers like "levelOne", "levelTwo" etc. because they will all work together. This is the main reason why I keep the threading outside the class instead of the levelOne method.
please help me understand where I'm wrong
You certainly dont have "session object has no attribute worker" as error message with the code you posted - the error should be "'function' object has no attribute 'worker'". And actually I don't know why you'd expect anything else - names defined within a function are local variables (hint: python functions are objects just like any other), they do not become attributes of the function.
It's important for this algorithm to be divided on layers like "levelOne", "levelTwo"
Well, possibly but that's not the proper design. If you want foo to be nothing but a namespace and levelOne, levelTwo etc to be instances of some type having bot a writer and worker methods, then you need to 1/ define your LevelXXX as classes, 2/ build instances of those objects as attributes of your foo class, ie:
class LevelOne():
def worker(self, i):
# ...
def writer(self):
# ...
class foo():
levelOne = LevelOne()
Now whether this is the correct design for your use case is not garanteed in any way, but it's impossible to design a proper solution without knowing anything about the problem...
If it's possible could you explain why trying to access workers and writer as shown in question's code is bad design?
Well, for the mere reason that it doesn't work, to start with, obviously xD.
Note that you could return the "worker" and "writer" functions from the levelOne method, ie:
class foo:
def levelOne(self):
def worker(self, i):
print('doing hard work')
def writer(self):
print('writing workers work')
return worker, writer
session = foo()
worker, writer = session.levelOne()
# etc
but this is both convoluted (assuming the point is to let worker and writer share self, which is much more simply done using a proper LevelOne class and making worker and writer methods of this class) and inefficient (def is an executable statement, so with your solution the worker and writer functions are created anew - which is not free - on each call).
I have two classes that need to pass data between each other. The first class instantiates the second. The second class needs to be able to pass information back to the first. However I cannot instantiates the ClassOne again from class two. Both classes are running off a shared timer where they poll different things so while they share the timer they cannot share the objects they poll.
My current solution (that works) is to pass a method to ClassTwo and used to send data back up but I feel this might be a bit hack-ey and the wrong way to go about it.
classOne():
def __init__(self,timer):
self.classTwo = classTwo(self.process_alerts,timer)
self.classTwo.start
def process_alerts(alert_msg):
print alert_msg
classTwo():
def __init__(proses_alerts,timer)
self.process_alerts = process_alerts # <----- class ones method
def start(self):
check for alerts:
if alert found:
self.alert(alert_msg)
def alert(self,alert_msg):
self.process_alert(alert_msg) # <----- ClassOnes method
Thank you for your time.
Nothing prevents you from passing the current ClassOne instance (self) to it's own ClassTwo instance:
class ClassOne(object):
def __init__(self):
self.two = ClassTwo(self)
self.two.start()
def process_alert(self, msg):
print msg
class ClassTwo(object):
def __init__(self, parent):
self.parent = parent
def start(self):
while True:
if self.has_message():
self.parent.process_alert(self.get_message())
Note that in this context "parent" means it's a containment relationship ("has a"), it has nothing to do with inheritance ("is a").
If what bugs you is that ClassOne is responsible for instanciating ClassTwo (which indeed introduce a strong coupling), you can either change ClassOne so it takes a factory:
class ClassOne(object):
def __init__(self, factory):
self.other = factory(self)
self.other.start()
# etc
and then pass ClassTwo as the factory:
c1 = ClassOne(ClassTwo)
So you can actually pass anything that returns an object with the right interface (makes unittesting easier)
Or - at least in your (I assume striped down) example - you could just make ClassOne pass itself to ClassTwo.start() and explicitely pass ClassTwo instance to ClassOne, ie:
class ClassOne(object):
def __init__(self, other):
self.other.start(self)
def process_alert(self, msg):
print msg
class ClassTwo(object):
def start(self, parent):
while True:
if self.has_message():
parent.process_alert(self.get_message())
c2 = ClassTwo()
c1 = ClassOne(c2)
Or even simpler remove the call to ClassTwo.start from ClassOne and you don't need any reference to a ClassTwo instance in ClassOne.__init__:
class ClassOne(object):
def process_alert(self, msg):
print msg
class ClassTwo(object):
def start(self, parent):
while True:
if self.has_message():
parent.process_alert(self.get_message())
c1 = ClassOne()
c2 = ClassTwo()
c2.start(c1)
which is as decoupled as it can be but only works if ClassTwo only needs ClassOne instance in start() and methods called from start and ClassOne doesn't need to keep a reference on the ClassTwo instance either.
You could remove/minimize the coupling between the classes! I found this sort of architecture maps really well to sharing data by communicating across a Queue.
By using Queue you can decouple the two classes. The producer (ClassTwo) can check for messages, and publish them to a queue. It no longer needs to know how to correctly instantiate a class or interact with it, it just passes a message.
Then a ClassOne instance could pull messages from the queue, as they become available. This also lends well to scaling each instance independent of each other.
ClassTwo -> publish to queue -> Class One pulls from queue.
This also helps with testing as the two classes are completely isolated, you can provide a Queue to either class.
Queues also usually provide operations that support blocking until message becomes available, so you don't have to manage timeouts.
I have a subclass of threading.Thread. After instantiating it, it runs forever in the background.
class MyThread(threading.Thread):
def __init__(self):
threading.Thread.__init__(self)
self.daemon = True
self.start()
def run(self):
while True:
<do something>
If I were to instantiate the thread from within another class, I would normally do so with
self.my_thread = MyThread()
In cases when I never thereafter have to access the thread, I have long wondered whether I can instead instantiate it simply with
MyThread()
(i.e., instantiate it without holding a reference). Will the thread eventually be garbage collected because there is no reference holding it?
it doesnt matter ... you can test it easily with del self.my_thread and you should see the thread continue running even though you deleted the only reference and forced garbage collection ... that said it is usually a good idea to hold a reference (so that you can set flags and what not for the other thread, although shared memory may be sufficient)
Is there a simple way to use Multiprocessing to do the equivalent of this?
for sim in sim_list:
sim.run()
where the elements of sim_list are "simulation" objects and run() is a method of the simulation class which does modify the attributes of the objects. E.g.:
class simulation:
def __init__(self):
self.state['done']=False
self.cmd="program"
def run(self):
subprocess.call(self.cmd)
self.state['done']=True
All the sim in sim_list are independent, so the strategy does not have to be thread safe.
I tried the following, which is obviously flawed because the argument is passed by deepcopy and is not modified in-place.
from multiprocessing import Process
for sim in sim_list:
b = Process(target=simulation.run, args=[sim])
b.start()
b.join()
One way to do what you want is to have your computing class (simulation in your case) be a subclass of Process. When initialized properly, instances of this class will run in separate processes and you can set off a group of them from a list just like you wanted.
Here's an example, building on what you wrote above:
import multiprocessing
import os
import random
class simulation(multiprocessing.Process):
def __init__(self, name):
# must call this before anything else
multiprocessing.Process.__init__(self)
# then any other initialization
self.name = name
self.number = 0.0
sys.stdout.write('[%s] created: %f\n' % (self.name, self.number))
def run(self):
sys.stdout.write('[%s] running ... process id: %s\n'
% (self.name, os.getpid()))
self.number = random.uniform(0.0, 10.0)
sys.stdout.write('[%s] completed: %f\n' % (self.name, self.number))
Then just make a list of objects and start each one with a loop:
sim_list = []
sim_list.append(simulation('foo'))
sim_list.append(simulation('bar'))
for sim in sim_list:
sim.start()
When you run this you should see each object run in its own process. Don't forget to call Process.__init__(self) as the very first thing in your class initialization, before anything else.
Obviously I've not included any interprocess communication in this example; you'll have to add that if your situation requires it (it wasn't clear from your question whether you needed it or not).
This approach works well for me, and I'm not aware of any drawbacks. If anyone knows of hidden dangers which I've overlooked, please let me know.
I hope this helps.
For those who will be working with large data sets, an iterable would be your solution here:
import multiprocessing as mp
pool = mp.Pool(mp.cpu_count())
pool.imap(sim.start, sim_list)
Say I derive from threading.Thread:
from threading import Thread
class Worker(Thread):
def start(self):
self.running = True
Thread.start(self)
def terminate(self):
self.running = False
self.join()
def run(self):
import time
while self.running:
print "running"
time.sleep(1)
Any instance of this class with the thread being started must have it's thread actively terminated before it can get garbage collected (the thread holds a reference itself). So this is a problem, because it completely defies the purpose of garbage collection. In that case having some object encapsulating a thread, and with the last instance of the object going out of scope the destructor gets called for thread termination and cleanup. Thuss a destructor
def __del__(self):
self.terminate()
will not do the trick.
The only way I see to nicely encapsulate threads is by using low level thread builtin module and weakref weak references. Or I may be missing something fundamental. So is there a nicer way than tangling things up in weakref spaghetti code?
How about using a wrapper class (which has-a Thread rather than is-a Thread)?
eg:
class WorkerWrapper:
__init__(self):
self.worker = Worker()
__del__(self):
self.worker.terminate()
And then use these wrapper classes in client code, rather than threads directly.
Or perhaps I miss something (:
To add an answer inspired by #datenwolf's comment, here is another way to do it that deals with the object being deleted or the parent thread ending:
import threading
import time
import weakref
class Foo(object):
def __init__(self):
self.main_thread = threading.current_thread()
self.initialised = threading.Event()
self.t = threading.Thread(target=Foo.threaded_func,
args=(weakref.proxy(self), ))
self.t.start()
while not self.initialised.is_set():
# This loop is necessary to stop the main threading doing anything
# until the exception handler in threaded_func can deal with the
# object being deleted.
pass
def __del__(self):
print 'self:', self, self.main_thread.is_alive()
self.t.join()
def threaded_func(self):
self.initialised.set()
try:
while True:
print time.time()
if not self.main_thread.is_alive():
print('Main thread ended')
break
time.sleep(1)
except ReferenceError:
print('Foo object deleted')
foo = Foo()
del foo
foo = Foo()
I guess you are a convert from C++ where a lot of meaning can be attached to scopes of variables, equalling lifetimes of variables. This is not the case for Python, and garbage collected languages in general.
Scope != Lifetime simply because garbage collection occurs whenever the interpreter gets around to it, not on scope boundaries. Especially as you are trying to do asynchronuous stuff with it, the raised hairs on your neck should vibrate to the clamour of all the warning bells in your head!
You can do stuff with the lifetime of objects, using 'del'.
(In fact, if you read the sources to the cpython garbage collector module, the obvious (and somewhat funny) disdain for objects with finalizers (del methods) expressed there, should tell everybody to use even the lifetime of an object only if necessary).
You could use sys.getrefcount(self) to find out when to leave the loop in your thread. But I can hardly recommend that (just try out what numbers it returns. You won't be happy. To see who holds what just check gc.get_referrers(self)).
The reference count may/will depend on garbage collection as well.
Besides, tying the runtime of a thread of execution to scopes/lifetimes of objects is an error 99% of the time. Not even Boost does it. It goes out of its RAII way to define something called a 'detached' thread.
http://www.boost.org/doc/libs/1_55_0/doc/html/thread/thread_management.html