Class object affecting other class object when thread.sleep- Python - python

I have a class function in python.
I run the class in many different instances
class worker():
def__init__(x,x)
def run():
instance1 = worker(x,x)
instance1.run()
instance2 = worker(x,x)
instance2.run()
The problem is if first instance1 encounter thread.sleep() it affects the other instance2. How do i make them independent. Better if without multi-process Thank you!
Different example:
__author__ = 'user'
import time
class test():
def __init__(self, message):
self.message=message
def run(self):
while True:
print self.message
time.sleep(5)
if __name__ == '__main__':
test1 = test("PRINT-1")
test1.run()
test2 = test("PRINT-2")
test2.run()

you can use Celery for run parallel tasks. It's easy to implement.
See an example:
import time
from celery import task
#task
def make_task():
time.sleep(5)
return True
def execute_tasks():
result = group([make_task.s(), make_task.s()]).apply_async() # Execute tasks
print result.get() # Print the result

It looks like you're half-followed a tutorial on parallel code. Nothing in your current test class will cause it to run in parallel, but with just some minor tweaks you can do so with either threads or processes.
Here's a version that makes the class inherit from threading.Thread:
import threading
import time
class TestThreaded(threading.Thread):
def __init__(self, x, y):
super().__init__()
self.x = x
self.y = y
def run(self):
for i in range(self.x):
time.sleep(self.y)
print((i+1)*self.y)
You can use it like this:
t0 = TestThreaded(8, 3)
t1 = TestThreaded(6, 4)
t0.start()
t1.start()
t0.join()
t1.join()
Both threads in this example will count to 24 over a span of 24 seconds. The first thread will count by threes, the second thread will count by fours. The timings will be closely synched at 12 and 24 seconds (depending on your computer's exact timings they may get printed on the same line).
Note that we're calling the start method inherited from the Thread class, not the run method we defined above. The threading code will call run for us, in the spawned thread.
You can get an equivalent multiprocessing version by using multiprocessing.Process as the base class instead of threading.Thread. The only difference is that you'll spawn child processes instead of child threads. For CPU limited work in Python, processes are better than threads because they're not limited by the Global Interpreter Lock, which makes it impossible for two threads to run Python code at the same time. The downside is higher overhead, both during startup, and when communicating between processes.

Related

Python: CPU intensive tasks on multiple threads

Suppose I have this class:
class Foo:
def __init__(self):
self.task1_dict = {}
self.task2_dict = {}
def task1(self):
for i in range(10000000):
# update self.task1_dict
def task2(self):
for i in range(10000000):
# update self.task2_dict
def run(self):
self.task1()
self.task2()
Task 1 and task 2 are both CPU intensive tasks and are non-IO. They are also independent so you can assume that running them concurrently is thread safe.
For now, my class is running the tasks sequentially and I want to change it so the tasks are run in parallel in multiple threads. I'm using the ThreadPoolExecutor from the concurrent.future package.
class Foo:
...
def run(self):
with ThreadPoolExecutor() as executor:
executor.submit(self.task1)
executor.submit(self.task2)
The problem is when I call the run method the run time does not decrease at all and even slightly increases compared to the sequential version. I'm guessing that this is because of the GIL allowing only one thread to run at a time. Is there any way that I can parallelise this program? Maybe a way to overcome the GIL and run the 2 methods on 2 threads? I have considered switching to ProcessPoolExecutor, but I cannot call the methods since class methods are not picklable. Also if I use multiprocessing, Python will create multiple instances of Foo and self.task1_dict and self.task2_dict would not be updated accordingly.
You can use multiprocessing shared memory as explained here

Updating object attributes from within module run in multiprocessor process

I am relatively new to python and definitely new to multiprocessing. I'm following this question/answer for the structure of my multiprocessing, but in def func_A, I'm calling a module that passes a class object as one of the arguments. In the module, I change an object attribute that I would like the main program to see and update the user with the object attribute value. The child processes run for very long times, so I need the main program to provide updates as they run.
My suspicion is that I'm not understanding namespace/object scoping or something similar, but from what I've read, passing an object (an instance of a class?) to a module as an argument passes a reference to the object and not a copy. I would have thought this meant that changing the attributes of the object in the child process/module would have changed the attributes in the main program object (since they're the same object). Or am I confusing things?
The code for my main program:
# MainProgram.py
import multiprocessing as mp
import time
from time import sleep
import sys
from datetime import datetime
import myModule
MYOBJECTNAMES = ['name1','name2']
class myClass:
def __init__(self, name):
self.name = name
self.value = 0
myObjects = []
for n in MYOBJECTNAMES:
myObjects.append(myClass(n))
def func_A(process_number, queue):
start = datetime.now()
print("Process {} (object: {}) started at {}".format(process_number, myObjects[process_number].name, start))
myModule.Eval(myObjects[process_number])
sys.stdout.flush()
def multiproc_master():
queue = mp.Queue()
proceed = mp.Event()
processes = [mp.Process(target=func_A, args=(x, queue)) for x in range(len(myObjects))]
for p in processes:
p.start()
for i in range(100):
for o in myObjects:
print("In main: Value of {} is {}".format(o.name, o.value))
sleep(10)
for p in processes:
p.join()
if __name__ == '__main__':
split_jobs = multiproc_master()
print(split_jobs)
The code for my module program:
# myModule.py
from time import sleep
def Eval(myObject):
for i in range(100):
myObject.value += 1
print("In module: Value of {} is {}".format(myObject.name, myObject.value))
sleep(5)
That question/answer you linked to probably was probably a poor choice to use as a template, as it's doing many things that your code doesn't require (much less use).
I think your biggest misconception about how multiprocessing works is thinking that all the code is running in the same address-space. The main task runs in its own, and there are separate ones for each subtask. The way your code is written, each of them will end up with its own separate myObjects list. That's why the main task doesn't see any of the changes made by any of the other tasks.
While there are ways share objects using the multiprocessing module, doing so often introduces significant overhead because keeping it or them all in-sync between all the processes requires lots of things happening "under the covers" to make seem like they're shared (which is what is really going on since they can't actually be because of having separate address-spaces). This overhead frequently completely cancels out any speed gained by parallel-processing.
As stated in the documentation: "when doing concurrent programming it is usually best to avoid using shared state as far as possible".

How to minimize memory usage for a mostly sleeping thread that may need to be killed

I have a thread in my process that has a few characteristics: 1. it mostly sleeps 2. it only toggles a few booleans; it doesn't interact with any data that could be messed up from abrupt killing (eg, an external database) 3. it may need to be killed at any time from another thread.
A rough approximation is:
import time
import threading
class Foo():
def __init__(self):
self.my_bool = True
def bar(self):
while True:
time.sleep(60*30)
self.my_bool = not self.my_bool
foo = Foo()
t = threading.Thread(target=foo.bar)
t.start()
According to this highly voted question/answer, I should not kill it abruptly. Does this still apply since it clearly isn't going to be ruining my data collection and cleanup? Specifically, in my kill function, I will manually clean up the processes that interact with the boolean.
I can imagine some other options such as this:
import time
import threading
import datetime
class Foo():
def __init__(self):
self.my_bool = True
self.killed = False
def bar(self):
while True and not self.killed:
next_toggle_time = datetime.datetime.now() + datetime.timedelta(minutes=15)
print 'next toggle time: ', next_toggle_time
while datetime.datetime.now() < next_toggle_time and not self.killed:
time.sleep(0.1)
self.my_bool = not self.my_bool
This doesn't appear to use much memory (0.1% of a MacBook Pro) but it feels kind of ugly and I'm planning to use this on computers with much more limited memory. Is there a better way?

Simple example of Multiprocessing and multi-threading

I have the following code:
class SplunkUKAnalyser(object):
def __init__
def method1
def method2
def method2
...
class SplunkDEAnalyser(SplunkUKAnalyser):
def __init__ (Over-ridden)
def method1 (Over-ridden)
def method2
def method2
...
perform_uk_analysis():
my_uk_analyser = SplunkUKAnalyser()
perform_de_analysis():
my_de_analyser = SplunkDEAnalyser()
It all works well if I just execute the below:
perform_uk_analysis()
perform_de_analysis()
How can I make it so that the two last statements are executed concurrently. (using mutliprocessing and/or multi-threading)?
From my test it seems that the second statement executes even though the first statement has not finished completely but I would like to incorporate true concurrency.
Any other additional advice is much appreciated.
Many thanks in advance.
Because of GIL (Global Interpreter Lock) you can not achieve 'true concurrency' with threading.
However, using multiprocessing to concurrently run multiple tasks is easy:
import multiprocessing
process1 = multiprocessing.Process(target=perform_uk_analysis)
process2 = multiprocessing.Process(target=perform_de_analysis)
# you can optionally daemoize the process
process2.daemon = True
# run the tasks concurrently
process1.start()
process2.start()
# you can optionally wait for a process to finish
process2.join()
For tasks that run the same function with different arguments, consider using multiprocessing.Pool, an even more convenient solution.

Concurrent execution of QThread or QRunnable objects in Python

I have the following code:
from PySide.QtCore import *
import time
class GUI(object):
IDLIST = [i for i in xrange(20)]
UNUSEDIDS = [i for i in xrange(20)]
def __init__(self):
print "GUI CLASS INITIALIZED!"
worker = Worker()
worker2 = Worker2()
threadpool = QThreadPool()
threadpool.setMaxThreadCount(10)
for i in xrange(5):
#Alternate between the two
#threadpool.start(worker)
#worker2.start()
#classmethod
def delegator(self):
"""Irrelevant to the question, I need this method for something else"""
USEDIDS = []
toUse = self.UNUSEDIDS[0]
USEDIDS.append(toUse)
self.UNUSEDIDS.pop(0)
return toUse
class Worker(QRunnable):
def __init__(self, parent=None):
super(Worker, self).__init__(parent)
def run(self):
#idInUse = getattr(GUI, "delegator")
idInUse = GUI.delegator()
print "Hello world from QRunnable", idInUse
#time.sleep(5)
class Worker2(QThread):
def __init__(self, parent=None):
super(Worker2, self).__init__(parent)
def run(self):
idInUse = GUI.delegator()
print "Hello world from QThread", idInUse
s = time.time()
GUI()
print "Done in %s" % ((time.time()-s) * 1000)
I think the desired effect is obvious from the code. I want the "Hello world from QThread/QRunnable " to be shown. Since I am writing a multi-threaded application, in my GUI __init__ part I have the loop that starts concurrent threads.
The thing is that, with QRunnable it works just fine. All the 5 threads I specified get executed at once, concurrently. With QThread, however, that is not the case. Instead, I get the following error:
QThread: Destroyed while thread is still running
And it is not executed at all.
Normally I would not at all mind using the QRunnable, however, it does not derive from QObject (so I can't directly emit signals though I can construct a QObject() within it) and also it does not have the .stop() method which I badly need. Googling revealed that there is no way to stop a QRunnable from executing? On the other hand, QThread has both of these methods that I need.
So I guess my question is either how to make multiple same QThreads run concurrently, or how to terminate an execution of a QRunnable?
(also please bear in mind that the python's built-in threading module is out of the question)
The QThread: Destroyed while thread is still running-exception happens because you never wait for your threads to finish, and you don't keep any reference to them (neither to worker, worker2 or threadpool, so when your __init__ finishes it gets destroyed.
if you keep a reference to this objects, then is should work:
def __init__(self):
print "GUI CLASS INITIALIZED!"
self.worker = Worker()
self.worker2 = Worker2()
self.threadpool = QThreadPool()
self.threadpool.setMaxThreadCount(10)
for i in xrange(5):
#Alternate between the two
self.threadpool.start(worker)
# this is wrong, by the way!
# you should create 5 workers, not call start 5 times...
self.worker2.start()
and calling the wait/waitForDone methods on the thread/pool is even better.
For a QThreadPool this implicily happens when it's (C++) destructor is called. If that wasn't the case, then your program wouldn't have worked with QRunnables in the first place eiter. For the QThread nothing like this happens and it's even mentioned that it will probably result in a crash. So it's better to explicitly wait for the threads to finish...
also, i hope you already know this

Categories

Resources