I just started getting familiar with multiprocessing in python and got stuck at a problem which I'm not able to solve the way i want and I don't find any clear information if what I'm trying is even properly solvable.
What i'm trying to do is something similar to the following:
import time
from multiprocessing import Process, Event, Queue
from threading import Thread
class Main:
def __init__(self):
self.task_queue = Queue()
self.process = MyProcess(self.task_queue)
self.process.start()
def execute_script(self, code):
ProcessCommunication(code, self.task_queue).start()
class ProcessCommunication(Thread):
def __init__(self, script, task_queue):
super().__init__()
self.script = script
self.script_queue = task_queue
self.script_end_event = Event()
def run(self):
self.script_queue.put((self.script, self.script_end_event))
while not self.script_end_event.is_set():
time.sleep(0.1)
class MyProcess(Process):
class ExecutionThread(Thread):
def __init__(self, code, end_event):
super().__init__()
self.code = code
self.event = end_event
def run(self):
exec(compile(self.code, '<string>', 'exec'))
self.event.set()
def __init__(self, task_queue):
super().__init__(name="TEST_PROCESS")
self.task_queue = task_queue
self.status = None
def run(self):
while True:
if not self.task_queue.empty():
script, end_event = self.task_queue.get()
if script is None:
break
self.ExecutionThread(script, end_event).start()
So I would like to have one separate process, which is running during the whole runtime of my main Process, to execute user written scripts in an environment with restriced user privileges, restriced namespace. Also to protect the main process from potential endless loops without waiting times which load the CPU core too much.
Example Code to use the structure could look something like this:
if __name__ == '__main__':
main_class = Main()
main_class.execute_script("print(1)")
The main process can start several scripts simultaneously and I would like to pass an event, together with the execution request, to the process so that the main process gets notified whenever one of the scripts finished.
However, the Python process Queues somehow do not like the passing of events thorugh the queue and throw the following Error.
'RuntimeError: Semaphore objects should only be shared between processes through inheritance'
As I create another event with every execution request, I can't pass them on instantiation of the Process.
I came up with one way to solve this, which is passing an identifier together with the code and basically set up another queue which is served with the identifier whenever the end_event would be set. However, the usage of events seems much more elegant to me and i wonder if there is solution which I did not think of yet.
Related
Created a background thread this way
def listenReply(self):
while self.SOCK_LISTENING:
fromNodeRED = self.nodeRED_sock.recv(1024).decode()
if fromNodeRED=="closeDoor":
self.door_closed()
...
self.listenThread = Thread(target=self.listenReply, daemon=True)
self.SOCK_LISTENING = True
self.listenThread.start()
But self.door_closed() has some UI stuffs so that's no good. How do I call self.door_closed in main thread instead?
One thing to keep in mind is that each thread is a sequential execution of a single flow of code, starting from the function the thread was started on.
It doesn't make much sense to simply run something on an existing thread, since that thread is already executing something, and doing so would disrupt it's current flow.
However, it's quite easy to communicate between threads and it's possible to implement a thread's code such that it simply receives functions/events from other threads which tell it what to do. This is commonly known as an event loop.
For example your main thread could look something like this
from queue import Queue
tasks = Queue()
def event_loop():
while True:
next_task = tasks.get()
print('Executing function {} on main thread'.format(next_task))
next_task()
In your other threads you could ask the main thread to run a function by simply adding it to the tasks queue:
def listenReply(self):
while self.SOCK_LISTENING:
fromNodeRED = self.nodeRED_sock.recv(1024).decode()
if fromNodeRED=="closeDoor":
tasks.put(door_closed)
You can use threading.Event() and set it whenever you you receive "closeDoor" from recv.
For example:
g_should_close_door = threading.Event()
def listenReply(self):
while self.SOCK_LISTENING:
fromNodeRED = self.nodeRED_sock.recv(1024).decode()
if fromNodeRED=="closeDoor":
g_should_close_door.set()
...
self.listenThread = Thread(target=self.listenReply, daemon=True)
self.SOCK_LISTENING = True
self.listenThread.start()
if g_should_close_door.is_set():
door_closed()
g_should_close_door.clear()
Solved it myself using signal and slots of the PyQt.
class App(QWidget):
socketSignal = QtCore.pyqtSignal(object) #must be defined in class level
# BG THREAD
def listenReply(self):
while self.SOCK_LISTENING:
fromNodeRED = self.nodeRED_sock.recv(1024).decode()
print(fromNodeRED)
self.socketSignal.emit(fromNodeRED)
.... somewhere in init of Main thread:
self.socketSignal.connect(self.executeOnMain)
self.listenThread = Thread(target=self.listenReply, daemon=True)
self.SOCK_LISTENING = True
self.listenThread.start()
....
def executeOnMain(self, data):
if data=="closeDoor":
self.door_closed() # a method that changes the UI
Works great for me.
I want to have a main program that works like a console from where I can call other processes (infinite loops) and kill them selectively whenever certain commands are entered.
For that I created this class:
class RunInThread(threading.Thread):
def __init__(self, function):
self.function = function
self.kill_pill = threading.Event()
threading.Thread.__init__(self)
def start(self): # This is controversial.
self.__init__(self.function)
threading.Thread.start(self)
def stop(self):
self.kill_pill.set()
def run(self):
while not self.kill_pill.is_set():
self.function()
The documentation for thread.Thread says that only the __init__() and run() methods should be overridden.
Is there any clear issue with my code? It works the way I intended but since it's going to be running for long periods of time I need to make sure I'm not creating any memory problems.
EDIT:
What about this solution?:
class StoppableThread(threading.Thread):
# threading.Thread class but can be stopped with the stop() method.
def __init__(self, function):
threading.Thread.__init__(self)
self.function = function
self.kill_pill = threading.Event()
def stop(self):
self.kill_pill.set()
def run(self):
while not self.kill_pill.is_set():
self.function()
class RunInThread():
def __init__(self, function, prnt=False):
self.function = function
self.running = False
self.prnt = prnt
def start(self):
if not self.running:
self.thread = StoppableThread(self.function)
self.thread.start()
self.running = True
else:
if self.prnt:
print('Thread already running.')
def stop(self):
self.thread.stop()
self.running = False
If you want to find out what things that could break, I'd suggest looking into the implementation of Thread class.
Among other things, Thread.__init__() initialises an Event() object to detect thread startup and shutdown, manages cleanup hooks/callbacks, some internal lock objects, and registers the thread to a list so you can introspect running threads. By calling Thread.__init__(), these variables gets reinitialised and screws up the internal mechanisms of many of these functionalities.
What could go wrong? I didn't test any of these, but from skimming through threading.py, these are likely some of the things that I expect could go wrong:
your python process now will be running a OS threads that doesn't show up in enumerate_thread()
multiple OS thread will now return the same Thread object when it calls current_thread(), which will likely also break threadlocal and anything that depends on threadlocal
Thread.join() depends on some internal locks, which likely would now become thread unsafe to call
Unhandled reception can go to the wrong exception hook handler
register_at_fork and shutdown handler likely will get confused
In other words, don't try to be sneaky. Create a new Thread object for each thread you want to start.
There's a good reason that the Thread class spent efforts trying to prevent you from accidentally calling start() twice. Don't try to subvert this.
For a couple weeks I have been trying to solve a problem with a multiprocessing module in python (2.7.x)
Idea:
Lets have Message Queue (RabbitMQ in our case). Create a listener on that queue and on the message spawn task which will process that message.
Problem:
Everything works fine, but after a couple hundred tasks, some sub-processes became zombies which is the main problem.
We have also some limitation (such as max number of tasks per machine) - which in the end leads that the machine stops processing any task.
Current implementation:
I created minimal code which should explain our approach
# -*- coding: utf-8 -*-
from multiprocessing import Process
import signal
from threading import Lock
class Task(Process):
def __init__(self, data):
super(Task, self).__init__()
self.data = data
def run(self):
# ignore sigchild signals in subprocess
signal.signal(signal.SIGCHLD, signal.SIG_DFL)
self.do_job() # long job there
pass
def do_job(self):
# very long job
pass
class MQListener(object):
def __init__(self):
self.tasks = []
self.tasks_lock = Lock()
self.register_signal_handler()
mq = RabbitMQ()
mq.listen("task_queue", self.on_message)
def register_signal_handler(self):
signal.signal(signal.SIGCHLD, self.on_signal_received)
def on_signal_received(self, *_):
self._check_existing_processes()
def on_message(self, message):
# ack message and create task
task = Task(message)
with self.tasks_lock:
self.tasks.append(task)
task.start()
pass
def _check_existing_processes(self):
"""
go over all created task, if some is not alive - remove them from tasks collection
"""
try:
with self.tasks_lock:
running_tasks = []
for w in self.tasks:
if not w.is_alive():
w.join()
else:
running_tasks.append(w)
self.tasks = running_tasks
except Exception:
# log
pass
if __name__ == '__main__':
m = MQListener()
I'm quite open to use some library for that - if you can recommend some, that will be great as well.
Using SIGCHLD to catch child processes termination has quite many gotchas. The signal handler is run asynchronously and multiple SIGCHLD calls might get aggregated.
In short is better not to use it as long as you're not really aware of how it works.
Your program has, as well, another issue: what happens if you get 10000 messages at once? You'll spawn 10000 processes altogether and kill your machine.
You could use a process Pool and let it handle all these issues for you.
from multiprocessing import Pool
class MQListener(object):
def __init__(self):
self.pool = Pool()
self.rabbitclient = RabbitMQ()
def new_message(self, message):
self.pool.apply_async(do_job, args=(message, ))
def run(self):
self.rabbitclient.listen("task_queue", self.new_message)
app = MQListener()
app.run()
I want to make a simple server-like program, which can run in loop and read and process messages sent to it. And when I start it like Server().start it obviously runs in loop forever. Is there a way to run it in background and feed it with data, which will be proceeded?
class Server:
def __init__(self):
self.messages = []
self.running = False
def start(self):
self.running = True
self.work()
def send_mess(self, message):
self.messages.append(message)
def handle_mess(self):
for mess in self.messages:
self.do_calculations(mess)
def work(self):
while self.running:
self.handle_mess(self)
self.do_some_important_stuff()
def do_some_important_stuff():
pass
def do_calculations():
pass
Seems like you could use Thread class from the threading module.
It works by inheriting it and redefine run method. Then you issue obj.start() and you'll make start method run in parallel.
Roughly, your class can be define like this (I made some corrections to some methods in order to run)
import threading
class Server(threading.Thread):
def __init__(self):
super(Server, self).__init__()
self.messages = []
self.running = False
def run(self): # changed name start for run
self.running = True
self.work()
def send_mess(self, message):
self.messages.append(message)
def handle_mess(self):
for mess in self.messages:
self.do_calculations(mess)
def work(self):
while self.running:
self.handle_mess()
self.do_some_important_stuff()
def do_some_important_stuff(self):
pass
def do_calculations(self):
pass
s = Server()
s.start() # now is in another another thread running
s.join() # wait for it to finnish
IMPORTANT: Copying #Alfe comment which I found extremely useful:
One MUST point out that by entering the world of concurrency (by threading) one opens a nasty can of worms. OP, you really really should read a little more about concurrency problems which occur in parallel environments. Otherwise you are bound to end with a serious problem sooner or later which you have no clue how to solve. See that you understand Queue.Queue (Queue.queue in Python3) and the things in threading like Events, Locks, Semaphores and what they are good for.
Hope this helps!
An easy way would be:
def start(self):
self.running = True
thread = Thread(target = self.work, args = ())
thread.start()
To start just one background thread (another way is to extend the threading.Thread class).
Or:
def work(self):
while self.running:
message = self.handle_mess(self) # gets a message
def threaded_part(m):
self.do_some_important_stuff(m)
self.do_other_important_stuff(m)
thread = Thread(target = threaded_part, args = (message))
thread.start()
To start a thread for each message you receive. Anyway, with a thread pool it would probably be better.
I have a thread class, in it, I want to create a thread function to do its job corrurently with the thread instance. Is it possible, if yes, how ?
run function of thread class is doing a job at every, excatly, x seconds. I want to create a thread function to do a job parallel with the run function.
class Concurrent(threading.Thread):
def __init__(self,consType, consTemp):
# something
def run(self):
# make foo as a thread
def foo (self):
# something
If not, think about below case, is it possible, how ?
class Concurrent(threading.Thread):
def __init__(self,consType, consTemp):
# something
def run(self):
# make foo as a thread
def foo ():
# something
If it is unclear, please tell . I will try to reedit
Just launch another thread. You already know how to create them and start them, so simply write another sublcass of Thread and start() it along the ones you already have.
Change def foo() for a Thread subclass with run() instead of foo().
First of all, I suggest the you will reconsider using threads. In most cases in Python you should use multiprocessing instead.. That is because Python's GIL.
Unless you are using Jython or IronPython..
If I understood you correctly, just open another thread inside the thread you already opened:
import threading
class FooThread(threading.Thread):
def __init__(self, consType, consTemp):
super(FooThread, self).__init__()
self.consType = consType
self.consTemp = consTemp
def run(self):
print 'FooThread - I just started'
# here will be the implementation of the foo function
class Concurrent(threading.Thread):
def __init__(self, consType, consTemp):
super(Concurrent, self).__init__()
self.consType = consType
self.consTemp = consTemp
def run(self):
print 'Concurrent - I just started'
threadFoo = FooThread('consType', 'consTemp')
threadFoo.start()
# do something every X seconds
if __name__ == '__main__':
thread = Concurrent('consType', 'consTemp')
thread.start()
The output of the program will be:
Concurrent - I just startedFooThread - I just started