I am writing a framework in Python 3.8 on Debian that will launch some multiprocessing processes. I want the configuration and launching of the processes to be done in functions OTHER than the main. The main file will be written by the end user of the framework and they should not need to know about these processes. Hence I tried to put the code that configures and launches the processes in helper functions or class methods that the main will call.
What I'm finding is as soon as the launcher function / method exits the processes die. This is even though the launcher functions / methods run (I think) in the same process as main which is still running. I have put a long time.sleep in the launcher functions / methods right before they exit and it seems the processes are alive for that long.
I tried setting the 'daemon' flag but that doesn't seem to solve it. If this is truly a limitation of multiprocessing I can instruct the users of my framework to always put some boiler-plate launcher code in their file, but it seems clunky. All help is appreciated!
The function that launches the processes was returning only the processes to main and NOT managed queues that the processes were using. I changed to return a dict with the processes AND the managed queues and everything works. This is even though main does NOT use the queues.
Related
so title doesn't explain much.
I have a function where it should run on a separate .py file.
So I have a python file where it takes some variables and waits for event to happen than continue until it finishes. You can think this as event listener (web socket).
This file runs and doesn't give output just does some functions and when event finishes it closes. So running one file only is no problem but I want to run more than 10 of these at the same time for different purposes but same event, this causes problems where some of them doesn't work or miss the event.
I do this by running 10 terminals (cmd or shell). Which I think it creates problem because of running this much of event handling in shells, in future I might use more than 10 files maybe 50-100.
So what I tried:
I tried one-file using threading (multi-threading) and it didn't work.
My goal
I want help with this problem where I can run as many as these files without missing events and slowing down the system. I am open to ideas.
Concurrent.future could be a good option, it execute a piece of
code in another thread. See documentation
https://docs.python.org/3/library/concurrent.futures.html.
The threading library that comes with Python allows you to start
mutliple times the same function in different threads whithout having
to wait for them to finish. See the documentation
ttps://docs.python.org/3/library/threading.html.
A similar API is in the library multiprocessing allow you do the
same in pultiple processes. Documentation is
https://docs.python.org/3/library/multiprocessing.html. One
difference is that in Python threads are virtual, all manage in the
single interpreter process. With multiprocessing you start several
processes and probably have less impact on the performance.
The code you have to run in a process or a thread has to be in a defined function. It seems that this code is in a separate .py file, a module, therefore you have to import it (https://docs.python.org/3/tutorial/modules.html) first. So one file manage the thread/multiprocess in a loop, another for the code listening the event and only one terminal will be required to start them.
You can use multi Threading.
here in this page you will find some very useful examples of what you want (I recommend using the concurrent.futures cause in the new version of python 3 you will run into some bugs using the threading ).
https://realpython.com/intro-to-python-threading/#using-a-threadpoolexecutor
I'm adding python scripting support to an application.
This application has an API which is not thread safe, and I cannot change this aspect.
One requirement I have is being able to run multiple independent scripts, thus I have to run sub-interpreters in separate threads.
Although, due to the GIL in CPython, no more than one thread runs concurrently, whatever thread holds the GIL will still run concurrently with the main thread, and this will cause problems due to the thread-unsafe API of the application.
To summarize: I'm looking for a way to run all python code (__main__, threads, every sub-interpreter) in the main thread.
How can this be solved?
Should the main thread always hold the GIL, and have a function that -in a cooperative-multitasking fashion- would release it and reacquire it x milliseconds later, thus allowing the interpreter to do some work? This doesn't look right: such function will consume x milliseconds also when python has no work to do.
I am a bit confused with multiprocessing. I have a video processing script which can be run from the command line or launched from a PySide application using a subprocess call. The script seems to run fine from the command line and basically initializes a pool of workers which each process a separate video file.
When I run the program however the OS tells me my program is not responding. I would like to make use of all the cores on my system for multiprocessing but I would also like to prevent this annoyance. What should I do I get around this? Do I start the initial script in a thread or something?
As you are speaking of PySide, I assume you program is a GUI one. In a GUI program all processing must occurs in a worker thread if you want to keep the UI responsive. So yes, the initial script must be start in a thread distinct from main thread (main one is reserved for UI)
So, I've got an application that uses Twisted + Stomper as a STOMP client which farms out work to a multiprocessing.Pool of workers.
This appears to work ok when I just use a python script to fire this up, which (simplified) looks something like this:
# stompclient.py
logging.config.fileConfig(config_path)
logger = logging.getLogger(__name__)
# Add observer to make Twisted log via python
twisted.python.log.PythonLoggingObserver().start()
# initialize the process pool. (child processes get forked off immediately)
pool = multiprocessing.Pool(processes=processes)
StompClientFactory.username = username
StompClientFactory.password = password
StompClientFactory.destination = destination
reactor.connectTCP(host, port, StompClientFactory())
reactor.run()
As this gets packaged for deployment, I thought I would take advantage of the twistd script and run this from a tac file.
Here's my very-similar-looking tac file:
# stompclient.tac
logging.config.fileConfig(config_path)
logger = logging.getLogger(__name__)
# Add observer to make Twisted log via python
twisted.python.log.PythonLoggingObserver().start()
# initialize the process pool. (child processes get forked off immediately)
pool = multiprocessing.Pool(processes=processes)
StompClientFactory.username = username
StompClientFactory.password = password
StompClientFactory.destination = destination
application = service.Application('myapp')
service = internet.TCPClient(host, port, StompClientFactory())
service.setServiceParent(application)
For the sake of illustration, I have collapsed or changed a few details; hopefully they were not the essence of the problem. For example, my app has a plugin system, the pool is initialized by a separate method, and then work is delegated to the pool using pool.apply_async() passing one of my plugin's process() methods.
So, if I run the script (stompclient.py), everything works as expected.
It also appears to work OK if I run twist in non-daemon mode (-n):
twistd -noy stompclient.tac
however, it does not work when I run in daemon mode:
twistd -oy stompclient.tac
The application appears to start up OK, but when it attempts to fork off work, it just hangs. By "hangs", I mean that it appears that the child process is never asked to do anything and the parent (that called pool.apply_async()) just sits there waiting for the response to return.
I'm sure that I'm doing something stupid with Twisted + multiprocessing, but I'm really hoping that someone can explain to my the flaw in my approach.
Thanks in advance!
Since the difference between your working invocation and your non-working invocation is only the "-n" option, it seems most likely that the problem is caused by the daemonization process (which "-n" prevents from happening).
On POSIX, one of the steps involved in daemonization is forking and having the parent exit. Among of things, this has the consequence of having your code run in a different process than the one in which the .tac file was evaluated. This also re-arranges the child/parent relationship of processes which were started in the .tac file - as your pool of multiprocessing processes were.
The multiprocessing pool's processes start off with a parent of the twistd process you start. However, when that process exits as part of daemonization, their parent becomes the system init process. This may cause some problems, although probably not the hanging problem you described. There are probably other similarly low-level implementation details which normally allow the multiprocessing module to work but which are disrupted by the daemonization process.
Fortunately, avoiding this strange interaction should be straightforward. Twisted's service APIs allow you to run code after daemonization has completed. If you use these APIs, then you can delay the initialization of the multiprocessing module's process pool until after daemonization and hopefully avoid the problem. Here's an example of what that might look like:
from twisted.application.service import Service
class MultiprocessingService(Service):
def startService(self):
self.pool = multiprocessing.Pool(processes=processes)
MultiprocessingService().setServiceParent(application)
Now, separately, you may also run into problems relating to clean up of the multiprocessing module's child processes, or possibly issues with processes created with Twisted's process creation API, reactor.spawnProcess. This is because part of dealing with child processes correctly generally involves handling the SIGCHLD signal. Twisted and multiprocessing aren't going to be cooperating in this regard, though, so one of them is going to get notified of all children exiting and the other will never be notified. If you don't use Twisted's API for creating child processes at all, then this may be okay for you - but you might want to check to make sure any signal handler the multiprocessing module tries to install actually "wins" and doesn't get replaced by Twisted's own handler.
A possible idea for you...
When running in daemon mode twistd will close stdin, stdout and stderr. Does something that your clients do read or write to these?
Python have been really bumpy for me, because the last time I created a GUI client, the client seems to hang when spawning a process, calling a shell script, and calling outside application.
This have been my major problem with Python since then, and now I'm in a new project, can someone give me pointers, and a word of advice in order for my GUI python application to still be interactive when spawning another process?
Simplest (not necessarily "best" in an abstract sense): spawn the subprocess in a separate thread, communicating results back to the main thread via a Queue.Queue instance -- the main thread must periodically check that queue to see if the results have arrived yet, but periodic polling isn't hard to arrange in any event loop.
Your main GUI thread will freeze if you spawn off a process and wait for it to completely. Often, you can simply use subprocess and poll it now and then for completion rather than waiting for it to finish. This will keep your GUI from freezing.