I have a python script which attempts to communicate with a python daemon. When the original script is invoked, it checks to see if the daemon exists. If the daemon exists, the original script writes to a named pipe to communicate with the daemon. If the daemon does not exists, the original script attempts to create a daemon using DaemonContext and then writes to the named pipe.
Pseudo-code of the original script:
from daemon import DaemonContext
if daemon_exists():
pass
else:
with DaemonContext():
create_daemon()
communicate_with_daemon()
The problem is that when the daemon is created, the parent process is killed (i.e. communicate_with_daemon will never be executed). This prevents the original script from creating a daemon and communicating with it.
According to this answer, this problem is a limitation of the python-daemon library. How would I get around this?
Thanks.
You're describing not a limitation, but a definition of how a daemon process works.
[…] the parent process is killed (i.e. communicate_with_daemon will never be executed).
Yes, that's right; the daemon process detaches from what started it. That's what makes the process a daemon.
However, this statement is not true:
This prevents the original script from creating a daemon and communicating with it.
There are numerous other ways to communicate between processes. The general name for this is Inter-Process Communication. The solutions are many, and which you choose depends on the constraints of your application.
For example, you could open a socket at a known path and preserve that open file; you could open a network port and communicate through the loopback interface; you could make a "drop-box" communication at a file on the local filesystem store, a database, or otherwise; etc.
Related
I am writing a web application in Python (Django) that will execute tasks/process on the side, typically network scans. I would like the user to be able to terminate a scan, view its status or results in real-time.
I thought one of the best ways to do this is to have a job manager daemon that is a stand alone process, that:
Accepts new jobs via a TCP connection.
Accepts user-commands, typically to terminate or restart a process.
Reports on the status of a job.
I am struggling with the structure of this code. I am thinking that a TCP port on the daemon process will accept new jobs. It will then create an os.fork(), which itself will create an os.fork(). The second fork will perform an os.execv() for nmap. The first os.fork() will monitor the second fork (how?) and when it completes, it will report back to the master daemon that it has ended. The first fork must also be able to terminate the second child process.
How does that sound? Are there any structures of this already having been done? I would hate to re-create the wheel.
Finally, how would the child process know that the second child, the one running the os.execv() has terminated? Or whether its still running? I would hate to continuously poll a list of processes.
And as I've said, this must be done in Python.
I opted for a fork-based approach. This approach is "wrong", but it works and fulfills my needs.
https://gist.github.com/FarhansCode/a0f27469142b6afaa6c2
I have two scripts: "autorun.py" and "main.py". I added "autorun.py" as service to the autorun in my linux system. works perfectly!
Now my question is: When I want to launch "main.py" from my autorun script, and "main.py" will run forever, "autorun.py" never terminates as well! So when I do
sudo service autorun-test start
the command also never finishes!
How can I run "main.py" and then exit, and to finish it up, how can I then stop "main.py" when "autorun.py" is launched with the parameter "stop" ? (this is how all other services work I think)
EDIT:
Solution:
if sys.argv[1] == "start":
print "Starting..."
with daemon.DaemonContext(working_directory="/home/pi/python"):
execfile("main.py")
else:
pid = int(open("/home/pi/python/main.pid").read())
try:
os.kill(pid, 9)
print "Stopped!"
except:
print "No process with PID "+str(pid)
First, if you're trying to create a system daemon, you almost certainly want to follow PEP 3143, and you almost certainly want to use the daemon module to do that for you.
When I want to launch "main.py" from my autorun script, and "main.py" will run forever, "autorun.py" never terminates as well!
You didn't say how you're running it. If you're doing anything that launches main.py as a child and waits (or, worse, tries to import/execfile/etc. in the same process), you can't do that. Either autorun.py has to launch and detach main.py (or do so indirectly via some external tool), or main.py has to daemonize when launched.
how can I then stop "main.py" when "autorun.py" is launched with the parameter "stop" ?
You need some form of inter-process communication (IPC), and some way for autorun to find the right IPC channel to use.
If you're building a network server, the right answer might be to connect to it as a client. But otherwise, the simplest thing to do is kill the process with a signal.
If you're using the daemon module, it can easily map signals to callbacks. Or, if you don't need any cleanup, just use SIGTERM, which by default will abruptly terminate. If neither of those applies, you will have to set up a custom signal handler (and within that handler do something useful—e.g., set a flag that your main code checks periodically).
How do you know what process to send the signal to? The standard way to do this is to have main.py record its PID in a pidfile at startup. You read that pidfile, and signal whatever process is specified there. (If you get an error because there is no process with that PID, that just means the daemon already quit for some reason—possibly because of an unhandled exception, or even a segfault. You may want to log that, but treat the "stop" as successful otherwise.) Again, if you're using daemon, it does the pidfile stuff for you; if not, you have to do it yourself.
You may want to take a look at the service scripts for daemons that came with your computer. They're probably all written in bash rather than Python, but it's not that hard to figure out what they're doing. Or… just use one of them as a skeleton, in which case you don't really need any bash knowledge; it's just search-and-replace on the name.
If your distro has LSB-style init functions, you can use something like this example. That one does a whole lot more than you need to, but it's a good example of all of the details. Or do it all from scratch with something like this example. This one is doing the pidfile management and the backgrounding from the service script (turning a non-daemon program into a daemon), which you don't need if you're using daemon properly, and it's using SIGHUP instead of SIGTERM. You can google yourself for other examples of init.d service scripts.
But again, if you're just trying to do this for your own system, the best thing to do is look inside the /etc/init.d on your distro. There will be dozens of examples there, and 90% of them will be exactly the same except for the name of the daemon.
I'm using the Jython 2.51 implementation of Python to write a script that repeatedly invokes another process via subprocess.Popen and uses PIPE to pipe stdout and stderr to the parent process and stdin to the child process. After several hundred loop iterations, I seem to run out of file descriptors.
The Python subprocess documentation mentions very little about freeing file descriptors, other than the close_fds option, which isn't described very clearly (Why should there be any file descriptors besides 0, 1 and 2 open in the first place?). I'm assuming that in CPython, reference counting takes care of the resource freeing issue. What's the proper way to make sure all descriptors get freed when one is done with a Popen object in Jython?
Edit: Just in case it makes a difference, this is a multithreaded program, so there are several Popen processes running simultaneously.
This only answers part of your question, but my understanding is that, when you spawn a new process, it normally inherits all the handles of the parent process. That includes such things as open files and sockets that you're listening on.
On UNIX, that's a side-effect of using 'fork', which duplicates the current process and all of its handles before loading the new executable. On Windows it's more explicit, but Python does it anyway, to try to match the behavior across platforms as much as possible.
The close_fds option, when True, closes all these inherited handles after spawning the subprocess, so the new executable starts with a clean slate. But if your subprocesses are run one at a time, and terminating when they're done, then this shouldn't be the problem.
This post describes how to keep a child process alive in a BASH script:
How do I write a bash script to restart a process if it dies?
This worked great for calling another BASH script.
However, I tried executing something similar where the child process is a Python script, daemon.py which creates a forked child process which runs in the background:
#!/bin/bash
PYTHON=/usr/bin/python2.6
function myprocess {
$PYTHON daemon.py start
}
NOW=$(date +"%b-%d-%y")
until myprocess; do
echo "$NOW Prog crashed. Restarting..." >> error.txt
sleep 1
done
Now the behaviour is completely different. It seems the python script is no longer a child of of the bash script but seems to have 'taken over' the BASH scripts PID - so there is no longer a BASH wrapper round the called script...why?
A daemon process double-forks, as the key point of daemonizing itself -- so the PID that the parent-process has is of no value (it's gone away very soon after the child process started).
Therefore, a daemon process should write its PID to a file in a "well-known location" where by convention the parent process knows where to read it from; with this (traditional) approach, the parent process, if it wants to act as a restarting watchdog, can simply read the daemon process's PID from the well-known location and periodically check if the daemon is still alive, and restart it when needed.
It takes some care in execution, of course (a "stale" PID will stay in the "well known location" file for a while and the parent must take that into account), and there are possible variants (the daemon could emit a "heartbeat" so that the parent can detect not just dead daemons, but also ones that are "stuck forever", e.g. due to a deadlock, since they stop giving their "heartbeat" [[via UDP broadcast or the like]] -- etc etc), but that's the general idea.
You should look at the Python Enhancement Proposal 3143 (PEP) here. In it Ben suggests including a daemon library in the python standard lib. He goes over LOTS of very good information about daemons and is a pretty easy read. The reference implementation is here.
It seems that the behavior is completely different because here your "daemon.py" is launched in background as a daemon.
In the other link you pointed to the process that is surveyed is not a daemon, it does not start in the background. The launcher simply wait forever that the child process stop.
There is several ways to overcome this. The classical one is the way #Alex explain, using some pid file in conventional places.
Another way could be to build the watchdog inside your running daemon and daemonize the watchdog... this would simulate a correct process that do not break at random (something that shouldn't occur)...
Make use of 'https://github.com/ut0mt8/simple-ha' .
simple-ha
Tired of keepalived, corosync, pacemaker, heartbeat or whatever ? Here a simple daemon wich ensure a Heartbeat between two hosts. One is active, and the other is backup, launching script when changing state. Simple implementation, KISS. Production ready (at least it works for me :)
Life will be too easy !
So, I've got an application that uses Twisted + Stomper as a STOMP client which farms out work to a multiprocessing.Pool of workers.
This appears to work ok when I just use a python script to fire this up, which (simplified) looks something like this:
# stompclient.py
logging.config.fileConfig(config_path)
logger = logging.getLogger(__name__)
# Add observer to make Twisted log via python
twisted.python.log.PythonLoggingObserver().start()
# initialize the process pool. (child processes get forked off immediately)
pool = multiprocessing.Pool(processes=processes)
StompClientFactory.username = username
StompClientFactory.password = password
StompClientFactory.destination = destination
reactor.connectTCP(host, port, StompClientFactory())
reactor.run()
As this gets packaged for deployment, I thought I would take advantage of the twistd script and run this from a tac file.
Here's my very-similar-looking tac file:
# stompclient.tac
logging.config.fileConfig(config_path)
logger = logging.getLogger(__name__)
# Add observer to make Twisted log via python
twisted.python.log.PythonLoggingObserver().start()
# initialize the process pool. (child processes get forked off immediately)
pool = multiprocessing.Pool(processes=processes)
StompClientFactory.username = username
StompClientFactory.password = password
StompClientFactory.destination = destination
application = service.Application('myapp')
service = internet.TCPClient(host, port, StompClientFactory())
service.setServiceParent(application)
For the sake of illustration, I have collapsed or changed a few details; hopefully they were not the essence of the problem. For example, my app has a plugin system, the pool is initialized by a separate method, and then work is delegated to the pool using pool.apply_async() passing one of my plugin's process() methods.
So, if I run the script (stompclient.py), everything works as expected.
It also appears to work OK if I run twist in non-daemon mode (-n):
twistd -noy stompclient.tac
however, it does not work when I run in daemon mode:
twistd -oy stompclient.tac
The application appears to start up OK, but when it attempts to fork off work, it just hangs. By "hangs", I mean that it appears that the child process is never asked to do anything and the parent (that called pool.apply_async()) just sits there waiting for the response to return.
I'm sure that I'm doing something stupid with Twisted + multiprocessing, but I'm really hoping that someone can explain to my the flaw in my approach.
Thanks in advance!
Since the difference between your working invocation and your non-working invocation is only the "-n" option, it seems most likely that the problem is caused by the daemonization process (which "-n" prevents from happening).
On POSIX, one of the steps involved in daemonization is forking and having the parent exit. Among of things, this has the consequence of having your code run in a different process than the one in which the .tac file was evaluated. This also re-arranges the child/parent relationship of processes which were started in the .tac file - as your pool of multiprocessing processes were.
The multiprocessing pool's processes start off with a parent of the twistd process you start. However, when that process exits as part of daemonization, their parent becomes the system init process. This may cause some problems, although probably not the hanging problem you described. There are probably other similarly low-level implementation details which normally allow the multiprocessing module to work but which are disrupted by the daemonization process.
Fortunately, avoiding this strange interaction should be straightforward. Twisted's service APIs allow you to run code after daemonization has completed. If you use these APIs, then you can delay the initialization of the multiprocessing module's process pool until after daemonization and hopefully avoid the problem. Here's an example of what that might look like:
from twisted.application.service import Service
class MultiprocessingService(Service):
def startService(self):
self.pool = multiprocessing.Pool(processes=processes)
MultiprocessingService().setServiceParent(application)
Now, separately, you may also run into problems relating to clean up of the multiprocessing module's child processes, or possibly issues with processes created with Twisted's process creation API, reactor.spawnProcess. This is because part of dealing with child processes correctly generally involves handling the SIGCHLD signal. Twisted and multiprocessing aren't going to be cooperating in this regard, though, so one of them is going to get notified of all children exiting and the other will never be notified. If you don't use Twisted's API for creating child processes at all, then this may be okay for you - but you might want to check to make sure any signal handler the multiprocessing module tries to install actually "wins" and doesn't get replaced by Twisted's own handler.
A possible idea for you...
When running in daemon mode twistd will close stdin, stdout and stderr. Does something that your clients do read or write to these?