What does this daemonize method do? - python

I was looking around on GitHub, when I stumbled across this method called daemonize() in a reverse shell example. source
What I don't quite understand is what it does in this context, wouldn't running this code from the command line as such: python example.py & not achieve the same thing?
Deamonize method source:
def daemonize():
pid = os.fork()
if pid > 0:
sys.exit(0) # Exit first parent
pid = os.fork()
if pid > 0:
sys.exit(0) # Exit second parent

A background process - running python2.7 <file>.py with the & signal - is not the same thing as a true daemon process.
A true daemon process:
Runs in the background. This also happens if you use &, and is where the similarity ends.
Is not in the same process group as the terminal. When the terminal closes, the daemon will not die either. This does not happen with & - the process remains the same, it is simply moved to the background.
Properly closes all inherited file descriptors (including input, output, etc.) so that nothing ties it back to the parent. Again, this does not happen with & - it will still write to the terminal.
Should only ideally be killed by SIGKILL, not SIGHUP. Running with & allows your process to be killed by SIGHUP.
All of this, however, is pedantry. Few tasks really require you to go to the extreme that these properties require - a background task spawned in a new terminal using screen can usually do the same job, though less efficiently, and you may as well call that a daemon in that it is a long-running background task. The only real difference between that and a true daemon is that the latter simply tries to avoid all avenues of potential death.
The code you saw simply forks the current process. Essentially, it clones the current process, kills its parent and 'acts in the background' by simply being a separate process that does not block the current execution - a bit of an ugly hack, if you ask me, but it works.

Have a look at Orphan Processes and Daemon Process. A process without a parent becomes a child of init (pid 1).
When it comes time to shut down a group of processes, say all the children of a bash instance, the OS will go about giving a sighup to the children of that bash. An orphan, forced as in this case, or other due to some accident, won't get that treatment and will stay around longer.

Related

When a parent NodeJS process exits, causing child processes to exit, how is that communicated?

index.js
const childProcess = require("child_process");
childProcess.spawn("python", ["main.py"]);
main.py
import time
while True:
time.sleep(1)
When running the NodeJS process with node index.js, it runs forever since the Python child process it spawns runs forever.
When the NodeJS process is ended by x-ing out of the Command Prompt, the Python process ends as well, which is desired, but how can you run some cleanup code in the Python code before it exits?
Previous attempts
Look in documentation for child_process.spawn for how this termination is communicated from parent to child, perhaps by a signal. Didn't find it.
In Python use signal.signal(signal.SIGTERM, handler) (as well as with signal.SIGINT). Didn't get handler to run (though, ending the NodeJS process with Ctrl-C instead of closing the window did get the SIGINT handler to run even though I'm not explicitly forwarding input from the NodeJS process to the Python child process).
Lastly, though this reproducible example is also valid and much more simple, my real-life use case involves Electron, so in case that introduces a complication, or a solution, figured I'd mention it.
Windows 10, NodeJS 12.7.0, Python 3.8.3

Do processes need to be stopped manually

I'm new to multiprocessing in Python so I'm in doubt. My first idea was to use threads, but then I read about GIL and moved to multiprocessing.
My question is, when I start a process like this:
t1 = Process(target=run, args=lot)
t1.start()
do I need to stop it somehow from the main process, or they shutdown when the run() method is finished?
I know that things like join() exist, but I'm scheduling a job every n minutes and start a couple of processes in parallel, and this procedure goes until stopped, so I don't really need to wait for processes to finish.
Yes when t1.start() happens it executes the method which is specified in target(i.e run). Once its completed it exits automatically.
You can check this by checking the running process eg in linux use below command,
"ps -aux |grep python" or "ps -aux |grep "program_name.py"
when your target is running count will be more.
To wait until a process has completed its work and exited, use the join() method. But in your case its not required
more example are here : https://pymotw.com/2/multiprocessing/basics.html
Well, GIL is not a big problem when you are not doing much computation, but something like networking stuff or reading files when execution of a program is hanged and control flow is given to the krnel untill input/output operation is performed. Then another thread can run in python.
If you, owever, are bothering with more CPU-consuming stuff you actually should go for multiprocessing.
join() method is used for thread synchronization, so when main thread relies on data processed by another thread it is important to use it. Otherwise it is not. You operating system will handle things like closing child processes in a safe manner.
EDIT: check this discussion for more details.

Windows processes; how to find orphans from known dead parents?

This question may be slightly academic; but yet something puzzling.
Assume on a windows (8+) machine you have a process (in this case a service), say proc_0. I can send a request to this to run something specific. This is done by proc_0 starting (say) proc_a. proc_a then spawns proc_b which in turn may spawn proc_c, which again may spawn proc_d
We'll end up in this process tree:
proc_0
|_proc_a
|____proc_b
|______proc_c
|________proc_d
Let's say I have only influence what happens in proc_b.
Ok, what we want is if proc_a dies, every child of proc_a should die too.
The problem arises if proc_0 kills proc_a. On windows this "orphan" proc_b and it (and its children) stay alive.
Now, proc_b, the one and only process I can decide what to do in, is actually watching what happens, i.e. if proc_a dies it will kill its children. Fine. proc_b may know about proc_c's child proc_d, so it kills that, and then kills proc_c.
But here it goes theoretical.... before proc_c actually got killed by proc_b, proc_c just spawned another child proc_z. proc_b has now no idea of proc_z... which will get orphaned when proc_c eventually dies (remember, I cannot do anything on how proc_c behaves or even if it got killed the hard way for some other reason). Now the processes looks like this:
proc_0
proc_b (orphaned, but I know)
proc_z (orphaned, but I dunno)
proc_b I can terminate my self, but is there any way to detect that proc_z was actually started by a process I knew about and just killed (i.e. it should die too).
Ok, I can find the orphaned proc_z, but yet, how can I decide if it is ok to kill? proc_z will have no parent process, but might have an old parent pid. Even I could know this pid was actually the one I knew from proc_c, but I have no chance to actually detect if proc_z was actually started by proc_c or a completely different process that also died, where the OS just happened to reuse the pid from my dead proc_c
To relate to reality, this is about what I see on a windows jenkins slave; which caused me to insert proc_b, because cancelling a job leaves subprocesses running. There might be an update to jenkins on this, but the "theoretical" problem - to decide whether to kill this orphan or not - is still in play, I think.

Python Multiprocessing respawn crashed processes

I want to create some worker processes and if they crash due to an exception, I would like them to respawn. Aside from the is_alive method in the multiprocessing module, I can't seem to find a way to do this.
This would require me to iterate over all the running processes (after a sleep) and check if they are alive. This is essentially a busy loop, I was wondering if there was a better solution that will wake up my program in the event that any one of my worker processes has crashed. Once it wakes up, I would like to log th exception that crashed my program and launch another process.
Polling to see if the child processes are alive should work fine, since it's a low-overhead check and you don't need to check that often.
The first answer to this (similar) question has a Python code example: Multi-server monitor/auto restarter in python
You can wrap your worker processes in try/except blocks where the except pushes a message onto a pipe before raising. Of course, polling isn't really worse than this and it's simpler.
If you're on a unix-like system, your main program can be notified of dead children by installing a signal handler. Look up your operating system's documentation on signal(), especially SIGCHLD. I'm afraid I don't remember whether Windows covers SIGCHLD with its very limited POSIX signal support.

How to auto-restart a python script on fail?

This post describes how to keep a child process alive in a BASH script:
How do I write a bash script to restart a process if it dies?
This worked great for calling another BASH script.
However, I tried executing something similar where the child process is a Python script, daemon.py which creates a forked child process which runs in the background:
#!/bin/bash
PYTHON=/usr/bin/python2.6
function myprocess {
$PYTHON daemon.py start
}
NOW=$(date +"%b-%d-%y")
until myprocess; do
echo "$NOW Prog crashed. Restarting..." >> error.txt
sleep 1
done
Now the behaviour is completely different. It seems the python script is no longer a child of of the bash script but seems to have 'taken over' the BASH scripts PID - so there is no longer a BASH wrapper round the called script...why?
A daemon process double-forks, as the key point of daemonizing itself -- so the PID that the parent-process has is of no value (it's gone away very soon after the child process started).
Therefore, a daemon process should write its PID to a file in a "well-known location" where by convention the parent process knows where to read it from; with this (traditional) approach, the parent process, if it wants to act as a restarting watchdog, can simply read the daemon process's PID from the well-known location and periodically check if the daemon is still alive, and restart it when needed.
It takes some care in execution, of course (a "stale" PID will stay in the "well known location" file for a while and the parent must take that into account), and there are possible variants (the daemon could emit a "heartbeat" so that the parent can detect not just dead daemons, but also ones that are "stuck forever", e.g. due to a deadlock, since they stop giving their "heartbeat" [[via UDP broadcast or the like]] -- etc etc), but that's the general idea.
You should look at the Python Enhancement Proposal 3143 (PEP) here. In it Ben suggests including a daemon library in the python standard lib. He goes over LOTS of very good information about daemons and is a pretty easy read. The reference implementation is here.
It seems that the behavior is completely different because here your "daemon.py" is launched in background as a daemon.
In the other link you pointed to the process that is surveyed is not a daemon, it does not start in the background. The launcher simply wait forever that the child process stop.
There is several ways to overcome this. The classical one is the way #Alex explain, using some pid file in conventional places.
Another way could be to build the watchdog inside your running daemon and daemonize the watchdog... this would simulate a correct process that do not break at random (something that shouldn't occur)...
Make use of 'https://github.com/ut0mt8/simple-ha' .
simple-ha
Tired of keepalived, corosync, pacemaker, heartbeat or whatever ? Here a simple daemon wich ensure a Heartbeat between two hosts. One is active, and the other is backup, launching script when changing state. Simple implementation, KISS. Production ready (at least it works for me :)
Life will be too easy !

Categories

Resources