Daemonising python project, that uses Twisted

Daemonising python project, that uses Twisted - python

If we use PEP-3143 and it's reference implementation http://pypi.python.org/pypi/python-daemon
then it looks like impossible to have Twisted working, since during daemonising ALL possible file handlers are explicitly closed, which includes pipes.
When Twisted tries to call os.pipe() and then write to it - gets bad file descriptor.
As I see it, daemonising is not suited for networking by this PEP?
And probably that's the reason why twisted exist
Edit:
I'll have to point out that the question is more of the "Why PEP effectively makes it impossible to create a network application" rather then "How to do it".
Twisted breaks this rules in order to work

It doesn't close all the open file descriptors: just the ones not in the files_preserve attribute. You could probably coerce this to work by figuring out the FD of the waker and all open sockets in the reactor and then passing that to files_preserve... but why bother? Just use twistd and have twisted daemonize itself.
Better yet, use twistd -n and let your process get monitored by some other system tool, and don't bother with daemonization at all.

Feel free to use this daemon http://www.jejik.com/articles/2007/02/a_simple_unix_linux_daemon_in_python/
How to mix it with Twisted see here
http://michael-xiii.blogspot.com/2011/10/twisted.html (warning! Russian text ahead, but Python code is rather demonstrating)

supervisord + upstart

The practice of closing all open filedescriptors is an effect of the possibility that the deamonizing process inherits some open files from the parent process. For example, you can open dozens of files in one process (with, say, os.open()) or and then invoke a sub-process that inherits them. You probably don't have an easy way, as a subprocess, to know what filedescriptors are useful from the parent process (unless you pass that along with command line arguments), and you certainly don't want stdin, stdout or stderr, so its perfectly reasonable to, before doing anything else, close all open files.
Then a deamonizing process will take some additional steps to become a deamon (as laid out in the PEP).
Once the process is fully detached from any kind of terminal, it can start opening files and connections as it needs. It'll open its log files, its configuration files, and its network connections.
Others have mentioned that twisted, via the twistd tool already does a pretty good job of all of this, and you don't need to use an extra module. If you don't want to use twistd (for some reason) but you do want to use twisted, you could use something external, but you should deamonize first and then import twisted and the rest of your application code and open network connections last.

Related

Python 3 Sockets - Can I keep a socket open while stopping and re-running a program?

I've been scratching my head trying to figure out if this is possible.
I have a server program running with about 30 different socket connections to it from all over the country. I need to update this server program now and although the client devices will automatically reconnect, its not totally reliable.
I was wondering if there is a way of saving the socket object to a file? then load it back up when the server restarts? or forcefully keeping a socket open even after the program stops. This way the clients never disconnect at all.
Could really do with hot swappable code here really!

Solution 1.
It can be done with some process magic, at least under linux (although I do believe similar windows api exists). First of all note that sockets cannot be stored in a file. These objects are temporary by their nature. But you can keep them in a separate process. Have a look at this:
Can I open a socket and pass it to another process in Linux
So one way to accomplish this is the following:
Create a "keeper" process at some point (make sure that the process is not a child of the main process so that it stays alive when the main process is gone)
Send all sockets to the keeper process via sendmsg() with SCM_RIGHTS
Shutdown the main process
Do whatever update you have to
Fire the main process
Retrieve sockets from the keeper process
Shutdown the keeper process
However this solution is quite difficult to maintain. You have two separate processes, it is unclear which is the master and which is a slave. So you would probably need another master process at the top. Things get nasty very quickly, not to mention security issues.
Solution 2.
Reloading modules as suggested by #gavinb might be a solution. Note however that in practice this often breaks the app. You never know what those modules do under the hood unless you know the code of every single Python file you use. Plus it imposes some restrictions on modules, i.e. they have to be reloadable. For example some modules use inline caching which makes reloading difficult.
Also once a module is loaded in a different module it keeps a reference to that module. So you not only have to reload it but also update references in every other module that loaded it earlier. The maintanance costs raise very quickly unless you thought about it at the begining of the project (so that every import is encapsulated for easy reload). And bugs caused by two different versions of a module running in the same process are (I imagine, never been in this situation though) extremely difficult to find.
Anyway I would avoid that.
Solution 3.
So this is XY problem. Instead of saving sockets how about you put a proxy in front of the main server? IMO this is the safest and at the same time simpliest solution. The proxy will communicate with the main server (for example over unix domain sockets) and will buffer the data and automatically reconnect to the main server once it is available again. Perhaps you can even reuse some existing tech, e.g. nginx.

No, the sockets are special file handles that belong to the process. If you close the process, the runtime will force close any open files/sockets. This is not Python specific; it is just how operating systems manage resources.
Now what you can do however is dynamically reload one or more modules while keeping the process active. It might take some careful management when you have open sockets, but in theory it should be possible. So yes, hot swappable code is actually supported by Python.
Do some reading and research on "dynamic reloading". The importlib module in Python 3 provides the reload function which is used to:
Reload a previously imported module. The argument must be a module object, so it must have been successfully imported before. This is useful if you have edited the module source file using an external editor and want to try out the new version without leaving the Python interpreter.

I think your critical question is how to hot reload.
And as mentioned by #gavinb, you can import importlib and then use importlib.reload(module) to reload a module dynamically.
Be careful, the parameter of reload(param) must be a module.

How to do local IPC without leaking handles (cross platform)?

How can I initiate IPC with a child process, without letting it inherit all handles? To make it more interesting, this shoud work on windows as well as unix.
The background: I am writing a library that interfaces with a 3rparty shared library (let's just call it IT) which in turn contains global data (that really should be objects!). I want to have multiple instances of this global data. As far as I understand, I have two options to solve this:
create a cython module that links against a static variant of IT, then copy and import the module whenever I want a new instance. Analogously, I could copy IT but that's even more work to create a ctypes interface.
spawn a subprocess that loads IT and establish an IPC connection to it.
There are a few reasons to use (2):
I am not sure, if (1) is reliable in any way and it feels like a bad idea (what happens with all the extra modules, when the application exits in an uncontrolled way?).
boxing IT into a separate process might actually be a good idea anyway for security considerations: IT deals with potentially unsafe input and IT's code quality isn't overly good. So, I'd rather not have any secure resources open when running it.
there is probably lot's of need for this kind of IPC in future applications
So what are my options? I have already looked into:
multiprocessing.Process at first looked nice, until I realized that the new process gets a copy of all my handles. Needless to say that this is quite problematic, since now resources cannot be reliably freed by closing them in the parent process + the security issues mentioned earlier.
Use os.closerange within a multiprocessing.Process to close to all handles manually - except for the Pipe I'm interested in. Does os.closerange close only files or does it take care of other types of resources as well? If so: how can I determine the range, given the Pipe object?
subprocess.Popen(.., close_fds=True, stdin=PIPE, stdout=PIPE) works fine on unix but isn't possible on win32.
Named pipes are very different on win32 and unix. Are their any libraries that their usage?
Sockets. Promising, especially since their are handy RPC libraries that can work with sockets. On the other hand, I fear that this may cause a whole bunch of security issues. Are sockets that I have determined to be of local origin (sock.getpeername()[0] == '127.0.0.1') secure against tempering?
Are there any possibilities that I have overlooked?
To round up: the main question is how to establish a secure IPC with a child process on windows+unix? But please don't hesitate to answer if you know any answers to only partial problems.
Thanks for taking the time to read it!

It seems on python>=3.4 subprocess.Popen(..., stdin=PIPE, stdout=PIPE, close_fds=False) is a possible option. This is due to a patch that makes all opened file descriptors non-inheritable by default. To be more precise, they will be automatically closed on execv (so still can't use multiprocessing.Process), see PEP 446.
This is also a valid option for other python versions:
on windows, HANDLEs are created non-inheritable by default, so you will leak only handles that were made inheritable explicitly
on POSIX/python<=3.3 you can still use os.closerange to close open file descriptors after spawning the subprocess
for a corresponding example see:
https://github.com/coldfix/python-ipc-test
The most useful combinations are:
stdio:pickle
pro: completely cross-platform in my tests
pro: fastest option (with 2)
con: stdin/stdout can not be redirected independently
inherit_unidir:pickle
pro: you can redirect STDIO streams independently
pro: fastest option together with stdio:pickle
con: very low level platform specific code
socket:sockpipe
pro: cross-platform with little effort
con: there is a short period when "attackers" may connect to the port, you could require a pass-phrase or something to prevent that from happening
con: slightly slower than alternatives on windows (factor 1.6 in my measurements)
when not using AF_UNIX there are unpredictable performance hits on linux

How do I implement a simple cross platform Python daemon?

I would like to have my Python program run in the background as a daemon, on either Windows or Unix. I see that the python-daemon package is for Unix only; is there an alternative for cross platform? If possible, I would like to keep the code as simple as I can.

In Windows it's called a "service" and you could implement it pretty easily e.g. with the win32serviceutil module, part of pywin32. Unfortunately the two "mental models" -- service vs daemon -- are very different in detail, even though they serve similar purposes, and I know of no Python facade that tries to unify them into a single framework.

This question is 6 years old, but I had the same problem, and the existing answers weren't cross-platform enough for my use case. Though Windows services are often used in similar ways as Unix daemons, at the end of the day they differ substantially, and "the devil's in the details". Long story short, I set out to try and find something that allows me to run the exact same application code on both Unix and Windows, while fulfilling the expectations for a well-behaved Unix daemon (which is better explained elsewhere) as best as possible on both platforms:
Close open file descriptors (typically all of them, but some applications may need to protect some descriptors from closure)
Change the working directory for the process to a suitable location to prevent "Directory Busy" errors
Change the file access creation mask (os.umask in the Python world)
Move the application into the background and make it dissociate itself from the initiating process
Completely divorce from the terminal, including redirecting STDIN, STDOUT, and STDERR to different streams (often DEVNULL), and prevent reacquisition of a controlling terminal
Handle signals, in particular, SIGTERM.
The fundamental problem with cross-platform daemonization is that Windows, as an operating system, really doesn't support the notion of a daemon: applications that start from a terminal (or in any other interactive context, including launching from Explorer, etc) will continue to run with a visible window, unless the controlling application (in this example, Python) has included a windowless GUI. Furthermore, Windows signal handling is woefully inadequate, and attempts to send signals to an independent Python process (as opposed to a subprocess, which would not survive terminal closure) will almost always result in the immediate exit of that Python process without any cleanup (no finally:, no atexit, no __del__, etc).
Windows services (though a viable alternative in many cases) were basically out of the question for me: they aren't cross-platform, and they're going to require code modification. pythonw.exe (a windowless version of Python that ships with all recent Windows Python binaries) is closer, but it still doesn't quite make the cut: in particular, it fails to improve the situation for signal handling, and you still cannot easily launch a pythonw.exe application from the terminal and interact with it during startup (for example, to deliver dynamic startup arguments to your script, say, perhaps, a password, file path, etc), before "daemonizing".
In the end, I settled on using subprocess.Popen with the creationflags=subprocess.CREATE_NEW_PROCESS_GROUP keyword to create an independent, windowless process:
import subprocess
independent_process = subprocess.Popen(
'/path/to/pythonw.exe /path/to/file.py',
creationflags=subprocess.CREATE_NEW_PROCESS_GROUP
)
However, that still left me with the added challenge of startup communications and signal handling. Without going into a ton of detail, for the former, my strategy was:
pickle the important parts of the launching process' namespace
Store that in a tempfile
Add the path to that file in the daughter process' environment before launching
Extract and return the namespace from the "daemonization" function
For signal handling I had to get a bit more creative. Within the "daemonized" process:
Ignore signals in the daemon process, since, as mentioned, they all terminate the process immediately and without cleanup
Create a new thread to manage signal handling
That thread launches daughter signal-handling processes and waits for them to complete
External applications send signals to the daughter signal-handling process, causing it to terminate and complete
Those processes then use the signal number as their return code
The signal handling thread reads the return code, and then calls either a user-defined signal handler, or uses a cytpes API to raise an appropriate exception within the Python main thread
Rinse and repeat for new signals
That all being said, for anyone encountering this problem in the future, I've rolled a library called daemoniker that wraps both proper Unix daemonization and the above Windows strategy into a unified facade. The cross-platform API looks like this:
from daemoniker import Daemonizer
with Daemonizer() as (is_setup, daemonizer):
if is_setup:
# This code is run before daemonization.
do_things_here()
# We need to explicitly pass resources to the daemon; other variables
# may not be correct
is_parent, my_arg1, my_arg2 = daemonizer(
path_to_pid_file,
my_arg1,
my_arg2
)
if is_parent:
# Run code in the parent after daemonization
parent_only_code()
# We are now daemonized, and the parent just exited.
code_continues_here()

Two options come to mind:
Port your program into a windows service. You can probably share much of your code between the two implementations.
Does your program really use any daemon functionality? If not, you rewrite it as a simple server that runs in the background, manages communications through sockets, and perform its tasks. It will probably consume more system resources than a daemon would, but it would be quote platform independent.

In general the concept of a daemon is Unix specific, in particular expected behaviour with respect to file creation masks, process hierarchy, and signal handling.
You may find PEP 3143 useful wherein a proposed continuation of python-daemon is considered for Python 3.2, and many related daemonizing modules and implementations are discussed.

The reason it's unix only is that daemons are a Unix specific concept i.e a background process initiated by the os and usually running as a child of the root PID .
Windows has no direct equivalent of a unix daemon, the closest I can think of is a Windows Service.
There's a program called pythonservice.exe for windows . Not sure if it's supported on all versions of python though

Example for using Python Twisted with File Descriptors

I'm looking to use twisted to control communication across Linux pipes (os.pipe()) and fifos (os.mkfifo()) between a master process and a set of slave processes. While I'm positive tat it's possible to use twisted for these types of file descriptors (after all, twisted is great for tcp sockets which *nix abstracts away as file descriptors), I cannot find any examples of this type of usage. Anyone have any links, sample code, or advice?

You can use reactor.spawnProcess to set up arbitrary file descriptor mappings between a parent process and a child process it spawns. For example, to run a program and give it two extra output descriptors (in addition to stdin, stdout, and stderr) with which it can send bytes back to the parent process, you would do something like this:
reactor.spawnProcess(protocol, executable, args,
childFDs={0: 'w', 1: 'r', 2: 'r', 3: 'r', 4: 'r'})
The reactor will take care of creating the pipes for you, and will call childDataReceived on the ProcessProtocol you pass in when data is read from them. See the spawnProcess API docs for details.
If you're also using Twisted on the child end, then you mostly want to be looking at twisted.internet.stdio. stdiodemo.py and stdin.py in the core examples will show you how to use that module.

It does not have anything built-in for asynchronous I/O. Someone wrote a libaio wrapper for it, but it has not been touched for a long time, and I have no idea if it still works.
In the worst case you could use select to see if there's anything available to read, but that won't help you with writing.

Creating a new terminal/shell window to simply display text

I want to pipe [edit: real-time text] the output of several subprocesses (sometimes chained, sometimes parallel) to a single terminal/tty window that is not the active python shell (be it an IDE, command-line, or a running script using tkinter). IPython is not an option. I need something that comes with the standard install. Prefer OS-agnostic solution, but needs to work on XP/Vista.
I'll post what I've tried already if you want it, but it’s embarrassing.

A good solution in Unix would be named pipes. I know you asked about Windows, but there might be a similar approach in Windows, or this might be helpful for someone else.
on terminal 1:
mkfifo /tmp/display_data
myapp >> /tmp/display_data
on terminal 2 (bash):
tail -f /tmp/display_data
Edit: changed terminal 2 command to use "tail -f" instead of infinite loop.

You say "pipe" so I assume you're dealing with text output from the subprocesses. A simple solution may be to just write output to files?
e.g. in the subprocess:
Redirect output %TEMP%\output.txt
On exit, copy output.txt to a directory your main process is watching.
In the main process:
Every second, examine directory for new files.
When files found, process and remove them.
You could encode the subprocess name in the output filename so you know how to process it.

You could make a producer-customer system, where lines are inserted over a socket (nothing fancy here).
The customer would be a multithreaded socket server listening to connections and putting all lines into a Queue. In the separate thread it would get items from the queue and print it on the console. The program can be run from the cmd console or from the eclipse console as an external tool without much trouble.
From your point of view, it should be realtime. As a bonus, You can place producers and customers on separate boxes. Producers can even form a network.
Some Examples of socket programming with python can be found here. Look here for an tcp echoserver example and here for a tcp "hello world" socket client.
There also is an extension for windows that enables usage of named pipes.
On linux (possibly cygwin?) You could just tail -f named-fifo.
Good luck!

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.