Examples for writing a daemon or a service in Linux - python

I have been looking at daemons for Linux such as httpd and have also looked at some code that can be used as a skeleton. I have done a fair amount of research and now I want to practice writing it. However, I'm not sure of what can I use a daemon for. Any good examples/ideas that I can try to execute?
I was thinking of using a daemon along with libnotify on Ubuntu to have pop-up notifications of select tweets.
Is this a bad example for implementing a daemon?
Will you even need a daemon for this?
Can this be implemented as a service rather than a daemon?

First: PEP 3143 tries to enumerate all of the fiddly details you have to get right to write a daemon in Python. And it specifies a library that takes care of those details for you.
The PEP was deferred—at least in part because the community felt it was more a responsibility of POSIX or some Linux standards group or something to first define exactly what is essential to being a daemon, before Python could have its own position on how to implement one. But it's still a great guide. However, the reference implementation of that proposed library still lives on, as python-daemon, which you can install from PyPI.
Meanwhile, the really interesting question for this project isn't so much service vs. daemon, as root vs. user. Do you want a single process that keeps track of all users' twitter accounts, and sends notifications to anyone who's logged in? Just a per-user process? Or maybe both, a single process watching all the tweets, then sending notifications via user processes?
Of course you don't really need a daemon or service for this. For example, it could be a GUI app whose main window is a configuration dialog, which keeps running (maybe with a traybar thingy) even when you close the config dialog, and it would work just as well. The question isn't whether you need a daemon, but whether it's more appropriate. Which really is a design choice.

Related

Create a daemon background service?

I'm trying to create a background service in Python. The service will be called from another Python program. It needs to run as a daemon process because it uses a heavy object (300MB) that has to be loaded previously into the memory. I've had a look at python-daemon and still haven't found out how to do it. In particular, I know how to make a daemon run and periodically do some stuff itself, but I don't know how to make it callable from another program. Could you please give some help?
I had a similar situation when I wanted to access a big binary matrix from a web app.
Of course there are many solutions, but I used Redis, a popular in-memory database/cache system, to store and access my object successfully. It has practical Python bindings (several probably equivalent wrapper libraries).
The main advantage is that when the service goes down, a copy of the data still remains on disk. Also, I noticed that once in place, it could be used for other things in my app (for instance Celery proposes it as backend), and actually, for other services in any other unrelated program.

Confused about DBus

Ok, so, I might be missing the plot a bit here, but would really like some help. I am quite new to development etc. and have now come to a point where I need to implement DBus (or some other inter-program communication). I am finding the concept a bit hard to understand though.
My implementation will be to use an HTML website to change certain variables to be used in another program, thus allowing for the program to be dynamically changed in its working. I am doing this on a raspberry PI using Raspbian. I am running a webserver to host my website, and this is where the confusion comes in.
As far as I understand, DBus runs a service which allows you to call methods from a program in another program. So does this mean that my website needs to run a DBUS service to allow me to call methods from it into my program? To complicate things a bit more, I am coding in Python, so I am not sure if I can run a Python script on my website that would allow me to run a DBUS service. Would it be better to use JavaScript?
For me, the most logical solution would be to run a single DBUS service that somehow imports method from different programs and can be queried by others who want to run those methods. Is that possible?
Help would be appreciated!
Thank you in advance!
So does this mean that my website needs to run a DBUS service to
allow me to call methods from it into my program?
A dbus background process (a daemon) would run on your web server, yes.
In fact dbus provides two daemons. One is a system daemon which permits
objects to receive system information (e.g. printer availability for exampple)
and the second is a general user application to application IPC daemon. It is the
second daemon that you definitely use for different applications to communicate.
I am coding in Python, so I am not sure if I can run a Python script
on my website that would allow me to run a DBUS service.
There is no problem using python; dbus has bindings for many languages (e.g Java, perl, ruby, c++, Python). dbus objects can be mapped to python objects.
the most logical solution would be to run a single DBUS service that
somehow imports method from different programs and can be queried by
others who want to run those methods. Is that possible?
Correct - dbus provides a mechanism by which a client process will create dbus object or objects which allow that process to other services to other dbus-aware processes.
This sounds like you should write an isolated D-Bus service to act as a datastore, and communicate synchronously with it in your scripts to write and read values. You can use shelve to persist the values between service invocations.
In the tutorial, the "Making method calls" section covers synchronous calls, and "Exporting objects" covers writing most of the D-Bus service.

Is there a good way to split a python program into independent modules?

I'm trying to do some machinery automation with python, but I've run into a problem.
I have code that does the actual control, code that logs, code the provides a GUI, and some other modules all being called from a single script.
The issue is that an error in one module halts all the others. So, for instance a bug in the GUI will kill the control systems.
I want to be able to have the modules run independently, so one can crash, be restarted, be patched, etc without halting the others.
The only way I can find to make that work is to store the variables in an SQL database, or files or something.
Is there a way for one python script to sort of ..debug another? so that one script can read or change the variables in the other? I can't find a way to do that that also allows to scripts to be started and stopped independently.
Does anyone have any ideas or advice?
A fairly effective way to do this is to use message passing. Each of your modules are independent, but they can send and receive messages to each other. A very good reference on the many ways to achieve this in Python is the Python wiki page for parallel processing.
A generic strategy
Split your program into pieces where there are servers and clients. You could then use middleware such as 0MQ, Apache ActiveMQ or RabbitMQ to send data between different parts of the system.
In this case, your GUI could send a message to the log parser server telling it to begin work. Once it's done, the log parser will send a broadcast message to anyone interested telling the world the a reference to the results. The GUI could be a subscriber to the channel that the log parser subscribes to. Once it receives the message, it will open up the results file and display whatever the user is interested in.
Serialization and deserialization speed is important also. You want to minimise the overhead for communicating. Google Protocol Buffers and Apache Thrift are effective tools here.
You will also need some form of supervision strategy to prevent a failure in one of the servers from blocking everything. supervisord will restart things for you and is quite easy to configure. Again, it is only one of many options in this space.
Overkill much?
It sounds like you have created a simple utility. The multiprocessing module is an excellent way to have different bits of the program running fairly independently. You still apply the same strategy (message passing, no shared shared state, supervision), but with different tactics.
You want multiply independent processes, and you want them to talk to each other. Hence: read what methods of inter-process communication are available on your OS. I recommend sockets (generic, will work over a n/w and with diff OSs). You can easily invent a simple (maybe http-like) protocol on top of TCP, maybe with json for messages. There is a bunch of classes coming with Python distribution to make it easy (SocketServer.ThreadingMixIn, SocketServer.TCPServer, etc.).

Best practice for Python process control

This is my first hack at doing any system-level programming (mostly a LAMPhp, specifically Drupal, web dev up to this point).
Because of availability of a library with a very specific feature, I am using Python for an upcoming project. I need to run, restart as needed, monitor and respond to the output of multiple Python script processes, controlled ideally via a HTTP API from another master program which keeps a database of processes that need to be running, and some metadata about those processes (parameters, pid, etc). I'm planning on building this master program in PHP as I have far more experience in it, hence the want for a nice HTTP API.
Is there some best practice for this type of system? Some initial research lead me to supervisord (which has XML-RPC built in, apparently), but I thought I'd check the wisdom of the masses who've actually been down this road before moving forward with testing.
I can't say I have been down this road, but I am working to go down this road. I would look into the multiprocessing libraries for Python. There are network transparent libraries. A couple of routes you could take with those:
1. Create a process that controls all of the other processes. Make this process a server you can control with your PHP.
2. Determine how to get PHP to communicate to these networked Python processes. They may still need to be launched from a central Python process however.

Safely executing user-submitted python code on the server

I am looking into starting a project which involves executing python code that the user enters via a HTML form. I know this can be potentially lethal (exec), but I have seen it done successfully in at least one instance.
I sent an email off to the developers of the Python Challenge and I was told they are using a solution they came up with themselves, and they only let on that they are using "security features provided by the operating system" and that "the operating system [Linux] provides most of the security you need if you know how to use it."
Would anyone know how a safe and secure way to go about doing this? I thought about spawning a new VM for every submission, but that would have way too much overhead and be pert-near impossible to implement efficiently.
On a modern Linux in addition to chroot(2) you can restrict process further by using clone(2) instead of fork(2). There are several interesting clone(2) flags:
CLONE_NEWIPC (new namespace for semaphores, shared memory, message queues)
CLONE_NEWNET (new network namespace - nice one)
CLONE_NEWNS (new set of mountpoints)
CLONE_NEWPID (new set of process identifiers)
CLONE_NEWUTS (new hostname, domainname, etc)
Previously this functionality was implemented in OpenVZ and merged then upstream, so there is no need for patched kernel anymore.
http://codepad.org/about has implemented such a system successfully (as a public code pasting/running service!)
codepad.org is an online compiler/interpreter, and a simple collaboration tool. It's a pastebin that executes code for you. [...]
How it works
Code execution is handled by a supervisor based on geordi. The strategy is to run everything under ptrace, with many system calls disallowed or ignored. Compilers and final executables are both executed in a chroot jail, with strict resource limits. The supervisor is written in Haskell.
[...]
When your app is remote code execution, you have to expect security problems. Rather than rely on just the chroot and ptrace supervisor, I've taken some additional precautions:
The supervisor processes run on virtual machines, which are firewalled such that they are incapable of making outgoing connections.
The machines that run the virtual machines are also heavily firewalled, and restored from their source images periodically.
If you run the script as user nobody (on Linux), it can write practically nowhere and read no data that has its permissions set up properly. But it could still cause a DoS attack by, for example:
filling up /tmp
eating all RAM
eating all CPU
Furthermore, outside network connections can be opened, etcetera etcetera. You can probably lock all these down with kernel limits, but you are bound to forget something.
So I think that a virtual machine with no access to the network or the real hard drive would be the only (reasonably) safe route. Perhaps the developers of the Python Challenge use KVM which is, in principle, "provided by the operating system".
For efficiency, you could run all submissions in the same VM. That saves you much overhead, and in the worst-case scenario they only hamper each other, but not your server.
Using chroot (Wikipedia) may be part of the solution, e.g. combined with ulimit and some other common (or custom) tools.

Categories

Resources