Adding a python daemon to the systemd

Adding a python daemon to the systemd - python

The essence:
I have created a daemon to manage some tasks on a remote platform.
It is written in python and accepts start, stop and restart arguments.
While trying to add it to the systemd (so it would start on system startup and be stopped on shutdown, etc.) I encountered a problem:
It seems to see daemon running, but I am not sure if it actually works, because restarting or requesting status returns with an error:
[user#centos ~]# systemctl restart mydaemon
Failed to restart mydaemon.service: Unit mydaemon.service failed to load: No such file or directory.
[user#centos ~]# systemctl status mydaemon
● mydaemon.service
Loaded: not-found (Reason: No such file or directory)
Active: inactive (dead)
The specifics:
The code itself follows the well-known example by Sander Marechal with very few changes. By itself it works without any problems, and properly reacts to all accepted arguments. The pid is saved in /tmp/my-daemon.pid.
The systemd service file is in the user daemons directory: /usr/lib/systemd/user/mydaemon.service, and the code is as follows:
[Unit]
Description=The user daemon
[Service]
Type=forking
ExecStart=/usr/bin/python /home/frcr/mydaemon_v01.py start
ExecStop=/usr/bin/python /home/frcr/mydaemon_v01.py stop
RestartSec=5
TimeoutSec=60
RuntimeMaxSec=infinity
Restart=always
PIDFile=/tmp/my-daemon.pid
[Install]
WantedBy=multi-user.target
systemctl returns the status of it as active, but only if provided the pid:
[user#centos ~]# systemctl status 9177
● session-481.scope - Session 481 of user user
Loaded: loaded
Drop-In: /run/systemd/system/session-481.scope.d
└─50-After-systemd-logind\x2eservice.conf, 50-After-systemd-user-sessions\x2eservice.conf, 50-Description.conf, 50-SendSIGHUP.conf, 50-Slice.conf
Active: active (running) since Tue 2016-05-17 06:24:51 EDT; 1h 43min ago
CGroup: /user.slice/user-0.slice/session-481.scope
├─8815 sshd: root#pts/0
├─8817 -bash
├─9177 python /home/user/mydaemon_v01.py start
└─9357 systemctl status 9177
I have seen a similar question here on stack overflow, but it doesn't seem to have the solution to my problem.
I assume I am missing something very obvious due to the sheer lack of experience with systemd, and I'd be extremely grateful if somebody could point it out for me or show me the right direction to move. Thanks in advance and please forgive my mad English skillz.

Enabling the daemon with a full path name worked around the issue but there is a better solution.
The issue was the service was in a user directory but was started as a system service. However /usr/lib was not the right place to add new service files anyway. That directory is for files shipped as part of operating system packages. The correct directory to add a new system service is in /etc/systemd/system See related docs about systemd paths.
You still want to enable the service to make sure it gets loaded at boot time.

After some additional googling I found a solution: I actually forgot to add the daemon to systemctl:
[root#centos ~]# systemctl enable /usr/lib/systemd/user/mydaemon.service
Created symlink from /etc/systemd/system/multi-user.target.wants/mydaemon.service to /usr/lib/systemd/user/mydaemon.service.
Created symlink from /etc/systemd/system/mydaemon.service to /usr/lib/systemd/user/mydaemon.service.
It is also worth mentioning that the absolute path is required.
The only thing left is to refresh systemctl:
[root#centos ~]# systemctl daemon-reload
After that the service is added, and
[root#centos ~]# systemctl start mydaemon
[root#centos ~]# systemctl restart mydaemon
[root#centos ~]# systemctl stop mydaemon
all work perfectly.

Related

Python script does not restart properly with systemd

I am learning how to restart a Python script in case of error (following this tutorial, however with some small tweaks regarding filenames and alike). First things first, the needed files:
/home/myuser/Desktop/test/test.py:
from datetime import datetime
from time import sleep
path = "/home/dec13666/Desktop/test/log.txt"
while True:
with open(path, "a") as f:
now = datetime.now().strftime("%d/%m/%Y %H:%M:%S")
f.write(now+"\n")
f.close()
sleep(1)
raise Exception("Error Simulation!!!")
davidcustom.service:
[Unit]
Description=Python Script Made By Me
After=multi-user.target
[Service]
RestartSec=10
Restart=always
ExecStart=python3 /home/myuser/Desktop/test/test.py
[Install]
WantedBy=multi-user.target
Finally, the commands run:
sudo nano /etc/systemd/system/davidcustom.service
sudo systemctl daemon-reload
sudo systemctl enable davidcustom.service
sudo systemctl start davidcustom.service
sudo systemctl status davidcustom.service
The message I am getting:
● davidcustom.service - Python Script Made By Me
Loaded: loaded (/etc/systemd/system/davidcustom.service; enabled; vendor preset: enabled)
Active: activating (auto-restart) (Result: exit-code) since Sun 2022-09-04 14:12:12 EDT; 8s ago
Process: 3341 ExecStart=python3 /home/myuser/Desktop/test/test.py (code=exited, status=1/FAILURE)
Main PID: 3341 (code=exited, status=1/FAILURE)
CPU: 100ms
Notes:
When I run test.py manually, it works OK, but then that python script is run from a service (as seen here), it generates that error.
I have tried to set User=myusername and Type=simple, in davidcustom.service ([Service]), with no difference in the results.
What am I doing wrong?

ExecStart requires an absolute path to the executable, it doesn't search $PATH. So use
ExecStart=/usr/bin/python3 /home/myuser/Desktop/test/test.py
(assuming that's where python3 is installed -- you can use type python3 to get the actual location).

I inadvertently added some wrong indentation in the original code (corrected by now in the original post). That solved my issue; when running sudo systemctl status davidcustom.service, now the service is active:
● davidcustom.service - Python Script Made By Dave
Loaded: loaded (/etc/systemd/system/davidcustom.service; enabled; vendor preset: enabled)
Active: active (running) since Sun 2022-09-04 16:16:02 EDT; 23ms ago
Main PID: 4287 (python3)
Tasks: 1 (limit: 14215)
Memory: 3.0M
CPU: 16ms
CGroup: /system.slice/davidcustom.service
└─4287 python3 /home/dec13666/Desktop/test/test.py
Other suggestions which would worth keeping in mind, but were NOT necessary for my case (thanks #Barmar & #tdelaney):
Use an absolute path at ExecStart (easily obtainable by running which python command)
If you only want to use your User and/or Group, then explicitely write those parameters.
The 2 previous suggestions should be added in [Service], in your service script:
[Service]
RestartSec=10
Restart=always
ExecStart=/usr/bin/python3 /home/myuser/Desktop/test/test.py
User=youruser
Group=yourgroup
Read the comments in the original post, for other options.
Thanks.

Systemd stucked into ExecStartPre step

I've made a very simple python program which extract the latests news popped in some website and send them to me via Telegram. The program is working perfectly when I am launching the command below in the console :
/usr/bin/python3.9 /home/dietpi/news/news.py
However when I try to automate it in systemd (to automatically restart if there is any bug or so), I noticed the services is blocked into the ExecStartPre step forever :
[Unit]
Description=News Service
Wants=network.target
After=network.target
[Service]
ExecStartPre=/bin/sleep 10
ExecStart=/usr/bin/python3.9 /home/dietpi/news/news.py
Restart=always
[Install]
WantedBy=multi-user.target
I put ExecStartPre command to let the Pi to setup properly the network before launching the program (I noticed a failure occur if not done, as the program starts too quickly and generates an error).
When I reboot the Pi, here is what I can see when I am opening the status of the services (by using the command: systemctl --type=service):
UNIT LOAD ACTIVE SUB JOB DESCRIPTION
news.service loaded activating start-pre start News Service
When I look more into detail on this service here is what I have (by using command: sudo systemctl status news.service):
● news.service - News Service
Loaded: loaded (/etc/systemd/system/news.service; enabled; vendor preset: enabled)
Active: activating (start-pre) since Fri 2022-02-04 17:03:58 GMT; 2s ago
Cntrl PID: 552 (sleep)
Tasks: 1 (limit: 4915)
CPU: 4ms
CGroup: /system.slice/news.service
└─552 /bin/sleep 10
Feb 04 17:03:58 DietPi systemd[1]: Starting News Service...
If I launch this command multiple time, I see the "activating" step goes up to 10s, then starts again from 0s >>> Which shows I am stuck in the ExecStartPre step :(
If you have any idea on how to solve this issue, it would be much appreciated :)

Try to create your own sleep script:
sleep.py:
import time
import sys
if __name__ == '__main__':
time.sleep(10)
sys.exit()
In your system unit:
ExecStartPre /usr/bin/python3.9 /home/dietpi/news/sleep.py
Personally, I prefer use supervisord to launch my python script as a service:
/etc/supervisor/conf.d/news.conf:
[program:news]
command = /usr/bin/python3.9 news.py
directory = /home/dietpi/news/
user = dietpi
autostart = true
autorestart = true
stdout_logfile = /var/log/supervisor/news.log
redirect_stderr = true

Why the supervisor make the celery worker changing form running to starting all the time?

backgroud
The system is Centos7, which have a python2.x. 1GB memory and single core.
I install python3.x , I can code python3 into python3.
The django-celery project is based on a virtualenv python3.x,and I had make it well at nginx,uwsgi,mariadb. At least,I think so for no error happend.
I try to use supervisor to control the django-celery's worker,like below:
command=env/bin/python project/manage.py celeryd -l INFO -n worker_%(process_num)s
numprocs=4
process_name=projects_worker_%(process_num)s
stdout_logfile=logfile.log
etderr_logfile=logfile_err.log
Also had make setting about celery events,celery beat,this part is well ,no error happend. Error comes from the part of worker.
When I keep the proces big than 1,it would run at first,when I do supervisorctl status,all are running.
But when I do the same command to see status once more times,some process status change to starting.
So I try more times,I found that:the worker's status would always change from running to starting and then changeing starting to running-- no stop.
When I check the supervisor's logfile at tmp/supervisor.log,it shows like:
exit status 1; not expected
entered runnging state,process has stayed up for > than 1 seconds(startsecs)
'project_worker_0' with pid 2284
Maybe it shows why the worker change status all the time.
What's more ,when I change the proces to 1,the worker could failed.The worker's log show me:
stale pidfile exists.Removing it
But,I did not ponit the pidfile path to worker.And,I just found the events's and beat 's pidfie at the / path,no worker's pidfile.Also ,I try find / -name *.pid to find a pidfile like worker,or celeryd,but here did not exist.
question
firstly, I want to deploy the project , so ,did here any other way to deploy the django-celery with virtulanev's celery part?
If here anyone can tell me how this phenomenon comes,I would better to choose supervisor to deploy the celery part. Anyone can help me about it ?
PS
Any of your thoughts may be helpful to me, best wishs!

Finally, I solve this problem yesterday night.
about the reason
I make the project could success running at a windows 10 system, but did no check when I change the project to centos7.+. The command:env/bin/python project/manage.py celeryd could not run success. So the supervisor would start a process which will failed soon.
Why the command could not success? I had pip installed all the package need. But it show err below:
Running a worker with superuser privileges when the worker accepts messages serialized with pickle is a very bad idea!
If you really want to continue then you have to set the C_FORCE_ROOT
environment variable (but please think about this before you do).
User information: uid=0 euid=0 gid=0 egid=0
I try to search some blog about this error, and get the anser:
export C_FORCE_ROOT='true' # at the centos enviroument
action to solve(after meeting error like this)
add export C_FORCE_ROOT='true' to centos's enviroment file and source it.
check command 'env/bin/python project/manage.py celeryd ',did it run successful.
restart the supervisord. Attention please! not supervisorctl reload,it just reload the .conf file,not the environment file. Try kill the process supervisord -c xx.conf(ps aux | grep supervisord and kill -9 process_number,be careful).
some url about the blog
the error when just run celeryd not sucess in chinese

configure supervisor on amazon ec2 giving spawn error unknown error making dispatchers for 'app_name': EACCES

I've seen this post Unable to start service with nohup due to 'INFO spawnerr: unknown error making dispatchers for 'app_name': EACCES' and tried the answer but it doesn't work
I'm using the Amazon AMI, and since Amazon doesn't have apt-get, I had to use easy_install to install supervisor. here is my /etc/supervisord.conf
[program:awesome]
command = /srv/awesome/www/app.py
directory = /srv/awesome/www
user = ec2-user
startsecs = 3
redirect_stderr = true
stdout_logfile_maxbytes = 50MB
stdout_logfile_backups = 10
stdout_logfile = /srv/awesome/log/app.log
my app files are placed under /srv/awesome/www/ and the owner set to ec2-user which is the same user when I ran whoami. I first ran
supervisord -c /etc/supervisord.conf
which gave me
Another program is already listening on a port that one of our HTTP servers is configured to use. Shut this program down first before starting supervisord.
I entered the command
sudo unlink /tmp/supervisor.sock
which resolved it, then I did
supervisorctl start awesome
which spawn the error, I've tried reloading, stop and start but none works

I switched to ubuntu instead of amazon AMI and everything worked out

nginx and supervisor setup in Ubuntu

I'm using django-gunicorn-nginx setup by following this tutorial http://ijcdigital.com/blog/django-gunicorn-and-nginx-setup/ Upto nginx setup, it is working. Then I installed supervisor, configured it and then I reboot my server and checked, it shows 502 bad gateway. I'm using Ubuntu 12.04 LTS
/etc/supervisor/conf.d/qlimp.conf
[program: qlimp]
directory = /home/nirmal/project/qlimp/qlimp.sh
user = nirmal
command = /home/nirmal/project/qlimp/qlimp.sh
stdout_logfile = /path/to/supervisor/log/file/logfile.log
stderr_logfile = /path/to/supervisor/log/file/error-logfile.log
Then I restarted supervisor and I run this command $ supervisorctl start qlimp and I'm getting this error
unix:///var/run/supervisor.sock no such file
Is there any problem in my supervisor setup?
Thanks!

That there is no socket file probably means that supervisor isn't running. A reason that it isn't running might be that your qlimp.conf file has some sort of error in it. If you do a
sudo service supervisor start
you can see whether or not this is the case. If supervisor is already running, it will say. And if it is catching an error, it will usually give you a more helpful error message than supervisorctl.

I have met the same issue as you and after several times, here comes the solution:
First remove the apt-get supervisor version:
sudo apt-get remove supervisor
Kill the backend supervisor process:
sudo ps -ef | grep supervisor
Then get the newest version(apt-get version was 3.0a8):
sudo easy_install(pip install) supervisor==3.0b2
Echo the config file(root premission):
echo_supervisord_conf > /etc/supervisord.conf
5.Start supervisord:
sudo supervisord
6.Enter supervisorctl:
sudo supervisorctl
Anything has been done! Have fun!

Try this
cd /etc/supervisor
sudo supervisord
sudo supervisorctl restart all

Are you sure that supervisord is installed and running? Is there a socket file in present at /var/run/supervisor.sock?
The error indicates that supervisorctl, the control CLI, cannot reach the UNIX socket to communicate with supervisord, the daemon.
You could also check /etc/supervisor/supervisord.conf and see if the values for the unix_http_server and supervisorctl sections match.
Note that this is a Ubuntu-level problem, not a problem with Python, Django or nginx and as such this question probably belongs on ServerFault.

On Ubuntu 16+ it seems to been caused by the switch to systemd, this workaround may fix for new servers:
# Make sure Supervisor comes up after a reboot.
$ sudo systemctl enable supervisor
# Bring Supervisor up right now.
$ sudo systemctl start supervisor
and then do check your status of iconic.conf [My example] of supervisor
$ sudo supervisorctl status iconic
PS: Make sure gunicorn should not have any problem while running.

The error may be due to that you don't have the privilege.
Maybe you can fix the error by this way, open your terminal, and input vim /etc/supervisord.conf to edit the file, search the lines
[unix_http_server]
;file=/tmp/supervisor.sock ; (the path to the socket file)
;chmod=0700 ; socket file mode (default 0700)
and delete the Semicolon in the start of the string ;file=/tmp/supervisor.sock and ;chmod=0700, restart your supervisord. I suggest you do it.

Make sure that in /etc/supervisor.conf the following two sections exists
[unix_http_server]
file=/tmp/supervisor.sock ; path to your socket file
[rpcinterface:supervisor]
supervisor.rpcinterface_factory = supervisor.rpcinterface:make_main_rpcinterface

You can do something like this :-
sudo touch /var/run/supervisor.sock
sudo chmod 777 /var/run/supervisor.sock
sudo service supervisor restart
It's definitely work, try this.

In my case, Supervisor was not running. To spot the issue I run:
sudo systemctl status supervisor.service
The problem was that I had my logs pointing to a non-existing directory, so I just had to create it.
I hope it helps :)

touch /var/run/supervisor.sock
sudo supervisord -c /etc/supervisor/supervisord.conf
and after
supervisorctl restart all
if you want to listen the supervisor port
ps -ef | grep supervisord
if you want kill the process
kill -s SIGTERM 2503

Create a conf file and below add lines
Remember that in order to work with Nginx, you must have to disable autostart on system boot, that you activated while installing Nginx.
https://askubuntu.com/questions/177041/nginx-disable-autostart
Note: All the supervisor processes must be on "daemon off" mode, in order to work with supervisor
[program:nginx]
command=/usr/sbin/nginx -g "daemon off;"
autostart=true
autorestart=true
startretries=5
stopasgroup=true
stopsignal=QUIT
numprocs=1
startsecs=0
process_name=WebServer(Nginx)
stderr_logfile=/var/log/nginx/error.log
stderr_logfile_maxbytes=10MB
stdout_logfile=/var/log/nginx/access.log
stdout_logfile_maxbytes=10MB
sudo supervisorctl reread && sudo supervisorctl update

I have faced this error several times -
If server is newly created instance and facing this issue
Might be because of some wrong config or mistake happened during the process, or supervisor is not enabled.
Try restarting your supervisor and reconnecting ec2
or
Try reinstalling supervisor (#Scen)
or
Try approaches mentioned by #Yuvaraj Loganathan, #Dinesh Sunny. But mostly you might end up creating a new instance.
If server was running perfectly from long time but then it suddenly stopped
and threw
unix:///var/run/supervisor.sock no such file on sudo supervisorctl status.
It may be due to high memory usage, refer below image usage was 99% of 7.69GB earlier.
You can find the above config after connecting with ec2 via ssh or putty at the top.
You can upgrade your ec2 instance or you can then delete any extra files like logs (/var/logs), zip to free up the space. But careful do not delete any system files.
Restart supervisor
sudo service supervisor restart
Check sudo supervisorctl status

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.