Keep process running 24/7 EC2 - python

I have a script that I need to be running 24/7. However, I cannot seem to get EC2 to stop killing the process
I've tried daemonizing it with python-daemon,
I've tried nohup,
I've tried adding & at the end of the command to make it a background process,
I've tried screen to assign it to a virtual session.
All of these work temporarily but when I check it an hour later with ps aux | grep python, it's no longer there/no longer running
I've looked at the output and the nohup.out file to see if it's crashing because of an error but no error/output
I've used signal handlers:
signal.signal(signal.SIGINT, exit_gracefully)
signal.signal(signal.SIGTERM, exit_gracefully)
Still nothing. I suspected that there may be an error that is escaping my tests but I ran it in an open console for an hour and it worked perfectly so there must be something unexpected happening unassociated to my code.
An excerpt:
import signal
import time
from daemon import DaemonContext
import schedule as schedule
global exit_now
def exit_gracefully(*args):
global exit_now
exit_now = True
def run_server():
global exit_now
schedule.every(3).seconds.do(lambda: a_function(param1, param2))
schedule.every(3).hours.do(lambda: another_function(param1, param2))
schedule.every(10).minutes.do(function_with_no_params)
signal.signal(signal.SIGINT, exit_gracefully)
signal.signal(signal.SIGTERM, exit_gracefully)
exit_now = False
while not exit_now:
schedule.run_pending()
time.sleep(1)
my_model.backup()
print("Processes successfully stopped")
if __name__ == "__main__":
with DaemonContext():
run_server()
Edit: I tried adding a log file to my daemon and to my nohup to catch and print whatever's breaking it but upon exit, nothing. No output at all. Additionally, I tried disowning the background process but that didn't work.
The OS is Amazon Linux. Here's a link to the codebase in case you would like to reproduce it: https://github.com/DavidTeju/Tweet-Generator

Related

Apscheduler keeps spawned processes alive for no reason

EDIT: I found the issue. It was a problem with PyCharm. I ran the .py outside of PyCharm and it worked as expected. In PyCharm I enabled "Emulate terminal in output console" and it now also works there...
Expectations:
Apscheduler spawns a thread that checks a website for something.
If the something was found (or multiple of it), the thread spawns (multiple) processes to download it/them.
After five seconds the next check thread spawns. While the other downloads may continue in the background.
Problem:
The spawned processes never stop to exist, which makes other parts of the code (not included) not work, because I need to check if the processes are done etc.
If I use a simple time.sleep(5) instead (see code), it works as expected.
No I cannot set max_instances to 1 because this will stop the scheduled job from running if there is one active download process.
Code:
import datetime
import multiprocessing
from apscheduler.schedulers.background import BackgroundScheduler
class DownloadThread(multiprocessing.Process):
def __init__(self):
super().__init__()
print("Process started")
def main():
print(multiprocessing.active_children())
# prints: [<DownloadThread name='DownloadThread-1' pid=3188 parent=7088 started daemon>,
# <DownloadThread name='DownloadThread-3' pid=12228 parent=7088 started daemon>,
# <DownloadThread name='DownloadThread-2' pid=13544 parent=7088 started daemon>
# ...
# ]
new_process = DownloadThread()
new_process.daemon = True
new_process.start()
new_process.join()
if __name__ == '__main__':
sched = BackgroundScheduler()
sched.add_job(main, 'interval', args=(), seconds=5, max_instances=999, next_run_time=datetime.datetime.now())
sched.start()
while True:
# main() # works. Processes despawn.
# time.sleep(5)
input()

python daemon thread exits but process still run in the background

I am using python 2.7 and Python thread doesn't kill its process after the main program exits. (checking this with the ps -ax command on ubuntu machine)
I have the below thread class,
import os
import threading
class captureLogs(threading.Thread):
'''
initialize the constructor
'''
def __init__(self, deviceIp, fileTag):
threading.Thread.__init__(self)
super(captureLogs, self).__init__()
self._stop = threading.Event()
self.deviceIp = deviceIp
self.fileTag = fileTag
def stop(self):
self._stop.set()
def stopped(self):
return self._stop.isSet()
'''
define the run method
'''
def run(self):
'''
Make the thread capture logs
'''
cmdTorun = "adb logcat > " + self.deviceIp +'_'+self.fileTag+'.log'
os.system(cmdTorun)
And I am creating a thread in another file sample.py,
import logCapture
import os
import time
c = logCapture.captureLogs('100.21.143.168','somefile')
c.setDaemon(True)
c.start()
print "Started the log capture. now sleeping. is this a dameon?", c.isDaemon()
time.sleep(5)
print "Sleep tiime is over"
c.stop()
print "Calling stop was successful:", c.stopped()
print "Thread is now completed and main program exiting"
I get the below output from the command line:
Started the log capture. now sleeping. is this a dameon? True
Sleep tiime is over
Calling stop was successful: True
Thread is now completed and main program exiting
And the sample.py exits.
But when I use below command on a terminal,
ps -ax | grep "adb"
I still see the process running. (I am killing them manually now using the kill -9 17681 17682)
Not sure what I am missing here.
My question is,
1) why is the process still alive when I already killed it in my program?
2) Will it create any problem if I don't bother about it?
3) is there any other better way to capture logs using a thread and monitor the logs?
EDIT: As suggested by #bug Killer, I added the below method in my thread class,
def getProcessID(self):
return os.getpid()
and used os.kill(c.getProcessID(), SIGTERM) in my sample.py . The program doesn't exit at all.
It is likely because you are using os.system in your thread. The spawned process from os.system will stay alive even after the thread is killed. Actually, it will stay alive forever unless you explicitly terminate it in your code or by hand (which it sounds like you are doing ultimately) or the spawned process exits on its own. You can do this instead:
import atexit
import subprocess
deviceIp = '100.21.143.168'
fileTag = 'somefile'
# this is spawned in the background, so no threading code is needed
cmdTorun = "adb logcat > " + deviceIp +'_'+fileTag+'.log'
proc = subprocess.Popen(cmdTorun, shell=True)
# or register proc.kill if you feel like living on the edge
atexit.register(proc.terminate)
# Here is where all the other awesome code goes
Since all you are doing is spawning a process, creating a thread to do it is overkill and only complicates your program logic. Just spawn the process in the background as shown above and then let atexit terminate it when your program exits. And/or call proc.terminate explicitly; it should be fine to call repeatedly (much like close on a file object) so having atexit call it again later shouldn't hurt anything.

How to make Python apscheduler run in background

I want to make Python apscheduler run in background , here is my code:
from apscheduler.schedulers.background import BackgroundScheduler, BlockingScheduler
from datetime import datetime
import logging
import sys
logging.basicConfig(level=logging.DEBUG, stream=sys.stdout)
def singleton(cls, *args, **kw):
instances = {}
def _singleton(*args, **kw):
if cls not in instances:
instances[cls] = cls(*args, **kw)
return instances[cls]
return _singleton
#singleton
class MyScheduler(BackgroundScheduler):
pass
def simple_task(timestamp):
logging.info("RUNNING simple_task: %s" % timestamp)
scheduler = MyScheduler()
scheduler.start()
scheduler.add_job(simple_task, 'interval', seconds=5, args=[datetime.utcnow()])
when I run the command:
look:Python look$ python itger.py
I just got this:
INFO:apscheduler.scheduler:Scheduler started
DEBUG:apscheduler.scheduler:Looking for jobs to run
DEBUG:apscheduler.scheduler:No jobs; waiting until a job is added
INFO:apscheduler.scheduler:Added job "simple_task" to job store "default"
And ps:
ps -e | grep python
I just got 54615 ttys000 0:00.00 grep python
My Problem is how to set the code run in background and I can see it's running or it's print log for every 5 secs so the code show?
BackgroundScheduler runs in a background thread which in this case, I guess, doesn't prevent the application main threat to terminate.
Try add at the end of your application:
import time
print("Waiting to exit")
while True:
time.sleep(1)
... and then terminate your application with CTRL+C.
Threads are no magical way of making code run & managed by the OS. They are local to your process only, and so if that process terminates or dies unexpectantly, so does your Thread.
So the answer to your question is: don't use threads, write your program in a normal fashion so you can invoke it on your commandline, and then use a OS-based scheduler such as CRON to schedule it.
Alternatively, if your program needs to run continuously because it e.g. builds up caches that are expensive to re-compute every 5 minutes, use a process-observer such as supervisord to ensure even after a reboot or crash the program continues executing.

eventlet hangs when using futures.ProcessPoolExecutor

I'm using Ubuntu 12.04 Server x64, Python 2.7.3, futures==2.1.5, eventlet==0.14.0
Did anybody hit the same problem?
import eventlet
import futures
import random
# eventlet.monkey_patch() # Uncomment this line and it will hang!
def test():
if random.choice((True, False)):
raise Exception()
print "OK this time"
def done(f):
print "done callback!"
def main():
with futures.ProcessPoolExecutor() as executor:
fu = []
for i in xrange(6):
f = executor.submit(test)
f.add_done_callback(done)
fu.append(f)
futures.wait(fu, return_when=futures.ALL_COMPLETED)
if __name__ == '__main__':
main()
This code, if you uncomment the line, will hang. I can only stop it by pressing Ctrl+C. In this case the following KeyboardInterrupt traceback is printed: https://gist.github.com/max-lobur/8526120#file-traceback
This works fine with ThreadPoolExecutor.
Any feedback is appreciated
I think this KeyboardInterrupt reaches the started process because under ubuntu and most linux systems the new process is created with fork - starting a new process from the current one with the same stdin and stdout and stderr. So the KeyboardInterrupt can reach the child process. Maybe someone who know something more can add thoughts.

Trapping and handling taskkill in Windows Python

I am using Python 2.6.6 for Windows (on Windows XP SP3) with pywin32-218.
In my Python application, I have a second thread (apart from the main thread) which spawns a subprocess to run another Windows executable.
My problem is that when the main process (python.exe) is killed (e.g. using taskkill), I want to terminate the subprocess (calc.exe) and perform some cleaning up.
I tried various methods (atexit, signal and win32api.handleConsoleCtrl), but none seem to be able to trap the taskkill signal.
My code as follows (test.py):
import sys
import os
import signal
import win32api
import atexit
import time
import threading
import subprocess
class SecondThread(threading.Thread):
def __init__(self):
threading.Thread.__init__(self)
self.secondProcess = None
def run(self):
secondCommand = ['C:\WINDOWS\system32\calc.exe']
self.secondProcess = subprocess.Popen(secondCommand)
print 'calc.exe running'
self.secondProcess.wait()
print 'calc.exe stopped'
# do cleanup here
def stop(self):
if self.secondProcess and self.secondProcess.returncode == None:
self.secondProcess.kill()
secondThread = SecondThread()
def main():
secondThread.start()
def cleanup():
print 'cleaning up'
secondThread.stop()
print 'cleaned up'
atexit.register(cleanup)
def handleSignal(signalNum, frame):
print 'handleSignal'
cleanup()
sys.exit(0)
for signalNum in (signal.SIGINT, signal.SIGILL, signal.SIGABRT, signal.SIGFPE, signal.SIGSEGV, signal.SIGTERM):
signal.signal(signalNum, handleSignal)
def handleConsoleCtrl(signalNum):
print ('handleConsoleCtrl')
cleanup()
win32api.SetConsoleCtrlHandler(handleConsoleCtrl, True)
main()
The application is launched using
python.exe test.py
The console then prints "calc.exe running", and the Calculator application runs, and using Process Explorer, I can see calc.exe as a sub-process of python.exe
Then I kill the main process using
taskkill /pid XXXX /f
(where XXXX is the PID for python.exe)
What happens after this is that the command prompt returns without further output (i.e. none of "cleaning up", "handleSignal" or "handleConsoleCtrl" gets printed), the Calculator application continues running, and from Process Explorer, python.exe is no longer running but calc.exe has re-parented itself.
Taskkill (normally) sends WM_CLOSE. If your application is console only and has no window, while you can get CTRL_CLOSE_EVENT via a handler set by SetConsoleCtrlHandler (which happens if your controlling terminal window is closed) you can't receive a bare WM_CLOSE message.
If you have to stick with taskkill (rather than using a different program to send a Ctrl-C) one solution is to set the aforementioned handler and ensure your application has its own terminal window (e.g. by usingstart.exe "" <yourprog> to invoke it). See https://stackoverflow.com/a/23197789/4513656 for details an alternatives.

Categories

Resources