I am trying to create a module init and a module mydaemon in Python 2.7 under Debian 7.
The module init checks the requirements such as db connections etc. Then, mydaemon runs in a thread and uses the database to do things and write a logfile.
The problem when setting up the thread daemon is that the logging and function call fails.
But if the thread not daemon working fine...
Where am I wrong or what will be a better approach?
init.py
import mydaemon, threading
print 'start'
t = threading.Thread( target = mydaemon.start, args = () )
t.daemon = True # error here
t.start()
mydaemon.py
import logging
def start():
work()
return
def work():
logging.basicConfig( filename = 'mylog.log', level = logging.DEBUG )
logging.info('foo log')
print 'foo console'
return
My collage found another method with external Daemon module (python-daemon)
http://www.gavinj.net/2012/06/building-python-daemon-process.html
In the tutorial have some error but read comments ;-)
Making it as a deamon means the background thread dies as soon as the main app closes. Your code 'works' as is, simply add a pause to init.py to model this behavior:
...
t.start()
import time
time.sleep(1)
This is discussed in more detail at http://pymotw.com/2/threading/#daemon-vs-non-daemon-threads.
The simply way to fix this is to join the thread.
import mydaemon, threading
print 'start'
t = threading.Thread( target = mydaemon.start, args = () )
t.daemon = True # error here
t.start()
t.join()
Related
Right now, I'm using subprocess to run a long-running job in the background. For multiple reasons (PyInstaller + AWS CLI) I can't use subprocess anymore.
Is there an easy way to achieve the same thing as below ? Running a long running python function in a multiprocess pool (or something else) and do real time processing of stdout/stderr ?
import subprocess
process = subprocess.Popen(
["python", "long-job.py"],
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
shell=True,
)
while True:
out = process.stdout.read(2000).decode()
if not out:
err = process.stderr.read().decode()
else:
err = ""
if (out == "" or err == "") and process.poll() is not None:
break
live_stdout_process(out)
Thanks
getting it cross platform is messy .... first of all windows implementation of non-blocking pipe is not user friendly or portable.
one option is to just have your application read its command line arguments and conditionally execute a file, and you get to use subprocess since you will be launching yourself with different argument.
but to keep it to multiprocessing :
the output must be logged to queues instead of pipes.
you need the child to execute a python file, this can be done using runpy to execute the file as __main__.
this runpy function should run under a multiprocessing child, this child must first redirect its stdout and stderr in the initializer.
when an error happens, your main application must catch it .... but if it is too busy reading the output it won't be able to wait for the error, so a child thread has to start the multiprocess and wait for the error.
the main process has to create the queues and launch the child thread and read the output.
putting it all together:
import multiprocessing
from multiprocessing import Queue
import sys
import concurrent.futures
import threading
import traceback
import runpy
import time
class StdoutQueueWrapper:
def __init__(self,queue:Queue):
self._queue = queue
def write(self,text):
self._queue.put(text)
def flush(self):
pass
def function_to_run():
# runpy.run_path("long-job.py",run_name="__main__") # run long-job.py
print("hello") # print something
raise ValueError # error out
def initializer(stdout_queue: Queue,stderr_queue: Queue):
sys.stdout = StdoutQueueWrapper(stdout_queue)
sys.stderr = StdoutQueueWrapper(stderr_queue)
def thread_function(child_stdout_queue,child_stderr_queue):
with concurrent.futures.ProcessPoolExecutor(1, initializer=initializer,
initargs=(child_stdout_queue, child_stderr_queue)) as pool:
result = pool.submit(function_to_run)
try:
result.result()
except Exception as e:
child_stderr_queue.put(traceback.format_exc())
if __name__ == "__main__":
child_stdout_queue = multiprocessing.Queue()
child_stderr_queue = multiprocessing.Queue()
child_thread = threading.Thread(target=thread_function,args=(child_stdout_queue,child_stderr_queue),daemon=True)
child_thread.start()
while True:
while not child_stdout_queue.empty():
var = child_stdout_queue.get()
print(var,end='')
while not child_stderr_queue.empty():
var = child_stderr_queue.get()
print(var,end='')
if not child_thread.is_alive():
break
time.sleep(0.01) # check output every 0.01 seconds
Note that a direct consequence of running as a multiprocess is that if the child runs into a segmentation fault or some unrecoverable error the parent will also die, hencing running yourself under subprocess might seem a better option if segfaults are expected.
I'm on Ubuntu 16.04.6 LTS with python-2.7.12. I'm not an expert in python, but I have to maintain some code. Here is snippet:
from threading import Thread
...
class Shell(cmd.Cmd):
...
def do_start(self, line):
threads = []
t = Thread(target=traffic(line, arg1, arg2, arg3)
threads.append(t)
t.start()
t.join()
...
if __name__ == '__main__':
global config
global args
args = parse_args()
config = configparser.ConfigParser()
config.read(args.FILE)
s = Shell()
...
So it starts a small command-line shell, where I can execute some commands. It does work, however it blocks the CLI, as the threads starts, so I googled and thought that adding t.setDaemon(True) would help. I tried it before t.start() or after, and it didn't take any effect. Is it not supported in this version, or I'm doing something wrong?
Thanks.
The t.join() makes the main thread to wait for the one created, so the CLI is blocked.
If you want to run your CLI and not block the terminal you need to run it in the background.
If you run on Linux you can simply use the & sign
I am testing Python threading with the following script:
import threading
class FirstThread (threading.Thread):
def run (self):
while True:
print 'first'
class SecondThread (threading.Thread):
def run (self):
while True:
print 'second'
FirstThread().start()
SecondThread().start()
This is running in Python 2.7 on Kubuntu 11.10. Ctrl+C will not kill it. I also tried adding a handler for system signals, but that did not help:
import signal
import sys
def signal_handler(signal, frame):
sys.exit(0)
signal.signal(signal.SIGINT, signal_handler)
To kill the process I am killing it by PID after sending the program to the background with Ctrl+Z, which isn't being ignored. Why is Ctrl+C being ignored so persistently? How can I resolve this?
Ctrl+C terminates the main thread, but because your threads aren't in daemon mode, they keep running, and that keeps the process alive. We can make them daemons:
f = FirstThread()
f.daemon = True
f.start()
s = SecondThread()
s.daemon = True
s.start()
But then there's another problem - once the main thread has started your threads, there's nothing else for it to do. So it exits, and the threads are destroyed instantly. So let's keep the main thread alive:
import time
while True:
time.sleep(1)
Now it will keep print 'first' and 'second' until you hit Ctrl+C.
Edit: as commenters have pointed out, the daemon threads may not get a chance to clean up things like temporary files. If you need that, then catch the KeyboardInterrupt on the main thread and have it co-ordinate cleanup and shutdown. But in many cases, letting daemon threads die suddenly is probably good enough.
KeyboardInterrupt and signals are only seen by the process (ie the main thread)... Have a look at Ctrl-c i.e. KeyboardInterrupt to kill threads in python
I think it's best to call join() on your threads when you expect them to die. I've taken the liberty to make the change your loops to end (you can add whatever cleanup needs are required to there as well). The variable die is checked on each pass and when it's True, the program exits.
import threading
import time
class MyThread (threading.Thread):
die = False
def __init__(self, name):
threading.Thread.__init__(self)
self.name = name
def run (self):
while not self.die:
time.sleep(1)
print (self.name)
def join(self):
self.die = True
super().join()
if __name__ == '__main__':
f = MyThread('first')
f.start()
s = MyThread('second')
s.start()
try:
while True:
time.sleep(2)
except KeyboardInterrupt:
f.join()
s.join()
An improved version of #Thomas K's answer:
Defining an assistant function is_any_thread_alive() according to this gist, which can terminates the main() automatically.
Example codes:
import threading
def job1():
...
def job2():
...
def is_any_thread_alive(threads):
return True in [t.is_alive() for t in threads]
if __name__ == "__main__":
...
t1 = threading.Thread(target=job1,daemon=True)
t2 = threading.Thread(target=job2,daemon=True)
t1.start()
t2.start()
while is_any_thread_alive([t1,t2]):
time.sleep(0)
One simple 'gotcha' to beware of, are you sure CAPS LOCK isn't on?
I was running a Python script in the Thonny IDE on a Pi4. With CAPS LOCK on, Ctrl+Shift+C is passed to the keyboard buffer, not Ctrl+C.
I am facing the problem with collecting logs from the following script.
Once I set up the SLEEP_TIME to too "small" value, the LoggingThread
threads somehow block the logging module. The script freeze on logging request
in the action function. If the SLEEP_TIME is about 0.1 the script collect
all log messages as I expect.
I tried to follow this answer but it does not solve my problem.
import multiprocessing
import threading
import logging
import time
SLEEP_TIME = 0.000001
logger = logging.getLogger()
ch = logging.StreamHandler()
ch.setFormatter(logging.Formatter('%(asctime)s %(levelname)s %(funcName)s(): %(message)s'))
ch.setLevel(logging.DEBUG)
logger.setLevel(logging.DEBUG)
logger.addHandler(ch)
class LoggingThread(threading.Thread):
def __init__(self):
threading.Thread.__init__(self)
def run(self):
while True:
logger.debug('LoggingThread: {}'.format(self))
time.sleep(SLEEP_TIME)
def action(i):
logger.debug('action: {}'.format(i))
def do_parallel_job():
processes = multiprocessing.cpu_count()
pool = multiprocessing.Pool(processes=processes)
for i in range(20):
pool.apply_async(action, args=(i,))
pool.close()
pool.join()
if __name__ == '__main__':
logger.debug('START')
#
# multithread part
#
for _ in range(10):
lt = LoggingThread()
lt.setDaemon(True)
lt.start()
#
# multiprocess part
#
do_parallel_job()
logger.debug('FINISH')
How to use logging module in multiprocess and multithread scripts?
This is probably bug 6721.
The problem is common in any situation where you have locks, threads and forks. If thread 1 had a lock while thread 2 calls fork, in the forked process, there will only be thread 2 and the lock will be held forever. In your case, that is logging.StreamHandler.lock.
A fix can be found here (permalink) for the logging module. Note that you need to take care of any other locks, too.
I've run into similar issue just recently while using logging module together with Pathos multiprocessing library. Still not 100% sure, but it seems, that in my case the problem may have been caused by the fact, that logging handler was trying to reuse a lock object from within different processes.
Was able to fix it with a simple wrapper around default logging Handler:
import threading
from collections import defaultdict
from multiprocessing import current_process
import colorlog
class ProcessSafeHandler(colorlog.StreamHandler):
def __init__(self):
super().__init__()
self._locks = defaultdict(lambda: threading.RLock())
def acquire(self):
current_process_id = current_process().pid
self._locks[current_process_id].acquire()
def release(self):
current_process_id = current_process().pid
self._locks[current_process_id].release()
By default, multiprocessing will fork() the process in the pool when running on Linux. The resulting subprocess will lose all running threads except for the main one. So if you're on Linux, that's the problem.
First action item: You shouldn't ever use the fork()-based pool; see https://pythonspeed.com/articles/python-multiprocessing/ and https://docs.python.org/3/library/multiprocessing.html#contexts-and-start-methods.
On Windows, and I think newer versions of Python on macOS, the "spawn"-based pool is used. This is also what you ought use on Linux. In this setup, a new Python process is started. As you would expect, the new process doesn't have any of the threads from the parent process, because it's a new process.
Second action item: you'll want to have logging setup done in each subprocess in the pool; the logging setup for the parent process isn't sufficient to get logs from the worker processes. You do this with the initializer keyword argument to Pool, e.g. write a function called setup_logging() and then do pool = multiprocessing.Pool(initializer=setup_logging) (https://docs.python.org/3/library/multiprocessing.html#module-multiprocessing.pool).
I'm new to Python and I'm writing a script that
includes some timed routines.
My current approach is to instantiate a class
that includes those Timers (from: threading.Timer),
but I don't want the script to return when it gets to the
end of the function:
import mytimer
timer = mytimer()
Suppose I have a imple script like that one. All it
does is instantiate a mytimer object which performs a series
of timed activities.
In order for the application not to exit, I could use Qt like this:
from PyQt4.QtCore import QCoreApplication
import mytimer
import sys
def main():
app = QCoreApplication(sys.argv)
timer = mytimer()
sys.exit(app.exec_())
if __name__ == '__main__':
main()
This way, the sys.exit() call won't return immediately, and the
timer would just keep doing its thing 'forever' in background.
Although this is a solution I've used before, using Qt just for this doesn't
fell right to me.
So my question is, Is there any way to accomplish this using standard Python?
Thanks
Create a function in your script which tests a select or poll object to terminate a loop. Check out serve_forever in SocketServer.py from the standard library as an example.
A Google search for "python timer" finds:
http://docs.python.org/library/sched.html
http://docs.python.org/release/2.5.2/lib/timer-objects.html
The sched module seems to be exactly what you need.
Example:
>>> import sched, time
>>> s = sched.scheduler(time.time, time.sleep)
>>> def print_time(): print "From print_time", time.time()
...
>>> def print_some_times():
... print time.time()
... s.enter(5, 1, print_time, ())
... s.enter(10, 1, print_time, ())
... s.run()
... print time.time()
...
>>> print_some_times()
930343690.257
From print_time 930343695.274
From print_time 930343700.273
930343700.276
Once you have built your queue of times for things to happen, you just call the .run() method on your sched instance, and it will automatically wait until the queue is emptied, then will complete. So you can just put s.run() as the last thing in your script, and it will automatically exit only when the timed tasks are all done.
import mytimer
import sys
from threading import Lock
lock = Lock()
lock.acquire() # put lock into locked state
def main():
timer = mytimer()
lock.acquire() # blocks until someone calls lock.release()
if __name__ == '__main__':
main()
If you want a clean exit, you can just make mytimer() call lock.release() at some point.