Logging inside Threads When Logger is Already Configured - python

EDIT: Repo with all code (branch "daemon"). The question is regarding the code in the file linked to).
My main program configures logging like this (options have been simplified):
logging.basicConfig(level='DEBUG', filename="/some/directory/cstash.log")
Part of my application starts a daemon, for which I use the daemon package:
with daemon.DaemonContext(
pidfile=daemon.pidfile.PIDLockFile(self.pid_file),
stderr=self.log_file,
stdout=self.log_file
):
self.watch_files()
where self.log_file is a file I've opened for writing.
When I start the application, I get:
--- Logging error ---
Traceback (most recent call last):
File "/Users/afraz/.pyenv/versions/3.7.2/lib/python3.7/logging/__init__.py", line 1038, in emit
self.flush()
File "/Users/afraz/.pyenv/versions/3.7.2/lib/python3.7/logging/__init__.py", line 1018, in flush
self.stream.flush()
OSError: [Errno 9] Bad file descriptor
If I switch off the logging to a file in the daemon, the logging in my main application works, and if I turn off the logging to a file in my main application, the logging in the daemon works. If I set them up to log to a file (even different files), I get the error above.

After trying many things, here's what worked:
def process_wrapper():
with self.click_context:
self.process_queue()
def watch_wrapper():
with self.click_context:
self.watch_files()
with daemon.DaemonContext(
pidfile=daemon.pidfile.PIDLockFile(self.pid_file),
files_preserve=[logger.handlers[0].stream.fileno()],
stderr=self.log_file,
stdout=self.log_file
):
logging.info("Started cstash daemon")
while True:
threading.Thread(target=process_wrapper).start()
time.sleep(5)
threading.Thread(target=watch_wrapper).start()
There were two main things wrong:
daemon.DaemonContext needs files_preserve set to the file logging handler, so it doesn't close the file once the context is switched. This is the actual solution to the original problem.
Further however, both methods needed to be in separate threads, not just one. The while True loop in the main thread was stopping the other method from running, so putting them both into separate threads means they can both run

Related

The child process exception is not visible when BrokenProcessPool is raised

I have the following code:
import asyncio
from concurrent.futures import ProcessPoolExecutor
PROCESS_POOL_EXECUTOR = ProcessPoolExecutor(max_workers=2)
def run_in_process(blocking_task, *args) -> Awaitable:
event_loop = asyncio.get_event_loop()
return event_loop.run_in_executor(PROCESS_POOL_EXECUTOR, blocking_task, *args)
It has been working fine until I added an additional volume and mounted it in an EC2 instance. After I did that, it is raising the following exception:
File "/proc/self/fd/3/repo/utils/asyncio.py", line 63, in run_in_process
return event_loop.run_in_executor(PROCESS_POOL_EXECUTOR, blocking_task, *args)
File "/conda/lib/python3.8/asyncio/base_events.py", line 783, in run_in_executor
executor.submit(func, *args), loop=self)
File "/conda/lib/python3.8/concurrent/futures/process.py", line 629, in submit
raise BrokenProcessPool(self._broken)
concurrent.futures.process.BrokenProcessPool: A child process terminated abruptly, the process pool is not usable anymore
There is nothing except this log. If I understood correctly, this means that the worker raised some exception and that's why the child process terminated. But I don't see that child process exception. That's why I have no idea what is going wrong.
It is probably related to that additional volume and mounting because it works without that new volume. I just don't know what exactly is going wrong.
I tried to run the code in ipython and it worked just fine there too.
I understand that this is a bad question that is not reproducible but maybe someone has seen this before and has some idea.

kivy logging; 'Too many logfile, remove them'

When using kivy logging like this:
from kivy.logger import Logger
from kivy.config import Config
Config.set('kivy', 'log_enable', 1)
Config.set('kivy', 'log_dir', '/home/dude/folder')
Config.set('kivy', 'log_level', 'debug')
Config.set('kivy', 'log_name', 'my_file.log')
Config.write()
Logger.debug('main:switching stuff on')
Logger.info('socket:send command to raspberry')
I always get the the error:
[ERROR ] Error while activating FileHandler logger
Traceback (most recent call last):
File "/usr/lib/python2.7/dist-packages/kivy/logger.py", line 220, in emit
self._configure()
File "/usr/lib/python2.7/dist-packages/kivy/logger.py", line 171, in _configure
raise Exception('Too many logfile, remove them')
Exception: Too many logfile, remove them
... even after removing any file with this name.
What am I missing here?
I also get the error when running bigger programs which actually contain kivy widgets and apps.
Setting the correct settings in the /home/.kivy/config.ini file solved the problem. All existing files in the set directory have to be removed. Kivy logger does not append in an existing file. It just raises the mentioned error.
This is because kivy Logger does not actually have a single file when dumping stdout into a text file (possibly due to it calling the handler multiple times not sure). Rather it will occasionally raise another file with a number appendage (replacing %_) and continue logging. However, if for whatever reason it cannot append this number or the count reaches over 10k, then it will raise the above exception and exit the app. Great if you want to avoid flooding, as the notes in the kivy code indicate, not so great when you were not aware of kivy naming convention.
Kivy Logger -> https://kivy.org/doc/stable/_modules/kivy/logger.html

File cannot be found when i use #Fabric put# in multi-threading to copy a temp file created by tempfile.mkstemp to a remote file

Python Version: 2.6.4
Fabric Version:1.9.0
I have an automation testing framework to execute cases in parallel by using threading.Thread(3 threads in my case).
Each thread worker uses fabric put(we do some wrapper on this function though) to copy a temporary file created by tempfile.mkstemp to a remote file.
The question is that it always gives me an error that file cannot be found, the error happens during 'put' from the exception tips.
here is the code when do 'put':
MyShell.py(Parent class of MyFabShell)
def putFileContents(self, file, contents):
fd, tmpFile= tempfile.mkstemp()
os.system('chmod 777 %s' % tmpFile)
contentsFile = open(tmpFile, 'w')
contentsFile.write(contents)
contentsFile.close()
dstTmpFile = self.genTempFile()
localshell = getLocalShell()
self.get(localshell, tmpFile, dstTmpFile) # use local shell to put tmpFile to dstTmpFile
os.close(fd)
os.remove(tmpFile)
#os.remove(tmpFile)
MyFabShell.py:
def get( self, srcShell, srcPath, dstPath ):
srcShell.put( self, srcPath, dstPath )
def put(self, dstShell, srcpath, dstpath):
if not self.pathExists( srcPath ): # line 158
raise Exception( "Cannot put <%s>, does not exist." % srcPath )
# do fabric get/put in the following
# ...
The call of put results in an error:
...
self.shell.putFileContents(configFile, contents)
File "/path/to/MyShell.py", line 401, in putFileContents
self.get(localShell, tmpFile, dstTmpFile)
File "/path/to/MyFabShell.py", line 134, in get
srcShell.put( self, srcPath, dstPath )
File "/path/to/myFabShell.py", line 158, in put
raise Exception( "Cannot put <%s>, does not exist." % srcPath )
Exception: Cannot put </tmp/tmpwt3hoO>, does not exist.
I initially doubt the file could be removed during put, so I commented os.remove. However, I got the same error again.
From the exception log, it should not be the problem of 'fabric put' since exception throws before the execution of fabric get/put
Is mkstemp NOT safe when multithreading is involved? but the document says that "There are no race conditions in the file’s creation" or does my case fail because of GIL? I suspect this is because when I use only 1 thread, everything will be fine.
Could anyone give me some clue on my error? I have being struggling with the problem for a while:(
My problem is solved when i 'join' the thread explicitly. all my threads are not daemon threads, and each of thread has pretty much of I/O operation(e.g. file write/read). 'join' explicitly will make sure each thread's job is completed.
I am still not sure the root cause of my problem...the temp file is actually there but the thread complains "cannot find" when multiple thread are working together, the only guess i can give is:
when a thread A does I/O operation, the thread A will release GIL so that thread B(or C, D...) can acquired GIL during A's I/O operation time. the problem might happen during the I/O time because Thread A is not in the Python Interpreter any more...that's the reason "file cannot be found" by A, however, When we join A explicitly, A will always make sure to complete its job by reentering the GI(Global Interpreter).

Listening Twisted TCP connection with python-daemon gives bad file descriptor

I'm trying to create a program which:
fork at start using multiprocessing
the forked process uses python-daemon to fork again in the background
opening a twisted listening TCP port in the resulting background process
The reason I need to fork the process before launching python-daemon is because I want the staring process to stay alive (by default python-daemon kills the father process).
So far my code is:
from twisted.web import xmlrpc, server
from twisted.internet import reactor
from daemon import daemon
import multiprocessing
import os
import logging
class RemotePart(object):
def setup(self):
self.commands = CommandPart()
reactor.listenTCP(9091, server.Site(self.commands))
class CommandPart(xmlrpc.XMLRPC, object):
def __init__(self):
super(CommandPart, self).__init__()
def xmlrpc_stop(self):
return True
class ServerPart(object):
def __init__(self):
self.logger = logging.getLogger("server")
self.logger.info("ServerPart.__init__()")
def start_second_daemon(self):
self.logger.info("start_second_daemon()")
daemon_context = daemon.DaemonContext(detach_process=True)
daemon_context.stdout = open(
name="log.txt",
mode='w+',
buffering=0
)
daemon_context.stderr = open(
name="log.txt",
mode='w+',
buffering=0
)
daemon_context.working_directory = os.getcwd()
daemon_context.open()
self.inside_daemon()
def inside_daemon(self):
self.logger.setLevel(0)
self.logger.info("inside daemon")
self.remote = RemotePart()
self.remote.setup()
reactor.run()
class ClientPart(object):
def __init__(self):
logging.basicConfig(level=0)
self.logger = logging.getLogger("client")
self.logger.info("ClientPart.__init__()")
def start_daemon(self):
self.logger.info("start_daemon()")
start_second_daemon()
def launch_daemon(self):
self.logger.info("launch_daemon()")
server = ServerPart()
p = multiprocessing.Process(target=server.start_second_daemon())
p.start()
p.join()
if __name__ == '__main__':
client = ClientPart()
client.launch_daemon()
Starting the process seems to work:
INFO:client:ClientPart.__init__()
INFO:client:launch_daemon()
INFO:server:ServerPart.__init__()
INFO:server:start_second_daemon()
But looking to the log file of the background process, twisted cannot open the TCP port:
INFO:server:inside daemon
Traceback (most recent call last):
File "forking_test.py", line 74, in <module>
client.launch_daemon()
File "forking_test.py", line 68, in launch_daemon
p = multiprocessing.Process(target=server.start_second_daemon())
File "forking_test.py", line 45, in start_second_daemon
self.inside_daemon()
File "forking_test.py", line 51, in inside_daemon
self.remote.setup()
File "forking_test.py", line 12, in setup
reactor.listenTCP(9091, server.Site(self.commands))
File "/usr/lib/python2.7/site-packages/twisted/internet/posixbase.py", line 482, in listenTCP
p.startListening()
File "/usr/lib/python2.7/site-packages/twisted/internet/tcp.py", line 1004, in startListening
self.startReading()
File "/usr/lib/python2.7/site-packages/twisted/internet/abstract.py", line 429, in startReading
self.reactor.addReader(self)
File "/usr/lib/python2.7/site-packages/twisted/internet/epollreactor.py", line 247, in addReader
EPOLLIN, EPOLLOUT)
File "/usr/lib/python2.7/site-packages/twisted/internet/epollreactor.py", line 233, in _add
self._poller.register(fd, flags)
IOError: [Errno 9] Bad file descriptor
Any idea ? It seems python-daemon is closing all the file descriptors of the background process when this one starts, could it be coming from this behavior ?
There are lots of reasons that running fork and then running some arbitary library code that doesn't work. It would be hard to list them all here, but generally it's not cool to do. My guess as to what's specifically happening here is that something within multiprocessing is closing the "waker" file descriptor that lets Twisted communicate with its thread pool, but I can't be completely sure.
If you were to re-write this to:
Use spawnProcess instead of multiprocessing
Use twistd instead of python-daemonize
the interactions would be far less surprising, because you'd be using process-spawning and daemonization code specifically designed to work with Twisted, instead of two things with lots of accidental platform interactions (calling fork, serializing things over pipes with pickle, calling setsid and setuid and changing controlling terminal and session leader at various times).
(And actually I would recommend integrating with your platform's daemon management tools, like upstart or launchd or systemd or a cross-platform one like runit rather than depending on any daemonization code, including that in twistd, but I would need to know more about your application to know what to recommend.)

Python multiprocessing, ValueError: I/O operation on closed file

I'm having a problem with the Python multiprocessing package. Below is a simple example code that illustrates my problem.
import multiprocessing as mp
import time
def test_file(f):
f.write("Testing...\n")
print f.name
return None
if __name__ == "__main__":
f = open("test.txt", 'w')
proc = mp.Process(target=test_file, args=[f])
proc.start()
proc.join()
When I run this, I get the following error.
Process Process-1:
Traceback (most recent call last):
File "C:\Python27\lib\multiprocessing\process.py", line 258, in _bootstrap
self.run()
File "C:\Python27\lib\multiprocessing\process.py", line 114, in run
self.target(*self._args, **self._kwargs)
File "C:\Users\Ray\Google Drive\Programming\Python\tests\follow_test.py", line 24, in test_file
f.write("Testing...\n")
ValueError: I/O operation on closed file
Press any key to continue . . .
It seems that somehow the file handle is 'lost' during the creation of the new process. Could someone please explain what's going on?
I had similar issues in the past. Not sure whether it is done within the multiprocessing module or whether open sets the close-on-exec flag by default but I know for sure that file handles opened in the main process are closed in the multiprocessing children.
The obvious work around is to pass the filename as a parameter to the child process' init function and open it once within each child (if using a pool), or to pass it as a parameter to the target function and open/close on each invocation. The former requires the use of a global to store the file handle (not a good thing) - unless someone can show me how to avoid that :) - and the latter can incur a performance hit (but can be used with multiprocessing.Process directly).
Example of the former:
filehandle = None
def child_init(filename):
global filehandle
filehandle = open(filename,...)
../..
def child_target(args):
../..
if __name__ == '__main__':
# some code which defines filename
proc = multiprocessing.Pool(processes=1,initializer=child_init,initargs=[filename])
proc.apply(child_target,args)

Categories

Resources