python does not release filehandles to logfile - python

I have an application which has to run a number of simulation runs. I want to setup a logging mechanisme where all logrecords are logged in a general.log, and all logs for a simulation run go to run00001.log, .... For this I have defined a class Run. in the __init__() a new filehandle is added for the runlog.
The problem is that the logfiles for the runs never get released, so after a number of runs the available handles are exhausted and the run crashes.
I've set up some routines to test this as follows
main routine
import Model
try:
myrun = Model.Run('20130315150340_run_49295')
ha = raw_input('enter')
myrun.log.info("some info")
except:
traceback.print_exc(file=sys.stdout)
ha = raw_input('enter3')
The class Run is defined in module Model as follows
import logging
class Run(object):
""" Implements the functionality of a single run. """
def __init__(self, runid):
self.logdir="."
self.runid = runid
self.logFile = os.path.join(self.logdir , self.runid + '.log')
self.log = logging.getLogger('Run'+self.runid)
myformatter = logging.Formatter('%(asctime)s %(name)-12s %(levelname)-8s %(message)s')
myhandler = logging.FileHandler(self.logFile)
myhandler.setLevel(logging.INFO)
myhandler.setFormatter(myformatter)
self.log.addHandler(myhandler)
Then I use the program process explorer to follow the filehandlers. And I see the runlogs appear, but never disappear.
Is there a way I can force this?

You need to call .close() on the filehandler.
When your Run class completes, call:
handlers = self.log.handlers[:]
for handler in handlers:
self.log.removeHandler(handler)
handler.close()
A file handler will automatically re-open the configured filename every time a new log message arrives, so calling handler.close() may sometimes appear futile. Removing the handler from the logger stops future log records from being sent to it; in the above code we do this first, to avoid an untimely log message from another thread reopening the handler.
Another answer here suggest you use logging.shutdown(). However, all that logging.shutdown() does is call handler.flush() and handler.close(), and I'd not recommend using it. It leaves the logging module in a state where you can't use logging.shutdown() again, not reliably.

You can also shutdown the logging completely. In that case, file handles are being released:
logging.shutdown()
It closes opened handles of all configured logging handlers.
I needed it to be able to delete a log file after a unit test is finished and I was able to delete it right after the call to the logging.shutdown() method.

Probably we'll just want to .close() the FileHandler's and not the others, so the accepted answer could be slightly modified like:
for handler in self.log.handlers:
if isinstance(handler, logging.FileHandler):
handler.close()
Also, for the simpler cases where we just have a logger configured with logging.basicConfig(), the handlers can be retrieved by calling logging.getLogger().handlers.

Related

Enable module's logger

I'm seeing a behavior that I have no way of explaining... Here's my simplified setup:
module x:
import logging
logger = logging.getLogger('x')
def test_debugging():
logger.debug('Debugging')
test for module x:
import logging
import unittest
from x import test_debugging
class TestX(unittest.TestCase):
def test_test_debugging(self):
test_debugging()
if __name__ == '__main__':
logger = logging.getLogger('x')
logger.setLevel(logging.DEBUG)
# logging.debug('another test')
unittest.main()
If I uncomment the logging.debug('another test') line I can also see the log from x. Note, it is not a typo, I'm calling debug on logging, not on the logger from module x. And if I call debug on logger, I don't see logs.
What is this, I can't even?..
In your setup, you didn't actually configure logging. Although the configuration can be pretty complex, it would suffice to set the log level in your example:
if __name__ == '__main__':
# note I configured logging, setting e.g. the level globally
logging.basicConfig(level=logging.DEBUG)
logger = logging.getLogger('x')
logger.setLevel(logging.DEBUG)
unittest.main()
This will create a simple StreamHandler with a predefined output format that prints all the log records to the stdout. I suggest you to quickly look over the relevant docs for more info.
Why did it work with the logging.debug call? Because the logging.{info,debug,warn,error} functions all call logging.basicConfig internally, so once you have called logging.debug, you configured logging implicitly.
Edit: let's take a quick look under the hood what is the actual meaning of the logging.{info,debug,error,warning} functions. Let's take the following snippet:
import logging
logger = logging.getLogger('mylogger')
logger.warning('hello world')
If you run the snippet, hello world will be not printed (and this is correct so!). Why not? It's because you didn't actually specify how the log records should be treated - should they be printed to stdout, or maybe printed to a file, or maybe sent to some server that will email them to the recipients? The logger mylogger will receive the log record hello world, but it doesn't know yet what to do with it. So, to actually print the record, let's do some configuration for the logger:
import logging
logger = logging.getLogger('mylogger')
formatter = logging.Formatter('Logger received message %(message)s at time %(asctime)s')
handler = logging.StreamHandler()
handler.setFormatter(formatter)
logger.addHandler(handler)
logger.warning('hello world')
We now attached a handler that handles the record by printing it to the stdout in the format specified by formatter. Now the record hello world will be printed to the stdout. We could attach more handlers and the record would be handled by each of the handler. Example: try to attach another StreamHandler and you will notice that the record is now printed twice.
So, what's with the logging functions now? If you have some simple program that has only one logger that should print the messages and that's all, you can replace the manual configuration by using convenience logging functions:
import logging
logging.warning('hello world')
This will configure the root logger to print the messages to stdout by adding a StreamHandler to it with some default formatter, so you don't have to configure it yourself. After that, it will tell the root logger to process the record hello world. Merely a convenience, nothing more. If you want to explicitly trigger this basic configuration of the root logger, issue
logging.basicConfig()
with or without the additional configuration parameters.
Now, let's go through my first code snippet once again:
logging.basicConfig(level=logging.DEBUG)
After this line, the root logger will print all log records with level DEBUG and higher to the command line.
logger = logging.getLogger('x')
logger.setLevel(logging.DEBUG)
We did not configure this logger explicitly, so why are the records still being printed? This is because by default, any logger will propagate the log records to the root logger. So the logger x does not print the records - it has not been configured for that, but it will pass the record further up to the root logger that knows how to print the records.

python-daemon doesn't call the start function

I've been following the this example to implement a python daemon, and it seems to be somewhat working, but only the reconfigure function is called.
This is the code I've been using:
import signal
import daemon
import lockfile
import manager
context = daemon.DaemonContext(
working_directory='/home/debian/station',
pidfile=lockfile.FileLock('/var/run/station.pid'))
context.signal_map = {
signal.SIGTERM: manager.Manager.program_terminate,
signal.SIGHUP: 'terminate',
signal.SIGUSR1: manager.Manager.program_reload_configuration,
}
manager.Manager.program_configure()
with context:
manager.Manager.program_start()
This is the code on the manager class:
#staticmethod
def program_configure():
logging.info('Configuring program')
#staticmethod
def program_reload_configuration():
logging.info('Reloading configuration')
#staticmethod
def program_start():
global Instance
logging.info('Program started')
Instance = Manager()
Instance.run()
#staticmethod
def program_terminate():
logging.info('Terminating')
And the log shows only:
INFO:root:Configuring program
For some reason program_start() isn't being called.
program_configure() is called every time the python file is read, so that's that, but why isn't program_start() called?
I start the daemon by typing sudo service station.sh start and the line that runs the script is:
python $DAEMON start
EDIT:
After reading a bit, I've realized that the program probably exits or hangs in context.__enter__() (with calls that). But I have no clue what is causing this
The problem wasn't in the python-daemon not calling the functions, it's the logging that didn't work.
When the daemon creates a new process it doesn't transfer all file handles from the mother process - Therefore the logs aren't written. See this question for more info.
The solution to that is to use the files_preserve property like so:
# Set the logger
LOG_LEVEL = logging.DEBUG
logger = logging.getLogger()
logger.setLevel(LOG_LEVEL)
fh = logging.FileHandler(LOG_FILENAME)
logger.addHandler(fh)
# Not create the context, and notify it to preserve the log file
context = daemon.DaemonContext(
working_directory='/home/debian/station',
pidfile=lockfile.FileLock('/var/run/station.pid'),
files_preserve=[fh.stream],
)

Python logging in multiple modules

I want to write a logger which I can use in multiple modules. I must be able to enable and disable it from one place. And it must be reusable.
Following is the scenario.
switch_module.py
class Brocade(object):
def __init__(self, ip, username, password):
...
def connect(self):
...
def disconnect(self):
...
def switch_show(self):
...
switch_module_library.py
import switch_module
class Keyword_Mapper(object):
def __init__(self, keyword_to_execute):
self._brocade_object = switch_module.Brocade(ip, username, password)
...
def map_keyword_to_command(self)
...
application_gui.py
class GUI:
# I can open a file containing keyword for brocade switch
# in this GUI in a tab and tree widget(it uses PyQt which I don't know)
# Each tab is a QThread and there could be multiple tabs
# Each tab is accompanied by an execute button.
# On pressing exeute it will read the string/keywords from the file
# and create an object of keyword_Mapper class and call
# map_key_word_to_command method, execute the command on brocade
# switch and log the results. Current I am logging the result
# only from the Keyword_Mapper class.
Problem I have is how to write a logger which could be enabled and disabled at will
and it must log to one file as well as console from all three modules.
I tried writing global logger in int.py and then importing it in all three modules
and had to give a common name so that they log to the same file, but then
ran into trouble since there is multi-threading and later created logger to
log to a file which has thread-id in its name so that I can have each log
per thread.
What if I am required to log to different file rather than the same file?
I have gone through python logging documentation but am unable to get a clear picture
about writing a proper logging system which could be reused.
I have gone through this link
Is it better to use root logger or named logger in Python
but due to the GUI created by someone other than me using PyQt and multi-threading I am unable to get my head around logging here.
I my project I only use a root logger (I dont have the time to create named loggers, even if it would be nice). So if you don't want to use a named logger, here is a quick solution:
I created a function to set up logger quickly:
import logging
def initLogger(level=logging.DEBUG):
if level == logging.DEBUG:
# Display more stuff when in a debug mode
logging.basicConfig(
format='%(levelname)s-%(module)s:%(lineno)d-%(funcName)s: %(message)s',
level=level)
else:
# Display less stuff for info mode
logging.basicConfig(format='%(levelname)s: %(message)s', level=level)
I created a package for it so that I can import it anywhere.
Then, in my top level I have:
import LoggingTools
if __name__ == '__main__':
# Configure logger
LoggingTools.initLogger(logging.DEBUG)
#LoggingTools.initLogger(logging.INFO)
Depending if I am debugging or not, I using the corresponding statement.
Then in each other files, I just use the logging:
import logging
class MyClass():
def __init__(self):
logging.debug("Debug message")
logging.info("Info message")

Including previously logged events in a new logging handler

Please consider the following example:
import logging
#create a logger object:
logger = logging.getLogger("MyLogger")
#define a logging handler for the standard output:
stdoutHandler = logging.StreamHandler(sys.stdout)
logger.addHandler(stdoutHandler)
#...
#initialization code with several logging events (for example, loading a configuration file to a 'conf' object)
#...
logger.info("Log event 1")
#after configuration is loaded, a new logging handler is defined for a log file:
fileHandler = logging.FileHandler(conf.get("main","log_file"),'w')
logger.addHandler(fileHandler)
logger.info("Log event 2")
With this example, "Log event 1" does not appear in the log file (only in stdout).
The log file is inevitably initialized after "Log event 1" (because it's dependent on the configuration).
My question is:
How do I include previously logged events (such as "Log event 1") in a new logging handler (such as the file handler in the example)?
My solution for the question:
Define a MemoryHandler to handle all the events prior to the definition of the FileHandler.
When the FileHandler is defined, set it as the flush target of the MemoryHandler and flush it.
import logging
import logging.handlers
#create a logger object:
logger = logging.getLogger("MyLogger")
#define a memory handler:
memHandler = logging.handlers.MemoryHandler(capacity = 1024*10)
logger.addHandler(memHandler)
#...
#initialization code with several logging events (for example, loading a configuration file to a 'conf' object)
#everything is logged by the memory handler
#...
#after configuration is loaded, a new logging handler is defined for a log file:
fileHandler = logging.FileHandler(conf.get("main","log_file"),'w')
#flush the memory handler into the new file handler:
memHandler.setTarget(fileHandler)
memHandler.flush()
memHandler.close()
logger.removeHandler(memHandler)
logger.addHandler(fileHandler)
This works for me, so I'm accepting this as the correct answer, until a more elegant answer comes along.

Logging in worker threads spawned from a pylons application does not seem to work

I have a pylons application where, under certain cirumstances I want to spawn multiple worker threads to process items in a queue. Right now we aren't making use of a ThreadPool (would be ideal, but we'll add that in later). The main problem is that the worker threads logging does not get written to the log files.
When I run the code outside of the pylons application the logging works fine. So I think its something to do with the pylons log handler but not sure what.
Here is a basic example of the code (trimmed down):
import logging
log = logging.getLogger(__name__)
import sys
from Queue import Queue
from threading import Thread, activeCount
def run(input, worker, args = None, simulteneousWorkerLimit = None):
queue = Queue()
threads = []
if args is not None:
if len(args) > 0:
args = list(args)
args = [worker, queue] + args
args = tuple(args)
else:
args = (worker, queue)
# start threads
for i in range(4):
t = Thread(target = __thread, args = args)
t.daemon = True
t.start()
threads.append(t)
# add ThreadTermSignal
inputData = list(input)
inputData.extend([ThreadTermSignal] * 4)
# put in the queue
for data in inputData:
queue.put(data)
# block until all contents are downloaded
queue.join()
log.critical("** A log line that appears fine **")
del queue
for thread in threads:
del thread
del threads
class ThreadTermSignal(object):
pass
def __thread(worker, queue, *args):
try:
while True:
data = queue.get()
if data is ThreadTermSignal:
sys.exit()
try:
log.critical("** I don't appear when run under pylons **")
finally:
queue.task_done()
except SystemExit:
queue.task_done()
pass
Take note, that the log lin within the RUN method will show up in the log files, but the log line within the worker method (which is run in a spawned thread), does not appear. Any help would be appreciated. Thanks
** EDIT: I should mention that I tried passing in the "log" variable to the worker thread as well as redefining a new "log" variable within the thread and neither worked.
** EDIT: Adding the configuration used for the pylons application (which comes out of the INI file). So the snippet below is from the INI file.
[loggers]
keys = root
[handlers]
keys = wsgierrors
[formatters]
keys = generic
[logger_root]
level = WARNING
handlers = wsgierrors
[handler_console]
class = StreamHandler
args = (sys.stderr,)
level = WARNING
formatter = generic
[handler_wsgierrors]
class = pylons.log.WSGIErrorsHandler
args = ()
level = WARNING
format = generic
One thing to note about logging is that if an exception occurs while emitting a log event (for whatever reason) the exception is typically swallowed, and not allowed to potentially bring down an application just because of a logging error. (It depends on the handlers used and the value of logging.raiseExceptions). So there are a couple of things to check:
That your formatting of log messages is dead simple, perhaps just using %(message)s until you find the problem.
Check that Pylons hasn't turned logging off or messed with the handlers for whatever reason. You haven't posted your logging initialization code so I'm not sure what handlers, etc. you're using. You can print log.getEffectiveLevel() to see if logging verbosity has been turned right down (unlikely for CRITICAL, but you never know).
If you put in print statements alongside your log statements, do they produce output how you'd expect them to?
Update: I'm aware of the restriction about mod_wsgi and printing, but that only applies to sys.stdout. You can e.g.
print >> sys.stderr, some_data
or
print >> open('/tmp/somefile', 'a'), some_data
without any problem.
Also: you should be aware that a call to logging.config.fileConfig() (which is presumably how the configuration you described is implemented) disables any existing loggers unless they are explicitly named in, or are descendants of loggers explicitly named in, the configuration file. While this might seem odd, it's because a configuration is intended to replace any existing configuration rather than augment it; and since threads might be pointing to existing loggers, they're disabled rather than deleted. You can check a logger's disabled attribute to see if fileConfig() has disabled the logger - that could be your problem.
You can try to pass a log variable to the thread through the arguments(args).

Categories

Resources