Guarantee all logs go to file - python

I need to set up logging so it will always log to a file, no matter what a user or other programmer might setLevel too. For instance I have this logger set up:
initialize_logging.py
import logging
filename="./files/logs/attribution.log", level=logging.INFO)
logger = logging.getLogger('attribution')
formatter = logging.Formatter('%(asctime)s - %(levelname)s: %(message)s')
fh = logging.FileHandler('./files/logs/attribution.log')
fh.setLevel(logging.DEBUG)
fh.setFormatter(formatter)
ch = logging.StreamHandler()
ch.setLevel(logging.WARNING)
ch.setFormatter(formatter)
logger.setLevel(logging.DEBUG)
logger.addHandler(fh)
logger.addHandler(ch)
then I have this file:
main.py
%load_ext autotime
import initialize_logging
import logging
logger = logging.getLogger('attribution')
logger.setLevel(logging.WARNING)
logger.info('TEST')
The above results in nothing being logged to my file and nothing being output.
However in main, if I setLevel(logging.INFO) then everything gets written to file and to the sterrout (I'm using a jupyter notebook so it prints on screen immediatly.)
The behavior I would like is that the user of my notebook can setLevel to determine what they want to see printed on screen but no matter what all logs get sent to the log file.
How do I do this?

Create a Logger subclass that is enabled for all levels and set it as the LoggerClass. Override the isEnabledFor function to return True regardless of the Logger's level. Override the setLevel function to find all of the Logger's StreamHandlers and set their level to be the same as the Logger. The behavior you want will rely on the level set for the handlers.
logging_init.py
class AllLevels(logging.Logger):
def isEnabledFor(self,level):
"""
Is this logger enabled for level 'level'?
"""
return True
def setLevel(self,level):
super().setLevel(level)
for h in self.handlers:
if isinstance(h,logging.StreamHandler):
h.setLevel(self.level)
filename="attribution.log"
logging.setLoggerClass(AllLevels)
logger = logging.getLogger('attribution')
formatter = logging.Formatter('%(asctime)s - %(levelname)s: %(message)s')
fh = logging.FileHandler(filename)
fh.setLevel(logging.DEBUG)
fh.setFormatter(formatter)
ch = logging.StreamHandler()
ch.setLevel(logging.WARNING)
ch.setFormatter(formatter)
logger.setLevel(logging.DEBUG)
logger.addHandler(fh)
logger.addHandler(ch)
caveat: I have not tried to think of all or any ways that this could go wrong. I only attempted to create the behavior.
Maybe instead of setting all StreamHandlers to the same level as the logger, only set a StreamHandler with a specific name.
Nothing will stop someone from inspecting the logger; finding the handlers; arbitrarily changing their levels. Maybe use a FileHandler subclass with its setLevel function overridden.
class AlwaysFileHandler(logging.FileHandler):
def setLevel(self,level):
"""
Set the logging level of this handler to DEBUG.
"""
self.level = logging.DEBUG

Related

Dynamically change level of python logging

In my project, I have setup multiple pipelines(~20). I want to implement logging for each of these pipeline and redirect them to different file for each pipeline.
I have created a class GenericLogger as below:
class GenericLogger(object):
def __init__(self, pipeline):
self.name = pipeline
pass
def get_logger(self):
logger = logging.getLogger(self.name)
log_file = "{0}.log".format(self.name)
console_handler = logging.StreamHandler()
file_handler = logging.handlers.RotatingFileHandler(log_file, maxBytes=LOGS_FILE_SIZE, backupCount=3)
file_format = logging.Formatter('%(asctime)s: %(levelname)s: %(name)s: %(message)s', datefmt="%Y-%m-%d %H:%M:%S")
console_format = logging.Formatter('%(asctime)s: %(levelname)s: %(name)s: %(message)s', datefmt="%Y-%m-%d %H:%M:%S")
console_handler.setFormatter(console_format)
file_handler.setFormatter(file_format)
logger.addHandler(file_handler)
logger.addHandler(console_handler)
logger.setLevel(logging.INFO)
return logger
I am importing this class in my pipeline and getting the logger and using as below:
logger_helper = PythonLogger('pipeline_name')
logger = logger_helper.get_logger()
logger.warning("Something happened")
Flow of pipeline:
Once triggered, they will run continuously in interval of T minutes. Currently to avoid piling up of logger objects after each complete execution I am using logger.handlers = [], and then creating a new instance of logger again on the next iteration.
Questions:
1). How Can I dynamically change the level of the logs for each pipeline separately? If I am using logging.ini, Is creating a static handlers/formatters for each pipeline is necessary or Is there something I can do dynamically? don't know much about this.
2). Is the above implementation of the logger is correct or creating a Class for logger is something which should not be done?
To answer your two points:
You could have a mapping between pipeline name and level, and after creating the logger for a pipeline, you can set its level appropriately.
You don't need multiple console handlers, do you? I'd just create one console handler and attach it to the root logger. Likewise, you don't need to create multiple identical file and console formatters - just make one of each. In fact, since they are all apparently identical format strings, you just need one formatter instance. Avoid creating a class like GenericLogger.
Thus, something like:
formatter = logging.Formatter(...)
console_handler = logging.StreamHandler()
console_handler.setFormatter(formatter)
logging.getLogger().addHandler(console_handler)
for name, level in name_to_level.items():
logger = logging.getLogger(name)
logger.setLevel(level)
file_handler = logging.handlers.RotatingFileHandler(...)
file_handler.setFormatter(formatter)
logger.addHandler(file_handler)
should do what you need.

Logging with two handlers - one to file and one to stderr

I have the following code to create two logging handlers. The goal is to have the stream_handler write only to the sterr and the file_handler only to file.
In the code below, stream_handler writes to the file as well, so I have duplicate log entries whenever I log a message. How can I modify it to get the stream_handler not to write to the file?
def create_timed_rotating_log(
name='log',
path='logs/',
when='D',
interval=1,
backupCount=30,
form='%(asctime)s | %(name)s | %(levelname)s: %(message)s',
sterr = False,
verbose=False):
logger = logging.getLogger(name)
formatter = logging.Formatter(form)
logger.setLevel(logging.DEBUG)
if sterr:
stream_handler = logging.StreamHandler()
stream_handler.setLevel(logging.DEBUG if verbose else logging.ERROR)
stream_handler.setFormatter(formatter)
logger.addHandler(stream_handler)
file_handler = TimedRotatingFileHandler(filename=path+name,
when=when,
interval=interval,
backupCount=backupCount)
file_handler.setFormatter(formatter)
file_handler.setLevel(logging.DEBUG if verbose else logging.ERROR)
logger.addHandler(file_handler)
return logger
Looks like you have considered that the messages go to both handlers as duplication. If you wanted the file not to receive messages and them to only go to stderr, when stderr=True, you never to put the code that adds the file handler in an else so that it is only added if stderr=False
Specifically:
if sterr:
stream_handler = logging.StreamHandler()
stream_handler.setLevel(logging.DEBUG if verbose else logging.ERROR)
stream_handler.setFormatter(formatter)
logger.addHandler(stream_handler)
else:
file_handler = TimedRotatingFileHandler(filename=path+name,
when=when,
interval=interval,
backupCount=backupCount)
file_handler.setFormatter(formatter)
file_handler.setLevel(logging.DEBUG if verbose else logging.ERROR)
logger.addHandler(file_handler)
I had considered this possible but less likely than the more common issue of setting up handlers multiple times when how you are supposed use logging is that in the main module you add the handlers to the root logger and then only get the logger for the current module via logger = logging.getLogger(__name__). This and import logging is the only logging code that should appear in any but the module where if __name__=="__main__": occurs where right after that it should setup the logging.
The only way you will have duplicates is if you call that function in your program twice. The issue isn't these two handlers but the other two handlers from the second call to this setup function.
You need to just add the handlers once to the root logger not once for every logger in each module.
Not exactly sure why this is the case but adding the StreamHandler first causes it to also write to the file. I moved the StreamHandler to after adding the TimedRotatingFileHandler and this resolved the problem.

Python logging in several modules outputs twice

I'm not very familiar with python logging and I am trying to get it to log output to the console. I've gotten it to work but it seems that the output is seen twice in the console and I'm not sure why. I've looked at the other questions asked here about similar situations but I could not find anything that helped me.
I have three modules, lets call them ma, mb, mc and a main. The main imports these 3 modules and calls their functions.
In each module I set up a logger like so:
ma.py
import logging
logger = logging.getLogger('test.ma') #test.mb for mb.py/test.mc for mc.py
logger.setLevel(logging.DEBUG)
console_log = logging.StreamHandler()
console_log.setLevel(logging.DEBUG)
formatter = logging.Formatter('%(asctime)s - %(name)s - %(message)s')
console_log.setFormatter(formatter)
logger.addHandler(console_log)
...
...
#right before the end of each module, after I'm finished logging what I need.
logger.removeHandler(console_log)
mb.py
import logging
logger = logging.getLogger('test.mb')
logger.setLevel(logging.DEBUG)
console_log = logging.StreamHandler()
console_log.setLevel(logging.DEBUG)
formatter = logging.Formatter('%(asctime)s - %(name)s - %(message)s')
console_log.setFormatter(formatter)
logger.addHandler(console_log)
...
...
#end of file
logger.removeHandler(console_log)
mc.py
import logging
logger = logging.getLogger('test.mc')
logger.setLevel(logging.DEBUG)
console_log = logging.StreamHandler()
console_log.setLevel(logging.DEBUG)
formatter = logging.Formatter('%(asctime)s - %(name)s - %(message)s')
console_log.setFormatter(formatter)
logger.addHandler(console_log)
...
...
#end of file
logger.removeHandler(console_log)
The main problem I'm having is that output is printing twice and for some parts of the program it is not formatted. Any help would appreciated, thanks!
A better approach to using a global logger is to use a function that wraps the call to logging.getLogger:
import logging
def get_logger(name):
logger = logging.getLogger(name)
if not logger.handlers:
# Prevent logging from propagating to the root logger
logger.propagate = 0
console = logging.StreamHandler()
logger.addHandler(console)
formatter = logging.Formatter('%(asctime)s - %(name)s - %(message)s')
console.setFormatter(formatter)
return logger
def main():
logger = get_logger(__name__)
logger.setLevel(logging.DEBUG)
logger.info("This is an info message")
logger.error("This is an error message")
logger.debug("This is a debug message")
if __name__ == '__main__':
main()
Output
2017-03-09 12:02:41,083 - __main__ - This is an info message
2017-03-09 12:02:41,083 - __main__ - This is an error message
2017-03-09 12:02:41,083 - __main__ - This is a debug message
Note: Using a function also allows us to set logger.propagate to False to prevent log messages from being sent to the root logger. This is almost certainly the cause of the duplicated output you are seeing.
Update: After calling get_logger, the desired level must be set by calling logger.setLevel. If this is not done, then only error messages will be seen.
To log in multiple modules, to one global file, create a logging instance and then get that instance in whichever modules you like. So for example in main, i would have:
import logging
logging.basicConfig(filename='logfile.log', level=logging.DEBUG)
formatter = logging.Formatter('%(asctime)s - %(name)s - %(message)s')
logger = logging.getLogger('Global Log')
Then, in every module that I want to be able to append to this log, I have:
import logging
logger = logging.getLogger('Global Log')
Now everytime you access the logger in your module, you are accessing the same logging instance. This way, you should only have to set up and format the logger once.
More about logging instance in the docs:
https://docs.python.org/2/library/logging.html#logging.getLogger
In my case I was seeing logging twice due to the way /dev/log redirects records to the rsyslog daemon I have locally running.
Using a
handler = logging.handlers.SysLogHandler(address='/run/systemd/journal/syslog')
solved my problem of double logging.

Logging using elasticsearch-py

I would like to log my python script that uses elasticsearch-py. In particular, I want to have three logs:
General log: log INFO and above both to the stdout and to a file.
ES log: only ES related messages only to a file.
ES tracing log: Extended ES logging (curl queries and their output for instance) only to a file.
Here is what I have so far:
import logging
import logging.handlers
es_logger = logging.getLogger('elasticsearch')
es_logger.setLevel(logging.INFO)
es_logger_handler=logging.handlers.RotatingFileHandler('top-camps-base.log',
maxBytes=0.5*10**9,
backupCount=3)
es_logger.addHandler(es_logger_handler)
es_tracer = logging.getLogger('elasticsearch.trace')
es_tracer.setLevel(logging.DEBUG)
es_tracer_handler=logging.handlers.RotatingFileHandler('top-camps-full.log',
maxBytes=0.5*10**9,
backupCount=3)
es_tracer.addHandler(es_tracer_handler)
logger = logging.getLogger('mainLog')
logger.setLevel(logging.DEBUG)
# create file handler
fileHandler = logging.handlers.RotatingFileHandler('top-camps.log',
maxBytes=10**6,
backupCount=3)
fileHandler.setLevel(logging.INFO)
# create console handler
consoleHandler = logging.StreamHandler()
consoleHandler.setLevel(logging.INFO)
# create formatter and add it to the handlers
formatter = logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s')
consoleHandler.setFormatter(formatter)
fileHandler.setFormatter(formatter)
# add the handlers to logger
logger.addHandler(consoleHandler)
logger.addHandler(fileHandler)
My problem is that INFO messages of es_logger are displayed also on the terminal. As a matter of fact the log messages are saved to the right files!
If I remover the part related to logger, then the ES logging works fine, i.e. only saved to the corresponding files. But then I don't have the other part.... What is it that I'm doing wrong with the last part of the settings?
Edit
Possible hint: In the sources of elasticsearch-py there's a logger named logger. Could it be that it conflicts with mine? I tried to change the name of logger to main_logger in the lines above but it didn't help.
Possible hint 2: If I replace logger = logging.getLogger('mainLog') with logger = logging.getLogger(), then the format of the output to the console of es_logger changes and becomes identical to the one defined in the snippet.
I think you are being hit by the somewhat confusing logger hierarchy propagation. Everything that is logged in "elasticsearch.trace" that passes the loglevel of that logger, will propagate first to the "elasticsearch" logger and then to the root ("") logger. Note that once the message passes the loglevel of the "elasticsearch.trace" logger, the loglevels of the parents ("elasticsearch" and root) are not checked, but all messages will be sent to the handlers. (The handlers themselves have log levels that do apply.)
Consider the following example that illustrates the issue, and a possible solution:
import logging
# The following line will basicConfig() the root handler
logging.info('DUMMY - NOT SEEN')
ll = logging.getLogger('foo')
ll.setLevel('DEBUG')
ll.addHandler(logging.StreamHandler())
ll.debug('msg1')
ll.propagate = False
ll.debug('msg2')
Output:
msg1
DEBUG:foo:msg1
msg2
You see that "msg1" is logged both by the "foo" logger, and its parent, the root logger (as "DEBUG:foo:msg1"). Then, when propagation is turned off ll.propagate = False before "msg2", the root logger no longer logs it. Now, if you were to comment out the first line (logging.info("DUMMY..."), then the behavior would change so that the root logger line would not be shown. This is because the logging module top level functions info(), debug() etc. configure the root logger with a handler when no handler has yet been defined. That is also why you see different behavior in your example when you modify the root handler by doing logger = logging.getLogger().
I can't see in your code that you would be doing anything to the root logger, but as you see, a stray logging.info() or the like in your code or library code would cause a handler to be added.
So, to answer your question, I would set logger.propagate = False to the loggers where it makes sense for you and where you want propagation, check that the log level of the handlers themselves are as you want them.
Here is an attempt:
es_logger = logging.getLogger('elasticsearch')
es_logger.propagate = False
es_logger.setLevel(logging.INFO)
es_logger_handler=logging.handlers.RotatingFileHandler('top-camps-base.log',
maxBytes=0.5*10**9,
backupCount=3)
es_logger.addHandler(es_logger_handler)
es_tracer = logging.getLogger('elasticsearch.trace')
es_tracer.propagate = False
es_tracer.setLevel(logging.DEBUG)
es_tracer_handler=logging.handlers.RotatingFileHandler('top-camps-full.log',
maxBytes=0.5*10**9,
backupCount=3)
es_tracer.addHandler(es_tracer_handler)
logger = logging.getLogger('mainLog')
logger.propagate = False
logger.setLevel(logging.DEBUG)
# create file handler
fileHandler = logging.handlers.RotatingFileHandler('top-camps.log',
maxBytes=10**6,
backupCount=3)
fileHandler.setLevel(logging.INFO)
# create console handler
consoleHandler = logging.StreamHandler()
consoleHandler.setLevel(logging.INFO)
# create formatter and add it to the handlers
formatter = logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s')
consoleHandler.setFormatter(formatter)
fileHandler.setFormatter(formatter)
# add the handlers to logger
logger.addHandler(consoleHandler)
logger.addHandler(fileHandler)

Suspend the formatting of the logger, then go back to it

I have a logging configuration where I log to a file and to the console:
logging.basicConfig(filename=logfile, filemode='w',
level=numlevel,
format='%(asctime)s - %(levelname)s - %(name)s:%(funcName)s - %(message)s')
# add console messages
console = logging.StreamHandler()
console.setLevel(logging.INFO)
consoleformatter = logging.Formatter('%(asctime)s - %(levelname)s - %(message)s')
console.setFormatter(consoleformatter)
logging.getLogger('').addHandler(console)
At some point in my script, i need to interact with the user by printing a summary and asking for confirmation. The summary is currently produced by prints in a loop. I would like to suspend the current format of the console logs, so that I can printout one big block of text with a question at the end and wait for user input. But i still want all of this to be logged to file!
The function that does this is in a module, where I tried the following :
logger = logging.getLogger(__name__)
def summaryfunc:
logger.info('normal logging business')
clearformatter = logging.Formatter('%(message)s')
logger.setFormatter(clearformatter)
logger.info('\n##########################################')
logger.info('Summary starts here')
Which yields the error: AttributeError: 'Logger' object has no attribute 'setFormatter'
I understand that a logger is a logger, not a handler, but i'm not sure on how to get things to work...
EDIT:
Following the answers, my problem turned into : how can i suspend logging to the console when interacting with the user, while still being able to log to file. IE: suspend only the streamHandler. Since this is happening in a module, the specifics of the handlers are defined elsewhere, so here is how i did it :
logger.debug('Normal logging to file and console')
root_logger = logging.getLogger()
stream_handler = root_logger.handlers[1]
root_logger.removeHandler(stream_handler)
print('User interaction')
logger.info('Logging to file only')
root_logger.addHandler(stream_handler)
logger.info('Back to logging to both file and console')
This relies on the streamHandler always being the second in the list returned by handlers but i believe this is the case because it's in the order I added the handlers to the root logger...
I agree with Vinay that you should use print for normal program output and only use logging for logging purpose. However, if you still want to switch formatting in the middle, then switch back, here is how to do it:
import logging
def summarize():
console_handler.setFormatter(logging.Formatter('%(message)s'))
logger.info('Here is my report')
console_handler.setFormatter(console_formatter)
numlevel = logging.DEBUG
logfile = 's2.log'
logger = logging.getLogger(__name__)
logger.setLevel(logging.DEBUG)
console_handler = logging.StreamHandler()
console_formatter = logging.Formatter('%(asctime)s - %(levelname)s - %(message)s')
console_handler.setFormatter(console_formatter)
logger.addHandler(console_handler)
file_handler = logging.FileHandler(filename=logfile, mode='w')
file_formatter = logging.Formatter('%(asctime)s - %(levelname)s - %(name)s:%(funcName)s - %(message)s')
file_handler.setFormatter(file_formatter)
logger.addHandler(file_handler)
logger.info('Before summary')
summarize()
logger.info('After summary')
Discussion
The script creates a logger object and assigned to it two handlers: one for the console and one for file.
In the function summarize(), I switched in a new formatter for the console handler, do some logging, then switched back.
Again, let me remind that you should not use logging to display normal program output.
Update
If you want to suppress console logging, then turn it back on. Here is a suggestion:
def interact():
# Remove the console handler
for handler in logger.handlers:
if not isinstance(handler, logging.FileHandler):
saved_handler = handler
logger.removeHandler(handler)
break
# Interact
logger.info('to file only')
# Add the console handler back
logger.addHandler(saved_handler)
Note that I did not test the handlers against logging.StreamHandler since a logging.FileHandler is derived from logging.StreamHandler. Therefore, I removed those handlers that are not FileHandler. Before removing, I saved that handler for later restoration.
Update 2: .handlers = []
In the main script, if you have:
logger = logging.getLogger(__name__) # __name__ == '__main__'
Then in a module, you do:
logger = logging.getLogger(__name__) # __name__ == module's name, not '__main__'
The problem is, in the script, __name__ == '__main__' and in the module, __name__ == <the module's name> and not '__main__'. In order to achieve consistency, you will need to make up some name and use them in both places:
logger = logging.getLogger('MyScript')
Logging shouldn't be used to provide the actual output of your program - a program is supposed to run the same way if logging is turned off completely. So I would suggest that it's better to do what you were doing before, i.e. prints in a loop.

Categories

Resources