How to get all previous logs written during current run (python logging) - python

I'm using the python logging module. How can I get all of the previously outputted logs that have been written-out by logger since the application started?
Let's say I have some large application. When the application first starts, it sets-up logging with something like this
import loggging
logging.basicConfig(
filename = '/path/to/log/file.log,
filemode = 'a',
format = '%(asctime)s,%(msecs)d %(name)s %(levelname)s %(message)s',
datefmt = '%H:%M:%S',
level = logging.DEBUG
)
logging.info("===============================================================================")
logging.info( "INFO: Starting Application Logging" )
A few hours or days later I want to be able to get the history of all of the log messages ever written by logging and store that to a variable.
How can I get the history of messages written by the python logging module since the application started?

If you're writing to a log file, then you can simply get the path to the log file and then read its contents
import logging
logger = logging.getLogger( __name__ )
# attempt to get debug log contents
if logger.root.hasHandlers():
logfile_path = logger.root.handlers[0].baseFilename
with open(logfile_path) as log_file:
log_history = log_file.read()

Related

Create log file named after filename of caller script

I have a logger.py file which initialises logging.
import logging
logger = logging.getLogger(__name__)
def logger_init():
import os
import inspect
global logger
logger.setLevel(logging.DEBUG)
ch = logging.StreamHandler()
ch.setLevel(logging.DEBUG)
logger.addHandler(ch)
fh = logging.FileHandler(os.getcwd() + os.path.basename(__file__) + ".log")
fh.setLevel(level=logging.DEBUG)
logger.addHandler(fh)
return None
logger_init()
I have another script caller.py that calls the logger.
from logger import *
logger.info("test log")
What happens is a log file called logger.log will be created containing the logged messages.
What I want is the name of this log file to be named after the caller script filename. So, in this case, the created log file should have the name caller.log instead.
I am using python 3.7
It is immensely helpful to consolidate logging to one location. I learned this the hard way. It is easier to debug when events are sorted by time and it is thread-safe to log to the same file. There are solutions for multiprocessing logging.
The log format can, then, contain the module name, function name and even line number from where the log call was made. This is invaluable. You can find a list of attributes you can include automatically in a log message here.
Example format:
format='[%(asctime)s] [%(module)s.%(funcName)s] [%(levelname)s] %(message)s
Example log message
[2019-04-03 12:29:48,351] [caller.work_func] [INFO] Completed task 1.
You can get the filename of the main script from the first item in sys.argv, but if you want to get the caller module not the main script, check the answers on this question.

Python logging multiple modules each module log to separate files

I have a main script and multiple modules. Right now I have logging setup where all the logging from all modules go into the same log file. It gets hard to debug when its all in one file. So I would like to separate each module into its own log file. I would also like to see the requests module each module uses into the log of the module that used it. I dont know if this is even possible. I searched everywhere and tried everything I could think of to do it but it always comes back to logging everything into one file or setup logging in each module and from my main module initiate the script instead of import as a module.
main.py
import logging, logging.handlers
import other_script.py
console_debug = True
log = logging.getLogger()
def setup_logging():
filelog = logging.handlers.TimedRotatingFileHandler(path+'logs/api/api.log',
when='midnight', interval=1, backupCount=3)
filelog.setLevel(logging.DEBUG)
fileformatter = logging.Formatter('%(asctime)s %(name)-15s %(levelname)-8s %(message)s')
filelog.setFormatter(fileformatter)
log.addHandler(filelog)
if console_debug:
console = logging.StreamHandler()
console.setLevel(logging.DEBUG)
formatter = logging.Formatter('%(name)-15s: %(levelname)-8s %(message)s')
console.setFormatter(formatter)
log.addHandler(console)
if __name__ == '__main__':
setup_logging()
other_script.py
import requests
import logging
log = logging.getLogger(__name__)
One very basic concept of python logging is that every file, stream or other place that logs go is equivalent to one Handler. So if you want every module to log to a different file you will have to give every module it's own handler. This can also be done from a central place. In your main.py you could add this to make the other_script module log to a separate file:
other_logger = logging.getLogger('other_script')
other_logger.addHandler(logging.FileHandler('other_file'))
other_logger.propagate = False
The last line is only required if you add a handler to the root logger. If you keep propagate at the default True you will have all logs be sent to the root loggers handlers too. In your scenario it might be better to not even use the root logger at all, and use a specific named logger like getLogger('__main__') in main.

Python logging create handlers in wrapper script for logger in imported module

I am writing an API (python 2.7.x), and I have a worker script for it which does nothing on its own but can be wrapped by a variety of higher level scripts (ie one that feeds the worker data from csv, one from dB etc). The current task requires me to:
log INFO+ to console
log a certain set of INFO+ events to a .csv file
log ALL events to a distinct .log file
I've distilled my code to the following examples:
# SuperExample.py
import logging
import SubExample
def main():
logging.basicConfig(level=logging.INFO)
verbose_log = 'debug.log'
data_log = 'data.csv'
format_string = '%(asctime)s::%(name)s::%(levelname)s::%(message)s'
formatter = logging.Formatter(format_string)
# verbose log is a typical event log used for debugging
verbose = logging.FileHandler(verbose_log, mode='w')
verbose.setLevel(logging.DEBUG)
verbose.setFormatter(formatter)
SubExample.logger.addHandler(verbose)
# data log will eventually have a different formatter and a filter in
# order to get a narrow set of events, formatted for post-processing ease
data = logging.FileHandler(data_log, mode='w')
data.setLevel(logging.INFO)
data.setFormatter(formatter)
SubExample.logger.addHandler(data)
logging.info('Started')
SubExample.do_something()
logging.info('Finished')
if __name__ == '__main__':
main()
and
# SubExample.py
import logging
logger = logging.getLogger(__name__)
logger.setLevel(logging.DEBUG)
def do_something():
logger.debug('hey look I am doing something!')
logger.debug('now I am doing something else!!')
logger.info('this is my result!!!')
which gives me what I want in my files, but gives me this in my console:
INFO:root:Started
DEBUG:SubExample:hey look I am doing something!
DEBUG:SubExample:now I am doing something else!!
INFO:SubExample:this is my result!!!
INFO:root:Finished
I've read about the logging module and it's best practices, but very little of the example code works exactly the way its described when libraries get involved. So, my first question is: is this a basically sane approach? I haven't actually seen anyone else attach handlers to the subscript logger from the wrapper script, but it seems to do what I want.
And my second question is why do the DEBUG statements get into the console? I would think that logging.basicConfig(level=logging.INFO) should prevent this?
In the SuperExample.py file, I removed the basicConfig step and instead did this:
# SuperExample.py
import logging
import SubExample
def main():
logger = logging.getLogger(__name__)
logger.setLevel(logging.DEBUG)
console = logging.StreamHandler()
console.setLevel(logging.INFO)
logger.addHandler(console)
...
...
logger.info('Started')
...
logger.info('Finished')
In the SubExample.py file:
# SubExample.py
import logging
logger = logging.getLogger(__name__)
logger.setLevel(logging.DEBUG)
console = logging.StreamHandler()
console.setLevel(logging.INFO)
logger.addHandler(console)
def do_something():
....
Rest of the code is same as yours. When I run SuperExample.py, this is the output:
test_project ~$ python SuperExample.py
Started
this is my result!!!
Finished
The debug.log file has this:
2017-10-25 16:18:08,292::SubExample::DEBUG::hey look I am doing something!
2017-10-25 16:18:08,292::SubExample::DEBUG::now I am doing something else!!
2017-10-25 16:18:08,292::SubExample::INFO::this is my result!!!
The data.csv file has this:
2017-10-25 16:18:08,292::SubExample::INFO::this is my result!!!
So, it seems like the right way to do this is to add a StreamHandler to the logger in each of your modules and set it's level to what you want logged from there to the console. Also, whenever you do logging.getLogger(), you HAVE to set that logger's level to get the expected behavior from the handlers.

Struggling to get logfile output in PySpark

Following the question here: How do I log from my Python Spark script, I have been struggling to get:
a) All output into a log file.
b) Writing out to a log file from pyspark
For a) I use the following changes to the config file:
# Set everything to be logged to the console
log4j.rootCategory=ALL, file
log4j.appender.console=org.apache.log4j.ConsoleAppender
log4j.appender.console.target=System.err
log4j.appender.console.layout=org.apache.log4j.PatternLayout
log4j.appender.console.layout.ConversionPattern=%d{yy/MM/dd HH:mm:ss} %p %c{1}: %m%n
log4j.appender.file=org.apache.log4j.RollingFileAppender
log4j.appender.file.File=/home/xxx/spark-1.6.1/logging.log
log4j.appender.file.MaxFileSize=5000MB
log4j.appender.file.MaxBackupIndex=10
log4j.appender.file.layout=org.apache.log4j.PatternLayout
log4j.appender.file.layout.ConversionPattern=%d{yy/MM/dd HH:mm:ss} %p %c{1}: %m%n
This produces output and now for b) I would like to add my own input to logging from pyspark, but I cannot find any output written to the logs. Here is the code I am using:
import logging
logger = logging.getLogger('py4j')
#print(logger.handlers)
sh = logging.StreamHandler(sys.stdout)
sh.setLevel(logging.DEBUG)
logger.addHandler(sh)
logger.info("TESTING.....")
I can find output in the logfile, but no "TESTING...." I have also tried using the existing logger stream but this does not work either.
import logging
logger = logging.getLogger('py4j')
logger.info("TESTING.....")
Works in my configuration:
log4jLogger = sc._jvm.org.apache.log4j
LOGGER = log4jLogger.LogManager.getLogger(__name__)
LOGGER.info("Hello logger...")
All output into a log file & Writing out to a log file from pyspark
import os
import sys
import logging
import logging.handlers
log = logging.getLogger(__name_)
handler = logging.FileHandler("spam.log")
formatter = logging.Formatter("%(asctime)s - %(name)s - %(levelname)s - %(message)s")
handler.setFormatter(formatter)
log.addHandler(handler)
sys.stderr.write = log.error
sys.stdout.write = log.info
(will log every error in "spam.log" in the same directory, nothing will be on console/stdout)
(will log every info in "spam.log" in the same directory,nothing will be on console/stdout)
to print output error/info in both file as well as in console remove above two line.
Happy Coding Cheers!!!

Python logging over multiple files

I've read through the logging module documentation and whilst I may have missed something obvious, the code I've got doesn't appear to be working as intended. I'm using Python 2.6.4.
My program consists of several different python files, from which I want to send logging messages to a text file and, potentially, the screen. I imagine this is a common thing to do so I'm messing this up somewhere.
What my code is doing at the minute is logging to the text file correctly, kinda. But logging to the screen is being duplicated, one with the specified formatting, and one without. Also, when I turn off the screen output, I'm still getting the text printed once, which I don't want - I just want it to be logged to the file.
Anyway, some code:
#logger.py
import logging
from logging.handlers import RotatingFileHandler
import os
def setup_logging(logdir=None, scrnlog=True, txtlog=True, loglevel=logging.DEBUG):
logdir = os.path.abspath(logdir)
if not os.path.exists(logdir):
os.mkdir(logdir)
log = logging.getLogger('stumbler')
log.setLevel(loglevel)
log_formatter = logging.Formatter("%(asctime)s - %(levelname)s :: %(message)s")
if txtlog:
txt_handler = RotatingFileHandler(os.path.join(logdir, "Stumbler.log"), backupCount=5)
txt_handler.doRollover()
txt_handler.setFormatter(log_formatter)
log.addHandler(txt_handler)
log.info("Logger initialised.")
if scrnlog:
console_handler = logging.StreamHandler()
console_handler.setFormatter(log_formatter)
log.addHandler(console_handler)
Nothing unusual there.
#core.py
import logging
corelog = logging.getLogger('stumbler.core') # From what I understand of the docs, this should work :/
class Stumbler:
[...]
corelog.debug("Messages and rainbows...")
The screen output shows how this is being duplicated:
2010-01-08 22:57:07,587 - DEBUG :: SCANZIP: Checking zip contents, file: testscandir/testdir1/music.mp3
DEBUG:stumbler.core:SCANZIP: Checking zip contents, file: testscandir/testdir1/music.mp3
2010-01-08 22:57:07,587 - DEBUG :: SCANZIP: Checking zip contents, file: testscandir/testdir2/subdir/executable.exe
DEBUG:stumbler.core:SCANZIP: Checking zip contents, file: testscandir/testdir2/subdir/executable.exe
Although the textfile is getting the correctly formatted output, turning the screen logging off in logger.py still has the incorrectly formatted output displayed.
From what I understand of the docs, calling corelog.debug(), seeing as corelog is a child of the "stumbler" logger, it should use that formatting and output the logs as such.
Apologies for the essay over such a trivial issue.
TL;DR: How do I do logging from multiple files?
Are you sure no other logging setup is being done in anything you import.
The incorrect output in your console logs look like the default configuration for a logger, so something else may be setting that up.
Running this quick test script:
import logging
from logging.handlers import RotatingFileHandler
import os
def setup_logging(logdir=None, scrnlog=True, txtlog=True, loglevel=logging.DEBUG):
logdir = os.path.abspath(logdir)
if not os.path.exists(logdir):
os.mkdir(logdir)
log = logging.getLogger('stumbler')
log.setLevel(loglevel)
log_formatter = logging.Formatter("%(asctime)s - %(levelname)s :: %(message)s")
if txtlog:
txt_handler = RotatingFileHandler(os.path.join(logdir, "Stumbler.log"), backupCount=5)
txt_handler.doRollover()
txt_handler.setFormatter(log_formatter)
log.addHandler(txt_handler)
log.info("Logger initialised.")
if scrnlog:
console_handler = logging.StreamHandler()
console_handler.setFormatter(log_formatter)
log.addHandler(console_handler)
setup_logging('/tmp/logs')
corelog = logging.getLogger('stumbler.core')
corelog.debug("Messages and rainbows...")
yields this result:
2010-01-08 15:39:25,335 - DEBUG ::
Messages and rainbows...
and in my /tmp/logs/Stumbler.log
2010-01-08 15:39:25,335 - INFO ::
Logger initialised. 2010-01-08
15:39:25,335 - DEBUG :: Messages and
rainbows...
This worked as expected when I ran it in python 2.4, 2.5, and 2.6.4

Categories

Resources