Following the question here: How do I log from my Python Spark script, I have been struggling to get:
a) All output into a log file.
b) Writing out to a log file from pyspark
For a) I use the following changes to the config file:
# Set everything to be logged to the console
log4j.rootCategory=ALL, file
log4j.appender.console=org.apache.log4j.ConsoleAppender
log4j.appender.console.target=System.err
log4j.appender.console.layout=org.apache.log4j.PatternLayout
log4j.appender.console.layout.ConversionPattern=%d{yy/MM/dd HH:mm:ss} %p %c{1}: %m%n
log4j.appender.file=org.apache.log4j.RollingFileAppender
log4j.appender.file.File=/home/xxx/spark-1.6.1/logging.log
log4j.appender.file.MaxFileSize=5000MB
log4j.appender.file.MaxBackupIndex=10
log4j.appender.file.layout=org.apache.log4j.PatternLayout
log4j.appender.file.layout.ConversionPattern=%d{yy/MM/dd HH:mm:ss} %p %c{1}: %m%n
This produces output and now for b) I would like to add my own input to logging from pyspark, but I cannot find any output written to the logs. Here is the code I am using:
import logging
logger = logging.getLogger('py4j')
#print(logger.handlers)
sh = logging.StreamHandler(sys.stdout)
sh.setLevel(logging.DEBUG)
logger.addHandler(sh)
logger.info("TESTING.....")
I can find output in the logfile, but no "TESTING...." I have also tried using the existing logger stream but this does not work either.
import logging
logger = logging.getLogger('py4j')
logger.info("TESTING.....")
Works in my configuration:
log4jLogger = sc._jvm.org.apache.log4j
LOGGER = log4jLogger.LogManager.getLogger(__name__)
LOGGER.info("Hello logger...")
All output into a log file & Writing out to a log file from pyspark
import os
import sys
import logging
import logging.handlers
log = logging.getLogger(__name_)
handler = logging.FileHandler("spam.log")
formatter = logging.Formatter("%(asctime)s - %(name)s - %(levelname)s - %(message)s")
handler.setFormatter(formatter)
log.addHandler(handler)
sys.stderr.write = log.error
sys.stdout.write = log.info
(will log every error in "spam.log" in the same directory, nothing will be on console/stdout)
(will log every info in "spam.log" in the same directory,nothing will be on console/stdout)
to print output error/info in both file as well as in console remove above two line.
Happy Coding Cheers!!!
Related
I'm trying to understand the python logging module and am currently running into an issue. I'm adding lines to my own code to log to a file, but I'm using the discord bot module that also outputs data to the console.
I'd like to log both my own stuff and the module logs to a file, creating a new file every day.
I can output my own logs to a file using the TimedRotatingFileHandler, then add that same filename to basicConfig, but then my own logs get into the file 2x for each log action I do.
I can also get everything to log to the file in basicConfig, but I have no idea how to create a TimedRotationFileHandler to that, so I don't know how to make a log file for each day using that method.
Anyone who can help me out with this? Thanks a bunch!
from logging.handlers import TimedRotatingFileHandler
def createLog(name):
logging.basicConfig(level=logging.DEBUG, filename=name, filemode="a", format="%(asctime)s - %(levelname)s - %(message)s")
logfile = logging.getLogger(name)
if (logfile.hasHandlers()):
logfile.handlers.clear()
else:
print("No handlers")
handler = TimedRotatingFileHandler(name, when="midnight", interval=1)
formatter = logging.Formatter("%(asctime)s - %(levelname)s - %(message)s")
handler.setFormatter(formatter)
handler.suffix = "%Y%m%d"
logfile.addHandler(handler)
logfile.setLevel(logging.DEBUG)
return logfile```
LOG_DIR_NAME = {log file name}
MAX_DAYS = { maximum number of days }
formatter = logging.Formatter('%(asctime)s | %(lineno)3d | %(levelname)5s | %(message)s')
logHandler = handlers.TimedRotatingFileHandler(LOG_DIR_NAME + logfile_name, when="midnight", interval=1, backupCount=MAX_DAYS)
Hope it will be helpful to you
or you can also check this answerPython: How to create log file everyday using logging module?
I'm using the python logging module. How can I get all of the previously outputted logs that have been written-out by logger since the application started?
Let's say I have some large application. When the application first starts, it sets-up logging with something like this
import loggging
logging.basicConfig(
filename = '/path/to/log/file.log,
filemode = 'a',
format = '%(asctime)s,%(msecs)d %(name)s %(levelname)s %(message)s',
datefmt = '%H:%M:%S',
level = logging.DEBUG
)
logging.info("===============================================================================")
logging.info( "INFO: Starting Application Logging" )
A few hours or days later I want to be able to get the history of all of the log messages ever written by logging and store that to a variable.
How can I get the history of messages written by the python logging module since the application started?
If you're writing to a log file, then you can simply get the path to the log file and then read its contents
import logging
logger = logging.getLogger( __name__ )
# attempt to get debug log contents
if logger.root.hasHandlers():
logfile_path = logger.root.handlers[0].baseFilename
with open(logfile_path) as log_file:
log_history = log_file.read()
I have a file named helper.py
import logging
import os
from json import load
def get_config(value):
with open('config.json','r') as f:
result=load(f)[value]
return result
def get_logger(name,level):
logpath=get_config("log_path")
if not os.path.exists(logpath):
os.mkdir(logpath)
logger = logging.getLogger(name)
if not bool(logger.handlers):
formatter = logging.Formatter('%(asctime)s.%(msecs)03d - %(name)s - %(levelname)s - %(message)s',datefmt='%Y-%m-%d %H:%M:%S')
fh = logging.FileHandler(os.path.join(logpath,f'{get_config("log_file_name")}.log'),mode="w",encoding='utf-8')
fh.setFormatter(formatter)
logger.addHandler(fh)
ch = logging.StreamHandler()
ch.setFormatter(formatter)
ch.setLevel(level)
logger.addHandler(ch)
return logger
LOGGER=get_logger("MyLogger",logging.INFO)
This is config.json:
{
"save_path" : "results/",
"log_path" : "logs/",
"log_file_name" : "MyLog"
}
let's say I am using LOGGER from helper it in x.py
from helper import LOGGER
logger=LOGGER
def div(x,y):
try:
logger.info("inside div")
return x/y
except Exception as e:
logger.error(f"div failed due to {e.message if 'message' in dir(e) else e}")
I am using this LOGGER in other files by importing helper.LOGGER for logging purposes but it's not printing anything on the console nor writing in a log file
My attempt:
I tried adding sys.stdout in StreamHandler() It doesn't worked
Then I tried setting the level of fh but nothing works
I tried adding basicConfig() instead of fileHandler() but then printing to console using print() and the output of logs is not coming in the correct order
Kindly let me know where I go wrong
Any help is appreciated :)
Thanks :)
You are not setting the level on the LOGGER, which by default is warning. This is why your info level log is not appearing. The Python documentation has a flow chart illustrating when a log will be logged: https://docs.python.org/3/howto/logging.html#logging-flow
The first thing it does, is that before a logger sends a log to their handlers it checks if the level is enabled for the logger. You should add logger.setLevel(level) in your get_logger().
I am trying to add in multiple log files into a Logs folder, but instead of having to change the code each time you start the program, i want to make the log file's name "Log(the time).log". I'm using logger at the moment, but i can switch. I've also imported time.
Edit: Here is some of the code i am using:
import logging
logger = logging.getLogger('k')
hdlr = logging.FileHandler('Path to the log file/log.log')
formatter = logging.Formatter('At %(asctime)s, KPY returned %(message)s at level %(levelname)s
hdlr.setFormatter(formatter)
logger.addHandler(hdlr)
logger.setLevel(logging.DEBUG)
logger.info('hello')
import logging
import time
fname = "Log({the_time}).log".format(the_time=time.time())
logging.basicConfig(level=logging.DEBUG, filename=fname)
logging.info('hello')
You should do it when you set the FileHandler for the logging object. Use datetime instead of time so that you can include the date for each instance of the log in order to differentiate logs on different days at the same time.
fh = logging.FileHandler("Log"+str(datetime.datetime.now())+'.log')
fh.setLevel(logging.DEBUG)
I got help from another website.
You have to change the hdlr to:
({FOLDER LOCATION}/Logs/log{}.log'.format(datetime.datetime.strftime(datetime.datetime.now(), '%Y%m%d%H%M%S_%f')))
I don't know why this code prints to the screen, but not to the file? File "example1.log" is created, but nothing is written there.
#!/usr/bin/env python3
import logging
logging.basicConfig(level=logging.DEBUG,
format='%(asctime)s %(message)s',
handlers=[logging.FileHandler("example1.log"),
logging.StreamHandler()])
logging.debug('This message should go to the log file and to the console')
logging.info('So should this')
logging.warning('And this, too')
I have "bypassed" this problem by creating a logging object, but it keeps bugging me why basicConfig() approach failed?
PS. If I change basicConfig call to:
logging.basicConfig(level=logging.DEBUG,
filename="example2.log",
format='%(asctime)s %(message)s',
handlers=[logging.StreamHandler()])
then all logs are in the file and nothing is displayed in the console.
Try this working fine(tested in python 2.7) for both console and file
# set up logging to file
logging.basicConfig(
filename='log_file_name.log',
level=logging.INFO,
format= '[%(asctime)s] {%(pathname)s:%(lineno)d} %(levelname)s - %(message)s',
datefmt='%H:%M:%S'
)
# set up logging to console
console = logging.StreamHandler()
console.setLevel(logging.DEBUG)
# set a format which is simpler for console use
formatter = logging.Formatter('%(name)-12s: %(levelname)-8s %(message)s')
console.setFormatter(formatter)
# add the handler to the root logger
logging.getLogger('').addHandler(console)
logger = logging.getLogger(__name__)
I can't reproduce it on Python 3.3. The messages are written both to the screen and the 'example2.log'. On Python <3.3 it creates the file but it is empty.
The code:
from logging_tree import printout # pip install logging_tree
printout()
shows that FileHandler() is not attached to the root logger on Python <3.3.
The docs for logging.basicConfig() say that handlers argument is added in Python 3.3. The handlers argument isn't mentioned in Python 3.2 documentation.
Another technique using the basicConfig is to setup all your handlers in the statement and retrieve them after the fact, as in...
import logging
logging.basicConfig(level=logging.DEBUG,
format='%(asctime)s %(levelname)s %(module)s %(funcName)s %(message)s',
handlers=[logging.FileHandler("my_log.log", mode='w'),
logging.StreamHandler()])
stream_handler = [h for h in logging.root.handlers if isinstance(h , logging.StreamHandler)][0]
stream_handler.setLevel(logging.INFO)
More sensibly though is to construct your stream handler instance outside and configure them as standalone objects that you pass to the handlers list as in...
import logging
stream_handler = logging.StreamHandler()
stream_handler.setLevel(logging.INFO)
logging.basicConfig(level=logging.DEBUG,
format='%(asctime)s %(levelname)s %(module)s %(funcName)s %(message)s',
handlers=[logging.FileHandler("my_log.log", mode='w'),
stream_handler])
In the example below, you can specify the log destination based on its level. For example, the code below lets all logs over the INFO level go to the log file, and all above ERROR level goes to the console.
import logging
logging.root.handlers = []
logging.basicConfig(format='%(asctime)s : %(levelname)s : %(message)s', level=logging.INFO , filename='ex.log')
# set up logging to console
console = logging.StreamHandler()
console.setLevel(logging.ERROR)
# set a format which is simpler for console use
formatter = logging.Formatter('%(asctime)s : %(levelname)s : %(message)s')
console.setFormatter(formatter)
logging.getLogger("").addHandler(console)
logging.debug('debug')
logging.info('info')
logging.warning('warning')
logging.error('error')
logging.exception('exp')
WOOAH!
I just spent about 20 minutes being baffled by this.
Eventually I worked out that the StreamHandler was outputting to stderr, not stdout (in a 'Doze DOS screen these have the same font colour!).
This resulted in me being able to run the code perfectly OK and get sensible results, but in a pytest function things going awry. Until I changed from this:
out, _ = capsys.readouterr()
assert 'test message check on console' in out, f'out was |{out}|'
to this:
_, err = capsys.readouterr()
assert 'test message check on console' in err, f'err was |{err}|'
NB the constructor of StreamHandler is
class logging.StreamHandler(stream=None)
and, as the docs say, "If stream is specified, the instance will use it for logging output; otherwise, sys.stderr will be used."
NB it seems that supplying the level keyword does not run setLevel on the handlers: you'd need to iterate on the resulting handlers and run setLevel on each, if it matters to you.
This is a ValueError if FileHandler and StreamHandler both are present in BasicConfig function
https://docs.python.org/3/library/logging.html#logging.basicConfig
See image below: