Multiline log records in syslog - python

So I've configured my Python application to log to syslog with Python's SysLogHandler, and everything works fine. Except for multi-line handling. Not that I need to emit multiline log records so badly (I do a little), but I need to be able to read Python's exceptions. I'm using Ubuntu with rsyslog 4.2.0. This is what I'm getting:
Mar 28 20:11:59 telemachos root: ERROR 'EXCEPTION'#012Traceback (most recent call last):#012 File "./test.py", line 22, in <module>#012 foo()#012 File "./test.py", line 13, in foo#012 bar()#012 File "./test.py", line 16, in bar#012 bla()#012 File "./test.py", line 19, in bla#012 raise Exception("EXCEPTION!")#012Exception: EXCEPTION!
Test code in case you need it:
import logging
from logging.handlers import SysLogHandler
logger = logging.getLogger()
logger.setLevel(logging.INFO)
syslog = SysLogHandler(address='/dev/log', facility='local0')
formatter = logging.Formatter('%(name)s: %(levelname)s %(message)r')
syslog.setFormatter(formatter)
logger.addHandler(syslog)
def foo():
bar()
def bar():
bla()
def bla():
raise Exception("EXCEPTION!")
try:
foo()
except:
logger.exception("EXCEPTION")

Alternatively, if you want to keep your syslog intact on one line for parsing, you can just replace the characters when viewing the log.
tail -f /var/log/syslog | sed 's/#012/\n\t/g'

OK, figured it out finally...
rsyslog by default escapes all weird characters (ASCII < 32), and this include newlines (as well as tabs and others).
$EscapeControlCharactersOnReceive:
This directive instructs rsyslogd to replace control characters during reception of the
message. The intent is to provide a way to stop non-printable
messages from entering the syslog system as whole. If this option is
turned on, all control-characters are converted to a 3-digit octal
number and be prefixed with the $ControlCharacterEscapePrefix
character (being ‘\’ by default). For example, if the BEL character
(ctrl-g) is included in the message, it would be converted to “\007”.
You can simply add this to your rsyslog config to turn it off:
$EscapeControlCharactersOnReceive off
or, with the "new" advanced syntax:
global(parser.escapeControlCharactersOnReceive="off")

Another option would be to subclass the SysLogHandler and override emit() - you could then call the superclass emit() for each line in the text you're sent. Something like:
from logging import LogRecord
from logging.handlers import SysLogHandler
class MultilineSysLogHandler(SysLogHandler):
def emit(self, record):
if '\n' in record.msg:
record_args = [record.args] if isinstance(record.args, dict) else record.args
for single_line in record.msg.split('\n'):
single_line_record = LogRecord(
name=record.name,
level=record.levelno,
pathname=record.pathname,
msg=single_line,
args=record_args,
exc_info=record.exc_info,
func=record.funcName
)
super(MultilineSysLogHandler, self).emit(single_line_record)
else:
super(MultilineSysLogHandler, self).emit(record)

Related

Prevent one of the logging handlers for specific messages

I monitor my script with the logging module of the Python Standard Library and I send the loggings to both the console with StreamHandler, and to a file with FileHandler.
I would like to have the option to disable a handler for a LogRecord independantly of its severity. For example, for a specific LogRecord I would like to have the option not to send it to the file destination or to the console (with passing a parameter).
I have found that the library has the Filter class for that reason (which is described as a finer grained way to filter blocks), but haven't figured out how to do it.
Any ideas how to do this in a cosistent way?
Finally, it is quite easy. I used a function as a Handler.filer as suggested in the comments.
This is a working example:
from pathlib import Path
import logging
from logging import LogRecord
def build_handler_filters(handler: str):
def handler_filter(record: LogRecord):
if hasattr(record, 'block'):
if record.block == handler:
return False
return True
return handler_filter
ch = logging.StreamHandler()
ch.addFilter(build_handler_filters('console'))
fh = logging.FileHandler(Path('/tmp/test.log'))
fh.addFilter(build_handler_filters('file'))
mylogger = logging.getLogger(__name__)
mylogger.setLevel(logging.DEBUG)
mylogger.addHandler(ch)
mylogger.addHandler(fh)
When the logger is called, the message is sent to both console and output, i.e.
mylogger.info('msg').
To block for example the file the logger should be called with the extra argument like this
mylogger.info('msg only to console', extra={'block': 'file'})
Disabling console is analogous.

How to encode all logged messages as utf-8 in Python

I have a little logger function that returns potentially two handlers to log to a RotatingFileHandler and sys.stdout simultaneously.
import os, logging, sys
from logging.handlers import RotatingFileHandler
from config import *
def get_logger(filename, log_level_stdout=logging.WARNING, log_level_file=logging.INFO, echo=True):
logger = logging.getLogger(__name__)
if not os.path.exists(PATH + '/Logs'):
os.mkdir(PATH + '/Logs')
logger.setLevel(logging.DEBUG)
if echo:
prn_handler = logging.StreamHandler(sys.stdout)
prn_handler.setFormatter(logging.Formatter('%(asctime)s %(levelname)s: %(message)s'))
prn_handler.setLevel(log_level_stdout)
logger.addHandler(prn_handler)
file_handler = RotatingFileHandler(PATH + '/Logs/' + filename, maxBytes=1048576, backupCount=3)
file_handler.setFormatter(logging.Formatter('%(asctime)s %(levelname)s: %(message)s'))
file_handler.setLevel(log_level_file)
logger.addHandler(file_handler)
return logger
This works fine in general but certain strings being logged appear to be encoded in cp1252 and throw an (non-fatal) error when trying to print them to stdout via logger function. It should be noted that the very same characters can be printed just fine in the error message. Logging them to a file also causes no issues. It's only the console - sys.stdout - that throws this error.
--- Logging error ---
Traceback (most recent call last):
File "C:\Program Files\Python38\lib\logging\__init__.py", line 1084, in emit
stream.write(msg + self.terminator)
File "C:\Program Files\Python38\lib\encodings\cp1252.py", line 19, in encode
return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\u1ecd' in position 65: character maps to <undefined>
Call stack:
File "script.py", line 147, in <module>
logger.info(f"F-String with a name in it: '{name}'.")
Message: "F-String with a name in it: 'Heimstọð'."
Arguments: ()
A fix to this has been to encode every single message getting as utf8 in the code that's calling the logger function like this:
logger.info((f"F-String with a name in it: '{name}'.").encode('utf8'))
However I feel like this is neither elegant nor efficient.
It should also be noted that the logging of the file works just fine and I already tried setting the PYTHONIOENCODING to utf-8 in the system variables of Windows without any noticeable effect.
Update:
Turns out I'm stupid. Just because an error message is printed in the console doesn't mean the printing to the console is the cause of the error. I was looking into the answers to the other question that has been suggested to me here and after a while realized that nothing I did to the "if echo" part of the function had any impact on the result. The last check was commenting out the whole block and I still got the error. That's when I realized that the issue was in fact caused by not enforcing UTF8 when writing to the file. Adding the simple kwarg encoding='utf-8' to the RotatingFileHandler as suggested by #michael-ruth fixed the issue for me.
P.S. I'm not sure how to handle this case because, while that answer fixed my problem, it wasn't really what I was asking for or what the question suggested because I originally misunderstood the root cause. I'll still check it as solution and upvote both answers. I'll also edit the question as to not mislead future readers into believing it would answer that question when it doesn't really.
Set the encoding while instantiating the handler instead of encoding the message explicitly.
file_handler = RotatingFileHandler(
PATH + '/Logs/' + filename,
maxBytes=1048576,
backupCount=3,
encoding='utf-8'
)
help(RotatingFileHandler) is your best friend.
Help on class RotatingFileHandler in module logging.handlers:
class RotatingFileHandler(BaseRotatingHandler)
| RotatingFileHandler(filename, mode='a', maxBytes=0, backupCount=0, encoding=None, delay=False)
Can you check the encoding of sys.stdout (sys.stdout.encoding)?
If it's not 'utf-8', this answer may help to reconfigure the encoding.

python: logging all errors into single log file

I am coding a tool in python and I want to put all the errors -and only the errors-(computations which didn't go through as they should have) into a single log file. Additionally I would want to have a different text in the error log file for each section of my code in order to make the error log file easy to interpret. How do I code this? Much appreciation for who could help with this!
Check out the python module logging. This is a core module for unifying logging not only in your own project but potentially in third party modules too.
For a minimal logging file example, this is taken directly from the documentation:
import logging
logging.basicConfig(filename='example.log',level=logging.DEBUG)
logging.debug('This message should go to the log file')
logging.info('So should this')
logging.warning('And this, too')
Which results in the contents of example.log:
DEBUG:root:This message should go to the log file
INFO:root:So should this
WARNING:root:And this, too
However, I personally recommend using the yaml configuration method (requires pyyaml):
#logging_config.yml
version: 1
disable_existing_loggers: False
formatters:
standard:
format: '%(asctime)s [%(levelname)s] %(name)s - %(message)s'
handlers:
console:
class: logging.StreamHandler
level: INFO
formatter: standard
stream: ext://sys.stdout
file:
class: logging.FileHandler
level: DEBUG
formatter: standard
filename: output.log
email:
class: logging.handlers.SMTPHandler
level: WARNING
mailhost: smtp.gmail.com
fromaddr: to#address.co.uk
toaddrs: to#address.co.uk
subject: Oh no, something's gone wrong!
credentials: [email, password]
secure: []
root:
level: DEBUG
handlers: [console, file, email]
propagate: True
Then to use, for example:
import logging.config
import yaml
with open('logging_config.yml', 'r') as config:
logging.config.dictConfig(yaml.safe_load(config))
logger = logging.getLogger(__name__)
logger.info("This will go to the console and the file")
logger.debug("This will only go to the file")
logger.error("This will go everywhere")
try:
list = [1, 2, 3]
print(list[10])
except IndexError:
logger.exception("This will also go everywhere")
This prints:
2018-07-18 13:29:21,434 [INFO] __main__ - This will go to the console and the file
2018-07-18 13:29:21,434 [ERROR] __main__ - This will go everywhere
2018-07-18 13:29:21,434 [ERROR] __main__ - This will also go everywhere
Traceback (most recent call last):
File "C:/Users/Chris/Desktop/python_scratchpad/a.py", line 16, in <module>
print(list[10])
IndexError: list index out of range
While the contents of the log file is:
2018-07-18 13:35:55,423 [INFO] __main__ - This will go to the console and the file
2018-07-18 13:35:55,424 [DEBUG] __main__ - This will only go to the file
2018-07-18 13:35:55,424 [ERROR] __main__ - This will go everywhere
2018-07-18 13:35:55,424 [ERROR] __main__ - This will also go everywhere
Traceback (most recent call last):
File "C:/Users/Chris/Desktop/python_scratchpad/a.py", line 15, in <module>
print(list[10])
IndexError: list index out of range
Of course, you can add or remove handlers, formatters, etc, or do all of this in code (see the Python documentation) but this is my starting point whenever I use logging in a project. I find it helpful to have the configuration in a dedicated config file rather than polluting my project with defining logging in code.
If I understand the question correctly, the request was to capture only the errors in a dedicated log file, and I would do that differently.
I would stick to the BKM that all modules in the package define their own logger objects (logger = logging.getLogger(__name__)).
I'd let them be without any handlers and whenever they will emit they will look up the hierarchy tree for handlers to actually take care of the emitted messages.
At the root logger, I would add a dedicated FileHandler(filename='errors.log') and I would set the log level of that handler to logging.ERROR.
That means, whenever a logger from the package will emit something, this dedicated file-handler will discard anything below ERROR and will log into the files only ERROR and CRITICAL messages.
You could still add global StreamHandler and regular FileHandler to your root logger. Since you'll not change their log levels, they will be set to logging.NOTSET and will log everything that is emitted from the loggers in the package.
And to answer the second part of the question, the logger handlers can define their own formatting. So for the handler that handles only the errors, you could set the formatter to something like this: %(name)s::%(funcName)s:%(lineno)d - %(message)s which basically means, it will print:
the logger name (and if you used the convention to define loggers in every *.py file using __name__, then name will actually hold the hierarchical path to your module file (e.g. my_pkg.my_sub_pkg.module))
the funcName will hold the function from where the log was emitted and
lineno is the line number in the module file where the log was emitted.

How to find a spurious print statement?

I'm debugging a large Python codebase. Somewhere, a piece of code is printing {} to console, presumably this is some old debugging code that's been left in by accident.
As this is the only console output that doesn't go through logger, is there any way I can find the culprit? Perhaps by redefining what the print statement does, so I can cause an exception?
Try to redirect sys.stdout to custom stream handler (see Redirect stdout to a file in Python?), where you can override write() method.
Try something like this:
import io
import sys
import traceback
class TestableIO(io.BytesIO):
def __init__(self, old_stream, initial_bytes=None):
super(TestableIO, self).__init__(initial_bytes)
self.old_stream = old_stream
def write(self, bytes):
if 'bb' in bytes:
traceback.print_stack(file=self.old_stream)
self.old_stream.write(bytes)
sys.stdout = TestableIO(sys.stdout)
sys.stderr = TestableIO(sys.stderr)
print('aa')
print('bb')
print('cc')
Then you will get nice traceback:
λ python test.py
aa
File "test.py", line 22, in <module>
print('bb')
File "test.py", line 14, in write
traceback.print_stack(file=self.old_stream)
bb
cc

Logging module: too many open file descriptors

I am using Python logging module to print logs to a file, but I encountered the issue that "too many open file descriptors", I did remember to close the log file handlers, but the issue was still there.
Below is my code
class LogService(object):
__instance = None
def __init__(self):
self.__logger = logging.getLogger('ddd')
self.__handler = logging.FileHandler('/var/log/ddd/ddd.log')
self.__formatter = logging.Formatter('%(asctime)s %(levelname)s %(message)s')
self.__handler.setFormatter(self.__formatter)
#self.__logger.addHandler(self.__handler)
#classmethod
def getInstance(cls):
if cls.__instance == None:
cls.__instance = LogService()
return cls.__instance
# log Error
def logError(self, msg):
self.__logger.addHandler(self.__handler)
self.__logger.setLevel(logging.ERROR)
self.__logger.error(msg)
# Remember to close the file handler
self.closeHandler()
# log Warning
def logWarning(self, msg):
self.__logger.addHandler(self.__handler)
self.__logger.setLevel(logging.WARNING)
self.__logger.warn(msg)
# Remember to close the file handler
self.closeHandler()
# log Info
def logInfo(self, msg):
self.__logger.addHandler(self.__handler)
self.__logger.setLevel(logging.INFO)
self.__logger.info(msg)
# Remember to close the file handler
self.closeHandler()
def closeHandler(self):
self.__logger.removeHandler(self.__handler)
self.__handler.close()
And after running this code for a while, the following showed that there were too many open file descriptors.
[root#my-centos ~]# lsof | grep ddd | wc -l
11555
No no. The usage is far simpler
import logging
logging.basicConfig()
logger = logging.getLogger("mylogger")
logger.info("test")
logger.debug("test")
In your case you are appending the handler in every logging operation, which is at least overkill.
Check the documentation https://docs.python.org/2/library/logging.html
Each time you log anything, you add another instance of the handler.
Yes, you close it every time. But this just means it takes slightly longer to blow up. Closing it doesn't remove it from the logger.
The first message, you have one handler, so you open one file descriptor and then close it.
The next message, you have two handlers, so you open two file descriptors and close them.
The next message, you open three file descriptors and close them.
And so on, until you're opening more file descriptors than you're allowed to, and you get an error.
To solution is just to not do that.

Categories

Resources