How to log all print statements - python

My program is quite big and I want all its print statements to be logged so as a result I implemented
F = open('testy.txt','w')
sys.stdout = F
if app.button_press() == True and app.return_data():
data = app.return_data()
main(data[0],data[1],data[2],data[3],data[4],data[5],data[6],data[7],data[8])
F.close()
This is what I used to do with small programs with a few print statements but this program has few hundred of them I guess and as I run my program it freezes I think it has a lot of print statements and a memory overflow occurs therefore How could I log all my print statements to a .txt file without affecting the functionality of it?

You shouldn't be using print statements for logging, nor should you be redirecting standard out (or any other standard stream) in production code. Rather, you should be using the logging module to write messages, and setting up whatever handlers, probably a file handler or rotating file handler in your case, that you need to record your log messages to the right place.
If you already have too much existing code printing log messages to refactor in one sitting, I'd suggest implementing logging, using it for all logging going forward, setting up a file-like object that shunts standard out to logging to capture existing log messages, then refactoring out all your print-based logging over time.

Related

Python - How to log records to both console and stdout

I'm trying to replace all print statements from the old test framework with Python logging.
I need to use a simple logging.debug('something') way to write records to both console and test report.
I'm using a custom logger for each test module like this:
logger.py
import logging
from sys import stdout
def setup_logger(filename):
logger = logging.getLogger(filename)
logger.setLevel(logging.DEBUG)
stdout_handler = logging.StreamHandler(stdout)
logger.addHandler(stdout_handler)
return logger
test_file.py
from logger import setup_logger
logger = setup_logger(__name__)
logger.debug('something')
Currently I can write to the console but I don't see those records in sys.stdout when I'm generating test reports. I'm using HtmlTestRunner for generating test reports and it looks like it doesn't receive stdout records from tests here:
self._stdout_data = sys.stdout.getvalue()
https://github.com/oldani/HtmlTestRunner/blob/master/HtmlTestRunner/result.py#L181
I'm not very familiar with logging and I don't understand what might be an issue here.
Is sys.stdout something completely different from the stdout where my custom logger writes messages?
Is there an easy way to solve this issue?
I read other similar questions about this (so my custom logger was written after reading those) but my issue is a bit different as I'm using the report generator which is parsing sys.stdout.
UPD: Or maybe I'm already doing this right but I need to read not the sys.stdout but something else to find my records when generating test report?
UPD2: Found a similar old question which was never answered) Maybe there's no quick solution for that(
View passed testcases and show log messages in html report when using python html-testrunner??
It looks like the only good way to solve such an issue is writing a function which does 2 things: adds logs to console (I ended up with just using loguru - simple and clear) and prints the same message (the only purpose of that is to support old test reports). Of course it might look weird that someone needs output data in success reports as well but it's not my personal choice)

Python, read log exception from logging.exception

I use the scrapy library for scraping and it handels all the logging for me. The log file is very big and I was hoping for another way than to read through the whole file. I'm only interested in the exception.
Is it posible to get the raw logging.exception and read from that without creating another log file?
Use the python logging facility which is what Scrapy uses. At the program entry point you need to customize this logging facility, it can be in the __init__.py file for example or anything else before you use any log related calls.
What you need to do is set different handlers for different levels (ERROR, WARNING, ...). Exception level do not exist, you might be mistaking with Error or Critical levels.
This is the "Logging How To", you should check it.
Your solution would look like :
import logging
#this is the logger Scrapy uses throughout the app.
logger = logging.getLogger()
my_exception_handler = logging.FileHandler()
my_exception_handler.setLevel(Logging.ERROR)
logger.addHandler(my_exception_handler)
Finally, if you are really talking about exception and not error or critical levels. You'll need to inherit from your Exception class or overwrite it so you can use the same logging approach I just explained.

How do I define a different logger for an imported module in Python?

I'm using Advanced Python Scheduler in a Python script. The main program defines a log by calling logging.basicConfig with the file name of the log that I want. This log is also set to "DEBUG" as the logging level, since that's what I need at present for my script.
Unfortunately, because logging.basicConfig has been set up in this manner, apscheduler writes its log entries to the same log file. There are an awful lot of these, especially since I have one scheduled task that runs every minute.
Is there any way to redirect apscheduler's log output to another log file (without changing apscheduler's code) while using my log file for my own script? I.e. is there a way to change the file name for each module's output within my script?
I tried reading the module page and the HOWTO for logging, but could not find an answer to this.
Set the logger level for apscheduler to your desired value (e.g. WARNING to avoid seeing DEBUG and INFO messages from apscheduler like this:
logging.getLogger('apscheduler').setLevel(logging.WARNING)
You will still get messages for WARNING and higher severities. To direct messages from apscheduler into a separate file, use
aplogger = logging.getLogger('apscheduler')
aplogger.propagate = False
aplogger.setLevel(logging.WARNING) # or whatever
aphandler = logging.FileHandler(...) # as per what you want
aplogger.addHandler(aphandler)
Ensure the above code is only called once (otherwise you will add multiple FileHandler instances - probably not what you want).
maybe you want to call logging.getLogger("apscheduler") and setup its log file in there? see this answer https://stackoverflow.com/a/2031557/782168

Python- How to flush the log? (django)

I'm working with Django-nonrel on Google App Engine, which forces me to use logging.debug() instead of print().
The "logging" module is provided by Django, but I'm having a rough time using it instead of print().
For example, if I need to verify the content held in the variable x, I will put
logging.debug('x is: %s' % x). But if the program crashes soon after (without flushing the stream), then it never gets printed.
So for debugging, I need debug() to be flushed before the program exits on error, and this is not happening.
I think this may work for you, assuming you're only using one(or default) handler:
>>> import logging
>>> logger = logging.getLogger()
>>> logging.debug('wat wat')
>>> logger.handlers[0].flush()
It's kind of frowned upon in the documentation, though.
Application code should not directly instantiate and use instances of Handler. Instead, the Handler class is a base class that defines the interface that all handlers should have and establishes some default behavior that child classes can use (or override).
http://docs.python.org/2/howto/logging.html#handler-basic
And it could be a performance drain, but if you're really stuck, this may help with your debugging.
If the use case is that you have a python program that should flush its logs when exiting, use logging.shutdown().
From the python documentation:
logging.shutdown()
Informs the logging system to perform an orderly
shutdown by flushing and closing all handlers. This should be called
at application exit and no further use of the logging system should be
made after this call.
Django logging relies on the standard python logging module.
This module has a module-level method: logging.shutdown() which flushes all of the handlers and shuts down the logging system (i.e. logging can not longer be used after it is called)
Inspecting the code of this function shows that currently (python 2.7) the logging module holds a list of weak references to all handlers in a module-level variable called _handlerList so all of the handlers can be flushed by doing something like
[h_weak_ref().flush() for h_weak_ref in logging._handlerList]
because this solution uses the internals of the module #Mikes solution above is better, but it relies on having access to a logger, it can be generalized as follows:
[h.flush() for h in my_logger.handlerList]
a simple function that always working for register you debug messsages while programming. dont use it for production, since it will not rotate:
def make_log(message):
import datetime
with open('mylogfile.log','a') as f:
f.write(f"{datetime.datetime.now()} {message}\n")
then use as
make_log('my message to register')
when to put on production, just comment the last 2 lines
def make_log(message):
import datetime
#with open('mylogfile.log','a') as f:
# f.write(f"{datetime.datetime.now()} {message}\n")

In python, why use logging instead of print?

For simple debugging in a complex project is there a reason to use the python logger instead of print? What about other use-cases? Is there an accepted best use-case for each (especially when you're only looking for stdout)?
I've always heard that this is a "best practice" but I haven't been able to figure out why.
The logging package has a lot of useful features:
Easy to see where and when (even what line no.) a logging call is being made from.
You can log to files, sockets, pretty much anything, all at the same time.
You can differentiate your logging based on severity.
Print doesn't have any of these.
Also, if your project is meant to be imported by other python tools, it's bad practice for your package to print things to stdout, since the user likely won't know where the print messages are coming from. With logging, users of your package can choose whether or not they want to propogate logging messages from your tool or not.
One of the biggest advantages of proper logging is that you can categorize messages and turn them on or off depending on what you need. For example, it might be useful to turn on debugging level messages for a certain part of the project, but tone it down for other parts, so as not to be taken over by information overload and to easily concentrate on the task for which you need logging.
Also, logs are configurable. You can easily filter them, send them to files, format them, add timestamps, and any other things you might need on a global basis. Print statements are not easily managed.
Print statements are sort of the worst of both worlds, combining the negative aspects of an online debugger with diagnostic instrumentation. You have to modify the program but you don't get more, useful code from it.
An online debugger allows you to inspect the state of a running program; But the nice thing about a real debugger is that you don't have to modify the source; neither before nor after the debugging session; You just load the program into the debugger, tell the debugger where you want to look, and you're all set.
Instrumenting the application might take some work up front, modifying the source code in some way, but the resulting diagnostic output can have enormous amounts of detail, and can be turned on or off to a very specific degree. The python logging module can show not just the message logged, but also the file and function that called it, a traceback if there was one, the actual time that the message was emitted, and so on. More than that; diagnostic instrumentation need never be removed; It's just as valid and useful when the program is finished and in production as it was the day it was added; but it can have it's output stuck in a log file where it's not likely to annoy anyone, or the log level can be turned down to keep all but the most urgent messages out.
anticipating the need or use for a debugger is really no harder than using ipython while you're testing, and becoming familiar with the commands it uses to control the built in pdb debugger.
When you find yourself thinking that a print statement might be easier than using pdb (as it often is), You'll find that using a logger pulls your program in a much easier to work on state than if you use and later remove print statements.
I have my editor configured to highlight print statements as syntax errors, and logging statements as comments, since that's about how I regard them.
In brief, the advantages of using logging libraries do outweigh print as below reasons:
Control what’s emitted
Define what types of information you want to include in your logs
Configure how it looks when it’s emitted
Most importantly, set the destination for your logs
In detail, segmenting log events by severity level is a good way to sift through which log messages may be most relevant at a given time. A log event’s severity level also gives you an indication of how worried you should be when you see a particular message. For instance, dividing logging type to debug, info, warning, critical, and error. Timing can be everything when you’re trying to understand what went wrong with an application. You want to know the answers to questions like:
“Was this happening before or after my database connection died?”
“Exactly when did that request come in?”
Furthermore, it is easy to see where a log has occurred through line number and filename or method name even in which thread.
Here's a functional logging library for Python named loguru.
If you use logging then the person responsible for deployment can configure the logger to send it to a custom location, with custom information. If you only print, then that's all they get.
Logging essentially creates a searchable plain text database of print outputs with other meta data (timestamp, loglevel, line number, process etc.).
This is pure gold, I can run egrep over the log file after the python script has run.
I can tune my egrep pattern search to pick exactly what I am interested in and ignore the rest. This reduction of cognitive load and freedom to pick my egrep pattern later on by trial and error is the key benefit for me.
tail -f mylogfile.log | egrep "key_word1|key_word2"
Now throw in other cool things that print can't do (sending to socket, setting debug levels, logrotate, adding meta data etc.), you have every reason to prefer logging over plain print statements.
I tend to use print statements because it's lazy and easy, adding logging needs some boiler plate code, hey we have yasnippets (emacs) and ultisnips (vim) and other templating tools, so why give up logging for plain print statements!?
I would add to all other mentionned advantages that the print function in standard configuration is buffered. The flush may occure only at the end of the current block (the one where the print is).
This is true for any program launched in a non interactive shell (codebuild, gitlab-ci for instance) or whose output is redirected.
If for any reason the program is killed (kill -9, hard reset of the computer, …), you may be missing some line of logs if you used print for the same.
However, the logging library will ensure to flush the logs printed to stderr and stdout immediately at any call.

Categories

Resources