convert python print statements to logging

convert python print statements to logging - python

I'm working on a large codebase that uses print statements for logging rather than python logging. I'm wondering if there is a recommended for converting all these print statements to calls to logging.info? Many of these prints are spread over several lines and thus any solution needs to handle those cases and hopefully would maintain formatting.
I've looked into python rope but that doesn't seem to have the facility to convert calls to statement like print to a function call.

You could use 2to3 and only apply the fix for print statement -> print function.
2to3 --fix=print [yourfiles] # just displays the diff on stdout
2to3 --fix=print [yourfiles] --write # also saves the changes to disk
This should automatically handle all those strange cases, and then converting print functions to logging functions should be a straightforward find-and-replace with, e.g., sed.
If you don't have the shortcut for the 2to3 script for some reason, run lib2to3 as a module instead:
python -m lib2to3 --fix=print .

just add this few line before your code starts and it will log everything it prints. I think you are looking for something like this.
import logging
import sys
class writer :
def __init__(self, *writers) :
self.writers = writers
def write(self, text) :
logging.warning(text)
saved = sys.stdout
sys.stdout = writer(sys.stdout)
print "There you go."
print "There you go2."

Related

Apply function decorator on print function across all files without having to import and/or reapply?

Edit: My first attempt at asking this might be a bit unfocused/poorly worded here's a better explanation of what I'm trying to do:
I'm trying to modify the default behavior of the print function for the entire environment python is running in without having to modify each file that's being run.
I'm attempting to decorate the print function (I know there are many ways to do this such as overriding it but that's not really the question I'm asking) so I can have it print out some debugging information and force it to always flush. I did that like so:
def modify_print(func):
# I made this so that output always gets flushed as it won't by default
# within the environment I'm using, I also wanted it to print out some
# debugging information, doesn't really matter much in the context of this
# question
def modified_print(*args,**kwargs):
return func(f"some debug prefix: ",flush=True,*args,**kwargs)
return modified_print
print = modify_print(print)
print("Hello world") # Prints "some debug prefix Hello World"
However what I'm trying to do is modify this behavior throughout my entire application. I know I can manually decorate/override/import the print function in each file however I'm wondering if there is some way I can globally configure my python environment to decorate this function everywhere. The only way I can think to do this would be to edit the python source code and build the modified version.
EDIT:
Here's the behavior I wanted implemented, thank you Match for your help.
It prints out the line number and filename everywhere you call a print function within your python environment. This means you don't have to import or override anything manually in all of your files.
https://gist.github.com/MichaelScript/444cbe5b74dce2c01a151d60b714ac3a
import site
import os
import pathlib
# Big thanks to Match on StackOverflow for helping me with this
# see https://stackoverflow.com/a/48713998/5614280
# This is some cool hackery to overwrite the default functionality of
# the builtin print function within your entire python environment
# to display the file name and the line number as well as always flush
# the output. It works by creating a custom user script and placing it
# within the user's sitepackages file and then overwriting the builtin.
# You can disable this behavior by running python with the '-s' flag.
# We could probably swap this out by reading the text from a python file
# which would make it easier to maintain larger modifications to builtins
# or a set of files to make this more portable or to modify the behavior
# of more builtins for debugging purposes.
customize_script = """
from inspect import getframeinfo,stack
def debug_printer(func):
# I made this so that output always gets flushed as it won't by default
# within the environment I'm running it in. Also it will print the
# file name and line number of where the print occurs
def debug_print(*args,**kwargs):
frame = getframeinfo(stack()[1][0])
return func(f"{frame.filename} : {frame.lineno} ", flush=True,*args,**kwargs)
return debug_print
__builtins__['print'] = debug_printer(print)
"""
# Creating the user site dir if it doesn't already exist and writing our
# custom behavior modifications
pathlib.Path(site.USER_SITE).mkdir(parents=True, exist_ok=True)
custom_file = os.path.join(site.USER_SITE,"usercustomize.py")
with open(custom_file,'w+') as f:
f.write(customize_script)

You can use usercustomize script from the site module to achieve something like this.
First, find out where your user site-packages directory is:
python3 -c "import site; print(site.USER_SITE)"
/home/foo/.local/lib/python3.6/site-packages
Next, in that directory, create a script called usercustomize.py - this script will now be run first whenever python is run.
One* way to replace print is to override the __builtins__ dict and replace it with a new method - something like:
from functools import partial
old_print = __builtins__['print']
__builtins__['print'] = partial(old_print, "Debug prefix: ", flush=True)
Drop this into the usercustomize.py script and you should see all python scripts from then on being overridden. You can temporarily disable calling this script by calling python with the -s flag.
*(Not sure if this is the correct way of doing this - there may be a better way - but the main point is that you can use usercustomize to deliver whatever method you choose).

There's no real reason to define a decorator here, because you are only intending to apply it to a single, predetermined function. Just define your modified print function directly, wrapping it around __builtins__.print to avoid recursion.
def print(*args, **kwargs):
__builtins.__print(f"some debug prefix: ", flush=True, *args, **kwargs)
print("Hello world") # Prints "some debug prefix Hello World"
You can use functools.partial to simplify this.
import functools
print = functools.partial(__builtins.__print, f"some debug prefix: ", flush=True)

Deciphering large program flow in Python

I'm in the process of learning how a large (356-file), convoluted Python program is set up. Besides manually reading through and parsing the code, are there any good methods for following program flow?
There are two methods which I think would be useful:
Something similar to Bash's "set -x"
Something that displays which file outputs each line of output
Are there any methods to do the above, or any other ways that you have found useful?

I don't know if this is actually a good idea, but since I actually wrote a hook to display the file and line before each line of output to stdout, I might as well give it to you…
import inspect, sys
class WrapStdout(object):
_stdout = sys.stdout
def write(self, buf):
frame = sys._getframe(1)
try:
f = inspect.getsourcefile(frame)
except TypeError:
f = 'unknown'
l = frame.f_lineno
self._stdout.write('{}:{}:{}'.format(f, l, buf))
def flush(self):
self._stdout.flush()
sys.stdout = WrapStdout()
Just save that as a module, and after you import it, every chunk of stdout will be prefixed with file and line number.
Of course this will get pretty ugly if:
Anyone tries to print partial lines (using stdout.write directly, or print magic comma in 2.x, or end='' in 3.x).
You mix Unicode and non-Unicode in 2.x.
Any of the source files have long pathnames.
etc.
But all the tricky deep-Python-magic bits are there; you can build on top of it pretty easily.

Could be very tedious, but using a debugger to trace the flow of execution, instruction by instruction could probably help you to some extent.
import pdb
pdb.set_trace()

You could look for a cross reference program. There is an old program called pyxr that does this. The aim of cross reference is to let you know how classes refer to each other. Some of the IDE's also do this sort of thing.

I'd recommend running the program inside an IDE like pydev or pycharm. Being able to stop the program and inspect its state can be very helpful.

How do I export the output of Python's built-in help() function

I've got a python package which outputs considerable help text from: help(package)
I would like to export this help text to a file, in the format in which it's displayed by help(package)
How might I go about this?

pydoc.render_doc(thing) to get thing's help text as a string. Other parts of pydoc like pydoc.text and pydoc.html can help you write it to a file.
Using the -w modifier in linux will write the output to a html in the current directory, for example;
pydoc -w Rpi.GPIO
Puts all the help() text that would be presented from the command help(Rpi.GPIO) into a nicely formatted file Rpi.GPIO.html, in the current directory of the shell

This is a bit hackish (and there's probably a better solution somewhere), but this works:
import sys
import pydoc
def output_help_to_file(filepath, request):
f = open(filepath, 'w')
sys.stdout = f
pydoc.help(request)
f.close()
sys.stdout = sys.__stdout__
return
And then...
>>> output_help_to_file(r'test.txt', 're')

An old question but the newer recommended generic solution (for Python 3.4+) for writing the output of functions that print() to terminal is using contextlib.redirect_stdout:
import contextlib
def write_help(func, out_file):
with open(out_file, 'w') as f:
with contextlib.redirect_stdout(f):
help(func)
Usage example:
write_help(int, 'test.txt')

To get a "clean" text output, just as the built-in help() would deliver, and suitable for exporting to a file or anything else, you can use the following:
>>> import pydoc
>>> pydoc.render_doc(len, renderer=pydoc.plaintext)
'Python Library Documentation: built-in function len in module builtins\n\nlen(obj, /)\n Return the number of items in a container.\n'

If you do help(help) you'll see:
Help on _Helper in module site object:
class _Helper(__builtin__.object)
| Define the builtin 'help'.
| This is a wrapper around pydoc.help (with a twist).
[rest snipped]
So - you should be looking at the pydoc module - there's going to be a method or methods that return what help(something) does as a string...

Selected answer didn't work for me, so I did a little more searching and found something that worked on Daniweb. Credit goes to vegaseat. https://www.daniweb.com/programming/software-development/threads/20774/starting-python/8#post1306519
# simplified version of sending help() output to a file
import sys
# save present stdout
out = sys.stdout
fname = "help_print7.txt"
# set stdout to file handle
sys.stdout = open(fname, "w")
# run your help code
# its console output goes to the file now
help("print")
sys.stdout.close()
# reset stdout
sys.stdout = out

The simplest way to do that is via using
sys module
it opens a data stream between the operation system and it's self , it grab the data from the help module then save it in external file
file="str.txt";file1="list.txt"
out=sys.stdout
sys.stdout=open('str_document','w')
help(str)
sys.stdout.close

The cleanest way
Assuming help(os)
Step 1 - In Python Console
import pydoc
pydoc.render_doc(os, renderer=pydoc.plaintext)`
#this will display a string containing help(os) output
Step 2 - Copy string
Step 3 - On a Terminal
echo "copied string" | tee somefile.txt
If you want to write Class information in a text file. Follow below steps
Insert pdb hook somewhere in the Class and run file
import pdb; pdb.set_trace()
Perform step 1 to 3 stated above

In Windows, just open up a Windows Command Line window, go to the Lib subfolder of your Python installation, and type
python pydoc.py moduleName.memberName > c:\myFolder\memberName.txt
to put the documentation for the property or method memberName in moduleName into the file memberName.txt. If you want an object further down the hierarchy of the module, just put more dots. For example
python pydoc.py wx.lib.agw.ultimatelistctrl > c:\myFolder\UltimateListCtrl.txt
to put the documentation on the UltimateListCtrl control in the agw package in the wxPython package into UltimateListCtrl.txt.

pydoc already provides the needed feature, a very well-designed feature that all question-answering systems should have. The pydoc.Helper.init has an output object, all output being sent there. If you use your own output object, you can do whatever you want. For example:
class OUTPUT():
def __init__(self):
self.results = []
def write(self,text):
self.results += [text]
def flush(self):
pass
def print_(self):
for x in self.results: print(x)
def return_(self):
return self.results
def clear_(self):
self.results = []
when passed as
O = OUTPUT() # Necessarily to remember results, but see below.
help = pydoc.Helper(O)
will store all results in the OUTPUT instance. Of course, beginning with O = OUTPUT() is not the best idea (see below). render_doc is not the central output point; output is. I wanted OUTPUT so I could keep large outputs from disappearing from the screen using something like Mark Lutz' "More". A different OUTPUT would allow you to write to files.
You could also add a "return" to the end of the class pydoc.Helper to return the information you want. Something like:
if self.output_: return self.output_
should work, or
if self.output_: return self.output.return_()
All of this is possible because pydoc is well-designed. It is hidden because the definition of help leaves out the input and output arguments.

Using the command line we can get the output directly and pipe it to whatever is useful.
python -m pydoc ./my_module_file.py
-- the ./ is important, it tells pydoc to look at your local file and not attempt to import from somewhere else.
If you're on the mac you can pipe the output to pbcopy and paste it into a documentation tool of your choice.
python -m pydoc ./my_module_file.py | pbcopy

How can I strip Python logging calls without commenting them out?

Today I was thinking about a Python project I wrote about a year back where I used logging pretty extensively. I remember having to comment out a lot of logging calls in inner-loop-like scenarios (the 90% code) because of the overhead (hotshot indicated it was one of my biggest bottlenecks).
I wonder now if there's some canonical way to programmatically strip out logging calls in Python applications without commenting and uncommenting all the time. I'd think you could use inspection/recompilation or bytecode manipulation to do something like this and target only the code objects that are causing bottlenecks. This way, you could add a manipulator as a post-compilation step and use a centralized configuration file, like so:
[Leave ERROR and above]
my_module.SomeClass.method_with_lots_of_warn_calls
[Leave WARN and above]
my_module.SomeOtherClass.method_with_lots_of_info_calls
[Leave INFO and above]
my_module.SomeWeirdClass.method_with_lots_of_debug_calls
Of course, you'd want to use it sparingly and probably with per-function granularity -- only for code objects that have shown logging to be a bottleneck. Anybody know of anything like this?
Note: There are a few things that make this more difficult to do in a performant manner because of dynamic typing and late binding. For example, any calls to a method named debug may have to be wrapped with an if not isinstance(log, Logger). In any case, I'm assuming all of the minor details can be overcome, either by a gentleman's agreement or some run-time checking. :-)

What about using logging.disable?
I've also found I had to use logging.isEnabledFor if the logging message is expensive to create.

Use pypreprocessor
Which can also be found on PYPI (Python Package Index) and be fetched using pip.
Here's a basic usage example:
from pypreprocessor import pypreprocessor
pypreprocessor.parse()
#define nologging
#ifdef nologging
...logging code you'd usually comment out manually...
#endif
Essentially, the preprocessor comments out code the way you were doing it manually before. It just does it on the fly conditionally depending on what you define.
You can also remove all of the preprocessor directives and commented out code from the postprocessed code by adding 'pypreprocessor.removeMeta = True' between the import and
parse() statements.
The bytecode output (.pyc) file will contain the optimized output.
SideNote: pypreprocessor is compatible with python2x and python3k.
Disclaimer: I'm the author of pypreprocessor.

I've also seen assert used in this fashion.
assert logging.warn('disable me with the -O option') is None
(I'm guessing that warn always returns none.. if not, you'll get an AssertionError
But really that's just a funny way of doing this:
if __debug__: logging.warn('disable me with the -O option')
When you run a script with that line in it with the -O option, the line will be removed from the optimized .pyo code. If, instead, you had your own variable, like in the following, you will have a conditional that is always executed (no matter what value the variable is), although a conditional should execute quicker than a function call:
my_debug = True
...
if my_debug: logging.warn('disable me by setting my_debug = False')
so if my understanding of debug is correct, it seems like a nice way to get rid of unnecessary logging calls. The flipside is that it also disables all of your asserts, so it is a problem if you need the asserts.

As an imperfect shortcut, how about mocking out logging in specific modules using something like MiniMock?
For example, if my_module.py was:
import logging
class C(object):
def __init__(self, *args, **kw):
logging.info("Instantiating")
You would replace your use of my_module with:
from minimock import Mock
import my_module
my_module.logging = Mock('logging')
c = my_module.C()
You'd only have to do this once, before the initial import of the module.
Getting the level specific behaviour would be simple enough by mocking specific methods, or having logging.getLogger return a mock object with some methods impotent and others delegating to the real logging module.
In practice, you'd probably want to replace MiniMock with something simpler and faster; at the very least something which doesn't print usage to stdout! Of course, this doesn't handle the problem of module A importing logging from module B (and hence A also importing the log granularity of B)...
This will never be as fast as not running the log statements at all, but should be much faster than going all the way into the depths of the logging module only to discover this record shouldn't be logged after all.

You could try something like this:
# Create something that accepts anything
class Fake(object):
def __getattr__(self, key):
return self
def __call__(self, *args, **kwargs):
return True
# Replace the logging module
import sys
sys.modules["logging"] = Fake()
It essentially replaces (or initially fills in) the space for the logging module with an instance of Fake which simply takes in anything. You must run the above code (just once!) before the logging module is attempted to be used anywhere. Here is a test:
import logging
logging.basicConfig(level=logging.DEBUG,
format='%(asctime)s %(levelname)-8s %(message)s',
datefmt='%a, %d %b %Y %H:%M:%S',
filename='/temp/myapp.log',
filemode='w')
logging.debug('A debug message')
logging.info('Some information')
logging.warning('A shot across the bows')
With the above, nothing at all was logged, as was to be expected.

I'd use some fancy logging decorator, or a bunch of them:
def doLogging(logTreshold):
def logFunction(aFunc):
def innerFunc(*args, **kwargs):
if LOGLEVEL >= logTreshold:
print ">>Called %s at %s"%(aFunc.__name__, time.strftime("%H:%M:%S"))
print ">>Parameters: ", args, kwargs if kwargs else ""
try:
return aFunc(*args, **kwargs)
finally:
print ">>%s took %s"%(aFunc.__name__, time.strftime("%H:%M:%S"))
return innerFunc
return logFunction
All you need is to declare LOGLEVEL constant in each module (or just globally and just import it in all modules) and then you can use it like this:
#doLogging(2.5)
def myPreciousFunction(one, two, three=4):
print "I'm doing some fancy computations :-)"
return
And if LOGLEVEL is no less than 2.5 you'll get output like this:
>>Called myPreciousFunction at 18:49:13
>>Parameters: (1, 2)
I'm doing some fancy computations :-)
>>myPreciousFunction took 18:49:13
As you can see, some work is needed for better handling of kwargs, so the default values will be printed if they are present, but that's another question.
You should probably use some logger module instead of raw print statements, but I wanted to focus on the decorator idea and avoid making code too long.
Anyway - with such decorator you get function-level logging, arbitrarily many log levels, ease of application to new function, and to disable logging you only need to set LOGLEVEL. And you can define different output streams/files for each function if you wish. You can write doLogging as:
def doLogging(logThreshold, outStream=sys.stdout):
.....
print >>outStream, ">>Called %s at %s" etc.
And utilize log files defined on a per-function basis.

This is an issue in my project as well--logging ends up on profiler reports pretty consistently.
I've used the _ast module before in a fork of PyFlakes (http://github.com/kevinw/pyflakes) ... and it is definitely possible to do what you suggest in your question--to inspect and inject guards before calls to logging methods (with your acknowledged caveat that you'd have to do some runtime type checking). See http://pyside.blogspot.com/2008/03/ast-compilation-from-python.html for a simple example.
Edit: I just noticed MetaPython on my planetpython.org feed--the example use case is removing log statements at import time.
Maybe the best solution would be for someone to reimplement logging as a C module, but I wouldn't be the first to jump at such an...opportunity :p

:-) We used to call that a preprocessor and although C's preprocessor had some of those capablities, the "king of the hill" was the preprocessor for IBM mainframe PL/I. It provided extensive language support in the preprocessor (full assignments, conditionals, looping, etc.) and it was possible to write "programs that wrote programs" using just the PL/I PP.
I wrote many applications with full-blown sophisticated program and data tracing (we didn't have a decent debugger for a back-end process at that time) for use in development and testing which then, when compiled with the appropriate "runtime flag" simply stripped all the tracing code out cleanly without any performance impact.
I think the decorator idea is a good one. You can write a decorator to wrap the functions that need logging. Then, for runtime distribution, the decorator is turned into a "no-op" which eliminates the debugging statements.
Jon R

I am doing a project currently that uses extensive logging for testing logic and execution times for a data analysis API using the Pandas library.
I found this string with a similar concern - e.g. what is the overhead on the logging.debug statements even if the logging.basicConfig level is set to level=logging.WARNING
I have resorted to writing the following script to comment out or uncomment the debug logging prior to deployment:
import os
import fileinput
comment = True
# exclude files or directories matching string
fil_dir_exclude = ["__","_archive",".pyc"]
if comment :
## Variables to comment
source_str = 'logging.debug'
replace_str = '#logging.debug'
else :
## Variables to uncomment
source_str = '#logging.debug'
replace_str = 'logging.debug'
# walk through directories
for root, dirs, files in os.walk('root/directory') :
# where files exist
if files:
# for each file
for file_single in files :
# build full file name
file_name = os.path.join(root,file_single)
# exclude files with matching string
if not any(exclude_str in file_name for exclude_str in fil_dir_exclude) :
# replace string in line
for line in fileinput.input(file_name, inplace=True):
print "%s" % (line.replace(source_str, replace_str)),
This is a file recursion that excludes files based on a list of criteria and performs an in place replace based on an answer found here: Search and replace a line in a file in Python

I like the 'if __debug_' solution except that putting it in front of every call is a bit distracting and ugly. I had this same problem and overcame it by writing a script which automatically parses your source files and replaces logging statements with pass statements (and commented out copies of the logging statements). It can also undo this conversion.
I use it when I deploy new code to a production environment when there are lots of logging statements which I don't need in a production setting and they are affecting performance.
You can find the script here: http://dound.com/2010/02/python-logging-performance/

What is the best way to toggle python prints?

I'm running Python 2.4 in a game engine and I want to be able to turn off all prints if needed. For example I'd like to have the prints on for a debug build, and then turned off for a release build.
It's also imperative that it's as transparent as possible.
My solution to this in the C code of the engine is having the printf function inside a vararg macro, and defining that to do nothing in a release build.
This is my current solution:
DebugPrints = True
def PRINT (*args):
global DebugPrints
if DebugPrints:
string = ""
for arg in args:
string += " " + str(arg)
print string
It makes it easy to toggle print outs, but there is possibly a better way to format the string. My main issue is that this is actually adding a lot more function calls to the program.
I'm wondering if there is anything you can do to how the print keyword works?

yes, you can assign sys.stdout to whatever you want. Create a little class with a write method that does nothing:
class DevNull(object):
def write(self, arg):
pass
import sys
sys.stdout = DevNull()
print "this goes to nirvana!"
With the same technique you can also have your prints logged to a file by setting sys.stdout to an opened file object.

I know an answer has already been marked as correct, but Python has a debug flag that provides a cleaner solution. You use it like this:
if __debug__:
print "whoa"
If you invoke Python with -O or -OO (as you normally would for a release build), __debug__ is set to False. What's even better is that __debug__ is a special case for the interpreter; it will actually strip out that code when it writes the pyc/pyo files, making the resulting code smaller/faster. Note that you can't assign values to __debug__, so it's entirely based off those command-line arguments.

The logging module is the "best" way.. although I quite often just use something simple like..
class MyLogger:
def _displayMessage(self, message, level = None):
# This can be modified easily
if level is not None:
print "[%s] %s" % (level, message
else:
print "[default] %s" % (message)
def debug(self, message):
self._displayMessage(message, level = "debug")
def info(self, message):
self._displayMessage(message, level = "info")
log = MyLogger()
log.info("test")

Don't use print, but make a console class which handles all printing. Simply make calls to the console and the console can decide whether or not to actually print them. A console class is also useful for things like error and warning messages or redirecting where the output goes.

If you really want to toggle printing:
>>> import sys
>>> import os
>>> print 'foo'
foo
>>> origstdout = sys.stdout
>>> sys.stdout = open(os.devnull, 'w')
>>> print 'foo'
>>> sys.stdout = origstdout
>>> print 'foo'
foo
However, I recommend you only use print for throwaway code. Use logging for real apps like the original question describes. It allows for a sliding scale of verbosity, so you can have only the important logging, or more verbose logging of usually less important details.
>>> import logging
>>> logging.basicConfig(level=logging.DEBUG)
>>> logging.debug('foo')
DEBUG:root:foo
For example, usage might be to put debugging logs throughout your code. Then to silence them, change your level to a higher level, one of INFO, WARN or WARNING, ERROR, or CRITICAL or FATAL
>>> logging.root.setLevel(logging.INFO)
>>> logging.debug('foo')
>>>
In a script you'll want to just set this in the basicConfig like this:
import logging
logging.basicConfig(level=logging.INFO)
logging.debug('foo') # this will be silent
More sophisticated usage of logging can be found in the docs.

I've answered this question on a different post but was the question was marked duplicate:
anyhow, here's what I would do:
from __future__ import print_function
DebugPrints = 0
def print(*args, **kwargs):
if DebugPrints:
return __builtins__.print(*args, **kwargs)
print('foo') # doesn't get printed
DebugPrints = 1
print('bar') # gets printed
sadly you can't keep the py2 print syntax print 'foo'

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.