I have a Python script that runs 24hrs a day.
A module from this script is using variables values that I wish to change from time to time, without having to stop the script, edit the module file, then launch the script again (I need to avoid interruptions as much as I can).
I thought about storing the variables in a separate file, and the module would, when needed, fetch the new values from the file and use them.
Pickle seemed a solution but is not human readable and therefore not easily changeable. Maybe a JSON file, or another .py file I import over again ?
Another advantage of doing so, for me, is that in case of interruption (eg. server restart), I can resume the script with the latest variable values if I load them from a separate file.
Is there a recommended way of doing such things ?
Something along the lines :
# variables file:
variable1 = 10
variable2 = 25
# main file:
while True:
import variables
print('Sum:', str(variable1+variable2))
time.sleep(60)
An easy way to maintain a text file with variables would be the YAML format. This answer explains how to use it, basically:
import yaml
stream = open("vars.yaml", "r")
docs = yaml.load_all(stream)
If you have more than a few variables, it may be good to check the file descriptor to see if the file was recently updated, and only re-load variables when there was a change in the file.
import os
last_updated = os.path.getmtime('vars.yaml')
Finally, since you want avoid interruption of the script, it may be good to have the script catch any errors in the YAML file and warn the user, instead of just throwing an exception and die. But also remember that "errors should never pass silently". What is the best approach here would depend on your use-case.
I seem to be running into a problem when I am logging data after invoking another module in an application I am working on. I'd like assistance in understanding what may be happening here.
To replicate the issue, I have developed the following script...
#!/usr/bin/python
import sys
import logging
from oletools.olevba import VBA_Parser, VBA_Scanner
from cloghandler import ConcurrentRotatingFileHandler
# set up logger for application
dbg_h = logging.getLogger('dbg_log')
dbglog = '%s' % 'dbg.log'
dbg_rotateHandler = ConcurrentRotatingFileHandler(dbglog, "a")
dbg_h.addHandler(dbg_rotateHandler)
dbg_h.setLevel(logging.ERROR)
# read some document as a buffer
buff = sys.stdin.read()
# generate issue
dbg_h.error('Before call to module....')
vba = VBA_Parser('None', data=buff)
dbg_h.error('After call to module....')
When I run this, I get the following...
cat somedocument.doc | ./replicate.py
ERROR:dbg_log:After call to module....
For some reason, my last dbg_h logger write attempt is getting output to the console as well as getting written to my dbg.log file? This only appears to happen AFTER the call to VBA_Parser.
cat dbg.log
Before call to module....
After call to module....
Anyone have any idea as to why this might be happening? I reviewed the source code of olevba and did not see anything that stuck out to me specifically.
Could this be a problem I should raise with the module author? Or am I doing something wrong with how I am using the cloghandler?
The oletools codebase is littered with calls to the root logger though calls to logging.debug(...), logging.error(...), and so on. Since the author didn't bother to configure the root logger, the default behavior is to dump to sys.stderr. Since sys.stderr defaults to the console when running from the command line, you get what you're seeing.
You should contact the author of oletools since they're not using the logging system effectively. Ideally they would use a named logger and push the messages to that logger. As a work-around to suppress the messages you could configure the root logger to use your handler.
# Set a handler
logger.root.addHandler(dbg_rotateHandler)
Be aware that this may lead to duplicated log messages.
I am new to Python. I'm using Vim with Python-mode to edit and test my code and noticed that if I run the same code more than once, the logging file will only be updated the first time the code is run. For example below is a piece of code called "testlogging.py"
#!/usr/bin/python2.7
import os.path
import logging
filename_logging = os.path.join(os.path.dirname(__file__), "testlogging.log")
logging.basicConfig(filename=filename_logging, filemode='w',
level=logging.DEBUG)
logging.info('Aye')
If I open a new gvim session and run this code with python-mode, then I would get a file called "testlogging.log" with the content
INFO:root:Aye
Looks promising! But if I delete the log file and run the code in pythong-mode again, then the log file won't be re-created. If at this time I run the code in a terminal like this
./testlogging.py
Then the log file would be generated again!
I checked with Python documentation and noticed this line in the logging tutorial (https://docs.python.org/2.7/howto/logging.html#logging-to-a-file):
A very common situation is that of recording logging events in a file, so let’s look at that next. Be sure to try the following in a newly-started Python interpreter, and don’t just continue from the session described above:...
So I guess this problem with logging file only updated once has something to do with python-mode remaining in the same interpreter when I run a code a second time. So my question is: is there anyway to solve this problem, by fiddling with the logging module, by putting something in the code to reset the interpreter or by telling Python-mode to reset it?
I am also curious why the logging module requires a newly-started interpreter to work...
Thanks for your help in advance.
The log file is not recreated because the logging module still has the old log file open and will continue to write to it (even though you've deleted it). The solution is to force the logging module to release all acquired resources before running your code again in the same interpreter:
# configure logging
# log a message
# delete log file
logging.shutdown()
# configure logging
# log a message (log file will be recreated)
In other words, call logging.shutdown() at the end of your code and then you can re-run it within the same interpreter and it will work as expected (re-creating the log file on every run).
You opened log file with "w" mode. "w" means write data from begining, so you saw only log of last execution.
This is why, you see same contents on log file.
You should correct your code on fifth to seventh line to follows.
filename_logging = os.path.join(os.path.dirname(__file__), "testlogging.log")
logging.basicConfig(filename=filename_logging, filemode='a',
level=logging.DEBUG)
Code above use "a" mode, i.e. append mode, so new log data will be added at the end of log file.
I want to modify the python logging to log the time elapsed from when the script started (or when the logging module was initialized).
000:00 DEBUG bla bla
000:01 ERROR wow! ... this is measured in minutes:seconds
I found the relativeCreated variable in the logging module, but this is givving me millisecond accuracy which would only spam the logs, making harder to see where the time goes or how log it took to run.
How can I do this?
You can use a logging.Formatter subclass which gets the relativeCreated from the LogRecord passed to its format method, format it as mins:secs and output it in the format string e.g. using a different placeholder in the format string, such as %(adjustedTime)s:
class MyFormatter(logging.Formatter):
def format(self, record):
record.adjustedTime = format_as_mins_and_secs(record.relativeCreated)
return super(MyFormatter, self).format(record)
where you can define format_as_mins_and_secs to return the formatted relative time.
First that pops up in my mind .
Calculate time delta between time script started and time that is now.
Use datetime module for that (do calc to epoh secs and back to timestamp) .
Insert results by loggin module into message section.
I do not put any code for now to inspire you to read more about modules mentioned ).
I'm not sure you can do this using the logging python module (I could be wrong).
If you need the run-time of the application, though, I would say you should capture the time at application start in a variable (using the time module). Once the time has been captured, you can then pass that variable to the logger to calculate the difference when hitting an error.
Today I was thinking about a Python project I wrote about a year back where I used logging pretty extensively. I remember having to comment out a lot of logging calls in inner-loop-like scenarios (the 90% code) because of the overhead (hotshot indicated it was one of my biggest bottlenecks).
I wonder now if there's some canonical way to programmatically strip out logging calls in Python applications without commenting and uncommenting all the time. I'd think you could use inspection/recompilation or bytecode manipulation to do something like this and target only the code objects that are causing bottlenecks. This way, you could add a manipulator as a post-compilation step and use a centralized configuration file, like so:
[Leave ERROR and above]
my_module.SomeClass.method_with_lots_of_warn_calls
[Leave WARN and above]
my_module.SomeOtherClass.method_with_lots_of_info_calls
[Leave INFO and above]
my_module.SomeWeirdClass.method_with_lots_of_debug_calls
Of course, you'd want to use it sparingly and probably with per-function granularity -- only for code objects that have shown logging to be a bottleneck. Anybody know of anything like this?
Note: There are a few things that make this more difficult to do in a performant manner because of dynamic typing and late binding. For example, any calls to a method named debug may have to be wrapped with an if not isinstance(log, Logger). In any case, I'm assuming all of the minor details can be overcome, either by a gentleman's agreement or some run-time checking. :-)
What about using logging.disable?
I've also found I had to use logging.isEnabledFor if the logging message is expensive to create.
Use pypreprocessor
Which can also be found on PYPI (Python Package Index) and be fetched using pip.
Here's a basic usage example:
from pypreprocessor import pypreprocessor
pypreprocessor.parse()
#define nologging
#ifdef nologging
...logging code you'd usually comment out manually...
#endif
Essentially, the preprocessor comments out code the way you were doing it manually before. It just does it on the fly conditionally depending on what you define.
You can also remove all of the preprocessor directives and commented out code from the postprocessed code by adding 'pypreprocessor.removeMeta = True' between the import and
parse() statements.
The bytecode output (.pyc) file will contain the optimized output.
SideNote: pypreprocessor is compatible with python2x and python3k.
Disclaimer: I'm the author of pypreprocessor.
I've also seen assert used in this fashion.
assert logging.warn('disable me with the -O option') is None
(I'm guessing that warn always returns none.. if not, you'll get an AssertionError
But really that's just a funny way of doing this:
if __debug__: logging.warn('disable me with the -O option')
When you run a script with that line in it with the -O option, the line will be removed from the optimized .pyo code. If, instead, you had your own variable, like in the following, you will have a conditional that is always executed (no matter what value the variable is), although a conditional should execute quicker than a function call:
my_debug = True
...
if my_debug: logging.warn('disable me by setting my_debug = False')
so if my understanding of debug is correct, it seems like a nice way to get rid of unnecessary logging calls. The flipside is that it also disables all of your asserts, so it is a problem if you need the asserts.
As an imperfect shortcut, how about mocking out logging in specific modules using something like MiniMock?
For example, if my_module.py was:
import logging
class C(object):
def __init__(self, *args, **kw):
logging.info("Instantiating")
You would replace your use of my_module with:
from minimock import Mock
import my_module
my_module.logging = Mock('logging')
c = my_module.C()
You'd only have to do this once, before the initial import of the module.
Getting the level specific behaviour would be simple enough by mocking specific methods, or having logging.getLogger return a mock object with some methods impotent and others delegating to the real logging module.
In practice, you'd probably want to replace MiniMock with something simpler and faster; at the very least something which doesn't print usage to stdout! Of course, this doesn't handle the problem of module A importing logging from module B (and hence A also importing the log granularity of B)...
This will never be as fast as not running the log statements at all, but should be much faster than going all the way into the depths of the logging module only to discover this record shouldn't be logged after all.
You could try something like this:
# Create something that accepts anything
class Fake(object):
def __getattr__(self, key):
return self
def __call__(self, *args, **kwargs):
return True
# Replace the logging module
import sys
sys.modules["logging"] = Fake()
It essentially replaces (or initially fills in) the space for the logging module with an instance of Fake which simply takes in anything. You must run the above code (just once!) before the logging module is attempted to be used anywhere. Here is a test:
import logging
logging.basicConfig(level=logging.DEBUG,
format='%(asctime)s %(levelname)-8s %(message)s',
datefmt='%a, %d %b %Y %H:%M:%S',
filename='/temp/myapp.log',
filemode='w')
logging.debug('A debug message')
logging.info('Some information')
logging.warning('A shot across the bows')
With the above, nothing at all was logged, as was to be expected.
I'd use some fancy logging decorator, or a bunch of them:
def doLogging(logTreshold):
def logFunction(aFunc):
def innerFunc(*args, **kwargs):
if LOGLEVEL >= logTreshold:
print ">>Called %s at %s"%(aFunc.__name__, time.strftime("%H:%M:%S"))
print ">>Parameters: ", args, kwargs if kwargs else ""
try:
return aFunc(*args, **kwargs)
finally:
print ">>%s took %s"%(aFunc.__name__, time.strftime("%H:%M:%S"))
return innerFunc
return logFunction
All you need is to declare LOGLEVEL constant in each module (or just globally and just import it in all modules) and then you can use it like this:
#doLogging(2.5)
def myPreciousFunction(one, two, three=4):
print "I'm doing some fancy computations :-)"
return
And if LOGLEVEL is no less than 2.5 you'll get output like this:
>>Called myPreciousFunction at 18:49:13
>>Parameters: (1, 2)
I'm doing some fancy computations :-)
>>myPreciousFunction took 18:49:13
As you can see, some work is needed for better handling of kwargs, so the default values will be printed if they are present, but that's another question.
You should probably use some logger module instead of raw print statements, but I wanted to focus on the decorator idea and avoid making code too long.
Anyway - with such decorator you get function-level logging, arbitrarily many log levels, ease of application to new function, and to disable logging you only need to set LOGLEVEL. And you can define different output streams/files for each function if you wish. You can write doLogging as:
def doLogging(logThreshold, outStream=sys.stdout):
.....
print >>outStream, ">>Called %s at %s" etc.
And utilize log files defined on a per-function basis.
This is an issue in my project as well--logging ends up on profiler reports pretty consistently.
I've used the _ast module before in a fork of PyFlakes (http://github.com/kevinw/pyflakes) ... and it is definitely possible to do what you suggest in your question--to inspect and inject guards before calls to logging methods (with your acknowledged caveat that you'd have to do some runtime type checking). See http://pyside.blogspot.com/2008/03/ast-compilation-from-python.html for a simple example.
Edit: I just noticed MetaPython on my planetpython.org feed--the example use case is removing log statements at import time.
Maybe the best solution would be for someone to reimplement logging as a C module, but I wouldn't be the first to jump at such an...opportunity :p
:-) We used to call that a preprocessor and although C's preprocessor had some of those capablities, the "king of the hill" was the preprocessor for IBM mainframe PL/I. It provided extensive language support in the preprocessor (full assignments, conditionals, looping, etc.) and it was possible to write "programs that wrote programs" using just the PL/I PP.
I wrote many applications with full-blown sophisticated program and data tracing (we didn't have a decent debugger for a back-end process at that time) for use in development and testing which then, when compiled with the appropriate "runtime flag" simply stripped all the tracing code out cleanly without any performance impact.
I think the decorator idea is a good one. You can write a decorator to wrap the functions that need logging. Then, for runtime distribution, the decorator is turned into a "no-op" which eliminates the debugging statements.
Jon R
I am doing a project currently that uses extensive logging for testing logic and execution times for a data analysis API using the Pandas library.
I found this string with a similar concern - e.g. what is the overhead on the logging.debug statements even if the logging.basicConfig level is set to level=logging.WARNING
I have resorted to writing the following script to comment out or uncomment the debug logging prior to deployment:
import os
import fileinput
comment = True
# exclude files or directories matching string
fil_dir_exclude = ["__","_archive",".pyc"]
if comment :
## Variables to comment
source_str = 'logging.debug'
replace_str = '#logging.debug'
else :
## Variables to uncomment
source_str = '#logging.debug'
replace_str = 'logging.debug'
# walk through directories
for root, dirs, files in os.walk('root/directory') :
# where files exist
if files:
# for each file
for file_single in files :
# build full file name
file_name = os.path.join(root,file_single)
# exclude files with matching string
if not any(exclude_str in file_name for exclude_str in fil_dir_exclude) :
# replace string in line
for line in fileinput.input(file_name, inplace=True):
print "%s" % (line.replace(source_str, replace_str)),
This is a file recursion that excludes files based on a list of criteria and performs an in place replace based on an answer found here: Search and replace a line in a file in Python
I like the 'if __debug_' solution except that putting it in front of every call is a bit distracting and ugly. I had this same problem and overcame it by writing a script which automatically parses your source files and replaces logging statements with pass statements (and commented out copies of the logging statements). It can also undo this conversion.
I use it when I deploy new code to a production environment when there are lots of logging statements which I don't need in a production setting and they are affecting performance.
You can find the script here: http://dound.com/2010/02/python-logging-performance/