Strange Issue Using Logging Module in Python

Strange Issue Using Logging Module in Python - python

I seem to be running into a problem when I am logging data after invoking another module in an application I am working on. I'd like assistance in understanding what may be happening here.
To replicate the issue, I have developed the following script...
#!/usr/bin/python
import sys
import logging
from oletools.olevba import VBA_Parser, VBA_Scanner
from cloghandler import ConcurrentRotatingFileHandler
# set up logger for application
dbg_h = logging.getLogger('dbg_log')
dbglog = '%s' % 'dbg.log'
dbg_rotateHandler = ConcurrentRotatingFileHandler(dbglog, "a")
dbg_h.addHandler(dbg_rotateHandler)
dbg_h.setLevel(logging.ERROR)
# read some document as a buffer
buff = sys.stdin.read()
# generate issue
dbg_h.error('Before call to module....')
vba = VBA_Parser('None', data=buff)
dbg_h.error('After call to module....')
When I run this, I get the following...
cat somedocument.doc | ./replicate.py
ERROR:dbg_log:After call to module....
For some reason, my last dbg_h logger write attempt is getting output to the console as well as getting written to my dbg.log file? This only appears to happen AFTER the call to VBA_Parser.
cat dbg.log
Before call to module....
After call to module....
Anyone have any idea as to why this might be happening? I reviewed the source code of olevba and did not see anything that stuck out to me specifically.
Could this be a problem I should raise with the module author? Or am I doing something wrong with how I am using the cloghandler?

The oletools codebase is littered with calls to the root logger though calls to logging.debug(...), logging.error(...), and so on. Since the author didn't bother to configure the root logger, the default behavior is to dump to sys.stderr. Since sys.stderr defaults to the console when running from the command line, you get what you're seeing.
You should contact the author of oletools since they're not using the logging system effectively. Ideally they would use a named logger and push the messages to that logger. As a work-around to suppress the messages you could configure the root logger to use your handler.
# Set a handler
logger.root.addHandler(dbg_rotateHandler)
Be aware that this may lead to duplicated log messages.

Related

How can I prevent sentry from capturing events for some uncaught exceptions and logging messages?

As recommended by Sentry's docs [1][2] for their new unified python sdk (sentry_sdk), I've configured it with my Django application to capture events on all exceptions or "error"-level logs:
import sentry_sdk
import logging
from sentry_sdk.integrations.django import DjangoIntegration
from sentry_sdk.integrations.logging import LoggingIntegration
sentry_logging = LoggingIntegration(
level=logging.DEBUG,
event_level=logging.ERROR
)
sentry_sdk.init(
dsn="{{sentry_dsn}}",
integrations=[DjangoIntegration(), sentry_logging]
)
However, since this hooks directly into python's logging module and internal exception handling, it means anything that uses this Django environment will be shipping events to sentry. There are some tasks (such as interactive manage.py commands, or work in the REPL) that need the Django environment, but for which I don't want events created in Sentry.
Is there a way to indicate to sentry that I'd like it to not capture events from exceptions or logging calls for the current task? Or a way to temporarily disable it after it's been globally configured?

You can run sentry_sdk.init() (notably without a DSN) again to disable the SDK.

The accepted answer to call init with no args did not work for me.
I had to explicitly pass an empty DSN:
import sentry_sdk
sentry_sdk.init(dsn="")
Tested with sentry_sdk 0.19.1:
>>> import sentry_sdk
>>> sentry_sdk.Hub.current.client.dsn
"https://...redacted...#sentry.io/..."
>>> sentry_sdk.init(dsn="")
>>> sentry_sdk.Hub.current.client.dsn
''
After calling init with no args, the DSN remained unchanged.

Maybe there’s a better way but in any file you can import logging and disable it like so: logging.disable(logging.CRITICAL). This will disabling logging at the level at or below the parameter (since CRITICAL is the highest it’ll disable all logging).

According to the docs for .NET # https://getsentry.github.io/sentry-dotnet/api/Sentry.SentrySdk.html#Sentry_SentrySdk_Init_System_String_ sentry_sdk.init() "An empty string is interpreted as a disabled SDK". Yes, I know this is a Python question but their API is generally consistent across languages

Very late to the party, but for anyone that comes to this from April 2021 you can simply:
import sentry_sdk
sentry_sdk.Hub.current.stop_auto_session_tracking()
The call to stop_auto_session_tracking() will always stop sentry errors as long as you include it (1) after you do sentry_sdk.init (2) Before any errors in your application occur.

Problems with Python's multithreading/logging modules

I have a concurrently-run program where I want to create a log for each child process. I'll first describe my setup and then the issue I'm facing. Here are my primary modules:
mp_handler.py:
import logging
import multiprocessing as mp
def mp_handler(target, args_list):
# configure logs
for args in args_list:
logger_id = args[0] # first arg suffices to id a process, in my case
logger = logging.getLogger(logger_id)
handler = logging.FileHandler(logger_id + '.log')
logger.setLevel(logging.INFO)
logger.addHandler(handler)
mp.set_start_method('spawn') # bug fix, see below
# build each process
for args in args_list:
p = mp.Process(target = target, args = args)
p.start()
mp_worker.py:
import logging
from deco_module import deco
from my_module import function_with_open_cv
#deco
def mp_worker(args):
logger_id = arg[0]
logger = logging.getLogger(logger_id)
log.info("Information about process %s" % log_id)
# do a lot of stuff with openCV3
function_with_open_cv(args) # also logs to this child's log file
deco_module.py: this module does some exception handling and I have no idea why it might interfere but I figure I'd include it just in case.
from functools import wraps
import logging
def deco(function):
#wraps(function)
def wrapper(*args):
logger_id = *args[0]
logger = logging.getLogger(logger_id)
try:
function(*args)
except:
logger.info('a message in case the child fails.')
return wrapper
Now, on to my issue. I was getting the error described in this post: https://github.com/opencv/opencv/issues/5150. Hence, I wrote the mp.set_start_method('spawn') line in mp_handler().
After debugging, however, I found that that line was causing the logger = logging.getLogger(logger_id) line mp_worker() to create a NEW logger as opposed to getting the one created in the parent, i.e. mp_handler(). I was able to see this by printing hex(id(logger)) in both the parent and the child modules and see that the locations in memory are different. Indeed, as I said, writing mp.set_start_method('fork') avoids this issue (this makes very rough sense to me as my understanding is that spawn will create a new space for the logger).
main problem: So, the problem is, how do I work around the fact that I need the start method to be set to 'spawn' for the sake of OpenCV but need to toggle it off in order for log communication between modules (i.e. in order for mp_worker to recognize its correct logger_id in order to log to the correct file)? As part of good practice, I want to keep all logging configs out of the children and submodules alike.
secondary problem: supposing I ignore the fact that I need OpenCV and set the method to 'fork.' In this case I noticed that none of the logging.info() statements in the function_with_open_cv() function never get to the log! So, supposing your recommendation does involve setting it to fork, what is the work around here? EDIT: FIXED! This is also being caused by OpenCV. So the problem still stands... how do I use a spawn process and not lose my logger ID?
Thank you so much!

You shouldn't configure logging before a process is spawned, but after. See the documentation for an example of how to do it correctly. This applies to Python 3, but if you need to run it under Python 2, you can use the logutils package, which provides QueueListener and QueueHandler classes.
The logging cookbook contains more example code relating to using logging with multiprocessing.

NameError: name 'MyClass' is not defined

I have written a small piece of code in Python, I was trying to run the code from the command line as follows
python -c MyClass.py
The contents of MyClass.py are as follows:
#!/usr/bin/python
import logging;
class MyClass():
data = None
logger = logging.getLogger(__name__)
def __init__(self, data):
self.data = data
if __name__ == "__main__":
a = MyClass("Data")
a.logger.info("This is message")
The above code fails with the following error
Traceback (most recent call last):
File "<string>", line 1, in <module>
NameError: name 'MyClass' is not defined
Mind you, when I ran this from the python tool, IDLE, I saw no issues and the code ran as expected, I am confused as to what I am missing. Also, I saw several variants of this question, but the problem I am facing is specific to me,so , please help me.

The -c flag is to run the passed string as Python code. python -c MyClass.py is equivalent to opening a fresh interpreter and entering MyClass.py. When you do that, MyClass isn't defined, so it fails. To run a script from your system's terminal, simply use python MyClass.py.

The reason your code fails is that you are using wrong command line option:
-c command
Specify the command to execute (see next section).
This terminates the option list (following options
are passed as arguments to the command).
So, what you give after -c is evaluated as a statement, like python -c "print('Hello!')". So, with -c MyClass.py you are literally trying to access an attribute named py from an object named MyClass, which is likely not what you are trying to do.
To execute source code from file MyClass.py, you should omit -c: python MyClass.py.
However, your code won't work either because of two reasons:
1. You are using non-configured logger instance
2. Your logging level is way too low to notice the error.
For example, replace logger.info with logger.warning — you will immediately get No handlers could be found for logger "__main__" error. Searching reference for this message leads us to 15.6.1.6. Configuring Logging for a Library section. Reason is that you are trying to access the logger that is not configured — so it has no idea what to do next with generated messages.
To address this, in your main project file you can configure your logger as suggested in section 15.6.1.5. Configuring Logging, using either your own handler, or reusing some library-provided shortcuts for common tasks, as per section 15.6.3. Useful Handlers.

python -c "import MyClass" will work, but why not run it with a command python MyClass.py?
When ran as "python MyClass.py" , I dont see any output printed to the console, hence I was forced to use python -c MyClass.py.
It work well, because of logging level setting,you can not see log message
.you can set log level or use high level such as logger.error("msg").
Anyway, a.logger.error("This is message") will get No handlers could be found for logger "__main__".It's clear that there is no handlers for logger "main",you need set it or using "root" logger by logging.info("message")
Good luck!

Python logging.handlers.SocketHandler not broadcasting except in App Engine project

I'm trying to broadcast my error log with a SocketHandler, and it's working great if I put this in the main.py of my app engine project, but if I just write a simple stand alone python script and run it, it doesn't seem to successfully broadcast the logs.
import logging
import logging.handlers
sh = logging.handlers.SocketHandler('localhost', logging.handlers.DEFAULT_TCP_LOGGING_PORT)
rootLogger = logging.getLogger('')
rootLogger.addHandler(sh)
logging.info(['Test', 'This', 'List'])
logging.info(dict(Test = 'Blah', Test2 = 'Blah2', Test3 = dict(Test4 = 'Blah4')))
I'm at a loss as to how to go about debugging this. Any ideas?

The default level for logging is WARNING, so INFO messages will not be seen by default. You need a rootLogger.setLevel(logging.INFO) to see INFO messages, for example, or set the level to logging.DEBUG to see all messages.
Update: It's hard to see how logging.error() would fail to output where logging.info() does, unless there is a filter involved or an error during the logging.error() call. Please post a runnable script which demonstrates the issue when run with the logging socket server described in the docs.
Further update: logging doesn't automatically capture exceptions. To log an exception, the usual thing for the developer to do is to use a try: except: block and to call logging.exception(...) in the exception handling code.

How do I use Django's logger to log a traceback when I tell it to?

try:
print blah
except KeyError:
traceback.print_exc()
I used to debug like this. I'd print to the console. Now, I want to log everything instead of print, since Apache doesn't allow printing. So, how do I log this entire traceback?

You can use python's logging mechanism:
import logging
...
logger = logging.getLogger("blabla")
...
try:
print blah # You can use logger.debug("blah") instead of print
except KeyError:
logger.exception("An error occurred")
This will print the stack trace and will work with apache.

If you're running Django's trunk version (or 1.3 when it's released), there are a number of default logging configurations built in which are integrated with Python's standard logging module. For that, all you need to do is import logging, call logger = logging.getLogger(__name__) and then call logger.exception(msg) and you'll get both your message and a stack trace. Docs for Django's logging functionality and Python's logger.exception method would be handy references.
If you don't want to use Python's logging module, you can also import sys and write to sys.stderr rather than using print. On the command line this goes to the screen, when running under Apache it'll go into your error logs.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Strange Issue Using Logging Module in Python - python

Related

How can I prevent sentry from capturing events for some uncaught exceptions and logging messages?

Problems with Python's multithreading/logging modules

NameError: name 'MyClass' is not defined

Python logging.handlers.SocketHandler not broadcasting except in App Engine project

How do I use Django's logger to log a traceback when I tell it to?

Categories

Resources