logging with IPython parallel

logging with IPython parallel - python

I am trying to setup logging when using IPython parallel. Specifically, I would like to redirect log messages from the engines to the client. So, rather than each of the engines logging individually to their own log files, as in IPython.parallel - can I write my own log into the engine logs?, I am looking for something like How should I log while using multiprocessing in Python?
Based on reviewing the IPython code base, I have the impression that the way to do this would be to register a zmq.log.hander.PUBHandler with the logging module (see documentation in iploggerapp.py). I have tried this in various ways, but none seem to work. I also tried to register a logger via IPython.parallel.util. connect_engine_logger, but this also does not appear to do anything.
update
I have made some progress on this problem. If I specify in ipengine_config c.IPEngineApp.log_url, then the logger of the IPython application has the appropriate EnginePubHandler. I checked this via
%%px
from IPython.config import Application
log = Application.instance().log
print(log.handlers)
Which indicated that the application logger has an EnginePUBHandler for each engine. Next, I can start the iplogger app in a separate terminal and see the log messages from each engine.
However, What I would like to achieve is to see these log messages in the notebook, rather than in a separate terminal. I have tried starting iplogger from within the notebook via a system call, but this crashes.

Related

Robot Framework and Python Logging: Logging from multiple threads and logging channel names

I use Python standard logging extensively in my libraries within my Robot Framework test suites. These log messages are appearing in the RF log, as expected, save for two problems:
Some of the libraries create threads. Log messages on these additional threads do not reach the RF log.
For each library I follow the standard practice of creating a logging channel named after the module/class self._logger = logging.getLogger( __name__ ) , but I cannot seem to format the logging in any way to get these channel names to appear in the RF logs.
If I run these libraries from regular Python scripts, instead of RF, I get log messages from the additional threads, and I can format all messages to show the channel names. So there is some issue when using them within RF.
I'm using RF3, Python3 and running under Raspbian.

Quote from the Robot Framework User Guide:
"Messages logged by non-main threads using the normal logging methods from programmatic logging APIs are silently ignored."
http://robotframework.org/robotframework/latest/RobotFrameworkUserGuide.html#communication-when-using-threads
You could try this user contributed module as a workaround:
https://github.com/robotframework/robotbackgroundlogger

No handlers could be found for logger "elasticsearch.trace"

Updated:
Turns out, this is not a function of cron. I get the same behavior when running the script from the command line, if it in fact has a record to process and communicates with ElasticSearch.
I have a cron job that runs a python script which uses pyelasticsearch to index some documents in an ElasticSearch instance. The script works fine from the command line, but when run via cron, it results in this error:
No handlers could be found for logger "elasticsearch.trace"
Clearly there's some logging configuration issue that only crops up when run under cron, but I'm not clear what it is. Any insight?

I solved this by explicitly configuring a handler for the elasticsearch.trace logger, as I saw in examples from the pyelasticsearch repo.
After importing pyelasticsearch, set up a handler like so:
tracer = logging.getLogger('elasticsearch.trace')
tracer.setLevel(logging.INFO)
tracer.addHandler(logging.FileHandler('/tmp/es_trace.log'))
I'm not interested in keeping the trace logs, so I used the near-at-hand Django NullHandler.
from django.utils.log import NullHandler
tracer = logging.getLogger('elasticsearch.trace')
tracer.setLevel(logging.INFO)
tracer.addHandler(NullHandler())

How to enable logging so messages kept internally until get/clear?

In an application I want to collect messages related to some dedicated part of the processing, and then show these messages later at user request. I would like to report severity (e.g. info, warning), time of message, etc., so for this I considered to use the Python standard logging module to collect the messages and related information. However, I don't want these messages to go to a console or file.
Is there a way to create a Python logger, using logging, where the messages are kept internally (in memory) only, until read out by the application. I would expect start of code like:
log = logging.getLogger('my_logger')
... some config of log for internal only; not to console
log.error('Just some error')
... some code to get/clear messages in log until now
I have tried to look in logging — Logging facility for Python, but most example are for immediate output to file or console, so an example for internal logging or reference is appreciated.

You should just use another handler. You could use a StreamHandler over an io.StringIO that would simply log to memory:
log = logging.getLogger('my_logger')
memlog = io.StringIO()
log.addHandler(logging.StreamHandler(memlog))
All logging sent to log can be found in memlog.getvalue()
Of course, this is just a simple Handler that concatenates everything in one single string, even if for versions >= 3.2 each record is terminated, by default with a \n. For more specific requirements, you could have a look at a QueueHandler or implement a dedicated Handler.
References: logging.handlers in the Python Standard Library reference manual.

python testfixtures.LogCapture not capturing logs

I have a multi-module package in python. One of the modules is essentially a command line application. We'll call that one the "top level" module. The other module has three classes in it, which are essentially the backend of the application.
The toplevel module, in the init for it's class, does logging.basicConfig to log debug to file, then adds a console logger for info and above. The backend classes just use getLogger(classname), because when the application run in full, the backend will be called by the top level command line frontend, so logging will already be configured.
In the Test class (subclassed from unittest.TestCase and run via nose), I simply run testfixtures.LogCapture() in setup, and testfixtures.LogCapture.uninstall_all() in tearDown, and all the logging is captured just fine, no effort.
In the backend test file, I tried to do the same thing. I run testfixtures.LogCapture in the setup, uninstall_all in the teardown. However, all the "INFO" level logmessages still print when I'm running unittests for the backend.
Any help on
1) why log capture works for the frontend but not backend
2) an elegant way to be able to log and capture logs in my backend class without explictly setting up logging in those files.
would be amazing.

I fixed the same issue by setting 'disable_existing_loggers' to False when reconfiguring the Logger: the previous logger was disabled and it was preventing it from propagating the logs to the RootLogger.

Logging on Hadoop

I am trying to run map reduce job. But I am unable to find my log files when I run this job. I am using hadoop streaming job to perform map reduce and I am using Python. I am using python's logging module to log messages. When I run this on a file by using "cat" command, the log file is created.
cat file | ./mapper.py
But when I run this job via hadoop, I am unable to find the log file.
import os,logging
logging.basicConfig(filename="myApp.log", level=logging.INFO)
logging.info("app start")
##
##logic with log messages
##
logging.info("app complete")
But I cannot find the myApp.log file anywhere. Is the log data stored anywhere or does hadoop ignore the application logging complete. I have searched for my log items in the userlogs folder too, but it doesn't look like my log items are there.
I work with vast amounts of data where random items are not making to the next stage, this is a very big issue on our side, so I am trying to find a way to use logging to debug my application.
Any help is appreciated.

I believe that you are logging in stdout? If so, you should definitely log in stderr instead, or create your own custom stream.
Using hadoop-streaming, stdout is the stream dedicated to pass key-values between mappers/reducers and to output results, so you should not log anything in it.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

logging with IPython parallel - python

Related

Robot Framework and Python Logging: Logging from multiple threads and logging channel names

No handlers could be found for logger "elasticsearch.trace"

How to enable logging so messages kept internally until get/clear?

python testfixtures.LogCapture not capturing logs

Logging on Hadoop

Categories

Resources