Logging to separate files in Python - python

I'm using python's logging module. I've initialized it as:
import logging
logger = logging.getLogger(__name__)
in every of my modules. Then, in the main file:
logging.basicConfig(level=logging.INFO,filename="log.txt")
Now, in the app I'm also using WSGIServer from gevent. The initializer takes a log argument where I can add a logger instance. Since this is an HTTP Server it's very verbose.
I would like to log all of my app's regular logs to "log.txt" and WSGIServer's logs to "http-log.txt".
I tried this:
logging.basicConfig(level=logging.INFO,filename="log.txt")
logger = logging.getLogger(__name__)
httpLogger = logging.getLogger("HTTP")
httpLogger.addHandler(logging.FileHandler("http-log.txt"))
httpLogger.addFilter(logging.Filter("HTTP"))
http_server = WSGIServer(('0.0.0.0', int(config['ApiPort'])), app, log=httpLogger)
This logs all HTTP messages into http-log.txt, but also to the main logger.
How can I send all but HTTP messages to the default logger (log.txt), and HTTP messages only to http-log.txt?
EDIT: Since people are quickly jumping to point that this Logging to two files with different settings has an answer, plese read the linked answer and you'll see they don't use basicConfig but rather initialize each logger separately. This is not how I'm using the logging module.

Add the following line to disable propagation:
httpLogger.propagate = False
Then, it will no longer propagate messages to its ancestors' handlers which includes the root logger for which you have set up the general log file.

Related

GCP Cloud Functions printing extra blank line after every print statement

I have a Cloud Function running Python 3.7 runtime triggered from a Pub/Sub Topic.
In the code, I have places where I use print() to write logs. However, when I go to the logs tab of my function, I see that an extra blank line is added after each log. I would like to remove these, since this is basically doubling my usage of the Logging API.
I have tried using print(message, end="") but this did not remove the blank lines.
Thanks in advance.
Although I have not found out the root cause for the blank line, I was able to resolve this by using the google-cloud-logging library as suggested by John in the comment of my question.
Resulting code is as below:
import google.cloud.logging
import logging
# set up logging client when run on GCP
if not os.environ.get("DEVELOPMENT"): # custom environment variable
# only on GCP
logging_client = google.cloud.logging.Client()
logging_client.setup_logging()
# define logger
logger = logging.getLogger(__name__)
logger.setLevel(logging.DEBUG) # min logging level for logger
# define handler only on local
# only add handlers in local, since Cloud Function there already is a handler attached to the logger
# adding another handler in Cloud Function will result in duplicate logging with severity = ERROR
if os.environ.get("DEVELOPMENT"):
console_handler = logging.StreamHandler() # handler to write to stream
console_handler.setLevel(logging.DEBUG) # min logging level for handler
# add handler to logger
logger.addHandler(console_handler)
def my_function():
logger.info('info')
This code will,
not send code to GCP logs when function is executed on local
will print INFO and DEBUG logs both on local and on GCP
Thank you both for your suggestions.
Instead of using print. Use Logger.
import logging
import logging.handlers as handlers
logger = logging.getLogger("YourCloudFunctionLoggerName")
logger.setLevel(logging.DEBUG)

How to redirect another library's console logging messages to a file, in Python

The fastAPI library that I import for an API I have written, writes many logging.INFO level messages to the console, which I would like either to redirect to a file-based log, or ideally, to both console and file. Here is an example of fastAPI module logging events in my console:
So I've tried to implement this Stack Overflow answer ("Easy-peasy with Python 3.3 and above"), but the log file it creates ("api_screen.log") is always empty....
# -------------------------- logging ----------------------------
logging_file = "api_screen.log"
logging_level = logging.INFO
logging_format = ' %(message)s'
logging_handlers = [logging.FileHandler(logging_file), logging.StreamHandler()]
logging.basicConfig(level = logging_level, format = logging_format, handlers = logging_handlers)
logging.info("------logging test------")
Even though my own "------logging test------" message does appear on console within the other fastAPI logs:
As you can see here it's created the file, but it has size zero.
So what do I need to do also to get the file logging working?
There are multiple issues here. First and most importantly: basicConfig does nothing if a logger is already configured, which fastAPI does. So the handlers you are creating are never used. When you call logging.info() you are sending a log to the root logger which is printed because the fastAPI has added a handler to it. You are also not setting the level on your handlers. Try this code instead of what you currently have:
logging_file = "api_screen.log"
logging_level = logging.INFO
logging_fh = logging.FileHandler(logging_file)
logging_sh = logging.StreamHandler()
logging_fh.setLevel(logging_level)
logging_sh.setLevel(logging_level)
root_logger = logging.getLogger()
root_logger.addHandler(logging_fh)
root_logger.addHandler(logging_sh)
logging.info('--test--')

Duplicate log entries with Google Cloud Stackdriver logging of Python code on Kubernetes Engine

I have a simple Python app running in a container on Google Kubernetes Engine. I am trying to connect the standard Python logging to Google Stackdriver logging using this guide. I have almost succeeded, but I am getting duplicate log entries with one always at the 'error' level...
Screenshot of Stackdriver logs showing duplicate entries
This is my python code that set's up the logging according to the above guide:
import webapp2
from paste import httpserver
import rpc
# Imports the Google Cloud client library
import google.cloud.logging
# Instantiates a client
client = google.cloud.logging.Client()
# Connects the logger to the root logging handler; by default this captures
# all logs at INFO level and higher
client.setup_logging()
app = webapp2.WSGIApplication([('/rpc/([A-Za-z]+)', rpc.RpcHandler),], debug=True)
httpserver.serve(app, host='0.0.0.0', port='80')
Here's the code that triggers the logs from the screenshot:
import logging
logging.info("INFO Entering PostEchoPost...")
logging.warning("WARNING Entering PostEchoPost...")
logging.error("ERROR Entering PostEchoPost...")
logging.critical("CRITICAL Entering PostEchoPost...")
Here is the full Stackdriver log, expanded from the screenshot, with an incorrectly interpreted ERROR level:
{
insertId: "1mk4fkaga4m63w1"
labels: {
compute.googleapis.com/resource_name: "gke-alg-microservice-default-pool-xxxxxxxxxx-ttnz"
container.googleapis.com/namespace_name: "default"
container.googleapis.com/pod_name: "esp-alg-xxxxxxxxxx-xj2p2"
container.googleapis.com/stream: "stderr"
}
logName: "projects/projectname/logs/algorithm"
receiveTimestamp: "2018-01-03T12:18:22.479058645Z"
resource: {
labels: {
cluster_name: "alg-microservice"
container_name: "alg"
instance_id: "703849119xxxxxxxxxx"
namespace_id: "default"
pod_id: "esp-alg-xxxxxxxxxx-xj2p2"
project_id: "projectname"
zone: "europe-west1-b"
}
type: "container"
}
severity: "ERROR"
textPayload: "INFO Entering PostEchoPost...
"
timestamp: "2018-01-03T12:18:20Z"
}
Here is the the full Stackdriver log, expanded from the screenshot, with a correctly interpreted INFO level:
{
insertId: "1mk4fkaga4m63w0"
jsonPayload: {
message: "INFO Entering PostEchoPost..."
thread: 140348659595008
}
labels: {
compute.googleapis.com/resource_name: "gke-alg-microservi-default-pool-xxxxxxxxxx-ttnz"
container.googleapis.com/namespace_name: "default"
container.googleapis.com/pod_name: "esp-alg-xxxxxxxxxx-xj2p2"
container.googleapis.com/stream: "stderr"
}
logName: "projects/projectname/logs/algorithm"
receiveTimestamp: "2018-01-03T12:18:22.479058645Z"
resource: {
labels: {
cluster_name: "alg-microservice"
container_name: "alg"
instance_id: "703849119xxxxxxxxxx"
namespace_id: "default"
pod_id: "esp-alg-xxxxxxxxxx-xj2p2"
project_id: "projectname"
zone: "europe-west1-b"
}
type: "container"
}
severity: "INFO"
timestamp: "2018-01-03T12:18:20.260099887Z"
}
So, this entry might be the key:
container.googleapis.com/stream: "stderr"
It looks like in addition to my logging set-up working, all logs from the container are being send to stderr in the container, and I believe that by default, at least on Kubernetes Container Engine, all stdout/stderr are picked up by Google Stackdriver via FluentD... Having said that, I'm out of my depth at this point.
Any ideas why I am getting these duplicate entries?
I solved this problem by overwriting the handlers property on my root logger immediately after calling the setup_logging method
import logging
from google.cloud import logging as gcp_logging
from google.cloud.logging.handlers import CloudLoggingHandler, ContainerEngineHandler, AppEngineHandler
logging_client = gcp_logging.Client()
logging_client.setup_logging(log_level=logging.INFO)
root_logger = logging.getLogger()
# use the GCP handler ONLY in order to prevent logs from getting written to STDERR
root_logger.handlers = [handler
for handler in root_logger.handlers
if isinstance(handler, (CloudLoggingHandler, ContainerEngineHandler, AppEngineHandler))]
To elaborate on this a bit, the client.setup_logging method sets up 2 handlers, a normal logging.StreamHandler and also a GCP-specific handler. So, logs will go to both stderr and Cloud Logging. You need to remove the stream handler from the handlers list to prevent the duplication.
EDIT:
I have filed an issue with Google to add an argument to to make this less hacky.
Problem is in the way how logging client initializes root logger
logger = logging.getLogger()
logger.setLevel(log_level)
logger.addHandler(handler)
logger.addHandler(logging.StreamHandler())
it adds default stream handler in addition to Stackdriver handler.
My workaround for now is to initialize appropriate Stackdriver handler manually:
# this basically manually sets logger compatible with GKE/fluentd
# as LoggingClient automatically add another StreamHandler - so
# log records are duplicated
from google.cloud.logging.handlers import ContainerEngineHandler
formatter = logging.Formatter("%(message)s")
handler = ContainerEngineHandler(stream=sys.stderr)
handler.setFormatter(formatter)
handler.setLevel(level)
root = logging.getLogger()
root.addHandler(handler)
root.setLevel(level)
Writing in 2022, shortly after v3.0.0 of google-cloud-logging was released, and this issue cropped up for me too (albeit almost certainly for a different reason).
Debugging
The most useful thing I did on the way to debugging it was stick the following in my code:
import logging
...
root_logger = logging.getLogger() # no arguments = return the root logger
print(root_logger.handlers, flush=True) # tell me what handlers are attached
...
If you're getting duplicate logs, it seems certain that it's because you've got multiple handlers attached to your logger, and Stackdriver is catching logs from both of them! To be fair, that is Stackdriver's job; it's just a pity that google-cloud-logging can't sort this out by default.
The good news is that Stackdriver will also catch the print statement (which goes to the STDOUT stream). In my case, the following list of handlers was logged: [<StreamHandler <stderr> (NOTSET)>, <StructuredLogHandler <stderr> (NOTSET)>]. So: two handlers were attached to the root logger.
Fixing it
You might be able to find that your code is attaching the handler somewhere else, and simply remove that part. But it may instead be the case that e.g. a dependency is setting up the extra handler, something I wrestled with.
I used a solution based on the answer written by Andy Carlson. Keeping it general/extensible:
import google.cloud.logging
import logging
def is_cloud_handler(handler: logging.Handler) -> bool:
"""
is_cloud_handler
Returns True or False depending on whether the input is a
google-cloud-logging handler class
"""
accepted_handlers = (
google.cloud.logging.handlers.StructuredLogHandler,
google.cloud.logging.handlers.CloudLoggingHandler,
google.cloud.logging.handlers.ContainerEngineHandler,
google.cloud.logging.handlers.AppEngineHandler,
)
return isinstance(handler, accepted_handlers)
def set_up_logging():
# here we assume you'll be using the basic logging methods
# logging.info, logging.warn etc. which invoke the root logger
client = google.cloud.logging.Client()
client.setup_logging()
root_logger = logging.getLogger()
root_logger.handlers = [h for h in root_logger.handlers if is_cloud_handler(h)]
More context
For those who find this solution confusing
In Python there is a separation between 'loggers' and 'handlers': loggers generate logs, and handlers decide what happens to them. Thus, you can attach multiple handlers to the same logger (in case you want multiple things to happen to the logs from that logger).
The google-cloud-logging library suggests that you run its setup_logging method and then just use the basic logging methods of the built in logging library to create your logs. These are: logging.debug, logging.info, logging.warning, logging.error, and logging.critical (in escalating order of urgency).
All logging.Logger instances have the same methods, including a special Logger instance called the root logger. If you look at the source code for the basic logging methods, they simply call these methods on this root logger.
It's possible to set up specific Loggers, which is standard practice to demarcate logs generated by different areas of an application (rather than sending everything via the root logger). This is done using logging.getLogger("name-of-logger"). However, logging.getLogger() with no argument returns the root logger.
Meanwhile, the purpose of the google.cloud.logging.Client.setup_logging method is to attach a special log handler to the root logger. Thus, logs created using logging.info etc. will be handled by a google-cloud-logging handler. But you have to make sure no other handlers are also attached to the root logger.
Fortunately, Loggers have a property, .handlers, which is a list of attached log handlers. In this solution we just edit that list to ensure we have just one handler.

Disabling logging in 'transaction' package (Pyramid app)

When debugging level of main logger in Pyramid app is set to DEBUG, transaction is spewing lots of pointless debug messages.
In Nosetests I can disable that this way:
from transaction._compat import get_thread_ident
txn_logger = logging.getLogger("txn.%d" % get_thread_ident())
txn_logger.setLevel(logging.WARN)
However, in Pyramid app the infrastructure adds "scoped session" to each HTTP request and that obviously means get_thread_ident() is different every time.
Is there some way of disabling that globally without repeating above in every single Pyramid view?
Simply turn off logging for the txn parent logger in your logging config.
[loggers]
keys = transactions, ...
[logger_transactions]
level = WARN
handlers =
qualname = txn

How to add logging to classes in a module at a custom level

I'm trying to replace an old method of logging information with standard Python logging to a file. The application currently has a log file which is set to capture Info and Debug messages so I would like this at a lower level that isn't captured by the main log.
App structure:
- mysite
- legacy
-- items
--- item1.py
-- __init__.py
-- engine.py
Within item1.py and engine.py are calls to an old debug() function which I'd not like to be logged in legacy.log but not have them appear in the mysite.log file.
Ideally the way this works is to create a wrapper with a debug function which does the logging at the new level and I've read that this requires an extension of logging.Logger.
So In legacy/__init__.py I've written;
import logging
LEGACY_DEBUG_LVL = 5
class LegacyLogger(logging.Logger):
"""
Extend the Logger to introduce the new legacy logging.
"""
def legacydebug(self, msg, *args, **kwargs):
"""
Log messages from legacy provided they are strings.
#param msg: message to log
#type msg:
"""
if isinstance(msg, str):
self._log(LEGACY_DEBUG_LVL, msg, args)
logging.Logger.legacydebug = legacydebug
logger = logging.getLogger('legacy')
logger.setLevel(LEGACY_DEBUG_LVL)
logger.addHandler(logging.FileHandler('legacy.log'))
logging.addLevelName(LEGACY_DEBUG_LVL, "legacydebug")
And from engine.py and item1.py I can just do;
from . import logger
debug = logger.legacydebug
At the moment I'm seeing messages logged to both logs. Is this the correct approach for what I want to achieve? I've got a talent for over-complicating things and missing the simple stuff!
edit
Logging in the main application settings is setup as such;
# settings.py
logging.captureWarnings(True)
logger = logging.getLogger()
logger.addHandler(logging.NullHandler())
logger.addHandler(logging.handlers.FileHandler('mysite.log'))
if DEBUG:
# If we're running in debug mode, write logs to stdout as well:
logger.addHandler(logging.StreamHandler())
logger.setLevel(logging.DEBUG)
else:
logger.setLevel(logging.INFO)
When using multiple loggers, the logging module implicitly creates them in a tree structure. The structure are defined by the logger name: a logger named 'animal' will be the parent of loggers called 'animal.cat' and 'animal.dog'.
In your case, the unnamed logger defined in settings.py is parent of the logger named 'legacy'. The unnamed logger will receive the messages sent through the 'legacy' logger, and write them into mysite.log.
Try to give a name for the unnamed logger, such as 'mysite' to break the tree structure.

Categories

Resources