Doing the equivalent of log_struct in python logger - python

In the google example, it gives the following:
logger.log_struct({
'message': 'My second entry',
'weather': 'partly cloudy',
})
How would I do the equivalent in python's logger. For example:
import logging
log.info(
msg='My second entry',
extra = {'weather': "partly cloudy"}
)
When I view this in stackdriver, the extra fields aren't getting parsed properly:
2018-11-12 15:41:12.366 PST
My second entry
Expand all | Collapse all
{
insertId: "1de1tqqft3x3ri"
jsonPayload: {
message: "My second entry"
python_logger: "Xhdoo8x"
}
logName: "projects/Xhdoo8x/logs/python"
receiveTimestamp: "2018-11-12T23:41:12.366883466Z"
resource: {…}
severity: "INFO"
timestamp: "2018-11-12T23:41:12.366883466Z"
}
How would I do that?
The closest I'm able to do now is:
log.handlers[-1].client.logger('').log_struct("...")
But this still requires a second call...

Current solution:
Update 1 - user Seth Nickell improved my proposed solution, so I update this answer as his method is superior. The following is based on his response on GitHub:
https://github.com/snickell/google_structlog
pip install google-structlog
Used like so:
import google_structlog
google_structlog.setup(log_name="here-is-mylilapp")
# Now you can use structlog to get searchable json details in stackdriver...
import structlog
logger = structlog.get_logger()
logger.error("Uhoh, something bad did", moreinfo="it was bad", years_back_luck=5)
# Of course, you can still use plain ol' logging stdlib to get { "message": ... } objects
import logging
logger = logging.getLogger("yoyo")
logger.error("Regular logging calls will work happily too")
# Now you can search stackdriver with the query:
# logName: 'here-is-mylilapp'
Original answer:
Based on an answer from this GitHub thread, I use the following bodge to log custom objects as info payload. It derives more from the original _Worker.enqueue and supports passing custom fields.
from google.cloud.logging import _helpers
from google.cloud.logging.handlers.transports.background_thread import _Worker
def my_enqueue(self, record, message, resource=None, labels=None, trace=None, span_id=None):
queue_entry = {
"info": {"message": message, "python_logger": record.name},
"severity": _helpers._normalize_severity(record.levelno),
"resource": resource,
"labels": labels,
"trace": trace,
"span_id": span_id,
"timestamp": datetime.datetime.utcfromtimestamp(record.created),
}
if 'custom_fields' in record:
entry['info']['custom_fields'] = record.custom_fields
self._queue.put_nowait(queue_entry)
_Worker.enqueue = my_enqueue
Then
import logging
from google.cloud import logging as google_logging
logger = logging.getLogger('my_log_client')
logger.addHandler(CloudLoggingHandler(google_logging.Client(), 'my_log_client'))
logger.info('hello', extra={'custom_fields':{'foo': 1, 'bar':{'tzar':3}}})
Resulting in:
Which then makes it much easier to filter according to these custom_fields.
Let's admit this is not good programming, though until this functionality is officially supported there doesn't seem to be much else that can be done.

This is not currently possible, see this open feature request on google-cloud-python for more details.

Official docs: Setting Up Cloud Logging for Python
You can write logs to Logging from Python applications
by using the Python logging handler included with the Logging client library, or
by using Cloud Logging API Cloud client library for Python directly.
I did not get the Python logging module to export jsonPayload, but the cloud logging library for Python works:
google-cloud-logging >= v.3.0.0 can do it
No need for Python logging workaround anymore, the only thing you need is to install Python >= 3.6 and
pip install google-cloud-logging
Then, you can use Python logging with
import os
# have the environment variable ready:
# GOOGLE_APPLICATION_CREDENTIALS
# Then:
GOOGLE_APPLICATION_CREDENTIALS = os.environ.get("GOOGLE_APPLICATION_CREDENTIALS")
client = google.cloud.logging.Client.from_service_account_json(
GOOGLE_APPLICATION_CREDENTIALS
)
log_name = "test"
gcloud_logger = client.logger(log_name)
# jsonPayloads
gcloud_logger.log_struct(entry)
# messages
gcloud_logger.log_text('hello')
# generic (can find out whether it is jsonPayload or message)!
gcloud_logger.log(entry or 'hello')
You run these commands in a Python file outside of GCP and reach GCP with more or less a one-liner, you only need the credentials.
You can use the gcloud logger even for printing and logging in one go, taken from Writing structured logs, untested.
Python logging to log jsonPayload into GCP logs (TL/DR)
I could not get this to run!
You can also use the built-in logging module of Python with the workaround mentioned in the other answer, but I did not get it to run.
It will not work if you pass a dictionary or its json.dumps() directly as a parameter, since then, you get a string output of the whole dictionary which you cannot read as a json tree.
But it also did not work for me when I used the logger.info() to log the jsonPayload / json.dumps in an example parameter called extras.
import json
import os
#...
# https://googleapis.dev/python/logging/latest/stdlib-usage.html
GOOGLE_APPLICATION_CREDENTIALS = os.environ.get("GOOGLE_APPLICATION_CREDENTIALS")
client = google.cloud.logging.Client.from_service_account_json(
GOOGLE_APPLICATION_CREDENTIALS
)
log_name = "test"
handler = CloudLoggingHandler(client, name=log_name)
setup_logging(handler)
logger = logging.getLogger()
logger.setLevel(logging.INFO) # Set default level.
#...
# Complete a structured log entry.
entry = dict(
severity="NOTICE",
message="This is the default display field.",
# Log viewer accesses 'component' as jsonPayload.component'.
component="arbitrary-property"
)
# Python logging to log jsonPayload into GCP logs
logger.info('hello', extras=json.dumps(entry))
I also tried the google-structlog solution of the other answer, that only threw the error:
google_structlog.setup(log_name=log_name) TypeError: setup() got an unexpected keyword argument 'log_name'
I used Python v3.10.2 and
google-auth==1.35.0
google-cloud-core==1.7.2
google-cloud-logging==1.15.0
googleapis-common-protos==1.54.0
Research steps gcloud logging (TL/DR)
Following the fixed issue (see the merge and close in the end) on github at googleapis / python-logging: Logging: support sending structured logs to stackdriver via stdlib 'logging'. #13
you find feat!: support json logs #316 :
This PR adds full support for JSON logs, along with standard text
logs. Now, users can call logging.error({'a':'b'}), and they will get
a JsonPayload in Cloud Logging, Or call logging.error('test') to
receive a TextPayload
As part of this change, I added a generic logger.log() function, which
serves as a generic entry-point instead of logger.log_text or
logger.log_struct. It will infer which log function is meant based on
the input type
Previously, the library would attach the python logger name as part of
the JSON payload for each log. Now, that information will be attached
as a label instead, giving users full control of the log payload
fields
Fixes #186, #263, #13
With the main new version listing the new feature:
chore(main): release 3.0.0 #473

Related

I want to make cloud run log when come to input

when i receive input like url
Then want to logging in cloud run
I want to see my input with python
You can log the input URL in Python using the built-in logging module. You can use this example
import logging
def log_url(url):
logging.basicConfig(filename='url.log', level=logging.INFO)
logging.info('Received URL: %s', url)
if __name__ == '__main__':
url = 'https://www.example.com'
log_url(url)
This code will log the input URL in a file named url.log in the same directory as the script. The logging level is set to logging.INFO, so only informational messages will be logged. You can adjust the logging level as needed to control the verbosity of the log output.
To log the input URL in the cloud, you will need to configure the logging backend to write to a cloud-based log service. The specific service you use will depend on the cloud provider you are using. For example, if you are using Google Cloud, you could use Stackdriver Logging to store and manage your logs.

How to send logs to GCP using StructuredLogHandler with jsonPayload?

I am trying to send logs (with jsonPayload) to GCP using StructuredLogHandler with below python code.
rootlogger = logging.getLogger()
client = google.cloud.logging.Client(credentials=xxx, project=xxx)
h = StructuredLogHandler()
rootlogger.addHandler(h)
logger = logging.getLogger('test')
logger.warning('warning')
I see that logs are being printed on console (in json format) but the logs are not sent to GCP Log Explorer. Can someone help?
Since v3 of Python Cloud Logging Library it's now easier than ever as it integrates with the Python standard logging library with client.setup_logging:
import logging
import google.cloud.logging
# Instantiate a client
client = google.cloud.logging.Client()
# Retrieves a Cloud Logging handler based on the environment
# you're running in and integrates the handler with the
# Python logging module. By default this captures all logs
# at INFO level and higher
client.setup_logging()
Name your logger as usual. E.g.
logger = logging.getLogger('test')
Then if you want to send structured log messages to Cloud Logging then you can use one of two methods:
Use the json_fields extra argument:
data_dict = {"hello": "world"}
logging.info("message field", extra={"json_fields": data_dict})
Use a JSON-parseable string (requires importing the json module):
import json
data_dict = {"hello": "world"}
logging.info(json.dumps(data_dict))
This will see your log messages sent to Google Cloud and the JSON payload available under the jsonPayload field of the expanded log entry:

How to log UUID of api request payload on every log messages with python?

I'm working on a micro services where i need to log the uuid recieved on a payload to the cloud based logger. But i've one module calling another module, so for every submodule functions to log the messages with uuid, i need to pass uuid to every function i call, but that's a bad idea, is there a better way of doing this ??
Illustration of what i'm trying to achieve:
def call_bar(uuid):
app.logger.info("Something coming with uuid: {uuid}")
def call_foo(uuid):
#
app.logger.info("Something coming with uuid: {uuid}")
call_bar(uuid)
def flask_view():
# i get a uuid here
app.logger.info("Something coming with uuid: {uuid}")
call_foo(uuid)
The issue in the above code is i need to explicitly pass the value to every function i call.
The logger component of Lambda Power tools does this for us on AWS. the code setup is pretty minimal:
from aws_lambda_powertools import Logger
logger = Logger()
def flask_view():
request_uuid = i_get_a_uuid_here()
logger.append_keys(uuid=request_uuid)
call_foo()
def call_foo()
logger.info("foo happened")
This would get you a json log line than line that as well as bunch of other fields has the following keys.
{
"message": "foo happened", "uuid": "<request_uuid>"
}
if your not on aws lambda, I'd take a look at structlog its log.bind(uuid=uuid) works similarly allowing you to add keys that get used for subsequent log entries.

Duplicate log entries with Google Cloud Stackdriver logging of Python code on Kubernetes Engine

I have a simple Python app running in a container on Google Kubernetes Engine. I am trying to connect the standard Python logging to Google Stackdriver logging using this guide. I have almost succeeded, but I am getting duplicate log entries with one always at the 'error' level...
Screenshot of Stackdriver logs showing duplicate entries
This is my python code that set's up the logging according to the above guide:
import webapp2
from paste import httpserver
import rpc
# Imports the Google Cloud client library
import google.cloud.logging
# Instantiates a client
client = google.cloud.logging.Client()
# Connects the logger to the root logging handler; by default this captures
# all logs at INFO level and higher
client.setup_logging()
app = webapp2.WSGIApplication([('/rpc/([A-Za-z]+)', rpc.RpcHandler),], debug=True)
httpserver.serve(app, host='0.0.0.0', port='80')
Here's the code that triggers the logs from the screenshot:
import logging
logging.info("INFO Entering PostEchoPost...")
logging.warning("WARNING Entering PostEchoPost...")
logging.error("ERROR Entering PostEchoPost...")
logging.critical("CRITICAL Entering PostEchoPost...")
Here is the full Stackdriver log, expanded from the screenshot, with an incorrectly interpreted ERROR level:
{
insertId: "1mk4fkaga4m63w1"
labels: {
compute.googleapis.com/resource_name: "gke-alg-microservice-default-pool-xxxxxxxxxx-ttnz"
container.googleapis.com/namespace_name: "default"
container.googleapis.com/pod_name: "esp-alg-xxxxxxxxxx-xj2p2"
container.googleapis.com/stream: "stderr"
}
logName: "projects/projectname/logs/algorithm"
receiveTimestamp: "2018-01-03T12:18:22.479058645Z"
resource: {
labels: {
cluster_name: "alg-microservice"
container_name: "alg"
instance_id: "703849119xxxxxxxxxx"
namespace_id: "default"
pod_id: "esp-alg-xxxxxxxxxx-xj2p2"
project_id: "projectname"
zone: "europe-west1-b"
}
type: "container"
}
severity: "ERROR"
textPayload: "INFO Entering PostEchoPost...
"
timestamp: "2018-01-03T12:18:20Z"
}
Here is the the full Stackdriver log, expanded from the screenshot, with a correctly interpreted INFO level:
{
insertId: "1mk4fkaga4m63w0"
jsonPayload: {
message: "INFO Entering PostEchoPost..."
thread: 140348659595008
}
labels: {
compute.googleapis.com/resource_name: "gke-alg-microservi-default-pool-xxxxxxxxxx-ttnz"
container.googleapis.com/namespace_name: "default"
container.googleapis.com/pod_name: "esp-alg-xxxxxxxxxx-xj2p2"
container.googleapis.com/stream: "stderr"
}
logName: "projects/projectname/logs/algorithm"
receiveTimestamp: "2018-01-03T12:18:22.479058645Z"
resource: {
labels: {
cluster_name: "alg-microservice"
container_name: "alg"
instance_id: "703849119xxxxxxxxxx"
namespace_id: "default"
pod_id: "esp-alg-xxxxxxxxxx-xj2p2"
project_id: "projectname"
zone: "europe-west1-b"
}
type: "container"
}
severity: "INFO"
timestamp: "2018-01-03T12:18:20.260099887Z"
}
So, this entry might be the key:
container.googleapis.com/stream: "stderr"
It looks like in addition to my logging set-up working, all logs from the container are being send to stderr in the container, and I believe that by default, at least on Kubernetes Container Engine, all stdout/stderr are picked up by Google Stackdriver via FluentD... Having said that, I'm out of my depth at this point.
Any ideas why I am getting these duplicate entries?
I solved this problem by overwriting the handlers property on my root logger immediately after calling the setup_logging method
import logging
from google.cloud import logging as gcp_logging
from google.cloud.logging.handlers import CloudLoggingHandler, ContainerEngineHandler, AppEngineHandler
logging_client = gcp_logging.Client()
logging_client.setup_logging(log_level=logging.INFO)
root_logger = logging.getLogger()
# use the GCP handler ONLY in order to prevent logs from getting written to STDERR
root_logger.handlers = [handler
for handler in root_logger.handlers
if isinstance(handler, (CloudLoggingHandler, ContainerEngineHandler, AppEngineHandler))]
To elaborate on this a bit, the client.setup_logging method sets up 2 handlers, a normal logging.StreamHandler and also a GCP-specific handler. So, logs will go to both stderr and Cloud Logging. You need to remove the stream handler from the handlers list to prevent the duplication.
EDIT:
I have filed an issue with Google to add an argument to to make this less hacky.
Problem is in the way how logging client initializes root logger
logger = logging.getLogger()
logger.setLevel(log_level)
logger.addHandler(handler)
logger.addHandler(logging.StreamHandler())
it adds default stream handler in addition to Stackdriver handler.
My workaround for now is to initialize appropriate Stackdriver handler manually:
# this basically manually sets logger compatible with GKE/fluentd
# as LoggingClient automatically add another StreamHandler - so
# log records are duplicated
from google.cloud.logging.handlers import ContainerEngineHandler
formatter = logging.Formatter("%(message)s")
handler = ContainerEngineHandler(stream=sys.stderr)
handler.setFormatter(formatter)
handler.setLevel(level)
root = logging.getLogger()
root.addHandler(handler)
root.setLevel(level)
Writing in 2022, shortly after v3.0.0 of google-cloud-logging was released, and this issue cropped up for me too (albeit almost certainly for a different reason).
Debugging
The most useful thing I did on the way to debugging it was stick the following in my code:
import logging
...
root_logger = logging.getLogger() # no arguments = return the root logger
print(root_logger.handlers, flush=True) # tell me what handlers are attached
...
If you're getting duplicate logs, it seems certain that it's because you've got multiple handlers attached to your logger, and Stackdriver is catching logs from both of them! To be fair, that is Stackdriver's job; it's just a pity that google-cloud-logging can't sort this out by default.
The good news is that Stackdriver will also catch the print statement (which goes to the STDOUT stream). In my case, the following list of handlers was logged: [<StreamHandler <stderr> (NOTSET)>, <StructuredLogHandler <stderr> (NOTSET)>]. So: two handlers were attached to the root logger.
Fixing it
You might be able to find that your code is attaching the handler somewhere else, and simply remove that part. But it may instead be the case that e.g. a dependency is setting up the extra handler, something I wrestled with.
I used a solution based on the answer written by Andy Carlson. Keeping it general/extensible:
import google.cloud.logging
import logging
def is_cloud_handler(handler: logging.Handler) -> bool:
"""
is_cloud_handler
Returns True or False depending on whether the input is a
google-cloud-logging handler class
"""
accepted_handlers = (
google.cloud.logging.handlers.StructuredLogHandler,
google.cloud.logging.handlers.CloudLoggingHandler,
google.cloud.logging.handlers.ContainerEngineHandler,
google.cloud.logging.handlers.AppEngineHandler,
)
return isinstance(handler, accepted_handlers)
def set_up_logging():
# here we assume you'll be using the basic logging methods
# logging.info, logging.warn etc. which invoke the root logger
client = google.cloud.logging.Client()
client.setup_logging()
root_logger = logging.getLogger()
root_logger.handlers = [h for h in root_logger.handlers if is_cloud_handler(h)]
More context
For those who find this solution confusing
In Python there is a separation between 'loggers' and 'handlers': loggers generate logs, and handlers decide what happens to them. Thus, you can attach multiple handlers to the same logger (in case you want multiple things to happen to the logs from that logger).
The google-cloud-logging library suggests that you run its setup_logging method and then just use the basic logging methods of the built in logging library to create your logs. These are: logging.debug, logging.info, logging.warning, logging.error, and logging.critical (in escalating order of urgency).
All logging.Logger instances have the same methods, including a special Logger instance called the root logger. If you look at the source code for the basic logging methods, they simply call these methods on this root logger.
It's possible to set up specific Loggers, which is standard practice to demarcate logs generated by different areas of an application (rather than sending everything via the root logger). This is done using logging.getLogger("name-of-logger"). However, logging.getLogger() with no argument returns the root logger.
Meanwhile, the purpose of the google.cloud.logging.Client.setup_logging method is to attach a special log handler to the root logger. Thus, logs created using logging.info etc. will be handled by a google-cloud-logging handler. But you have to make sure no other handlers are also attached to the root logger.
Fortunately, Loggers have a property, .handlers, which is a list of attached log handlers. In this solution we just edit that list to ensure we have just one handler.

Display warning in console output in Django App

I want to add a warning to an old function-based view alerting users that we are moving to class-based views. I see these warnings all the time in Django, such as RemovedInDjango19Warning. It's really cool.
def old_view(request, *args, **kwargs):
print('I can see this message in my terminal output!')
warnings.warn("But not this message.", DeprecationWarning, stacklevel=2)
view = NewClassBasedView.as_view()
return view(request, *args, **kwargs)
I see the print statement in my terminal output but not the warning itself. I get other warnings from Django in the terminal (like the aforementioned RemovedInDjango19Warning but not this one.
Am I missing something? Do I have to turn warnings on somehow at an import specific level (even though it is working broadly for Django)?
Assuming you are using Django 1.8 and you have no Logging configurations, you can read from the Django documentation:
Django uses Python’s builtin logging module to perform system logging. The usage of this module is discussed in detail in Python’s own documentation.
So you can do:
# import the logging library
import logging
# Get an instance of a logger
logger = logging.getLogger(__name__)
def my_view(request, arg1, arg):
...
if bad_mojo:
# Log an error message
logger.error('Something went wrong!')
The logging module is very useful and has too many functions, I suggest you to take a look at Python logging documentation, you can for example do:
FORMAT = '%(asctime)-15s %(clientip)s %(user)-8s %(message)s'
logging.basicConfig(format=FORMAT)
d = {'clientip': '192.168.0.1', 'user': 'fbloggs'}
logger = logging.getLogger('tcpserver')
logger.warning('Protocol problem: %s', 'connection reset', extra=d)
That outputs to:
2006-02-08 22:20:02,165 192.168.0.1 fbloggs Protocol problem: connection reset
To format all your messages and have pretty awesome messages to your users.
If you want to use the warnings module, you have to know that Python by default does not displays all warnings raised on the application. There are two ways to change this behaviour:
By parameters: You have to call python with the -Wd parameter to load the default filter, so you can do python -Wd manage.py runserver to call the test server.
By program: You need to call the warnings.simplefilter('default') function just one time . You can call this function from anywhere, but you have to be sure that this line will be executed before any call to warnings.warn, on my tests I placed it at the beginning of settings.py file but I am not sure that was the best place. The __init__.py file of the project package would be a nice place.

Categories

Resources