Print message from inside AWS SageMaker endpoint

Print message from inside AWS SageMaker endpoint - python

I've set up an endpoint for a model in AWS SageMaker. I set up my own container and am trying to debug some path related errors. To do this I need to print some information so I know what's going on.
Is there a way to print a message either to the terminal or to AWS CloudWatch from inside the endpoint.
Update:
I'm now trying to get logger to work, I'm doing the following:
import logging
logger = logging.getLogger()
logger.setLevel(logging.DEBUG)
logger.debug("message")
I optimally want this to show up in CloudWatch but currently it's not.

Related

Logging on either GCP or local

Suppose there is a system that is run on GCP, but as a backup, can be run locally.
When running on the cloud, stackdriver is pretty straightforward.
However, I need my system to push to stackdriver if on the cloud, and if not on the cloud, use the local python logger.
I also don't want to include any logic to do so, and this should be automatic.
When logging, log straight to Python/local logger.
If on GCP -> push these to stackdriver.
I can write logic that could implement this but that is bad practice. There surely is a direct way of getting this to work.
Example
import google.cloud.logging
client = google.cloud.logging.Client()
client.setup_logging()
import logging
cl = logging.getLogger()
file_handler = logging.FileHandler('file.log')
cl.addHandler(file_handler)
logging.info("INFO!")
This will basically log to python logger, and then 'always' upload to cloud logger. How can I have it so that I don't need to explicitly add import google.cloud.logging and basically if stackdriver is installed, it directly gets the logs? Is that even possible? If not can someone explain how this would be handled from a best practices perspective?
Attempt 1 [works]
Created /etc/google-fluentd/config.d/workflow_log.conf
<source>
#type tail
format none
path /home/daudn/this_log.log
pos_file /var/lib/google-fluentd/pos/this_log.pos
read_from_head true
tag workflow-log
</source>
Created /var/log/this_log.log
pos_file /var/lib/google-fluentd/pos/this_log.pos exists
import logging
cl = logging.getLogger()
file_handler = logging.FileHandler('/var/log/this_log.log')
file_handler.setFormatter(logging.Formatter("%(asctime)s;%(levelname)s;%(message)s"))
cl.addHandler(file_handler)
logging.info("info_log")
logging.error("error_log")
This works! Look for your logs for the specific VM and not global>python

Fortunately, this is a story that is handled. Stackdriver Logging is a very versatile framework for logging. However, there are a variety of logging APIs and Google's intent was not that you had to rewrite all your existing applications to leveraging the Stackdriver logging native APIs. Instead, you can use a logging API of your choice (including standard and defacto APIs) and these logging APIs will then map to Stackdriver. If executed outside a GCP environment or you simply wish to switch to an alternate log collector, your applications would not have to be re-coded or recompiled.
A list of the logging APIs available for different languages can be found at Setting Up Language Runtimes and this includes Setting Up Stackdriver Logging for Python.
For Python, at runtime, you have a configuration property (eg an Environment variable) that declares whether or not you wish to use Stackdriver. If set to true, then .. and only then ... would you execute the login that sets up the native Python logging for Stackdriver otherwise that logic would not be called and hence you would have no dependency on Stackdriver.
A possible piece of code might be:
if os.environ.get('USE_STACKDRIVER') == 'true':
import google.cloud.logging
client = google.cloud.logging.Client()
client.setup_logging()

You do not need to specifically enable or use Stackdriver in your program. You can use the Python logger and write to any file you want. However, Stackdriver only logs specific log files. This means that you would need to manually set up Stackdriver to log "your" log files.
In your example, you are writing to file.log. Modify /etc/google-fluentd/config.d/mylogfile.conf to include the following. You will need to specify the full path for file.log and not just the file name. In this example, I named it /var/log/mylogfile.log. This example also assumes that your logs start each line with a date.
<source>
#type tail
# Parse the timestamp, but still collect the entire line as 'message'
format /^(?<message>(?<time>[^ ]*\s*[^ ]* [^ ]*) .*)$/
path /var/log/mylogfile.log
pos_file /var/lib/google-fluentd/pos/mylogfile.log.pos
read_from_head true
tag auth
</source>
For more information read the following document:
Stackdriver - Configuring the Agent
Now your program will run outside GCP and when running on a configured instance, log to Stackdriver.
Note: I would do the opposite of what you have asked. I would always use Stackdriver. When not running in GCP I would manually set up Stackdriver on my desktop, local server, etc and continue to log to Stackdriver.

How to disable the python log `URL being requested` `Refreshing access_token`

I have a python code downloading files from google drive by googleapiclient API. The python code uses logger to print information. I set the basic log level to be INFO. However, there is some logger info from calling the API. To be specific, the other and unwanted logger info is:
2019-02-12 03:52:21,269 INFO URL being requested:
2019-02-12 03:52:21,091 INFO Starting new HTTP connection
2019-02-12 03:52:19,691 INFO Attempting refresh to obtain initial access_token
From what I googled, logging.getLogger("requests").setLevel(logging.WARNING) seems able to mute the logging info of Starting new HTTP connection. But how can I mute the other two?

Grepping through my virtual environment, I was able to see that it's the googleapiclient library that's creating the logger. You can mute those messages with
logging.getLogger("googleapiclient.discovery").setLevel(logging.WARNING)
Or at the module level
logging.getLogger("googleapiclient").setLevel(logging.WARNING)

Missing logs and delay in logs when using BlobStorageRotatingFileHandler

I have a flask API hosted on Azure and I am using azure_storage_logging.handlers package to send API runtime logs to Azure Storage. I am using BlobStorageRotatingFileHandler for it.
I receive few logs in my storage account. However, a huge number of logs are missing. My API is very CPU intensive.
Please let me know how to solve this problem.
def fun_logging(id, logfilename , loggername):
mystorageaccountname = STORAGE_ACCOUNT_NAME
mystorageaccountkey = STORAGE_ACCOUNT_KEY
mystoragecontainer = STORAGE_CONTAINER
utctime = asctime(gmtime())
logger = logging.getLogger(loggername)
logger.setLevel(logging.DEBUG)
log_formater = logging.Formatter('%(utctime)s - %(id)s - %(levelname)s - %(message)s')
azure_blob_handler = BlobStorageRotatingFileHandler(filename = logfilename, account_name=mystorageaccountname,account_key=mystorageaccountkey, delay=False, maxBytes= 10000,container=mystoragecontainer)
azure_blob_handler.setLevel(logging.DEBUG)
azure_blob_handler.setFormatter(log_formater)
if (logger.hasHandlers()):
logger.handlers.clear()
logger.addHandler(azure_blob_handler)
logger = logging.LoggerAdapter(logger, {'id': id, 'utctime':utctime})
return logger
####### Calling function
logger = fun_logging(id, 'Logs//xyz.log', 'xyz')
logger.info(Result.log) ## the variable I am logging

I searched for BlobStorageRotatingFileHandler you used, and found the github repo michiya/azure-storage-logging. I'm not sure whether it was used in your flask project. If yes, I advise you to use TableStorageHandler for logging in your scenario for writing a huge number of logs, not using BlobStorageRotatingFileHandler.
After I reviewed the code of this repo, there is an issue to use BlockBlob to store an entire log file once, not to append a log record into an AppendBlob. So it's not suitable for your scenario with massive logs. And Azure recommends to use AppendBlob for logging, see below comes from here if using Blob Storage.
Append blobs are used for logging, such as when you want to write to a
file and then keep adding more information. Most objects stored in
Blob storage are block blobs.
So you can try to use Append Blob via Azure Blob Storage SDK for Python for wrapping a logging API by your self. Otherwise, Azure Table Storage is a good choise which you can try for logging. For massive logs recording, the best practice is that to write logs into EventHubs, then to use other services like Stream Analysis to filter and transfer data into Blob Storage.
Hope it helps.

GAE flexible and Stackriver log severity levels with python

I'm running a python 3 application in the google flexible app engine. I'm using the default python root logger and would like to be able to view the log levels like in this image.
When deploying an application with the following log configuration in GAE using gunicorn:
import logging
logging.basicConfig(level=logging.INFO)
logging.warn('Warning level message')
All log levels show up under any log level in GAE independent of the severity level.
I tried using the Stackdriver logging client for python:
import logging
from google.cloud import logging as gcloud_logging
client = gcloud_logging.Client()
client.setup_logging(logging.INFO)
logging.basicConfig(level=logging.INFO)
logging.warn('Warning level message')
When executing locally this logs to Stackdriver (under 'global') with the log levels actually functioning properly. However, when I deploy this application in the GAE the log levels of the GAE still show up under 'any log level'. Also, the logging under 'global' does not seem to function.
Is there an easy way to get the log severity levels to work with the google standard app engine with python 3?

Have you tried integrating the Stackdriver Logging with the Python logging module?
From https://gcloud-python.readthedocs.io/en/latest/logging/usage.html#integration-with-python-logging-module:
import logging
handler = client.get_default_handler()
cloud_logger = logging.getLogger('cloudLogger')
cloud_logger.setLevel(logging.INFO)
cloud_logger.addHandler(handler)
cloud_logger.error('bad news')

Google Stackdriver logging in App Engine (Python) using Redis queue

I want to log using Stackdriver Logging to App Engine using Redis queue. So I'm using Redis Server, Redis Queue and Python logging to do this. Here's my code:
import logging
from redis import Redis
from rq import Queue
import time
class SomeClass():
def log_using_redis(self,text):
log_text = logging.warn(text)
f=open("stack_log.txt","a+")
f.write(str(text))
return "logged Successfully using redis"
def get(self):
text = 'Hello, Logged Successfully!'+time.strftime('%a, %d %b %Y %H:%M:%S %Z(%z)')
redis_conn = Redis()
q = Queue(connection=redis_conn)
job = q.enqueue(self.log_using_redis,text)
print job.result
When I run RQ worker I'm getting some output on terminal but couldn't find where the logs are being stored.
If I try to log directly without using Redis, the logs are being stored at Global in the logging section of Google Cloud. The queue is working properly, to check I've been appending the text to a file.
It seems the logging isn't working. If it is being logged, where can I find my logs on Google Cloud?

Taking into account that you are using Python client library, use print() function to obtain the desired results. I don’t know if you are testing the application locally or you have deployed it.
If you are testing the application locally: print() function
output can be found in the cloud shell.
If you have deployed the application: go to the GCP console, App Engine and services. Select the service in which you have deployed
your application. In the right side click on tools and select
"Logs". This will redirect the page to your app logs.
A more precise logging can be defined using the Stackdriver Logging for Python. The warning level can be defined. This can help you manage your application or identify events of interest. Find an example code here.
You might find useful Stackdriver Logging agent, an application based on fluentd that runs on your virtual machine (VM) instances. The Logging agent is pre-configured to send logs from VM instances to Stackdriver Logging. There are source and configuraiton files available for redis.
If you want a more general vision, App Engine flexible environment logs official documentation can help you to understand the different available logs.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Print message from inside AWS SageMaker endpoint - python

Related

Logging on either GCP or local

How to disable the python log `URL being requested` `Refreshing access_token`

Missing logs and delay in logs when using BlobStorageRotatingFileHandler

GAE flexible and Stackriver log severity levels with python

Google Stackdriver logging in App Engine (Python) using Redis queue

Categories

Resources