Python logging.getLogger not working in AWS Glue python shell job

Python logging.getLogger not working in AWS Glue python shell job - python

I am trying to set up a logger for my AWS Glue job using Python's logging module. I have a Glue job with the type as "Python Shell" using Python version 3.
Logging works fine if I instantiate the logger without any name, but if I give my logger a name, it no longer works, and I get an error which says: Log stream not found.
I have the following code in an example Glue job:
import sys
import logging
# Version 1 - this works fine
logger = logging.getLogger()
log_format = "[%(asctime)s %(levelname)-8s %(message)s"
# Version 2 - this fails
logger = logging.getLogger(name = "foobar")
log_format = "[%(name)s] %(asctime)s %(levelname)-8s %(message)s"
date_format = "%a, %d %b %Y %H:%M:%S %Z"
log_stream = sys.stdout
if logger.handlers:
for handler in logger.handlers:
logger.removeHandler(handler)
logging.basicConfig(level = logging.INFO, format = log_format, stream =
log_stream, datefmt = date_format)
logger.info("This is a test.")
Note that I'm removing the handlers based on this post.
If I instantiate the logger using Version 1 of the code, it runs successfully and I am able to view the logs, as well as query them in CloudWatch.
If I run Version 2, giving the logger a name, the Glue job still succeeds. However, if I try to view the logs, I get the following error message:
Log stream not found
The log stream jr_f137743545d3d242618ac95d859b9146fd15d15a0aadce64d8f3ba991ffed012 could not be found. Check if it was correctly created and retry.
And I am also not able to query these logs in CloudWatch.
I have tried running this code locally using python version 3.6.0, and both versions work. Additionally, both versions of this logging code work inside of a Lambda funcvtion. They only fail in Glue.

This code worked for me:
import sys
root = logging.getLogger()
root.setLevel(logging.DEBUG)
handler = logging.StreamHandler(sys.stdout)
handler.setLevel(logging.DEBUG)
formatter = logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s')
handler.setFormatter(formatter)
root.addHandler(handler)
root.info("check")

I had a similar issue but fixed it with a combination of correct roles and looking in the right place in Cloudwatch. Make sure you're using the GlueServiceRole. Both Steve and your logging code is fine but the place that you are taken in Cloudwatch when you click on the "logs" button in Glue isn't the correct logging folder.
Go back into log groups then go into /aws-glue/python-jobs/error and that is the where the logger write to, while the stdout writes to the folder /aws-glue/python-jobs/output. It's not a very intuitive setup writing the logs to the error logs folder but hey ho, I'm sure there is a way of configuring it to get it to write where expected.

You should be able to name the log stream by using the following (replace "logger-name-here" with your desired log stream name):
import logging
MSG_FORMAT = '%(asctime)s %(levelname)s %(name)s: %(message)s'
DATETIME_FORMAT = '%Y-%m-%d %H:%M:%S'
logging.basicConfig(format=MSG_FORMAT, datefmt=DATETIME_FORMAT)
logger = logging.getLogger(<logger-name-here>)
logger.setLevel(logging.INFO)
logger.info("Test log message")

Related

Colored logs in a .log file in python : Not working

I wanted to have colored log files for one of my programs. So I tried utilizing colored logging from one of answers on stackoverflow questions, and it worked great in the python terminal but the log file showed strange characters and no color as shown below.
In Terminal :
In Log File :
I tried many ways , but still am getting the same .
Can anyone help me , Below is my code .
import coloredlogs
import logging
# Create a logger object.
logger = logging.getLogger(__name__)
# Create a filehandler object
fh = logging.FileHandler('spam.log')
fh.setLevel(logging.DEBUG)
# Create a ColoredFormatter to use as formatter for the FileHandler
formatter = coloredlogs.ColoredFormatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s')
fh.setFormatter(formatter)
logger.addHandler(fh)
# Install the coloredlogs module on the root logger
coloredlogs.install(level='DEBUG')
logger.debug("this is a debugging message")
logger.info("this is an informational message")
logger.warning("this is a warning message")
logger.error("this is an error message")
logger.critical("this is a critical message")

Python's logging with severity level is different from Heroku console's severity level

I'm trying to implement logging in my python (FastAPI) web app with severity levels and deploy in Heroku. However, with python's logging module as configured below:
import logging
import sys
logging.basicConfig(
format="[%(asctime)s] %(filename)s:%(lineno)-5d %(levelname)-8s: %(message)s",
datefmt="%Y-%m-%d %H:%M:%S",
level=logging.INFO,
handlers=[
logging.StreamHandler(sys.stdout)
]
)
logger = logging.getLogger("plant-pot-backend")
#router.post('/add')
async def create(new_pot: PotHttpReq):
try:
pot_id = new_pot.id
new_pot = new_pot_registration(pot_id)
pots_collection.document(pot_id).set(new_pot.dict())
logger.warning("New pot added: {}".format(pot_id))
return {"success": True}
except Exception as e:
print(e)
return f"An Error Occured: {e}"
Specifically, I wish to get the same type of logging format, as shown in the output line with the green INFO. This allows me to filter these severity levels through log drains like Papertrail as well.
However, even with the python's logging config above, the std output is still categorized as under INFO even though it is a WARNING log, as seen in the picture.
If everything is under INFO type log in the Heroku console and logs, I can't filter them in log drains.
Any help on this?

Why my Kedro logging file keeps empty? Am I missing any step?

I am using Kedro but I can't get my logging file to be used. I am following the tutorial. The log file was created but is still empty.
Steps done:
Configured logging
class ProjectContext(KedroContext):
def _setup_logging(self) -> None:
log = logging.getLogger(__name__)
handler = TimedRotatingFileHandler(filename='logs/mypipeline.log', when='d', interval=1)
f_format = logging.Formatter('%(asctime)s %(levelname)s %(funcName)s %(lineno)d %(message)s ')
handler.setFormatter(f_format)
log.addHandler(handler)
log.setLevel(logging.DEBUG)
Use logging (in my nodes.py file)
import logging
log = logging.getLogger(__name__)
log.warning("Issue warning")
log.info("Send information")
And after running the pipeline the log file is created but keeps empty.
Any advice?

Ok, problem solved! It was missing the logger definition on the logging.yml file! Thank you guys for your support!

Prevent Python logger from printing to console

I'm getting mad at the logging module from Python, because I really have no idea anymore why the logger is printing out the logging messages to the console (on the DEBUG level, even though I set my FileHandler to INFO). The log file is produced correctly.
But I don't want any logger information on the console.
Here is my configuration for the logger:
template_name = "testing"
fh = logging.FileHandler(filename="testing.log")
fr = logging.Formatter("%(asctime)s,%(msecs)d %(name)s %(levelname)s %(message)s")
fh.setFormatter(fr)
fh.setLevel(logging.INFO)
logger = logging.getLogger(template_name)
# logger.propagate = False # this helps nothing
logger.addHandler(fh)
Would be nice if anybody could help me out :)

I found this question as I encountered a similar issue, after I had removed the logging.basicConfig(). It started printing all logs in the console.
In my case, I needed to change the filename on every run, to save a log file to different directories. Adding the basic config on top (even if the initial filename is never used, solved the issue for me. (I can't explain why, sorry).
What helped me was adding the basic configuration in the beginning:
logging.basicConfig(filename=filename,
format='%(levelname)s - %(asctime)s - %(name)s - %(message)s',
filemode='w',
level=logging.INFO)
Then changing the filename by adding a handler in each run:
file_handler = logging.FileHandler(path_log + f'/log_run_{c_run}.log')
formatter = logging.Formatter('%(asctime)s : %(levelname)s : %(name)s : %(message)s')
file_handler.setFormatter(formatter)
logger_TS.addHandler(file_handler)
Also curious side-note. If I don't set the formater (file_handler.setFormatter(formatter)) before setting the handler with the new filename, the initially formated logging (levelname, time, etc.) is missing in the log files.
So, the key is to set the logging.basicConfig before, then set the add the handler. As #StressedBoi69420 indicated above.
Hope that helps a bit.

You should be able to add a StreamHandler to handle stdout and set the handlers log level to a level above 50. (Standard log levels are 50 and below.)
Example of how I'd do it...
import logging
import sys
console_log_level = 100
logging.basicConfig(level=logging.INFO,
format="%(asctime)s,%(msecs)d %(name)s %(levelname)s %(message)s",
filename="testing.log",
filemode="w")
console = logging.StreamHandler(sys.stdout)
console.setLevel(console_log_level)
root_logger = logging.getLogger("")
root_logger.addHandler(console)
logging.debug("debug log message")
logging.info("info log message")
logging.warning("warning log message")
logging.error("error log message")
logging.critical("critical log message")
Contents of testing.log...
2019-11-21 12:53:02,426,426 root INFO info log message
2019-11-21 12:53:02,426,426 root WARNING warning log message
2019-11-21 12:53:02,426,426 root ERROR error log message
2019-11-21 12:53:02,426,426 root CRITICAL critical log message
Note: The only reason I have the console_log_level variable is because I pulled most of this code from a default function that I use that will set the console log level based on an argument value. That way if I want to make the script "quiet", I can change the log level based on a command line arg to the script.

Python logging caches?

I have a function in my python package, which returns a logger:
import logging
def get_logger(logger_name, log_level='DEBUG') :
"""Setup and return a logger."""
log = logging.getLogger(logger_name)
log.setLevel(log_level)
formatter = logging.Formatter('[%(levelname)s] %(message)s')
handler = logging.StreamHandler()
handler.setFormatter(formatter)
log.addHandler(handler)
return log
I use this logger in my modules and submodules:
from tgtk.logger import get_logger
log = get_logger(__name__, 'DEBUG')
so I can use it via
log.debug("Debug message")
log.info("Info message")
etc. This worked so far perfectly, but today I encountered a weird problem: on one machine, there is no output at all. Every log.xxx is simply being "ignored", whatever level is set. I had some similar issues in the past and I remember that it somehow started working again after I renamed the logger, but this time it does not help.
Is there any caching or what is going on? The scripts are exactly the same (synced over SVN).

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python logging.getLogger not working in AWS Glue python shell job - python

Related

Colored logs in a .log file in python : Not working

Python's logging with severity level is different from Heroku console's severity level

Why my Kedro logging file keeps empty? Am I missing any step?

Prevent Python logger from printing to console

Python logging caches?

Categories

Resources