Greengrass V2 generates extra `stdout` logs - python

I deploy greengrass components into my EC2 instance. The deploy greengrass components have been generating logs which wraps around my python log.
what is causing the "wrapping" around it? how can I remove these wraps.
For example, the logs in bold are wraps the original python log.
The log in emphasis is generated by my python log formatter.
2022-12-13T23:59:56.926Z [INFO] (Copier) com.bolt-data.iot.RulesEngineCore: stdout. [2022-12-13 23:59:56,925][DEBUG ][iot-ipc] checking redis pub-sub health (io_thread[140047617824320]:pub_sub_redis:_connection_health_nanny_task:61).
{scriptName=services.com.bolt-data.iot.RulesEngineCore.lifecycle.Run, serviceName=com.bolt-data.iot.RulesEngineCore, currentState=RUNNING}
The following is my python log formatter.
formatter = logging.Formatter(
fmt="[%(asctime)s][%(levelname)-7s][%(name)s] %(message)s (%(threadName)s[%(thread)d]:%(module)s:%(funcName)s:%(lineno)d)"
)
# TODO: when we're running in the lambda function, don't stream to stdout
_handler = logging.StreamHandler(stream=stdout)
_handler.setLevel(get_level_from_environment())
_handler.setFormatter(formatter)

by default Greengrass Nucleus captures the stdoud and stderr streams from the processes it manages, including custom components. It then outputs each line of the logs with the prefix and suffix you have highlighted in bold. This cannot be changed. You can switch the format from TEXT to JSON which can make the log easier to parse by a machine (check greengrass-nucleus-component-configuration - logging.format)
import logging
from logging.handlers import RotatingFileHandler
logger = logging.Logger(__name__)
_handler = RotatingFileHandler("mylog.log")
# additional _handler configuration
logger.addHandler(_handler)
If you want to output a log containing only what your application generates, change the logger configuration output to file. You can write the file in the work folder of the component or in another location, such as /var/log

Related

Save console output in txt file as it happens

I want to save my console output in a text file, but I want it to be as it happens so that if the program crashes, logs will be saved.
Do you have some ideas?
I can't just specify file in logger because I have a lot of different loggers that are printing into the console.
I think that you indeed can use a logger, just adding a file handler, from the logging module you can read this
As an example you can use something like this, which logs both to the terminal and to a file:
import logging
from pathlib import Path
root_path = <YOUR PATH>
log_level = logging.DEBUG
# Print to the terminal
logging.root.setLevel(log_level)
formatter = logging.Formatter("%(asctime)s | %(levelname)s | %(message)s", "%Y-%m-%d %H:%M:%S")
stream = logging.StreamHandler()
stream.setLevel(log_level)
stream.setFormatter(formatter)
log = logging.getLogger("pythonConfig")
if not log.hasHandlers():
log.setLevel(log_level)
log.addHandler(stream)
# file handler:
file_handler = logging.FileHandler(Path(root_path / "process.log"), mode="w")
file_handler.setLevel(log_level)
file_handler.setFormatter(formatter)
log.addHandler(file_handler)
log.info("test")
If you have multiple loggers, you can still use this solutions as loggers can inherit from other just put the handler in the root logger and ensure that the others take the handler from that one.
As an alternative you can use nohup command which will keep the process running even if the terminal closes and will return the outputs to the desired location:
nohup python main.py > log_file.out &
There are literally many ways to do this. However, they are not all suitable for different reasons (maintainability, ease of use, reinvent the wheel, etc.).
If you don't mind using your operating system built-ins you can:
forward standard output and error streams to a file of your choice with python3 -u ./myscript.py 2>&1 outputfile.txt.
forward standard output and error streams to a file of your choice AND display it to the console too with python3 -u ./myscript.py 2>&1 tee outputfile.txt. The -u option specifies the output is unbuffered (i.e.: whats put in the pipe goes immediately out).
If you want to do it from the Python side you can:
use the logging module to output the generated logs to a file handle instead of the standard output.
override the stdout and stderr streams defined in sys (sys.stdout and sys.stderr) so that they point to an opened file handle of your choice. For instance sys.stdout = open("log-stdout.txt", "w").
As a personnal preference, the simpler, the better. The logging module is made for the purpose of logging and provides all the necessary mechanisms to achieve what you want. So.. I would suggest that you stick with it. Here is a link to the logging module documentation which also provides many examples from simple use to more complex and advanced use.

Unable to use logging library for kafka logs

I'm relatively new to confluent-kafka python and I am having an hard time figuring out how to send the logs from the kafka library to the program logger.
I am using a SerializingProducer and, according to the docs here: https://docs.confluent.io/platform/current/clients/confluent-kafka-python/html/index.html#serde-producer I should be able to pass a logger as config entry.
Something like:
logger = logging.getLogger()
producer_config = {
"bootstrap.servers" : "https....",
...,
"logger" : logger
}
However this does not seem to work as intended. The only log visible is the one producer directly by the program. Removing the line
"logger" : logger
allows the kafka library to correctly show the logs in the console.
My producer looks as follow:
log = logging.getLogger(__name__)
class MyProducer(confluent_kafka.SerializingProducer):
def __init__(self, topic, producer_config, source_folder) -> None:
producer_config["logger"] = log
super(MyProducer, self).__init__(producer_config)
log.info(f"Producer initialized with the following conf::\n{producer_config}")
Any idea on why it happens and how to fix this?
Other information available:
kafka library logs to stderr
passing directly an Handler rather than a Logger does not fix the problem, but rather duplicates the output of the program
Thanks in advance.

Logging works locally but not on remote

I set up a basic python logger that writes to a log file and to stdout. When I run my python program locally, log messages with logging.info appear as expected in the file and in the console. However, when I run the same program remotely via ssh -n user#server python main.py neither the console nor the file show any logging.info messages.
This is the code used to set up the logger:
def setup_logging(model_type, dataset):
file_name = dataset + "_" + model_type + time.strftime("_%Y%m%d_%H%M")
logging.basicConfig(
level=logging.INFO,
format="[%(levelname)-5.5s %(asctime)s] %(message)s",
datefmt='%H:%M:%S',
handlers=[
logging.FileHandler("log/{0}.log".format(file_name)),
logging.StreamHandler()
])
I already tried the following things:
Sending a message to logging.warning: Those appear as expected on the root logger. However, even without setting up the logger and falling back to the default logging.info messages do not show up.
The file and folder permissions seem to be alright and an empty file is created on disk.
Using print works as usual as well
If you look into the source code of basicConfig function, you will see that the function is applied only when there are no handlers on the root logger:
_acquireLock()
try:
force = kwargs.pop('force', False)
if force:
for h in root.handlers[:]:
root.removeHandler(h)
h.close()
if len(root.handlers) == 0:
handlers = kwargs.pop("handlers", None)
if handlers is None:
...
I think, one of the libraries you use configures logging on import. And as you see from the sample above, one of the solutions is to use force=True argument.
A possible disadvantage is that several popular data-science libraries keep a reference to the loggers they configure, so that when you reconfigure logging yourselves their old loggers with the handlers are still there and do not see your changes. In which case you will also need to clean the handlers for those loggers as well.

robot framework info message not printing

My python code generates a log file using logging framework and all INFO messages are captured in the log file. I integrated my program with ROBOT framework and now the log file is not generated. Instead the INFO messages are printed in the log.html. I understand this is because robot existing logger is being called and hence INFO are directed to log.html. I don't want the behavior to change, I still want the user defined log file to be generated separately with just the INFO level messages.
How can I achieve this?
Python Code --> Logging Library --> "Log File"
RobotFramework --> Python Code --> Logging Library --> by default "log.html"
When you run using python code it will allow you set log file name.
But when you run using robotframework, the file is by default set to log.html (since robot uses the same logging library internally that you are using) so your logging function is overridden by that of robotframework.
That is why you see it in log.html instead of your file.
You can also refer Robot Framework not creating file or writing to it
Hope it helps!
The issue has been fixed now, which was a very minor one. But am still analyzing it deeper, will update when I am clear on the exact cause.
This was the module that I used,
def call_logger(logger_name, logFile):
level = logging.INFO
l = logging.getLogger(logger_name)
if not getattr(l, 'handler_set', None):
formatter = logging.Formatter('%(asctime)s : %(message)s')
fileHandler = logging.FileHandler(logFile, mode = 'a')
fileHandler.setFormatter(formatter)
streamHandler = logging.StreamHandler()
streamHandler.setFormatter(formatter)
l.setLevel(level)
l.addHandler(fileHandler)
l.addHandler(streamHandler)
l.handler_set = True
When I changed the parameter "logFile" to a different name "log_file" it worked.
Looks like "logFile" was a built in robot keyword.

Python Logging Start-up (verbose) to memory, then dumping subsets of that based on configuration settings: Any example?

I'm writing a program with some fairly complicated configuration settings (with options from the environment, from the command line, and from a couple of possible configuration files). I'd like to enable very verbose "debug" level logging to memory during all this configuration reading and initialization --- and then I'd like to selectively dump some, possibly all, of that into their final logging destinations based on some of those configuration settings, command line switches and environment values.
Does anyone here know of a good open source example I could look at where someone's already done this sort of thing?
It looks like the Logging: MemoryHandler should be able to do this ... initially with a target=None and then calling .setTarget() method after the configuration is parsed. But the question is, can I set the MemoryHandler loglevel to DEBUG, then set a target with a different (usually less verbose) loglevel, and then simply .flush() the MemoryHandler into this other handler (effectively throwing away all of the overly verbose entries in the process)?
That won't quite work, because MemoryHandler's flush() method doesn't check levels before sending them to the target - all buffered records are sent. However, you could use a filter on the target handler, as in this example:
import logging, logging.handlers
class WarningFilter(logging.Filter):
def filter(self, record):
return record.levelno >= logging.WARNING
logger = logging.getLogger('foo')
mh = logging.handlers.MemoryHandler(1000)
logger.setLevel(logging.DEBUG)
logger.addHandler(mh)
logger.debug('bar')
logger.warning('baz')
sh = logging.StreamHandler()
sh.setLevel(logging.WARNING)
sh.addFilter(WarningFilter())
mh.setTarget(sh)
mh.flush()
When run, you should just see baz printed.

Categories

Resources