Hi I have setup Airflow on my local using docker-compose, I am using MAC and airflow docker version is apache/airflow:2.1.0. The logs for tasks have a lot asterisk as shown below. I need help to rectify it, I searched a lot but could not find anything.
*** Reading local file: /opt/airflow/logs/bi_sf_snowflake/git_clone/2021-07-09T11:41:24.189880+00:00/1.log
[2021-07-09 11:41:28,633] {logging_mixin.py:104} WARNING - ***-***-***-*** ***L***o***g***g***i***n***g*** ***e***r***r***o***r*** ***-***-***-***
[2021-07-09 11:41:28,634] {logging_mixin.py:104} WARNING - ***T***r***a***c***e***b***a***c***k*** ***(***m***o***s***t*** ***r***e***c***e***n***t*** ***c***a***l***l*** ***l***a***s***t***)***:***
[2021-07-09 11:41:28,635] {logging_mixin.py:104} WARNING - *** *** ***F***i***l***e*** ***"***/***u***s***r***/***l***o***c***a***l***/***l***i***b***/***p***y***t***h***o***n***3***.***6***/***l***o***g***g***i***n***g***/***_***_***i***n***i***t***_***_***.***p***y***"***,*** ***l***i***n***e*** ***9***9***4***,*** ***i***n*** ***e***m***i***t***
The docker-compose file starts as below -
version: '3'
x-airflow-common:
&airflow-common
image: ${AIRFLOW_IMAGE_NAME:-mc-airflow:Dockerfile}
environment:
&airflow-common-env
AIRFLOW__CORE__EXECUTOR: CeleryExecutor
AIRFLOW__CORE__SQL_ALCHEMY_CONN: postgresql+psycopg2://airflow:airflow#postgres/airflow
AIRFLOW__CELERY__RESULT_BACKEND: db+postgresql://airflow:airflow#postgres/airflow
AIRFLOW__CELERY__BROKER_URL: redis://:#redis:6379/0
AIRFLOW__CORE__FERNET_KEY: ''
AIRFLOW__CORE__DAGS_ARE_PAUSED_AT_CREATION: 'true'
AIRFLOW__CORE__LOAD_EXAMPLES: 'true'
AIRFLOW__API__AUTH_BACKEND: 'airflow.api.auth.backend.basic_auth'
volumes:
- ./dags/:/opt/airflow/dags
- ./logs:/opt/airflow/logs
- ./plugins:/opt/airflow/plugins
user: "${AIRFLOW_UID:-50000}:${AIRFLOW_GID:-50000}"
depends_on:
redis:
condition: service_healthy
postgres:
condition: service_healthy
command: bash -c pip
The airflow config has below log config
logging_level = INFO
# Logging level for Flask-appbuilder UI.
#
# Supported values: ``CRITICAL``, ``ERROR``, ``WARNING``, ``INFO``, ``DEBUG``.
fab_logging_level = WARN
# Logging class
# Specify the class that will specify the logging configuration
# This class has to be on the python classpath
# Example: logging_config_class = my.path.default_local_settings.LOGGING_CONFIG
logging_config_class =
# Flag to enable/disable Colored logs in Console
# Colour the logs when the controlling terminal is a TTY.
colored_console_log = False
# Log format for when Colored logs is enabled
colored_log_format = [%%(blue)s%%(asctime)s%%(reset)s] {%%(blue)s%%(filename)s:%%(reset)s%%(lineno)d} %%(log_color)s%%(levelname)s%%(reset)s - %%(log_color)s%%(message)s%%(reset)s
colored_formatter_class = airflow.utils.log.colored_log.CustomTTYColoredFormatter
# Format of Log line
log_format = [%%(asctime)s] {%%(filename)s:%%(lineno)d} %%(levelname)s - %%(message)s
simple_log_format = %%(asctime)s %%(levelname)s - %%(message)s
# Specify prefix pattern like mentioned below with stream handler TaskHandlerWithCustomFormatter
# Example: task_log_prefix_template = {ti.dag_id}-{ti.task_id}-{execution_date}-{try_number}
task_log_prefix_template =
# Formatting for how airflow generates file names/paths for each task run.
log_filename_template = {{ ti.dag_id }}/{{ ti.task_id }}/{{ ts }}/{{ try_number }}.log
# Formatting for how airflow generates file names for log
log_processor_filename_template = {{ filename }}.log
# full path of dag_processor_manager logfile
dag_processor_manager_log_location = /opt/airflow/logs/dag_processor_manager/dag_processor_manager.log
# Name of handler to read task instance logs.
# Defaults to use ``task`` handler.
task_log_reader = task
# A comma\-separated list of third-party logger names that will be configured to print messages to
# consoles\.
# Example: extra_loggers = connexion,sqlalchemy
extra_loggers =
This issue has been fixed in Airflow 2.1.1 already. It was caused by secret masker that incorrectly masked "no character" if your connection had empty password.
The ways to fix it:
migrate to latest released Airflow (best)
disable secrets masking (new feature in Airflow 2.1.0)
find the connection that had empty password and set it to some non-empty password (usually for those connections password value is not used so you can set it to any random set of characters).
The issue in question is here: https://github.com/apache/airflow/issues/16007
Related
I have an application which I run with docker-compose. Now I'd like to add logging to my application so I added this image to my compose -file:
syslog-ng:
image: lscr.io/linuxserver/syslog-ng:latest
container_name: syslog-ng
environment:
- PUID=1000
- PGID=1000
- TZ=Europe/London
volumes:
- ./syslog-ng.conf:/config/syslog-ng.conf
- /var/log/test_logs:/var/log
ports:
- 514:5514/udp
- 601:6601/tcp
- 6514:6514/tcp
restart: unless-stopped
The syslog-ng.conf is located in the root of my repo, just like the docker-compose.yaml. This is the contents of the conf file, it is copied from here:
#############################################################################
# Default syslog-ng.conf file which collects all local logs into a
# single file called /var/log/booth_logs tailored to container usage.
#version: 3.35
#include "scl.conf"
source s_local {
internal();
};
source s_network_tcp {
syslog(transport(tcp) port(6601));
};
source s_network_udp {
syslog(transport(udp) port(5514));
};
destination d_local {
file("/var/log/test_logs");
file("/var/log/test_logs-kv.log" template("$ISODATE $HOST $(format-welf --scope all-nv-pairs)\n") frac-digits(3));
};
log {
source(s_local);
source(s_network_tcp);
source(s_network_udp);
destination(d_local);
};
This is the service where the file test_script.py is located from where I try to send logs:
test_service:
image: "another_service:latest"
depends_on:
- another_service
- syslog-ng
container_name: another_service
volumes:
- ./syslog-ng.conf:/config/syslog-ng.conf
command: python app/test_script.py
And this is how I've been trying to test this in the `test_script.py``:
import logging
import logging.handlers
# configure the syslog handler
syslog = logging.handlers.SysLogHandler(address=('syslog-ng', 514))
# create a logger
logger = logging.getLogger('my_logger')
logger.setLevel(logging.DEBUG)
# add the syslog handler to the logger
logger.addHandler(syslog)
# use the logger
logger.debug('This is a debug message')
logger.info('This is an info message')
logger.warning('This is a warning message')
logger.error('This is an error message')
logger.critical('This is a critical message')
I don't get any errors and all the services are starting normally but no logs are saved. I've checked the container /var/log and my computers /var/log/test_logs. Any ideas what I'm missing here?
so recently I moved my app into a docker container.
I noticed, that the log streams of the log group changed its names to some random hash.
Before moving to docker:
After moving to docker:
The logger in each file is initialized as
logger = logging.getLogger(__name__)
The logger's config is set up inside the __main__ with
def setup_logger(config_file):
with open(config_file) as log_config:
config_yml = log_config.read()
config_dict = yaml.safe_load(config_yml)
logging.config.dictConfig(config_dict)
with the config loaded from this file
version: 1
disable_existing_loggers: False
formatters:
json:
format: "[%(asctime)s] %(process)d %(levelname)s %(name)s:%(funcName)s:%(lineno)s - %(message)s"
plaintext:
format: "%(asctime)s %(levelname)s %(name)s - %(message)s"
datefmt: "%Y-%m-%d %H:%M:%S"
handlers:
console:
class: logging.StreamHandler
formatter: plaintext
level: INFO
stream: ext://sys.stdout
root:
level: DEBUG
propagate: True
handlers: [console]
The docker image is run with the flags
--log-driver=awslogs \
--log-opt awslogs-group=XXXXX \
--log-opt awslogs-create-group=true \
Is there a way to keep the original log stream names?
That's how the awslogs driver works.
Per the documentation, you can control the name somewhat using the awslogs-stream-prefix option:
The awslogs-stream-prefix option allows you to associate a log stream with the specified prefix, the container name, and the ID of the Amazon ECS task to which the container belongs. If you specify a prefix with this option, then the log stream takes the following format:
prefix-name/container-name/ecs-task-id
If you don't specify a prefix with this option, then the log stream is named after the container ID that is assigned by the Docker daemon on the container instance. Because it is difficult to trace logs back to the container that sent them with just the Docker container ID (which is only available on the container instance), we recommend that you specify a prefix with this option.
You cannot change this behavior if you're using the awslogs driver. The only option would be to disable the log driver and use the AWS SDK to put the events into CloudWatch manually, but I don't think that'd be a good idea.
To be clear, your container settings/code don't affect the stream name at all when using awslogs - the log driver is just redirecting all of the container's STDOUT to CloudWatch.
I have a flask app that I am loading through Docker, and when I try and access the application on localhost:8000 I get the error message in the subject line. I believe the issue is that the flask application is not recognizing my application's SECRET_KEY, but I'm not sure how to fix it.
Here is my app structure (condensed for clarity):
config/
-- settings.py
instance/
-- settings.py
myapp/
-- app.py
blueprints/
user/
-- models.py
.env
docker-compose
Dockerfile
My app-factory function looks like this in app.py:
def create_app(settings_override=None):
"""
Create a Flask application using the app factory pattern.
:param settings_override: Override settings
:return: Flask app
"""
app = Flask(__name__, instance_relative_config=True)
app.config.from_object('config.settings')
app.config.from_pyfile('settings.py', silent=True)
if settings_override:
app.config.update(settings_override)
app.register_blueprint(admin)
app.register_blueprint(page)
app.register_blueprint(contact)
app.register_blueprint(user)
extensions(app)
authentication(app, User)
return app
The error is being triggered in the function called authentication:
def authentication(app, user_model):
"""
Initialize the Flask-Login extension (mutates the app passed in).
:param app: Flask application instance
:param user_model: Model that contains the authentication information
:type user_model: SQLAlchemy model
:return: None
"""
login_manager.login_view = 'user.login'
#login_manager.user_loader
def load_user(uid):
return user_model.query.get(uid)
#login_manager.token_loader
def load_token(token):
duration = app.config['REMEMBER_COOKIE_DURATION'].total_seconds()
serializer = URLSafeTimedSerializer(app.secret_key)
data = serializer.loads(token, max_age=duration)
user_uid = data[0]
return user_model.query.get(user_uid)
It's the line where it says data = serializer.loads(token, max_age=duration)
The token is usually generated from the secret_key of the application.
Here are some examples from my User class where a token is generated:
def serialize_token(self, expiration=3600):
"""
Sign and create a token that can be used for things such as resetting
a password or other tasks that involve a one off token.
:param expiration: Seconds until it expires, defaults to 1 hour
:type expiration: int
:return: JSON
"""
private_key = current_app.config['SECRET_KEY']
serializer = TimedJSONWebSignatureSerializer(private_key, expiration)
return serializer.dumps({'user_email': self.email}).decode('utf-8')
The SECRET_KEY variable is being set from my settings.py file from my config folder. Here is its contents:
from datetime import timedelta
DEBUG = True
SERVER_NAME = 'localhost:8000'
SECRET_KEY = 'insecurekeyfordev'
# Flask-Mail.
MAIL_DEFAULT_SENDER = 'contact#local.host'
MAIL_SERVER = 'smtp.gmail.com'
MAIL_PORT = 587
MAIL_USE_TLS = True
MAIL_USE_SSL = False
MAIL_USERNAME = 'you#gmail.com'
MAIL_PASSWORD = 'awesomepassword'
# Celery.
CELERY_BROKER_URL = 'redis://:devpassword#redis:6379/0'
CELERY_RESULT_BACKEND = 'redis://:devpassword#redis:6379/0'
CELERY_ACCEPT_CONTENT = ['json']
CELERY_TASK_SERIALIZER = 'json'
CELERY_RESULT_SERIALIZER = 'json'
CELERY_REDIS_MAX_CONNECTIONS = 5
# SQLAlchemy.
db_uri = 'postgresql://snakeeyes:devpassword#postgres:5432/snakeeyes'
SQLALCHEMY_DATABASE_URI = db_uri
SQLALCHEMY_TRACK_MODIFICATIONS = False
# User.
SEED_ADMIN_EMAIL = 'dev#local.host'
SEED_ADMIN_PASSWORD = 'devpassword'
REMEMBER_COOKIE_DURATION = timedelta(days=90)
I don't know why this information isn't loading correctly in the app, but when I run docker-compose up --build I get the error message in the title.
If it's at all useful here are the contents of my docker files.
docker-compose.yml
version: '2'
services:
postgres:
image: 'postgres:9.5'
env_file:
- '.env'
volumes:
- 'postgres:/var/lib/postgresql/data'
ports:
- '5432:5432'
redis:
image: 'redis:3.0-alpine'
command: redis-server --requirepass devpassword
volumes:
- 'redis:/var/lib/redis/data'
ports:
- '6379:6379'
website:
build: .
command: >
gunicorn -b 0.0.0.0:8000
--access-logfile -
--reload
"snakeeyes.app:create_app()"
env_file:
- '.env'
volumes:
- '.:/snakeeyes'
ports:
- '8000:8000'
celery:
build: .
command: celery worker -l info -A snakeeyes.blueprints.contact.tasks
env_file:
- '.env'
volumes:
- '.:/snakeeyes'
volumes:
postgres:
redis:
And my DOCKERFILE:
FROM python:3.7.5-slim-buster
MAINTAINER My Name <myname#gmail.com>
RUN apt-get update && apt-get install -qq -y \
build-essential libpq-dev --no-install-recommends
ENV INSTALL_PATH /myapp
RUN mkdir -p $INSTALL_PATH
WORKDIR $INSTALL_PATH
COPY requirements.txt requirements.txt
RUN pip install -r requirements.txt
COPY . .
RUN pip install --editable .
CMD gunicorn -b 0.0.0.0:8000 --access-logfile - "myapp.app:create_app()"
I have two files.
first is the TCP server.
second is the flask app. they are one project but they are inside of a separated docker container
they should write logs same file due to being the same project
ı try to create my logging library ı import my logging library to two file
ı try lots of things
firstly ı deleted bellow code
if (logger.hasHandlers()):
logger.handlers.clear()
when ı delete,ı get same logs two times
my structure
docker-compose
docker file
loggingLib.py
app.py
tcp.py
requirements.txt
.
.
.
my last logging code
from logging.handlers import RotatingFileHandler
from datetime import datetime
import logging
import time
import os, os.path
project_name= "proje_name"
def get_logger():
if not os.path.exists("logs/"):
os.makedirs("logs/")
now = datetime.now()
file_name = now.strftime(project_name + '-%H-%M-%d-%m-%Y.log')
log_handler = RotatingFileHandler('logs/'+file_name,mode='a', maxBytes=10000000, backupCount=50)
formatter = logging.Formatter('%(asctime)s - %(levelname)s - %(funcName)s - %(message)s ', '%d-%b-%y %H:%M:%S')
formatter.converter = time.gmtime
log_handler.setFormatter(formatter)
logger = logging.getLogger(__name__)
logger.setLevel(level=logging.INFO)
if (logger.hasHandlers()):
logger.handlers.clear()
logger.addHandler(log_handler)
return logger
it is working but only in one file
if app.py works first, it only makes a log
other file don't make any logs
Anything that directly uses files – config files, log files, data files – is a little trickier to manage in Docker than running locally. For logs in particular, it's usually better to set your process to log directly to stdout. Docker will collect the logs, and you can review them with docker logs. In this setup, without changing your code, you can configure Docker to send the logs somewhere else or use a log collector like fluentd or logstash to manage the logs.
In your Python code, you usually will want to configure the detailed logging setup at the top level, on the root logger
import logging
def main():
logging.basicConfig(
format='%(asctime)s - %(levelname)s - %(funcName)s - %(message)s ',
datefmt='%d-%b-%y %H:%M:%S',
level=logging.INFO
)
...
and in each individual module you can just get a local logger, which will inherit the root logger's setup
import logging
LOGGER = logging.getLogger(__name__)
With its default setup, Docker will capture log messages into JSON files on disk. If you generate a large amount of log messages in a long-running container, it can lead to local disk exhaustion (it will have no effect on memory available to processes). The Docker logging documentation advises using the local file logging driver, which does automatic log rotation. In a Compose setup you can specify logging: options:
version: '3.8'
services:
app:
image: ...
logging:
driver: local
You can also configure log rotation on the default JSON File logging driver:
version: '3.8'
services:
app:
image: ...
logging:
driver: json-file # default, can be omitted
options:
max-size: 10m
max-file: 50
You "shouldn't" directly access the logs, but they are in a fairly stable format in /var/lib/docker, and tools like fluentd and logstash know how to collect them.
If you ever decide to run this application in a cluster environment like Kubernetes, that will have its own log-management system, but again designed around containers that directly log to their stdout. You would be able to run this application unmodified in Kubernetes, with appropriate cluster-level configuration to forward the logs somewhere. Retrieving a log file from opaque storage in a remote cluster can be tricky to set up.
I'm developing a custom network module for Ansible 2.5.2 for MikroTik RouterOS.
I want to change the remote username under the hood, e.g. if in my inventory I have ansible_user: ansible, I would like to change it to ansible_user: ansible+cet.
I have seen modules that do this in action plugins, like here: CFSworks/ansible-routeros.
However, it seems that in Ansible 2.5 with network_cli connection type this is no longer the case. When I add the following sample action plugin to plugins/action directory nothing happens – I don't see routeros action plugin is called in debug console:
class ActionModule(_ActionModule):
def run(self, tmp=None, task_vars=None):
display.debug('routeros action plugin is called')
result = super(ActionModule, self).run(task_vars=task_vars)
return result
This is my playbook:
- hosts: routeros
gather_facts: no
connection: network_cli
tasks:
- ...
And this is my inventory:
routeros:
hosts:
example:
ansible_host: 192.168.88.1
ansible_user: ansible
# ansible_user: ansible+cet <-- i want to add '+cet' on the fly
ansible_ssh_pass: ansible
ansible_network_os: routeros
What is the proper way to replace the connection user in Ansible 2.5 with network_cli connection type?