Airflow on Docker - Path issue - python

Working with airflow I try simple DAG work.
I wrote custom operators and other files that I want to import into the main file where the DAG logic is.
Here the folder's structure :
├── airflow.cfg
├── dags
│   ├── __init__.py
│   ├── dag.py
│   └── sql_statements.sql
├── docker-compose.yaml
├── environment.yml
└── plugins
├── __init__.py
└── operators
├── __init__.py
├── facts_calculator.py
├── has_rows.py
└── s3_to_redshift.py
I setup the volume right in the compose file since I can see them when I log into the container's terminal.
I followed some tutorials online from where I have added some __init__.py.
The 2 none empty __init__ are into
/plugins/operators:
from operators.facts_calculator import FactsCalculatorOperator
from operators.has_rows import HasRowsOperator
from operators.s3_to_redshift import S3ToRedshiftOperator
__all__ = [
'FactsCalculatorOperator',
'HasRowsOperator',
'S3ToRedshiftOperator'
]
/plugins:
from airflow.plugins_manager import AirflowPlugin
import operators
# Defining the plugin class
class CustomPlugin(AirflowPlugin):
name = "custom_plugin"
# A list of class(es) derived from BaseOperator
operators = [
operators.FactsCalculatorOperator,
operators.HasRowsOperator,
operators.S3ToRedshiftOperator
]
# A list of class(es) derived from BaseHook
hooks = []
# A list of class(es) derived from BaseExecutor
executors = []
# A list of references to inject into the macros namespace
macros = []
# A list of objects created from a class derived
# from flask_admin.BaseView
admin_views = []
# A list of Blueprint object created from flask.Blueprint
flask_blueprints = []
# A list of menu links (flask_admin.base.MenuLink)
menu_links = []
But I keep getting errors from my IDE (saying No module named operators or Unresolved reference operators inside the operator's __init__).
Since everything fails to launch on the webserver.
Any idea how to set this up ? Where I'm wrong ?

Are you using the puckel's image?
If you are, you need to uncomment the # - ./plugins:/usr/local/airflow/plugins lines (may there are more than one) in the docker-compose files (either Local or Celery). The rest of your setup looks fine to me.

Related

How to Refactor Module using python rope?

I have my project structure as following
├── app
│   ├── Country
│   │   └── views.py
│   ├── Customer
│   │   └── views.py
Where the module 'Country' folder is what I tried to rename it to 'Countries' and every occurrence it is used, and it is also imported in Customer/views.py as well.
from app.Country.views import *
....
According to this tutorial Refactoring Python Applications for Simplicity, I tried it as below:
>>> from rope.base.project import Project
>>>
>>> proj = Project('app')
>>>
>>> Country = proj.get_folder('Country')
>>>
>>> from rope.refactor.rename import Rename
>>>
>>> change = Rename(proj, Country).get_changes('Countries')
>>> proj.do(change)
After executing the script, the module folder 'Country' was changed to 'Countries' but its instance where it is used in Customer/views.py does not change accordingly, the import statement in Customer/views.py is still
from app.Country.views import *
I expected it should change to from app.Countries.views import * after refactoring, but it did not.
Is there anything else I should do to refactor this successfully? Thanks.
You could use proj.get_module('app.Country').get_resource() to rename module.
from rope.base.project import Project
from rope.refactor.rename import Rename
proj = Project('app')
country = proj.get_module('app.Country').get_resource()
change = Rename(proj, country).get_changes('Countries')
print(change.get_description())
If you happen to work in a virtual environment and/or Django (as the views.py files suggest), you may need to define your PYTHONPATH variable before you start python.
>>> export PYTHONPATH=<path-to-app-folder>:<path-to-virtualen-bin>:<other-paths-used-by-your-project>
>>> python
Then (code from AnnieFromTaiwan is valid, as well as yours I guess, but did not test it):
from rope.base.project import Project
from rope.refactor.rename import Rename
proj = Project('app')
country = proj.get_module('app.Country').get_resource()
change = Rename(proj, country).get_changes('Countries')
proj.do(change)

Proper way to include builds from react apps into Flask

I have a site developed entirely using flask. It uses a Blueprint called dashboards, which hosts some views to access different visualizations, i.e dashboards/<string: dashboard_name>
I store my dashboards with an id, a name and the group it belongs to, so users can only access the visualizations from the group they also belong to.
(This is my unexperienced approach to solve this problem, so I'm also open to suggestions of a better design pattern to achieve this.)
My project kinda looks like this
app
├── dashboards
│   ├── __init__.py
│   └── views.py
├── __init__.py
├── models
│   ├── dashboard.py
│   └── __init__.p
├── static
│   ├── css
│   │   ├── styles.css
│   └── js
│   └── scripts.js
└── templates
├── base.html
└── dashboards
   ├── cat.html
   ├── clock.html
   ├── index.html
   ├── private.html
   └── public.html
my views for the Dashboard blueprint
# app/dashboards/views.py
from ..models.dashboard import Dashboard
#dashboards.route('/<string:name>')
def view_dashboard(name):
# a dictionary with dashboards available to current_user
dashboards = get_dashboards()
for dashboard in dashboards.values():
for d in dashboard:
if name == d.name:
route = 'dashboards/{dashboard}.html'.format(dashboard = d.name)
return render_template(route, dashboard=d)
else:
return render_template('404.html'), 404
and for the Dashboard blueprint
# app/dashboards/__init__.py
from flask import Blueprint
dashboards = Blueprint('dashboards', __name__)
from . import views
finally, I use the create_app pattern
# app/__init__.py
# some imports happening here
def create_app(config_name):
app = Flask(__name__)
from .dashboards import dashboards as dashboards_blueprint
app.register_blueprint(dashboards_blueprint, url_prefix='/dashboards')
return app
One level above the app, there is a manage.py that calls the create_app method, passing some configuration attributes. I believe this is a common pattern in Flask. I also believe this is not relevant to my question.
Particularly, if one deploys a react app using the create-react-app package, i.e. using npm run build, the output is a folder /build that contains the static files necessary to run the app. This folder has, for instance the following structure
├── build
│   ├── asset-manifest.json
│   ├── favicon.ico
│   ├── index.html
│   ├── manifest.json
│   ├── service-worker.js
│   └── static
│   ├── css
│   │   ├── main.<some-hash-1>.css
│   │   └── main.<some-hash-1>.css.map
│   └── js
│   ├── main.<some-hash-2>.js
│   └── main.<some-hash-2>.js.map
other files here
Now if want to plug this react-app using Flask, what I do now is to move the files in /css and /js of /build to the /static folder inside /app, and rename index.html from /build into let's say 'react-dashboard.html' and move it to templates/dashboards.
This is a very dumb approach to make basic apps working, but I don't know where to place the other files from /build, and lest about a better design pattern.
From this README I get that I can tweak the blueprint definition and move my template and static folder inside app/dashboards, but I still don't know where other files as service-worker.js, and manifests and so on.
What is the proper way to "mount" my deployed react app on my Flask application? I understand that reorganizing files from /build is something not desired.
I've solved it tweaking the blueprint
bp = Blueprint(name='myblueprint',
import_name=__name__,
static_folder='static/myblueprint',
template_folder='templates',
static_url_path='/static', # this is what myblueprint.static will point to
url_prefix='/myblueprint',
subdomain=None,
url_defaults=None,
root_path=None)
With this, my static paths inside the blueprint will point to /myblueprint/static, and I will need to modify my create-react-app build folder.
replace calls to static in every *.css, *.js and .html in /build to /myblueprint/static
rearrange folders inside /build to match my blueprint static folder.
Move /build/static to /blueprint/static
Set my own jinja template in /blueprint/templates/index.html that calls these files. Make sure this file points to the static assets in /myblueprint/static and not to /build/static.
I've created a gulpfile that will automate this process,
var gulp = require('gulp');
var replace = require('gulp-replace');
var rename = require('gulp-rename');
// script to deploy react build into flask blueprint directory
// it replaces values and folder names
// README
// how to use this:
// in the react app, run npm run build
// copy the resulting /build folder in /myblueprint
// create a template of a template in /templates/base as index.default.html
// install gulp, gulp-replace, gulp-rename
// run gulp from /myblueprint
// replace routes inside css files
gulp.task('othertemplates', function(){
gulp.src(['./build/static/css/main.*.css'])
.pipe(replace('url(/static/media/', 'url(/myblueprint/static/media/'))
.pipe(gulp.dest('static/myblueprint/css/'));
});
// replace routes inside js files
gulp.task('jsthingtemplates', function(){
gulp.src(['./build/static/js/main.*.js'])
.pipe(replace('static/media/', 'myblueprint/static/media/'))
.pipe(gulp.dest('static/myblueprint/js/'));
});
// move other files
gulp.task('mediastuff', function(){
gulp.src(['./build/static/media/*'])
.pipe(gulp.dest('static/myblueprint/media/'));
gulp.src(['./build/asset-manifest.json'])
.pipe(gulp.dest('static/myblueprint/'));
});
// replace hashes in assets in the jinja template
gulp.task('jinjareplace', function(){
var fs = require('fs');
var assets = JSON.parse(fs.readFileSync('./build/asset-manifest.json'));
// extract hashes
let jsHashNew = assets['main.js'].match('[a-z0-9]{8}')
let cssHashNew = assets['main.css'].match('[a-z0-9]{8}')
gulp.src(['./templates/myblueprint/base/index.default.html'])
.pipe(replace('css/main.cssHash.css', `css/main.${cssHashNew}.css`))
.pipe(replace('js/main.jsHash.js', `js/main.${jsHashNew}.js`))
.pipe(rename('index.html'))
.pipe(gulp.dest('templates/myblueprint/'));
});
gulp.task('default', ['othertemplates', 'jsthingtemplates', 'mediastuff', 'jinjareplace'])
with this, you can run gulp from the same path as gulpfile.js and should have everything covered.

how to write a custom assert Python

I am planning to split my one big test file to smaller test based on the part of code it tests. And I have a custom assert function for one of my tests. If I split them out to a separate file, how should I import them to other test files.
TestSchemacase:
class TestSchemaCase(unittest.TestCase):
"""
This will test our schema against the JSONTransformer output
just to make sure the schema matches the model
"""
# pylint: disable=too-many-public-methods
_base_dir = os.path.realpath(os.path.dirname(__file__))
def assertJSONValidates(self, schema, data):
"""
This function asserts the validation works as expected
Args:
schema(dict): The schema to test against
data(dict): The data to validate using the schema
"""
# pylint: disable=invalid-name
validator = jsonschema.Draft4Validator(schema)
self.assertIsNone(validator.validate(data))
def assertJSONValidateFails(self, schema, data):
"""
This function will assertRaises an ValidationError Exception
is raised.
Args:
schema(dict): The schema to validate from
data(dict): The data to validate using the schema
"""
# pylint: disable=invalid-name
validator = jsonschema.Draft4Validator(schema)
with self.assertRaises(jsonschema.ValidationError):
validator.validate(data)
My question is,1. When I try to import them I get an Import error with No module name found. I am breaking the TestValidation to mentioned small files. 2. I know I can raise Validation error in the assertJSONValidateFails but what should I return in case of validation pass.
tests/schema
├── TestSchemaCase.py
├── TestValidation.py
├── __init__.py
└── models
├── Fields
│   ├── TestImplemen.py
│   ├── TestRes.py
│   └── __init__.py
├── Values
│   ├── TestInk.py
│   ├── TestAlue.py
│   └── __init__.py
└── __init__.py
3.And is this how we should inherit them?
class TestRes(unittest.TestCase, TestSchemaCase):
Thanks for your time. Sorry for the big post
I did see the post, But that doesn't solve the problem.
I would suggest using a test framework that doesn't force you to put your tests in classes, like pytest.

Flask: App structure to avoid circular imports

I am following the book Mastering flask's recommended file structure.
(The name of my project is Paw)
In Paw/paw/__init__.py:
def create_app(object_name):
app = Flask(__name__)
app.config.from_object(object_name)
db.init_app(app)
robot = LoggingWeRobot(token='banana', enable_session=True)
robot.init_app(app, endpoint='werobot', rule='/wechat')
attach_debugging_logger(app)
app.register_blueprint(poll_blueprint)
app.register_blueprint(wechat_blueprint)
return app
Note that the robot variable is actually needed in my blueprint, wechat, found in: Paw/paw/controllers/wechat.py
#robot.handler
def request_logging_middleware(message, session):
app.logger.debug("\n%s", request.data)
return False # This allows other handlers to continue execution
So my problem is that my blueprint has no access to the robot variable. However, the robot variable should be created in create_app in Paw/paw/__init__.py because I am trying to follow the application factory pattern.
Any recommendation on how to fix this? My project can be found here and I am trying to follow this application structure
Simply use the same pattern you are using for db - create robot elsewhere and import it into your Paw/paw/__init__.py file, just as you do with db:
import db from models
import robot from wechat_setup
# wechat_setup is where you would invoke
# robot = LoggingWeRobot(token='banana', enable_session=True)
def create_app(object_name):
app = Flask(__name__)
app.config.from_object(object_name)
db.init_app(app)
robot.init_app(app, endpoint='werobot', rule='/wechat')
I usually put project global variables in one file (say gvars.py).
Then the project structure will be some sort like this:
.
├── etc
│   └── webapp.py
├── models
│   └── common.py
├── views
│   └── common.py
├── gvars.py
└── webapp.py
In other files we just do this:
from gvars import db, robot # or other variables

Flask teardown request in context of blueprint

I would like to access an sqlite3 database from a Flask application (without using Flask-SQLAlchemy, since I require fts4 functionality). I am using Flask blueprints, and I am not sure where to put the following functions (shamelessly copied from a response to this stackoverflow question):
def request_has_connection():
return hasattr(flask.g, 'dbconn')
def get_request_connection():
if not request_has_connection():
flask.g.dbconn = sqlite3.connect(DATABASE)
# Do something to make this connection transactional.
# I'm not familiar enough with SQLite to know what that is.
return flask.g.dbconn
#app.teardown_request
def close_db_connection(ex):
if request_has_connection():
conn = get_request_connection()
# Rollback
# Alternatively, you could automatically commit if ex is None
# and rollback otherwise, but I question the wisdom
# of automatically committing.
conn.close()
My file structure is:
app
├── __init__.py
├── main
│   ├── forms.py
│   ├── __init__.py
│   ├── views.py
├── models.py
├── static
└── templates
├── base.html
├── index.html
└── login.html
I want the request_has_connection() and get_request_connection() functions accessible from all view functions and maybe from models.py as well. Right now, I'm thinking they all belong in my blueprint init.py, which currently contains:
from flask import Blueprint
main = Blueprint('main',__name__)
from . import views
and that my request teardown function would be registered as
#main.teardown_request
def close_db_connection(ex):
<blah-blah-blah>
Is this right?

Categories

Resources