Folder structure of flask app with machine learning component - python

I am developing a flask based web app. A user can enter the car specifications and will get predictions of the price of the car based on a machine learning model.
I was following many tutorials on how to create a web app but I feel confused on where to put configurations on machine learning component and how to structure the code correctly.
I have the following folder structure of my project:
├── webapp
│ ├── app
│ ├── static
│ ├── templates
│ ├── routes.py
│ ├── utils.py --> utils function that are used in 'routes.py'
│ ├── src
│ ├── ml_utils.py --> functions for machine learning component
│ ├── else stuff
in routes.py:
from flask import Flask, request, render_template
from sklearn.externals import joblib
import numpy as np
from app.utils import find_freshest_model, convert_to_float, process_features_info_for_option_html, create_features
from src.ml_utils import load_features_info
app = Flask(__name__)
#app.route('/')
def home():
return render_template('index.html', car_type=option_values['car type'])
#app.route('/', methods=['POST'])
def predict():
#order features in a correct way according to order in features_info
features = create_features(request, features_info)
prediction = model.predict(features)
return render_template('index.html',
prediction_text='Your predicted car price is {} Euro'.format(prediction), quality=option_values['quality'])
if __name__ == '__main__':
model_file = find_freshest_model()
features_info = load_features_info() # containts correct order of the features and categorization of features (numerical, categorical)
option_values = process_features_info_for_option_html(features_info['features_dummy'])
model = joblib.load(model_file)
app.run(host='0.0.0.0', debug=True)
Task and Questions:
I want to prepare my app for production and to structure it better.
Should I put in init.py following code?
Regarding the code under if __name__ == '__main__':. Should I create a class modelConfigs, put it into models.py? In init.py I import modelConfigs and will initialize it routes.py.
models.py
from src.ml_utils import load_features_info
from sklearn.externals import joblib
from app.utils import find_freshest_model, process_features_info_for_option_html
class ModelConfigs:
__tablename__ = 'modelConfigs'
model = joblib.load(find_freshest_model())
features_info = load_features_info()
option_values = process_features_info_for_option_html(features_info['features_dummy'])
init.py:
from flask import Flask
app = Flask(__name__)
from app.models import ModelConfigs
model_config = ModelConfigs
from app import routes
routes.py:
from flask import request, render_template
import numpy as np
from app.utils import create_features
from app import app, model_config
#app.route('/')
def home():
return render_template('index.html', car_type=option_values['car type'])
#app.route('/', methods=['POST'])
def predict():
features = create_features(request, model_config.features_info)
prediction = np.expm1(model_config.model.predict(features))
return render_template('index.html',
prediction_text='Your predicted car price is {} Euro'.format(prediction), quality=option_values['quality'])

Flask has a feature called, "Blueprints," which allows you, separate your application into several folders so that routes and views can be kept more neatly in their own folder, yet allow one python file to call on them each individually as needed.
I mention this because it's one of the parts of Flask that Flask likes to highlight. What this does is allows you to keep your project structure cleaner, yet at the same time completely customizeable. Hypothetically you could build your own machine learning pipeline right within a blueprint, I can't think of a specific application for that - but who knows? The capability is there.
Outside of blueprints, from what I have read in a lot of different blogs and practice myself, generally for a small machine learning project, you might start off with just throwing a few things into /src since that's the, "source" folder where you put the types of things that go on the server. However you may quickly outgrow that, and need to separate your /src folder into several different folders which actually represent a legitimate data science project or pipeline structure of some type.
One way you might consider structuring your folder would be the following:
└── src
│ ├── features
│ ├── preparation
│ ├── preprocessing
│ ├── evaluation
│ └── js
└── tests
│ └── unit_tests
└── models
│ └── retrained_models
└── data
│ └── raw_data
│ └── processed_data
│ └── user_input_data
└── pipeline
│ └── model_retraining_automation_scripts
└── docs
└── Documentation
└── Notebooks
All of the above assumes you are storing everything on the server itself, which is restrictive from a data engineering perspective. Servers typically are more expensive, they run on SSD's or whatever. So if you start to grow in size and have massive amounts of data, you would need to perhaps store that in a document store such as AWS S3. If that's the case, you could think about that folder structure above to keep various operations and .py files that would perform similar functions on a small scale and translate that into a larger scale, however you might need to start storing training models as actual binaries in a database, or growing even larger, in S3 buckets, with some way of tracking what is going where, presumably a relational database.

Related

How to serve create-react-app from Flask?

How can I serve the deliverables of create-react-app via Flask?
After npm run build, this is roughly my folder tree:
src/
├── service/
│ └── app.py
└── content/
├── static
│ ├── css
│ │ └── many css files...
│ ├── js
│ │ └── many js files...
│ └── media
│ └── some images...
├── index.html
├── service-worker.js
└── manifest.json
That's the server's code:
app = Flask(__name__, template_folder='../content', static_folder='../content')
#app.route('/')
def home():
return flask.render_template('index.html')
#app.route('/static/<path:path>')
def static_files(path):
return flask.send_from_directory('../content/static', path)
if __name__ == '__main__':
app.run(debug=True)
The main html, index.html is served successfully. So are all the files under content/static/.
The files under content/ however (except index.html), are not delivered at all (error 404, not found).
Assuming you're just trying to serve a bunch of static files, that could probably be done more efficiently with a webserver like nginx.
However if you do wish to use Flask, the simplest way to do this with your exisiting directory structure is:
from flask import Flask, render_template
app = Flask(__name__,
static_url_path='',
static_folder='../content')
#app.route('/')
def index_redir():
# Reached if the user hits example.com/ instead of example.com/index.html
return render_template('index.html')
This essentially serves everything in the contents/ directory statically at the / endpoint (thanks to the static_url_path being set to an empty string)
Of course there's also the index_redir function which will render index.html if the user actually hits example.com/ instead of example.com/index.html.
This also avoids defining the static_files function from your code, as flask has the functionality to serve static files out of the box.
Again this stuff is probably better suited to nginx in a production env.

Proper way to include builds from react apps into Flask

I have a site developed entirely using flask. It uses a Blueprint called dashboards, which hosts some views to access different visualizations, i.e dashboards/<string: dashboard_name>
I store my dashboards with an id, a name and the group it belongs to, so users can only access the visualizations from the group they also belong to.
(This is my unexperienced approach to solve this problem, so I'm also open to suggestions of a better design pattern to achieve this.)
My project kinda looks like this
app
├── dashboards
│   ├── __init__.py
│   └── views.py
├── __init__.py
├── models
│   ├── dashboard.py
│   └── __init__.p
├── static
│   ├── css
│   │   ├── styles.css
│   └── js
│   └── scripts.js
└── templates
├── base.html
└── dashboards
   ├── cat.html
   ├── clock.html
   ├── index.html
   ├── private.html
   └── public.html
my views for the Dashboard blueprint
# app/dashboards/views.py
from ..models.dashboard import Dashboard
#dashboards.route('/<string:name>')
def view_dashboard(name):
# a dictionary with dashboards available to current_user
dashboards = get_dashboards()
for dashboard in dashboards.values():
for d in dashboard:
if name == d.name:
route = 'dashboards/{dashboard}.html'.format(dashboard = d.name)
return render_template(route, dashboard=d)
else:
return render_template('404.html'), 404
and for the Dashboard blueprint
# app/dashboards/__init__.py
from flask import Blueprint
dashboards = Blueprint('dashboards', __name__)
from . import views
finally, I use the create_app pattern
# app/__init__.py
# some imports happening here
def create_app(config_name):
app = Flask(__name__)
from .dashboards import dashboards as dashboards_blueprint
app.register_blueprint(dashboards_blueprint, url_prefix='/dashboards')
return app
One level above the app, there is a manage.py that calls the create_app method, passing some configuration attributes. I believe this is a common pattern in Flask. I also believe this is not relevant to my question.
Particularly, if one deploys a react app using the create-react-app package, i.e. using npm run build, the output is a folder /build that contains the static files necessary to run the app. This folder has, for instance the following structure
├── build
│   ├── asset-manifest.json
│   ├── favicon.ico
│   ├── index.html
│   ├── manifest.json
│   ├── service-worker.js
│   └── static
│   ├── css
│   │   ├── main.<some-hash-1>.css
│   │   └── main.<some-hash-1>.css.map
│   └── js
│   ├── main.<some-hash-2>.js
│   └── main.<some-hash-2>.js.map
other files here
Now if want to plug this react-app using Flask, what I do now is to move the files in /css and /js of /build to the /static folder inside /app, and rename index.html from /build into let's say 'react-dashboard.html' and move it to templates/dashboards.
This is a very dumb approach to make basic apps working, but I don't know where to place the other files from /build, and lest about a better design pattern.
From this README I get that I can tweak the blueprint definition and move my template and static folder inside app/dashboards, but I still don't know where other files as service-worker.js, and manifests and so on.
What is the proper way to "mount" my deployed react app on my Flask application? I understand that reorganizing files from /build is something not desired.
I've solved it tweaking the blueprint
bp = Blueprint(name='myblueprint',
import_name=__name__,
static_folder='static/myblueprint',
template_folder='templates',
static_url_path='/static', # this is what myblueprint.static will point to
url_prefix='/myblueprint',
subdomain=None,
url_defaults=None,
root_path=None)
With this, my static paths inside the blueprint will point to /myblueprint/static, and I will need to modify my create-react-app build folder.
replace calls to static in every *.css, *.js and .html in /build to /myblueprint/static
rearrange folders inside /build to match my blueprint static folder.
Move /build/static to /blueprint/static
Set my own jinja template in /blueprint/templates/index.html that calls these files. Make sure this file points to the static assets in /myblueprint/static and not to /build/static.
I've created a gulpfile that will automate this process,
var gulp = require('gulp');
var replace = require('gulp-replace');
var rename = require('gulp-rename');
// script to deploy react build into flask blueprint directory
// it replaces values and folder names
// README
// how to use this:
// in the react app, run npm run build
// copy the resulting /build folder in /myblueprint
// create a template of a template in /templates/base as index.default.html
// install gulp, gulp-replace, gulp-rename
// run gulp from /myblueprint
// replace routes inside css files
gulp.task('othertemplates', function(){
gulp.src(['./build/static/css/main.*.css'])
.pipe(replace('url(/static/media/', 'url(/myblueprint/static/media/'))
.pipe(gulp.dest('static/myblueprint/css/'));
});
// replace routes inside js files
gulp.task('jsthingtemplates', function(){
gulp.src(['./build/static/js/main.*.js'])
.pipe(replace('static/media/', 'myblueprint/static/media/'))
.pipe(gulp.dest('static/myblueprint/js/'));
});
// move other files
gulp.task('mediastuff', function(){
gulp.src(['./build/static/media/*'])
.pipe(gulp.dest('static/myblueprint/media/'));
gulp.src(['./build/asset-manifest.json'])
.pipe(gulp.dest('static/myblueprint/'));
});
// replace hashes in assets in the jinja template
gulp.task('jinjareplace', function(){
var fs = require('fs');
var assets = JSON.parse(fs.readFileSync('./build/asset-manifest.json'));
// extract hashes
let jsHashNew = assets['main.js'].match('[a-z0-9]{8}')
let cssHashNew = assets['main.css'].match('[a-z0-9]{8}')
gulp.src(['./templates/myblueprint/base/index.default.html'])
.pipe(replace('css/main.cssHash.css', `css/main.${cssHashNew}.css`))
.pipe(replace('js/main.jsHash.js', `js/main.${jsHashNew}.js`))
.pipe(rename('index.html'))
.pipe(gulp.dest('templates/myblueprint/'));
});
gulp.task('default', ['othertemplates', 'jsthingtemplates', 'mediastuff', 'jinjareplace'])
with this, you can run gulp from the same path as gulpfile.js and should have everything covered.

Flask: App structure to avoid circular imports

I am following the book Mastering flask's recommended file structure.
(The name of my project is Paw)
In Paw/paw/__init__.py:
def create_app(object_name):
app = Flask(__name__)
app.config.from_object(object_name)
db.init_app(app)
robot = LoggingWeRobot(token='banana', enable_session=True)
robot.init_app(app, endpoint='werobot', rule='/wechat')
attach_debugging_logger(app)
app.register_blueprint(poll_blueprint)
app.register_blueprint(wechat_blueprint)
return app
Note that the robot variable is actually needed in my blueprint, wechat, found in: Paw/paw/controllers/wechat.py
#robot.handler
def request_logging_middleware(message, session):
app.logger.debug("\n%s", request.data)
return False # This allows other handlers to continue execution
So my problem is that my blueprint has no access to the robot variable. However, the robot variable should be created in create_app in Paw/paw/__init__.py because I am trying to follow the application factory pattern.
Any recommendation on how to fix this? My project can be found here and I am trying to follow this application structure
Simply use the same pattern you are using for db - create robot elsewhere and import it into your Paw/paw/__init__.py file, just as you do with db:
import db from models
import robot from wechat_setup
# wechat_setup is where you would invoke
# robot = LoggingWeRobot(token='banana', enable_session=True)
def create_app(object_name):
app = Flask(__name__)
app.config.from_object(object_name)
db.init_app(app)
robot.init_app(app, endpoint='werobot', rule='/wechat')
I usually put project global variables in one file (say gvars.py).
Then the project structure will be some sort like this:
.
├── etc
│   └── webapp.py
├── models
│   └── common.py
├── views
│   └── common.py
├── gvars.py
└── webapp.py
In other files we just do this:
from gvars import db, robot # or other variables

Using flask-script together with template-filter

I have a flask application which uses jinja2 template filters. An example of template filter is as follows:
#app.template_filter('int_date')
def format_datetime(date):
if date:
return utc_time.localize(date).astimezone(london_time).strftime('%Y-%m-%d %H:%M')
else:
return date
This works fine if we have an app instantiated before a decorator is defined, however if we are using an app factory combined with flask-script manager, then we don't have an instantiated app. For example:
def create_my_app(config=None):
app = Flask(__name__)
if config:
app.config.from_pyfile(config)
return app
manager = Manager(create_my_app)
manager.add_option("-c", "--config", dest="config", required=False)
#manager.command
def mycommand(app):
app.do_something()
Manager accepts either an instantiated app or an app factory, so at first glance it appears that we can do this:
app = create_my_app()
#app.template_filter('int_date')
....
manager = Manager(app)
The problem with this solution is that the manager then ignores the option, since the app has already been configured during instantiation. So how is someone supposed to use template filters together with the flask-script extension?
This is where blueprints come into play. I would define a blueprint core and put all my custom template filters in say core/filters.py.
To register filters to an application in flask when using blueprints you need to use app_template_filter instead of template_filter. This way you can still use the decorator pattern to register filters and use the application factory approach.
A typical directory layout for an application using blueprint might look something like:
├── app
│   ├── blog
│   │   ├── __init__.py # blog blueprint instance
│   │   └── routes.py # core filters can be used here
│   ├── core
│   │   ├── __init__.py # core blueprint instance
│   │   ├── filters.py # define filters here
│   │   └── routes.py # any core views are defined here
│   └── __init__.py # create_app is defined here & blueprint registered
└── manage.py # application is configured and created here
For a minimal working example of this approach see: https://github.com/iiSeymour/app_factory
The solution can be found here, which states they are two ways one can define a jinja template filter. Thus, instead of defining a decorator outside the factory, one can modify the jinja_env instead. This can be done in the app factory, for example:
def format_datetime(date):
if date:
return utc_time.localize(date).astimezone(london_time).strftime('%Y-%m-%d %H:%M')
else:
return date
def create_app(production=False):
app = Flask(__name__)
....
# Register Jinja2 filters
app.jinja_env.filters['datetime'] = format_datetime
manager = Manager(create_app)
...

Flask teardown request in context of blueprint

I would like to access an sqlite3 database from a Flask application (without using Flask-SQLAlchemy, since I require fts4 functionality). I am using Flask blueprints, and I am not sure where to put the following functions (shamelessly copied from a response to this stackoverflow question):
def request_has_connection():
return hasattr(flask.g, 'dbconn')
def get_request_connection():
if not request_has_connection():
flask.g.dbconn = sqlite3.connect(DATABASE)
# Do something to make this connection transactional.
# I'm not familiar enough with SQLite to know what that is.
return flask.g.dbconn
#app.teardown_request
def close_db_connection(ex):
if request_has_connection():
conn = get_request_connection()
# Rollback
# Alternatively, you could automatically commit if ex is None
# and rollback otherwise, but I question the wisdom
# of automatically committing.
conn.close()
My file structure is:
app
├── __init__.py
├── main
│   ├── forms.py
│   ├── __init__.py
│   ├── views.py
├── models.py
├── static
└── templates
├── base.html
├── index.html
└── login.html
I want the request_has_connection() and get_request_connection() functions accessible from all view functions and maybe from models.py as well. Right now, I'm thinking they all belong in my blueprint init.py, which currently contains:
from flask import Blueprint
main = Blueprint('main',__name__)
from . import views
and that my request teardown function would be registered as
#main.teardown_request
def close_db_connection(ex):
<blah-blah-blah>
Is this right?

Categories

Resources