How to Refactor Module using python rope? - python

I have my project structure as following
├── app
│   ├── Country
│   │   └── views.py
│   ├── Customer
│   │   └── views.py
Where the module 'Country' folder is what I tried to rename it to 'Countries' and every occurrence it is used, and it is also imported in Customer/views.py as well.
from app.Country.views import *
....
According to this tutorial Refactoring Python Applications for Simplicity, I tried it as below:
>>> from rope.base.project import Project
>>>
>>> proj = Project('app')
>>>
>>> Country = proj.get_folder('Country')
>>>
>>> from rope.refactor.rename import Rename
>>>
>>> change = Rename(proj, Country).get_changes('Countries')
>>> proj.do(change)
After executing the script, the module folder 'Country' was changed to 'Countries' but its instance where it is used in Customer/views.py does not change accordingly, the import statement in Customer/views.py is still
from app.Country.views import *
I expected it should change to from app.Countries.views import * after refactoring, but it did not.
Is there anything else I should do to refactor this successfully? Thanks.

You could use proj.get_module('app.Country').get_resource() to rename module.
from rope.base.project import Project
from rope.refactor.rename import Rename
proj = Project('app')
country = proj.get_module('app.Country').get_resource()
change = Rename(proj, country).get_changes('Countries')
print(change.get_description())

If you happen to work in a virtual environment and/or Django (as the views.py files suggest), you may need to define your PYTHONPATH variable before you start python.
>>> export PYTHONPATH=<path-to-app-folder>:<path-to-virtualen-bin>:<other-paths-used-by-your-project>
>>> python
Then (code from AnnieFromTaiwan is valid, as well as yours I guess, but did not test it):
from rope.base.project import Project
from rope.refactor.rename import Rename
proj = Project('app')
country = proj.get_module('app.Country').get_resource()
change = Rename(proj, country).get_changes('Countries')
proj.do(change)

Related

Folder structure of flask app with machine learning component

I am developing a flask based web app. A user can enter the car specifications and will get predictions of the price of the car based on a machine learning model.
I was following many tutorials on how to create a web app but I feel confused on where to put configurations on machine learning component and how to structure the code correctly.
I have the following folder structure of my project:
├── webapp
│ ├── app
│ ├── static
│ ├── templates
│ ├── routes.py
│ ├── utils.py --> utils function that are used in 'routes.py'
│ ├── src
│ ├── ml_utils.py --> functions for machine learning component
│ ├── else stuff
in routes.py:
from flask import Flask, request, render_template
from sklearn.externals import joblib
import numpy as np
from app.utils import find_freshest_model, convert_to_float, process_features_info_for_option_html, create_features
from src.ml_utils import load_features_info
app = Flask(__name__)
#app.route('/')
def home():
return render_template('index.html', car_type=option_values['car type'])
#app.route('/', methods=['POST'])
def predict():
#order features in a correct way according to order in features_info
features = create_features(request, features_info)
prediction = model.predict(features)
return render_template('index.html',
prediction_text='Your predicted car price is {} Euro'.format(prediction), quality=option_values['quality'])
if __name__ == '__main__':
model_file = find_freshest_model()
features_info = load_features_info() # containts correct order of the features and categorization of features (numerical, categorical)
option_values = process_features_info_for_option_html(features_info['features_dummy'])
model = joblib.load(model_file)
app.run(host='0.0.0.0', debug=True)
Task and Questions:
I want to prepare my app for production and to structure it better.
Should I put in init.py following code?
Regarding the code under if __name__ == '__main__':. Should I create a class modelConfigs, put it into models.py? In init.py I import modelConfigs and will initialize it routes.py.
models.py
from src.ml_utils import load_features_info
from sklearn.externals import joblib
from app.utils import find_freshest_model, process_features_info_for_option_html
class ModelConfigs:
__tablename__ = 'modelConfigs'
model = joblib.load(find_freshest_model())
features_info = load_features_info()
option_values = process_features_info_for_option_html(features_info['features_dummy'])
init.py:
from flask import Flask
app = Flask(__name__)
from app.models import ModelConfigs
model_config = ModelConfigs
from app import routes
routes.py:
from flask import request, render_template
import numpy as np
from app.utils import create_features
from app import app, model_config
#app.route('/')
def home():
return render_template('index.html', car_type=option_values['car type'])
#app.route('/', methods=['POST'])
def predict():
features = create_features(request, model_config.features_info)
prediction = np.expm1(model_config.model.predict(features))
return render_template('index.html',
prediction_text='Your predicted car price is {} Euro'.format(prediction), quality=option_values['quality'])
Flask has a feature called, "Blueprints," which allows you, separate your application into several folders so that routes and views can be kept more neatly in their own folder, yet allow one python file to call on them each individually as needed.
I mention this because it's one of the parts of Flask that Flask likes to highlight. What this does is allows you to keep your project structure cleaner, yet at the same time completely customizeable. Hypothetically you could build your own machine learning pipeline right within a blueprint, I can't think of a specific application for that - but who knows? The capability is there.
Outside of blueprints, from what I have read in a lot of different blogs and practice myself, generally for a small machine learning project, you might start off with just throwing a few things into /src since that's the, "source" folder where you put the types of things that go on the server. However you may quickly outgrow that, and need to separate your /src folder into several different folders which actually represent a legitimate data science project or pipeline structure of some type.
One way you might consider structuring your folder would be the following:
└── src
│ ├── features
│ ├── preparation
│ ├── preprocessing
│ ├── evaluation
│ └── js
└── tests
│ └── unit_tests
└── models
│ └── retrained_models
└── data
│ └── raw_data
│ └── processed_data
│ └── user_input_data
└── pipeline
│ └── model_retraining_automation_scripts
└── docs
└── Documentation
└── Notebooks
All of the above assumes you are storing everything on the server itself, which is restrictive from a data engineering perspective. Servers typically are more expensive, they run on SSD's or whatever. So if you start to grow in size and have massive amounts of data, you would need to perhaps store that in a document store such as AWS S3. If that's the case, you could think about that folder structure above to keep various operations and .py files that would perform similar functions on a small scale and translate that into a larger scale, however you might need to start storing training models as actual binaries in a database, or growing even larger, in S3 buckets, with some way of tracking what is going where, presumably a relational database.

Airflow on Docker - Path issue

Working with airflow I try simple DAG work.
I wrote custom operators and other files that I want to import into the main file where the DAG logic is.
Here the folder's structure :
├── airflow.cfg
├── dags
│   ├── __init__.py
│   ├── dag.py
│   └── sql_statements.sql
├── docker-compose.yaml
├── environment.yml
└── plugins
├── __init__.py
└── operators
├── __init__.py
├── facts_calculator.py
├── has_rows.py
└── s3_to_redshift.py
I setup the volume right in the compose file since I can see them when I log into the container's terminal.
I followed some tutorials online from where I have added some __init__.py.
The 2 none empty __init__ are into
/plugins/operators:
from operators.facts_calculator import FactsCalculatorOperator
from operators.has_rows import HasRowsOperator
from operators.s3_to_redshift import S3ToRedshiftOperator
__all__ = [
'FactsCalculatorOperator',
'HasRowsOperator',
'S3ToRedshiftOperator'
]
/plugins:
from airflow.plugins_manager import AirflowPlugin
import operators
# Defining the plugin class
class CustomPlugin(AirflowPlugin):
name = "custom_plugin"
# A list of class(es) derived from BaseOperator
operators = [
operators.FactsCalculatorOperator,
operators.HasRowsOperator,
operators.S3ToRedshiftOperator
]
# A list of class(es) derived from BaseHook
hooks = []
# A list of class(es) derived from BaseExecutor
executors = []
# A list of references to inject into the macros namespace
macros = []
# A list of objects created from a class derived
# from flask_admin.BaseView
admin_views = []
# A list of Blueprint object created from flask.Blueprint
flask_blueprints = []
# A list of menu links (flask_admin.base.MenuLink)
menu_links = []
But I keep getting errors from my IDE (saying No module named operators or Unresolved reference operators inside the operator's __init__).
Since everything fails to launch on the webserver.
Any idea how to set this up ? Where I'm wrong ?
Are you using the puckel's image?
If you are, you need to uncomment the # - ./plugins:/usr/local/airflow/plugins lines (may there are more than one) in the docker-compose files (either Local or Celery). The rest of your setup looks fine to me.

Python directory structure for modules

I have the following directory and file structure in my current directory:
├── alpha
│   ├── A.py
│   ├── B.py
│   ├── Base.py
│   ├── C.py
│   └── __init__.py
└── main.py
Each file under the alpha/ directory is it's own class and each of those classes inheirts the Base class in Base.py. Right now, I can do something like this in main.py:
from alpha.A import *
from alpha.B import *
from alpha.C import *
A()
B()
C()
And it works fine. However, if I wanted to add a file and class "D" and then use D() in main.py, I'd have to go into my main.py and do "from alpha.D import *". Is there anyway to do an import in my main file so that it imports EVERYTHING under the alpha directory?
depens what you are trying to do with the objects, one possible solution could be:
import importlib
import os
for file in os.listdir("alpha"):
if file.endswith(".py") and not file.startswith("_") and not file.startswith("Base"):
class_name = os.path.splitext(file)[0]
module_name = "alpha" + '.' + class_name
loaded_module = importlib.import_module(module_name)
loaded_class = getattr(loaded_module, class_name)
class_instance = loaded_class()
Importing everything with * is not a good practice, so if your files have only one class, importing this class is "cleaner" ( class_name is in your case)

Flask: App structure to avoid circular imports

I am following the book Mastering flask's recommended file structure.
(The name of my project is Paw)
In Paw/paw/__init__.py:
def create_app(object_name):
app = Flask(__name__)
app.config.from_object(object_name)
db.init_app(app)
robot = LoggingWeRobot(token='banana', enable_session=True)
robot.init_app(app, endpoint='werobot', rule='/wechat')
attach_debugging_logger(app)
app.register_blueprint(poll_blueprint)
app.register_blueprint(wechat_blueprint)
return app
Note that the robot variable is actually needed in my blueprint, wechat, found in: Paw/paw/controllers/wechat.py
#robot.handler
def request_logging_middleware(message, session):
app.logger.debug("\n%s", request.data)
return False # This allows other handlers to continue execution
So my problem is that my blueprint has no access to the robot variable. However, the robot variable should be created in create_app in Paw/paw/__init__.py because I am trying to follow the application factory pattern.
Any recommendation on how to fix this? My project can be found here and I am trying to follow this application structure
Simply use the same pattern you are using for db - create robot elsewhere and import it into your Paw/paw/__init__.py file, just as you do with db:
import db from models
import robot from wechat_setup
# wechat_setup is where you would invoke
# robot = LoggingWeRobot(token='banana', enable_session=True)
def create_app(object_name):
app = Flask(__name__)
app.config.from_object(object_name)
db.init_app(app)
robot.init_app(app, endpoint='werobot', rule='/wechat')
I usually put project global variables in one file (say gvars.py).
Then the project structure will be some sort like this:
.
├── etc
│   └── webapp.py
├── models
│   └── common.py
├── views
│   └── common.py
├── gvars.py
└── webapp.py
In other files we just do this:
from gvars import db, robot # or other variables

Getting "No module named" error

EDIT: I do have __init__.py It was generated when django made the app. I'm just adding a file in the directory. The script is just trying to from npage.models import Page, Level, Section, Edge but it cannot do it for some odd reason or another :(
EDIT 2: Directory Structure + Code Snippet:
└── npage/ <-- My package?
├── __init__.py
├── __pycache__/
│   ├── __init__.cpython-34.pyc
│   └── models.cpython-34.pyc
├── admin.py
├── forms.py
├── middleware.py
├── mixins.py
├── models.py <-- My module?
├── restructure.py <-- from package.module import classes (Edge,Page, etc.)
├── tests.py
├── urls.py
└── views.py
Code snippet from restructure.py,
from npage.models import Edge
import MySQLdb
def ordered_pages_for(level):
""" return sorted list of pages represented as dictionary objects """
# the rest of the code just converts a linked list of pages into an Edge set and a table of Pages
# ...
def build_edge_table_of_pages(pagedicts_in_order):
# prev and next are Page instances (None, 4) where 4 is the first page.pk
prev0 = None
for pd in pagedicts_in_order:
# make a page next=page(pd)
next0 = page(pd)
# add an edge prev<-->next (prev will be None at first)
e = Edge.create(prev = prev0, next = next0)
e.save()
# save prev = next (now prev is not None, it's a Page.)
prev0 = next0
# make edge prev<-->None
e = Edge.create(prev = prev0, next = None)
e.save()
I wrote a script to import a database table into a new django model defined structure...
I placed the script inside an app called 'npage'. views.py contains from npage.models import * and it is working fine. I have the same line in my script which is in the same directory and it is saying there is not nodule named npage. What am I missing here?
(env)glitch:npage nathann$ python restructure.py
Traceback (most recent call last):
File "restructure.py", line 32, in <module>
from npage.models import * # whhhyyy???
ImportError: No module named npage.models
I tried doing a relative import and it gave me this:
Traceback (most recent call last):
File "restructure.py", line 32, in <module>
from .models import * # whhhyyy???
ValueError: Attempted relative import in non-package
I found my issue. It was simple.
import os
os.environ.setdefault("DJANGO_SETTINGS_MODULE", "ipals.settings")
# And abracadabra I can import everything now.
from npage.models import Page, Edge, Level
Where ipals is the name of my project. I had to do this so it recognized my npage app. Please comment on this answer so I can edit and expound on why this works. From my perspective I simple had an environment issue. My script was executing within the wrong context.

Categories

Resources