python code organisation

python code organisation - python

As a beginning programmer (I'm rather doing scripting), I'm struggling with my growing collection of scripts and modules.
Currently, small scripts are in a general folder that is added to my pythonpath. Specific projects get their own subfolder, but more and more I try to write the general parts of those projects as modules that can be used by other projects. But I cannot just import them unless this subfolder is also in my pythonpath.
I don't know how to organise all this. I will be happy to get your hints and recommendations on organising (python) code. Thanks!

Be sure to read about Packages, specifically the part about creating a file named __init__.py in directories you want to import from.
I have a git repository named MyCompany in my home directory. There's a link from /usr/local/lib/python2.7/site-packages/ to it. Inside MyCompany are package1, package2, package3, etc. In my code, I write import MyCompany.package1.modulefoo. Python looks in site-packages and finds MyCompany. Then it finds the package1 subdirectory with an __init__.py file in it - yay, a package! Then it imports the modulefoo.py file in that directory, and I'm off and running.

General-purpose custom modules should go to ~/.local/lib/pythonX.Y/site-packages, or to /usr/local/lib/pythonX.Y/site-packages if they should be available to everyone. Both paths are automatically available in your $PYTHONPATH. (The former is available since Python 2.6 – see PEP 370 – Per user site-packages directory.)

Related

How to maintain python application with dependencies, including my own custom libs?

I'm using Python to develop few company-specific applications. There is a custom shared module ("library") that describes some data and algorithms and there are dozens of Python scripts that work with this library. There's quite a lot of these files, so they are organized in subfolders
myproject
apps
main_apps
app1.py
app2.py
...
utils
util1.py
util2.py
...
library
__init__.py
submodule1
__init__.py
file1.py
...
submodule2
...
Users want to run these scripts by simply going, say, to myproject\utils and launching "py util2.py some_params". Many of these users are developers, so quite often they want to edit a library and immediately re-run scripts with updated code. There are also some 3rd party libraries used by this project and I want to make sure that everyone is using the same versions of these libs.
Now, there are two key problems I encountered:
how to reference (library) from (apps)?
how to manage 3rd party dependencies?
The first problem is well-familiar to many Python developers and was asked on SO for many times: it's quite difficult to instruct Python to import package from "....\library". I tested several different approaches, but it seems that python is reluctant to search for packages anywhere, but in standard libraries locations or the folder of the script itself.
Relative import doesn't work since script is not a part of a library (and even if it was, this still doesn't work when script is executed directly unless it's placed in the "root" project folder which I'd like to avoid)
Placing .pth file (as one might think from reading this document) to script folder apparently doesn't have any effect
Of course direct meddling with sys.path work, but boilerplate code like this one in each and every one of the script files looks quite terrible
import sys, os.path
here = os.path.dirname(os.path.realpath(__file__))
module_root = os.path.abspath(os.path.join(here, '../..'))
sys.path.append(python_root)
import my_library
I realize that this happens because Python wants my library to be properly "installed" and that's indeed would be the only right way to go had this library was developed separately from the scripts that use it. But unfortunately it's not the case and I think that re-doing "installation" of library each time it's changed is going to be quite inconvenient and prone to errors.
The second problem is straightforward. Someone adds a new 3rd party module to our app/lib and everyone else start seeing import problems once they update their apps. Several branches of development, different moments when user does pip install, few rollbacks - and everyone eventually ends using different versions of 3rd party modules. In my case things are additionally complicated by the fact that many devs work a lot with older Python 2.x code while I'd like to move on to Python 3.x
While looking for a possible solution for my problems, I found a truly excellent virtual environments feature in Python. Things looked quite bright:
Create a venv for myproject
Distribute a Requirements.txt file as part of app and provide a script that populates venv accordingly
Symlink my own library to venv site_packages folder so it'll be always detected by Python
This solution looked quite natural & robust. I'm explicitly setting my own environment for my project and place whatever I need into this venv, including my own lib that I can still edit on the fly. And it indeed work. But calling activate.bat to make this python environment active and another batch file to deactivate it is a mess, especially on Windows platform. Boilerplate code that is editing sys.path looks terrible, but at least it doesn't interfere with UX like this potential fix do.
So there's a question that I want to ask.
Is there a way to bind particular python venv to particular folders so python launcher will automatically use this venv for scripts from these folders?
Is there a better alternative way to handle this situation that I'm missing?
Environment for my project is Python 3.6 running on Windows 10.

I think that I finally found a reasonable answer. It's enough to just add shebang line pointing to python interpreter in venv, e.g.
#!../../venv/Scripts/python
The full project structure will look like this
myproject
apps
main_apps
app1.py (with shebang)
app2.py (with shebang)
...
utils
util1.py (with shebang)
util2.py (with shebang)
...
library
__init__.py
submodule1
__init__.py
file1.py
...
submodule2
...
venv
(python interpreter, 3rd party modules)
(symlink to library)
requirements.txt
init_environment.bat
and things work like this:
venv is a virtual python environment with everything that project needs
init_environment.bat is a script that populates venv according to requirements.txt and places a symlink to my library into venv site-modules
all scripts start with shebang line pointing (with relative path) to venv interpreter
There's a full custom environment with all the libs including my own and scripts that use it will all have very natural imports. Python launcher will also automatically pick Python 3.6 as interpreter & load the relevant modules whenever any user-facing script in my project is launched from console or windows explorer.
Cons:
Relative shebang won't work if a script is called from other folder
User will still have to manually run init_environment.bat to update virtual environment according to requirements.txt
init_environment scrip on Windows require elevated privileges to make a symlink (but hopefully that strange MS decision will be fixed with upcoming Win10 update in April'17)
However I can live with these limitations. Hope that this will help others looking for similar problems.
Would be still nice to still hear other options (as answers) and critics (as comments) too.

Pythonic way for magaging and using user-created shared libraries

tl;dr I have a directory of common files outside of my various project directories. What is the pythonic way of using/importing these common files inside my projects, and for building them into an output directory.
Background:
I'm in school and taking a data structures class that uses Python as the language. I'm learning the languages as I take the class but have having some issues trying to maintain a shared code base.
In all of the other languages I've used, both compiled and interpreted, there has been a fairly intuitive way of being able to keep shared modules separate from the code that is using them so that updating a shared module doesn't require updates to the calling code.
This is how I initially had my directory structure organized.
/.../Projects
Assignment_1
__init__.py
classA.py
classB.py
Assignment_2
__init__.pu
classC.py
(etc)
After realizing that much of the functionality of classA and classB would be required later on, I reorganized to this:
/.../Projects
Common
Sorters
__init__.py
BubbleSort.py
MergeSort.py
__init__.py
SimpleProfiler.py
Assignment_1
__init__.py
main.py
Assignment_2
__init__.py
main.py
My issue is that I can't find a good way of importing things like SimpleProfiler or MergeSort from main,py. Right now I'm manually coping all of the Common files into each assignment, which is bad.
I understand that one possible solution is to update the path to include the common folder form within each main.py file, but I've also read that this is very hacky and isn't encouraged.
Another Stackoverflow answer to a similar question suggested that the user structure everything under one large project. I tried this but still couldn't import modules from one sibling into another sibling.
My other issue is how to package everything together when submitting the assignment. In other languages it was easy to implement a build script that would scan the main project for any imports, then copy (flatten) those imported files into a single output directory which I could then compress and submit for grading. I'm using PyCharm, but can't seem to find a way to reference the imports as part of the build process. Is there any kind of script for this? Whatever the solution is, I need to be able to submit the project in such a way that all the instructor has to do is call a single python file (such as main.py)
This issue isn't unique to a school setting, but seems universal to most programming projects. So, what is the pythonic way of managing a shared code base and for building that shared code into a final project?

[Disclaimer: I think it is better to use PYTHONPATH environment variable]
I think of two very similar alternatives:
/.../Projects
Common
Sorters
__init__.py
BubbleSort.py
MergeSort.py
__init__.py
SimpleProfiler.py
assignment_1.py
assignment_2.py
If you use, from assignment_1.py, the following import: from Common.Sorters.BubbleSort import bubble_sort. This is because, by default, PYTHONPATH considers the current path as a valid PYTHONPATH. This is assuming that you are invoking the scripts assignment_* directly.
The other alternative would be:
/.../Projects
Common
Sorters
__init__.py
BubbleSort.py
MergeSort.py
__init__.py
SimpleProfiler.py
Assignment_1
__init__.py
__main__.py
Assignment_2
__init__.py
__main__.py
And invoking the assignments like so: python -m Assignment_1 (from the Projects folder). By default, "executing" a module like that will load its __main__.py code. (This is not a rigurous explanation, although the official one is a bit short).
It works for the same reasons as before: Python interpreter will consider the current path as a valid PYTHONPATH.

Try setting PYTHONPATH environment variable to your directory.
Python first searches for files being imported in sys.path, and the first directory in sys.path is the current directory. PYTHONPATH is the next where python will look for files.

On the minimum end, make a PyCharm run configuration that sets your PYTHONPATH before executing and include the other directory. That way you don't need to do a sys.path call in your code.
Closer to the "perfect" end, make your other directory into a Python package with a setup.py. Then, using the interpreter from your project, do "python path/to/other/dir/setup.py develop" to bring your separately-developed package into the consuming project.

Sphinx and relative imports in Python 3.*

I have a directory with a python package as follows:
--docs/index.rst
--docs/...
--app/__init__.py
--app/foo.py
and I'm using sphinx with autodocs for documenting the app (in python 3.3).
Now, in the conf.py (inside docs/), I have
sys.path.insert(0, os.path.abspath('../app'))
I cd into docs/, run
make html
which gives me
SystemError: Parent module '' not loaded, cannot perform relative import
to all the modules that have a
from .foo import Bar
I have a clean virtualenv installation of Sphinx using
pip install Sphinx
after I created the (clean) environment for python 3.3.
What am I missing?
I was moving the project from python 2.* to python 3.* when this happened. All the project is working, but this...

Your app directory is a package. A package is a directory with __init.py__ and other files inside it.
If you put a package directory on your sys.path, all kinds of things go wrong.
Let's take an example:
root/
app/
app/__init__.py
app/spam.py
app/eggs.py
If you have root on your sys.path (because it's your current working directory, or because you do it explicitly, or because you've installed things correctly to your site-packages), then app is a package, app.spam is a module, and, within app.eggs, .spam is that module. So, everything works.
If you have app on your sys.path, then app is not a package, spam is a module, and, within eggs, .spam isn't anything. So, you can't use relative imports.
If you have both on your sys.path, then app is a package, spam and app.spam are both different modules (with the same contents, executed twice), and within app.eggs, .spam is a module, but within eggs, .spam isn't anything. This will cause you no end of problems.
So, most likely, the fix you want is this:
sys.path.insert(0, os.path.abspath('..'))
If there are other packages, or directories full of Python code that aren't packages, in .. that you don't want to autodoc (e.g., a tests directory with tests/test_spam.py), then you will need to restructure your directories to put app into some directory that doesn't have any other Python code in it, like this:
root/
src/
app/
tests/
doc/
Alternatively, if you didn't want app to be a package, but rather to be a sys.path root directory, then kill the __init__.py, and leave app directly in sys.path. But in that case, you can't use intra-package relative imports; all of the modules in app are top-level modules, and have to be imported as such.
The Packages section of the tutorial (and the rest of the chapter above it) explains some of this, but there's probably better introductory documentation out there.
For full details, in 3.3+, The import system has everything, nicely organized; for older versions, the reference docs are muddy, incomplete, and scattered; you have to start at The import statement, and then read The Knights Who Say Neeeow ... Wum ... Ping! (which is basically a PEP but 1.5 didn't have PEPs yet), and possibly even the ni documentation, if you can find it, plus various PEPs and minor change log entries that explain how things have changed between 1.5 and 2.7 or 3.2 or whatever.

Best practice for handling path/executables in project scripts in Python (e.g. something like Django's manage.py, or fabric)

I do a lot of work on different projects (I'm a scientist) in a fairly standardised directory structure. e.g.:
project
/analyses/
/lib
/doc
/results
/bin
I put all my various utility scripts in /bin/ because cleanliness is next to godliness. However, I have to hard code paths (e.g. ../../x/y/z) and then I have to run things within ./bin/ or they break.
I've used Django and that has /manage.py which runs various django-things and automatically handles the path. I've also used fabric to run various user defined functions.
Question: How do I do something similar? and what's the best way? I can easily write something in /manage.py to inject the root dir into sys.path etc, but then I'd like to be able to do "./manage.py foo" which would run /bin/foo.py. Or is it possible to get fabric to call executables from a certain directory?
Basically - I want something easy and low maintenance. I want to be able to drop an executable script/file/whatever into ./bin/ and not have to deal with path issues or import issues.
What is the best way to do this?

Keep Execution at TLD
In general, try to keep your runtime at top-level. This will straighten out your imports tremendously.
If you have to do a lot of import addressing with relative imports, there's probably a
better way.
Modifying The Path
Other poster's have mentioned the PYTHONPATH. That's a great way to do it permanently in your shell.
If you don't want to/aren't able to manipulate the PYTHONPATH project path directly you can use sys.path to get yourself out of relative import hell.
Using sys.path.append
sys.path is just a list internally. You can append to it to add stuff to into your path.
Say I'm in /bin and there's a library markdown in lib/. You can append a relative paths with sys.path to import what you want.
import sys
sys.path.append('../lib')
import markdown
print markdown.markdown("""
Hello world!
------------
""")
Word to the wise: Don't get too crazy with your sys.path additions. Keep your schema simple to avoid yourself a lot confusion.
Overly eager imports can sometimes lead to cases where a python module needs to import itself, at which point execution will halt!
Using Packages and __init__.py
Another great trick is creating python packages by adding __init__.py files. __init__.py gets loaded before any other modules in the directory, so it's a great way to add imports across the entire directory. This makes it an ideal spot to add sys.path hackery.
You don't even need to necessarily add anything to the file. It's sufficient to just do touch __init__.py at the console to make a directory a package.
See this SO post for a more concrete example.

In a shell script that you source (not run) in your current shell you set the following environment variables:
PATH=$PATH:$PROJECTDIR/bin
PYTHONPATH=$PROJECTDIR/lib
Then you put your Python modules and package tree in your projects ./lib directory. Python automatically adds the PYTHONPATH environment variable to sys.path.
Then you can run any top-level script from the shell without specifying the path, and any imports from your library modules are looked for in the lib directory.
I recommend very simple top-level scripts, such as:
#!/usr/bin/python
import sys
import mytool
mytool.main(sys.argv)
Then you never have to change that, you just edit the module code, and also benefit from the byte-code caching.

You can easily achieve your goals by creating a mini package that hosts each one of your projects. Use paste scripts to create a simple project skeleton. And to make it executable, just install it via setup.py develop. Now your bin scripts just need to import the entry point to this package and execute it.

django shared library/classes

Am new to django and was looking for advice where to place my shared library. Am planning on creating classes that I want to use across all my apps within the project. Where would be the best location to place them?
e.x abstract models
regards,

We usually set our projects up like this:
/site/
__init__.py
manage.py
settings.py
urls.py
/apps/
__init__.py
/appA/
__init__.py
/appB/
__init__.py
/lib/
__init__.py
/django-lib/
__init__.py
/shared-lib/
__init__.py
Just make sure your site directory is on your python path:
import sys
sys.path.append('/path/to/site/')
Also, ensure that an init.py exists in site, apps, and lib so they can be treated as modules using dot notation imports (import site.lib.shared-lib)
Edit:
In answer to your question regarding your python path, it all has to do with where your 'manage.py' or equivalent file resides. If it is under the /site/ directory (next to apps and lib) then the PYTHONPATH should be fine.
You need to make sure that every directory contains an empty file called __init__.py. This tells Python to treat that directory as a module. See the new and improved ASCII art above.

What I learned from Two Scoops of Django is that you put the shared code inside a Django app that you create on your own called core.
To use the core app then is similar to how you cross-use classes from other apps.
To learn more, go to Chapter 28 of the book titled: "What About Those Random Utilities?"

If the shared library is going to be used in others projects as well you can make a installer using distutils.
But if it's only for the apps in the project you can take AntonioP's answer.
Remember that the root of your project (folder that contains manage.py) is always in your PYTHON_PATH when you run your django project, so you can create a folder deps or dependencies or extra or anything you like it that contains your shared libraries.

Wherever you want, you can import them if they are in the PYTHON_PATH.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.