Installing (and accessing) a Python module from a local directory

Installing (and accessing) a Python module from a local directory - python

I'm trying to make an application in Python that will be run from a USB drive on a variety of different computers. Let's just assume for a second that the client computer has Python installed and put that to one side.
My application uses cherrypy to launch a local web server, but it's safe to assume that this won't be already installed on the computer it's run on.
What would be the best of addressing this problem as transparently as possible?
It should be usable by a non-technical user and they shouldn't have to worry about installing any dependencies themselves.
At the moment, I've packaged a copy of the cherrypy source with my app and if the program detects that it isn't already installed, it runs the setup.py but with the --user flag, avoiding any permissions issues.
This doesn't seem like great practice.
Is it possible to just include the built package with my app? I've tried doing this with something like the following:
/myapp/libs/cherrypy
I copied the cherrypy directory from the CherryPy archive. This is pre-build, and so probably won't work. I also tried an alternative where I run setup.py (which created a build directory) and copied those files instead.
/myapp/somefile.py
import myapp.libs.cherrypy as cherrypy
In theory, this should work... But it doesn't. When I try to run the server (using cherrpy.quickstart()) it runs two instances of the server, which obviously starts causing problems.
So, the question(s): First off, am I going about this completely the wrong way? How can I make a third-party package available to my app?

This is due to PYTHONPATH issues. I would recommend using virtual envs and pip as standard when working with packages you've imported or obatined externally.
Some great notes here: https://python-guide.readthedocs.org/en/latest/
If you wnat to import your own code. I'd set your PYTHONPATH (in the case below the dev_folder) to a root development directory and follow this structure...
dev_folder \
- project_name \
- main_script.py
- helper.py
- libary1 \
- __init__.py
- lib1.py
- libary2 \
- __init__.py
- lib2.py
You'd obviously come up with better names for the library folders/packages :-)
Hope this helps.

Related

Questions regarding python setup script

I have a few questions regarding python setup scripts or rather how to properly setup a module (as I am doing this for the first time and kind of struggling).
For simplicity I just post a link to the corresponding github repository rather than explaining the project in detail. I am fully aware that the project as it is will not work (e.g. the file constants.py is missing) but for starters I would like the "structure" to work.
There are two main components in this project, i.e. pymap and agb - both dependent on each other (which should not be a problem I guess). I also would like to use scripts located in the bin/ directory which of course use the modules pymap and agb. For installation I use sudo ./setup.py develop which installs the modules as I can now use them in a python3 shell. The line import pymap.pymap_gui will throw an error (since constants.py is not yet in the project) however the import can be resolved.
When - on the other hand - calling the scripts with pymap.py the same import can not even be resolved:
ModuleNotFoundError: No module named 'pymap.pymap_gui'; 'pymap' is not a package
How can this be even though from a python3 shell the import works perfectly fine?
Furthermore - where to improve my project structure? Is my setup the way to go (not minding the messy code and not yet working project itself - I am viewing this from kinda structure-ish perspective).

The problem was simply that the module had the same name as the script (pymap and pymap.py). Sorry for bothering!

How to maintain python application with dependencies, including my own custom libs?

I'm using Python to develop few company-specific applications. There is a custom shared module ("library") that describes some data and algorithms and there are dozens of Python scripts that work with this library. There's quite a lot of these files, so they are organized in subfolders
myproject
apps
main_apps
app1.py
app2.py
...
utils
util1.py
util2.py
...
library
__init__.py
submodule1
__init__.py
file1.py
...
submodule2
...
Users want to run these scripts by simply going, say, to myproject\utils and launching "py util2.py some_params". Many of these users are developers, so quite often they want to edit a library and immediately re-run scripts with updated code. There are also some 3rd party libraries used by this project and I want to make sure that everyone is using the same versions of these libs.
Now, there are two key problems I encountered:
how to reference (library) from (apps)?
how to manage 3rd party dependencies?
The first problem is well-familiar to many Python developers and was asked on SO for many times: it's quite difficult to instruct Python to import package from "....\library". I tested several different approaches, but it seems that python is reluctant to search for packages anywhere, but in standard libraries locations or the folder of the script itself.
Relative import doesn't work since script is not a part of a library (and even if it was, this still doesn't work when script is executed directly unless it's placed in the "root" project folder which I'd like to avoid)
Placing .pth file (as one might think from reading this document) to script folder apparently doesn't have any effect
Of course direct meddling with sys.path work, but boilerplate code like this one in each and every one of the script files looks quite terrible
import sys, os.path
here = os.path.dirname(os.path.realpath(__file__))
module_root = os.path.abspath(os.path.join(here, '../..'))
sys.path.append(python_root)
import my_library
I realize that this happens because Python wants my library to be properly "installed" and that's indeed would be the only right way to go had this library was developed separately from the scripts that use it. But unfortunately it's not the case and I think that re-doing "installation" of library each time it's changed is going to be quite inconvenient and prone to errors.
The second problem is straightforward. Someone adds a new 3rd party module to our app/lib and everyone else start seeing import problems once they update their apps. Several branches of development, different moments when user does pip install, few rollbacks - and everyone eventually ends using different versions of 3rd party modules. In my case things are additionally complicated by the fact that many devs work a lot with older Python 2.x code while I'd like to move on to Python 3.x
While looking for a possible solution for my problems, I found a truly excellent virtual environments feature in Python. Things looked quite bright:
Create a venv for myproject
Distribute a Requirements.txt file as part of app and provide a script that populates venv accordingly
Symlink my own library to venv site_packages folder so it'll be always detected by Python
This solution looked quite natural & robust. I'm explicitly setting my own environment for my project and place whatever I need into this venv, including my own lib that I can still edit on the fly. And it indeed work. But calling activate.bat to make this python environment active and another batch file to deactivate it is a mess, especially on Windows platform. Boilerplate code that is editing sys.path looks terrible, but at least it doesn't interfere with UX like this potential fix do.
So there's a question that I want to ask.
Is there a way to bind particular python venv to particular folders so python launcher will automatically use this venv for scripts from these folders?
Is there a better alternative way to handle this situation that I'm missing?
Environment for my project is Python 3.6 running on Windows 10.

I think that I finally found a reasonable answer. It's enough to just add shebang line pointing to python interpreter in venv, e.g.
#!../../venv/Scripts/python
The full project structure will look like this
myproject
apps
main_apps
app1.py (with shebang)
app2.py (with shebang)
...
utils
util1.py (with shebang)
util2.py (with shebang)
...
library
__init__.py
submodule1
__init__.py
file1.py
...
submodule2
...
venv
(python interpreter, 3rd party modules)
(symlink to library)
requirements.txt
init_environment.bat
and things work like this:
venv is a virtual python environment with everything that project needs
init_environment.bat is a script that populates venv according to requirements.txt and places a symlink to my library into venv site-modules
all scripts start with shebang line pointing (with relative path) to venv interpreter
There's a full custom environment with all the libs including my own and scripts that use it will all have very natural imports. Python launcher will also automatically pick Python 3.6 as interpreter & load the relevant modules whenever any user-facing script in my project is launched from console or windows explorer.
Cons:
Relative shebang won't work if a script is called from other folder
User will still have to manually run init_environment.bat to update virtual environment according to requirements.txt
init_environment scrip on Windows require elevated privileges to make a symlink (but hopefully that strange MS decision will be fixed with upcoming Win10 update in April'17)
However I can live with these limitations. Hope that this will help others looking for similar problems.
Would be still nice to still hear other options (as answers) and critics (as comments) too.

GAE - Including external python modules without adding them to the repository?

I'm current working on a python based Google App Engine project. Specifically, I'm using Flask for the application. I'm wondering what the accepted method of including external python modules is, specifically when it comes to the repository. From what I can tell, including other people's code in my repository is bad form for several reasons. However, other people will be working on the same repository, so we should be using the same external modules to insure the same results.
Specifically, I need to include Flask (and its dependencies) to my application. The easiest way to do this with Google App Engine is just to throw them into the root level:
MyProject
app.yaml
main.py
MyApp
Flask
...
What is the proper way to bring in these external modules in such a project? Both a generalized answer and one specific to my case would be useful. Also, any other related recommendations would be appreciated. Thank you much.

While it is indeed possible to include third party libraries as submodules or symlinks from external repositories, in practice it's not a good idea. Here are two scenarios on what could go wrong:
If the third party library releases a new version that breaks the functionality, you will have to either make all the necessary changes to meet the new requirements or simply find the previous version to keep working and break the external connection. Usually this happens when you are very close to deadlines.
If the third party library releases a new version and one of your colleagues is upgraded and made all the necessary changes to support the new version, on your side the code will be broken until you will upgrade as well.
The above examples are much more visible in big projects with lots of dependencies and as more people joining the project in the long run it becomes a huge problem! I could come up with more examples, but I think you can see the point.
Your best option is to include the external libraries into your repository, which also has the advantage that you are able to have the whole project up and running on a new machine without many dependencies. There are many ways on how to organize your third party libraries and all of them needs to be included on the same or deeper level with your app.yaml file. Just as #dragonx mentioned include only the core library code.
Also do not afraid putting stuff into your repository cause space is not an issue today and these libraries usually not updating that often so your repository size is not getting too much bigger over time.
Since you mentioned Flask on Google App Engine, you can check out my gae-init project, where you can see in practice how the external libraries are organised.

You're actually asking two questions here.
How do I include the external library in my GAE project?
You've got the right idea. Whatever way you go about it, you must somehow include Flask and its dependencies in the root of your GAE project. One way is to put a copy directly in there.
The second way is to use a symbolic link to the folder that contains the external library. I'm not sure about Flask, but often times external repos contain the actual library code in a subdirectory - so often you don't want the root of the repo in your GAE app, just the root of the actual source. In this case, it's easier to put a symlink that links to the source folder.
How do I manage external libraries in my source repo?
This is a harder question to answer since it depends what source control tool you're using. Yes, you do want to have everyone use the same versions of external libraries, and they should be included in your source control somehow.
If you're using git, git submodule is the way to go. It's a bit confusing to start with but it'll get the job done.
I'd recommend a repo structure that looks something like this
repo/
thirdparty/
flask/
other_dependency/
another_dependency/
README.TXT
setup.py
src/
app/
app.yaml
your_source.py
softlink_to_flask
softlink_to_other_dependency
softlink_to_another_dependency_src
In this example you keep the source to your external libraries in the thirdparty folder. These may be git submodules. In the app folder you have your source, and softlinks to the appropriate files that are actually needed for your app to run. In this case, the actual code for another_dependency may be in the another_dependency/src folder rather than the actual root of another dependency. This way you don't need to include the unnecessary files in your deployment folder, but you can still keep the entire library in your repo.

You can't just create requirements.txt and put it to GAE. Your code must include all pure python libraries that used your project and doesn't supported by GAE (https://developers.google.com/appengine/docs/python/tools/libraries27).
If you look at flask deploy example for GAE (http://flask.pocoo.org/docs/quickstart/#deploying-to-a-web-server and https://github.com/kamalgill/flask-appengine-template) you can find some dependencies like flask, werkzeug and etc. and all this dependencies you must push to GAE server.
So I see three solutions:
Use local requirements for local development and make custom build function that will download all dependencies, put with your application and upload to GAE server.
Add tools for local deployment when you just start project that put required libraries with your application (don't forget about .gitignore).
Use something like git submodules to requirements repositories.

There is two case for using python third party packages in google app engine project:
If your library is one of the supported runtime-provided third-party libraries of GAE section
just add it to your app.yml file under libraries
libraries:
- name: package_name
version: latest
Add your code
import pack_name
Sometimes you need to install the package with
pip install package_name
Make sure you're using the right interpreter, by using
pip freeze
you can make sure the package is installed successfully to the right path.
Otherwise, if GAE does not support you library, you need to download it manually and save it locally under root/Lib directory:
or through GIT
or through pip (pip install package_name -t path/to/your/Lib/dir)
After that, we should declare Lib directory as source dir in pycharm
pycharm->preferences->Project Structure
Choose Lib directory and mark it as source.
Then, import it.
import pack_name
Pay attention that when you're doing the import, you choosing the local path and not your python path.
In general, that's recommended to have requirements.txt file, that includes all the used packages names, and then the pycharm will recognize the uninstalled packages and suggest you to install them.
Good Luck

Django : Which approach is better [ virtualenv + pip ] vs [manually carrying packages in svn]?

I have a django project that uses a lot of 3rd party apps, so wanted to decide out of the two approaches to manage my situation :
I can use [ virtualenv + pip ] along with pip freeze as requirements file to manage my project dependencies.
I don't have to worry about the apps, but can't have that committed with my code to svn.
I can have a lib folder in my svn structure and have my apps sit there and add that to sys.path
This way, my dependencies can be committed to svn, but I have to manage sys.path
Which way should I proceed ?
What are the pros and cons of each approach ?
Update:
Method1 Disadvantage : Difficult to work with appengine.

This has been unanswered question (at least to me) so far. There're some discussion on this recently:-
https://plus.google.com/u/0/104537541227697934010/posts/a4kJ9e1UUqE
Ian Bicking said this in the comment:-
I do think we can do something that incorporates both systems. I
posted a recipe for handling this earlier, for instance (I suppose I
should clean it up and repost it). You can handle libraries in a very
similar way in Python, while still using the tools we have to manage
those libraries. You can do that, but it's not at all obvious how to
do that, so people tend to rely on things like reinstalling packages
at deploy time.
http://tarekziade.wordpress.com/2012/02/10/defining-a-wsgi-app-deployment-standard/
The first approach seem the most common among python devs. When I first started doing development in Django, it feels a bit weird since when doing PHP, it quite common to check third party lib into the project repo but as Ian Bicking said in the linked post, PHP style deployment leaves out thing such non-portable library. You don't want to package things such as mysqldb or PIL into your project which better being handled by tools like Pip or distribute.

So this is what I'm using currently.
All projects will have virtualenv directory at the project root. We name it as .env and ignore it in vcs. The first thing dev did when to start doing development is to initialize this virtualenv and install all requirements specified in requirements.txt file. I prefer having virtualenv inside project dir so that it obvious to developer rather than having it in some other place such as $HOME/.virtualenv and then doing source $HOME/virtualenv/project_name/bin/activate to activate the environment. Instead developer interact with the virtualenv by invoking the env executable directly from project root, such as:-
.env/bin/python
.env/bin/python manage.py runserver
To deploy, we have a fabric script that will first export our project directory together with the .env directory into a tarball, then copy the tarball to live server, untar it deployment dir and do some other tasks like restarting the server etc. When we untar the tarball on live server, the fabric script make sure to run virtualenv once again so that all the shebang path in .env/bin get fixed. This mean we don't have to reinstall dependencies again on live server. The fabric workflow for deployment will look like:-
fab create_release:1.1 # create release-1.1.tar.gz
fab deploy:1.1 # copy release-1.1.tar.gz to live server and do the deployment tasks
fab deploy:1.1,reset_env=1 # same as above but recreate virtualenv and re-install all dependencies
fab deploy:1.1,update_pkg=1 # only reinstall deps but do not destroy previous virtualenv like above
We also do not install project src into virtualenv using setup.py but instead add path to it to sys.path. So when deploying under mod_wsgi, we have to specify 2 paths in our vhost config for mod_wsgi, something like:-
WSGIDaemonProcess project1 user=joe group=joe processes=1 threads=25 python-path=/path/to/project1/.env/lib/python2.6/site-packages:/path/to/project1/src
In short:
We still use pip+virtualenv to manage dependencies.
We don't have to reinstall requirements when deploying.
We have to maintain path into sys.path a bit.

Virtualenv and pip are fantastic for working on multiple django projects on one machine. However, if you only have one project that you are editing, it is not necessary to use virtualenv.

Installing python library to a custom location?

I want to use a couple of third party django packages for my application. So, I want to install each package locally for that application (or my whole project).
The python custom installation instructions look a bit scary - how do I do this as simply as possible?

You can just put the library module into your project folder because your project folder will be in the PYTHONPATH automatically when running via manage.py runserver or your wsgi script will point to it when running on production.
Usually all python packages are packaged like this:
package directory
module directory
... other files/dirs like README, Manifest and so on
What need to be in your Django project folder is only the module directory part from the example above, not the rest of the package.

Use virtualenv.
That is by far the best solution for installing custom packages for each project.

Just use --home - nothing scary about it.
Don't forget to make your PYTHONPATH point there, though.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.