Python web app that can download itself - python

I'm writing a small web app that I'd like to include the ability to download itself. The ideal solution would be for users to be able to "pip install" the full app but that users of the app would be able to download a version of it to use themselves (perhaps with reduced functionality or without some of the less essential dependencies).
I'm currently using Bottle as I'd like to keep everything as close to the standard library as possible. Users could be on different platforms or Python versions, which are other reasons for minimising the use of extra modules. (Though I'll assume 2.7 or 3.3 will be in use regardless of platform).
My current thinking is to have the app use __file__ or similar and zip itself up. It could also use setuptools/distribute and call sdist on itself. Users could then execute the zip file, or install the app using the source distribution. (ideally I'd like to provide both of these options).
The app would include aggressive import checking to fallback to available modules, with Bottle being the only requirement (and would be included in the downloaded file).
Can anyone think of a robust approach to providing this functionality?
Update: users of the app cannot be guaranteed to have internet access at all times, hence the requirement for being able to download a version of the app from someone who as previously installed it. Python experience cannot be assumed either, hence the idea of letting users run python -m myApp.zip to run their own version.
Update II: as the level of python experience also cannot be guaranteed I'd want the simplest way for a user to get a mostly working version of the app. Experienced users would then be free to 'upgrade' the app by installing their own choice of additional modules. The vast majority of these would be different servers to host the app from (CherryPy, Twisted, etc) and so would not strictly count as a dependency but a "nice to have".
Update III: based on the answer below I will look into a PyPI/buildout based solution but would still be interested in whether there is a specific solution to the above approach.

Just package your app and put it on PyPI. Trying to automatically package the code running on the server seems over-engineered. Then you can let people use pip to install your app. In your app, provide a link to the PyPI page.
Then you can also add dependencies in the setup.py, and pip will install them for you. It seems like you are trying to build your own packaging infrastructure, but don't have to. Use what's out there.

Related

GAE - Including external python modules without adding them to the repository?

I'm current working on a python based Google App Engine project. Specifically, I'm using Flask for the application. I'm wondering what the accepted method of including external python modules is, specifically when it comes to the repository. From what I can tell, including other people's code in my repository is bad form for several reasons. However, other people will be working on the same repository, so we should be using the same external modules to insure the same results.
Specifically, I need to include Flask (and its dependencies) to my application. The easiest way to do this with Google App Engine is just to throw them into the root level:
MyProject
app.yaml
main.py
MyApp
Flask
...
What is the proper way to bring in these external modules in such a project? Both a generalized answer and one specific to my case would be useful. Also, any other related recommendations would be appreciated. Thank you much.
While it is indeed possible to include third party libraries as submodules or symlinks from external repositories, in practice it's not a good idea. Here are two scenarios on what could go wrong:
If the third party library releases a new version that breaks the functionality, you will have to either make all the necessary changes to meet the new requirements or simply find the previous version to keep working and break the external connection. Usually this happens when you are very close to deadlines.
If the third party library releases a new version and one of your colleagues is upgraded and made all the necessary changes to support the new version, on your side the code will be broken until you will upgrade as well.
The above examples are much more visible in big projects with lots of dependencies and as more people joining the project in the long run it becomes a huge problem! I could come up with more examples, but I think you can see the point.
Your best option is to include the external libraries into your repository, which also has the advantage that you are able to have the whole project up and running on a new machine without many dependencies. There are many ways on how to organize your third party libraries and all of them needs to be included on the same or deeper level with your app.yaml file. Just as #dragonx mentioned include only the core library code.
Also do not afraid putting stuff into your repository cause space is not an issue today and these libraries usually not updating that often so your repository size is not getting too much bigger over time.
Since you mentioned Flask on Google App Engine, you can check out my gae-init project, where you can see in practice how the external libraries are organised.
You're actually asking two questions here.
How do I include the external library in my GAE project?
You've got the right idea. Whatever way you go about it, you must somehow include Flask and its dependencies in the root of your GAE project. One way is to put a copy directly in there.
The second way is to use a symbolic link to the folder that contains the external library. I'm not sure about Flask, but often times external repos contain the actual library code in a subdirectory - so often you don't want the root of the repo in your GAE app, just the root of the actual source. In this case, it's easier to put a symlink that links to the source folder.
How do I manage external libraries in my source repo?
This is a harder question to answer since it depends what source control tool you're using. Yes, you do want to have everyone use the same versions of external libraries, and they should be included in your source control somehow.
If you're using git, git submodule is the way to go. It's a bit confusing to start with but it'll get the job done.
I'd recommend a repo structure that looks something like this
repo/
thirdparty/
flask/
other_dependency/
another_dependency/
README.TXT
setup.py
src/
app/
app.yaml
your_source.py
softlink_to_flask
softlink_to_other_dependency
softlink_to_another_dependency_src
In this example you keep the source to your external libraries in the thirdparty folder. These may be git submodules. In the app folder you have your source, and softlinks to the appropriate files that are actually needed for your app to run. In this case, the actual code for another_dependency may be in the another_dependency/src folder rather than the actual root of another dependency. This way you don't need to include the unnecessary files in your deployment folder, but you can still keep the entire library in your repo.
You can't just create requirements.txt and put it to GAE. Your code must include all pure python libraries that used your project and doesn't supported by GAE (https://developers.google.com/appengine/docs/python/tools/libraries27).
If you look at flask deploy example for GAE (http://flask.pocoo.org/docs/quickstart/#deploying-to-a-web-server and https://github.com/kamalgill/flask-appengine-template) you can find some dependencies like flask, werkzeug and etc. and all this dependencies you must push to GAE server.
So I see three solutions:
Use local requirements for local development and make custom build function that will download all dependencies, put with your application and upload to GAE server.
Add tools for local deployment when you just start project that put required libraries with your application (don't forget about .gitignore).
Use something like git submodules to requirements repositories.
There is two case for using python third party packages in google app engine project:
If your library is one of the supported runtime-provided third-party libraries of GAE section
just add it to your app.yml file under libraries
libraries:
- name: package_name
version: latest
Add your code
import pack_name
Sometimes you need to install the package with
pip install package_name
Make sure you're using the right interpreter, by using
pip freeze
you can make sure the package is installed successfully to the right path.
Otherwise, if GAE does not support you library, you need to download it manually and save it locally under root/Lib directory:
or through GIT
or through pip (pip install package_name -t path/to/your/Lib/dir)
After that, we should declare Lib directory as source dir in pycharm
pycharm->preferences->Project Structure
Choose Lib directory and mark it as source.
Then, import it.
import pack_name
Pay attention that when you're doing the import, you choosing the local path and not your python path.
In general, that's recommended to have requirements.txt file, that includes all the used packages names, and then the pycharm will recognize the uninstalled packages and suggest you to install them.
Good Luck

python packages: how to depend on the latest version of a separate package

I'm developing coding a test django site, which I keep in a bitbucket repository in order to be able to deploy it easily on a remote server, and possible share development with a friend. I use hg for version control.
The site depends on a 3rd party app (django-registration), which I needed to customize for my site, so I forked the original app and created a 2nd repository for it (the idea being that this way I can keep up with updates on the original, which wouldnt be possible if I just pasted the code into my main site, plus add to my own custom code) (You can see some more details on this question)
My question is, how do I specify requirements on my setup.py file so that when I install my django site I get the latest version of my fork for the 3rd party app (I use distribute rather than setuptools in case that makes a difference)?
I have tried this:
install_requires = ['django', 'django-registration'],
dependency_links = ['https://myuser#bitbucket.org/myuser/django-registration#egg=django_registration']
but this gets me the latest named version on the original trunk (so not even the tip version)
Using a pip requirements file however works well:
hg+https://myuser#bitbucket.org/myuser/django-registration#egg=django-registration
gets me the latest version from my fork.
Is there a way to get this same behaviour directly from the setup.py file, without having to install first the code for the site, then running pip install -r requirements.txt?
This question is very informative, but seems to suggest I should depend on version 'dev' or the 3rd party package, which doesn't work (I guess there would have to be a specific version tagged as dev for that)
Also I'm a complete newbie in packaging / distribute / setuptools, so dont hold back spelling out the steps :)
Maybe I should change the setup.py file on my fork of the 3rd party app, and make sure it mentions a version number. Generally I'm curious to know what's a source distribution, as opposed to simply having my code on a public repository, and what would be a binary distribution in my case (an egg file?), and whether that would be any more practical for me when deploying remotely / have my friend deploy on his pc. And also would like to know how do I tag a version on my repository for setup.py to refer to it, is it simply a version control tag (hg in my case)?. Feel free to comment on any details you think are important for the starter packager :)
Thanks!
put this:
dependency_links=['https://bitbucket.org/abraneo/django-registration/get/tip.tar.gz#egg=django-registration']
in the dependency_links you have to pass a download url like that one.
"abraneo" is a guy who forked this project too, replace his name by yours.

Good way to Django-based website, installing prerequisites if needed

Consider a website build using python and django. In many cases it uses 3rd party modules beside standard python library - such as pytz, South, timezones or debug toolbar.
What is standard or just convenient way to deploy such application to production hosting with all the prerequisites (timezones, etc) installed automatically?
I'm new to python, and sorry if this question is lame.
There are at least two options available. Jacob Kaplan-Moss, one of the co founders of Django has written about packaging an application using buildout and djangorecipe. There is also the versatile fabric. You should be able to tackle your problem using either of these alone or in combination with some custom scripts.
Fabric is definitely a nice way to accomplish this. There is a fairly extensive blog write up on the process at http://www.caktusgroup.com/blog/2010/04/22/basic-django-deployment-with-virtualenv-fabric-pip-and-rsync/.
The key to fabric is "fabfile.py" - there's an example of one that does a deployment at http://bitbucket.org/copelco/caktus-deployment/src/tip/example-django-project/caktus_website/fabfile.py.
The variation of this that I've used to deploy to a Linode instance is http://gist.github.com/556508
You can either use a deployment solution like fabric (http://fabfile.org/) or you can try to package the entire thing up into a python egg with dependencies that will be automatically installed when you easy_install it. See http://mxm-mad-science.blogspot.com/2008/02/python-eggs-simple-introduction.html for a simple introduction to python eggs.

Creating portable Django apps - help needed

I'm building a Django app, which I comfortably run (test :)) on a Ubuntu Linux host. I would like to package the app without source code and distribute it to another production machine. Ideally the app could be run by ./runapp command which starts a CherryPy server that runs the python/django code.
I've discovered several ways of doing this:
Distributing the .pyc files only and building and installing all the requirements on target machine.
Using one of the many tools to package Python apps into a distributable package.
I'm really gunning for nr.2 option, I'd like to have my Django app contained, so it's possible to distribute it without needing to install or configure additional things. Searching the interwebs provided me with more questions than answers and a very sour taste that Django packing is an arcane art that everybody knows but nobody speaks about. :)
I've tried Freeze (fails), Cx_freeze (easy install version fails, repository version works, but the app output fails) and red up on dbuilder.py (which is supposed to work but doesn't work really - I guess). If I understand correctly most problems originate form the way that Django imports modules (example) but I have no idea how to solve it.
I'll be more than happy if anyone can provide any pointers or good resources online regarding packing/distributing standalone Django applications.
I suggest you base your distro on setuptools (a tool that enhances the standard Python distro mechanizm distutils).
Using setuptools, you should be able to create a Python egg containing your application. The egg's metadata can contain a list of dependencies that will be automatically installed by easy_install (can include Django + any third-party modules/packages that you use).
setuptools/distutils distros can include scripts that will be installed to /usr/bin, so that's how you can include your runapp script.
If you're not familiar with virtualenv, I suggest you take a look at that as well. It is a way to create isolated Python environments, it will be very useful for testing your distro.
Here's a blog post with some info on virtualenv, as well as a discussion about a couple of other nice to know tools: Tools of the Modern Python Hacker: Virtualenv, Fabric and Pip
The --noreload option will stop Django auto-detecting which modules have changed. I don't know if that will fix it, but it might.
Another option (and it's not ideal) is to obscure some of your core functionality by packaging it as a dll, which your plain text code will call.

How might I handle development versions of Python packages without relying on SCM?

One issue that comes up during Pinax development is dealing with development versions of external apps. I am trying to come up with a solution that doesn't involve bringing in the version control systems. Reason being I'd rather not have to install all the possible version control systems on my system (or force that upon contributors) and deal the problems that might arise during environment creation.
Take this situation (knowing how Pinax works will be beneficial to understanding):
We are beginning development on a new version of Pinax. The previous version has a pip requirements file with explicit versions set. A bug comes in for an external app that we'd like to get resolved. To get that bug fix in Pinax the current process is to simply make a minor release of the app assuming we have control of the app. Apps we don't have control we just deal with the release cycle of the app author or force them to make releases ;-) I am not too fond of constantly making minor releases for bug fixes as in some cases I'd like to be working on new features for apps as well. Of course branching the older version is what we do and then do backports as we need.
I'd love to hear some thoughts on this.
Could you handle this using the "==dev" version specifier? If the distribution's page on PyPI includes a link to a .tgz of the current dev version (such as both github and bitbucket provide automatically) and you append "#egg=project_name-dev" to the link, both easy_install and pip will use that .tgz if ==dev is requested.
This doesn't allow you to pin to anything more specific than "most recent tip/head", but in a lot of cases that might be good enough?
I meant to mention that the solution I had considered before asking was to put up a Pinax PyPI and make development releases on it. We could put up an instance of chishop. We are already using pip's --find-links to point at pypi.pinaxproject.com for packages we've had to release ourselves.
Most open source distributors (the Debians, Ubuntu's, MacPorts, et al) use some sort of patch management mechanism. So something like: import the base source code for each package as released, as a tar ball, or as a SCM snapshot. Then manage any necessary modifications on top of it using a patch manager, like quilt or Mercurial's Queues. Then bundle up each external package with any applied patches in a consistent format. Or have URLs to the base packages and URLs to the individual patches and have them applied during installation. That's essentially what MacPorts does.
EDIT: To take it one step further, you could then version control the set of patches across all of the external packages and make that available as a unit. That's quite easy to do with Mercurial Queues. Then you've simplified the problem to just publishing one set of patches using one SCM system, with the patches applied locally as above or available for developers to pull and apply to their copies of the base release packages.
EDIT: I am not sure I am reading your question correctly so the following may not answer your question directly.
Something I've considered, but haven't tested, is using pip's freeze bundle feature. Perhaps using that and distributing the bundle with Pinax would work? My only concern would be how different OS's are handled. For example, I've never used pip on Windows, so I wouldn't know how a bundle would interact there.
The full idea I hope to try is creating a paver script that controls management of the bundles, making it easy for users to upgrade to newer versions. This would require a bit of scaffolding though.
One other option may be you keeping a mirror of the apps you don't control, in a consistent vcs, and then distributing your mirrored versions. This would take away the need for "everyone" to have many different programs installed.
Other than that, it seems the only real solution is what you guys are doing, there isn't a hassle-free way that I've been able to find.

Categories

Resources