Reading setup.py via API to test requirements

Reading setup.py via API to test requirements - python

I'm working on a plugin system and was thinking of simply using a setup.py file for each 'plugin' since it's already an existing dependency. The thing is, I need a way to test requirements.
Is there already an existing API in place for this, or would it make more sense just to roll a custom system and check it manually?

setup.py is a script, and you can't generally parse that to figure out requirements, especially since some setup scripts will change the requirements depending on the python version used to run them.
There is an upcoming standard that will fix this: PEP 345. At this point in time very few packages make use of this though. For more information on this topic you can look at the distutils-sig list archives where this topic has come up several times.
Have you looked at egg entry points? They basically implement a plugin system that you can use directly. This stackoverflow question has some information that might be interesting.

Related

Integrate multiple Python projects

I have 2 Python projects:
Proj1(/var/www/proj1)
venv
requirments.txt
app
fun.py
fun2.py
app2
pdf.py
somefun2.py
Proj2(/var/www/proj2)
venv
requirments.txt
another
anotherfun.py
anotherfun2.py
someanother
someanotherfun.py
pdfproj2.py
Both work individually and both have different set of requirements.
Lets say pdf.py from proj1 has a function generate which will generate some PDFs. It will take all other modules(app/fun2 etc) in same project for it.
Now what I want is this functionality(pdf.py->generate) I want to call in pdfproj2.py in proj2.
How is this possible?
Nb: I am not using any frameworks like flask/django etc

There are at least three approaches.
1. external call
Change nothing. Pretty much.
Command line callers are already able to take advantage
of $ python proj1/app2/pdf.py arg..., invoking generate().
Arrange for proj2pdf.py to fork off a subprocess
and do exactly that.
Nothing changes in project1,
since its public API already supports this use case.
Notice that you might need to carefully finesse the PATH & PYTHONPATH
env vars, as part of correctly invoking that pdf.py command.
That's the sort of setup that conda and venv are good at.
2. merge projects
This is the quick-n-dirty approach. I do not recommend it.
Create project3, and incorporate source code from both existing projects.
Take the union of all library dependencies.
Now you can call generate() in the same address space,
the same process, as the calling python code.
Downside is: ugliness.
The bigger project's codebase is not as easily maintainable.
3. packaging
The "right" way to make generate() available to project2,
or to any project, is to package it up.
Pretend you're going to publish on pypi.
Doesn't matter if you actually do,
let's just prepare for such a possibility.
Create setup.py or similar, maybe use setuptools,
and create a wheel (or at least a tar) of project1.
There are many ways to do this, and best practices
continue to evolve, so I won't delve into details here.
Now you can list project1 as a dependency in project2's requirements.txt,
and import it just like any other dep. Problem solved!
This is the best approach.
It does involve a bit of work, and a gentle learning curve.

Python web app that can download itself

I'm writing a small web app that I'd like to include the ability to download itself. The ideal solution would be for users to be able to "pip install" the full app but that users of the app would be able to download a version of it to use themselves (perhaps with reduced functionality or without some of the less essential dependencies).
I'm currently using Bottle as I'd like to keep everything as close to the standard library as possible. Users could be on different platforms or Python versions, which are other reasons for minimising the use of extra modules. (Though I'll assume 2.7 or 3.3 will be in use regardless of platform).
My current thinking is to have the app use __file__ or similar and zip itself up. It could also use setuptools/distribute and call sdist on itself. Users could then execute the zip file, or install the app using the source distribution. (ideally I'd like to provide both of these options).
The app would include aggressive import checking to fallback to available modules, with Bottle being the only requirement (and would be included in the downloaded file).
Can anyone think of a robust approach to providing this functionality?
Update: users of the app cannot be guaranteed to have internet access at all times, hence the requirement for being able to download a version of the app from someone who as previously installed it. Python experience cannot be assumed either, hence the idea of letting users run python -m myApp.zip to run their own version.
Update II: as the level of python experience also cannot be guaranteed I'd want the simplest way for a user to get a mostly working version of the app. Experienced users would then be free to 'upgrade' the app by installing their own choice of additional modules. The vast majority of these would be different servers to host the app from (CherryPy, Twisted, etc) and so would not strictly count as a dependency but a "nice to have".
Update III: based on the answer below I will look into a PyPI/buildout based solution but would still be interested in whether there is a specific solution to the above approach.

Just package your app and put it on PyPI. Trying to automatically package the code running on the server seems over-engineered. Then you can let people use pip to install your app. In your app, provide a link to the PyPI page.
Then you can also add dependencies in the setup.py, and pip will install them for you. It seems like you are trying to build your own packaging infrastructure, but don't have to. Use what's out there.

What are the use cases for a Python distribution?

I'm developing a distribution for the Python package I'm writing so I can post
it on PyPI. It's my first time working with distutils, setuptools, distribute,
pip, setup.py and all that and I'm struggling a bit with a learning curve
that's quite a bit steeper than I anticipated :)
I was having a little trouble getting some of my test data files to be
included in the tarball by specifying them in the data_files parameter in setup.py until I came across a different post here that pointed me
toward the MANIFEST.in file. Just then I snapped to the notion that what you
include in the tarball/zip (using MANIFEST.in) and what gets installed in a
user's Python environment when they do easy_install or whatever (based on what
you specify in setup.py) are two very different things; in general there being
a lot more in the tarball than actually gets installed.
This immediately triggered a code-smell for me and the realization that there
must be more than one use case for a distribution; I had been fixated on the
only one I've really participated in, using easy_install or pip to install a
library. And then I realized I was developing work product where I had only a
partial understanding of the end-users I was developing for.
So my question is this: "What are the use cases for a Python distribution
other than installing it in one's Python environment? Who else am I serving
with this distribution and what do they care most about?"
Here are some of the working issues I haven't figured out yet that bear on the
answer:
Is it a sensible thing to include everything that's under source control
(git) in the source distribution? In the age of github, does anyone download
a source distribution to get access to the full project source? Or should I
just post a link to my github repo? Won't including everything bloat the
distribution and make it take longer to download for folks who just want to
install it?
I'm going to host the documentation on readthedocs.org. Does it make any
sense for me to include HTML versions of the docs in the source
distribution?
Does anyone use python setup.py test to run tests on a source
distribution? If so, what role are they in and what situation are they in? I
don't know if I should bother with making that work and if I do, who to make
it work for.

Some things that you might want to include in the source distribution but maybe not install include:
the package's license
a test suite
the documentation (possibly a processed form like HTML in addition to the source)
possibly any additional scripts used to build the source distribution
Quite often this will be the majority or all of what you are managing in version control and possibly a few generated files.
The main reason why you would do this when those files are available online or through version control is so that people know they have the version of the docs or tests that matches the code they're running.
If you only host the most recent version of the docs online, then they might not be useful to someone who has to use an older version for some reason. And the test suite on the tip in version control may not be compatible with the version of the code in the source distribution (e.g. if it tests features added since then). To get the right version of the docs or tests, they would need to comb through version control looking for a tag that corresponds to the source distribution (assuming the developers bothered tagging the tree). Having the files available in the source distribution avoids this problem.
As for people wanting to run the test suite, I have a number of my Python modules packaged in various Linux distributions and occasionally get bug reports related to test failures in their environments. I've also used the test suites of other people's modules when I encounter a bug and want to check whether the external code is behaving as the author expects in my environment.

writing an api for python that can be installed using setup.py method

I am new at writing APIs in python, in any language for that matter. I was hoping to get pointers on how i can create an API that can be installed using setup.py method and used in other python projects. Something similar to the twitterapi.
I have already created and coded all the methods i want to include in the API. I just need to know how to implement the installation so other can use my code to leverage ideas they may have. Or if i need to format the code a certain way to facilitate installation.
I learn best with examples or tutorials.
Thanks so much.

It's worth noting that this part of python is undergoing some changes right now. It's all a bit messy. The most current overview I know of is the Hitchhiker's Guide to Packaging: http://guide.python-distribute.org/
The current state of packaging section is important: http://guide.python-distribute.org/introduction.html#current-state-of-packaging

The python packaging world is a mess (like poswald said). Here's a brief overview along with a bunch of pointers. Your basic problem (using setup.py etc.) is solved by reading the distutils guide which msw has mentioned in his comment.
Now for the dirt. The basic infrastructure of the distribution modules which is in the Python standard library is distutils referred to above. It's limited in some ways and so a series of extensions was written on top of it called setuptools. Setuptools along with actually increasing the functionality provided a command line "installer" called "easy_install".
Setuptools maintenance was not too great and so it was forked and a more active branch called "distribute" was setup and it is the preferred alternative right now. In addition to this, a replacement for easy_install named pip was created which was more modular and useful.
Now there's a huge project going which attempts to fold in all changes from distribute and stuff into a unified library that will go into the stdlib. It's tentatively called "distutils2".

Organizing Python projects with shared packages

What is the best way to organize and develop a project composed of many small scripts sharing one (or more) larger Python libraries?
We have a bunch of programs in our repository that all use the same libraries stored in the same repository. So in other words, a layout like
trunk
libs
python
utilities
projects
projA
projB
When the official runs of our programs are done, we want to record what version of the code was used. For our C++ executables, things are simple because as long as the working copy is clean at compile time, everything is fine. (And since we get the version number programmatically, it must be a working copy, not an export.) For Python scripts, things are more complicated.
The problem is that, often one project (e.g. projA) will be running, and projB will need to be updated. This could cause the working copy revision to appear mixed to projA during runtime. (The code takes hours to run, and can be used as inputs for processes that take days to run, hence the strong traceability goal.)
My current workaround is, if necessary, check out another copy of the trunk to a different location, and run off there. But then I need to remember to change my PYTHONPATH to point to the second version of lib/python, not the one in the first tree.
There's not likely to be a perfect answer. But there must be a better way.
Should we be using subversion keywords to store the revision number, which would allow the data user to export files? Should we be using virtualenv? Should we be going more towards a packaging and installation mechanism? Setuptools is the standard, but I've read mixed things about it, and it seems designed for non-developer end users (of which we have none).

The much better solution involves not storing all your projects and their shared dependencies in the same repository.
Use one repository for each project, and externals for the shared libraries.
Make use of tags in the shared library repositories, so consumer projects may use exactly the version they need in their external.
Edit: (just copying this from my comment) use virtualenv if you need to provide isolated runtime environments for the different apps on the same server. Then each environment can contain a unique version of the library it needs.

If I'm understanding your question properly, then you definitely want virtualenv. Add in some virtualenvwrapper goodness to make it that much better.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.