So far I know requirements.txt like this: Django==2.0. Now I saw this style of writing Django>=1.8,<2.1.99
Can you explain to me what it means?
requirements.txt is a file where one specifies dependencies. For example your program will here depend on Django (well you probably do not want to implement Django yourself).
In case one only writes a custom application, and does not plan to export it (for example as a library) to other programmers, one can pin the version of the library, for example Django==2.0.1. Then you can always assume (given pip manages to install the correct package) that your environment will ave the correct version, and thus that if you follow the correct documentation, no problems will (well should) arise.
If you however implement a library, for example mygreatdjangolibrary, then you probably do not want to pin the version: it would mean that everybody that wants to use your library would have to install Django==2.0.1. Imagine that they want a feature that is only available in django-2.1, then they can - given they follow the dependencies strictly - not do this: your library requires 2.0.1. This is of course not manageable.
So typically in a library, one aims to give as much freedom to a user of a library. It would be ideal if regardless of the Django version the user installed, your library could work.
Unfortunately this would result in a lot of trouble for the library developer. Imagine that you have to take into account that a user can use Django-1.1 up to django-2.1. Through the years, several features have been introduced that the library then can not use, since the programmer should be conservative, and take into account that it is possible that these features do not exist in the library the user installed.
It becomes even worse since Django went through some refactoring: some features have later been removed, so we can not simply program on django-1.1 and hope that everything works out.
So in that case, it makes sense to specify a range of versions we support. For example we can read the documentation of django-2.0, and look to the release notes to see if something relevant changed in django-2.1, and let tox test both versions for the tests we write. We thus then can specify a range like Django>=2.0,<2.1.99.
This is also important if you depend on several libraries that each a common requirement. Say for example you want to install a library liba, and a library libb, both depend on Django, bot the two have a different range, for example:
liba:
Django>=1.10, <2.1
libb:
Django>=1.9, <1.11
Then this thus means that we can only install a Django version between >=1.10 and <1.11.
The above even easily gets more complex. Since liba and libb of course have versions as well, for example:
liba-0.1:
Django>=1.10, <2.1
liba-0.2:
Django>=1.11, <2.1
liba-0.3:
Django>=1.11, <2.2
libb-0.1:
Django>=1.5, <1.8
libb-0.2:
Django>=1.10, <2.0
So if we now want to install any liba, and any libb, we need to find a version of liba and libb that "allows" us to install a Django version, and that is not that trivial since for example if we would pick libb-0.1, then there is no version of liba that supports an "overlapping" Django version.
To the best of my knowledge, pip currently has no dependency resolution algorithm. It looks at the specification, and each time aims to pick the most recent that is satisfying the constraints, and recursively installs the dependencies of these packages.
Therefore it is up to the user to make sure that (sub)dependencies do not conflict: if we would specify liba libb==0.1, then pip will probably install Django-2.1, and then find out that libb can not work with this.
There are some dependency resolution programs. But the problem turns out to be quite hard (it is NP-hard if I recall correctly). So that means that for a given dependency tree, it can takes years to find a valid configuration.
Related
I've a question about python packaging.
My project is a library and I would like to have a demo of how to use this library in the form of an application with GUI.
I'm thinking of doing the following by using the install_requires from setuptools:
1 package for the library
1 package for the demo with GUI (auto installed when the 1st package is or that installs the library if the demo is installed)
Is that overkilled or is a decent way to approach it?
Basically I don't want to bloat my library with an entire demo application.
edit:
I probably need to give a bit more context to explain the situation better.
Basically my Library is an API to represent data and interact with it. But since I’m only offering the representation part, it doesn’t actually do anything when you interact with it.
The idea behind the demo is to create an entire environment in the form of an application that will provide contextual data to represent -> represent it -> then connect it to the environment so that when the user interacts with the representation it actually does something such as modifying the data for example.
so in that demo, most of the content is context based and irrelevant to the Library and that’s what i’m referring to when I say clutter / bloat. it’s only there to provide a context to the user and have finished product that actually does something when they test it.
Also I realize that over time I could grow that demo even more to show more possibilities of implementation without having to change the library.
That’s why I want to separate it.
So basically I want to always install the latest compatible version of the demo when the library is installed and also potentially be able to separately update the demo to the latest version if needed.
I hope this make sense :)
Cheers,
If a package provides some kind of optional feature that isn't part of the core functionality, it is usually gated with extras. This only addresses the dependencies your project has, the demo code would always be part of your library. That's usually ok though, because dependencies are where the really big chunks of bandwidth get wasted on.
Just to have an example to go on, the popular pyparsing package offers in its version 3 an extra called diagrams. The code to make diagrams work is part of the codebase and will be downloaded when installing pyparsing even without the extra. But its ~400 lines of code are nicely split off the core so that it can't possibly interfere with it. It's not clutter if it can safely be ignored.
The additional packages needed to make it work are a different story. On my machine, running pip install pyparsing==3.0.0a2 takes up 648kB of space. Running pip install pyparsing[diagrams]==3.0.0a2 adds 1872kB total, nearly three times as much. And this isn't even an extreme example, it just happened to be the last thing I installed which had an extra.
Summed up, if your optional feature doesn't introduce any additional dependencies, it's best practice to just include the feature. If it does introduce dependencies (e.g. some gui_lib_for_demo), split them off through extras, and maybe do something like this in your demo code
try:
import gui_lib_for_demo
except ImportError:
print("If you want to run the demo, please install this library with "
"`pip install my_lib[demo]` instead.")
# actual demo code starts here
Question
How can I install the lowest possible version of a dependency using pip (or any other tool for that matter) given some set of version specifiers. So if I were to specify requests >=2.0, <3.0 I would actually want to install requests==2.0.
Motivation
When creating Python packages you probably want to enable users to interact with it in a variety of environments. This means that you'll often claim to support many versions of a given dependency. For example one might claim that their package supports requests>=2.0. The problem here is that to responsibly make this claim we need some way of testing that our package works both with requests==2.0 but also the latest version and anything in between.
Unfortunately I don't think there's a good solution for testing one's support for all possible combinations of dependencies, but you can at least try to test the maximum and minimum versions of your dependencies.
To solve this problem I usually create a requirements.min.txt file that contains the lowest version of all my dependencies so that I can install those when I want to test that my changes haven't broken my support for older versions. Thus, if I claimed to support requests>=2.0 my requirements.min.txt files would have requests==2.0.
This solution to the problem of testing against one's minimum dependency set feels messy though. Has anyone also found a good solution?
not a full answer, but something that might help you get started...
pip install pkg_name_here==
this will output all versions available on pip which you can then redirect to a string/list and split out the versions, then install any that fall between you min/max supported versions using some kind of conditional statement
Said feature not yet exists in pip (at time of writing).
However, there is an ongoing discussion https://github.com/pypa/pip/issues/8085 and PR https://github.com/pypa/pip/pull/11336 to implement it as --version-selection=min.
I hope it will be merged eventually.
so I am working on a shared computer. It is a workhorse for computations across the department. The problem we have run into is controlling versions of imported modules. Take for example Biopython. Some people have requirements of an older version of biopython - 1.58. And yet others have requirements for the latest biopython - 1.61. How would I have both versions of the module installed side by side, and how does one specifically access a particular version. I ask because sometimes these apis change and break old scripts for other people (or they expect certain functionality that is no longer there).
I understand that one could locally (i.e. per user) install the module and specifically direct python to that module. Is there another way to handle this? Or would everyone have to create an export PYTHONPATH before using?
I'm not sure if it's possible to change the active installed versions of a given module. Given my understanding of how imports and site-packages work, I'm leaning towards no.
Have you considered using virtualenv though ?
With virtualenv, you could create multiple shared environments -- one for biopython 1.58 , another for 1.61 , another for whatever other special situations you need. They don't need to be locked down to a particular user, so while it would take more space than what you desired, it could take less space than everyone having their own python environment.
It sounds like you're doing scientific computing. You should use Anaconda, and make particular note of the conda tool, documented here.
Conda uses hard links whenever possible to avoid copies of the same files. It also manages non-python binary modules in a much better way than virtualenv (virtualenv chokes on VTK, for example).
I'm developing a distribution for the Python package I'm writing so I can post
it on PyPI. It's my first time working with distutils, setuptools, distribute,
pip, setup.py and all that and I'm struggling a bit with a learning curve
that's quite a bit steeper than I anticipated :)
I was having a little trouble getting some of my test data files to be
included in the tarball by specifying them in the data_files parameter in setup.py until I came across a different post here that pointed me
toward the MANIFEST.in file. Just then I snapped to the notion that what you
include in the tarball/zip (using MANIFEST.in) and what gets installed in a
user's Python environment when they do easy_install or whatever (based on what
you specify in setup.py) are two very different things; in general there being
a lot more in the tarball than actually gets installed.
This immediately triggered a code-smell for me and the realization that there
must be more than one use case for a distribution; I had been fixated on the
only one I've really participated in, using easy_install or pip to install a
library. And then I realized I was developing work product where I had only a
partial understanding of the end-users I was developing for.
So my question is this: "What are the use cases for a Python distribution
other than installing it in one's Python environment? Who else am I serving
with this distribution and what do they care most about?"
Here are some of the working issues I haven't figured out yet that bear on the
answer:
Is it a sensible thing to include everything that's under source control
(git) in the source distribution? In the age of github, does anyone download
a source distribution to get access to the full project source? Or should I
just post a link to my github repo? Won't including everything bloat the
distribution and make it take longer to download for folks who just want to
install it?
I'm going to host the documentation on readthedocs.org. Does it make any
sense for me to include HTML versions of the docs in the source
distribution?
Does anyone use python setup.py test to run tests on a source
distribution? If so, what role are they in and what situation are they in? I
don't know if I should bother with making that work and if I do, who to make
it work for.
Some things that you might want to include in the source distribution but maybe not install include:
the package's license
a test suite
the documentation (possibly a processed form like HTML in addition to the source)
possibly any additional scripts used to build the source distribution
Quite often this will be the majority or all of what you are managing in version control and possibly a few generated files.
The main reason why you would do this when those files are available online or through version control is so that people know they have the version of the docs or tests that matches the code they're running.
If you only host the most recent version of the docs online, then they might not be useful to someone who has to use an older version for some reason. And the test suite on the tip in version control may not be compatible with the version of the code in the source distribution (e.g. if it tests features added since then). To get the right version of the docs or tests, they would need to comb through version control looking for a tag that corresponds to the source distribution (assuming the developers bothered tagging the tree). Having the files available in the source distribution avoids this problem.
As for people wanting to run the test suite, I have a number of my Python modules packaged in various Linux distributions and occasionally get bug reports related to test failures in their environments. I've also used the test suites of other people's modules when I encounter a bug and want to check whether the external code is behaving as the author expects in my environment.
One issue that comes up during Pinax development is dealing with development versions of external apps. I am trying to come up with a solution that doesn't involve bringing in the version control systems. Reason being I'd rather not have to install all the possible version control systems on my system (or force that upon contributors) and deal the problems that might arise during environment creation.
Take this situation (knowing how Pinax works will be beneficial to understanding):
We are beginning development on a new version of Pinax. The previous version has a pip requirements file with explicit versions set. A bug comes in for an external app that we'd like to get resolved. To get that bug fix in Pinax the current process is to simply make a minor release of the app assuming we have control of the app. Apps we don't have control we just deal with the release cycle of the app author or force them to make releases ;-) I am not too fond of constantly making minor releases for bug fixes as in some cases I'd like to be working on new features for apps as well. Of course branching the older version is what we do and then do backports as we need.
I'd love to hear some thoughts on this.
Could you handle this using the "==dev" version specifier? If the distribution's page on PyPI includes a link to a .tgz of the current dev version (such as both github and bitbucket provide automatically) and you append "#egg=project_name-dev" to the link, both easy_install and pip will use that .tgz if ==dev is requested.
This doesn't allow you to pin to anything more specific than "most recent tip/head", but in a lot of cases that might be good enough?
I meant to mention that the solution I had considered before asking was to put up a Pinax PyPI and make development releases on it. We could put up an instance of chishop. We are already using pip's --find-links to point at pypi.pinaxproject.com for packages we've had to release ourselves.
Most open source distributors (the Debians, Ubuntu's, MacPorts, et al) use some sort of patch management mechanism. So something like: import the base source code for each package as released, as a tar ball, or as a SCM snapshot. Then manage any necessary modifications on top of it using a patch manager, like quilt or Mercurial's Queues. Then bundle up each external package with any applied patches in a consistent format. Or have URLs to the base packages and URLs to the individual patches and have them applied during installation. That's essentially what MacPorts does.
EDIT: To take it one step further, you could then version control the set of patches across all of the external packages and make that available as a unit. That's quite easy to do with Mercurial Queues. Then you've simplified the problem to just publishing one set of patches using one SCM system, with the patches applied locally as above or available for developers to pull and apply to their copies of the base release packages.
EDIT: I am not sure I am reading your question correctly so the following may not answer your question directly.
Something I've considered, but haven't tested, is using pip's freeze bundle feature. Perhaps using that and distributing the bundle with Pinax would work? My only concern would be how different OS's are handled. For example, I've never used pip on Windows, so I wouldn't know how a bundle would interact there.
The full idea I hope to try is creating a paver script that controls management of the bundles, making it easy for users to upgrade to newer versions. This would require a bit of scaffolding though.
One other option may be you keeping a mirror of the apps you don't control, in a consistent vcs, and then distributing your mirrored versions. This would take away the need for "everyone" to have many different programs installed.
Other than that, it seems the only real solution is what you guys are doing, there isn't a hassle-free way that I've been able to find.