I've read in few places that generally, Python doesn't provide backward compatibility, which means that any newer version of Python may break code that worked fine for earlier versions. If so, what is my way as a developer to know what versions of Python can execute my code successfully? Is there any set of rules/guarantees regarding this? Or should I just tell my users: Just run this with Python 3.8 (for example) - no more no less...?
99% of the time, if it works on Python 3.x, it'll work on 3.y where y >= x. Enabling warnings when running your code on the older version should pop DeprecationWarnings when you use a feature that's deprecated (and therefore likely to change/be removed in later Python versions). Aside from that, you can read the What's New docs for each version between the known good version and the later versions, in particular the Deprecated and Removed sections of each.
Beyond that, the only solution is good unit and component tests (you are using those, right? 😉) that you rerun on newer releases to verify stuff still works & behavior doesn't change.
According to PEP387, section "Making Incompatible Changes", before incompatible changes are made, a deprecation warning should appear in at least two minor Python versions of the same major version, or one minor version in an older major version. After that, it's a free game, in principle. This made me cringe with regards to safety. Who knows if people run airplanes on Python and if they don't always read the python-dev list. So if you have something that passes 100% coverage unit tests without deprecation warnings, your code should be safe for the next two minor releases.
You can avoid this issue and many others by containerizing your deployments.
tox is great for running unit tests against multiple Python versions. That’s useful for at least 2 major cases:
You want to ensure compatibility for a certain set of Python versions, say 3.7+, and to be told if you make any breaking changes.
You don’t really know what versions your code supports, but want to establish a baseline of supported versions for future work.
I don’t use it for internal projects where I can control over the environment where my code will be running. It’s lovely for people publishing apps or libraries to PyPI, though.
Related
I have a set of scripts and utility modules that were written for a recent version of Python 3. Now suddenly, I have a need to make sure that all this code works properly under an older version of Python 3. I can't get the user to update to a more recent Python version -- that's not an option. So I need to identify all the instances where I've used some functionality that was introduced since the old version they have installed, so I can remove it or develop workarounds.
Approach #1: eyeball all the code and compare against documentation. Not ideal when there's this much code to look at.
Approach #2: create a virtual environment locally based on the old version in question using pyenv, run everything, see where it fails, and make fixes. I'm doing this anyway, because backporting to the older Python will also mean going backwards in a number of needed third-party modules from PyPi, and I'll need to make sure that the suite still functions properly. But I don't think it's a good way to identify all my version incompatibilities, because much of the code is only exercised based on particular characteristics of input data, and it'd be hard to make sure I exercise all the code (I don't yet have good unit tests that ensure every line will get executed).
Approach #3: in my virtual environment based on the older version, I used pyenv to install the pylint module, then used this pylint module to check my code. It ran; but it didn't identify issues with standard library calls. For example, I know that several of my functions call subprocess.run() with the "check_output=" Boolean argument, which didn't become available until version 3.7. I expected the 3.6 pylint run to spot this and yell at me; but it didn't. Does pylint not check standard library calls against definitions?
Anyway, this is all I've thought of so far. Any ideas gratefully appreciated. Thanks.
If you want to use pylint to check 3.6 code the most effective way is to use a 3.6 interpreter and environment and then run pylint in it. If you want to use the latest pylint version, you can use the py-version option using 3.6 but this is probably going to catch less issue because pylint will not check what you would have in python 3.6, only some known "hard coded" issue in python 3.6 (like for example f-strings for python 3.5, not missing args in subprocess.run).
As noted in the comments, the real issue is that you do not have a proper test suite, so the question is how can you get one cheaply.
Adding unit test can be time consuming. Before doing that, you can add actual end-to-end tests (which will take some computational time and longer feedback time, but that will be easier to implement), by simply running the program with the current version of python that it is working with and storing the results and then adding a test to show you reproduce the same results.
This kind of test is usually expensive to maintain (as each time you are updating the behavior, you have to update the results). However, there are a safeguard against regression, and allow you to perform some heavy refactoring on legacy code in order to move to a more testable structure.
In your case, these end-to-end test will allow you to test against several versions of python the actual application (not only parts of it).
Once you have a better test suite, you can them decide if this heavy end-to-end tests are worth keeping based on the maintenance burden of the test suite (let's not forget that the test suite should not slow you down in your development, so if it is the bottleneck, that means you should rethink your testing)
What will take time is to generate good input data to your end-to-end tests, to help you with that, you should use some coverage tool (you might even spot unreachable code thanks to that). If there are part of your code that you don't manage to reach, I would not bother at first about it, as it means it will be unlikely to be reached by your client (and if it is the case and it fails at your client, be sure to have proper logging implemented to be able to add the test case to your test suite)
I want to develop and test my project on the up-to-date version of Python 2.7 (say 2.7.18), but I want my project to be still fully usable on earlier versions of 2.7 (say 2.7.7). Setting up many variants of 2.7 locally or/and on CI for testing can be redundant.
So there are the following questions about compatibility of 2.7.X.
Can there be any changes in syntax which make code not working?
Can there be any changes in available standard imports, for example, can some imports from __future__ be unavailable in earlier versions?
Since I have to distribute compiled Python files (.pyc, compiled via py_compile module), I'm also wondering if there can be any changes in Python bytecode which block code execution in earlier versions.
I guess if all the answers are "no", I can develop and test my project only on a single 2.7 version without worries.
I've tried to search it but there is no success. Please share your experience and/or links.
UPD 1: I should have clearly said from the beginning that it's not my desire to use 2.7, it's a requirement from the environment.
At least Python 2.7.9 introduced massive changes to the 'ssl' module, so trying to use code using SSL for 2.7.18 on Python older than 2.7.9 will fail. So a clear "yes" to number 2.
In general compatbility for most projects works the other way round, use the oldest version you need to support and work upwards from old to new, not downwards from newer to older. I do not know of any software project that makes the guarantees in the other direction.
Note that Python 2.7 dropped out of support with 2.7.18, so unless you use a compatible version like PyPy (https://www.pypy.org/) your freshly developed project will run on outdated Python versions from the start.
If you want to provide a shrink wrapped product, maybe have a look at the usual solution for this like pyinstaller (https://www.pyinstaller.org/) or freeze (https://wiki.python.org/moin/Freeze)
The #3 may work, if you study the list of bytecode opcodes which do not change that much over time (https://github.com/python/cpython/commits/2.7/Include/opcode.h) but no idea if the on-disk format changed.
I'm looking at some software that is wanting to bring in Python 3.6 for use in an environment where 3.5 is the standard. Reading up on Python's documentation I can't find anything about whether:
3.5 is representative of a semantic version number
3.6 would represent a forwards compatible upgrade (ie: code written for a 3.5 runtime is guaranteed to work in a 3.6 runtime)
The fact that this page about porting to 3.7 exists makes me think strongly no but I can't see official docs on what the version numbers mean (if anything, ala Linux kernel versioning)
In the more general sense - is there a PEP around compatibility standards within the 3.X release stream?
The short answer is "No", the long answer is "They strive for something close to it".
As a rule, micro versions match semantic versioning rules; they're not supposed to break anything or add features, just fix bugs. This isn't always the case (e.g. 3.5.1 broke vars() on a namedtuple, because it caused a bug that was worse than the break when it came up), but it's very rare for code (especially Python level stuff, as opposed to C extensions) to break across a micro boundary.
Minor versions mostly "add features", but they will also make backwards incompatible changes with prior warning. For example, async and await became keywords in Python 3.7, which meant code using them as variable names broke, but with warnings enabled, you would have seen a DeprecationWarning in 3.6. Many syntax changes are initially introduced as optional imports from the special __future__ module, with documented timelines for becoming the default behavior.
None of the changes made in minor releases are broad changes; I doubt any individual deprecation or syntax change has affected even 1% of existing source code, but it does happen. If you've got a hundred third party dependencies, and you're jumping a minor version or two, there is a non-trivial chance that one of them will be broken by the change (example: pika prior to 0.12 used async as a variable name, and broke on Python 3.7; they released new versions that fixed the bug, but of course, moving from 0.11 and lower to 0.12 and higher changed their own API in ways that might break your code).
Major versions are roughly as you'd expect; backwards incompatible changes are expected/allowed (though they're generally not made frivolously; the bigger the change, the bigger the benefit).
Point is, it's close to semantic versioning, but in the interests of not having major releases every few years, while also not letting the language stagnate due to strict compatibility constraints, minor releases are allowed to break small amounts of existing code as long as there is warning (typically in the form of actual warnings from code using deprecated behavior, notes on the What's New documentation, and sometimes __future__ support to ease the migration path).
This is all officially documented (with slightly less detail) in their Development Cycle documentation:
To clarify terminology, Python uses a major.minor.micro nomenclature for production-ready releases. So for Python 3.1.2 final, that is a major version of 3, a minor version of 1, and a micro version of 2.
new major versions are exceptional; they only come when strongly incompatible changes are deemed necessary, and are planned very long in advance;
new minor versions are feature releases; they get released annually, from the current in-development branch;
new micro versions are bugfix releases; they get released roughly every 2 months; they are prepared in maintenance branches.
Here's the document on updating to 3.6.
If you had, for example, open(apath, 'U+') in your code in 3.5, it would fail in 3.6. So, clearly, Python 3.6 is not entirely backwards compatible to every usage in 3.5.
Realistically, you will need to test, although I feel fairly comfortable telling the average stackoverflow reader from almost every area that they should feel comfortable doing this upgrade.
As for Semantic Versioning, specifically, Python does not follow it, but it isn't entirely agnostic to the meaning of major, minor and bugfix releases. Python guidelines for its developers can be found here.
To clarify terminology, Python uses a major.minor.micro nomenclature
for production-ready releases. So for Python 3.1.2 final, that is a
major version of 3, a minor version of 1, and a micro version of 2.
new major versions are exceptional; they only come when strongly incompatible changes are deemed necessary, and are planned very long
in advance;
new minor versions are feature releases; they get released annually, from the current in-development branch;
new micro versions are bugfix releases; they get released roughly every 2 months; they are prepared in maintenance branches.
Also read PEP440, which is for modules, not about releasing new versions of python itself, but still relevant for the philosophy of the ecosystem.
Currently I am working in big firm where we need to convert python2 old big Django project into python3 version so I have done lots of research related but still not able to find any perfect answer related to which version of Python and Django best suited for conversion.
Currently I am using Python : 2.7.16 & Django : 1.9.13 in my old version.
Anyone can suggest me best suited version of Python & Django for above old version for python2 to python3 conversion.
I thought I'd add a bit to the strategy advocated by Wim's answer - get the appropriate version of Django working on both 2.7 and 3.x first - and outline some tactics that worked for me.
Python 2.7 is your escape pod, until you pull the trigger on 3.x
your tests should run on both
don't use any 3.x specific features, like f-strings
first Python 3.x, then only later Django 2.x which doesn't run on 2.7
start early, don't over analyze, but avoid the big bang approach
file by file at first.
start with the lowest level code, like utility libraries, that you have test suites for.
if possible, try to gradually merge your changes to the 2.7 production branches and keep your 3.x porting code up to date with prod changes.
Which minor version of Django to start with?
My criteria here is that Django migrations can be fairly involved (and actually require more thinking than 2=>3 work). So I would move to the latest and greatest 1.11 that way you're already providing some value to your 2.7 users. There's probably a good number of pre-2.x compatibility shims on 1.11 and you'll be getting its 2.x deprecation warnings.
Which minor version of Python 3.x to start with?
Best to consider all angles, such as the availability of your 3rd party libs, support from your CI/devops suite and availability on your chosen server OS images. You could always install 3.8 and try a pip install of your requirements.txt by itself, for example.
Leverage git (or whatever scm you use) and virtualenv.
separate requirement.txt files, but...
if you have a file-based, git repo, you can point each venv at the same codeline with a pip install -e <your directory>. that means that, in 2 different terminals you can run 2.7 and 3.x against the same unittest(s).
you could even run 2.7 and 3.x Django servers side-by-side on different ports and point say Firefox and Chrome at them.
commit often (on the porting branch at least) and learn about git bisect.
make use of 2to3
Yes, it will break 2.7 code and Django if you let it. So...
run it in preview mode or against a single file. see what it breaks but also see what it did right.
throttle it to only certain conversions that don't break 2.7 or Django. print x=> print (x) and except(Exception) as e are 2 no-brainers.
This is what my throttled command looked like:
2to3 $tgt -w -f except -f raise -f next -f funcattrs -f print
run it file-by-file until you are really confident.
use sed or awk rather than your editor for bulk conversions.
The advantage is that, as you become more aware of your apps' specifics concerns, you can build a suite of changes that can be run on either 1 file or many files and do most of the work without breaking 2.7 or Django. Apply this after your suitably-throttled 2to3 pass. That leaves you with residual cleanups in your editor and getting your tests to pass.
(optional) start running black on 2.7 code.
black which is a code formatter, uses Python 3 ASTs to run its analysis. It doesn't try to run the code, but it will flag syntax errors that prevent it from getting to the AST stage. You will have to work some pip install global magic to get there though and you have to buy into black's usefulness.
Other people have done it - learn from them.
Listening to #155 Practical steps for moving to Python 3 should give you some ideas of the work. Look at the show links for it. They love to talk up the Instagram(?) move which involved a gradual adjustment of running 2.7 code to 3.x syntax on a common codebase, and on the same git branch, until pull-the-trigger day.
See also The Conservative Python 3 Porting Guide
and Instagram Makes a Smooth Move to Python 3 - The New Stack
Conclusion
Your time to Django 1.11 EOL (April 2020) is rather short, so if you have 2+ dev resources to throw at it, I'd consider doing the following in parallel:
DEV#1: start out on a Django 1.11 bump (the theory being that Django 1.11 is probably best positioned as a jump off point to Django 2.x), using 2.7.
DEV#2: get started on Python 3.6/3.7 of your non-Django utility code. Since the code is 2.7 compatible at this point, merge it into #1 as you go.
See how both tasks proceed, assess what the Django related project risk is and what the Python 3 pain looks like. You're already missing the Python 2.7 EOL, but an obsolete web framework is probably more dangerous than legacy Python 2.7, at least for a few months. So I wouldn't wait too long to start migrating off Django 1.9 and your work doing so won't be wasted. As you see the progress, you'll start seeing the project risks better.
Your initial 2to3 progress will be slow, but the tooling and guidance is good enough that you'll quickly pick up speed so don't overthink it before starting to gather experience. The Django side depends on your exposure to breaking changes in the framework which is why I think it's best to start early.
P.S. (controversial/personal opinion) I didn't use six or other canned 2-to-3 bridge libraries much.
It's not because I don't trust it - it's brilliant for 3rd party libs - but rather that I didn't want to add a complex permanent dependency (and I was too lazy to read its doc). I'd been writing 2.7 code in 3.x compatible syntax for a long time so I didn't really feel the need to use them. Your mileage may vary and don't set out on this path if it seems like a lot of work.
Instead, I created a py223.py (57 LOC incl. comments) with this type of content, most of which is concerned with workarounds for deprecations and name changes in the standard library.
try:
basestring_ = basestring
except (NameError,) as e:
basestring_ = str
try:
cmp_ = cmp
except (NameError,) as e:
# from http://portingguide.readthedocs.io/en/latest/comparisons.html
def cmp_(x, y):
"""
Replacement for built-in function cmp that was removed in Python 3
"""
return (x > y) - (x < y)
Then import from that py223 to work around those specific concerns. Later on I will just ditch the import and move those weird isinstance(x, basestr_) to isinstance(x, str) but I know in advance there is little to worry about.
My suggestion is to first upgrade to Django==1.11.26, which is the most recent version of Django that is supporting both Python 2 and Python 3. Stay on your current version of Python 2.7 for now.
Read carefully the release notes for 1.10.x and 1.11.x, checking for deprecations and fixing anything that stopped working from your 1.9.x code. Things WILL break. Django moves fast. For a large Django project, there may be many code changes required, and if you're using a lot of 3rd-party plugins or libraries you may have to juggle their versions around. Some of your 3rd-party dependencies will probably have been abandoned entirely, so you have to find replacements or remove the features.
To find the release notes for each version upgrade, just google "What's new in Django ". The hits will meticulously document all the deprecations and changes:
https://docs.djangoproject.com/en/2.2/releases/1.10/
https://docs.djangoproject.com/en/2.2/releases/1.11/
Once the webapp appears to be working fine on Django 1.11, with all tests passing (you do have a test suite, right?) then you can do the Python 3 conversion, whilst keeping the Django version the same. Django 1.11 supports up to Python 3.7, so that would be a good version to target. Expect unicode all over the place, since the implicit conversions between bytes and text is gone now and many Python 2 webapps relied upon that.
Once the project appears to be working fine on Django 1.11 and Python 3.7, then you can think about upgrading to Django 3.0, following the same process as before - reading the release notes, making the necessary changes, running the test suite, and checking out the webapp in a dev server manually.
I would upgrade to py3 first. You'll need to look at setup.py in the Django repo on the stable/1.9.x branch (https://github.com/django/django/blob/stable/1.9.x/setup.py) to figure out that the py3 versions supported are 3.4 (dead) and 3.5.
Once you're on py3.5 and Django 1.9 you can upgrade one at a time until you get to the version you want to end at. E.g. Django 1.11 supports py3.5 and py3.7, so
py27/dj19 -> py35/dj19 -> py35/dj1.11 -> py37/dj1.11 ... -> py37/dj2.2
dj2.2 is the first version supporting py3.8, but I would probably stop at py37/dj2.2 if you're working in a normally conservative environment.
If you have other packages you'll need to find version combinations that will work together on each step. Having a plan is key, and upgrading only one component at a time will usually end up saving you time.
The future library (https://python-future.org/) will help you with many icky situations while you need code to run on both py27 and 3.x. six is great too. I would avoid rolling your own compatibility layer (why reinvent the wheel?)
If at all possible, try to get your unit test coverage up to 75-85% before starting, and definitely set up automatic testing on both "from" and "to" versions for each upgrade step. Make sure you read and fix all warnings from Django before upgrading to the next version -- Django cares very little about backward compatibility, so I would normally suggest hitting every minor version on the upgrade path (or at least make sure you read the "backwards incompatibilities" and deprecation lists for each minor version).
Good luck (we're upgrading a 300+Kloc code base from py27/dj1.7 right now, so I feel your pain ;-)
I have same kind of issue with my project and I have tried python 3.7.5 with Django version 2.2.7.
You should not go with python latest version 3.8 or Django latest version 3.0 because you there may have been chances that for any kind of bug you may not able to get proper solution for latest versions.
You should try to shoot for the current versions. Python 3.8 and Django 3.0.The Six library will help with some convention changes. Either way you are going to have to do some refactoring so you might as well make it current.
Suppose I've developed a general-purpose end user utility written in Python. Previously, I had just one version available which was suitable for Python later than version 2.3 or so. It was sufficient to say, "download Python if you need to, then run this script". There was just one version of the script in source control (I'm using Git) to keep track of.
With Python 3, this is no longer necessarily true. For the foreseeable future, I will need to simultaneously develop two different versions, one suitable for Python 2.x and one suitable for Python 3.x. From a development perspective, I can think of a few options:
Maintain two different scripts in the same branch, making improvements to both simultaneously.
Maintain two separate branches, and merge common changes back and forth as development proceeds.
Maintain just one version of the script, plus check in a patch file that converts the script from one version to the other. When enough changes have been made that the patch no longer applies cleanly, resolve the conflicts and create a new patch.
I am currently leaning toward option 3, as the first two would involve a lot of error-prone tedium. But option 3 seems messy and my source control system is supposed to be managing patches for me.
For distribution packaging, there are more options to choose from:
Offer two different download packages, one suitable for Python 2 and one suitable for Python 3 (the user will have to know to download the correct one for whatever version of Python they have).
Offer one download package, with two different scripts inside (and then the user has to know to run the correct one).
One download package with two version-specific scripts, and a small stub loader that can run in both Python versions, that runs the correct script for the Python version installed.
Again I am currently leaning toward option 3 here, although I haven't tried to develop such a stub loader yet.
Any other ideas?
Edit: my original answer was based on the state of 2009, with Python 2.6 and 3.0 as the current versions. Now, with Python 2.7 and 3.3, there are other options. In particular, it is now quite feasible to use a single code base for Python 2 and Python 3.
See Porting Python 2 Code to Python 3
Original answer:
The official recommendation says:
For porting existing Python 2.5 or 2.6
source code to Python 3.0, the best
strategy is the following:
(Prerequisite:) Start with excellent test coverage.
Port to Python 2.6. This should be no more work than the average port
from Python 2.x to Python 2.(x+1).
Make sure all your tests pass.
(Still using 2.6:) Turn on the -3 command line switch. This enables
warnings about features that will be
removed (or change) in 3.0. Run your
test suite again, and fix code that
you get warnings about until there are
no warnings left, and all your tests
still pass.
Run the 2to3 source-to-source translator over your source code tree.
(See 2to3 - Automated Python 2 to 3
code translation for more on this
tool.) Run the result of the
translation under Python 3.0. Manually
fix up any remaining issues, fixing
problems until all tests pass again.
It is not recommended to try to write
source code that runs unchanged under
both Python 2.6 and 3.0; you’d have to
use a very contorted coding style,
e.g. avoiding print statements,
metaclasses, and much more. If you are
maintaining a library that needs to
support both Python 2.6 and Python
3.0, the best approach is to modify step 3 above by editing the 2.6
version of the source code and running
the 2to3 translator again, rather than
editing the 3.0 version of the source
code.
Ideally, you would end up with a single version, that is 2.6 compatible and can be translated to 3.0 using 2to3. In practice, you might not be able to achieve this goal completely. So you might need some manual modifications to get it to work under 3.0.
I would maintain these modifications in a branch, like your option 2. However, rather than maintaining the final 3.0-compatible version in this branch, I would consider to apply the manual modifications before the 2to3 translations, and put this modified 2.6 code into your branch. The advantage of this method would be that the difference between this branch and the 2.6 trunk would be rather small, and would only consist of manual changes, not the changes made by 2to3. This way, the separate branches should be easier to maintain and merge, and you should be able to benefit from future improvements in 2to3.
Alternatively, take a bit of a "wait and see" approach. Proceed with your porting only so far as you can go with a single 2.6 version plus 2to3 translation, and postpone the remaining manual modification until you really need a 3.0 version. Maybe by this time, you don't need any manual tweaks anymore...
For developement, option 3 is too cumbersome. Maintaining two branches is the easiest way although the way to do that will vary between VCSes. Many DVCS will be happier with separate repos (with a common ancestry to help merging) and centralized VCS will probably easier to work with with two branches. Option 1 is possible but you may miss something to merge and a bit more error-prone IMO.
For distribution, I'd use option 3 as well if possible. All 3 options are valid anyway and I have seen variations on these models from times to times.
I don't think I'd take this path at all. It's painful whichever way you look at it. Really, unless there's strong commercial interest in keeping both versions simultaneously, this is more headache than gain.
I think it makes more sense to just keep developing for 2.x for now, at least for a few months, up to a year. At some point in time it will be just time to declare on a final, stable version for 2.x and develop the next ones for 3.x+
For example, I won't switch to 3.x until some of the major frameworks go that way: PyQt, matplotlib, numpy, and some others. And I don't really mind if at some point they stop 2.x support and just start developing for 3.x, because I'll know that in a short time I'll be able to switch to 3.x too.
I would start by migrating to 2.6, which is very close to python 3.0. You might even want to wait for 2.7, which will be even closer to python 3.0.
And then, once you have migrated to 2.6 (or 2.7), I suggest you simply keep just one version of the script, with things like "if PY3K:... else:..." in the rare places where it will be mandatory. Of course it's not the kind of code we developers like to write, but then you don't have to worry about managing multiple scripts or branches or patches or distributions, which will be a nightmare.
Whatever you choose, make sure you have thorough tests with 100% code coverage.
Good luck!
Whichever option for development is chosen, most potential issues could be alleviated with thorough unit testing to ensure that the two versions produce matching output. That said, option 2 seems most natural to me: applying changes from one source tree to another source tree is a task (most) version control systems were designed for--why not take advantages of the tools they provide to ease this.
For development, it is difficult to say without 'knowing your audience'. Power Python users would probably appreciate not having to download two copies of your software yet for a more general user-base it should probably 'just work'.