how to handle byte-like values with Python 2 and 3 [duplicate]

how to handle byte-like values with Python 2 and 3 [duplicate] - python

A Django website I maintain currently uses Python 2.7 but I know that I'll have to upgrade it to Python 3 in a couple of months. If I'm writing code right now that has to work in Python 2, is there a Pythonic way to write it such that it would also work in Python 3 without any changes if I know what the syntax is going to be in Python 3? Ideally I'd like the code to continue to work even after the upgrade without changing it but it would be easy for me to spot where I've done this in the codebase so that I can change the code when I have time. Here's an example of what I'm talking about:
# Python 2 uses 'iteritems'
def log_dict(**kwargs):
for key, value in kwargs.iteritems():
log.info("{0}: {1}".format(key, value))
# Python 3 uses 'items'
def log_dict(**kwargs):
for key, value in kwargs.items():
log.info("{0}: {1}".format(key, value))

There is official documentation suggesting ways to do this. That documentation has changed over time as the situation has changed, so it's worth going directly to the source (especially if you're reading this answer a year or two after it was written).
It's also worth reading the Conservative Python 3 Porting Guide and skimming Nick Coghlan's Python 3 Q&A, especially this section.
Going back in time from the early 2018:
futurize
The current official suggestions are:
Only worry about supporting Python 2.7
Make sure you have good test coverage (coverage.py can help; pip install coverage)
Learn the differences between Python 2 & 3
Use Futurize (or Modernize) to update your code (e.g. pip install future)
Use Pylint to help make sure you don’t regress on your Python 3 support (pip install pylint)
Use caniusepython3 to find out which of your dependencies are blocking your use of Python 3 (pip install caniusepython3)
Once your dependencies are no longer blocking you, use continuous integration to make sure you stay compatible with Python 2 & 3 (tox can help test against multiple versions of Python; pip install tox)
Consider using optional static type checking to make sure your type usage works in both Python 2 & 3 (e.g. use mypy to check your typing under both Python 2 & Python 3).
Notice the last suggestion. Guido and another of the core devs have both been heavily involved in leading large teams to port large 2.7 codebases to 3.x, and found mypy to be very helpful (especially in dealing with bytes-vs.-unicode issues). In fact, that's a large part of the reason gradual static typing is now an official part of the language.
You also almost certainly want to use all of the future statements available in 2.7. This is so obvious that they seem to have forgotten to leave it out of the docs, but, besides making your life easier (e.g., you can write print function calls), futurize and modernize (and six and sixer) require it.
six
The documentation is aimed at people making an irreversible transition to Python 3 in the near future. If you're planning to stick with dual-version code for a long time, you might be better off following the previous recommendations, which largely revolved around using six instead of futurize. Six covers more of the differences between the two languages, and also makes you write code that's explicit about being dual-version instead of being as close to Python 3 as possible while still running in 2.7. But the downside is that you're effectively doing two ports—one from 2.7 to six-based dual-version code, and then, later, from 3.x-only six code to 3.x-only "native" code.
2to3
The original recommended answer was to use 2to3, a tool that can automatically convert Python 2 code to Python 3 code, or guide you in doing so. If you want your code to work in both, you need to deliver Python 2 code, then run 2to3 at installation time to port it to Python 3. Which means you need to test your code both ways, and usually modify it so that it still works in 2.7 but also works in 3.x after 2to3, which isn't always easy to work out. This turns out to not be feasible for most non-trivial projects, so it's no longer recommended by the core devs—but it is still built in with Python 2.7 and 3.x, and getting updates.
There are also two variations on 2to3: sixer auto-ports your Python 2.7 code to dual-version code that uses six, and 3to2 lets you write your code for Python 3 and auto-port back to 2.7 at install time. Both of these were popular for a time, but don't seem to be used much anymore; modernize and futurize, respectively, are their main successors.
For your specific question,
kwargs.items() will work on both, if you don't mind a minor performance cost in 2.7.
2to3 can automatically change that iteritems to items at install time on 3.x.
futurize can be used to do either of the above.
six will allow you to write six.iteritems(kwargs), which will do iteritems in 2.7 and items in 3.x.
six will also allow you to write six.viewitems(kwargs), which will do viewitems in 2.7 (which is identical to what items does in 3.x, rather than just similar).
modernize and sixer will automatically change that kwargs.iteritems() to six.iteritems(kwargs).
3to2 will let you write kwargs.items() and autmatically convert it to viewitems at install time on 2.x.
mypy can verify that you're just using the result as a general iterable (rather than specifically as an iterator), so changing to viewitems or items leaves your code still correctly typed.

you can import the future package
from future import ....
nested_scopes 2.1.0b1 2.2 PEP 227: Statically Nested Scopes
generators 2.2.0a1 2.3 PEP 255: Simple Generators
division 2.2.0a2 3.0 PEP 238: Changing the Division Operator
absolute_import 2.5.0a1 3.0 PEP 328: Imports: Multi-Line and Absolute/Relative
with_statement 2.5.0a1 2.6 PEP 343: The “with” Statement
print_function 2.6.0a2 3.0 PEP 3105: Make print a function
unicode_literals 2.6.0a2 3.0 PEP 3112: Bytes literals in Python 3000

Making your Django project compatible with both Python versions consists of the following steps:
Add from __future__ import unicode_literals at the top of each module and then use usual quotes without a u prefix for Unicode strings and a b prefix for bytestrings.
To ensure that a value is bytestring, use the django.utils.encoding.smart_bytes function. To ensure that a value is Unicode, use the django.utils.encoding.smart_text or django.utils.encoding.force_text function.
In your models use __str__ method instead of __unicode__ and add the python_2_unicode_compatible decorator.
# models.py
# -*- coding: UTF-8 -*-
from __future__ import unicode_literals
from django.db import models
from django.utils.translation import ugettext_lazy as _
from django.utils.encoding import python_2_unicode_compatible
#python_2_unicode_compatible
class NewsArticle(models.Model):
title = models.CharField(_("Title"), max_length=200)
content = models.TextField(_("Content"))
def __str__(self):
return self.title
class Meta:
verbose_name = _("News Article")
verbose_name_plural = _("News Articles")
To iterate through dictionaries, use iteritems() , iterkeys() , and itervalues() from django.utils.six . Take a look at the following:
from django.utils.six import iteritems
d = {"imported": 25, "skipped": 12, "deleted": 3}
for k, v in iteritems(d):
print("{0}: {1}".format(k, v))
At the time of capturing exceptions, use the as keyword, as follows:
try:
article = NewsArticle.objects.get(slug="hello-world")
except NewsArticle.DoesNotExist as exc:
pass
except NewsArticle.MultipleObjectsReturned as exc:
pass
Use django.utils.six to check the type of a value as shown in the following:
from django.utils import six
isinstance(val, six.string_types) # previously basestring
isinstance(val, six.text_type) # previously unicode
isinstance(val, bytes) # previously str
isinstance(val, six.integer_types) # previously (int, long)
Use range from django.utils.six.moves ,Instead of xrange , as follows:
from django.utils.six.moves import range
for i in range(1, 11):
print(i)
Source link

In addition to importing from future, there is also the six project that aims to provide api compatibility between Python 2 and Python 3: https://pypi.org/project/six/.
Your example code could be made compatible between version 2 and 3:
import six
for key, value in six.iteritems(dict):
log.info("{0}: {1}".format(key, value))
There are still things that won't be compatible between 2 and 3 like f-strings.

There are a few different tools that will help you make sure you are writing python2/3 compatible code.
If you are interested in porting python2 code into python3, then the 2to3 program that comes with the standard library will try to convert a python 2 program to python 3.
https://docs.python.org/2/library/2to3.html
Another great tool is pylint. pylint is a python linter that will describe issues to you without fixing them. If you pip install pylint on a python3 environment, then it will analyze your code based on python 3's rules. If you use python 2 to install pylint, it will do the same but with python 2's rules.
There are other popular and similar tools like flake8 or autopep8, but I am not familiar with them enough to advertise them.

six and future is a golden rule, enough to make easy a coming migration
add to every python2 file, this as first line:
from __future__ import absolute_import, unicode_literals
use below working with strings, iteration, metaclasses, ...
isinstance(sth, six.string_types)
six.iteritems(dict)
#six.add_metaclass(Meta)
and so on six reference

Related

What is the correct way (if any) to use Python 2 and 3 libraries in the same program?

I wish to write a python script for that needs to do task 'A' and task 'B'. Luckily there are existing Python modules for both tasks, but unfortunately the library that can do task 'A' is Python 2 only, and the library that can do task 'B' is Python 3 only.
In my case the libraries are small and permissively-licensed enough that I could probably convert them both to Python 3 without much difficulty. But I'm wondering what is the "right" thing to do in this situation - is there some special way in which a module written in Python 2 can be imported directly into a Python 3 program, for example?

The "right" way is to translate the Py2-only module to Py3 and offer the translation upstream with a pull request (or equivalent approach for non-git upstream repos). Seriously. Horrible hacks to make py2 and py3 packages work together are not worth the effort.

I presume you know of tools such as 2to3, that aim to make the job of porting code to py3k easier, just repeating it here for others' reference.
In situations where I have to use libraries from python3 and python2, I've been able to work around it using the subprocess module. Alternatively, I've gotten around this issue with shell scripts that pipes output from the python2 script to the python3 script and vice-versa. This of course covers only a tiny fraction of use cases, but if you're transferring text (or maybe even picklable objects) between 2 & 3, it (or a more thought out variant) should work.
To the best of my knowledge, there isn't a best practice when it comes to mixing versions of python.
I present to you an ugly hack
Consider the following simple toy example, involving three files:
# py2.py
# file uses python2, here illustrated by the print statement
def hello_world():
print 'hello world'
if __name__ == '__main__':
hello_world()
# py3.py
# there's nothing py3 about this, but lets assume that there is,
# and that this is a library that will work only on python3
def count_words(phrase):
return len(phrase.split())
# controller.py
# main script that coordinates the work, written in python3
# calls the python2 library through subprocess module
# the limitation here is that every function needed has to have a script
# associated with it that accepts command line arguments.
import subprocess
import py3
if __name__ == '__main__':
phrase = subprocess.check_output('python py2.py', shell=True)
num_words = py3.count_words(phrase)
print(num_words)
# If I run the following in bash, it outputs `2`
hals-halbook: toy hal$ python3 controller.py
2

pylint "Undefined variable" in module written in C++/SIP

I export several native C++ classes to Python using SIP. I don't use the resulting maplib_sip.pyd module directly, but rather wrap it in a Python packagepymaplib:
# pymaplib/__init__.py
# Make all of maplib_sip available in pymaplib.
from maplib_sip import *
...
def parse_coordinate(coord_str):
...
# LatLon is a class imported from maplib_sip.
return LatLon(lat_float, lon_float)
Pylint doesn't recognize that LatLon comes from maplib_sip:
error pymaplib parse_coordinate 40 15 Undefined variable 'LatLon'
Unfortunately, the same happens for all the classes from maplib_sip, as well as for most of the code from wxPython (Phoenix) that I use. This effectively makes Pylint worthless for me, as the amount of spurious errors dwarfs the real problems.
additional-builtins doesn't work that well for my problem:
# Both of these don't remove the error:
additional-builtins=maplib_sip.LatLon
additional-builtins=pymaplib.LatLon
# This does remove the error in pymaplib:
additional-builtins=LatLon
# But users of pymaplib still give an error:
# Module 'pymaplib' has no 'LatLon' member
How do I deal with this? Can I somehow tell pylint that maplib_sip.LatLon actually exists? Even better, can it somehow figure that out itself via introspection (which works in IPython, for example)?
I'd rather not have to disable the undefined variable checks, since that's one of the huge benefits of pylint for me.
Program versions:
Pylint 1.2.1,
astroid 1.1.1, common 0.61.0,
Python 3.3.3 [32 bit] on Windows7

you may want to try the new --ignored-modules option, though I'm not sure it will work in your case, beside if you stop using import * (which would probably be a good idea as pylint probably already told you ;).
Rather use short import name, eg import maplib_sip as mls, then prefixed name, eg mls.LatLon where desired.
Notice though that the original problem is worth an issue on pylint tracker (https://bitbucket.org/logilab/pylint/issues) so some investigation will be done to grasp why it doesn't get member of your sip exported module.

unavailiable assertion methods in python 3.1 unittest

I'm new to python programming and especially to unit-testing framework.
For some reason working with pyDev (py 3.1 interpreter) I cannot use all of those new
assert methods (such as assertRegexpMatches etc..).
Here's an example code:
class TestParser(unittest.TestCase):
def testskipCommentAndSpaces(self):
if os.path.isfile(sys.argv[1]):
#self.vmFilesListPath = sys.argv[1]
vmFilesListPath = sys.argv[1]
else:
#self.vmFilesListPath = get_all_vm_files(sys.argv[1])
vmFilesListPath = get_all_vm_files(sys.argv[1])
#parser = Parser(self.vmFilesListPath)
parser = Parser(vmFilesListPath)
commands = parser.getCommands()
for command in commands:
for token in commands:
p=re.search(r"(////)",str(token))
**self.assertNotRegexpMatches(str(token),p)**
What I get is: AttributeError: 'TestParser' object has no attribute 'assertNotRegexpMatches'
Needless to say that: hasattr(self, 'assertNotRegexpMatches') returns false while the "simple" asserts methods works good.
I'm sure the interpreter is set to 3.1 - i.e the correct version I need (since I also have py 2.7 installed on my system).
Would thank you for your help,
Igor.L

While the unittest module in Python 3.1 had an assertRegexpMatches method, there is no documented assertNotRegexpMatches. In Python 3.2, assertRegexpMatches was renamed to assertRegex and the complementary assertNotRegex was added.
Note that Python 3.1 is obsolete and no longer maintained other than critical security fixes. There have been many features, fixes, and major performance improvements added in Python 3.2 and now 3.3 which was just released. Consider upgrading to one of them.

How to write Python code that is able to properly require a minimal python version?

I would like to see if there is any way of requiring a minimal python version.
I have several python modules that are requiring Python 2.6 due to the new exception handling (as keyword).
It looks that even if I check the python version at the beginning of my script, the code will not run because the interpreter will fail inside the module, throwing an ugly system error instead of telling the user to use a newer python.

You can take advantage of the fact that Python will do the right thing when comparing tuples:
#!/usr/bin/python
import sys
MIN_PYTHON = (2, 6)
if sys.version_info < MIN_PYTHON:
sys.exit("Python %s.%s or later is required.\n" % MIN_PYTHON)

You should not use any Python 2.6 features inside the script itself. Also, you must do your version check before importing any of the modules requiring a new Python version.
E.g. start your script like so:
#!/usr/bin/env python
import sys
if sys.version_info[0] != 2 or sys.version_info[1] < 6:
print("This script requires Python version 2.6")
sys.exit(1)
# rest of script, including real initial imports, here

Starting with version 9.0.0 pip supports Requires-Python field in distribution's metadata which can be written by setuptools starting with version 24-2-0. This feature is available through python_requires keyword argument to setup function.
Example (in setup.py):
setup(
...
python_requires='>=2.5,<2.7',
...
)
To take advantage of this feature one has to package the project/script first if not already done. This is very easy in typical case and should be done nonetheless as it allows users to easily install, use and uninstall given project/script. Please see Python Packaging User Guide for details.

import sys
if sys.hexversion < 0x02060000:
sys.exit("Python 2.6 or newer is required to run this program.")
import module_requiring_26
Also the cool part about this is that it can be included inside the __init__ file or the module.

I used to have a more complicated approach for supporting both Python2 and Python3, but I no longer try to support Python2, so now I just use:
import sys
MIN_PYTHON = (3, 7)
assert sys.version_info >= MIN_PYTHON, f"requires Python {'.'.join([str(n) for n in MIN_PYTHON])} or newer"
If the version check fails, you get a traceback with something like:
AssertionError: requires Python 3.7 or newer
at the bottom.

To complement the existing, helpful answers:
You may want to write scripts that run with both Python 2.x and 3.x, and require a minimum version for each.
For instance, if your code uses the argparse module, you need at least 2.7 (with a 2.x Python) or at least 3.2 (with a 3.x Python).
The following snippet implements such a check; the only thing that needs adapting to a different, but analogous scenario are the MIN_VERSION_PY2=... and MIN_VERSION_PY3=... assignments.
As has been noted: this should be placed at the top of the script, before any other import statements.
import sys
MIN_VERSION_PY2 = (2, 7) # min. 2.x version as major, minor[, micro] tuple
MIN_VERSION_PY3 = (3, 2) # min. 3.x version
# This is generic code that uses the tuples defined above.
if (sys.version_info[0] == 2 and sys.version_info < MIN_VERSION_PY2
or
sys.version_info[0] == 3 and sys.version_info < MIN_VERSION_PY3):
sys.exit(
"ERROR: This script requires Python 2.x >= %s or Python 3.x >= %s;"
" you're running %s." % (
'.'.join(map(str, MIN_VERSION_PY2)),
'.'.join(map(str, MIN_VERSION_PY3)),
'.'.join(map(str, sys.version_info))
)
)
If the version requirements aren't met, something like the following message is printed to stderr and the script exits with exit code 1.
This script requires Python 2.x >= 2.7 or Python 3.x >= 3.2; you're running 2.6.2.final.0.
Note: This is a substantially rewritten version of an earlier, needlessly complicated answer, after realizing - thanks to Arkady's helpful answer - that comparison operators such as > can directly be applied to tuples.

I'm guessing you have something like:
import module_foo
...
import sys
# check sys.version
but module_foo requires a particular version as well? This being the case, it is perfectly valid to rearrange your code thus:
import sys
# check sys.version
import module_foo
Python does not require that imports, aside from from __future__ import [something] be at the top of your code.

I need to make sure I'm using Python 3.5 (or, eventually, higher). I monkeyed around on my own and then I thought to ask SO - but I've not been impressed with the answers (sorry, y'all ::smile::). Rather than giving up, I came up with the approach below. I've tested various combinations of the min_python and max_python specification tuples and it seems to work nicely:
Putting this code into a __init__.py is attractive:
Avoids polluting many modules with a redundant version check
Placing this at the top of a package hierarchy even more further supports the DRY principal, assuming the entire hierarchy abides by the same Python version contraints
Takes advantage of a place (file) where I can use the most portable Python code (e.g. Python 1 ???) for the check logic and still write my real modules in the code version I want
If I have other package-init stuff that is not "All Python Versions Ever" compatible, I can shovel it into another module, e.g. __init_p3__.py as shown in the sample's commented-out final line. Don't forget to replace the pkgname place holder with the appropriate package name.
If you don't want a min (or max), just set it to = ()
If you only care about the major version, just use a "one-ple", e.g. = (3, ) Don't forget the comma, otherwise (3) is just a parenthesized (trivial) expression evaluating to a single int
You can specify finer min/max than just one or two version levels, e.g. = (3, 4, 1)
There will be only one "Consider running as" suggestion when the max isn't actually greater than the min, either because max is an empty tuple (a "none-ple"?), or has fewer elements.
NOTE: I'm not much of a Windoze programmer, so the text_cmd_min and text_cmd_max values are oriented for *Nix systems. If you fix up the code to work in other environments (e.g. Windoze or some particular *Nix variant), then please post. (Ideally, a single super-smartly code block will suffice for all environments, but I'm happy with my *Nix only solution for now.)
PS: I'm somewhat new to Python, and I don't have an interpreter with version less than 2.7.9.final.0, so it's tuff to test my code for earlier variants. On the other hand, does anyone really care? (That's an actual question I have: In what (real-ish) context would I need to deal with the "Graceful Wrong-Python-Version" problem for interpreters prior to 2.7.9?)
__init__.py
'''Verify the Python Interpreter version is in range required by this package'''
min_python = (3, 5)
max_python = (3, )
import sys
if (sys.version_info[:len(min_python)] < min_python) or (sys.version_info[:len(max_python)] > max_python):
text_have = '.'.join("%s" % n for n in sys.version_info)
text_min = '.'.join("%d" % n for n in min_python) if min_python else None
text_max = '.'.join("%d" % n for n in max_python) if max_python else None
text_cmd_min = 'python' + text_min + ' ' + " ".join("'%s'" % a for a in sys.argv) if min_python else None
text_cmd_max = 'python' + text_max + ' ' + " ".join("'%s'" % a for a in sys.argv) if max_python > min_python else None
sys.stderr.write("Using Python version: " + text_have + "\n")
if min_python: sys.stderr.write(" - Min required: " + text_min + "\n")
if max_python: sys.stderr.write(" - Max allowed : " + text_max + "\n")
sys.stderr.write("\n")
sys.stderr.write("Consider running as:\n\n")
if text_cmd_min: sys.stderr.write(text_cmd_min + "\n")
if text_cmd_max: sys.stderr.write(text_cmd_max + "\n")
sys.stderr.write("\n")
sys.exit(9)
# import pkgname.__init_p3__

Rather than indexing you could always do this,
import platform
if platform.python_version() not in ('2.6.6'):
raise RuntimeError('Not right version')

Code changes from Python 2.6 to 3.x

I am trying to get pywbem working in Python 3.2 (it works fine in 2.6) but the build fails on this part of code in mof_compiler.py:
File "pywbem-0.7.0\mof_compiler.py", line 1341
print s
^
SyntaxError: invalid syntax
It's a macro, defined like this:
def _print_logger(s):
print s
I don't understand why this is invalid, please explain how to do the same in Python 3.2.
Note: I have little or no experience with Python.
PS: I have already done some small changes to the code for 3.2 like
changing
except CIMError, ce:
to
except CIMError as ce:
based on Lennart Regebro's answer here are some other changes I found (placing them here since it may be useful for others).
exec "import %s as lextab" % tabfile -> exec ("import %s as lextab" % tabfile)
raise ValueError,"Expected a string" -> raise ValueError("Expected a string")

That's not a macro, it's a function definition, and in Python 3 the print statement is now a function. So do print(s) instead.
The list of changes between Python 2 and Python 3 is here: http://docs.python.org/release/3.0.1/whatsnew/3.0.html
It's not so easy to read, but I don't know if there is a better one online (although books exist).
If you are going to use Python 3, you would probably do good to get a Python 3 book. There are a couple of them out now. Or at least refer to the Python 3 documentation: http://docs.python.org/release/3.2/ It has a decent tutorial.

One of the most visible changes in python 3 is print is no longer a statement, but is a function, so you have to use parenthesis for calling that function. print(s)
Also, if you have your Python2 code, just use 2to3 which can do a source to source translation of your python2 to python3, which can fix most of the syntax level changes for you like the above problems. 2to3 is installed with python3 binary.

Sorry for answering an old question, but I just recently wanted to get PyWBEM running under Python 3, so I forked it, made the required changes, and removed a Python 2.x dependency (M2Crypto) from it for the 3.x series. Here's the source from GitHub:
https://github.com/deejross/python3-pywbem
Quick note, this supports Python 2.6, 2.7, and 3.4+

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.