Mypy Python 2 insist on unicode value not string value

Mypy Python 2 insist on unicode value not string value - python

Python 2 will implicitly convert str to unicode in some circumstances. This conversion will sometimes throw a UnicodeError depending on what you try to do with the resulting value. I don't know the exact semantics, but it's something I'd like to avoid.
Is it possible to use another type besides unicode or a command-line argument similar to --strict-optional (http://mypy-lang.blogspot.co.uk/2016/07/mypy-043-released.html) to cause programs using this implicit conversion to fail to type check?
def returns_string_not_unicode():
# type: () -> str
return u"a"
def returns_unicode_not_string():
# type: () -> unicode
return "a"
In this example, only the function returns_string_not_unicode fails to type check.
$ mypy --py2 unicode.py
unicode.py: note: In function "returns_string_not_unicode":
unicode.py:3: error: Incompatible return value type (got "unicode", expected "str")
I would like both of them to fail to typecheck.
EDIT:
type: () -> byte seems to be treated the same way as str
def returns_string_not_unicode():
# type: () -> bytes
return u"a"

This is, unfortunately, an ongoing and currently unresolved issue -- see https://github.com/python/mypy/issues/1141 and https://github.com/python/typing/issues/208.
A partial fix is to use typing.Text which is (unfortunately) currently undocumented (I'll work on fixing that though). It's aliased to str in Python 3 and to unicode in Python 2. It won't resolve your actual issue or cause the second function to fail to typecheck, but it does make it a bit easier to write types compatible with both Python 2 and Python 3.
In the meantime, you can hack together a partial workaround by using the recently-implemented NewType feature -- it lets you define a psuedo-subclass with minimal runtime cost, which you can use to approximate the functionality you're looking for:
from typing import NewType, Text
# Tell mypy to treat 'Unicode' as a subtype of `Text`, which is
# aliased to 'unicode' in Python 2 and 'str' (aka unicode) in Python 3
Unicode = NewType('Unicode', Text)
def unicode_not_str(a: Unicode) -> Unicode:
return a
# my_unicode is still the original string at runtime, but Mypy
# treats it as having a distinct type from `str` and `unicode`.
my_unicode = Unicode(u"some string")
unicode_not_str(my_unicode) # typechecks
unicode_not_str("foo") # fails
unicode_not_str(u"foo") # fails, unfortunately
unicode_not_str(Unicode("bar")) # works, unfortunately
It's not perfect, but if you're principled about when you elevate a string into being treated as being of your custom Unicode type, you can get something approximating the type safety you're looking for with minimal runtime cost until the bytes/str/unicode issue is settled.
Note that you'll need to install mypy from the master branch on Github to use NewType.
Note that NewType was added as of mypy version 0.4.4.

Related

Tool to compare function signature to docstring

Is there a tool that can check if the arguments listed in the docstring match the signature of the function call? It should be able to deal with numpy-style docstrings.
I am regularly using R CMD CHECK, which finds documentation/code mismatches in R and this is quite helpful. It would be very good to have something similar in Python, but I did not find anything yet.

I just created a tool to achieve this, called pydoctest.
It will attempt to infer the types in your docstrings (not just lexically compare) and report back on mismatches between number of arguments, argument-names, argument-types, return-types, (optionally) throw error if lacking docstring and more.
It currently supports google, sphinx and numpy docstring format, but can rather easily be extended with other formats.
Example:
def func_type_mismatch(self, a: int) -> int:
"""[summary]
Args:
a (float): [description] <-- float is not int
Returns:
int: [description]
"""
pass
Running pydoctest on this function, gives this output:
Function: <function IncorrectTestClass.func_type_mismatch at 0x7f9a8b52c8c8> FAIL | Argument type differ. Argument 'a' was expected (from signature) to have type '<class 'int'>', but has (in docs) type '<class 'float'>'
Edit (June 2021): I've started the development of a vscode-extension that uses and highlights the errors.
https://marketplace.visualstudio.com/items?itemName=JeppeRask.pydoctest

I was trying to find the same, so I wrote docsig
pip install docsig
then just run docsig . and it will check this for you
/path/to/project
-----------------------
def function(✖*args) -> ✓List[str]:
"""...
:param None: ✖
:return: ✓
"""
E103: parameters missing
There are 8 other errors so far

Python mypy checking of return types for TypeVar(bound=Union[A, B]) doesn't error vs TypeVar(A, B) does error

I am stuck trying to understand bounding of TypeVar when using it in two different ways:
Enums = TypeVar("Enums", Enum1, Enum2)
Enums = TypeVar("Enums", bound=Union[Enum1, Enum2])
Here is the code I am using:
#!/usr/bin/env python3.6
"""Figuring out why enum is saying incompatible return type."""
from enum import IntEnum, EnumMeta
from typing import TypeVar, Union
class Enum1(IntEnum):
MEMBER1 = 1
MEMBER2 = 2
class Enum2(IntEnum):
MEMBER3 = 3
MEMBER4 = 4
# Enums = TypeVar("Enums", bound=Union[Enum1, Enum2]) # Case 1... Success
Enums = TypeVar("Enums", Enum1, Enum2) # Case 2... error: Incompatible return value
def _enum_to_num(val: int, cast_enum: EnumMeta) -> Enums:
return cast_enum(val)
def get_some_enum(val: int) -> Enum1:
return _enum_to_num(val, Enum1)
def get_another_enum(val: int) -> Enum2:
return _enum_to_num(val, Enum2) # line 35
When running mypy==0.770:
Case 1: Success: no issues found
Case 2: 35: error: Incompatible return value type (got "Enum1", expected "Enum2")
This case is very similar to this question: Difference between TypeVar('T', A, B) and TypeVar('T', bound=Union[A, B])
The answer explains when using case 1(bound=Union[Enum1, Enum2]), the following is legal:
Union[Enum1, Enum2]
Enum1
Enum2
And when using case 2 (A, B), the following is legal:
Enum1
Enum2
However, I don't think this answer explains my problem, I am not using the Union case.
Can anyone please tell me what's going on?

I think the error occurs because the type checker does not have enough information to infer the return type by looking at the types of input arguments. Although the handling may be improved.
Suppose you have a simple generic function:
Enums = TypeVar("Enums", Enum1, Enum2)
def add(x: Enums, y: Enums) -> Enums:
return x
The type checker can infer the return type by type of input arguments:
add(Enum2.MEMBER3, Enum2.MEMBER4) # ok, return Enum2
add(Enum1.MEMBER1, Enum1.MEMBER2) # ok, return Enum1
add(Enum2.MEMBER3, Enum1.MEMBER2) # not ok
Look at your function _enum_to_num again, the type checker has no means to infer the return type, it just doesn't know what type will be returned because it doesn't know what type will be returned by cast_enum:
def _enum_to_num(val: int, cast_enum: EnumMeta) -> Enums:
return cast_enum(val)
The idea of static type checking is that it evaluates the code without execution, it investigates the types of variables, not the dynamic values. By looking at the type of cast_enum, that is EnumMeta, the type checker cannot tell whether cast_enum will return Enums or not. Looks like it just assumes it will return Enum1, and it causes the error in _enum_to_num(val, Enum2).
You know that _enum_to_num(val, Enum2) will return Enum2 because you know the value of cast_enum is Enum2. The value is something that type checker doesn't touch in general. It may be confusing, the value of the variable cast_enum is Enum2, while the type of cast_enum is EnumMeta, although Enum2 is a type.
This issue can be solved by telling the type checker that types will be passed through cast_enum using typing.Type:
from typing import TypeVar, Union, Type
...
def _enum_to_num(val: int, cast_enum: Type[Enums]) -> Enums:
return cast_enum(val)
The error will disappear because now the type checker can infer the return type.

I write first a little about what mypy sees and reports, and follow with the question of whether this is a mypy bug.
The message:
Incompatible return value type (got "Enum1", expected "Enum2")
means here that roughly that a Enum2 or a subtype of that is expected. Enum2 is the declared return value of get_another_enum(). However mypy thinks that the function call _enum_to_num() is returning an Enum1 type.
The "roughly" part is because there are exceptions for type checking when a type is unbound, or is an Any, or Union type; but that doesn't apply in this example.
Mypy decides that the function cast_enum() in _enum_to_num() is returning the first type listed in Enums — I guess as a static type checker, it has to pick one and that's what it does.
So if you switch the order around in the Enums assignment and write:
Enums = TypeVar("Enums", Enum2, Enum1) # Case 2... error: Incompatible return value
Then line 35 will succeed, but the return in get_some_enum() will fail with the message:
error: Incompatible return value type (got "Enum2", expected "Enum1")
As to whether this is a mypy bug, it is hard to tell...
There is no dynamic type error that you can find here using the type() or ininstance() functions; running the code works as expected too.
On the other hand, Python never checks the return type whether at compile time or at run time: you could change the return type of _enum_to_none() to be None and that would still be valid as far as the Python interpreter is concerned.
The question then comes down to: in the static type system imposed by mypy, is this a bug? (I don't think PEP 484, 526, or other numbers try to address this).
Someone more qualified should answer the question of whether this is a bug that should be caught by a static analyzer, mypy in particular.
See Ken Hung's answer for a way to be more explicit and remove mypy's error.

typing.get_type_hints returns Union instead of Optional

I'm trying to extract some nice human-readable type hints from Python functions, but typing.get_type_hints() is returning something more complex/less readable than I was expecting.
For example:
import typing
def say_something(something: str = None):
print(something)
typing.get_type_hints(say_something)
What I want it to give me is:
{'something': Optional[str]}
And according to the docs, that is indeed what should be happening:
If necessary, Optional[t] is added for function and method annotations if a default value equal to None is set.
But what it actually returns is this, which is equivalent but less readable:
{'something': typing.Union[str, NoneType]}
Other than manually substituting typing.Union[{x}, NoneType] with Optional[{x}], is there any way to get these nicer type hints in Python 3? (I'm on 3.7.5 specifically.)

AFAIK Optional[smth] will always return Union[smth, NoneType], so basically there is no separate Optional type annotation, it just an alias/shorthand.
If we take a look at typing module source
https://github.com/python/cpython/blob/df8913f7c48d267efd662e8ffd9496595115eee8/Lib/typing.py#L369-L371
we can see that when we are calling Optional[smth] what gets returned is Union[smth, NoneType].
We can also check this in REPL
>>> from typing import Optional
>>> Optional[str]
typing.Union[str, NoneType]

Function Parameter Conversion in Python

To rephrase my question for below:
What is the point of telling the parameter is string when it cannot convert to string when integer was entered as input?
I understand we can use str() to convert integer to string but that's not the answering I'm looking for.
After running the code by entering gss inside the paramteter,
I received 1. However, when I look up the type of this results, it shows as NoneType.
Why is this not string?
gss=1
convert_gss_to_str(gss)
1
type(convert_gss_to_str(gss))
Nonetype
I ran the below codes thinking that the integer 1 will be converted to string '1'.
However, I received this error:
TypeError: convert_gss() missing 1 required positional argument: 'gss'
Any suggestion what I am doing wrong?
gss = 1
def convert_gss_to_str(gss: str):
print(gss)
convert_gss_to_str()

def convert_gss_to_str(gss: str):
print(gss)
This function takes one non-optional parameter gss and does not return anything, therefore its return type is Nonetype. If you want to do actual conversion you can use builtin function str() as suggested by Sawel.
Type conversion is not really necessary for print() as it will print integers anyways

def convert_gss_to_str(gss: str):
...
It's just a type hint.
For more information, read PEP484
While these annotations are available at runtime through the usual annotations attribute, no type checking happens at runtime. Instead, the proposal assumes the existence of a separate off-line type checker which users can run over their source code voluntarily. Essentially, such a type checker acts as a very powerful linter.

Just use the builtin function str.
>>> gss = 1
>>> str(gss)
"1"

How can I type-check variables in Python?

I have a Python function that takes a numeric argument that must be an integer in order for it behave correctly. What is the preferred way of verifying this in Python?
My first reaction is to do something like this:
def isInteger(n):
return int(n) == n
But I can't help thinking that this is 1) expensive 2) ugly and 3) subject to the tender mercies of machine epsilon.
Does Python provide any native means of type checking variables? Or is this considered to be a violation of the language's dynamically typed design?
EDIT: since a number of people have asked - the application in question works with IPv4 prefixes, sourcing data from flat text files. If any input is parsed into a float, that record should be viewed as malformed and ignored.

isinstance(n, int)
If you need to know whether it's definitely an actual int and not a subclass of int (generally you shouldn't need to do this):
type(n) is int
this:
return int(n) == n
isn't such a good idea, as cross-type comparisons can be true - notably int(3.0)==3.0

Yeah, as Evan said, don't type check. Just try to use the value:
def myintfunction(value):
""" Please pass an integer """
return 2 + value
That doesn't have a typecheck. It is much better! Let's see what happens when I try it:
>>> myintfunction(5)
7
That works, because it is an integer. Hm. Lets try some text.
>>> myintfunction('text')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 3, in myintfunction
TypeError: unsupported operand type(s) for +: 'int' and 'str'
It shows an error, TypeError, which is what it should do anyway. If caller wants to catch that, it is possible.
What would you do if you did a typecheck? Show an error right? So you don't have to typecheck because the error is already showing up automatically.
Plus since you didn't typecheck, you have your function working with other types:
Floats:
>>> print myintfunction(2.2)
4.2
Complex numbers:
>>> print myintfunction(5j)
(2+5j)
Decimals:
>>> import decimal
>>> myintfunction(decimal.Decimal('15'))
Decimal("17")
Even completely arbitrary objects that can add numbers!
>>> class MyAdderClass(object):
... def __radd__(self, value):
... print 'got some value: ', value
... return 25
...
>>> m = MyAdderClass()
>>> print myintfunction(m)
got some value: 2
25
So you clearly get nothing by typechecking. And lose a lot.
UPDATE:
Since you've edited the question, it is now clear that your application calls some upstream routine that makes sense only with ints.
That being the case, I still think you should pass the parameter as received to the upstream function. The upstream function will deal with it correctly e.g. raising an error if it needs to. I highly doubt that your function that deals with IPs will behave strangely if you pass it a float. If you can give us the name of the library we can check that for you.
But... If the upstream function will behave incorrectly and kill some kids if you pass it a float (I still highly doubt it), then just just call int() on it:
def myintfunction(value):
""" Please pass an integer """
return upstreamfunction(int(value))
You're still not typechecking, so you get most benefits of not typechecking.
If even after all that, you really want to type check, despite it reducing your application's readability and performance for absolutely no benefit, use an assert to do it.
assert isinstance(...)
assert type() is xxxx
That way we can turn off asserts and remove this <sarcasm>feature</sarcasm> from the program by calling it as
python -OO program.py

Python now supports gradual typing via the typing module and mypy. The typing module is a part of the stdlib as of Python 3.5 and can be downloaded from PyPi if you need backports for Python 2 or previous version of Python 3. You can install mypy by running pip install mypy from the command line.
In short, if you want to verify that some function takes in an int, a float, and returns a string, you would annotate your function like so:
def foo(param1: int, param2: float) -> str:
return "testing {0} {1}".format(param1, param2)
If your file was named test.py, you could then typecheck once you've installed mypy by running mypy test.py from the command line.
If you're using an older version of Python without support for function annotations, you can use type comments to accomplish the same effect:
def foo(param1, param2):
# type: (int, float) -> str
return "testing {0} {1}".format(param1, param2)
You use the same command mypy test.py for Python 3 files, and mypy --py2 test.py for Python 2 files.
The type annotations are ignored entirely by the Python interpreter at runtime, so they impose minimal to no overhead -- the usual workflow is to work on your code and run mypy periodically to catch mistakes and errors. Some IDEs, such as PyCharm, will understand type hints and can alert you to problems and type mismatches in your code while you're directly editing.
If, for some reason, you need the types to be checked at runtime (perhaps you need to validate a lot of input?), you should follow the advice listed in the other answers -- e.g. use isinstance, issubclass, and the like. There are also some libraries such as enforce that attempt to perform typechecking (respecting your type annotations) at runtime, though I'm uncertain how production-ready they are as of time of writing.
For more information and details, see the mypy website, the mypy FAQ, and PEP 484.

if type(n) is int
This checks if n is a Python int, and only an int. It won't accept subclasses of int.
Type-checking, however, does not fit the "Python way". You better use n as an int, and if it throws an exception, catch it and act upon it.

Don't type check. The whole point of duck typing is that you shouldn't have to. For instance, what if someone did something like this:
class MyInt(int):
# ... extra stuff ...

Programming in Python and performing typechecking as you might in other languages does seem like choosing a screwdriver to bang a nail in with. It is more elegant to use Python's exception handling features.
From an interactive command line, you can run a statement like:
int('sometext')
That will generate an error - ipython tells me:
<type 'exceptions.ValueError'>: invalid literal for int() with base 10: 'sometext'
Now you can write some code like:
try:
int(myvar) + 50
except ValueError:
print "Not a number"
That can be customised to perform whatever operations are required AND to catch any errors that are expected. It looks a bit convoluted but fits the syntax and idioms of Python and results in very readable code (once you become used to speaking Python).

I would be tempted to to something like:
def check_and_convert(x):
x = int(x)
assert 0 <= x <= 255, "must be between 0 and 255 (inclusive)"
return x
class IPv4(object):
"""IPv4 CIDR prefixes is A.B.C.D/E where A-D are
integers in the range 0-255, and E is an int
in the range 0-32."""
def __init__(self, a, b, c, d, e=0):
self.a = check_and_convert(a)
self.b = check_and_convert(a)
self.c = check_and_convert(a)
self.d = check_and_convert(a)
assert 0 <= x <= 32, "must be between 0 and 32 (inclusive)"
self.e = int(e)
That way when you are using it anything can be passed in yet you only store a valid integer.

how about:
def ip(string):
subs = string.split('.')
if len(subs) != 4:
raise ValueError("incorrect input")
out = tuple(int(v) for v in subs if 0 <= int(v) <= 255)
if len(out) != 4:
raise ValueError("incorrect input")
return out
ofcourse there is the standard isinstance(3, int) function ...

For those who are looking to do this with assert() function. Here is how you can efficiently place the variable type check in your code without defining any additional functions. This will prevent your code from running if the assert() error is raised.
assert(type(X) == int(0))
If no error was raised, code continues to work. Other than that, unittest module is a very useful tool for this sorts of things.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Mypy Python 2 insist on unicode value not string value - python

Related

Tool to compare function signature to docstring

Python mypy checking of return types for TypeVar(bound=Union[A, B]) doesn't error vs TypeVar(A, B) does error

typing.get_type_hints returns Union instead of Optional

Function Parameter Conversion in Python

How can I type-check variables in Python?

Categories

Resources