Is it possible to reference function parameters in Python's function annotation? - python

I'd like to be able to say
def f(param) -> type(param): return param
but I get the NameError: name 'param' is not defined. Key thing here is that the return type is a function of a function parameter. I have glanced through the https://www.python.org/dev/peps/pep-3107/, but I don't see any precise description of what comprises a valid annotation expression.
I would accept an answer which explains why exactly is this not possible at the moment, i.e., does it not fit into current annotation paradigm or is there a technical problem with this?

There are a few issues with the type(param) method.
First off, as Oleh mentioned in his answer, all annotations must be valid at the time of the function's definition. In an example like yours, you could potentially have problems due to variable shadowing.
param = 10
def f(param) -> type(param):
return param
f('a')
Since the variable param is of type int, the function's annotation is essentially read as f(param: Any) -> int. So when you pass in the argument param with the value 'a', which means f will return a str, this makes it inconsistent with the annotation. Admittedly this example is contrived, but from a language design stand point, it is something to be careful.
Instead, as jonrsharpe mentioned, often the best way to reference the generic types of parameters (as jonrsharpe) mentioned is with type variables.
This can be done using the typing.TypeVar class.
from typing import TypeVar
def f(param: T) -> T:
return param
This means that static checkers won't need to actually access the type of param, just check that at check-time that there is a way to consider both param and the return value of the same type. I say consider the same type because you will sometimes only assert that they both implement the same abstract base class/interface, like numbers.Real.
And then can use typevars in generic types
from typing import List, TypeVar
T = TypeVar('T')
def total(items: List[T]) -> List[T]:
return [f(item) for item in items]
Using type variables and generics can be better because it adds additional information and allows for a little bit more flexibility (as explained in the example with numbers.Real). For instance, the ability to use List[T] is really important. In your case of using type(param), it would only return list, not list of like List[T] would. So using type(param) would actually lose information, not add it.
Therefore, it is a better idea to stick to using type variables and generic types instead.
TL;DR:
Due to variable shadowing, type(param) could lead to inconsistent annotations.
Since sometimes when thinking of the types of your system you are thinking in terms of interfaces (abstract base classes in Python) instead of concrete types, it can be better to rely on ABC's and type variables
Using type(param) could lose information that would be provided by generics.

Let's take a glance at PEP-484 - Type Hints # Acceptable type hints.
Annotations must be valid expressions that evaluate without raising exceptions at the time the function is defined (but see below for forward references).
Annotations should be kept simple or static analysis tools may not be able to interpret the values. For example, dynamically computed types are unlikely to be understood. (This is an intentionally somewhat vague requirement, specific inclusions and exclusions may be added to future versions of this PEP as warranted by the discussion.)
I'd say that your approach is quite interesting and may be useful for static analysis. But if we accept PEPs as a source of an explanation for the current annotation paradigm, the highlighted text explains why return type can't be defined dynamically at the time the function is called.

Related

How can i type hint the init params are the same as fields in a dataclass?

Let us say I have a custom use case, and I need to dynamically create or define the __init__ method for a dataclass.
For exampel, say I will need to decorate it like #dataclass(init=False) and then modify __init__() method to taking keyword arguments, like **kwargs. However, in the kwargs object, I only check for presence of known dataclass fields, and set these attributes accordingly (example below)
I would like to type hint to my IDE (PyCharm) that the modified __init__ only accepts listed dataclass fields as parameters or keyword arguments. I am unsure if there is a way to approach this, using typing library or otherwise. I know that PY3.11 has dataclass transforms planned, which may or may not do what I am looking for (my gut feeling is no).
Here is a sample code I was playing around with, which is a basic case which illustrates problem I am having:
from dataclasses import dataclass
# get value from input source (can be a file or anything else)
def get_value_from_src(_name: str, tp: type):
return tp() # dummy value
#dataclass
class MyClass:
foo: str
apple: int
def __init__(self, **kwargs):
for name, tp in self.__annotations__.items():
if name in kwargs:
value = kwargs[name]
else:
# here is where I would normally have the logic
# to read the value from another input source
value = get_value_from_src(name, tp)
if value is None:
raise ValueError
setattr(self, name, value)
c = MyClass(apple=None)
print(c)
c = MyClass(foo='bar', # here, I would like to auto-complete the name
# when I start typing `apple`
)
print(c)
If we assume that number or names of the fields are not fixed, I am curious if there could be a generic approach which would basically say to type checkers, "the __init__ of this class accepts only (optional) keyword arguments that match up on the fields defined in the dataclass itself".
Addendums, based on notes in comments below:
Passing #dataclass(kw_only=True) won't work because imagine I am writing this for a library, and need to support Python 3.7+. Also, kw_only has no effect when a custom __init__() is implemented, as in this case.
The above is just a stub __init__ method. it could have more complex logic, such as setting attributes based on a file source for example. basically the above is just a sample implementation of a larger use case.
I can't update each field to foo: Optional[str] = None because that part would be implemented in user code, which I would not have any control over. Also, annotating it in this way doesn't make sense when you know a custom __init__() method will be generated for you - meaning not by dataclasses. Lastly, setting a default for each field just so that the class can
be instantiated without arguments, like MyClass(), don't seem like
the best idea to me.
It would not work to let dataclasses auto-generate an __init__, and instead implement a __post_init__(). This would not work because I need to be able to construct the class without arguments, like MyClass(), as the field values will be set from another input source (think local file or elsewhere); this means that all fields would be required, so annotating them as Optional would be fallacious in this case. I still need to be able to support user to enter optional keyword arguments, but these **kwargs will always match up with dataclass field names, and so I desire some way for auto-completion to work with my IDE (PyCharm)
Hope this post clarifies the expectations and desired result. If there are any questions or anything that is a bit vague, please let me know.
What you are describing is impossible in theory and unlikely to be viable in practice.
TL;DR
Type checkers don't run your code, they just read it. A dynamic type annotation is a contradiction in terms.
Theory
As I am sure you know, the term static type checker is not coincidental. A static type checker is not executing the code your write. It just parses it and infers types according to it's own internal logic by applying certain rules to a graph that it derives from your code.
This is important because unlike some other languages, Python is dynamically typed, which as you know means that the type of a "thing" (variable) can completely change at any point. In general, there is theoretically no way of knowing the type of all variables in your code, without actually stepping through the entire algorithm, which is to say running the code.
As a silly but illustrative example, you could decide to put the name of a type into a text file to be read at runtime and then used to annotate some variable in your code. Could you do that with valid Python code and typing? Sure. But I think it is beyond clear, that static type checkers will never know the type of that variable.
Why your proposition won't work
Abstracting away all the dataclass stuff and the possible logic inside your __init__ method, what you are asking boils down to the following.
"I want to define a method (__init__), but the types of its parameters will only be known at runtime."
Why am I claiming that? I mean, you do annotate the types of the class' attributes, right? So there you have the types!
Sure, but these have -- in general -- nothing whatsoever to do with the arguments you could pass to the __init__ method, as you yourself point out. You want the __init__ method to accept arbitrary keyword-arguments. Yet you also want a static type checker to infer which types are allowed/expected there.
To connect the two (attribute types and method parameter types), you could of course write some kind of logic. You could even implement it in a way that enforces adherence to those types. That logic could read the type annotations of the class attributes, match up the **kwargs and raise TypeError if one of them doesn't match up. This is entirely possible and you almost implemented that already in your example code. But this only works at runtime!
Again, a static type checker has no way to infer that, especially since your desired class is supposed to just be a base class and any descendant can introduce its own attributes/types at any point.
But dataclasses work, don't they?
You could argue that this dynamic way of annotating the __init__ method works with dataclasses. So why are they so different? Why are they correctly inferred, but your proposed code can't?
The answer is, they aren't.
Even dataclasses don't have any magical way of telling a static type checker which parameter types the __init__ method is to expect, even though they do annotate them, when they dynamically construct the method in _init_fn.
The only reason mypy correctly infers those types, is because they implemented a separate plugin just for dataclasses. Meaning it works because they read through PEP 557 and hand-crafted a plugin for mypy that specifically facilitates type inference based on the rules described there.
You can see the magic happening in the DataclassTransformer.transform method. You cannot generalize this behavior to arbitrary code, which is why they had to write a whole plugin just for this.
I am not familiar enough with how PyCharm does its type checking, but I strongly suspect they used something similar.
So you could argue that dataclasses are "cheating" with regards to static type checking. Though I am certainly not complaining.
Pragmatic solution
Even something as "high-profile" as Pydantic, which I personally love and use extensively, requires its own mypy plugin to realize the __init__ type inference properly (see here). For PyCharm they have their own separate Pydantic plugin, without which the internal type checker cannot provide those nice auto-suggestions for initialization etc.
That approach would be your best bet, if you really want to take this further. Just be aware that this will be (in the best sense of the word) a hack to allow specifc type checkers to catch "errors" that they otherwise would have no way of catching.
The reason I argue that it is unlikely to be viable is because it will essentially blow up the amount of work for your project to also cover the specific hacks for those type checkers that you want to satisfy. If you are committed enough and have the resources, go for it.
Conclusion
I am not trying to discourage you. But it is important to know the limitations enforced by the environment. It's either dynamic types and hacky imperfect type checking (still love mypy), or static types and no "kwargs can be anything" behavior.
Hope this makes sense. Please let me know, if I made any errors. This is just based on my understanding of typing in Python.
For
It would not work to let dataclasses auto-generate an __init__, and instead implement a __post_init__(). This would not work because I need to be able to construct the class without arguments, like MyClass(), as the field values will be set from another input source (think local file or elsewhere); this means that all fields would be required, so annotating them as Optional would be fallacious in this case. I still need to be able to support user to enter optional keyword arguments, but these **kwargs will always match up with dataclass field names, and so I desire some way for auto-completion to work with my IDE (PyCharm)
dataclasses.field + default_factory can be a solution.
But, it seems that dataclass field declarations are implemented in user code:
I can't update each field to foo: Optional[str] = None because that part would be implemented in user code, which I would not have any control over. Also, annotating it in this way doesn't make sense when you know a custom __init__() method will be generated for you - meaning not by dataclasses. Lastly, setting a default for each field just so that the class can be instantiated without arguments, like MyClass(), don't seem like the best idea to me.
If your IDE supports ParamSpec, there is a workaround: not correct(cannot pass static type checker), but has auto-completion:
from typing import Callable, Iterable, TypeVar, ParamSpec
from dataclasses import dataclass
T = TypeVar('T')
P = ParamSpec('P')
# user defined dataclass
#dataclass
class MyClass:
foo: str
apple: int
def wrap(factory: Callable[P, T], annotations: Iterable[tuple[str, type]]) -> Callable[P, T]:
def default_factory(**kwargs):
for name, type_ in annotations:
kwargs.setdefault(name, type_())
return factory(**kwargs)
return default_factory
WrappedMyClass = wrap(MyClass, MyClass.__annotations__.items())
WrappedMyClass() # Okay

Typing for repl/infinite-loop type [duplicate]

Python's new type hinting feature allows us to type hint that a function returns None...
def some_func() -> None:
pass
... or to leave the return type unspecified, which the PEP dictates should cause static analysers to assume that any return type is possible:
Any function without annotations should be treated as having the most general type possible
However, how should I type hint that a function will never return? For instance, what is the correct way to type hint the return value of these two functions?
def loop_forever():
while True:
print('This function never returns because it loops forever')
def always_explode():
raise Exception('This function never returns because it always raises')
Neither specifying -> None nor leaving the return type unspecified seems correct in these cases.
Even though “PEP 484 — Type Hints” standard mentioned both in question and in the answer yet nobody quotes its section: The NoReturn type that covers your question.
Quote:
The typing module provides a special type NoReturn to annotate functions that never return normally. For example, a function that unconditionally raises an exception:
from typing import NoReturn
def stop() -> NoReturn:
raise RuntimeError('no way')
The section also provides examples of the wrong usages. Though it doesn’t cover functions with an endless loop, in type theory they both equally satisfy never returns meaning expressed by that special type.
In july 2016, there was no answer to this question yet (now there is NoReturn; see the new accepted answer). These were some of the reasons:
When a function doesn't return, there is no return value (not even None) that a type could be assigned to. So you are not actually trying to annotate a type; you are trying to annotate the absence of a type.
The type hinting PEP has only just been adopted in the standard, as of Python version 3.5. In addition, the PEP only advises on what type annotations should look like, while being intentionally vague on how to use them. So there is no standard telling us how to do anything in particular, beyond the examples.
The PEP has a section Acceptable type hints stating the following:
Annotations must be valid expressions that evaluate without raising exceptions at the time the function is defined (but see below for forward references).
Annotations should be kept simple or static analysis tools may not be able to interpret the values. For example, dynamically computed types are unlikely to be understood. (This is an intentionally somewhat vague requirement, specific inclusions and exclusions may be added to future versions of this PEP as warranted by the discussion.)
So it tries to discourage you from doing overly creative things, like throwing an exception inside a return type hint in order to signal that a function never returns.
Regarding exceptions, the PEP states the following:
No syntax for listing explicitly raised exceptions is proposed. Currently the only known use case for this feature is documentational, in which case the recommendation is to put this information in a docstring.
There is a recommendation on type comments, in which you have more freedom, but even that section doesn't discuss how to document the absence of a type.
There is one thing you could try in a slightly different situation, when you want to hint that a parameter or a return value of some "normal" function should be a callable that never returns. The syntax is Callable[[ArgTypes...] ReturnType], so you could just omit the return type, as in Callable[[ArgTypes...]]. However, this doesn't conform to the recommended syntax, so strictly speaking it isn't an acceptable type hint. Type checkers will likely choke on it.
Conclusion: you are ahead of your time. This may be disappointing, but there is an advantage for you, too: you can still influence how non-returning functions should be annotated. Maybe this will be an excuse for you to get involved in the standardisation process. :-)
I have two suggestions.
Allow omitting the return type in a Callable hint and allow the type of anything to be forward hinted. This would result in the following syntax:
always_explode: Callable[[]]
def always_explode():
raise Exception('This function never returns because it always raises')
Introduce a bottom type like in Haskell:
def always_explode() -> ⊥:
raise Exception('This function never returns because it always raises')
These two suggestions could be combined.
From Python 3.11, the new bottom type typing.Never should be used to type functions that don't return, as
from typing import Never
def always_explode() -> Never:
raise
This replaces typing.NoReturn
... to make the intended meaning more explicit.
I'm guessing at some point they'll deprecate NoReturn in that context, since both are valid in 3.11.

What should the type annotation for a python property setter's argument be?

Does python have a stance on the PEP-484 type annotation for a property setter's argument? I see two options, both of which seem valid (according to me, and to mypy).
Consider:
from dataclasses import dataclass
from typing import Any
#dataclass
class Foo:
_bar: int = 1
#property
def bar(self) -> int:
return self._bar
#bar.setter
def bar(self, value) -> None:
self._bar = value
The question is:
Should #bar.setter's value argument be typed with typing.Any or with int?
On one hand, within the setter, having the expected type hint would be nice for performing validations, but on the other hand, the incoming value could be of any type.
One thing of note, though; mypy does warn about the incorrect assignment to a property setter:
f = Foo()
f.bar = 2 # Ok
f.bar = "baz" # Incompatible types in assignment (expression has type "str", variable has type "int")
I believe this comes from the revealed type of Foo.bar being an int, not from the type of the value argument of #bar.setter.
I searched through the python/cpython and python/typeshed projects for examples, but didn't come up with anything definitive.
I'm very experienced with modern python, and am comfortable reading cpython sources (beit in C or python itself). An answer that references a PEP, or includes input from a cpython or mypy maintainer would be ideal.
This question lends itself very well to being strongly opinion based, however, I think there may be a stronger argument for mutators annotated with the expected type (ie def bar(self, value: int) -> None:). First, annotations were implemented to aid in static analysis rather than prviding any real runtime benefit (they currently do not to my knowledge. From PEP 484 rationale:
Of these goals, static analysis is the most important. This includes support for off-line type checkers such as mypy, as well as providing a standard notation that can be used by IDEs for code completion and refactoring.
If type annotations are largely meant to benefit in static analysis, linting, etc it would make sense that you would want to be able to check that you are passing in the wrong type rather than potentially discover at runtime that you have not handled the parameter properly with type checks using isinstance for example.
This would also mean that we can do more with less, since the more specific int annotation would remove the need for us to add those type guards:
def bigger_fun(n: Any) -> None:
if isinstance(n, float):
# do something...
else
# dosomething else...
def smaller_fun(n: int) -> None:
# do something
You will know exactly what type you will receive and how to handle it, rather than needing to implement different multiple conditional branches to first cast the parameter to an expected value before operating on it. This will allow allow you to make your mutators as slim as possible with only minimal internal logic / processing.
If you were to pass it the wrong type, your IDE or static analysis tool will at the very least warn you when passing a float for smaller_fun for example. On the other hand, using Any might produce unexpected behavior for some types, which introduces runtime bugs which could be difficult to track down.
Now more specifically to your question, the same PEP touches upon the use of #property annotations in The Meaning of Annotations
Type checkers are expected to attempt to infer as much information as necessary. The minimum requirement is to handle the builtin decorators #property, #staticmethod and #classmethod.
This means that you can expect the #property annotation should function normally as you'd expect. Without any special treatment.
While python is at heart a dynamically typed language, methods like a mutator are very strongly tied to a specific value (and therfore type) and should only really do one thing rather than one of many things. So while it probably makes since for a comparison method like __gt__, which will likely perform different operations for different types, to take an Any value, a mutator should take as narrow a scope as possible.
Finally, even though type hints are not and probably should never be mandatory, all of the most popular python IDEs such as Pycharm automatically support type hints. They will often give warnings even when another programmer may not be annotating types, but the type can be safely inferred. This means that even when using a library with types hints, mutators with an int annotation, will still be more informative and useful to the end-user than an Any annotation.

Python type annotation for arbitrary list?

I would like to annotate an argument of a function to indicate that a list is expected as argument. However I would like to keep the base type of the list unspecified. Is there a way to do this? Ie use a placeholder like below?
def my_func(li: List[any])
edit is it possible to use a template:
Ie something like:
def union(li: List[List[T]])-> List[T]:
However I would like to keep the base type of the list unspecified. Is there a way to do this? Ie use a placeholder like below?
def my_func(li: List[any])
Yes. But you want Any:
Special type indicating an unconstrained type.
Every type is compatible with Any.
Any is compatible with every type.
is it possible to use a template:
Ie something like:
def union(li: List[List[T]])-> List[T]:
Yes. Although these are called generics, not templates (because they're not actually something like C++ templates that provide a Turing-complete compile-time language, they're just simple generic types.
The only problem is that generic types require type variables, and there's no builtin type variable named T. But it's easy enough to create one, as shown in the docs, and of course T is the conventional "first generic parameter" typevar:
T = TypeVar('T')
… and then you can use it:
def union(li: List[List[T]])-> List[T]:
If you already know C++ templates or Haskell parameterized types or Java generics or whatever, it's tempting to just jump in and start writing Python type annotations assuming you can guess what they mean. But really, you need to read at least the first few sections of the docs, or PEP 483 and the various other linked PEPs. Otherwise, you're going to guess all kinds of things wrong (not just what Any is called and how to declare TypeVars, but also probably what the parameters of Tuple are, how covariance works, which generic types are structurally checked vs. nominally, etc..
While we're at it, unless you really need the input to be a List, you probably want Sequence[Sequence[T]] or Iterable[Sequence[T]] or similar.
You can find all the details in Classes, functions, and decorators, but in general, anything from collections.abc that seems like it ought to have a generic counterpart in typing does.

Type hints: when to annotate

I'm using type hints and mypy more and more. I however have some questions about when I should explicitly annotate a declaration, and when the type can be determined automatically by mypy.
Ex:
def assign_volume(self, volume: float) -> None:
self._volume = volume * 1000
Should I write
self._volume: float = volume *1000
In this case?
Now if I have the following function:
def return_volume(self) -> float:
return self._volume
and somewhere in my code:
my_volume = return_volume()
Should I write:
my_volume: float = return_volume()
Mypy (and PEP 484 in general) is designed so that in the most ideal case, you only need to add type annotations to the "boundaries" or "interfaces" of your code.
For example, you basically must add annotations/type metadata in the following places:
The parameter and return types of functions and methods.
Any object fields (assuming the types of your fields are not inferrable just by looking at your constructor)
When you inherit a class. For example, if you specifically want to subclass a dict of ints to strs, you should do class MyClass(Dict[int, str]): ..., not class MyClass(dict): ....
These are all examples of "boundaries" of your code. Type hints on parameter/return types let the caller of the function make sure they're calling it correctly, type hints on fields let the caller know they're using the object correctly, etc...
Mypy (and other PEP 484 compliant tools) will then use that information and try to infer the types of everything else. This behavior is designed to roughly mimic how humans read code: once you know what types are being passed in, for example, it's usually pretty easy to understand what the rest of the code does.
After all, Python is a language that was designed from the start to be readable! We don't need to scatter type hints everywhere to enhance our understanding of what the code does.
Of course, mypy (and other PEP 484-compliant tools) aren't perfect, and sometimes they might not correctly infer what the type of some local variable will be. In that case, you might need to add a type hint to help mypy along. Ethan's answer gives a good overview of some common cases to watch out for. (Interestingly, these cases also tend to be examples of where a human reader might struggle to understand your code!)
So, to put everything together, the general recommendation is to:
Add type hints to all of the "boundaries" of your code, like function parameters and return types.
Default to not annotating variables. If mypy is unable to infer what type some variable should be, add an annotation to help it.
If you find yourself needing to annotate lots of variables to make mypy happy, consider refactoring your code. If mypy is getting confused easily, a human reader is also likely to get confused easily.
So, to go back to your examples, you would not add type hints in either case. Both a human reader and mypy can tell that your _volume field must be a float: it's immediately obvious that must be the case since the parameter is a float and multiplying a float by an int will always produce another float.
Similarly, you would not add an annotation to your my_volume variable. Since return_volume() has type hints, it's trivially easy to see what type it's returning and understand that my_volume is of type float. (And if you make a mistake and accidentally think it's something other then a float, then mypy will catch that for you.)
Mypy does some pretty advanced type inference. Usually, you do not need to annotate variables. The mypy documentation [1] says this about inference:
Mypy considers the initial assignment as the definition of a variable. If you do not explicitly specify the type of the variable, mypy infers the type based on the static type of the value expression
The general rule of thumb then is "annotate variables whose types are not inferrable at their initial assignment".
Here are some examples:
Empty containers. If I define a as a = [], mypy will not know what types are valid in the list a.
Optional types. Oftentimes, if I define an Optional type, I will assign the variable to None. For example, if I do a = None, mypy will infer that a has type NoneType, if you want to assign a to 5 later on, you need to annotate it: a: Optional[int] = None.
Complex nested containers. For example, if you have a dictionary with both list and string values, mypy might, for example, infer Dict[str, Any]. You may need to annotate it to be more accurate.
Of course there are many more cases.
In your examples, mypy can infer the types of the expressions.
[1] https://mypy.readthedocs.io/en/latest/type_inference_and_annotations.html
For myself it started to write type hints everywhere, where it is possible. It isn't slower at all and it makes it easier if you will go back to your old code in the feature. So there is now negative aspect on using them as much as possible except of the size of your python file.

Categories

Resources