Programmatic generation of Literal options - python

MyPy's Literal type can be super useful for defining available options. Is it possible to generate a literal type programmatically, e.g. from a canonical registry?
e.g.
class Dispatcher():
func_reg = {
'f1': my_func,
'f2': new_func,
'f3': shoe_func,
}
def dispatch(cls, func_name: Literal[*func_reg.keys()]) -> Whatever:
pass

Unfortunately, the answer is no.
According to the mypy documentation:
Literal types may contain one or more literal bools, ints, strs, bytes, and enum values. However, literal types cannot contain arbitrary expressions: types like Literal[my_string.trim()], Literal[x > 3], or Literal[3j + 4] are all illegal.

As #BrokenBenchmark notes, it is not possible to auto-generate Literal types. However, if the end goal is just to require specific values generated from some kind of function registry, we can hack it with enum.Enum.
To quote PEP 586
rather than entirely special-casing enums, we can instead treat them
as being approximately equivalent to the union of their values...
the Status enum could be treated as being approximately equivalent to Literal[Status.SUCCESS, Status.INVALID_DATA, Status.FATAL_ERROR]
Here, functions are "registered" by adding an enum value that is an exact uppercasing of the function name to the FuncNames enum. This is not a pretty or robust solution, but it runs, it supports single-location registration of a function for type-checked dispatch, and mypy handles the required enum values as expected.
from enum import Enum, auto
def f():
return "f"
def g():
return "g"
def h():
return "h"
class Dispatcher():
# Build the enum used to register the functions
class FuncNames(Enum):
"""
The enum names here _must_ be exact uppercase-ings of the function
names. The names will be lowercased and evaluated to register their
associated functions
"""
F = auto()
G = auto()
H = auto()
# NOTE: The functional syntax works just as well
# FuncNames = Enum('FuncNames', 'F G H')
# Comprehensions can't access names defined in the class block,
# so use a standard for loop
func_reg = dict()
for name in list(FuncNames):
func_reg[eval(f"FuncNames.{name.name}")] = eval(str(name.name).lower())
#classmethod
def dispatch(cls, func_name: FuncNames):
"""
Prints the return from a registered function.
Can only be called with an item from FuncNames
"""
print(cls.func_reg[func_name]())
Dispatcher.dispatch(Dispatcher.FuncNames.F)
Dispatcher.dispatch(Dispatcher.FuncNames.G)
Dispatcher.dispatch(Dispatcher.FuncNames.H)
# Dispatcher.dispatch(Dispatcher.FuncNames.I) -> "FuncNames has no attribute I"
# Dispatcher.dispatch(Dispatcher2.FuncNames) -> "incompatible type"
# Dispatcher.dispatch('MyPy hates me!') -> "incompatible type"
Interestingly, though it feels cleaner to generate the enum itself from a list of the functions themselves, MyPy chokes on this.
class Dispatcher2():
# Build an enum used to register these (the actual functions)
funcs_to_register = [f, g, h]
enum_names = [func.__name__.upper() for func in funcs_to_register]
joined = ' '.join(enum_names)
FuncNames = Enum('FuncNames', joined)
func_reg = dict()
for name in enum_names:
func_reg[eval(f"FuncNames.{name}")] = eval(name.lower())
#classmethod
def dispatch(cls, func_name: FuncNames):
"""
Prints the return from a registered function.
Can only be called with an item from FuncNames
"""
print(cls.func_reg[func_name]())
Dispatcher2.dispatch(Dispatcher2.FuncNames.F)
Dispatcher2.dispatch(Dispatcher2.FuncNames.G)
Dispatcher2.dispatch(Dispatcher2.FuncNames.H)
The above runs as expected, but mypy presumably can't infer the values present in the enum unless it is statically defined, so it errors.
> mypy enums_typing.py
enums_typing.py:19: error: Enum() expects a string, tuple, list or dict literal as the second argument
enums_typing.py:36: error: "Type[FuncNames]" has no attribute "F"
enums_typing.py:37: error: "Type[FuncNames]" has no attribute "G"
enums_typing.py:38: error: "Type[FuncNames]" has no attribute "H"
Found 4 errors in 1 file (checked 1 source file)
TLDR:
In order to define a fixed set of choices that MyPy can check against, you must define them statically. It may then be possible to use those statically-defined choices to programmatically build your function registry.

Related

Type hinting a list of specific strings in Python

I have a function as shown below:
from typing import List, Literal, Union
def foo(possible_values: List[Union[Literal['abcd', 'efgh', 'ijkl']]]):
return {}
Now this is how I want the code to behave:
whenever the possible_values parameter gets values other than ["abcd", "efgh","ijkl"]
Eg:
res = foo(possible_values=["abc", "efgh"])
It should throw an error as abc is not defined in the function signature.
However,
res = foo(possible_values=["abcd", "efgh"])
should work fine as they are subset of what is defined.
Currently, with the above code, it just accepts any arbitrary list of strings.
If you want to constrain values to a predefined set, you might want to use Enum. Like mentioned by others, type hinting won't enforce check and error natively in python, you'll have to either implement it in your function's code, or use a library allowing annotation-based control. Here is an example.
from typing import List
from enum import Enum
# Let's define your possible values as an enumeration. Note that it also inherits from
# str, which will allow to use its members in comparisons as if they were strings
class PossibleValues(str, Enum):
abcd = 'abcd'
efgh = 'efgh'
ijkl = 'ijkl'
Now your function. Note the type-hinting.
def foo(possible_values: List[PossibleValues]):
# We unroll the enum as a set, and check that possible_values is a subset of it
if not set(PossibleValues).issuperset(possible_values):
raise ValueError(f'Only {[v.value for v in PossibleValues]} are allowed.')
# Do whatever you need to do
return {}
Now when you use it:
foo(['abcd', 'efgh'])
# output: {}
foo(['abc', 'efgh'])
# ValueError: Only ['abcd', 'efgh', 'ijkl'] are allowed.

How to define pairs or tuples of acceptable types to generic protocol

Below is a rough example of building a dispatch interface for handling, e.g., messages coming in over the network. Each message contains a header with a MessageIdentifier, and the application is able to define various callbacks for the corresponding parsed ConcreteMessage types.
The crux of the dispatch is defining the associations between MessageIdentifiers and ConcreteMessages, defined in handle_message below. At runtime, this is type-safe because it's not possible for a callback to be called with anything other than its appropriate type. But I'm wondering if it's possible to constrain the allowed pairs of identifiers and concrete messages for mypy, ideally to show an error when a callback is called with the incorrect type.
from enum import Enum
from typing import Callable, Optional, TypeVar, cast
from typing_extensions import Literal, Protocol
class MessageType(Enum):
FOO = 0
BAR = 1
BAZ = 2
class FooMessage: ...
class BarMessage: ...
class BazMessage: ...
ConcreteMessage = TypeVar("ConcreteMessage", FooMessage, BarMessage, BazMessage)
MessageIdentifier = TypeVar("MessageIdentifier", Literal[MessageType.FOO], Literal[MessageType.BAR], Literal[MessageType.BAZ])
class MessageHandler(Protocol):
def __getitem__(self, item: MessageIdentifier) -> Callable[[ConcreteMessage], Optional[bool]]: ...
def handle_foo(foo: FooMessage) -> Optional[bool]: ...
def handle_bar(bar: BarMessage) -> Optional[bool]: ...
handle_message = cast(MessageHandler, {
MessageType.FOO: handle_foo,
MessageType.BAR: handle_bar,
})
# MessageIdentifier constrains correctly
func = handle_message[5] # error: Value of type variable "MessageIdentifier" of "__getitem__" of "MessageHandler" cannot be "Literal[5]"
func = handle_message[MessageType.FOO] # good
# ConcreteMessage constrains correctly
func(FooMessage()) # good
func(5) # error: Value of type variable "ConcreteMessage" of function cannot be "int"
func(BarMessage()) # good - but would ideally be an error
Said another way, rather than the MessageHandler accepting the cross-product of ConcreteMessage, MessageIdentifier (as if Tuple[ConcreteMessage, MessageIdentifier]), can I explicitly define the pairs of types that are acceptable for the Protocol?
Bonus points if you can also determine why the cast to MessageHandler is necessary. When defined as follows:
handle_message: MessageHandler = {
MessageType.FOO: handle_foo,
MessageType.BAR: handle_bar,
}
mypy reports Dict entry 0 has incompatible type "Literal[MessageType.FOO]": "Callable[[FooMessage], Optional[bool]]"; expected "MessageIdentifier": "Callable[[ConcreteMessage], Optional[bool]]".

Is there an "Addable" protocol or abstract base class in Python? If not how would one define it?

The typing module contains many protocols and abstract base classes that formally specify protocols which are informally described in the data model, so they can be used for type hints.
However I was unable to find such a protocol or abstract base class for objects that support __add__. Is there any formal specification of such a protocol? If not how would such an implementation look like?
Update:
Since I'm interested in such a class for the purpose of typing, such a class would only be useful if it's fully type itself, like the examples in the typing module.
You could define one yourself using the abc module. The ABC metaclass that is provided there allows you to define a __subclasshook__, in which you can check for class methods such as __add__. If this method is defined for a certain class, it is then considered a subclass of that abc.
from abc import ABC
class Addable(ABC):
#classmethod
def __subclasshook__(cls, C):
if cls is Addable:
if any("__add__" in B.__dict__ for B in C.__mro__):
return True
return NotImplemented
class Adder():
def __init__(self, x):
self.x = x
def __add__(self, x):
return x + self.x
inst = Adder(5)
# >>> isinstance(inst, Addable)
# True
# >>> issubclass(Adder, Addable)
# True
To my knowledge there is no pre-defined Addable protocol. You can definie one yourself:
from typing import Protocol, TypeVar
T = TypeVar("T")
class Addable(Protocol):
def __add__(self: T, other: T) -> T: ...
This protocol requires that both summands and the result share a common ancestor type.
The protocol can then be used as follows:
Tadd = TypeVar("Tadd", bound=Addable)
def my_sum(*args: Tadd, acc: Tadd) -> Tadd:
res = acc
for value in args:
res += value
return res
my_sum("a", "b", "c", acc="") # correct, returns string "abc"
my_sum(1, 2, 6, acc=0) # correct, returns int 9
my_sum(1, 2.0, 6, acc=0) # correct, returns float 9.0
my_sum(True, False, False, acc=False) # correct, returns bool 1 (mypy reveal_type returns bool, running it in python leads to result 1)
my_sum(True, False, 1, acc=1.0) # incorrect IMHO, but not detected by mypy, returns float 3.0
my_sum(1, 2, 6, acc="") # incorrect, detected by mypy
my_sum(1, 2, "6", acc=0) # incorrect, detected by mypy
There exists such a protocol. Just not in the code that's usually executed...
There exists a special package called typeshed that's used by type-checkers to add type-hints to code that isn't implemented with any type-hints... The typeshed package doesn't exist during runtime though.
So you can use _typeshed.SupportsAdd during type-checking - but to check during execution you'd need to check for the __add__ method dynamically or you implement the SupportsAdd protocol yourself...
from typing import TYPE_CHECKING
if TYPE_CHECKING:
from _typeshed import SupportsAdd
# SupportsAdd must always be written in quotation marks when used later...
def require_adding_str_results_in_int(a_obj: "SupportsAdd[str, int]"):
# the type checker guarantees that this should work now
assert type(a_obj + "a str") == int
# to check during runtime though you'd probably use
assert hasattr(a_obj, "__add__")
Hope that helped...
Most likely it's the best way to implement your own protocol and use the #runtime_checkable decorator. You can look at the source code of typeshed to get inspirations.

Python: Naming types and assigning variables outside a function from within a function

I'd like to be able to easily create new types, plus (optionally) add some information about them (say some docs, and a set of variable names they often come under).
The straightforward way to do this would be:
from typing import Any, NewType, Union, List, Iterable, Optional
Key = NewType('Key', Any)
Key._aka = set(['key', 'k'])
Val = NewType('Val', Union[int, float, List[Union[int, float]]])
Val.__doc__ = "A number or list of numbers."
But there's two reasons I don't like this:
I have to copy paste the name of the new type I'm making three times (not D.R.Y. and prone to mistakes)
I don't like to "externalize" the assignment of optional additional information (_aka and __doc__)
So I came up with this:
from typing import Any, NewType, Union, List, Iterable, Optional
def new_type(name, tp, doc: Optional[str]=None, aka: Optional[Iterable]=None):
"""Make a new type with (optional) doc and (optional) aka, set of var names it often appears as"""
new_tp = NewType(name, tp)
if doc is not None:
setattr(new_tp, '__doc__', doc)
if aka is not None:
setattr(new_tp, '_aka', set(aka))
globals()[name] = new_tp # is this dangerous? Does scope need to be considered more carefully?
which then gives me the interface I'd like:
new_type('Key', Any, aka=['key', 'k'])
new_type('Val', Union[int, float, List[Union[int, float]]], doc="A number or list of numbers.")
But I'm not sure of that globals()[name] = new_tp thing. It seems it would be benign if I'm defining my types in the top level of a module, but not sure how this would fair in some edge case nested scopes situation.
The normal way you create a new type is to just write a new class:
class Key:
def __init__(self, key: object) -> None:
self._key = key
self._aka = set(['key', 'k'])
class SqlString(str):
"""A custom string which has been verified to be valid SQL."""
Note that this approach avoids the DRY and scoping concerns that you had.
You use NewType only for when you don't want to add any extra attributes or a docstring -- doing Foo = NewType('Foo', X) is basically like doing class Foo(X): pass except with slightly less runtime overhead.
(More precisely, type checkers will treat Foo = NewType('Foo', X) as if it were written like class Foo(X): pass, but at runtime what actually happens Foo = lambda x: x -- Foo is the identity function.)
We do run into a complication with your second Union example, however, since Unions are not subclassable (which makes that NewType illegal, as per PEP 484).
Instead, I would personally just do the following:
# A number or list of numbers
Val = Union[int, float, List[Union[int, float]]]
IMO since types are supposed to be invisible at runtime, I think it makes sense to just not bother attaching runtime-available documentation.

Class that acts as mapping for **unpacking

Without subclassing dict, what would a class need to be considered a mapping so that it can be passed to a method with **.
from abc import ABCMeta
class uobj:
__metaclass__ = ABCMeta
uobj.register(dict)
def f(**k): return k
o = uobj()
f(**o)
# outputs: f() argument after ** must be a mapping, not uobj
At least to the point where it throws errors of missing functionality of mapping, so I can begin implementing.
I reviewed emulating container types but simply defining magic methods has no effect, and using ABCMeta to override and register it as a dict validates assertions as subclass, but fails isinstance(o, dict). Ideally, I dont even want to use ABCMeta.
The __getitem__() and keys() methods will suffice:
>>> class D:
def keys(self):
return ['a', 'b']
def __getitem__(self, key):
return key.upper()
>>> def f(**kwds):
print kwds
>>> f(**D())
{'a': 'A', 'b': 'B'}
If you're trying to create a Mapping — not just satisfy the requirements for passing to a function — then you really should inherit from collections.abc.Mapping. As described in the documentation, you need to implement just:
__getitem__
__len__
__iter__
The Mixin will implement everything else for you: __contains__, keys, items, values, get, __eq__, and __ne__.
The answer can be found by digging through the source.
When attempting to use a non-mapping object with **, the following error is given:
TypeError: 'Foo' object is not a mapping
If we search CPython's source for that error, we can find the code that causes that error to be raised:
case TARGET(DICT_UPDATE): {
PyObject *update = POP();
PyObject *dict = PEEK(oparg);
if (PyDict_Update(dict, update) < 0) {
if (_PyErr_ExceptionMatches(tstate, PyExc_AttributeError)) {
_PyErr_Format(tstate, PyExc_TypeError,
"'%.200s' object is not a mapping",
Py_TYPE(update)->tp_name);
PyDict_Update is actually dict_merge, and the error is thrown when dict_merge returns a negative number. If we check the source for dict_merge, we can see what leads to -1 being returned:
/* We accept for the argument either a concrete dictionary object,
* or an abstract "mapping" object. For the former, we can do
* things quite efficiently. For the latter, we only require that
* PyMapping_Keys() and PyObject_GetItem() be supported.
*/
if (a == NULL || !PyDict_Check(a) || b == NULL) {
PyErr_BadInternalCall();
return -1;
The key part being:
For the latter, we only require that PyMapping_Keys() and PyObject_GetItem() be supported.

Categories

Resources