In a perfect world, I could just do this:
ScoreBaseType = Union[bool, int, float]
ScoreComplexType = Union[ScoreBaseType, Dict[str, ScoreBaseType]]
But, that says a ScoreComplexType is either a ScoreBaseType or a dictionary which allows multiple types of values... not what I want.
The following looks like it should work to me, but it doesn't:
ScoreBaseTypeList = [bool, int, float]
ScoreBaseType = Union[*ScoreBaseTypeList] # pycharm says "can't use starred expression here"
ScoreDictType = reduce(lambda lhs,rhs: Union[lhs, rhs], map(lambda x: Dict[str, x], ScoreBaseTypeList))
ScoreComplexType = Union[ScoreBaseType, ScoreDictType]
Is there any way I can do something like the above without having to go through this tedium?
ScoreComplexType = Union[bool, int, float,
Dict[str, bool],
Dict[str, int],
Dict[str, float]]
Edit: More fleshed out desired usage example:
# these strings are completely arbitrary and determined at runtime. Used as keys in nested dictionaries.
CatalogStr = NewType('CatalogStr', str)
DatasetStr = NewType('DatasetStr', str)
ScoreTypeStr = NewType('ScoreTypeStr', str)
ScoreBaseType = Union[bool, int, float]
ScoreDictType = Dict[ScoreTypeStr, 'ScoreBaseTypeVar']
ScoreComplexType = Union['ScoreBaseTypeVar', ScoreDictType]
ScoreBaseTypeVar = TypeVar('ScoreBaseTypeVar', bound=ScoreBaseType)
ScoreComplexTypeVar = TypeVar('ScoreComplexTypeVar', bound=ScoreComplexType) # errors: "constraints cannot be parameterized by type variables"
class EvalBase(ABC, Generic[ScoreComplexTypeVar]):
def __init__(self) -> None:
self.scores: Dict[CatalogStr,
Dict[DatasetStr,
ScoreComplexTypeVar]
] = {}
class EvalExample(EvalBase[Dict[float]]): # can't do this either
...
Edit 2:
It occurs to me that I could simplify a LOT of my type hinting if I used tuples instead of nested dictionaries. This seems to maybe work? I've only tried it in the below toy example and haven't yet tried adapting all my code.
# These are used to make typing hints easier to understand
CatalogStr = NewType('CatalogStr', str) # A str corresponding to the name of a catalog
DatasetStr = NewType('DatasetStr', str) # A str corresponding to the name of a dataset
ScoreTypeStr = NewType('ScoreTypeStr', str) # A str corresponding to the label for a ScoreType
ScoreBaseType = Union[bool, int, float]
SimpleScoreDictKey = Tuple[CatalogStr, DatasetStr]
ComplexScoreDictKey = Tuple[CatalogStr, DatasetStr, ScoreTypeStr]
ScoreKey = Union[SimpleScoreDictKey, ComplexScoreDictKey]
ScoreKeyTypeVar = TypeVar('ScoreKeyTypeVar', bound=ScoreKey)
ScoreDictType = Dict[ScoreKey, ScoreBaseType]
# These are used for Generics in classes
DatasetTypeVar = TypeVar('DatasetTypeVar', bound='Dataset') # Must match a type inherited from Dataset
ScoreBaseTypeVar = TypeVar('ScoreBaseTypeVar', bound=ScoreBaseType)
class EvalBase(ABC, Generic[ScoreBaseTypeVar, ScoreKeyTypeVar]):
def __init__(self):
self.score: ScoreDictType = {}
class EvalExample(EvalBase[float, ComplexScoreDictKey]):
...
Although then what would the equivalent of this be? Seems like I might have to store a couple lists of keys in order to iterate?
for catalog_name in self.catalog_list:
for dataset_name in self.scores[catalog_name]:
for score in self.scores[catalog_name][dataset_name]:
You may need to use TypeVars to express this, but without an example of how you intend to use it, it's hard to say.
An example of how this would be used for typing a return value dependent on input:
ScoreBaseType = Union[bool, int, float]
ScoreTypeVar = TypeVar('ScoreTypeVar', bound=ScoreBaseType)
ScoreDictType = Union[ScoreTypeVar, Dict[str, ScoreTypeVar]]
def scoring_func(Iterable[ScoreTypeVar]) -> ScoreDictType:
...
If you're not doing this based on input values though, you probably want
ScoreBaseType = Union[bool, int, float]
ScoreDictTypes = Union[Dict[str, bool], Dict[str, int], Dict[str, float]]
ScoreComplexType = Union[ScoreBaseType, ScoreDictTypes]
Depending on how you are handling the types, you may also be able to use SupportsInt or SupportsFloat types rather than both int and float
Edit: (Additional Info Based on the edited OP below)
Since you are typing an ABC with this, it may be sufficient to type the base class using Dict[str, Any] and constrain subclasses further.
If it isn't, you are going to have very verbose type definitions, and there isn't much alternative, as mypy currently has some issues resolving some classes of programmatically generated types, even when operating on constants.
mypy also doesn't have support for recursive type aliases at this time (though there is a potential of support for them being added, it's not currently planned), so for readability, you'd need to define the allowed types for each potential level of nesting, and then collect those into a type representing the full nested structure.
Related
In Python, we can create aliases or custom types like following:
MyKindOfFunction = Callable[[int, int], int] # alias
AlternativeType = NewType('AlternativeType', Callable[[int, int], int]) # new type
Consider the following function:
def myfunc(a: int, b: int) -> int:
return a + b
Obviously, the signature matches the above definitions. Is there a more convenient way of notating this than annotating the parameters manually? I am unhappy with the way it is annotated for the following reasons:
It does not check against AlternativeType, as AlternativeType is a subclass of Callable[[int, int], int].
I am using many higher level functions, with equivalent signatures, but are logically of different types, and I would like to be able to differentiate between them.
I have a generic lookup function, that mostly returns TypeA, but sometimes can return TypeB:
Types = Union[TypeA,TypeB]
def get_hashed_value(
key:str, table: Dict[str,Types]
) -> Types:
return table.get(key)
and I use it in two less-generic functions:
def get_valueA(key: str) -> TypeA:
return get_hashed_value(key, A_dict) # A_dict: Dict[str, TypeA]
and
def get_valueB(key: str) -> TypeB:
return get_hashed_value(key, B_dict) # B_dict: Dict[str, TypeB]
what is the best way to handle typing on this?
since get_hashed_value can return either TypeA or TypeB, the return statement in the get_* functions throws a typing exception (during my linting)
there’s more logic in these methods, and I need the separate get_* functions, so I can’t just collapse all the usages
it would be really nice to have explicit return types on the get_* functions
it feels like a bad practice to duplicate get_hashed_value, just to get around the typing issue
it feels bad to just ignore type everything get_hashed_value is called
Thanks for your help! Also I am sure this has been asked before, but I had trouble finding the answer. :\
Interestingly, this doesn't return a type warning for me (in Pycharm). I'm not sure why it isn't warning on what's comparable to a "downcast", but Pycharm isn't perfect.
Regardless, this seems like a job that's more suited for a TypeVar than a Union:
from typing import TypeVar, Dict
T = TypeVar("T", TypeA, TypeB) # A generic type that can only be a TypeA or TypeB
# And the T stays consistent from the input to the output
def get_hashed_value(key: str, table: Dict[str, T]) -> T:
return table.get(key)
# Which means that if you feed it a Dict[str, TypeA], it will expect a TypeA return
def get_valueA(key: str) -> TypeA:
return get_hashed_value(key, A_dict)
# And if you feed it a Dict[str, TypeB], it will expect an TypeB return
def get_valueB(key: str) -> TypeB:
return get_hashed_value(key, B_dict)
I'd like to be able to easily create new types, plus (optionally) add some information about them (say some docs, and a set of variable names they often come under).
The straightforward way to do this would be:
from typing import Any, NewType, Union, List, Iterable, Optional
Key = NewType('Key', Any)
Key._aka = set(['key', 'k'])
Val = NewType('Val', Union[int, float, List[Union[int, float]]])
Val.__doc__ = "A number or list of numbers."
But there's two reasons I don't like this:
I have to copy paste the name of the new type I'm making three times (not D.R.Y. and prone to mistakes)
I don't like to "externalize" the assignment of optional additional information (_aka and __doc__)
So I came up with this:
from typing import Any, NewType, Union, List, Iterable, Optional
def new_type(name, tp, doc: Optional[str]=None, aka: Optional[Iterable]=None):
"""Make a new type with (optional) doc and (optional) aka, set of var names it often appears as"""
new_tp = NewType(name, tp)
if doc is not None:
setattr(new_tp, '__doc__', doc)
if aka is not None:
setattr(new_tp, '_aka', set(aka))
globals()[name] = new_tp # is this dangerous? Does scope need to be considered more carefully?
which then gives me the interface I'd like:
new_type('Key', Any, aka=['key', 'k'])
new_type('Val', Union[int, float, List[Union[int, float]]], doc="A number or list of numbers.")
But I'm not sure of that globals()[name] = new_tp thing. It seems it would be benign if I'm defining my types in the top level of a module, but not sure how this would fair in some edge case nested scopes situation.
The normal way you create a new type is to just write a new class:
class Key:
def __init__(self, key: object) -> None:
self._key = key
self._aka = set(['key', 'k'])
class SqlString(str):
"""A custom string which has been verified to be valid SQL."""
Note that this approach avoids the DRY and scoping concerns that you had.
You use NewType only for when you don't want to add any extra attributes or a docstring -- doing Foo = NewType('Foo', X) is basically like doing class Foo(X): pass except with slightly less runtime overhead.
(More precisely, type checkers will treat Foo = NewType('Foo', X) as if it were written like class Foo(X): pass, but at runtime what actually happens Foo = lambda x: x -- Foo is the identity function.)
We do run into a complication with your second Union example, however, since Unions are not subclassable (which makes that NewType illegal, as per PEP 484).
Instead, I would personally just do the following:
# A number or list of numbers
Val = Union[int, float, List[Union[int, float]]]
IMO since types are supposed to be invisible at runtime, I think it makes sense to just not bother attaching runtime-available documentation.
I am trying to wrap my head around generic type hints. Reading over this section in PEP 483, I got the impression that in
SENSOR_TYPE = TypeVar("SENSOR_TYPE")
EXP_A = Tuple[SENSOR_TYPE, float]
class EXP_B(Tuple[SENSOR_TYPE, float]):
...
EXP_A and EXP_B should identify the same type. In PyCharm #PC-181.4203.547, however, only EXP_Bworks as expected. Upon investigation, I noticed that EXP_B features a __dict__ member while EXP_A doesn't.
That got me to wonder, are both kinds of type definition actually meant to be synonymous?
Edit: My initial goal was to design a generic class EXP of 2-tuples where the second element is always a float and the first element type is variable. I want to use instances of this generic class as follows
from typing import TypeVar, Tuple, Generic
T = TypeVar("T")
class EXP_A(Tuple[T, float]):
...
EXP_B = Tuple[T, float]
V = TypeVar("V")
class MyClass(Generic[V]):
def get_value_a(self, t: EXP_A[V]) -> V:
return t[0]
def get_value_b(self, t: EXP_B[V]) -> V:
return t[0]
class StrClass(MyClass[str]):
pass
instance = "a", .5
sc = StrClass()
a: str = sc.get_value_a(instance)
b: str = sc.get_value_b(instance)
(The section on user defined generic types in PEP 484 describes this definition of EXP as equivalent to EXP_B in my original code example.)
The problem is that PyCharm complains about the type of instance as a parameter:
Expected type EXP (matched generic type EXP[V]), got Tuple[str, float] instead`. With `EXP = Tuple[T, float]` instead, it says: `Expected type 'Tuple[Any]' (matched generic type Tuple[V]), got Tuple[str, float] instead.
I followed #Michael0c2a's advice, headed over to the python typing gitter chat, and asked the question there. The answer was that the example is correct.
From this, I follow that
EXP_A and EXP_B are indeed defining the same kind of types
PyCharm as of build #PC-182.4323.49 just doesn't deal with generic type annotations very well.
Consider following code sample:
from typing import Dict, Union
def count_chars(string) -> Dict[str, Union[str, bool, int]]:
result = {} # type: Dict[str, Union[str, bool, int]]
if isinstance(string, str) is False:
result["success"] = False
result["message"] = "Inavlid argument"
else:
result["success"] = True
result["result"] = len(string)
return result
def get_square(integer: int) -> int:
return integer * integer
def validate_str(string: str) -> bool:
check_count = count_chars(string)
if check_count["success"] is False:
print(check_count["message"])
return False
str_len_square = get_square(check_count["result"])
return bool(str_len_square > 42)
result = validate_str("Lorem ipsum")
When running mypy against this code, following error is returned:
error: Argument 1 to "get_square" has incompatible type "Union[str, bool, int]"; expected "int"
and I'm not sure how I could avoid this error without using Dict[str, Any] as returned type in the first function or installing 'TypedDict' mypy extension. Is mypy actually 'right', any my code isn't type safe or is this should be considered as mypy bug?
Mypy is correct here -- if the values in your dict can be strs, ints, or bools, then strictly speaking we can't assume check_count["result"] will always evaluate to exactly an int.
You have a few ways of resolving this. The first way is to actually just check the type of check_count["result"] to see if it's an int. You can do this using an assert:
assert isinstance(check_count["result"], int)
str_len_square = get_square(check_count["result"])
...or perhaps an if statement:
if isinstance(check_count["result"], int):
str_len_square = get_square(check_count["result"])
else:
# Throw some kind of exception here?
Mypy understands type checks of this form in asserts and if statements (to a limited extent).
However, it can get tedious scattering these checks throughout your code. So, it might be best to actually just give up on using dicts and switch to using classes.
That is, define a class:
class Result:
def __init__(self, success: bool, message: str) -> None:
self.success = success
self.message = message
...and return an instance of that instead.
This is slightly more inconvenient in that if your goal is to ultimately return/manipulate json, you now need to write code to convert this class from/to json, but it does let you avoid type-related errors.
Defining a custom class can get slightly tedious, so you can try using the NamedTuple type instead:
from typing import NamedTuple
Result = NamedTuple('Result', [('success', bool), ('message', str)])
# Use Result as a regular class
You still need to write the tuple -> json code, and iirc namedtuples (both the regular version from the collections module and this typed variant) are less performant then classes, but perhaps that doesn't matter for your use case.