Python TypeHint: TypeVar dependency - python

I have defined a generic function using TypeVar to describe a process that has a common structure.
I want to get a value from data keyed by a Literal string like the following function.
from typing import Literal, TypeVar
data: dict[Literal["cat"] | Literal["dog"], int] = {
"cat": 0,
"dog": 1,
}
Key = TypeVar("Key")
Value = TypeVar("Value")
def get_value(key: Key, data: dict[Key, Value]) -> Value:
return data[key] # actualy more complex logic.
cat = get_value("cat", data) # Collect! data has 'cat'
bird = get_value("bird", data) # Wrong! data don't have 'bird', but pass type check :(
However, the type annotation seems to be interpreted as Callable[[str, dict [str, int]], int].
I want to get the type annotation like Callable[[Literal["cat", "dog"], dict[Literal["cat", "dog"], int]], int].
So, I tried to set TypeVar dependency like this, but error.
Key = TypeVar("Key")
Value = TypeVar("Value")
Label = TypeVar("Label", Key) # TypeVar bound type cannot be generic
def get_value(label: Label, data: dict[Key, Value]) -> Value:
return data[label]
Question
How to set TypeVar dependency, or the another approach to set the type annotation.

Related

Python list.sort(key=lambda x: ...) type hints

I am sorting a list of dicts based on a key like below
my_function() -> list[dict]:
data: list[dict] = []
# Populate data ...
if condition:
data.sort(key=lambda x: x["position"])
return data
However mypy complains about Returning Any from function declared to return "Union[SupportsDunderLT[Any], SupportsDunderGT[Any]]". Is it possible to update the above snippet so that mypy doesn't raise a no-any-return error?
EDIT
Versions: Python 3.10.9 and mypy 1.0.0 (compiled: yes)
hope this code is helpful:
from typing import List, Dict
def my_function() -> List[Dict[str, int]]:
data: List[Dict[str, int]] = []
# Populate data ...
if condition:
data.sort(key=lambda x: x["position"])
return data
The answer by #SisodiaMonu should work. However, seems that your example uses dict more like a JS object, so all keys have semantic meaning. For such cases there is a typing.TypedDict, which allows you to annotate all dict keys with types. This is important, if your dict can contain some objects of other types: if it's {'position': 1, 'key': 'foo'}, then the type would've been dict[str, int | str], and mypy will point out invalid comparison (int | str is not comparable). With TypedDict, this problem won't arise:
from typing import TypedDict
class MyItem(TypedDict):
position: int
key: str
condition = True
def my_function() -> list[MyItem]:
data: list[MyItem] = []
# Populate data ...
if condition:
data.sort(key=lambda x: x["position"])
return data
You can try this solution in playground.

Is it possible to automatically convert a Union type to only one type automatically with pydantic?

Given the following data model:
class Demo(BaseModel):
id: Union[int, str]
files: Union[str, List[str]]
Is there a way to tell pydantic to always convert id to str type and files to List[str] type automatically when I access them, instead of doing this manually every time.
Pydantic has built-in validation logic built-in for most of the common types out there. This includes str. It just so happens that the default string validator simply coerces values of type int, float or Decimal to str by default. (see str_validator source)
This means even if you annotate id as str, but pass an int value, the model will initialize properly without validation error and the id value will be the str version of that value. (e.g. str(42) gives "42")
list also has a default validator built-in, but in this case it may be not what you want. If it encounters a non-list value, but sees that it is a sequence (or a generator), it again coerces it to a list. (see list_validator source) In this case, since the value you might pass to it will be a str and a str is a sequence, the outcome would be a list of single-character strings from the initial string. (e.g. list("abc") gives ["a", "b", "c"])
So for list[str] you will likely need your own custom pre=True validator to perform whatever you deem necessary with the str value to turn it into a list[str].
Example:
from pydantic import BaseModel, validator
class Demo(BaseModel):
id: str
files: list[str]
#validator("files", pre=True)
def str_to_list_of_str(cls, v: object) -> object:
if isinstance(v, str):
return v.split(",")
return v
if __name__ == "__main__":
obj = Demo.parse_obj({"id": 42, "files": "foo,bar,baz"})
print(obj)
print(type(obj.id), type(obj.files))
Output:
id='42' files=['foo', 'bar', 'baz']
<class 'str'> <class 'list'>
As you can see, you don't even need any additional id field logic, if your values are int because they end up as str on the model instance.
I figure out how to make it after get help from the maintainer. The key point is to remove Union from the type definition and use a pre-process hook to convert the value before validation, here is the sample code:
from pydantic import BaseModel, validator
from typing import List
class Demo(BaseModel):
id: str
files: List[str]
#validator('id', pre=True)
def id_must_be_str(cls, v):
if isinstance(v, int):
v = str(v)
return v
#validator('files', pre=True)
def files_must_be_list_of_str(cls, v):
if isinstance(v, str):
v = [v]
return v
obj = Demo.parse_obj({'id': 1, 'files': '/data/1.txt'})
print(type(obj.id))
print(type(obj.files))

Dynamically Setting the Output Type of Python Function

Okay, I have a set of dictionaries that contain a key/ID and a function. Said dictionaries are used as identifiers that point to a specific cache key and function to update said cache key's associated data.
from typing import TypedDict, List
class CacheIdentifier(TypedDict):
key: str
func: Callable
def function_to_fetch_cache_data() -> List[str]:
return ["a", "b", "c"]
IDENTIFIER: CacheIdentifier = {
"key": "my_cache_key",
"func": function_to_fetch_cache_data
}
Now, the thing is, I have a function called load_or_set_cache which takes an identifier object and, like the name says, checks if there's data associated with a cache key and fetches the data associated with the cache key if it exists, otherwise it uses the func argument of the CacheIdentifier object provided to fetch new data.
def load_or_set_cache(identifier: CacheIdentifier) -> Any:
# Logic to check if the data exists in the cache or not
if not cached:
cache_data = identifier["func"]()
cache_key = identifier["key"]
cache.set(cache_key, cache_data, TTL = 3600)
else:
cache_key = identifier["key"]
cache_data = cache.get(cache_key)
return cache_data
cache_data = load_or_set_cache(IDENTIFIER)
The thing is, the load_or_set_function function returns the data that was fetched and stored in the cache, but as you can expect, the type of said data varies depending on the return type of the function of each identifier. In my example above, if the function_to_fetch_cache_data has a return type of List[str] then the load_or_set_cache function will have the same return type, causing cache_data to have a List[str] type.
Currently the output type of the load_or_set_cache function is just set to Any, is there any way I could dynamically change the output type of the function depending on the output type of the associated func argument found in each cache identifier?
I've tried playing with TypeVars but dont feel like they really suit what I want to do
I think Pedro has the right idea. Making your CacheIdentifier generic in terms of the func return type works, but not with a TypedDict:
from collections.abc import Callable
from typing import Generic, TypeVar
T = TypeVar("T")
class CacheIdentifier(Generic[T]):
key: str
func: Callable[..., T]
def __init__(self, key: str, func: Callable[..., T]) -> None:
self.key = key
self.func = func
def load_or_set_cache(identifier: CacheIdentifier[T]) -> T:
return identifier.func()
def main() -> None:
def function_to_fetch_cache_data() -> list[str]:
return ["a", "b", "c"]
ident = CacheIdentifier("my_cache_key", func=function_to_fetch_cache_data)
cache_data = load_or_set_cache(ident)
reveal_type(cache_data)
Mypy output:
note: Revealed type is "builtins.list[builtins.str]"
Python currently does not support defining generic TypedDict subclasses.
I think your question was confusing some people, not least because you were talking about "dynamically" setting the output type, yet have that be understood by a static type checker. That makes no sense. And nothing about this is dynamic. The types are all known before the program is run.
I recommend you read through this section of PEP 484 (but really the entire thing).

Type hinting function with two use cases

I am trying to fully type hint a function that ensures that an element is in a given dictionary, and then checks that the element type is what the user expects it to be. My initial implementation works correctly, and is shown below
T = TypeVar("T")
def check_and_validate_element_in_dict(
element_name: str, dictionary: Dict[str, Any], element_type: Type[T]
) -> T:
assert element_name in dictionary
element = dictionary[element_name]
assert isinstance(element, element_type)
return element
it allows me to replace this
assert "key1" in _dict
key1 = _dict["key1"]
assert isinstance(key1, type1)
assert "key2" in _dict
key2 = _dict["key2"]
assert isinstance(key2, type2)
with this
key1 = check_and_validate_element_in_dict("key1", _dict, type1)
key2 = check_and_validate_element_in_dict("key2", _dict, type2)
now, this only works if the element type to test is only one, like int, str, etc.
I also want to be able to test multuple different types in my function, like
isinstance(element, (int, dict))
isinstance(element, (float, type(None)))
the issue here is type hinting the function in order to make it understand that if element_type is a single value T, the return value is T, but if element_type is one of e.g. two types T and U, the return value will be either T or U.
I guess it is possible, but since I'm still a newbie in the type hinting area I'll need some help!
Edit:
I tried making the function support either a single type or a tuple of two different types as base case, so I updated element_type to be
element_type: Union[Type[T], Tuple[Type[T], Type[T]]]
now the return element statement gets flagged by mypy with the error:
Returning Any from function declared to return "T"
this also raises a question: do I need to indicate each different input type as a new TypeVar? In such case, the element_type definition becomes
# using U = TypeVar("U")
def ...(..., element_type: Union[Type[T], Tuple[Type[T], Type[U]]]) -> Union[T, U]:
In this case the issues keeps being
Returning Any from function declared to return "T"
You can use typing.overload, which allows you to register multiple different signatures for one function. A function decorated with #overload is ignored by python at runtime, so you can leave the body of these functions empty by just putting an ellipsis ... in the body. These implementations are just for the type-checker — you have to make sure that there is at least one "true" implementation of the function that is not decorated with overload.
from typing import TypeVar, overload, Any, Union, Dict, Type, Tuple
t0 = TypeVar('t0')
t1 = TypeVar('t1')
#overload
def check_and_validate_element_in_dict(
element_name: str,
dictionary: Dict[str, Any],
element_type: Type[t0]
) -> t0:
"""
Signature of the function if exactly one type
is supplied to argument element_type
"""
...
#overload
def check_and_validate_element_in_dict(
element_name: str,
dictionary: Dict[str, Any],
element_type: Tuple[Type[t0], Type[t1]]
) -> Union[t0, t1]:
"""
Signature of the function if a tuple of exactly two types
is supplied to argument element_type
"""
...
def check_and_validate_element_in_dict(
element_name: str,
dictionary: Dict[str, Any],
element_type: Any
) -> Any:
"""Concrete implementation of the function"""
assert element_name in dictionary
element = dictionary[element_name]
assert isinstance(element, element_type)
return element
This feels like a deeply imperfect solution, however, as it doesn't provide a solution for a tuple of arbitrary length being passed to your element_type argument. It only works if you know the length of your tuple will be one of (for example) 2, 3 or 4 -- you can then provide an overloaded implementation for each of those situations. Would definitely be interested if anybody can think of a better solution.

Python building complex mypy types

In a perfect world, I could just do this:
ScoreBaseType = Union[bool, int, float]
ScoreComplexType = Union[ScoreBaseType, Dict[str, ScoreBaseType]]
But, that says a ScoreComplexType is either a ScoreBaseType or a dictionary which allows multiple types of values... not what I want.
The following looks like it should work to me, but it doesn't:
ScoreBaseTypeList = [bool, int, float]
ScoreBaseType = Union[*ScoreBaseTypeList] # pycharm says "can't use starred expression here"
ScoreDictType = reduce(lambda lhs,rhs: Union[lhs, rhs], map(lambda x: Dict[str, x], ScoreBaseTypeList))
ScoreComplexType = Union[ScoreBaseType, ScoreDictType]
Is there any way I can do something like the above without having to go through this tedium?
ScoreComplexType = Union[bool, int, float,
                     Dict[str, bool],
                     Dict[str, int],
                     Dict[str, float]]
Edit: More fleshed out desired usage example:
# these strings are completely arbitrary and determined at runtime. Used as keys in nested dictionaries.
CatalogStr = NewType('CatalogStr', str)
DatasetStr = NewType('DatasetStr', str)
ScoreTypeStr = NewType('ScoreTypeStr', str)
ScoreBaseType = Union[bool, int, float]
ScoreDictType = Dict[ScoreTypeStr, 'ScoreBaseTypeVar']
ScoreComplexType = Union['ScoreBaseTypeVar', ScoreDictType]
ScoreBaseTypeVar = TypeVar('ScoreBaseTypeVar', bound=ScoreBaseType)
ScoreComplexTypeVar = TypeVar('ScoreComplexTypeVar', bound=ScoreComplexType) # errors: "constraints cannot be parameterized by type variables"
class EvalBase(ABC, Generic[ScoreComplexTypeVar]):
def __init__(self) -> None:
self.scores: Dict[CatalogStr,
Dict[DatasetStr,
ScoreComplexTypeVar]
] = {}
class EvalExample(EvalBase[Dict[float]]): # can't do this either
...
Edit 2:
It occurs to me that I could simplify a LOT of my type hinting if I used tuples instead of nested dictionaries. This seems to maybe work? I've only tried it in the below toy example and haven't yet tried adapting all my code.
# These are used to make typing hints easier to understand
CatalogStr = NewType('CatalogStr', str) # A str corresponding to the name of a catalog
DatasetStr = NewType('DatasetStr', str) # A str corresponding to the name of a dataset
ScoreTypeStr = NewType('ScoreTypeStr', str) # A str corresponding to the label for a ScoreType
ScoreBaseType = Union[bool, int, float]
SimpleScoreDictKey = Tuple[CatalogStr, DatasetStr]
ComplexScoreDictKey = Tuple[CatalogStr, DatasetStr, ScoreTypeStr]
ScoreKey = Union[SimpleScoreDictKey, ComplexScoreDictKey]
ScoreKeyTypeVar = TypeVar('ScoreKeyTypeVar', bound=ScoreKey)
ScoreDictType = Dict[ScoreKey, ScoreBaseType]
# These are used for Generics in classes
DatasetTypeVar = TypeVar('DatasetTypeVar', bound='Dataset') # Must match a type inherited from Dataset
ScoreBaseTypeVar = TypeVar('ScoreBaseTypeVar', bound=ScoreBaseType)
class EvalBase(ABC, Generic[ScoreBaseTypeVar, ScoreKeyTypeVar]):
def __init__(self):
self.score: ScoreDictType = {}
class EvalExample(EvalBase[float, ComplexScoreDictKey]):
...
Although then what would the equivalent of this be? Seems like I might have to store a couple lists of keys in order to iterate?
for catalog_name in self.catalog_list:
for dataset_name in self.scores[catalog_name]:
for score in self.scores[catalog_name][dataset_name]:
You may need to use TypeVars to express this, but without an example of how you intend to use it, it's hard to say.
An example of how this would be used for typing a return value dependent on input:
ScoreBaseType = Union[bool, int, float]
ScoreTypeVar = TypeVar('ScoreTypeVar', bound=ScoreBaseType)
ScoreDictType = Union[ScoreTypeVar, Dict[str, ScoreTypeVar]]
def scoring_func(Iterable[ScoreTypeVar]) -> ScoreDictType:
...
If you're not doing this based on input values though, you probably want
ScoreBaseType = Union[bool, int, float]
ScoreDictTypes = Union[Dict[str, bool], Dict[str, int], Dict[str, float]]
ScoreComplexType = Union[ScoreBaseType, ScoreDictTypes]
Depending on how you are handling the types, you may also be able to use SupportsInt or SupportsFloat types rather than both int and float
Edit: (Additional Info Based on the edited OP below)
Since you are typing an ABC with this, it may be sufficient to type the base class using Dict[str, Any] and constrain subclasses further.
If it isn't, you are going to have very verbose type definitions, and there isn't much alternative, as mypy currently has some issues resolving some classes of programmatically generated types, even when operating on constants.
mypy also doesn't have support for recursive type aliases at this time (though there is a potential of support for them being added, it's not currently planned), so for readability, you'd need to define the allowed types for each potential level of nesting, and then collect those into a type representing the full nested structure.

Categories

Resources