Class that acts as mapping for **unpacking - python

Without subclassing dict, what would a class need to be considered a mapping so that it can be passed to a method with **.
from abc import ABCMeta
class uobj:
__metaclass__ = ABCMeta
uobj.register(dict)
def f(**k): return k
o = uobj()
f(**o)
# outputs: f() argument after ** must be a mapping, not uobj
At least to the point where it throws errors of missing functionality of mapping, so I can begin implementing.
I reviewed emulating container types but simply defining magic methods has no effect, and using ABCMeta to override and register it as a dict validates assertions as subclass, but fails isinstance(o, dict). Ideally, I dont even want to use ABCMeta.

The __getitem__() and keys() methods will suffice:
>>> class D:
def keys(self):
return ['a', 'b']
def __getitem__(self, key):
return key.upper()
>>> def f(**kwds):
print kwds
>>> f(**D())
{'a': 'A', 'b': 'B'}

If you're trying to create a Mapping — not just satisfy the requirements for passing to a function — then you really should inherit from collections.abc.Mapping. As described in the documentation, you need to implement just:
__getitem__
__len__
__iter__
The Mixin will implement everything else for you: __contains__, keys, items, values, get, __eq__, and __ne__.

The answer can be found by digging through the source.
When attempting to use a non-mapping object with **, the following error is given:
TypeError: 'Foo' object is not a mapping
If we search CPython's source for that error, we can find the code that causes that error to be raised:
case TARGET(DICT_UPDATE): {
PyObject *update = POP();
PyObject *dict = PEEK(oparg);
if (PyDict_Update(dict, update) < 0) {
if (_PyErr_ExceptionMatches(tstate, PyExc_AttributeError)) {
_PyErr_Format(tstate, PyExc_TypeError,
"'%.200s' object is not a mapping",
Py_TYPE(update)->tp_name);
PyDict_Update is actually dict_merge, and the error is thrown when dict_merge returns a negative number. If we check the source for dict_merge, we can see what leads to -1 being returned:
/* We accept for the argument either a concrete dictionary object,
* or an abstract "mapping" object. For the former, we can do
* things quite efficiently. For the latter, we only require that
* PyMapping_Keys() and PyObject_GetItem() be supported.
*/
if (a == NULL || !PyDict_Check(a) || b == NULL) {
PyErr_BadInternalCall();
return -1;
The key part being:
For the latter, we only require that PyMapping_Keys() and PyObject_GetItem() be supported.

Related

Programmatic generation of Literal options

MyPy's Literal type can be super useful for defining available options. Is it possible to generate a literal type programmatically, e.g. from a canonical registry?
e.g.
class Dispatcher():
func_reg = {
'f1': my_func,
'f2': new_func,
'f3': shoe_func,
}
def dispatch(cls, func_name: Literal[*func_reg.keys()]) -> Whatever:
pass
Unfortunately, the answer is no.
According to the mypy documentation:
Literal types may contain one or more literal bools, ints, strs, bytes, and enum values. However, literal types cannot contain arbitrary expressions: types like Literal[my_string.trim()], Literal[x > 3], or Literal[3j + 4] are all illegal.
As #BrokenBenchmark notes, it is not possible to auto-generate Literal types. However, if the end goal is just to require specific values generated from some kind of function registry, we can hack it with enum.Enum.
To quote PEP 586
rather than entirely special-casing enums, we can instead treat them
as being approximately equivalent to the union of their values...
the Status enum could be treated as being approximately equivalent to Literal[Status.SUCCESS, Status.INVALID_DATA, Status.FATAL_ERROR]
Here, functions are "registered" by adding an enum value that is an exact uppercasing of the function name to the FuncNames enum. This is not a pretty or robust solution, but it runs, it supports single-location registration of a function for type-checked dispatch, and mypy handles the required enum values as expected.
from enum import Enum, auto
def f():
return "f"
def g():
return "g"
def h():
return "h"
class Dispatcher():
# Build the enum used to register the functions
class FuncNames(Enum):
"""
The enum names here _must_ be exact uppercase-ings of the function
names. The names will be lowercased and evaluated to register their
associated functions
"""
F = auto()
G = auto()
H = auto()
# NOTE: The functional syntax works just as well
# FuncNames = Enum('FuncNames', 'F G H')
# Comprehensions can't access names defined in the class block,
# so use a standard for loop
func_reg = dict()
for name in list(FuncNames):
func_reg[eval(f"FuncNames.{name.name}")] = eval(str(name.name).lower())
#classmethod
def dispatch(cls, func_name: FuncNames):
"""
Prints the return from a registered function.
Can only be called with an item from FuncNames
"""
print(cls.func_reg[func_name]())
Dispatcher.dispatch(Dispatcher.FuncNames.F)
Dispatcher.dispatch(Dispatcher.FuncNames.G)
Dispatcher.dispatch(Dispatcher.FuncNames.H)
# Dispatcher.dispatch(Dispatcher.FuncNames.I) -> "FuncNames has no attribute I"
# Dispatcher.dispatch(Dispatcher2.FuncNames) -> "incompatible type"
# Dispatcher.dispatch('MyPy hates me!') -> "incompatible type"
Interestingly, though it feels cleaner to generate the enum itself from a list of the functions themselves, MyPy chokes on this.
class Dispatcher2():
# Build an enum used to register these (the actual functions)
funcs_to_register = [f, g, h]
enum_names = [func.__name__.upper() for func in funcs_to_register]
joined = ' '.join(enum_names)
FuncNames = Enum('FuncNames', joined)
func_reg = dict()
for name in enum_names:
func_reg[eval(f"FuncNames.{name}")] = eval(name.lower())
#classmethod
def dispatch(cls, func_name: FuncNames):
"""
Prints the return from a registered function.
Can only be called with an item from FuncNames
"""
print(cls.func_reg[func_name]())
Dispatcher2.dispatch(Dispatcher2.FuncNames.F)
Dispatcher2.dispatch(Dispatcher2.FuncNames.G)
Dispatcher2.dispatch(Dispatcher2.FuncNames.H)
The above runs as expected, but mypy presumably can't infer the values present in the enum unless it is statically defined, so it errors.
> mypy enums_typing.py
enums_typing.py:19: error: Enum() expects a string, tuple, list or dict literal as the second argument
enums_typing.py:36: error: "Type[FuncNames]" has no attribute "F"
enums_typing.py:37: error: "Type[FuncNames]" has no attribute "G"
enums_typing.py:38: error: "Type[FuncNames]" has no attribute "H"
Found 4 errors in 1 file (checked 1 source file)
TLDR:
In order to define a fixed set of choices that MyPy can check against, you must define them statically. It may then be possible to use those statically-defined choices to programmatically build your function registry.

IntEnum subclass does not compare properly

I'm trying to subclass the IntEnum to start members' value at a certain value and then automatically set the value for subsequent members. This is my class:
class Abc(IntEnum):
def __init__(self, n=100):
super().__init__()
self._value_ = n + len(self.__class__.__members__)
A = () # 100
B = () # 101
Abc.A == Abc.B # expects False, but gets True
As shown above the comparison between the members is not correct. When printing out Abc.dict, I noticed that it _value2member_map_ does not look correct either.
mappingproxy({'A': <Abc.A: 100>,
'B': <Abc.B: 101>,
'__doc__': 'An enumeration.',
'__init__': <function __main__.Abc.__init__>,
'__module__': '__main__',
'__new__': <function enum.Enum.__new__>,
'_generate_next_value_': <function enum.Enum._generate_next_value_>,
'_member_map_': OrderedDict([('A', <Abc.A: 100>),
('B', <Abc.B: 101>)]),
'_member_names_': ['A', 'B'],
'_member_type_': int,
'_value2member_map_': {0: <Abc.B: 101>}})
Notice how '_value2member_map_' has key 0 instead of the expected values 100 and 101. I must be missing something in the init function, but I could not figure out how to properly do what I intended. Any help is appreciated.
Thank you.
First, there's a more idiomatic—and dead simple—way to do what you seem to be trying to do:
class Abc(IntEnum):
A = 100
B = auto()
Or, given that you're putting 100 and 101 in as comments anyway, live code is always better than comments:
class Abc(IntEnum):
A = 100
B = 101
The fact that you're not doing either of those is a signal to the reader that you're probably doing to do something more complicated. Except that, as far as I can tell, you aren't, so this is misleading.
Plus, you're combining two patterns that have directly opposite connotations: as the docs say, using the () idiom "signifies to the user that these values are not important", but using IntEnum obviously means that the numeric values of these enumeration constants are not just important but the whole point of them.
Not only that, but the user has to read through your method code to figure out what those important numeric values are, instead of just immediately reading them off.
Anyway, if you want to get this to work, the problem is that replacing _value_ after initialization isn't documented to do any good, and in fact it doesn't.
What you want to override is __new__, not __init__, as in the auto-numbering example in the docs.
But there are two differences here (both related to the fact that you're using IntEnum instead of Enum):
You cannot call object.__new__, because an IntEnum is an int, and object.__new__ can't be used on instances of builtin types like int. You can figure out the right base class dynamically from looking through cls's mro, or you can just hardcode int here.
You don't need an intermediate base class here to do the work. (You might still want one if you were going to create multiple auto-numbered IntEnums, of course.)
So:
class Abc(IntEnum):
def __new__(cls, n=100):
value = len(cls.__members__) + n
obj = int.__new__(cls, value)
obj._value_ = value
return obj
A = ()
B = ()

Why is the dictionary key being converted to an inherited class type?

My code looks something like this:
class SomeClass(str):
pass
some_dict = {'s':42}
>>> type(some_dict.keys()[0])
str
>>> s = SomeClass('s')
>>> some_dict[s] = 40
>>> some_dict # expected: Two different keys-value pairs
{'s': 40}
>>> type(some_dict.keys()[0])
str
Why did Python convert the object s to the string "s" while updating the dictionary some_dict?
Whilst the hash value is related, it is not the main factor.
It is equality that is more important here. That is, objects may have the same hash value and not be equal, but equal objects must have the same hash value (though this is not strictly enforced). Otherwise you will end up with some strange bugs when using dict and set.
Since you have not defined the __eq__ method on SomeClass you inherit the one on str. Python's builtins are built to allow subclassing, so __eq__ returns true, if the object would otherwise be equal were it not for them having different types. eg. 's' == SomeClass('s') is true. Thus it is right and proper that 's' and SomeClass('s') are equivalent as keys to a dictionary.
To get the behaviour you want you must redefine the __eq__ dunder method to take into account type. However, when you define a custom equals, python stops giving you an automatic __hash__ dunder method, and you must redefine it as well. But in this case we can just reuse str.__hash__.
class SomeClass(str):
def __eq__(self, other):
return (
type(self) is SomeClass
and type(other) is SomeClass
and super().__eq__(other)
)
__hash__ = str.__hash__
d = {'s': 1}
d[SomeClass('s')] = 2
assert len(d) == 2
print(d)
prints: {'s': 2, 's': 1}
This is a really good question. Firstly, when put (key, value) pair into dict, it uses hash function to get the hash value of key and check if this hash code is present. If present, then dict compares the object with same hash code. If two objects are equal (__eq__(self, other) return True), then, it would update the value, which is why your code encounters such behavior.
Given SomeClass is not even modified, so 's' and SomeClass('s') should have the same hash code and 's'.__eq__(SomeClass('s')) will return True.

Overriding special methods on builtin types

Can magic methods be overridden outside of a class?
When I do something like this
def __int__(x):
return x + 5
a = 5
print(int(a))
it prints '5' instead of '10'. Do I do something wrong or magic methods just can't be overridden outside of a class?
Short answer; not really.
You cannot arbitrarily change the behaviour of int() a builtin function (*which internally calls __int__()) on arbitrary builtin types such as int(s).
You can however change the behaviour of custom objects like this:
Example:
class Foo(object):
def __init__(self, value):
self.value = value
def __add__(self, other):
self.value += other
def __repr__(self):
return "<Foo(value={0:d})>".format(self.value)
Demo:
>>> x = Foo(5)
>>> x + 5
>>> x
<Foo(value=10)>
This overrides two things here and implements two special methods:
__repr__() which get called by repr()
__add__() which get called by the + operator.
Update: As per the comments above; techincally you can redefine the builtin function int; Example:
def int(x):
return x + 5
int(5) # returns 10
However this is not recommended and does not change the overall behaviour of the object x.
Update #2: The reason you cannot change the behaviour of bultin types (without modifying the underlying source or using Cuthon or ctypes) is because builtin types in Python are not exposed or mutable to the user unlike Homoiconic Languages (See: Homoiconicity). -- Even then I'm not really sure you can with Cython/ctypes; but the reason question is "Why do you want to do this?"
Update #3: See Python's documentation on Data Model (object.__complex__ for example).
You can redefine a top-level __int__ function, but nobody ever calls that.
As implied in the Data Model documentation, when you write int(x), that calls x.__int__(), not __int__(x).
And even that isn't really true. First, __int__ is a special method, meaning it's allowed to call type(x).__int__(x) rather than x.__int__(), but that doesn't matter here. Second, it's not required to call __int__ unless you give it something that isn't already an int (and call it with the one-argument form). So, it could be as if it's was written like this:
def int(x, base=None):
if base is not None:
return do_basey_stuff(x, base)
if isinstance(x, int):
return x
return type(x).__int__(x)
So, there is no way to change what int(5) will do… short of just shadowing the builtin int function with a different builtin/global/local function of the same name, of course.
But what if you wanted to, say, change int(5.5)? That's not an int, so it's going to call float.__int__(5.5). So, all we have to do is monkeypatch that, right?
Well, yes, except that Python allows builtin types to be immutable, and most of the builtin types in CPython are. So, if you try it:
>>> _real_float_int = float.__int__
>>> def _float_int(self):
... return _real_float_int(self) + 5
>>> _float_int(5.5)
10
>>> float.__int__ = _float_int
TypeError: can't set attributes of built-in/extension type 'float'
However, if you're defining your own types, that's a different story:
>>> class MyFloat(float):
... def __int__(self):
... return super().__int__() + 5
>>> f = MyFloat(5.5)
>>> int(f)
10

overloaded __iter__ is bypassed when deriving from dict

Trying to create a custom case-insensitive dictionary, I came the following inconvenient and (from my point-of-view) unexpected behaviour. If deriving a class from dict, the overloaded __iter__, keys, values functions are ignored when converting back to dict. I have condensed it to the following test case:
import collections
class Dict(dict):
def __init__(self):
super(Dict, self).__init__(x = 1)
def __getitem__(self, key):
return 2
def values(self):
return 3
def __iter__(self):
yield 'y'
def keys(self):
return 'z'
if hasattr(collections.MutableMapping, 'items'):
items = collections.MutableMapping.items
if hasattr(collections.MutableMapping, 'iteritems'):
iteritems = collections.MutableMapping.iteritems
d = Dict()
print(dict(d)) # {'x': 1}
print(dict(d.items())) # {'y': 2}
The values for keys,values and __iter__,__getitem__ are inconsistent only for demonstration which methods are actually called.
The documentation for dict.__init__ says:
If a positional argument is given and it is a mapping object, a
dictionary is created with the same key-value pairs as the mapping
object. Otherwise, the positional argument must be an iterator object.
I guess it has something to do with the first sentence and maybe with optimizations for builtin dictionaries.
Why exactly does the call to dict(d) not use any of keys, __iter__?
Is it possible to overload the 'mapping' somehow to force the dict constructor to use my presentation of key-value pairs?
Why did I use this? For a case-insensitive but -preserving dictionary, I wanted to:
store (lowercase => (original_case, value)) internally, while appearing as (any_case => value).
derive from dict in order to work with some external library code that uses isinstance checks
not use 2 dictionary lookups: lower_case=>original_case, followed by original_case=>value (this is the solution which I am doing now instead)
If you are interested in the application case: here is corresponding branch
In the file dictobject.c, you see in line 1795ff. the relevant code:
static int
dict_update_common(PyObject *self, PyObject *args, PyObject *kwds, char *methname)
{
PyObject *arg = NULL;
int result = 0;
if (!PyArg_UnpackTuple(args, methname, 0, 1, &arg))
result = -1;
else if (arg != NULL) {
_Py_IDENTIFIER(keys);
if (_PyObject_HasAttrId(arg, &PyId_keys))
result = PyDict_Merge(self, arg, 1);
else
result = PyDict_MergeFromSeq2(self, arg, 1);
}
if (result == 0 && kwds != NULL) {
if (PyArg_ValidateKeywordArguments(kwds))
result = PyDict_Merge(self, kwds, 1);
else
result = -1;
}
return result;
}
This tells us that if the object has an attribute keys, the code which is called is a mere merge. The code called there (l. 1915 ff.) makes a distinction between real dicts and other objects. In the case of real dicts, the items are read out with PyDict_GetItem(), which is the "most inner interface" to the object and doesn't bother using any user-defined methods.
So instead of inheriting from dict, you should use the UserDict module.
Is it possible to overload the 'mapping' somehow to force the dict constructor to use my presentation of key-value pairs?
No.
Being an inherent type, redefining the semantics of dict would certainly cause outright breakage elsewhere.
You've got a library that you can't override the behavior of dict in, that's tough, but redefining the language primitives isn't the answer. You'd probably find it irksome if someone screwed with the commutative property of integer addition behind your back; that's why they can't.
And with regard to your comment "UserDict (correctly) gives False in isinstance(d, dict) checks", of course it does because it isn't a dict and dict has very specific invariants which UserDict can't assure.

Categories

Resources