Implement two internal Python types simultaneuosly - python

I'm trying to change the return type of a function from set to list. For a smooth transition, the idea was to do an in-place deprecation and temporarily return a type that is both a set and a list. But I'm not sure whether it possible to derive from two internal Python types because:
>>> class ListSet(list, set):
... pass
...
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: multiple bases have instance lay-out conflict
In general the goal is to have a type that behaves like a sorted set (I think then it would be pretty much the same as a list and set) but also works with type checks (and ideally also works with MyPy):
thing = class.method_with_updated_return_type()
print(isinstance(thing, set)) # True
print(isinstance(thing, list)) # True
My idea would otherwise have been to override the __instancecheck__ method of ListSet's metaclass, to something like
class Meta(type):
def __instancecheck__(self, instance):
return isinstance(instance, (set, list))
class ListSet(metaclass=Meta):
# implement all set and list methods
Is something like this possible in Python?

So, no, nothing short of inheriting directly from set or list will make an instance of a custom class return "True" for an isinstance call. And there are several reasons why one class can't inherit from both at the sametime, and even if you'd code such a class in native code, or modify it with ctypes so that it would return True for such an isinstance check, the resulting class would likely crash your Python interpreter when used.
On the other hand, the mechanisms provided by the Abstract Base Classes and the __instancecheck__, __subclasscheck__, __subclasshook__ and register methods allow one to answer "True" if asked if an instance of anyone class is an instance of the custom class making use of these methods. That is: your custom class could answer, if asked, that an arbitrary list or set is an instance of itself, like in myobj = set((1,2,3)); isinstance(myobj, MySpecialClass) -> True- but not the other way around: isinstance(MySpecialClass(), set) will always return False.
What there is in order to allow similar mechanisms is that the recomendation is that one codes for prococols, not for specific classes. There is, any well-written code should always do isinstance(obj, collections.abc.Set) or collections.abc.Sequence and never isinstance(..., set) (or list). Then anyone can register a custom class as a subclass of those, and the test would be True:
from collections.abc import MutableSet, Sequence
class MyClass(MutableSet):
# <-NB. don't try to inherit _also_ from Sequence here. See bellow.
# code mandatory methods for MutableSet according to
# https://docs.python.org/3/library/collections.abc.html,
# plus customize all mandadory _and_ derivative methods
# for a mutable sequence, in order to have your
# desired "ordered mutable set" behavior here.
# After the class body, do:
Sequence.register(MyClass)
The call to Sequence.register will register MyClass as a virtual subclass of Sequence, and any well behaved code, that tests for the protocol, via collections.abc.Sequence instance check, or, better yet, code that just uses the object as it needs allows incorrect objects to fail at runtime, will just work. I fyou can get rid of any "isinstance" checks, just coding an appropriate implementation of "MutableSet" can give you an "ordered set" that works like you'd like, with no worries about arbitrary checks for the object type.
That is not hard at all: you can just implement the needed methods, initialize a list holding the actial data in your class __init__, update the list on all content-modifying calls for the Set, and iterate on the list
from collections.abc import MutableSet
class OrderedSet(MutableSet):
def __init__(self, initial=()):
self.data = list()
self.update(initial)
def update(self, items):
for item in items:
self.add(item)
def __contains__(self, item):
return item in self.data
def __iter__(self):
return iter(self.data)
def __len__(self):
return len(self.data)
def add(self, item):
if item not in self.data:
self.data.append(item)
def discard(self, item):
self.data.remove(item)
def __repr__(self):
return f"OrderedSet({self.data!r})"
If you can't change hardcoded tests for instances of "set" or "list", however, there is nothing that you can do.

Related

how to make a field of the class to be of the same type as parameter in constructor

I'm a fully "static typed" man and a beginner in python and I want to make a datatype in python, which saves prev-next states of some objects. The structure is like this:
class PrevCurr:
def __init__(self, obj_initial):
self.previous = obj_initial
self.current = obj_initial
No problem if my obj_initial is something like [None, None] or so. But imagine I need to wrap this class onto the different big types, like dictionaries/lists/sets of 1000+ elements or/and user-defined classes. Then I would like to have something like this:
class PrevCurr:
def __init__(self, obj_type, *args, **kwargs):
'''make self.previous and self.orientation of the same type as obj_type'''
self.previous = ...
self.current = ...
My question is: how to be sure, that fields (or is there another name in python for them?) are set to be of the same type as the object, onto which I want to wrap? I have the idea presented above, that I may somehow to pass the type and additional info about that, like size and etc. as parameters instead of object itself to save the memory, but how then to ensure, that my fields are set to be exactly the same type as the object? afaik I cannot pass constructors as parameters in python and I couldn't find any sort of generic programming in python for that task. Or I got the idea fully wrong.
You're thinking as you would in a static-typed language like C or Java, where you have to pre-allocate memory for objects of a specific type before assigning anything to them. This is the wrong approach in python.
Instead, let's consider what you want: a class that can represent any single type, but which, once initialized, can only represent that particular type. In Java, you would use a generic for this, but python doesn't have those. What we can do in python is to make sure that only objects of the correct type can be assigned to it.
The idiomatic way of doing something like this in python is throwing an error at runtime if the programmer uses the class incorrectly; there's not really a good way for a static typechecker to throw a compile-time error, like it might in Java. I present the following for your consideration:
class PrevCurr:
#property
def previous(self):
return self._previous
#previous.setter
def previous(self, value):
if not isinstance(value, self._type):
raise ValueError(f"Previous value must be of type {self._type}")
self._previous = value
#property
def current(self):
return self._current
#current.setter
def current(self, value):
if not isinstance(value, self._type):
raise ValueError(f"Current value must be of type {self._type}")
self._current = value
def __init__(self, obj_initial):
self._type = type(obj_initial)
self._previous = obj_initial
self._current = obj_initial
Here, we have two external-facing variables: previous and current, as you have in your current example. Because we want specific behaviors on setting these variables, we use the #property decorator to declare them. Their actual values are held in the 'private' variables _previous and _current, respectively.
Upon initialization, we check the type of the initial object, and store that type to the class as the 'private' variable _type.
Then, each time something (even the class's instance itself) tries to set instance.previous or instance.current, we redirect it to the appropriate function. In this function, we check whether the object to be set is the same type as what we initialized the class with. If not, we throw an error.
If you're storing, for example, a list or other collection, then I don't think there's any reasonable way to ensure that the entire list remains the same type (or, indeed, to make any assumption about the contents of the list, since python itself doesn't. They're all <type 'list'>).
One possible workaround would be to use metaclasses and overriding the __instancecheck__() metaclass method, to create a subclass of dict that only holds a specific type of object, and then initialize your PrevCurr instance with one of those.
I presume from the name of the class that you have some method .update() that copies over the current value to previous, and assigns a new current value that must be the same type. In that case you might want to play around with the getters and setters to make assigning directly to previous harder.
Example of usage:
>>> a = PrevCurr(3)
>>> a.previous = 2
>>> a.previous = 4.5
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 8, in previous
ValueError: Previous value must be of type <class 'int'>

How to create a Python class that is a subclass of another class, but fails issubclass and/or isinstance tests?

I know this is probably bad design, but I've run into a case where I need to create a subclass Derived of a class Base on-the-fly, and make instances of Derived fail the issubclass(Derived, Base) or isinstance(derived_obj, Base) checks (i.e. return False).
I've tried a number of approaches, but none succeeded:
Creating a property named __class__ in Derived (https://stackoverflow.com/a/42958013/4909228). This can only be used to make the checks return True.
Overriding the __instancecheck__ and __subclasscheck__ methods of Base. This doesn't work because CPython only calls these methods when conventional checks return False.
Assigning the __class__ attribute during __init__. This is no longer allowed in Python 3.6+.
Making Derived subclass object and assigning all its attributes and methods (including special methods) to that of Base. This doesn't work because certain methods (e.g. __init__) cannot be called on an instance that is not a subclass of Base.
Can this possibly be done in Python? The approach could be interpreter specific (code is only run in CPython), and only needs to target Python versions 3.6+.
To illustrate a potential usage of this requirement, consider the following function:
def map_structure(fn, obj):
if isinstance(obj, list):
return [map_structure(fn, x) for x in obj]
if isinstance(obj, dict):
return {k: map_structure(fn, v) for k, v in obj.items()}
# check whether `obj` is some other collection type
...
# `obj` must be a singleton object, apply `fn` on it
return fn(obj)
This method generalizes map to work on arbitrarily nested structures. However, in some cases we don't want to traverse a certain nested structure, for instance:
# `struct` is user-provided structure, we create a list for each element
struct_list = map_structure(lambda x: [x], struct)
# somehow add stuff into the lists
...
# now we want to know how many elements are in each list, so we want to
# prevent `map_structure` from traversing the inner-most lists
struct_len = map_structure(len, struct_list)
If the said functionality can be implemented, then the above could be changed to:
pseudo_list = create_fake_subclass(list)
struct_list = map_structure(lambda x: pseudo_list([x]), struct)
# ... and the rest of code should work
Overriding the __instancecheck__ and __subclasscheck__ methods of Base. This doesn't work because CPython only calls these methods when conventional checks return False.
This statement is a misconception. These hooks are to be defined on the metaclass, not on a base class (docs).
>>> class Meta(type):
... def __instancecheck__(self, instance):
... print("instancecheck", self, instance)
... return False
... def __subclasscheck__(self, subclass):
... print("subclasscheck", self, subclass)
... return False
...
>>> class Base(metaclass=Meta):
... pass
...
>>> class Derived(Base):
... pass
...
>>> obj = Derived()
>>> isinstance(obj, Base)
instancecheck <class '__main__.Base'> <__main__.Derived object at 0xcafef00d>
False
>>> issubclass(Derived, Base)
subclasscheck <class '__main__.Base'> <class '__main__.Derived'>
False
Be aware of the CPython performance optimizations which prevent custom instance check hooks from being called in some special cases (see here for details). In particular, you may not strong-arm the return value of isinstance(obj, Derived) because of a CPython fast path when there was an exact match.
As a final note, I agree with the commenters that it's not sounding like a very promising design. It seems like you should consider using composition over inheritance in this case.
As others have pointed out: use composition.
Make a class that is not recursively mapped:
class NotMapped:
pass
Then use composition and derive your classes from it:
class Derived(NotMapped):
pass
Then add a case at the beginning of your function:
def map_structure(fn, obj):
if isinstance(obj, NotMapped):
return obj
# special-case container types etc.
...
return fn(obj)
Use this together with multiple inheritance as a sort of mixin. Create convenience ctors for container types, so NotMapped([1, 2, 3]) works as expected.

How to create a static class property that returns an instance of the class itself?

I wrote a class that can handle integers with arbitrary precision (just for learning purposes). The class takes a string representation of an integer and converts it into an instance of BigInt for further calculations.
Often times you need the numbers Zero and One, so I thought it would be helpfull if the class could return these. I tried the following:
class BigInt():
zero = BigInt("0")
def __init__(self, value):
####yada-yada####
This doesn't work. Error: "name 'BigInt' is not defined"
Then I tried the following:
class BigInt():
__zero = None
#staticmethod
def zero():
if BigInt.__zero is None:
BigInt.__zero = BigInt('0')
return BigInt.__zero
def __init__(self, value):
####yada-yada####
This actually works very well. What I don't like is that zero is a method (and thus has to be called with BigInt.zero()) which is counterintuitive since it should just refer to a fixed value.
So I tried changing zero to become a property, but then writing BigInt.zero returns an instance of the class property instead of BigInt because of the decorator used. That instance cannot be used for calculations because of the wrong type.
Is there a way around this issue?
A static property...? We call a static property an "attribute". This is not Java, Python is a dynamically typed language and such a construct would be really overcomplicating matters.
Just do this, setting a class attribute:
class BigInt:
def __init__(self, value):
...
BigInt.zero = BigInt("0")
If you want it to be entirely defined within the class, do it using a decorator (but be aware it's just a more fancy way of writing the same thing).
def add_zero(cls):
cls.zero = cls("0")
return cls
#add_zero
class BigInt:
...
The question is contradictory: static and property don't go together in this way. Static attributes in Python are simply ones that are only assigned once, and the language itself includes a very large number of these. (Most strings are interred, all integers < a certain value are pre-constructed, etc. E.g. the string module.). Easiest approach is to statically assign the attributes after construction as wim illustrates:
class Foo:
...
Foo.first = Foo()
...
Or, as he further suggested, using a class decorator to perform the assignments, which is functionally the same as the above. A decorator is, effectively, a function that is given the "decorated" function as an argument, and must return a function to effectively replace the original one. This may be the original function, say, modified with some annotations, or may be an entirely different function. The original (decorated) function may or may not be called as appropriate for the decorator.
def preload(**values):
def inner(cls):
for k, v in values.items():
setattr(cls, k, cls(v))
return cls
return inner
This can then be used dynamically:
#preload(zero=0, one=1)
class Foo:
...
If the purpose is to save some time on common integer values, a defaultdict mapping integers to constructed BigInts could be useful as a form of caching and streamlined construction / singleton storage. (E.g. BigInt.numbers[27])
However, the problem of utilizing #property at the class level intrigued me, so I did some digging. It is entirely possible to make use of "descriptor protocol objects" (which the #property decorator returns) at the class level if you punt the attribute up the object model hierarchy, to the metaclass.
class Foo(type):
#property
def bar(cls):
print("I'm a", cls)
return 27
class Bar(metaclass=Foo):
...
>>> Bar.bar
I'm a <class '__main__.Bar'>
<<< 27
Notably, this attribute is not accessible from instances:
>>> Bar().bar
AttributeError: 'Bar' object has no attribute 'bar'
Hope this helps!

Let a class behave like it's a list in Python

I have a class which is essentially a collection/list of things. But I want to add some extra functions to this list. What I would like, is the following:
I have an instance li = MyFancyList(). Variable li should behave as it was a list whenever I use it as a list: [e for e in li], li.expand(...), for e in li.
Plus it should have some special functions like li.fancyPrint(), li.getAMetric(), li.getName().
I currently use the following approach:
class MyFancyList:
def __iter__(self):
return self.li
def fancyFunc(self):
# do something fancy
This is ok for usage as iterator like [e for e in li], but I do not have the full list behavior like li.expand(...).
A first guess is to inherit list into MyFancyList. But is that the recommended pythonic way to do? If yes, what is to consider? If no, what would be a better approach?
If you want only part of the list behavior, use composition (i.e. your instances hold a reference to an actual list) and implement only the methods necessary for the behavior you desire. These methods should delegate the work to the actual list any instance of your class holds a reference to, for example:
def __getitem__(self, item):
return self.li[item] # delegate to li.__getitem__
Implementing __getitem__ alone will give you a surprising amount of features, for example iteration and slicing.
>>> class WrappedList:
... def __init__(self, lst):
... self._lst = lst
... def __getitem__(self, item):
... return self._lst[item]
...
>>> w = WrappedList([1, 2, 3])
>>> for x in w:
... x
...
1
2
3
>>> w[1:]
[2, 3]
If you want the full behavior of a list, inherit from collections.UserList. UserList is a full Python implementation of the list datatype.
So why not inherit from list directly?
One major problem with inheriting directly from list (or any other builtin written in C) is that the code of the builtins may or may not call special methods overridden in classes defined by the user. Here's a relevant excerpt from the pypy docs:
Officially, CPython has no rule at all for when exactly overridden method of subclasses of built-in types get implicitly called or not. As an approximation, these methods are never called by other built-in methods of the same object. For example, an overridden __getitem__ in a subclass of dict will not be called by e.g. the built-in get method.
Another quote, from Luciano Ramalho's Fluent Python, page 351:
Subclassing built-in types like dict or list or str directly is error-
prone because the built-in methods mostly ignore user-defined
overrides. Instead of subclassing the built-ins, derive your classes
from UserDict , UserList and UserString from the collections
module, which are designed to be easily extended.
... and more, page 370+:
Misbehaving built-ins: bug or feature?
The built-in dict , list and str types are essential building blocks of Python itself, so
they must be fast — any performance issues in them would severely impact pretty much
everything else. That’s why CPython adopted the shortcuts that cause their built-in
methods to misbehave by not cooperating with methods overridden by subclasses.
After playing around a bit, the issues with the list builtin seem to be less critical (I tried to break it in Python 3.4 for a while but did not find a really obvious unexpected behavior), but I still wanted to post a demonstration of what can happen in principle, so here's one with a dict and a UserDict:
>>> class MyDict(dict):
... def __setitem__(self, key, value):
... super().__setitem__(key, [value])
...
>>> d = MyDict(a=1)
>>> d
{'a': 1}
>>> class MyUserDict(UserDict):
... def __setitem__(self, key, value):
... super().__setitem__(key, [value])
...
>>> m = MyUserDict(a=1)
>>> m
{'a': [1]}
As you can see, the __init__ method from dict ignored the overridden __setitem__ method, while the __init__ method from our UserDict did not.
The simplest solution here is to inherit from list class:
class MyFancyList(list):
def fancyFunc(self):
# do something fancy
You can then use MyFancyList type as a list, and use its specific methods.
Inheritance introduces a strong coupling between your object and list. The approach you implement is basically a proxy object.
The way to use heavily depends of the way you will use the object. If it have to be a list, then inheritance is probably a good choice.
EDIT: as pointed out by #acdr, some methods returning list copy should be overriden in order to return a MyFancyList instead a list.
A simple way to implement that:
class MyFancyList(list):
def fancyFunc(self):
# do something fancy
def __add__(self, *args, **kwargs):
return MyFancyList(super().__add__(*args, **kwargs))
If you don't want to redefine every method of list, I suggest you the following approach:
class MyList:
def __init__(self, list_):
self.li = list_
def __getattr__(self, method):
return getattr(self.li, method)
This would make methods like append, extend and so on, work out of the box. Beware, however, that magic methods (e.g. __len__, __getitem__ etc.) are not going to work in this case, so you should at least redeclare them like this:
class MyList:
def __init__(self, list_):
self.li = list_
def __getattr__(self, method):
return getattr(self.li, method)
def __len__(self):
return len(self.li)
def __getitem__(self, item):
return self.li[item]
def fancyPrint(self):
# do whatever you want...
Please note, that in this case if you want to override a method of list (extend, for instance), you can just declare your own so that the call won't pass through the __getattr__ method. For instance:
class MyList:
def __init__(self, list_):
self.li = list_
def __getattr__(self, method):
return getattr(self.li, method)
def __len__(self):
return len(self.li)
def __getitem__(self, item):
return self.li[item]
def fancyPrint(self):
# do whatever you want...
def extend(self, list_):
# your own version of extend
Based on the two example methods you included in your post (fancyPrint, findAMetric), it doesn't seem that you need to store any extra state in your lists. If this is the case, you're best off simple declaring these as free functions and ignoring subtyping altogether; this completely avoids problems like list vs UserList, fragile edge cases like return types for __add__, unexpected Liskov issues, &c. Instead, you can write your functions, write your unit tests for their output, and rest assured that everything will work exactly as intended.
As an added benefit, this means your functions will work with any iterable types (such as generator expressions) without any extra effort.

How to fake type with Python

I recently developed a class named DocumentWrapper around some ORM document object in Python to transparently add some features to it without changing its interface in any way.
I just have one issue with this. Let's say I have some User object wrapped in it. Calling isinstance(some_var, User) will return False because some_var indeed is an instance of DocumentWrapper.
Is there any way to fake the type of an object in Python to have the same call return True?
You can use the __instancecheck__ magic method to override the default isinstance behaviour:
#classmethod
def __instancecheck__(cls, instance):
return isinstance(instance, User)
This is only if you want your object to be a transparent wrapper; that is, if you want a DocumentWrapper to behave like a User. Otherwise, just expose the wrapped class as an attribute.
This is a Python 3 addition; it came with abstract base classes. You can't do the same in Python 2.
Override __class__ in your wrapper class DocumentWrapper:
class DocumentWrapper(object):
#property
def __class__(self):
return User
>>> isinstance(DocumentWrapper(), User)
True
This way no modifications to the wrapped class User are needed.
Python Mock does the same (see mock.py:612 in mock-2.0.0, couldn't find sources online to link to, sorry).
Testing the type of an object is usually an antipattern in python. In some cases it makes sense to test the "duck type" of the object, something like:
hasattr(some_var, "username")
But even that's undesirable, for instance there are reasons why that expression might return false, even though a wrapper uses some magic with __getattribute__ to correctly proxy the attribute.
It's usually preferred to allow variables only take a single abstract type, and possibly None. Different behaviours based on different inputs should be achieved by passing the optionally typed data in different variables. You want to do something like this:
def dosomething(some_user=None, some_otherthing=None):
if some_user is not None:
#do the "User" type action
elif some_otherthing is not None:
#etc...
else:
raise ValueError("not enough arguments")
Of course, this all assumes you have some level of control of the code that is doing the type checking. Suppose it isn't. for "isinstance()" to return true, the class must appear in the instance's bases, or the class must have an __instancecheck__. Since you don't control either of those things for the class, you have to resort to some shenanigans on the instance. Do something like this:
def wrap_user(instance):
class wrapped_user(type(instance)):
__metaclass__ = type
def __init__(self):
pass
def __getattribute__(self, attr):
self_dict = object.__getattribute__(type(self), '__dict__')
if attr in self_dict:
return self_dict[attr]
return getattr(instance, attr)
def extra_feature(self, foo):
return instance.username + foo # or whatever
return wrapped_user()
What we're doing is creating a new class dynamically at the time we need to wrap the instance, and actually inherit from the wrapped object's __class__. We also go to the extra trouble of overriding the __metaclass__, in case the original had some extra behaviors we don't actually want to encounter (like looking for a database table with a certain class name). A nice convenience of this style is that we never have to create any instance attributes on the wrapper class, there is no self.wrapped_object, since that value is present at class creation time.
Edit: As pointed out in comments, the above only works for some simple types, if you need to proxy more elaborate attributes on the target object, (say, methods), then see the following answer: Python - Faking Type Continued
Here is a solution by using metaclass, but you need to modify the wrapped classes:
>>> class DocumentWrapper:
def __init__(self, wrapped_obj):
self.wrapped_obj = wrapped_obj
>>> class MetaWrapper(abc.ABCMeta):
def __instancecheck__(self, instance):
try:
return isinstance(instance.wrapped_obj, self)
except:
return isinstance(instance, self)
>>> class User(metaclass=MetaWrapper):
pass
>>> user=DocumentWrapper(User())
>>> isinstance(user,User)
True
>>> class User2:
pass
>>> user2=DocumentWrapper(User2())
>>> isinstance(user2,User2)
False
It sounds like you want to test the type of the object your DocumentWrapper wraps, not the type of the DocumentWrapper itself. If that's right, then the interface to DocumentWrapper needs to expose that type. You might add a method to your DocumentWrapper class that returns the type of the wrapped object, for instance. But I don't think that making the call to isinstance ambiguous, by making it return True when it's not, is the right way to solve this.
The best way is to inherit DocumentWrapper from the User itself, or mix-in pattern and doing multiple inherintance from many classes
class DocumentWrapper(User, object)
You can also fake isinstance() results by manipulating obj.__class__ but this is deep level magic and should not be done.

Categories

Resources