Making copies of built-ins classes - python

I'm trying to write function which creates classes from classes without modifying original one.
Simple solution (based on this answer)
def class_operator(cls):
namespace = dict(vars(cls))
... # modifying namespace
return type(cls.__qualname__, cls.__bases__, namespace)
works fine except type itself:
>>> class_operator(type)
Traceback (most recent call last):
File "<input>", line 1, in <module>
TypeError: type __qualname__ must be a str, not getset_descriptor
Tested on Python 3.2-Python 3.6.
(I know that in current version modification of mutable attributes in namespace object will change original class, but it is not the case)
Update
Even if we remove __qualname__ parameter from namespace if there is any
def class_operator(cls):
namespace = dict(vars(cls))
namespace.pop('__qualname__', None)
return type(cls.__qualname__, cls.__bases__, namespace)
resulting object doesn't behave like original type
>>> type_copy = class_operator(type)
>>> type_copy is type
False
>>> type_copy('')
Traceback (most recent call last):
File "<input>", line 1, in <module>
TypeError: descriptor '__init__' for 'type' objects doesn't apply to 'type' object
>>> type_copy('empty', (), {})
Traceback (most recent call last):
File "<input>", line 1, in <module>
TypeError: descriptor '__init__' for 'type' objects doesn't apply to 'type' object
Why?
Can someone explain what mechanism in Python internals prevents copying type class (and many other built-in classes).

The problem here is that type has a __qualname__ in its __dict__, which is a property (i.e. a descriptor) rather than a string:
>>> type.__qualname__
'type'
>>> vars(type)['__qualname__']
<attribute '__qualname__' of 'type' objects>
And trying to assign a non-string to the __qualname__ of a class throws an exception:
>>> class C: pass
...
>>> C.__qualname__ = 'Foo' # works
>>> C.__qualname__ = 3 # doesn't work
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: can only assign string to C.__qualname__, not 'int'
This is why it's necessary to remove the __qualname__ from the __dict__.
As for the reason why your type_copy isn't callable: This is because type.__call__ rejects anything that isn't a subclass of type. This is true for both the 3-argument form:
>>> type.__call__(type, 'x', (), {})
<class '__main__.x'>
>>> type.__call__(type_copy, 'x', (), {})
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: descriptor '__init__' for 'type' objects doesn't apply to 'type' object
As well as the single-argument form, which actually only works with type as its first argument:
>>> type.__call__(type, 3)
<class 'int'>
>>> type.__call__(type_copy, 3)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: type.__new__() takes exactly 3 arguments (1 given)
This isn't easy to circumvent. Fixing the 3-argument form is simple enough: We make the copy an empty subclass of type.
>>> type_copy = type('type_copy', (type,), {})
>>> type_copy('MyClass', (), {})
<class '__main__.MyClass'>
But the single-argument form of type is much peskier, since it only works if the first argument is type. We can implement a custom __call__ method, but that method must be written in the metaclass, which means type(type_copy) will be different from type(type).
>>> class TypeCopyMeta(type):
... def __call__(self, *args):
... if len(args) == 1:
... return type(*args)
... return super().__call__(*args)
...
>>> type_copy = TypeCopyMeta('type_copy', (type,), {})
>>> type_copy(3) # works
<class 'int'>
>>> type_copy('MyClass', (), {}) # also works
<class '__main__.MyClass'>
>>> type(type), type(type_copy) # but they're not identical
(<class 'type'>, <class '__main__.TypeCopyMeta'>)
There are two reasons why type is so difficult to copy:
It's implemented in C. You'll run into similar problems if you try to copy other builtin types like int or str.
The fact that type is an instance of itself:
>>> type(type)
<class 'type'>
This is something that's usually not possible. It blurs the line between class and instance. It's a chaotic accumulation of instance and class attributes. This is why __qualname__ is a string when accessed as type.__qualname__ but a descriptor when accessed as vars(type)['__qualname__'].
As you can see, it's not possible to make a perfect copy of type. Each implementation has different tradeoffs.
The easy solution is to make a subclass of type, which doesn't support the single-argument type(some_object) call:
import builtins
def copy_class(cls):
# if it's a builtin class, copy it by subclassing
if getattr(builtins, cls.__name__, None) is cls:
namespace = {}
bases = (cls,)
else:
namespace = dict(vars(cls))
bases = cls.__bases__
cls_copy = type(cls.__name__, bases, namespace)
cls_copy.__qualname__ = cls.__qualname__
return cls_copy
The elaborate solution is to make a custom metaclass:
import builtins
def copy_class(cls):
if cls is type:
namespace = {}
bases = (cls,)
class metaclass(type):
def __call__(self, *args):
if len(args) == 1:
return type(*args)
return super().__call__(*args)
metaclass.__name__ = type.__name__
metaclass.__qualname__ = type.__qualname__
# if it's a builtin class, copy it by subclassing
elif getattr(builtins, cls.__name__, None) is cls:
namespace = {}
bases = (cls,)
metaclass = type
else:
namespace = dict(vars(cls))
bases = cls.__bases__
metaclass = type
cls_copy = metaclass(cls.__name__, bases, namespace)
cls_copy.__qualname__ = cls.__qualname__
return cls_copy

Related

Monkey-patching, duck-typing and argument self

When I try to monkey-patch a class with a method from another class, it doesn't work because the argument self isn't of the right type.
For example, let's like the result of the method __str__ created by the fancy class A:
class A:
def __init__(self, val):
self.val=val
def __str__(self):
return "Fancy formatted %s"%self.val
and would like to reuse it for a boring class B:
class B:
def __init__(self, val):
self.val=val
That means:
>>> b=B("B")
>>> #first try:
>>> B.__str__=A.__str__
>>> str(b)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unbound method __str__() must be called with A instance as first argument (got nothing instead)
>>> #second try:
>>> B.__str__= lambda self: A.__str__(self)
>>> str(b)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 1, in <lambda>
TypeError: unbound method __str__() must be called with A instance as first argument (got B instance instead)
So in both cases the it doesn't work because the argument self should be an instance of class A, but evidently isn't.
It would be nice to find a way to do the monkey-patching, but my actual question is, why it is necessary for the implicit parameter self to be an instance of the "right" class and not just depend on the duck-typing?
Because of the way methods are contributed to class objects in Python 2, the actual function object is hidden behind an unbound method, but you can access it using the im_func aka __func__ attribute:
>>> B.__str__ = A.__str__.__func__
>>> str(B('stuff'))
'Fancy formatted stuff'
Arguably, a better way to do this is using new-style classes and inheritance.
class MyStrMixin(object):
def __str__(self):
return "Fancy formatted %s" % self.val
Then inherit from MyStrMixin in both A and B, and just let the MRO do its thing.

Is there any way to create a Python class method that does NOT pollute the attribute namespace of its instances?

I want to provide a method that can be used on a Python 2.7 class object, but does not pollute the attribute namespace of its instances. Is there any way to do this?
>>> class Foo(object):
... #classmethod
... def ugh(cls):
... return 33
...
>>> Foo.ugh()
33
>>> foo = Foo()
>>> foo.ugh()
33
You could subclass the classmethod descriptor:
class classonly(classmethod):
def __get__(self, obj, type):
if obj: raise AttributeError
return super(classonly, self).__get__(obj, type)
This is how it would behave:
class C(object):
#classonly
def foo(cls):
return 42
>>> C.foo()
42
>>> c=C()
>>> c.foo()
AttributeError
This desugars to the descriptor call (rather, it is invoked by the default implementation of __getattribute__):
>>> C.__dict__['foo'].__get__(None, C)
<bound method C.foo of <class '__main__.C'>>
>>> C.__dict__['foo'].__get__(c, type(c))
AttributeError
Required reading: Data Model — Implementing Descriptors and Descriptor HowTo Guide.
ugh is not in the namespace:
>>> foo.__dict__
{}
but the rules for attribute lookup fall back to the type of the instance for missing names. You can override Foo.__getattribute__ to prevent this.
class Foo(object):
#classmethod
def ugh(cls):
return 33
def __getattribute__(self, name):
if name == 'ugh':
raise AttributeError("Access to class method 'ugh' block from instance")
return super(Foo,self).__getattribute__(name)
This produces:
>>> foo = Foo()
>>> foo.ugh()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "tmp.py", line 8, in __getattribute__
raise AttributeError("Access to class method 'ugh' block from instance")
AttributeError: Access to class method 'ugh' block from instance
>>> Foo.ugh()
33
You must use __getattribute__, which is called unconditionally on any attribute access, rather than __getattr__, which is only called after the normal lookup (which includes checking the type's namespace) fails.
Python has quasi-private variables that use name-munging to reduce accidental access. Methods and object variables of the form __name are converted to _ClassName__name. Python automatically changes the name when compiling methods on the class but doesn't change the name for subclasses.
I can use the private method in a class
>>> class A(object):
... def __private(self):
... print('boo')
... def hello(self):
... self.__private()
...
>>>
>>> A().hello()
boo
But not outside the class
>>> A().__private()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'A' object has no attribute '__private'
>>>
Or in subclasses
>>> class B(A):
... def hello2(self):
... self.__private()
...
>>>
>>> B().hello()
boo
>>> B().hello2()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 3, in hello2
AttributeError: 'B' object has no attribute '_B__private'
Yes, you can create the method in the metaclass.
class FooMeta(type):
# No #classmethod here
def ugh(cls):
return 33
class Foo(object):
__metaclass__ = FooMeta
Foo.ugh() # returns 33
Foo().ugh() # AttributeError
Note that metaclasses are a power feature, and their use is discouraged if unnecessary. In particular, multiple inheritance requires special care if the parent classes have different metaclasses.

How to determine which Python class provides attributes when inheriting

What's the easiest way to determine which Python class defines an attribute when inheriting? For example, say I have:
class A(object):
defined_in_A = 123
class B(A):
pass
a = A()
b = B()
and I wanted this code to pass:
assert hasattr(a, 'defined_in_A')
assert hasattr(A, 'defined_in_A')
assert hasattr(b, 'defined_in_A')
assert hasattr(B, 'defined_in_A')
assert defines_attribute(A, 'defined_in_A')
assert not defines_attribute(B, 'defined_in_A')
How would I implement the fictional defines_attribute function? My first thought would be to walk through the entire inheritance chain, and use hasattr to check for the attribute's existence, with the deepest match assumed to be the definer. Is there a simpler way?
(Almost) Every python object is defined with it's own instance variables (instance variables of a class object we usually call class variables) to get this as a dictionary you can use the vars function and check for membership in it:
>>> "defined_in_A" in vars(A)
True
>>> "defined_in_A" in vars(B)
False
>>> "defined_in_A" in vars(a) or "defined_in_A" in vars(b)
False
the issue with this is that it does not work when a class uses __slots__ or builtin objects since it changes how the instance variables are stored:
class A(object):
__slots__ = ("x","y")
defined_in_A = 123
>>> A.x
<member 'x' of 'A' objects>
>>> "x" in vars(a)
Traceback (most recent call last):
File "<pyshell#5>", line 1, in <module>
"x" in vars(a)
TypeError: vars() argument must have __dict__ attribute
>>> vars(1) #or floats or strings will raise the same error
Traceback (most recent call last):
...
TypeError: vars() argument must have __dict__ attribute
I'm not sure there is a simple workaround for this case.

How do you hide from hasattr?

Let's say a function looks at an object and checks if it has a function a_method:
def func(obj):
if hasattr(obj, 'a_method'):
...
else:
...
I have an object whose class defines a_method, but I want to hide it from hasattr. I don't want to change the implementation of func to achieve this hiding, so what hack can I do to solve this problem?
If the method is defined on the class you appear to be able to remove it from the __dict__ for the class. This prevents lookups (hasattr will return false). You can still use the function if you keep a reference to it when you remove it (like the example) - just remember that you have to pass in an instance of the class for self, it's not being called with the implied self.
>>> class A:
... def meth(self):
... print "In method."
...
>>>
>>> a = A()
>>> a.meth
<bound method A.meth of <__main__.A instance at 0x0218AB48>>
>>> fn = A.__dict__.pop('meth')
>>> hasattr(a, 'meth')
False
>>> a.meth
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: A instance has no attribute 'meth'
>>> fn()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: meth() takes exactly 1 argument (0 given)
>>> fn(a)
In method.
You could redefine the hasattr function. Below is an example.
saved_hasattr = hasattr
def hasattr(obj, method):
if method == 'MY_METHOD':
return False
else:
return saved_hasattr(obj, method)
Note that you probably want to implement more detailed checks than just checking the method name. For example checking the object type might be beneficial.
Try this:
class Test(object):
def __hideme(self):
print 'hidden'
t = Test()
print hasattr(t,"__hideme") #prints False....
I believe this works b/c of the double underscore magic of hiding members (owning to name mangling) of a class to outside world...Unless someone has a strong argument against this, I'd think this is way better than popping stuff off from __dict__? Thoughts?

Creating inmutable types in Python using __new__()

My question is pretty simple, I have:
class upperstr(str):
def __new__(cls, arg):
return str.__new__(cls, str(arg).upper())
Why, if my __new__() method is directly using an instance of an inmutable type (str), instances of my new type (upperstr) are mutable?
>>> s = str("text")
>>> "__dict__" in dir(s)
False
>>> s = upperstr("text")
>>> "__dict__" in dir(s)
True
In what stage does the interpreter sets the __dict__ attribute to upperstr intances if I'm only overriding the __new__() method?
Thanks!
All user-defined classes in Python have a __dict__() attribute by default, even if you don't overwrite anything at all:
>>> x = object()
>>> x.__dict__
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'object' object has no attribute '__dict__'
>>> class MyObject(object):
... pass
...
>>> x = MyObject()
>>> x.__dict__
{}
If you don't want a new-style class to have a __dict__, use __slots__ (documentation, related SO thread):
>>> class MyObject(object):
... __slots__ = []
...
>>> x = MyObject()
>>> x.__dict__
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'MyObject' object has no attribute '__dict__'

Categories

Resources