I am a bit puzzled right now about the following:
import weakref
class A:
def __init__(self, p):
self.p = p
def __repr__(self):
return f"{type(self).__name__}(p={self.p!r})"
a = A(1)
proxy_a = weakref.proxy(a)
print(repr(proxy_a))
# '<weakproxy at 0x7f2ea2fc1b80 to A at 0x7f2ea2fee610>'
print(proxy_a.__repr__())
# 'A(p=1)'
Why does repr(proxy_a) return a representation of the proxy while proxy_a.__repr__() returns the representation of the original object? Shouldn't the two calls boil down to the same thing? And which __repr__ implementation is actually called by using repr(proxy_a)?
repr(proxy_a) calls the default C implementation of repr for the weakref.proxy object. While proxy_a.__repr__() proxies the version from the a object.
Yes I would expect them to execute the same code, but wait, don't we expect also the proxy to send attribute lookups and method calls to the proxied object? At the same time I would want to see that a proxy is a proxy object, so the repr(proxy_a) result makes sense too. So it is not even clear what should be the right behaviour.
Information on this is very scarce, but it looks weakref.proxy objects do not replace in a complete transparent way the original objects, contrary to common expectations.
Added a few print lines to make things more clear. Note in the last line, that it is possible to access the weakreferred object and its methods through the __self__ parameter of a bound method from the proxy.
import weakref
class A:
def __init__(self, p):
self.__p = p
def __repr__(self):
return f"{type(self).__name__}(p={self.__p!r})"
a = A(1)
proxy_a = weakref.proxy(a)
print(repr(proxy_a))
# '<weakproxy at 0x7f2ea2fc1b80 to A at 0x7f2ea2fee610>'
print(proxy_a.__repr__())
# 'A(p=1)'
print(proxy_a.__repr__)
# <bound method A.__repr__ of A(p=1)>
print(type(proxy_a))
# <class 'weakproxy'>
print(type(proxy_a.__repr__.__self__))
# <class '__main__.A'>
print(proxy_a.__repr__.__self__.__repr__())
# A(p=1)
See also this complete thread python-dereferencing-weakproxy and some of the entries in this (old) thread from the Python bugtracker, where weakref.proxy and method delegation is mentioned in several places.
Related
The document about super() found on the Python website says it returns a proxy object that delegates method calls to parent or sibling class. Information found at super considered super, how does super() work with multiple inheritance and super considered harmful explains that in fact the next method in the mro is used. My question is, what happens if super(object, self).some_method() is used? Since object typically appears at the end of a mro list, I guess the search will hit the end immediately with an exception. But in fact, it seems that the methods of the proxy itself were called, as shown by super(object, self).__repr__() showing the super object itself. I wonder whether the behavior of super() with object is not to delegate method at all.
If this is the case, I wonder any reliable material ever mentions it and whether it applies to other Python implementations.
class X(object):
def __init__(self):
# This shows [X, object].
print X.mro()
# This shows a bunch of attributes that a super object can have.
print dir(super(object, self))
# This shows something similar to <super object at xxx>
print(object, self)
# This failed with `super() takes at least one argument`
try:
super(object, self).__init__()
except:
pass
# This shows something like <super <class 'object'>, <'X' object>>.
print(super(object, self).__repr__())
# This shows the repr() of object, like <'X' object at xxx>
print(super(X, self).__repr__())
if __name__ == '__main__':
X()
If super doesn't find something while looking through the method resolution order (MRO) to delegate to (or if you're looking for the attribute __class__) it will check its own attributes.
Because object is always the last type in the MRO (at least to my knowledge it's always the last one) you effectively disabled the delegation and it will only check the super instance.
I found the question really interesting so I went to the source code of super and in particular the delegation part (super.__getattribute__ (in CPython 3.6.5)) and I translated it (roughly) to pure Python accompanied by some additional comments of my own:
class MySuper(object):
def __init__(self, klass, instance):
self.__thisclass__ = klass
self.__self__ = instance
self.__self_class__ = type(instance)
def __repr__(self):
# That's not in the original implementation, it's here for fun
return 'hoho'
def __getattribute__(self, name):
su_type = object.__getattribute__(self, '__thisclass__')
su_obj = object.__getattribute__(self, '__self__')
su_obj_type = object.__getattribute__(self, '__self_class__')
starttype = su_obj_type
# If asked for the __class__ don't go looking for it in the MRO!
if name == '__class__':
return object.__getattribute__(self, '__class__')
mro = starttype.mro()
n = len(mro)
# Jump ahead in the MRO to the passed in class
# excluding the last one because that is skipped anyway.
for i in range(0, n - 1):
if mro[i] is su_type:
break
# The C code only increments by one here, but the C for loop
# actually increases the i variable by one before the condition
# is checked so to get the equivalent code one needs to increment
# twice here.
i += 2
# We're at the end of the MRO. Check if super has this attribute.
if i >= n:
return object.__getattribute__(self, name)
# Go up the MRO
while True:
tmp = mro[i]
dict_ = tmp.__dict__
try:
res = dict_[name]
except:
pass
else:
# We found a match, now go through the descriptor protocol
# so that we get a bound method (or whatever is applicable)
# for this attribute.
f = type(res).__get__
f(res, None if su_obj is starttype else su_obj, starttype)
res = tmp
return res
i += 1
# Not really the nicest construct but it's a do-while loop
# in the C code and I feel like this is the closest Python
# representation of that.
if i < n:
continue
else:
break
return object.__getattribute__(self, name)
As you can see there are some ways you could end up looking up the attribute on super:
If you're looking for the __class__ attribute
If you reached the end of the MRO immediately (by passing in object as first argument)!
If __getattribute__ couldn't find a match in the remaining MRO.
Actually because it works like super you can use that instead (at least as far as the attribute delegation is concerned):
class X(object):
def __init__(self):
print(MySuper(object, self).__repr__())
X()
That will print the hoho from the MySuper.__repr__. Feel free to experiment with that code by inserting some prints to follow the control flow.
I wonder any reliable material ever mentions it and whether it applies to other Python implementations.
What I said above was based on my observations of the CPython 3.6 source, but I think it shouldn't be too different for other Python versions given that the other Python implementations (often) follow CPython.
In fact I also checked:
CPython 2
PyPy (Python 2),
IronPython (Python 2)
And all of them return the __repr__ of super.
Note that Python follows the "We are all consenting adults" style, so I would be surprised if someone bothered to formalize such unusual usages. I mean who would try to delegate to a method of the sibling or parent class of object (the "ultimate" parent class).
super defines a few of its own attributes and needs a way to provide access to them. First is uses the __dunder__ style, which Python reserves for itself and says no library or application should define names that start and end with a double underscore. This means the super object can be confident that nothing will clash with its attributes of __self__, __self_class__ and __thisclass__. So if it searches the mro and doesn't find the requested attribute then it falls back on trying to find the attribute on the super object itself. For instance:
>>> class A:
pass
>>> class B(A):
pass
>>> s = super(A, B())
>>> s.__self__
<__main__.B object at 0x03BE4E70>
>>> s.__self_class__
<class '__main__.B'>
>>> s.__thisclass__
<class '__main__.A'>
Since you have specified object as the type to start looking beyond and because object is always the last type in the mro, then there is no possible candidate for which to fetch the method or attribute. In this situation, super behaves as if it had tried various types looking for the name, but didn't find one. So it tries to fetch the attribute from itself. However, since the super object is also an object it has access to __init__, __repr__ and everything else object defines. And so super returns its own __init__ and __repr__ methods for you.
This is kind of a situation of "ask a silly question (of super) and get a silly answer". That is, super should only ever be called with the first argument as class that the function was defined in. When you call it with object you are getting undefined behaviour.
I want to call some function depend on event code how can I do it in Python?
I made such code but it not works and I am user that only one step to make it working.
class Thing(object):
#classmethod
def do1(cls):
print 'do1'
#classmethod
def do2(cls):
print 'do2'
eventToMethod = {'1': do1,
'2': do2}
#classmethod
def onEvent(cls, name):
method = cls.eventToMethod.get(name)
if method != None:
method()
Thing.onEvent('1')
Whatever I get such errors and has not idea how to call classmethods in Python way.
TypeError: 'classmethod' object is not callable
Can you help with this simple problem?
You need to make some changes to changes to eventToMethod first, don't assign do1, do2 to it, better assign strings. You can always access class attributes using strings. The problem with storing references to do1 and do2 in dictionary is that they are not bound methods yet(they're simply classmethod objects(non-data descriptors)) when you store them in dictionary, it's only after the completion of class definition that they're converted to fully bound class methods.
eventToMethod = {'1': 'do1',
'2': 'do2'}
And then use getaattr to get the method:
#classmethod
def onEvent(cls, name):
method = getattr(cls, cls.eventToMethod.get(name))
...
Note that you can also directly pass 'do1' to onEvent instead of keeping a dictionary to store names and then simply use:
method = getattr(cls, name)
You can still get away with your current approach if you call __get__ method of do1, do2 descriptors explicitly.:
method = cls.eventToMethod.get(name)
if method != None:
method.__get__(None, cls)()
This works because this is exactly what Python does under the hood, classmethod is a non-data descriptor and when you do Thing.do1 Python actually calls __get__ method of do1 with first agument as None and second as type:
>>> Thing.do1.__get__(None, Thing)
<bound method type.do1 of <class '__main__.Thing'>>
>>> Thing.do1.__get__(None, Thing)
<bound method type.do1 of <class '__main__.Thing'>>
>>> Thing.do1
<bound method type.do1 of <class '__main__.Thing'>>
>>> Thing.eventToMethod['1'].__get__(None, Thing) #Using OP's code.
<bound method type.do1 of <class '__main__.Thing'>>
While I understand that this doesn't answer your question directly, I thought it might be useful to see an alternative.
Often it's possible to use reflection to calculate the correct method at runtime. For example:
#classmethod
def onEvent(cls, name):
try:
method = getattr(cls, 'do%s'%name)
except AttributeError:
return
method()
This approach may be useful if you are able to follow a strict naming convention in your methods (as in the example, where you seem to prefix them with 'do'). It's similar to how PyUnit detects the set of test cases to run.
This avoids the need to maintain a dict, which may get out of sync with the actual methods on the object. It also arguably leads to clearer code.
It's worth pointing out as well that if you are attempting to do some kine of event-driven programming -- There are libraries/frameworks that help facilitate this:
Example:
#!/usr/bin/env python
from circuits import Component, Event
class Thing(Component):
def do1(self):
print("do1")
def do2(self):
print("do2")
def started(self, manager):
self.fire(Event.create("do1"))
self.fire(Event.create("do2"))
raise SystemExit(0)
Thing().run()
Output:
$ python foo.py
do1
do2
Disclaimer: I'm the author of circuits
I have some working code (library) that, in some situations, I only need a small subset of its functional.
Thinking of a simpler case, the code (library) is a class that takes a few parameters when initializing.
For my limited use case, many of those parameters are not vital as they are not directly used in the internal calculation (some parameters are only used when I call particular methods of the object), while it is very hard to prepare those parameters properly.
So, I am wondering, if there is any easy way to know what parameters are essential without fully analyzing the library code (which is too complicated). For example, I may pass fake parameters to the api, And it would raise an exception only if they are actually used.
For example, I can pass in some_parameter = None for some_parameter that I guess won't be used. So whenever the library tries to access some_parameter.some_field an exception would be raised thus I can further look into the issue and replace it by the actually parameter. However, it would change the behavior of the library if the code itself accepts None as a parameter.
Are there any established approach to this problem? I don't mind false positive as I can always look into the problem and manually check if the usage of the fake parameters by the library is trivial.
For those suggestions on reading documentation and code, I don't have documentations! And the code is legacy code left by previous developers.
Update
#sapi:
Yes I would like to use the proxy pattern / object: I will further investigate on such topic.
"A virtual proxy is a placeholder for "expensive to create" objects. The real object is only created when a client first requests/accesses the object."
I am assuming all classes in question are new-style. This is always the case if you are using Python 3; in Python 2, they must extend from object. You can check a class with isinstance(MyClass, type). For the remainder of my answer, I will assume Python 3, since it was not specified. If you are using Python 2, make sure to extend from object where no other base class is specified.
If those conditions hold, you can write a descriptor that raises an exception whenever it is accessed:
class ParameterUsed(Exception):
pass
class UsageDescriptor:
def __init__(self, name):
super(UsageDescriptor, self).__init__()
self.name = name
def __get__(self, instance, owner):
raise ParameterUsed(self.name)
def __set__(self, instance, value):
# Ignore sets if the value is None.
if value is not None:
raise ParameterUsed(self.name)
def __delete__(self, instance):
# Ignore deletes.
pass
I will assume we are using this class as an example:
class Example:
def __init__(self, a, b):
self.a = a
self.b = b
def use_a(self):
print(self.a)
def use_b(self):
print(self.b)
If we want to see if a is used anywhere, extend the class and put an instance of our descriptor on the class:
class ExtExample(Example):
a = UsageDescriptor('a')
Now if we were to try to use the class, we can see which methods use a:
>>> example = ExtExample(None, None)
>>> example.use_a()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ParameterUsed: a
>>> example.use_b()
None
Here, we can see that use_a tried to use a (raising an exception because it did), but use_b did not (it completed successfully).
This approach works more generally than sapi’s does: in particular, sapi’s approach will only detect an attribute being accessed on the object. But there are plenty of things you can do that do not access attributes on that object. This approach, rather than detecting attributes being accessed on that object, detects the object itself being accessed.
Depending on what you're looking to achieve, you may be able to pass in a proxy object which throws an exception when accessed.
For example:
class ObjectUsedException(Exception):
pass
class ErrorOnUseProxy(object):
def __getattr__(self, name):
raise ObjectUsedException('Tried to access %s'%name)
Of course, that approach will fail in two pretty common situations:
if the library itself checks if the attribute exists (eg, to provide some default value)
if it's treated as a primitive (float, string etc), though you could modify this approach to take that into account
I belive the simplest and least intrusive way is to turn the parameters into properties:
class Foo(object):
def __init__(self):
pass
#property
def a(self):
print >>sys.stderr, 'Accesing parameter a'
return 1
bar = Foo()
print bar.a == 1
Will print True in stdout, and Accesing parameter a to stderr. You would have to tweak it to allow the class to change it.
I was going to ask "How to pickle a class that inherits from dict and defines __slots__". Then I realized the utterly mind-wrenching solution in class B below actually works...
import pickle
class A(dict):
__slots__ = ["porridge"]
def __init__(self, porridge): self.porridge = porridge
class B(A):
__slots__ = ["porridge"]
def __getstate__(self):
# Returning the very item being pickled in 'self'??
return self, self.porridge
def __setstate__(self, state):
print "__setstate__(%s) type(%s, %s)" % (state, type(state[0]),
type(state[1]))
self.update(state[0])
self.porridge = state[1]
Here is some output:
>>> saved = pickle.dumps(A(10))
TypeError: a class that defines __slots__ without defining __getstate__ cannot be pickled
>>> b = B('delicious')
>>> b['butter'] = 'yes please'
>>> loaded = pickle.loads(pickle.dumps(b))
__setstate__(({'butter': 'yes please'}, 'delicious')) type(<class '__main__.B'>, <type 'str'>)
>>> b
{'butter': 'yes please'}
>>> b.porridge
'delicious'
So basically, pickle cannot pickle a class that defines __slots__ without also defining __getstate__. Which is a problem if the class inherits from dict - because how do you return the content of the instance without returning self, which is the very instance pickle is already trying to pickle, and can't do so without calling __getstate__. Notice how __setstate__ is actually receiving an instance B as part of the state.
Well, it works... but can someone explain why? Is it a feature or a bug?
Maybe I'm a bit late to the party, but this question didn't get an answer that actually explains what's happening, so here we go.
Here's a quick summary for those who don't want to read this whole post (it got a bit long...):
You don't need to take care of the contained dict instance in __getstate__() -- pickle will do this for you.
If you include self in the state anyway, pickle's cycle detection will prevent an infinite loop.
Writing __getstate__() and __setstate__() methods for custom classes derived from dict
Let's start with the right way to write the __getstate__() and __setstate__() methods of your class. You don't need to take care of pickling the contents of the dict instance contained in B instances -- pickle knows how to deal with dictionaries and will do this for you. So this implementation will be enough:
class B(A):
__slots__ = ["porridge"]
def __getstate__(self):
return self.porridge
def __setstate__(self, state):
self.porridge = state
Example:
>>> a = B("oats")
>>> a[42] = "answer"
>>> b = pickle.loads(pickle.dumps(a))
>>> b
{42: 'answer'}
>>> b.porridge
'oats'
What's happening in your implementation?
Why does your implementation work as well, and what's happening under the hood? That's a bit more involved, but -- once we know that the dictionary gets pickled anyway -- not too hard to figure out. If the pickle module encounters an instance of a user-defined class, it calls the __reduce__() method of this class, which in turn calls __getstate__() (actually, it usually calls the __reduce_ex__() method, but that does not matter here). Let's define B again as you originally did, i.e. using the "recurisve" definition of __getstate__(), and let's see what we get when calling __reduce__() for an instance of B now:
>>> a = B("oats")
>>> a[42] = "answer"
>>> a.__reduce__()
(<function _reconstructor at 0xb7478454>,
(<class '__main__.B'>, <type 'dict'>, {42: 'answer'}),
({42: 'answer'}, 'oats'))
As we can see from the documentation of __reduce__(), the method returns a tuple of 2 to 5 elements. The first element is a function that will be called to reconstruct the instance when unpickling, the second element is the tuple of arguments that will be passed to this function, and the third element is the return value of __getstate__(). We can already see that the dictionary information is included twice. The function _reconstructor() is an internal function of the copy_reg module that reconstructs the base class before __setstate__() is called when unpickling. (Have a look at the source code of this function if you like -- it's short!)
Now the pickler needs to pickle the return value of a.__reduce__(). It basically pickles the three elements of this tuple one after the other. The second element is a tuple again, and its items are also pickled one after the other. The third item of this inner tuple (i.e. a.__reduce__()[1][2]) is of type dict and is pickled using the internal pickler for dictionaries. The third element of the outer tuple (i.e. a.__reduce__()[2]) is also a tuple again, consisting of the B instance itself and a string. When pickling the B instance, the cycle detection of the pickle module kicks in: pickle realises this exact instance has already been dealt with, and only stores a reference to its id() instead of really pickling it -- this is why no infinte loop occurs.
When unpickling this mess again, the unpickler first reads the reconstruction function and its arguments from the stream. The function is called, resulting in an B instance with the dictionary part already initialised. Next, the unpickler reads the state. It encounters a tuple consisting of a reference to an already unpickled object -- namely our instance of B -- and a string, "oats". This tuple now is passed to B.__setstate__(). The first element of state and self are the same object now, as can be seen by adding the line
print self is state[0]
to your __setstate__() implementation (it prints True!). The line
self.update(state[0])
consequently simply updates the instance with itself.
Here's the thinking as I understand it. If your class uses __slots__, it's a way to gaurantee that there aren't any unexpected attributes. Unlike a regular Python object, one that's implemented with slots cannot have attributes dynamically added to it.
When Python unserializes an object with __slots__, it doesn't want to just make an assumption that whatever attributes were in the serialized version are compatible with your runtime class. So it punts that off to you, and you can implement __getstate__ and __setstate__.
But the way you implemented your __getstate__ and__setstate__, you appear to be circumventing that check. Here's the code that's raising that exception:
try:
getstate = self.__getstate__
except AttributeError:
if getattr(self, "__slots__", None):
raise TypeError("a class that defines __slots__ without "
"defining __getstate__ cannot be pickled")
try:
dict = self.__dict__
except AttributeError:
dict = None
else:
dict = getstate()
In a round about way, you're telling the Pickle module to set its objections aside and serialize and unserialize your objects as normal.
That may or may not be a good idea -- I'm not sure. But I think that could come back to bite you if, for example, you change your class definition and then unserialize an object with a different set of attributes than what your runtime class expects.
That's why, when using slots especially, your __getstate__ and __getstate__ should be more explicit. I would be explicit and be clear that you're just sending the dictionary key/values back and forth, like this:
class B(A):
__slots__ = ["porridge"]
def __getstate__(self):
return dict(self), self.porridge
def __setstate__(self, state):
self.update(state[0])
self.porridge = state[1]
Notice the dict(self) -- that casts your object to a dict, which should make sure that the first element in your state tuple is only your dictionary data.
I have a project where i'm trying to use weakrefs with callbacks, and I don't understand what I'm doing wrong. I have created simplified test that shows the exact behavior i'm confused with.
Why is it that in this test test_a works as expected, but the weakref for self.MyCallbackB disappears between the class initialization and calling test_b? I thought like as long as the instance (a) exists, the reference to self.MyCallbackB should exist, but it doesn't.
import weakref
class A(object):
def __init__(self):
def MyCallbackA():
print 'MyCallbackA'
self.MyCallbackA = MyCallbackA
self._testA = weakref.proxy(self.MyCallbackA)
self._testB = weakref.proxy(self.MyCallbackB)
def MyCallbackB(self):
print 'MyCallbackB'
def test_a(self):
self._testA()
def test_b(self):
self._testB()
if __name__ == '__main__':
a = A()
a.test_a()
a.test_b()
You want a WeakMethod.
An explanation why your solution doesn't work can be found in the discussion of the recipe:
Normal weakref.refs to bound methods don't quite work the way one expects, because bound methods are first-class objects; weakrefs to bound methods are dead-on-arrival unless some other strong reference to the same bound method exists.
According to the documentation for the Weakref module:
In the following, the term referent means the object which is referred to
by a weak reference.
A weak reference to an object is not
enough to keep the object alive: when
the only remaining references to a
referent are weak references, garbage
collection is free to destroy the
referent and reuse its memory for
something else.
Whats happening with MyCallbackA is that you are holding a reference to it in the instances of A, thanks to -
self.MyCallbackA = MyCallbackA
Now, there is no reference to the bound method MyCallbackB in your code. It is held only in a.__class__.__dict__ as an unbound method. Basically, a bound method is created (and returned to you) when you do self.methodName. (AFAIK, a bound method works like a property -using a descriptor (read-only): at least for new style classes. I am sure, something similar i.e. w/o descriptors happens for old style classes. I'll leave it to someone more experienced to verify the claim about old style classes.) So, self.MyCallbackB dies as soon as the weakref is created, because there is no strong reference to it!
My conclusions are based on :-
import weakref
#Trace is called when the object is deleted! - see weakref docs.
def trace(x):
print "Del MycallbackB"
class A(object):
def __init__(self):
def MyCallbackA():
print 'MyCallbackA'
self.MyCallbackA = MyCallbackA
self._testA = weakref.proxy(self.MyCallbackA)
print "Create MyCallbackB"
# To fix it, do -
# self.MyCallbackB = self.MyCallBackB
# The name on the LHS could be anything, even foo!
self._testB = weakref.proxy(self.MyCallbackB, trace)
print "Done playing with MyCallbackB"
def MyCallbackB(self):
print 'MyCallbackB'
def test_a(self):
self._testA()
def test_b(self):
self._testB()
if __name__ == '__main__':
a = A()
#print a.__class__.__dict__["MyCallbackB"]
a.test_a()
Output
Create MyCallbackB
Del MycallbackB
Done playing with MyCallbackB
MyCallbackA
Note :
I tried verifying this for old style classes. It turned out that "print a.test_a.__get__"
outputs -
<method-wrapper '__get__' of instancemethod object at 0xb7d7ffcc>
for both new and old style classes. So it may not really be a descriptor, just something descriptor-like. In any case, the point is that a bound-method object is created when you acces an instance method through self, and unless you maintain a strong reference to it, it will be deleted.
The other answers address the why in the original question, but either don't provide a workaround or refer to external sites.
After working through several other posts on StackExchange on this topic, many of which are marked as duplicates of this question, I finally came to a succinct workaround. When I know the nature of the object I'm dealing with, I use the weakref module; when I might instead be dealing with a bound method (as occurs in my code when using event callbacks), I now use the following WeakRef class as a direct replacement for weakref.ref(). I've tested this with Python 2.4 through and including Python 2.7, but not on Python 3.x.
class WeakRef:
def __init__ (self, item):
try:
self.method = weakref.ref (item.im_func)
self.instance = weakref.ref (item.im_self)
except AttributeError:
self.reference = weakref.ref (item)
else:
self.reference = None
def __call__ (self):
if self.reference != None:
return self.reference ()
instance = self.instance ()
if instance == None:
return None
method = self.method ()
return getattr (instance, method.__name__)