Why do methods include strong-references to instances/classes? - python

If you examine a method in python, you'll find the im_class & im_self attributes. If you look closer, you'll see that these are strong-references!
Maybe i'm alone in this; but the way i figure it, if methods themselves are dependent upon their respective class/instance (i.e., the self argument), then the method should "go down with its ship", no? Why would the authors choose to store strong references in the method objects instead of weak references? This forces users that want to avoid circular-referencing to use workarounds. Does anyone have any use cases where strong referencing the class/instance is preferable?
Example:
from weakref import proxy
class Foo(object):
def func(self):
pass
>>> foo = Foo()
>>> func = foo.func
>>> _foo = proxy(foo)
>>> func.im_self is _foo
False

This is only true of method wrappers, which are bound to a specific instance of a class. Using a strong reference avoids interesting surprises; for example, for a class X, func = X().foo; func() does what you'd expect (with a weak reference, the X() would be deleted and the subsequent func() call would fail unexpectedly).
If you want to have "weak referenced methods", the simplest way is just to pass around a tuple of (class method, weak instance), e.g. (X.foo, weakref.ref(x)) instead of x.foo.

Related

Can a dynamically added function access the owner object in python?

I'm making a program in python in which specific instances of an object must be decorated with new functions built at runtime.
I've seen very simple examples of adding functions to objects through MethodType:
import types
def foo():
print("foo")
class A:
bar = "bar"
a = A()
a.foo = types.MethodType(foo, a)
But none of the examples I've seen show how a function added in this manner can reference to the new owner's attributes. As far as I know, even though this binds the foo() function to the instance a, foo() must still be a pure function, and cannot contain references to anything local.
In my case, I need functions to change attributes of the object they are added to. Here are two examples of the kind of thing I need to be able to do:
class A:
foo = "foo"
def printme():
print(foo)
def nofoo():
foo = "bar"
def printBar():
if foo != "foo"
self.printme()
I would then need a way to add a copy of a nofoo() or printBar() to an A object in such a way that they can access the object attributes named foo and the function named printme() correctly.
So, is this possible? Is there a way to do this kind of programming in vanilla Python? or at least Is there a programming pattern that achieves this kind of behavior?
P.S.: In my system, I also add attributes dynamically to objects. Your first thought then might be "How can I ever be sure that the object I'm adding the nofoo() function to actually has an attribute named foo?", but I also have a fairly robust tag system that makes sure that I never try to add a nofoo() function to an object that hasn't a foo variable. The reason I mention this is that solutions that look at the class definition aren't very useful to me.
As said in the comments, your function actually must take at least one parameter: self, the instance the method is being called on. The self parameter can be used as it would be used in a normal instance method. Here is an example:
>>> from types import MethodType
>>>
>>> class Class:
def method(self):
print('method run')
>>> cls = Class()
>>>
>>> def func(self): # must accept one argument, `self`
self.method()
>>> cls.func = MethodType(func, cls)
>>> cls.func()
method run
>>>
Without your function accepting self, an exception would be raised:
>>> def func():
self.method()
>>> cls.func = MethodType(func, cls)
>>> cls.func()
Traceback (most recent call last):
File "<pyshell#21>", line 1, in <module>
cls.func()
TypeError: func() takes 0 positional arguments but 1 was given
>>>
class A:
def __init__(self):
self.foo = "foo"
def printme(self):
print(self.foo)
def nofoo(self):
self.foo = "bar"
a.nofoo = types.MethodType(nofoo, a)
a.nofoo()
a.printme()
prints
bar
It's not entirely clear what you're trying to do, and I'm worried that whatever it is may be a bad idea. However, I can explain how to do what you're asking, even if it isn't what you want, or should want. I'll point out that it's very uncommon to want to do the second version below, and even rarer to want to do the third version, but Python does allow them both, because "even rarer than very uncommon" still isn't "never". And, in the same spirit…
The short answer is "yes". A dynamically-added method can access the owner object exactly the same way a normal method can.
First, here's a normal, non-dynamic method:
class C:
def meth(self):
return self.x
c = C()
c.x = 3
c.meth()
Obviously, with a normal method like this, when you call c.meth(), the c ends up as the value of the self parameter, so self.x is c.x, which is 3.
Now, here's how you dynamically add a method to a class:
class C:
pass
c = C()
c.x = 3
def meth(self):
print(self.x)
C.meth = meth
c.meth()
This is actually doing exactly the same thing. (Well, we've left another name for the same function object sitting around in globals, but that's the only difference) If C.meth is the same function it was in the first version, then obviously whatever magic made c.meth() work in the first version will do the exact same thing here.
(This used to be slightly more complicated in Python 2, because of unbound methods, and classic classes too… but fortunately you don't have to worry about that.)
Finally, here's how you dynamically add a method to an instance:
class C:
pass
c = C()
c.x = 3
def meth(self):
print(self.x)
c.meth = types.MethodType(meth, c)
c.meth()
Here, you actually have to know the magic that makes c.meth() work in the first two cases. So read the Descriptor HOWTO. After that, it should be obvious.
But if you just want to pretend that Guido is a wizard (Raymond definitely is a wizard) and it's magic… Well, in the first two versions, Guido's magic wand creates a special bound method object whenever you ask for c.meth, but even he isn't magical enough to do that when C.meth doesn't exist. But we can painstakingly create that same bound method object and store it as c.meth. After that, we're going to get the same thing we stored whenever we ask for c.meth, which we explicitly built as the same thing we got in the first two examples, so it'll obviously do the same thing.
But what if we did this:
class C:
pass
c = C()
c.x = 3
def meth(self):
print(self.x)
c.meth = meth
c.meth(c)
Here, you're not letting Guido do his descriptor magic to create c.meth, and you're not doing it manually, you're just sticking a regular function there. Which means if you want anything to show up as the self parameter, you have to explicitly pass it as an argument, as in that silly c.meth(c) line at the end. But if you're willing to do that, then even this one works. No matter how self ends up as c, self.x is going to be c.x.

Properties seem to set to the same value for all objects (Python) [duplicate]

What is the difference between class and instance variables in Python?
class Complex:
a = 1
and
class Complex:
def __init__(self):
self.a = 1
Using the call: x = Complex().a in both cases assigns x to 1.
A more in-depth answer about __init__() and self will be appreciated.
When you write a class block, you create class attributes (or class variables). All the names you assign in the class block, including methods you define with def become class attributes.
After a class instance is created, anything with a reference to the instance can create instance attributes on it. Inside methods, the "current" instance is almost always bound to the name self, which is why you are thinking of these as "self variables". Usually in object-oriented design, the code attached to a class is supposed to have control over the attributes of instances of that class, so almost all instance attribute assignment is done inside methods, using the reference to the instance received in the self parameter of the method.
Class attributes are often compared to static variables (or methods) as found in languages like Java, C#, or C++. However, if you want to aim for deeper understanding I would avoid thinking of class attributes as "the same" as static variables. While they are often used for the same purposes, the underlying concept is quite different. More on this in the "advanced" section below the line.
An example!
class SomeClass:
def __init__(self):
self.foo = 'I am an instance attribute called foo'
self.foo_list = []
bar = 'I am a class attribute called bar'
bar_list = []
After executing this block, there is a class SomeClass, with 3 class attributes: __init__, bar, and bar_list.
Then we'll create an instance:
instance = SomeClass()
When this happens, SomeClass's __init__ method is executed, receiving the new instance in its self parameter. This method creates two instance attributes: foo and foo_list. Then this instance is assigned into the instance variable, so it's bound to a thing with those two instance attributes: foo and foo_list.
But:
print instance.bar
gives:
I am a class attribute called bar
How did this happen? When we try to retrieve an attribute through the dot syntax, and the attribute doesn't exist, Python goes through a bunch of steps to try and fulfill your request anyway. The next thing it will try is to look at the class attributes of the class of your instance. In this case, it found an attribute bar in SomeClass, so it returned that.
That's also how method calls work by the way. When you call mylist.append(5), for example, mylist doesn't have an attribute named append. But the class of mylist does, and it's bound to a method object. That method object is returned by the mylist.append bit, and then the (5) bit calls the method with the argument 5.
The way this is useful is that all instances of SomeClass will have access to the same bar attribute. We could create a million instances, but we only need to store that one string in memory, because they can all find it.
But you have to be a bit careful. Have a look at the following operations:
sc1 = SomeClass()
sc1.foo_list.append(1)
sc1.bar_list.append(2)
sc2 = SomeClass()
sc2.foo_list.append(10)
sc2.bar_list.append(20)
print sc1.foo_list
print sc1.bar_list
print sc2.foo_list
print sc2.bar_list
What do you think this prints?
[1]
[2, 20]
[10]
[2, 20]
This is because each instance has its own copy of foo_list, so they were appended to separately. But all instances share access to the same bar_list. So when we did sc1.bar_list.append(2) it affected sc2, even though sc2 didn't exist yet! And likewise sc2.bar_list.append(20) affected the bar_list retrieved through sc1. This is often not what you want.
Advanced study follows. :)
To really grok Python, coming from traditional statically typed OO-languages like Java and C#, you have to learn to rethink classes a little bit.
In Java, a class isn't really a thing in its own right. When you write a class you're more declaring a bunch of things that all instances of that class have in common. At runtime, there's only instances (and static methods/variables, but those are really just global variables and functions in a namespace associated with a class, nothing to do with OO really). Classes are the way you write down in your source code what the instances will be like at runtime; they only "exist" in your source code, not in the running program.
In Python, a class is nothing special. It's an object just like anything else. So "class attributes" are in fact exactly the same thing as "instance attributes"; in reality there's just "attributes". The only reason for drawing a distinction is that we tend to use objects which are classes differently from objects which are not classes. The underlying machinery is all the same. This is why I say it would be a mistake to think of class attributes as static variables from other languages.
But the thing that really makes Python classes different from Java-style classes is that just like any other object each class is an instance of some class!
In Python, most classes are instances of a builtin class called type. It is this class that controls the common behaviour of classes, and makes all the OO stuff the way it does. The default OO way of having instances of classes that have their own attributes, and have common methods/attributes defined by their class, is just a protocol in Python. You can change most aspects of it if you want. If you've ever heard of using a metaclass, all that is is defining a class that is an instance of a different class than type.
The only really "special" thing about classes (aside from all the builtin machinery to make them work they way they do by default), is the class block syntax, to make it easier for you to create instances of type. This:
class Foo(BaseFoo):
def __init__(self, foo):
self.foo = foo
z = 28
is roughly equivalent to the following:
def __init__(self, foo):
self.foo = foo
classdict = {'__init__': __init__, 'z': 28 }
Foo = type('Foo', (BaseFoo,) classdict)
And it will arrange for all the contents of classdict to become attributes of the object that gets created.
So then it becomes almost trivial to see that you can access a class attribute by Class.attribute just as easily as i = Class(); i.attribute. Both i and Class are objects, and objects have attributes. This also makes it easy to understand how you can modify a class after it's been created; just assign its attributes the same way you would with any other object!
In fact, instances have no particular special relationship with the class used to create them. The way Python knows which class to search for attributes that aren't found in the instance is by the hidden __class__ attribute. Which you can read to find out what class this is an instance of, just as with any other attribute: c = some_instance.__class__. Now you have a variable c bound to a class, even though it probably doesn't have the same name as the class. You can use this to access class attributes, or even call it to create more instances of it (even though you don't know what class it is!).
And you can even assign to i.__class__ to change what class it is an instance of! If you do this, nothing in particular happens immediately. It's not earth-shattering. All that it means is that when you look up attributes that don't exist in the instance, Python will go look at the new contents of __class__. Since that includes most methods, and methods usually expect the instance they're operating on to be in certain states, this usually results in errors if you do it at random, and it's very confusing, but it can be done. If you're very careful, the thing you store in __class__ doesn't even have to be a class object; all Python's going to do with it is look up attributes under certain circumstances, so all you need is an object that has the right kind of attributes (some caveats aside where Python does get picky about things being classes or instances of a particular class).
That's probably enough for now. Hopefully (if you've even read this far) I haven't confused you too much. Python is neat when you learn how it works. :)
What you're calling an "instance" variable isn't actually an instance variable; it's a class variable. See the language reference about classes.
In your example, the a appears to be an instance variable because it is immutable. It's nature as a class variable can be seen in the case when you assign a mutable object:
>>> class Complex:
>>> a = []
>>>
>>> b = Complex()
>>> c = Complex()
>>>
>>> # What do they look like?
>>> b.a
[]
>>> c.a
[]
>>>
>>> # Change b...
>>> b.a.append('Hello')
>>> b.a
['Hello']
>>> # What does c look like?
>>> c.a
['Hello']
If you used self, then it would be a true instance variable, and thus each instance would have it's own unique a. An object's __init__ function is called when a new instance is created, and self is a reference to that instance.

Is accessing class variables via an instance documented?

In Python, class variables can be accessed via that class instance:
>>> class A(object):
... x = 4
...
>>> a = A()
>>> a.x
4
It's easy to show that a.x is really resolved to A.x, not copied to an instance during construction:
>>> A.x = 5
>>> a.x
5
Despite the fact that this behavior is well known and widely used, I couldn't find any definitive documentation covering it. The closest I could find in Python docs was the section on classes:
class MyClass:
"""A simple example class"""
i = 12345
def f(self):
return 'hello world'
[snip]
... By definition, all attributes of a class that are function objects define corresponding methods of its instances. So in our example, x.f is a valid method reference, since MyClass.f is a function, but x.i is not, since MyClass.i is not. ...
However, this part talks specifically about methods so it's probably not relevant to the general case.
My question is, is this documented? Can I rely on this behavior?
Refs the Classes and Class instances parts in the Python data model documentation
A class has a namespace implemented by a dictionary object. Class
attribute references are translated to lookups in this dictionary,
e.g., C.x is translated to C.__dict__["x"] (although for new-style classes in particular there are a number of hooks which allow for other means of locating attributes).
...
A class instance is created by calling a class object (see above). A
class instance has a namespace implemented as a dictionary which is
the first place in which attribute references are searched. When an
attribute is not found there, and the instance’s class has an
attribute by that name, the search continues with the class
attributes.
Generally, this usage is fine, except the special cases mentioned as "for new-style classes in particular there are a number of hooks which allow for other means of locating attributes".
Not only can you rely on this behavior, you constantly do.
Think about methods. A method is merely a function that has been made a class attribute. You then look it up on the instance.
>>> def foo(self, x):
... print "foo:", self, x
...
>>> class C(object):
... method = foo # What a weird way to write this! But perhaps illustrative?
...
>>> C().method("hello")
foo: <__main__.C object at 0xadad50> hello
In the case of objects like functions, this isn't a plain lookup, but some magic occurs to pass self automatically. You may have used other objects that are meant to be stored as class attributes and looked up on the instance; properties are an example (check out the property builtin if you're not familiar with it.)
As okm notes, the way this works is described in the data model reference (including information about and links to more information about the magic that makes methods and properties work). The Data Model page is by far the most useful part of the Language Reference; it also includes among other things documentation about almost all the __foo__ methods and names.

Explaining the 'self' variable to a beginner [duplicate]

This question already has answers here:
What is the purpose of the `self` parameter? Why is it needed?
(26 answers)
Closed 6 years ago.
I'm pretty much ignorant of OOP jargon and concepts. I know conceptually what an object is, and that objects have methods. I even understand that in python, classes are objects! That's cool, I just don't know what it means. It isn't clicking with me.
I'm currently trying to understand a few detailed answers that I think will illuminate my understanding of python:
What does the "yield" keyword do in Python?
What is a metaclass in Python?
In the first answer, the author uses the following code as an example:
>>> class Bank(): # let's create a bank, building ATMs
... crisis = False
... def create_atm(self) :
... while not self.crisis :
... yield "$100"
I don't immediately grok what self is pointing to. This is definitely a symptom of not understanding classes, which I will work on at some point. To clarify, in
>>> def func():
... for i in range(3):
... print i
I understand that i points to an item in the list range(3) which, since it is in a function, isn't global. But what does self "point to"?
I'll try to clear up some confusion about classes and objects for you first. Lets look at this block of code:
>>> class Bank(): # let's create a bank, building ATMs
... crisis = False
... def create_atm(self) :
... while not self.crisis :
... yield "$100"
The comment there is a bit deceptive. The above code does not "create" a bank. It defines what a bank is. A bank is something which has a property called crisis, and a function create_atm. That's what the above code says.
Now let's actually create a bank:
>>> x = Bank()
There, x is now a bank. x has a property crisis and a function create_atm. Calling x.create_atm(); in python is the same as calling Bank.create_atm(x);, so now self refers to x. If you add another bank called y, calling y.create_atm() will know to look at y's value of crisis, not x's since in that function self refers to y.
self is just a naming convention, but it is very good to stick with it. It's still worth pointing out that the code above is equivalent to:
>>> class Bank(): # let's create a bank, building ATMs
... crisis = False
... def create_atm(thisbank) :
... while not thisbank.crisis :
... yield "$100"
It may help you to think of the obj.method(arg1, arg2) invocation syntax as purely syntactic sugar for calling method(obj, arg1, arg2) (except that method is looked up via obj's type, and isn't global).
If you view it that way, obj is the first argument to the function, which traditionally is named self in the parameter list. (You can, in fact, name it something else, and your code will work correctly, but other Python coders will frown at you.)
"self" is the instance object automatically passed to the class instance's method when called, to identify the instance that called it. "self" is used to access other attributes or methods of the object from inside the method. (methods are basically just functions that belong to a class)
"self" does not need to be used when calling a method when you already have an available instance.
Accessing the "some_attribute" attribute from inside a method:
class MyClass(object):
some_attribute = "hello"
def some_method(self, some_string):
print self.some_attribute + " " + some_string
Accessing the "some_attribute" attribute from an existing instance:
>>> # create the instance
>>> inst = MyClass()
>>>
>>> # accessing the attribute
>>> inst.some_attribute
"hello"
>>>
>>> # calling the instance's method
>>> inst.some_method("world") # In addition to "world", inst is *automatically* passed here as the first argument to "some_method".
hello world
>>>
Here is a little code to demonstrate that self is the same as the instance:
>>> class MyClass(object):
>>> def whoami(self, inst):
>>> print self is inst
>>>
>>> local_instance = MyClass()
>>> local_instance.whoami(local_instance)
True
As mentioned by others, it's named "self" by convention, but it could be named anything.
self refers to the current instance of Bank. When you create a new Bank, and call create_atm on it, self will be implicitly passed by python, and will refer to the bank you created.
I don't immediately grok what self is pointing to. This is definitely a symptom of not understanding classes, which I will work on at some point.
self is an argument passed in to the function. In Python, this first argument is implicitly the object that the method was invoked on. In other words:
class Bar(object):
def someMethod(self):
return self.field
bar = Bar()
bar.someMethod()
Bar.someMethod(bar)
These last two lines have equivalent behavior. (Unless bar refers to an object of a subclass of Bar -- then someMethod() might refer to a different function object.)
Note that you can name the "special" first argument anything you want -- self is just a convention for methods.
I understand that i points to an item in the list range(3) which, since it is in a function, isn't global. But what does self "point to"?
The name self does not exist in the context of that function. Attempting to use it would raise a NameError.
Example transcript:
>>> class Bar(object):
... def someMethod(self):
... return self.field
...
>>> bar = Bar()
>>> bar.field = "foo"
>>> bar.someMethod()
'foo'
>>> Bar.someMethod(bar)
'foo'
>>> def fn(i):
... return self
...
>>> fn(0)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 2, in fn
NameError: global name 'self' is not defined
The reason "self" is there (by convention) is that when the Python runtime sees a call of the form Object.Method(Param1,Param2), it calls Method with parameters (Object,Param1,Param2). So if you call that first parameter "self", everyone will know what you are talking about.
The reason you have to do this is the subject of another question.
As far as a metaclass, it's something rarely used. You might want to look at:
http://python-history.blogspot.com/2009/04/metaclasses-and-extension-classes-aka.html, the original author and current Benevolent Dictator For Life of Python explains what this is, and how it came to be. He also has a nice article on some possible uses, but most people never directly use it at all.
One Rubyist's perspective (Ruby is my first programming language so I apologize for whatever oversimplified, potentially wrong abstractions I'm about to use)
as far as I can tell, the dot operator, for example:
os.path
is such that os gets passed into path() as its first variable "invisibly"
It is as if os.path is REALLY this:
path(os)
If there were a daisy chain I'd imagine that this:
os.path.filename
Would be sort of like this in reality*:
filename(path(os))
Here comes the offensive part
So with the self variable all that's doing is allowing the CLASS METHOD (from a rubyist perspective python 'instance methods' appear to be class methods...) to act as an instance method by getting an instance passed into it as its first variable (via the "sneaky" dot method above) which is called self by convention. Were self not an instance
c = ClassName()
c.methodname
but the class itself:
ClassName.methodname
the class would get passed in rather than the instance.
OK, also important to remember is that the __init__ method is called "magic" by some. So don't worry about what gets passed into generate a new instance. To be honest its probably nil.
self refers to an instance of the class.

is it ever useful to define a class method with a reference to self not called 'self' in Python?

I'm teaching myself Python and I see the following in Dive into Python section 5.3:
By convention, the first argument of any Python class method (the reference to the current instance) is called self. This argument fills the role of the reserved word this in C++ or Java, but self is not a reserved word in Python, merely a naming convention. Nonetheless, please don't call it anything but self; this is a very strong convention.
Considering that self is not a Python keyword, I'm guessing that it can sometimes be useful to use something else. Are there any such cases? If not, why is it not a keyword?
No, unless you want to confuse every other programmer that looks at your code after you write it. self is not a keyword because it is an identifier. It could have been a keyword and the fact that it isn't one was a design decision.
As a side observation, note that Pilgrim is committing a common misuse of terms here: a class method is quite a different thing from an instance method, which is what he's talking about here. As wikipedia puts it, "a method is a subroutine that is exclusively associated either with a class (in which case it is called a class method or a static method) or with an object (in which case it is an instance method).". Python's built-ins include a staticmethod type, to make static methods, and a classmethod type, to make class methods, each generally used as a decorator; if you don't use either, a def in a class body makes an instance method. E.g.:
>>> class X(object):
... def noclass(self): print self
... #classmethod
... def withclass(cls): print cls
...
>>> x = X()
>>> x.noclass()
<__main__.X object at 0x698d0>
>>> x.withclass()
<class '__main__.X'>
>>>
As you see, the instance method noclass gets the instance as its argument, but the class method withclass gets the class instead.
So it would be extremely confusing and misleading to use self as the name of the first parameter of a class method: the convention in this case is instead to use cls, as in my example above. While this IS just a convention, there is no real good reason for violating it -- any more than there would be, say, for naming a variable number_of_cats if the purpose of the variable is counting dogs!-)
The only case of this I've seen is when you define a function outside of a class definition, and then assign it to the class, e.g.:
class Foo(object):
def bar(self):
# Do something with 'self'
def baz(inst):
return inst.bar()
Foo.baz = baz
In this case, self is a little strange to use, because the function could be applied to many classes. Most often I've seen inst or cls used instead.
I once had some code like (and I apologize for lack of creativity in the example):
class Animal:
def __init__(self, volume=1):
self.volume = volume
self.description = "Animal"
def Sound(self):
pass
def GetADog(self, newvolume):
class Dog(Animal):
def Sound(this):
return self.description + ": " + ("woof" * this.volume)
return Dog(newvolume)
Then we have output like:
>>> a = Animal(3)
>>> d = a.GetADog(2)
>>> d.Sound()
'Animal: woofwoof'
I wasn't sure if self within the Dog class would shadow self within the Animal class, so I opted to make Dog's reference the word "this" instead. In my opinion and for that particular application, that was more clear to me.
Because it is a convention, not language syntax. There is a Python style guide that people who program in Python follow. This way libraries have a familiar look and feel. Python places a lot of emphasis on readability, and consistency is an important part of this.
I think that the main reason self is used by convention rather than being a Python keyword is because it's simpler to have all methods/functions take arguments the same way rather than having to put together different argument forms for functions, class methods, instance methods, etc.
Note that if you have an actual class method (i.e. one defined using the classmethod decorator), the convention is to use "cls" instead of "self".

Categories

Resources