Python OOP: inefficient to put methods in classes? - python

I usually use classes similarly to how one might use namedtuple (except of course that the attributes are mutable). Moreover, I try to put lengthy functions in classes that won't be instantiated as frequently, to help conserve memory.
From a memory point of view, is it inefficient to put functions in classes, if it is expected that the class will be instantiated often? Keeping aside that it's good design to compartmentalize functionality, should this be something to be worried about?

Methods don't add any weight to an instance of your class. The method itself only exists once and is parameterized in terms of the object on which it operates. That's why you have a self parameter.

Python doesn't maintain pointers directly to its methods in instances of new-style classes. Instead, it maintains a single pointer to the parent class. Consider the following example:
class Foo:
def bar(self):
print 'hello'
f = Foo()
f.bar()
In order to dispatch the bar method from the instance f, two lookups need to be made. Instead of f containing a method table to look for bar, f contains a reference to the class object Foo. Foo contains the method table, where it calls bar with f as the first argument. So f.bar() can be rewritten as
Foo.bar(f)

Instances of a class have one pointer that refers to the class; all other features of the class are unique and accessed through that pointer. Something like
foo.bar()
really translates to something like
foo.__class__.bar(foo)
so methods are unique, long-lived objects belonging to the class that take the instance as an argument when called.

Each object has its own copy of data members whereas the the member functions are shared. The compiler creates one copy of the member functions separate from all objects of the class. All the objects of the class share this one copy.
The whole point of OOP is to combine data and functions together. Without OOP, the data cannot be reused, only the functions can be reused.

Related

Class member function

Say I have some class, with a member function that is not used too often, but whose definition is quite lengthy
class foo:
# ...
fn():
print('This function is called rarely but its definition is quite lengthy')
# ...
At some point in may program I want to create millions of instances of class foo, and I want this to take as little space in memory as possible. Is the lengthy function fn somehow also copied a million times? In this case it would be better to define an external function which I give an instance as input. If it is not copied a milion times, I would rather keep it as a member function.
An instance method is in fact a member of the class. When the Python interpretor sees a construct like obj.method(params, ...), it (more or less) translates it as (obj.__class__).method(obj, params, ...). It looks for a method member in the class of obj and calls it after prepending the (reference to) the object itself.
TL/DR: the methods are not copied in instance objects, so you can safely keep you lengthy function as a method.
Methods declared at the class level are shared across all instances, just like class variables, so you don't have to worry about methods taking more memory when more instances are instantiated.

Calling instance method using class definition in Python

Lately, I've been studying Python's class instantiation process to really understand what happen under the hood when creating a class instance. But, while playing around with test code, I came across something I don't understand.
Consider this dummy class
class Foo():
def test(self):
print("I'm using test()")
Normally, if I wanted to use Foo.test instance method, I would go and create an instance of Foo and call it explicitly like so,
foo_inst = Foo()
foo_inst.test()
>>>> I'm using test()
But, I found that calling it that way ends up with the same result,
Foo.test(Foo)
>>>> I'm using test()
Here I don't actually create an instance, but I'm still accessing Foo's instance method. Why and how is this working in the context of Python ? I mean self normally refers to the current instance of the class, but I'm not technically creating a class instance in this case.
print(Foo()) #This is a Foo object
>>>><__main__.Foo object at ...>
print(Foo) #This is not
>>>> <class '__main__.Foo'>
Props to everyone that led me there in the comments section.
The answer to this question rely on two fundamentals of Python:
Duck-typing
Everything is an object
Indeed, even if self is Python's idiom to reference the current class instance, you technically can pass whatever object you want because of how Python handle typing.
Now, the other confusion that brought me here is that I wasn't creating an object in my second example. But, the thing is, Foo is already an object internally.
This can be tested empirically like so,
print(type(Foo))
<class 'type'>
So, we now know that Foo is an instance of class type and therefore can be passed as self even though it is not an instance of itself.
Basically, if I were to manipulate self as if it was a Foo object in my test method, I would have problem when calling it like my second example.
A few notes on your question (and answer). First, everything is, really an object. Even a class is an object, so, there is the class of the class (called metaclass) which is type in this case.
Second, more relevant to your case. Methods are, more or less, class, not instance attributes. In python, when you have an object obj, instance of Class, and you access obj.x, python first looks into obj, and then into Class. That's what happens when you access a method from an instance, they are just special class attributes, so they can be access from both instance and class. And, since you are not using any instance attributes of the self that should be passed to test(self) function, the object that is passed is irrelevant.
To understand that in depth, you should read about, descriptor protocol, if you are not familiar with it. It explains a lot about how things work in python. It allows python classes and objects to be essentially dictionaries, with some special attributes (very similar to javascript objects and methods)
Regarding the class instantiation, see about __new__ and metaclasses.

Properties seem to set to the same value for all objects (Python) [duplicate]

What is the difference between class and instance variables in Python?
class Complex:
a = 1
and
class Complex:
def __init__(self):
self.a = 1
Using the call: x = Complex().a in both cases assigns x to 1.
A more in-depth answer about __init__() and self will be appreciated.
When you write a class block, you create class attributes (or class variables). All the names you assign in the class block, including methods you define with def become class attributes.
After a class instance is created, anything with a reference to the instance can create instance attributes on it. Inside methods, the "current" instance is almost always bound to the name self, which is why you are thinking of these as "self variables". Usually in object-oriented design, the code attached to a class is supposed to have control over the attributes of instances of that class, so almost all instance attribute assignment is done inside methods, using the reference to the instance received in the self parameter of the method.
Class attributes are often compared to static variables (or methods) as found in languages like Java, C#, or C++. However, if you want to aim for deeper understanding I would avoid thinking of class attributes as "the same" as static variables. While they are often used for the same purposes, the underlying concept is quite different. More on this in the "advanced" section below the line.
An example!
class SomeClass:
def __init__(self):
self.foo = 'I am an instance attribute called foo'
self.foo_list = []
bar = 'I am a class attribute called bar'
bar_list = []
After executing this block, there is a class SomeClass, with 3 class attributes: __init__, bar, and bar_list.
Then we'll create an instance:
instance = SomeClass()
When this happens, SomeClass's __init__ method is executed, receiving the new instance in its self parameter. This method creates two instance attributes: foo and foo_list. Then this instance is assigned into the instance variable, so it's bound to a thing with those two instance attributes: foo and foo_list.
But:
print instance.bar
gives:
I am a class attribute called bar
How did this happen? When we try to retrieve an attribute through the dot syntax, and the attribute doesn't exist, Python goes through a bunch of steps to try and fulfill your request anyway. The next thing it will try is to look at the class attributes of the class of your instance. In this case, it found an attribute bar in SomeClass, so it returned that.
That's also how method calls work by the way. When you call mylist.append(5), for example, mylist doesn't have an attribute named append. But the class of mylist does, and it's bound to a method object. That method object is returned by the mylist.append bit, and then the (5) bit calls the method with the argument 5.
The way this is useful is that all instances of SomeClass will have access to the same bar attribute. We could create a million instances, but we only need to store that one string in memory, because they can all find it.
But you have to be a bit careful. Have a look at the following operations:
sc1 = SomeClass()
sc1.foo_list.append(1)
sc1.bar_list.append(2)
sc2 = SomeClass()
sc2.foo_list.append(10)
sc2.bar_list.append(20)
print sc1.foo_list
print sc1.bar_list
print sc2.foo_list
print sc2.bar_list
What do you think this prints?
[1]
[2, 20]
[10]
[2, 20]
This is because each instance has its own copy of foo_list, so they were appended to separately. But all instances share access to the same bar_list. So when we did sc1.bar_list.append(2) it affected sc2, even though sc2 didn't exist yet! And likewise sc2.bar_list.append(20) affected the bar_list retrieved through sc1. This is often not what you want.
Advanced study follows. :)
To really grok Python, coming from traditional statically typed OO-languages like Java and C#, you have to learn to rethink classes a little bit.
In Java, a class isn't really a thing in its own right. When you write a class you're more declaring a bunch of things that all instances of that class have in common. At runtime, there's only instances (and static methods/variables, but those are really just global variables and functions in a namespace associated with a class, nothing to do with OO really). Classes are the way you write down in your source code what the instances will be like at runtime; they only "exist" in your source code, not in the running program.
In Python, a class is nothing special. It's an object just like anything else. So "class attributes" are in fact exactly the same thing as "instance attributes"; in reality there's just "attributes". The only reason for drawing a distinction is that we tend to use objects which are classes differently from objects which are not classes. The underlying machinery is all the same. This is why I say it would be a mistake to think of class attributes as static variables from other languages.
But the thing that really makes Python classes different from Java-style classes is that just like any other object each class is an instance of some class!
In Python, most classes are instances of a builtin class called type. It is this class that controls the common behaviour of classes, and makes all the OO stuff the way it does. The default OO way of having instances of classes that have their own attributes, and have common methods/attributes defined by their class, is just a protocol in Python. You can change most aspects of it if you want. If you've ever heard of using a metaclass, all that is is defining a class that is an instance of a different class than type.
The only really "special" thing about classes (aside from all the builtin machinery to make them work they way they do by default), is the class block syntax, to make it easier for you to create instances of type. This:
class Foo(BaseFoo):
def __init__(self, foo):
self.foo = foo
z = 28
is roughly equivalent to the following:
def __init__(self, foo):
self.foo = foo
classdict = {'__init__': __init__, 'z': 28 }
Foo = type('Foo', (BaseFoo,) classdict)
And it will arrange for all the contents of classdict to become attributes of the object that gets created.
So then it becomes almost trivial to see that you can access a class attribute by Class.attribute just as easily as i = Class(); i.attribute. Both i and Class are objects, and objects have attributes. This also makes it easy to understand how you can modify a class after it's been created; just assign its attributes the same way you would with any other object!
In fact, instances have no particular special relationship with the class used to create them. The way Python knows which class to search for attributes that aren't found in the instance is by the hidden __class__ attribute. Which you can read to find out what class this is an instance of, just as with any other attribute: c = some_instance.__class__. Now you have a variable c bound to a class, even though it probably doesn't have the same name as the class. You can use this to access class attributes, or even call it to create more instances of it (even though you don't know what class it is!).
And you can even assign to i.__class__ to change what class it is an instance of! If you do this, nothing in particular happens immediately. It's not earth-shattering. All that it means is that when you look up attributes that don't exist in the instance, Python will go look at the new contents of __class__. Since that includes most methods, and methods usually expect the instance they're operating on to be in certain states, this usually results in errors if you do it at random, and it's very confusing, but it can be done. If you're very careful, the thing you store in __class__ doesn't even have to be a class object; all Python's going to do with it is look up attributes under certain circumstances, so all you need is an object that has the right kind of attributes (some caveats aside where Python does get picky about things being classes or instances of a particular class).
That's probably enough for now. Hopefully (if you've even read this far) I haven't confused you too much. Python is neat when you learn how it works. :)
What you're calling an "instance" variable isn't actually an instance variable; it's a class variable. See the language reference about classes.
In your example, the a appears to be an instance variable because it is immutable. It's nature as a class variable can be seen in the case when you assign a mutable object:
>>> class Complex:
>>> a = []
>>>
>>> b = Complex()
>>> c = Complex()
>>>
>>> # What do they look like?
>>> b.a
[]
>>> c.a
[]
>>>
>>> # Change b...
>>> b.a.append('Hello')
>>> b.a
['Hello']
>>> # What does c look like?
>>> c.a
['Hello']
If you used self, then it would be a true instance variable, and thus each instance would have it's own unique a. An object's __init__ function is called when a new instance is created, and self is a reference to that instance.

Python memory allocation, when using bound, static or class functions?

I am curious about this: what actually happens to the python objects once that you create a class that contains each one of these functions?
Looking at some example, I see that either the bound, static or class function is in fact creating a class object, which is the one that contains all 3 function.
Is this always true, no matter which function I call? and the parent object class (object in this case, but can be anything I think) is always called, since the constructor in my class is invoking it implicitly?
class myclass(object):
a=1
b=True
def myfunct(self, b)
return (self.a + b)
#staticmethod
def staticfunct(b):
print b
#classmethod
classfunct(cls, b):
cls.a=b
Since it was not clear: what is the lifecycle for this object class, when I use it as following?
from mymodule import myclass
class1 = myclass()
class1.staticfunct(4)
class1.classfunct(3)
class1.myfunct
In the case of static, myclass object get allocated, and then the function is run, but class and bound method are not generated?
In the case of class funciton, it is the same as above?
in the case of the bound function, everything in the class is allocated?
The class statement creates the class. That is an object which has all three functions, but the first (myfunct) is unbound and cannot be called without an instance object of this class.
The instances of this class (in case you create them) will have bound versions of this function and references to the static and the class functions.
So, both the class and the instances have all three functions.
None of these functions create a class object, though. That is done by the class statement. (To be precise: When the interpreter completes the class creation, i. e. the class does not yet exist when the functions inside it are created; mind boggling, but seldom necessary to know.)
If you do not override the __init__() function, it will be inherited and called for each created instance, yes.
Since it was not clear: what is the lifecycle for this object class,
when I use it as following?
from mymodule import myclass
This will create the class, and code for all functions. They will be classmethod, staticmethod, and method (which you can see by using type() on them)
class1 = myclass()
This will create an instance of the class, which has a dictionary and a lot of other stuff. It doesn't do anything to your methods though.
class1.staticfunct(4)
This calls your staticfunct.
class1.classfunct(3)
This calls you classfunct
class1.myfunct
This will create a new object that is a bound myfunct method of class1. It is often useful to bind this to a variable if you are going to be calling it over and over. But this bound method has normal lifetime.
Here is an example you might find illustrative:
>>> class foo(object):
... def bar(self):
... pass
...
>>> x = foo()
>>> x.bar is x.bar
False
Every time you access x.bar, it creates a new bound method object.
And another example showing class methods:
>>> class foo(object):
... #classmethod
... def bar():
... pass
...
>>> foo.bar
<bound method type.bar of <class '__main__.foo'>>
Your class myclass actually has four methods that are important: the three you explicitly coded and the constructor, __init__ which is inherited from object. Only the constructor creates a new instance. So in your code one instance is created, which you have named class1 (a poor choice of name).
myfunctcreates a new integer by adding class1.a to 4. The lifecycle of class1 is not affected, nor are variables class1.a, class1.b, myclass.a or myclass.b.
staticfunct just prints something, and the attributes of myclass and class1 are irrelevant.
classfunct modifies the variable myclass.a. It has no effect on the lifecycle or state of class1.
The variable myclass.b is never used or accessed at all; the variables named b in the individual functions refer to the values passed in the function's arguments.
Additional info added based on the OP's comments:
Except for the basic data types (int, chars, floats, etc) everything in Python is an object. That includes the class itself (a class object), every method (a method object) and every instance you create. Once created each object remains alive until every reference to it disappears; then it is garbage-collected.
So in your example, when the interpreter reaches the end of the class statement body an object named "myclass" exists, and additional objects exist for each of its members (myclass.a, myclass.b, myclass.myfunct, myclass.staticfunct etc.) There is also some overhead for each object; most objects have a member named __dict__ and a few others. When you instantiate an instance of myclass, named "class1", another new object is created. But there are no new method objects created, and no instance variables since you don't have any of those. class1.a is a pseudonym for myclass.a and similarly for the methods.
If you want to get rid of an object, i.e., have it garbage-collected, you need to eliminate all references to it. In the case of global variables you can use the "del" statement for this purpose:
A = myclass()
del A
Will create a new instance and immediately delete it, releasing its resources for garbage collection. Of course you then cannot subsequently use the object, for example print(A) will now give you an exception.

python memory consumption and performance related to classes

I am curious about memory consumption / performance of python related to nested classes vs class attributes.
If i have classes called OtherClass, ClassA, ClassB, ClassC where OtherClass needs access to limited attributes of ClassA-C. Assuming ClassA-C are large classes with many attributes, methods, and properties. Which one of these scenarios is more efficient.
Option 1:
def OtherClass(object):
def __init__(self, classa, classb, classc):
self.classa
self.classb
self.classc
Option 2:
def OtherClass(object):
def __init__(self, classa_id, classa_atr1, classa_atr2,
classb_id, classb_atr1, classb_atr2,
classc_id, classc_atr1, classc_atr2):
self.classa_id
self.classb_id
self.classc_id
self.classa_atr1
self.classb_atr1
self.classc_atr1
self.classa_atr2
self.classb_atr2
self.classc_atr2
I imagine option 1 is better, since the 3 attributes will simply link to the class instance already existing in memory. Where option 2 is adding 6 additional attributes per instance to memory. Is this correct?
TL;DR
My answer is that you should prefer option 1 for it's simplicity and better OOP design, and avoid premature optimization.
The Rest
I think the efficiency question here is dwarfed by how difficult it will be in the future to maintain your second option. If one object needs to use attributes of another object (your example code uses a form of composition), then it should have those objects as attributes, rather than creating extra references directly to the object attributes it needs. Your first option is the way to go. The first option supports encapsulation, option 2 very clearly violates it. (Granted, encapsulation isn't as strongly enforced in Python as some langauages, like Java, but it's still a good principle to follow).
The only efficiency-related reason you should prefer number two is if you find your code is slow, you profile, and your profiling shows that these extra lookups are indeed your bottleneck. Then you could consider sacrificing things like ease of maintenance for the speed you need. It is possible that the extra layer of references (foo = self.classa.bar() vs. foo = self.bar()) could slow things down if you're using them in tight loops, but it's not likely.
In fact, I would go one step further and say you should modify your code so that OtherClass actually instantiates the object it needs, rather than having them passed in. With Option 1, if I want to use OtherClass, I have to do this:
classa = ClassA(class_a_init_args)
classb = ClassC(class_b_init_args)
classc = ClassC(class_c_init_args)
otherclass_obj = OtherClass(classa_obj, classb_obj, classc_obj)
That's too much setup required just to instantiate OtherClass. Instead, change OtherClass to this:
def OtherClass(object):
def __init__(self, classa_init_args, classb_init_args, classc_init_args):
self.classa = ClassA(class_a_init_args)
self.classb = ClassC(class_b_init_args)
self.classc = ClassC(class_c_init_args)
Now instantiating an OtherClass object is simply this:
otherclass_obj = OtherClass(classa_init_args, classb_init_args, classc_init_args)
If possible, another option may be possible to reconfigure your class so that you don't even have to instantiate the other classes! Have a look at Class Attributes and the classmethod decorator. That allows you to do things like this:
class foo(object):
bar = 2
#classmethod
def frobble(self):
return "I didn't even have to be instantiated!"
print(foo.bar)
print(foo.frobble())
This code prints this:
2
I didn't even have to be instantiated!
If your OtherClass uses attributes or methods of classa, classb, and classc that don't need to be tied to an instance of those classes, consider using them directly via class methods and attributes instead of instantiating the objects. That would actually save you the most memory by avoiding the creation of entire objects.

Categories

Resources