Say I have some class, with a member function that is not used too often, but whose definition is quite lengthy
class foo:
# ...
fn():
print('This function is called rarely but its definition is quite lengthy')
# ...
At some point in may program I want to create millions of instances of class foo, and I want this to take as little space in memory as possible. Is the lengthy function fn somehow also copied a million times? In this case it would be better to define an external function which I give an instance as input. If it is not copied a milion times, I would rather keep it as a member function.
An instance method is in fact a member of the class. When the Python interpretor sees a construct like obj.method(params, ...), it (more or less) translates it as (obj.__class__).method(obj, params, ...). It looks for a method member in the class of obj and calls it after prepending the (reference to) the object itself.
TL/DR: the methods are not copied in instance objects, so you can safely keep you lengthy function as a method.
Methods declared at the class level are shared across all instances, just like class variables, so you don't have to worry about methods taking more memory when more instances are instantiated.
Related
I'm looking at the source code for a trie implementation
On lines 80-85:
def keys(self, prefix=[]):
return self.__keys__(prefix)
def __keys__(self, prefix=[], seen=[]):
result = []
etc.
What is def __keys__? Is that a magic object that is self-created? If so, is this poor code? Or does __keys__ exist as a standard Python magic method? I can't find it anywhere in the Python documentation, though.
Why is it legal for the function to call self.__keys__ before def __keys__ is even instantiated? Wouldn't def __keys__ have to go before def keys (since keys calls __keys__)?
For your second question, it is legal, the functions for a class are defined when the class gets defined , so you can be sure both functions would be defined before keys() is called, the logic also applies to normal functions, we can do -
>>> def a():
... b()
...
>>> def b():
... print("In B()")
...
>>> a()
In B()
This is legal because both a() and b() are defined before a() is called. It would only be illegal , if you try to call a() before b() gets defined. Please note defining a function does not automatically call it , and python does not validate at time of definition of function whether any functions used in a function is defined or not (untill runtime, when the function is called and in that case it throws a NameError)
For your first question, I do not know of any such magic methods called __keys__() , cannot find it in documentation either.
All of the real "magic methods" are in the data model documentation; __keys__ isn't one of them. The style guide says:
Never invent such names; only use them as documented.
so yes, making up a new one is bad form (the convention would have been to call it _keys).
The second part of your question doesn't make sense; even if this wasn't a class, there is no need to define methods and functions in the order they're called. As long as they exist by the time the call actually gets made, it's not a problem. I tend to define public methods before private ones, even though the former may call the latter, simply for the reader's convenience.
There is no magic method named __keys__(), so as you suspected this is just poor naming.
The code in the class definition can be in any order. All the matters that the definition has been made by the time the actual call is made downstream.
There is no magic method named __keys__, so its just a wrong naming convention. Looking at the code, the author just wanted to have a private method which is used internally, and also from the public method keys. As you can see __keys__ accepts an additional argument.
About the second question, there is no need that you define the functions in the same order as they called. It will be available by the time code is compiled.
The compilation of a class in Python is done way before the class is instantiated.
Whenever class type is created, the body of the class block is compiled and executed. Then, all the functions are transformed either into bound handles (normal functions) or into classmethod/staticmethod objects. Then, when a new instance is created, content of the type's __dict__ is copied over to the instance (and bound handles are transformed into methods).
Therefore, at the moment of calling instance.keys(), the instance already has both keys and __keys__ methods.
Also, there is no __keys__ method in any data mode, as far as I know.
I am curious about this: what actually happens to the python objects once that you create a class that contains each one of these functions?
Looking at some example, I see that either the bound, static or class function is in fact creating a class object, which is the one that contains all 3 function.
Is this always true, no matter which function I call? and the parent object class (object in this case, but can be anything I think) is always called, since the constructor in my class is invoking it implicitly?
class myclass(object):
a=1
b=True
def myfunct(self, b)
return (self.a + b)
#staticmethod
def staticfunct(b):
print b
#classmethod
classfunct(cls, b):
cls.a=b
Since it was not clear: what is the lifecycle for this object class, when I use it as following?
from mymodule import myclass
class1 = myclass()
class1.staticfunct(4)
class1.classfunct(3)
class1.myfunct
In the case of static, myclass object get allocated, and then the function is run, but class and bound method are not generated?
In the case of class funciton, it is the same as above?
in the case of the bound function, everything in the class is allocated?
The class statement creates the class. That is an object which has all three functions, but the first (myfunct) is unbound and cannot be called without an instance object of this class.
The instances of this class (in case you create them) will have bound versions of this function and references to the static and the class functions.
So, both the class and the instances have all three functions.
None of these functions create a class object, though. That is done by the class statement. (To be precise: When the interpreter completes the class creation, i. e. the class does not yet exist when the functions inside it are created; mind boggling, but seldom necessary to know.)
If you do not override the __init__() function, it will be inherited and called for each created instance, yes.
Since it was not clear: what is the lifecycle for this object class,
when I use it as following?
from mymodule import myclass
This will create the class, and code for all functions. They will be classmethod, staticmethod, and method (which you can see by using type() on them)
class1 = myclass()
This will create an instance of the class, which has a dictionary and a lot of other stuff. It doesn't do anything to your methods though.
class1.staticfunct(4)
This calls your staticfunct.
class1.classfunct(3)
This calls you classfunct
class1.myfunct
This will create a new object that is a bound myfunct method of class1. It is often useful to bind this to a variable if you are going to be calling it over and over. But this bound method has normal lifetime.
Here is an example you might find illustrative:
>>> class foo(object):
... def bar(self):
... pass
...
>>> x = foo()
>>> x.bar is x.bar
False
Every time you access x.bar, it creates a new bound method object.
And another example showing class methods:
>>> class foo(object):
... #classmethod
... def bar():
... pass
...
>>> foo.bar
<bound method type.bar of <class '__main__.foo'>>
Your class myclass actually has four methods that are important: the three you explicitly coded and the constructor, __init__ which is inherited from object. Only the constructor creates a new instance. So in your code one instance is created, which you have named class1 (a poor choice of name).
myfunctcreates a new integer by adding class1.a to 4. The lifecycle of class1 is not affected, nor are variables class1.a, class1.b, myclass.a or myclass.b.
staticfunct just prints something, and the attributes of myclass and class1 are irrelevant.
classfunct modifies the variable myclass.a. It has no effect on the lifecycle or state of class1.
The variable myclass.b is never used or accessed at all; the variables named b in the individual functions refer to the values passed in the function's arguments.
Additional info added based on the OP's comments:
Except for the basic data types (int, chars, floats, etc) everything in Python is an object. That includes the class itself (a class object), every method (a method object) and every instance you create. Once created each object remains alive until every reference to it disappears; then it is garbage-collected.
So in your example, when the interpreter reaches the end of the class statement body an object named "myclass" exists, and additional objects exist for each of its members (myclass.a, myclass.b, myclass.myfunct, myclass.staticfunct etc.) There is also some overhead for each object; most objects have a member named __dict__ and a few others. When you instantiate an instance of myclass, named "class1", another new object is created. But there are no new method objects created, and no instance variables since you don't have any of those. class1.a is a pseudonym for myclass.a and similarly for the methods.
If you want to get rid of an object, i.e., have it garbage-collected, you need to eliminate all references to it. In the case of global variables you can use the "del" statement for this purpose:
A = myclass()
del A
Will create a new instance and immediately delete it, releasing its resources for garbage collection. Of course you then cannot subsequently use the object, for example print(A) will now give you an exception.
I usually use classes similarly to how one might use namedtuple (except of course that the attributes are mutable). Moreover, I try to put lengthy functions in classes that won't be instantiated as frequently, to help conserve memory.
From a memory point of view, is it inefficient to put functions in classes, if it is expected that the class will be instantiated often? Keeping aside that it's good design to compartmentalize functionality, should this be something to be worried about?
Methods don't add any weight to an instance of your class. The method itself only exists once and is parameterized in terms of the object on which it operates. That's why you have a self parameter.
Python doesn't maintain pointers directly to its methods in instances of new-style classes. Instead, it maintains a single pointer to the parent class. Consider the following example:
class Foo:
def bar(self):
print 'hello'
f = Foo()
f.bar()
In order to dispatch the bar method from the instance f, two lookups need to be made. Instead of f containing a method table to look for bar, f contains a reference to the class object Foo. Foo contains the method table, where it calls bar with f as the first argument. So f.bar() can be rewritten as
Foo.bar(f)
Instances of a class have one pointer that refers to the class; all other features of the class are unique and accessed through that pointer. Something like
foo.bar()
really translates to something like
foo.__class__.bar(foo)
so methods are unique, long-lived objects belonging to the class that take the instance as an argument when called.
Each object has its own copy of data members whereas the the member functions are shared. The compiler creates one copy of the member functions separate from all objects of the class. All the objects of the class share this one copy.
The whole point of OOP is to combine data and functions together. Without OOP, the data cannot be reused, only the functions can be reused.
I am writing a framework to be used by people who know some Python. I have settled on some syntax, and it makes sense to me and them to use something like this, where Base is the Base class that implements the framework.
class A(Base):
#decorator1
#decorator2
#decorator3
def f(self):
pass
#decorator4
def f(self):
pass
#decorator5
def g(self)
pass
All my framework is implemented via Base's metaclass. This setup is appropriate for my use case, because all these user-defined classes have a rich inheritance graph. I expect the user to implement some of the methods, or just leave it with pass. Much of the information that the user is giving here is in the decorators. This allows me to avoid other solutions where the user would have to monkey-patch, give less structured dictionaries, and things like that.
My problem here is that f is defined twice by the user (with good reason), and this should be handled by my framework. Unfortunately, by the time this gets to the metaclass'__new__ method, the dictionary of attributes contains only one key f. My idea was to use yet another decorator, such as #duplicate for the user to signal this is happening, and the two f's to be wrapped differently so they don't overwrite each other. Can something like this work?
You should use a namespace to distinguish the different fs.
Heed the advice of the "Zen of Python":
Namespaces are one honking great idea -- let's do more of those!
Yes, you could, but only with an ugly hack, and only by storing the function with a new name.
You'd have to use the sys._getframe() function to retrieve the local namespace of the calling frame. That local namespace is the class-in-construction, and adding items to that namespace means they'll end up in the class dictionary passed to your metaclass.
The following retrieves that namespace:
callframe = sys._getframe(1)
namespace = callframe.f_locals
namespace is a dict-like object, just like locals(). You can now store something in that namespace (like __function_definitions__ or similar) to add extra references to the functions.
You might be thinking Java - methods overloading and arguments signature - but this is Python and you cannot do this. The second f() will override the first f() and you end up with only one f(). The namespace is a dictionary and you cannot have duplicated keys.
Being probably one of the worst OOP programmers on the planet, I've been reading through a lot of example code to help 'get' what a class can be used for. Recently I found this example:
class NextClass: # define class
def printer(self, text): # define method
self.message = text # change instance
print self.message # access instance
x = NextClass() # make instance
x.printer('instance call') # call its method
print x.message # instance changed
NextClass.printer(x, 'class call') # direct class call
print x.message # instance changed again
It doesn't appear there is any difference between what the direct class call does and the instance call does; but it goes against the Zen to include features like that without some use to them. So if there is a difference, what is it? Performance? Overhead reduction? Maybe readability?
There is no difference. instance.method(...) is class.method(instance, ...). But this doesn't go against the Zen, since it says (emphasis mine):
There should be one-- and preferably only one --obvious way to do it.
The second way is possible, and everyone with good knowledge of Python should know that (and why), but it's a nonobvious way of doing that, nobody does it in real code.
So why is it that way? It's just how methods work in any language - a method is some code that operates on an object/instance (and possibly more arguments). Except that usually, the instance is supplied implicitly (e.g. this in C++/Java/D) - but since the Zen says "explicit is better than implicit", self is explicitly a parameter of every method, which inevitable allows this. Explicitly prohibiting it would be pointless.
And apart from that, the fact that methods are not forced to (implicitly) take an instance allows class methods and static methods to be defined without special treatment of the language - the first is just a method that expects a class instead of an instance, and the latter is just a method that doesn't expect an instance at all.
In this situation, there is no difference. In both calls, you are supplying an instance of that class:
x.printer('instance call') # you supplied x and then called its printer method
NextClass.printer(x, 'class call') # you supplied x as a parameter this time
Normally, though, I wouldn't write the second method very often. I usually think of any method that operates on an instance as an instance method. Things like:
car.drive('place')
car.refuel
car.impound
And, I use class methods to operate more generally (I'm struggling to describe this):
Car.numberintheworld # returns the number of cars in the world (or your program)
Here's some more help, for your reading.
This might give you a better clarification.
When an instance attribute is
referenced that isn’t a data
attribute, its class is searched. If
the name denotes a valid class
attribute that is a function object, a
method object is created by packing
(pointers to) the instance object and
the function object just found
together in an abstract object (i.e. this
method object). When the method
object is called with an argument
list, a new argument list is
constructed from the instance object
and the argument list, and the
function object is called with this
new argument list.