Related
I just conducted an interesting test:
~$ python3 # I also conducted this on python 2.7.6, with the same result
Python 3.4.0 (default, Apr 11 2014, 13:05:11)
[GCC 4.8.2] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> class Foo(object):
... def __add__(self, other):
... global add_calls
... add_calls += 1
... return Foo()
... def __iadd__(self, other):
... return self
...
>>> add_calls = 0
>>> a = list(map(lambda x:Foo(), range(6)))
>>> a[0] + a[1] + a[2]
<__main__.Foo object at 0x7fb588e6c400>
>>> add_calls
2
>>> add_calls = 0
>>> sum(a, Foo())
<__main__.Foo object at 0x7fb588e6c4a8>
>>> add_calls
6
Obviously, the __iadd__ method is more efficient than the __add__ method, not requiring the allocation of a new class. If my objects being added were sufficiently complicated, this would create unnecessary new objects, potentially creating huge bottlenecks in my code.
I would expect that, in an a[0] + a[1] + a[2], the first operation would call __add__, and the second operation would call __iadd__ on the newly created object.
Why doesn't python optimize this?
The __add__ method is free to return a different type of object, while __iadd__ should, if using in-place semantics, return self. They are not required to return the same type of object here, so sum() should not rely on the special semantics of __iadd__.
You can use the functools.reduce() function to implement your desired functionality yourself:
from functools import reduce
sum_with_inplace_semantics = reduce(Foo.__iadd__, a, Foo())
Demo:
>>> from functools import reduce
>>> class Foo(object):
... def __add__(self, other):
... global add_calls
... add_calls += 1
... return Foo()
... def __iadd__(self, other):
... global iadd_calls
... iadd_calls += 1
... return self
...
>>> a = [Foo() for _ in range(6)]
>>> result = Foo()
>>> add_calls = iadd_calls = 0
>>> reduce(Foo.__iadd__, a, result) is result
True
>>> add_calls, iadd_calls
(0, 6)
Martjin's answer provides an excellent workaround, but I feel the need to summarize the bits and pieces of answers scattered throughout the comments:
The sum function is primarily used for immutable types. Performing all additions except the first in-place would create a performance improvement on objects that had an __iadd__ method, but checking for the __iadd__ method would cause a performance loss in the more typical case. Special cases aren't special enough to break the rules.
I also stated that __add__ should probably only be called once in a + b + c, where a + b creates a temporary variable, and then calls tmp.__iadd__(c) before returning it. However, this would violate the principle of least surprise.
Since you are writting your class anyway, you know it's __add__ can return the same object as well, don't you?
And therefore you can do your currying optimized code to run with both the + operator and the built-in sum:
>>> class Foo(object):
... def __add__(self, other):
... global add_calls
... add_calls += 1
... return self
(Just beware of passing your code to third party functions that expect "+" to be a new object)
I have seen source code where more than one methods are called on an object eg x.y().z() Can someone please explain this to me, does this mean that z() is inside y() or what?
This calls the method y() on object x, then the method z() is called on the result of y() and that entire line is the result of method z().
For example
friendsFavePizzaToping = person.getBestFriend().getFavoritePizzaTopping()
This would result in friendsFavePizzaTopping would be the person's best friend's favorite pizza topping.
Important to note: getBestFriend() must return an object that has the method getFavoritePizzaTopping(). If it does not, an AttributeError will be thrown.
Each method is evaluated in turn, left to right. Consider:
>>> s='HELLO'
>>> s.lower()
'hello'
>>> s='HELLO '
>>> s.lower()
'hello '
>>> s.lower().strip()
'hello'
>>> s.lower().strip().upper()
'HELLO'
>>> s.lower().strip().upper().replace('H', 'h')
'hELLO'
The requirement is that the object to the left in the chain has to have availability of the method on the right. Often that means that the objects are similar types -- or at least share compatible methods or an understood cast.
As an example, consider this class:
class Foo:
def __init__(self, name):
self.name=name
def m1(self):
return Foo(self.name+'=>m1')
def m2(self):
return Foo(self.name+'=>m2')
def __repr__(self):
return '{}: {}'.format(id(self), self.name)
def m3(self):
return .25 # return is no longer a Foo
Notice that as a type of immutable, each return from Foo is a new object (either a new Foo for m1, m2 or a new float). Now try those methods:
>>> foo
4463545376: init
>>> foo.m1()
4463545304: init=>m1
^^^^ different object id
>>> foo
4463545376: init
^^^^ foo still the same because you need to assign it to change
Now assign:
>>> foo=foo.m1().m2()
>>> foo
4464102576: init=>m1=>m2
Now use m3() and it will be a float; not a Foo anymore:
>>> foo=foo.m1().m2().m3()
>>> foo
.25
Now a float -- can't use foo methods anymore:
>>> foo.m1()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'float' object has no attribute 'm1'
But you can use float methods:
>>> foo.as_integer_ratio()
(1, 4)
In the case of:
x.y().z()
You're almost always looking at immutable objects. Mutable objects don't return anything that would HAVE a function like that (for the most part, but I'm simplifying). For instance...
class x:
def __init__(self):
self.y_done = False
self.z_done = False
def y(self):
new_x = x()
new_x.y_done = True
return new_x
def z(self):
new_x = x()
new_x.z_done = True
return new_x
You can see that each of x.y and x.z returns an x object. That object is used to make the consecutive call, e.g. in x.y().z(), x.z is not called on x, but on x.y().
x.y().z() =>
tmp = x.y()
result = tmp.z()
In #dawg's excellent example, he's using strings (which are immutable in Python) whose methods return strings.
string = 'hello'
string.upper() # returns a NEW string with value "HELLO"
string.upper().replace("E","O") # returns a NEW string that's based off "HELLO"
string.upper().replace("E","O") + "W"
# "HOLLOW"
The . "operator" is Python syntax for attribute access. x.y is (nearly) identical to
getattr(x, 'y')
so x.y() is (nearly) identical to
getattr(x, 'y')()
(I say "nearly identical" because it's possible to customize attribute access for a user-defined class. From here on out, I'll assume no such customization is done, and you can assume that x.y is in fact identical to getattr(x, 'y').)
If the thing that x.y() returns has an attribute z such that
foo = getattr(x, 'y')
bar = getattr(foo(), 'z')
is legal, then you can chain the calls together without needing the name foo in the middle:
bar = getattr(getattr(x, 'y')(), 'z')
Converting back to dot notation gives you
bar = getattr(x.y(), 'z')
or simply
bar = x.y().z()
x.y().z() means that the x object has the method y() and the result of x.y() object has the method z() . Now if you first want to apply the method y() on x and then on the result want to apply the z() method, you will write x.y().z(). This is like,
val = x.y()
result = val.z()
Example:
my_dict = {'key':'value'}
my_dict is a dict type object. my_dict.get('key') returns 'value' which is a str type object. now I can apply any method of str type object on it. which will be like,
my_dict.get('key').upper()
This will return 'VALUE'.
That is (sometimes a sign of) bad code.
It violates The law of Demeter. Here is a quote from Wikipedia explaining what is meant:
Each unit should have only limited knowledge about other units: only units "closely" related to the current unit.
Each unit should only talk to its friends; don't talk to strangers.
Only talk to your immediate friends.
Suppose you have a car, which itself has an engine:
class Car:
def __init__(self):
self._engine=None
#property
def engine(self):
return self._engine
#engine.setter
def engine(self, value):
self._engine = value
class Porsche_engine:
def start(self):
print("starting")
So if you make a new car and set the engine to Porsche you could do the following:
>>> from car import *
>>> c=Car()
>>> e=Porsche_engine()
>>> c.engine=e
>>> c.engine.start()
starting
If you are maing this call from an Object, it has not only knowledge of a Car object, but has too knowledge of Engine, which is bad design.
Additionally: if you do not know whether a Car has an engine, calling directly start
>>> c=Car()
>>> c.engine.start()
May result in an Error
AttributeError: 'NoneType' object has no attribute 'start'
Edit:
To avoid (further) misunterstandings and misreadings, from what I am saying.
There are two usages:
1) as I pointed out, Objects calling methods on other objects, returned from a third object is a violation of LoD. This is one way to read the question.
2) an exception to that is method chaining, which is not bad design.
And a better design would be, if the Car itself had a start()-Method which delegates to the engine.
[The code in the original version was badly messed up. Even after I fixed the code, several highly confusing typos remained in the post. I believe I finally fixed all of them too. Profuse apologies.]
The two calls to alias below produce different outputs, because the object associated with the variable my_own_id changes between the two calls:
>>> def my_own_id():
... me = my_own_id
... return id(me)
...
>>> alias = my_own_id
>>> alias()
4301701560
>>> my_own_id = None
>>> alias()
4296513024
What can I assign to me in the definition of my_own_id so that its output remains invariant wrt subsequent re-definitions of the my_own_id variable? (IOW, so that the internal me variable always refers to the same function object?)
(I can get the current frame (with inspect.currentframe()), but it contains only a reference to the current code object, not to the current function.)
P.S. The motivation for this question is only to know Python better.
It seems that referring to my_own_id will look for 'my_own_id' in the global namespace dictionary, so it will always be the name used on function definition. Since that name can be assigned to different values, the value retrieved can also change. If you make me a default argument, you can assign it to the function itself at function definition to keep a reference to the actual function.
You could use this decorator which implicitly passes the original function itself as the first argument.
>>> from functools import wraps
>>> def save_id(func):
#wraps(func)
def wrapper(*args, **kwargs):
return func(func, *args, **kwargs)
return wrapper
>>> #save_id
def my_own_id(me): # me is passed implicitly by save_id
return id(me)
>>> alias = my_own_id
>>> alias()
40775152
>>> my_own_id = 'foo'
>>> alias()
40775152
Indeed, if you rely only on the function name, if that name is overitten in the global variable space (in the module the function was defined), a reference using the name of the function itslef will fail
The easier, more maintanable way is to write a decorator for that, that would provide a nonlocalvariable containing a reference to the function itself.
from functools import wraps
def know_thyself(func):
#wraps(func):
def new_func(*args, **kwargs):
my_own_id = func
return func(*args, **kwargs)
return new_func
And can be used as:
>>> #know_thyself
... def my_own_id():
... me = my_own_id
... return id(me)
...
There is another possible approach, far from being this clean, using frame introspection
and rebuilding a new function re-using the same object code. I had used this on this post
about a self-referential lambda expression in Python:
http://metapython.blogspot.com.br/2010/11/recursive-lambda-functions.html
Well, if you don't mind calling a function (to get the desired function into the global scope), you can wrap the function to protect its definition:
>>> def be_known():
... global my_own_id
... def _my_own_id():
... return id(_my_own_id)
... my_own_id = _my_own_id
...
>>> be_known()
>>> my_own_id()
140685505972568
>>> alias, my_own_id = my_own_id, None
>>> alias()
140685505972568
Note that the protected function must call itself with the nonlocal name, not the global name.
The decorator approach is probably the best one. Here are some more for fun:
Hijack one of the function arguments to provide a static variable.
def fn(fnid=None):
print "My id:", fnid
fn.func_defaults = (id(fn),)
There are a few ways to get the current function here: Python code to get current function into a variable?; most of these involve searching for currentframe.f_code in a variety of places. These work without any modification to the original function.
import inspect
def _this_fn():
try:
frame = inspect.currentframe().f_back
code = frame.f_code
return frame.f_globals[code.co_name]
finally:
del code
del frame
def myfunc(*parms):
print _this_fn()
>>> myfunc(1)
<function myfunc at 0x036265F0>
>>> myfunc
<function myfunc at 0x036265F0>
It's due to scope
>>> def foo():
... x = foo
... print x
...
>>> foo()
<function foo at 0x10836e938>
>>> alias = foo
>>> alias()
<function foo at 0x10836e938>
>>> foo = None
>>> alias()
None
>>> foo = []
>>> alias()
[]
>>> del foo
>>> alias()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 2, in foo
NameError: global name 'foo' is not defined
>>>
Luke had an idea but didn't appear to develop it: use a mutable default parameter to hold the value in the function object. Default parameter values are evaluated only once, when the function is defined, and retain their previous value after that.
>>> def my_own_id(me=[None]):
if not me[0]:
me[0] = my_own_id
return id(me[0])
>>> alias = my_own_id
>>> alias()
40330928
>>> my_own_id = None
>>> alias()
40330928
This requires care on your part to never call the function with a parameter.
I want to make two functions equal to each other, like this:
def fn_maker(fn_signature):
def _fn():
pass
_fn.signature = fn_signature
return _fn
# test equality of two function instances based on the equality of their signature values
>>> fa = fn_maker(1)
>>> fb = fn_maker(1)
>>> fc = fn_maker(2)
>>> fa == fb # should be True, same signature values
True
>>> fa == fc # should be False, different signature values
False
How should I do it? I know I could probably override eq and ne if fa, fb, fc are instances of some class. But here eq is not in dir(fa) and adding it the list doesnt work.
I figured out some workaround like using a cache, e.g.,
def fn_maker(fn_signature):
if fn_signature in fn_maker.cache:
return fn_maker.cache[fn_signature]
def _fn():
pass
_fn.signature = fn_signature
fn_maker.cache[fn_signature] = _fn
return _fn
fn_maker.cache = {}
By this way there is a guarantee that there is only one function for the same signature value (kinda like a singleton). But I am really looking for some neater solutions.
If you turned your functions into instances of some class that overrides __call__() as well as the comparison operators, it will be very easy to achieve the semantics you want.
It is not possible to override the __eq__ implementation for functions (tested with Python 2.7)
>>> def f():
... pass
...
>>> class A(object):
... pass
...
>>> a = A()
>>> a == f
False
>>> setattr(A, '__eq__', lambda x,y: True)
>>> a == f
True
>>> setattr(f.__class__, '__eq__', lambda x,y: True)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: can't set attributes of built-in/extension type 'function'
I don't think it's possible.
But overriding __call__ seems a nice solution to me.
Consider the following (broken) code:
import functools
class Foo(object):
def __init__(self):
def f(a,self,b):
print a+b
self.g = functools.partial(f,1)
x=Foo()
x.g(2)
What I want to do is take the function f and partially apply it, resulting in a function g(self,b). I would like to use this function as a method, however this does not currently work and instead I get the error
Traceback (most recent call last):
File "test.py", line 8, in <module>
x.g(2)
TypeError: f() takes exactly 3 arguments (2 given)
Doing x.g(x,2) however works, so it seem the issue is that g is considered a "normal" function instead of a method of the class. Is there a way to get x.g to behave like a method (i.e implicitly pass the self parameter) instead of a function?
There are two issues at hand here. First, for a function to be turned into a method it must be stored on the class, not the instance. A demonstration:
class Foo(object):
def a(*args):
print 'a', args
def b(*args):
print 'b', args
Foo.b = b
x = Foo()
def c(*args):
print 'c', args
x.c = c
So a is a function defined in the class definition, b is a function assigned to the class afterwards, and c is a function assigned to the instance. Take a look at what happens when we call them:
>>> x.a('a will have "self"')
a (<__main__.Foo object at 0x100425ed0>, 'a will have "self"')
>>> x.b('as will b')
b (<__main__.Foo object at 0x100425ed0>, 'as will b')
>>> x.c('c will only recieve this string')
c ('c will only recieve this string',)
As you can see there is little difference between a function defined along with the class, and one assigned to it later. I believe there is actually no difference as long as there is no metaclass involved, but that is for another time.
The second problem comes from how a function is actually turned into a method in the first place; the function type implements the descriptor protocol. (See the docs for details.) In a nutshell, the function type has a special __get__ method which is called when you perform an attribute lookup on the class itself. Instead of you getting the function object, the __get__ method of that function object is called, and that returns a bound method object (which is what supplies the self argument).
Why is this a problem? Because the functools.partial object is not a descriptor!
>>> import functools
>>> def f(*args):
... print 'f', args
...
>>> g = functools.partial(f, 1, 2, 3)
>>> g
<functools.partial object at 0x10042f2b8>
>>> g.__get__
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'functools.partial' object has no attribute '__get__'
There are a number of options you have at this point. You can explicitly supply the self argument to the partial:
import functools
class Foo(object):
def __init__(self):
def f(self, a, b):
print a + b
self.g = functools.partial(f, self, 1)
x = Foo()
x.g(2)
...or you would imbed the self and value of a in a closure:
class Foo(object):
def __init__(self):
a = 1
def f(b):
print a + b
self.g = f
x = Foo()
x.g(2)
These solutions are of course assuming that there is an as yet unspecified reason for assigning a method to the class in the constructor like this, as you can very easily just define a method directly on the class to do what you are doing here.
Edit: Here is an idea for a solution assuming the functions may be created for the class, instead of the instance:
class Foo(object):
pass
def make_binding(name):
def f(self, *args):
print 'Do %s with %s given %r.' % (name, self, args)
return f
for name in 'foo', 'bar', 'baz':
setattr(Foo, name, make_binding(name))
f = Foo()
f.foo(1, 2, 3)
f.bar('some input')
f.baz()
Gives you:
Do foo with <__main__.Foo object at 0x10053e3d0> given (1, 2, 3).
Do bar with <__main__.Foo object at 0x10053e3d0> given ('some input',).
Do baz with <__main__.Foo object at 0x10053e3d0> given ().
This will work. But I'm not sure if this is what you are looking for
class Foo(object):
def __init__(self):
def f(a,self,b):
print a+b
self.g = functools.partial(f,1, self) # <= passing `self` also.
x = Foo()
x.g(2)
this is simply a concrete example of what i believe is the most correct (and therefore pythonic :) way to solve -- as the best solution (definition on a class!) was never revealed -- #MikeBoers explanations are otherwise solid.
i've used this pattern quite a bit (recently for an proxied API), and it's survived untold production hours without the slightest irregularity.
from functools import update_wrapper
from functools import partial
from types import MethodType
class Basic(object):
def add(self, **kwds):
print sum(kwds.values())
Basic.add_to_one = MethodType(
update_wrapper(partial(Basic.add, a=1), Basic.add),
None,
Basic,
)
x = Basic()
x.add(a=1, b=9)
x.add_to_one(b=9)
...yields:
10
10
...the key take-home-point here is MethodType(func, inst, cls), which creates an unbound method from another callable (you can even use this to chain/bind instance methods to unrelated classes... when instantiated+called the original instance method will receive BOTH self objects!)
note the exclusive use of keyword arguments! while there might be a better way to handle, args are generally a PITA because the placement of self becomes less predictable. also, IME anyway, using *args, and **kwds in the bottom-most function has proven very useful later on.
functools.partialmethod() is available since python 3.4 for this purpose.
import functools
class Foo(object):
def __init__(self):
def f(a,self,b):
print a+b
self.g = functools.partialmethod(f,1)
x=Foo()
x.g(2)