I'd like to combine a list of class instances of a class for which the __add__ method is defined.
i.e., I have a list of class instances L=[A,B,C,D] and I want their sum E = A+B+C+D, but generalized so that instead of the + syntax I could do something like E = sum(L).
What function should I use to do that? Is the __add__ method adequate, or do I need to define a different class method (e.g. __iadd__) in order to accomplish this?
(if this turns out to be a duplicate, how should I be asking the question?)
import operator
reduce(operator.add, L)
sum may want to add numerical values to instances of your class. Define __radd__ so for example int + Foo(1) will be defined:
class Foo(object):
def __init__(self, val):
self.val = val
def __add__(self, other):
return self.val + other.val
def __radd__(self, other):
return other + self.val
A = Foo(1)
B = Foo(2)
L = [A,B]
print(A+B)
# 3
print(sum(L))
# 3
Ignore my previous answer, it was wrong.
The reduce function allows you to apply any binary function or method to all the elements of a sequence. So, you could write:
reduce(YourClass.__add__, sequence)
If not all objects in the sequence are instances of the same class, then instead use this:
import operator
reduce(operator.add, sequence)
Or this:
reduce(lambda x, y: x + y, sequence)
Related
When comparing tuples of objects apparently the __eq__ method of the object is called and then the compare method:
import timeit
setup = """
import random
import string
import operator
random.seed('slartibartfast')
d={}
class A(object):
eq_calls = 0
cmp_calls = 0
def __init__(self):
self.s = ''.join(random.choice(string.ascii_uppercase) for _ in
range(16))
def __hash__(self): return hash(self.s)
def __eq__(self, other):
self.__class__.eq_calls += 1
return self.s == other.s
def __ne__(self, other): return self.s != other.s
def __cmp__(self, other):
self.__class__.cmp_calls += 1
return cmp(self.s ,other.s)
for i in range(1000): d[A()] = 0"""
print min(timeit.Timer("""
for k,v in sorted(d.iteritems()): pass
print A.eq_calls
print A.cmp_calls""", setup=setup).repeat(1, 1))
print min(timeit.Timer("""
for k,v in sorted(d.iteritems(),key=operator.itemgetter(0)): pass
print A.eq_calls
print A.cmp_calls""", setup=setup).repeat(1, 1))
Prints:
8605
8605
0.0172435735131
0
8605
0.0103719966418
So in the second case where we compare the keys (that is the A instances) directly __eq__ is not called, while in the first case apparently the first ellement of the tuple are compared via equal and then via cmp. But why are they not compared directly via cmp ? What I really don't quite get is the default sorted behavior on the absence of a cmp or key parameter.
It is just how tuple comparison is implemented: tuplerichcompare
it searches the first index where items are different and then compare on that. That's why you see an __eq__ and then a __cmp__ call.
Moreover if you do not implement the __eq__ operator for A, you will see that __cmp__ is called twice once for equality and once for comparison.
For instance,
print min(timeit.Timer("""
l =list()
for i in range(5):
l.append((A(),A(),A()))
l[-1][0].s='foo'
l[-1][1].s='foo2'
for _ in sorted(l): pass
print A.eq_calls
print A.cmp_calls""", setup=setup).repeat(1, 1))
prints out 24 and 8 calls respectively (the exact number clearly depends on random seed but in this case they will always have a ratio of 3)
I want to sum multiple attributes at a time in a single loop:
class Some(object):
def __init__(self, acounter, bcounter):
self.acounter = acounter
self.bcounter = bcounter
someList = [Some(x, x) for x in range(10)]
Can I do something simpler and faster than it?
atotal = sum([x.acounter for x in someList])
btotal = sum([x.bcounter for x in someList])
First off - sum doesn't need a list - you can use a generator expression instead:
atotal = sum(x.acounter for x in someList)
You could write a helper function to do the search of the list once but look up each attribute in turn per item, eg:
def multisum(iterable, *attributes, **kwargs):
sums = dict.fromkeys(attributes, kwargs.get('start', 0))
for it in iterable:
for attr in attributes:
sums[attr] += getattr(it, attr)
return sums
counts = multisum(someList, 'acounter', 'bcounter')
# {'bcounter': 45, 'acounter': 45}
Another alternative (which may not be faster) is to overload the addition operator for your class:
class Some(object):
def __init__(self, acounter, bcounter):
self.acounter = acounter
self.bcounter = bcounter
def __add__(self, other):
if isinstance(other, self.__class__):
return Some(self.acounter+other.acounter, self.bcounter+other.bcounter)
elif isinstance(other, int):
return self
else:
raise TypeError("useful message")
__radd__ = __add__
somelist = [Some(x, x) for x in range(10)]
combined = sum(somelist)
print combined.acounter
print combined.bcounter
This way sum returns a Some object.
I doubt that this is really faster, but you can do it like thus:
First define padd (for "pair add") via:
def padd(p1,p2):
return (p1[0]+p2[0],p1[1]+p2[1])
For example, padd((1,4), (5,10)) = (6,14)
Then use reduce:
atotal, btotal = reduce(padd, ((x.acounter,x.bcounter) for x in someList))
in Python 3 you need to import reduce from functools but IIRC it can be used directly in Python 2.
On edit: For more than 2 attributes you can replace padd by vadd ("vector add") which can handle tuples of arbitrary dimensions:
def vadd(v1,v2):
return tuple(x+y for x,y in zip(v1,v2))
For just 2 attributes it is probably more efficient to hard-wire in the dimension since there is less function-call overhead.
Use this line to accumulate all of the attributes that you wish to sum.
>>> A = ((s.acounter,s.bcounter) for s in someList)
Then use this trick from https://stackoverflow.com/a/19343/47078 to make separate lists of each attribute by themselves.
>>> [sum(x) for x in zip(*A)]
[45, 45]
You can obviously combine the lines, but I thought breaking it apart would be easier to follow here.
And based on this answer, you can make it much more readable by defining an unzip(iterable) method.
def unzip(iterable):
return zip(*iterable)
[sum(x) for x in unzip((s.acounter,s.bcounter) for s in someList)]
What is the difference between using a special method and just defining a normal class method? I was reading this site which lists a lot of them.
For example it gives a class like this.
class Word(str):
'''Class for words, defining comparison based on word length.'''
def __new__(cls, word):
# Note that we have to use __new__. This is because str is an immutable
# type, so we have to initialize it early (at creation)
if ' ' in word:
print "Value contains spaces. Truncating to first space."
word = word[:word.index(' ')] # Word is now all chars before first space
return str.__new__(cls, word)
def __gt__(self, other):
return len(self) > len(other)
def __lt__(self, other):
return len(self) < len(other)
def __ge__(self, other):
return len(self) >= len(other)
def __le__(self, other):
return len(self) <= len(other)
For each of those special methods why can't I just make a normal method instead, what are they doing different? I think I just need a fundamental explanation that I can't find, thanks.
It is a pythonic way to do this:
word1 = Word('first')
word2 = Word('second')
if word1 > word2:
pass
instead of direct usage of comparator method
NotMagicWord(str):
def is_greater(self, other)
return len(self) > len(other)
word1 = NotMagicWord('first')
word2 = NotMagicWord('second')
if word1.is_greater(word2):
pass
And the same with all other magic method. You define __len__ method to tell python its length using built-in len function, for example. All magic method will be called implicitly while standard operations like binary operators, object calling, comparision and a lot of other. A Guide to Python's Magic Methods is really good, read it and see what behavior you can give to your objects. It similar to operator overloading in C++, if you are familiar with it.
A method like __gt__ is called when you use comparison operators in your code. Writing something like
value1 > value2
Is the equivalent of writing
value1.__gt__(value2)
"Magic methods" are used by Python to implement a lot of its underlying structure.
For example, let's say I have a simple class to represent an (x, y) coordinate pair:
class Point(object):
def __init__(self, x, y):
self.x = x
self.y = y
So, __init__ would be an example of one of these "magic methods" -- it allows me to automatically initialize the class by simply doing Point(3, 2). I could write this without using magic methods by creating my own "init" function, but then I would need to make an explicit method call to initialize my class:
class Point(object):
def init(self, x, y):
self.x = x
self.y = y
return self
p = Point().init(x, y)
Let's take another example -- if I wanted to compare two point variables, I could do:
class Point(object):
def __init__(self, x, y):
self.x = x
self.y = y
def __eq__(self, other):
return self.x == other.x and self.y == other.y
This lets me compare two points by doing p1 == p2. In contrast, if I made this a normal eq method, I would have to be more explicit by doing p1.eq(p2).
Basically, magic methods are Python's way of implementing a lot of its syntactic sugar in a way that allows it to be easily customizable by programmers.
For example, I could construct a class that pretends to be a function by implementing __call__:
class Foobar(object):
def __init__(self, a):
self.a = a
def __call__(self, b):
return a + b
f = Foobar(3)
print f(4) # returns 7
Without the magic method, I would have to manually do f.call(4), which means I can no longer pretend the object is a function.
Special methods are handled specially by the rest of the Python language. For example, if you try to compare two Word instances with <, the __lt__ method of Word will be called to determine the result.
The magic methods are called when you use <, ==, > to compare the objects. functools has a helper called total_ordering that will fill in the missing comparison methods if you just define __eq__ and __gt__.
Because str already has all the comparison operations defined, it's necessary to add them as a mixin if you want to take advantage of total_ordering
from functools import total_ordering
#total_ordering
class OrderByLen(object):
def __eq__(self, other):
return len(self) == len(other)
def __gt__(self, other):
return len(self) > len(other)
class Word(OrderByLen, str):
'''Class for words, defining comparison based on word length.'''
def __new__(cls, word):
# Note that we have to use __new__. This is because str is an immutable
# type, so we have to initialize it early (at creation)
if ' ' in word:
print "Value contains spaces. Truncating to first space."
word = word[:word.index(' ')] # Word is now all chars before first space
return str.__new__(cls, word)
print Word('cat') < Word('dog') # False
print Word('cat') > Word('dog') # False
print Word('cat') == Word('dog') # True
print Word('cat') <= Word('elephant') # True
print Word('cat') >= Word('elephant') # False
I am building a string Class that behaves like a regular string class except that the addition operator returns the sum of the lengths of the two strings instead of concatenating them. And then a multiplication operator returns the products of the length of the two strings. So I was planning on doing
class myStr(string):
def __add__(self):
return len(string) + len (input)
at least that is what I have for the first part but that is apparently not correct. Can someone help me correct it.
You need to derive from str, and you can use len(self) to get the length of the current instance. You also need to give __add__ a parameter for the other operand of the + operator.
class myStr(str):
def __add__(self, other):
return len(self) + len(other)
Demo:
>>> class myStr(str):
... def __add__(self, other):
... return len(self) + len(other)
...
>>> foo = myStr('foo')
>>> foo
'foo'
>>> foo + 'bar'
6
string is not a class. It's not anything*. There is no context where len(string) will work unless you define string.
Secondly, __add__ does not have an input parameter.
You need to fix both of these issues.
* You could import a module called string, but it's not something that just exists in global scope.
I'm writing a function that exponentiates an object, i.e. given a and n, returns an. Since a needs not be a built-in type, the function accepts, as a keyword argument, a function to perform multiplications. If undefined, it defaults to the objects __mul__ method, i.e. the object itself is expected to have multiplication defined. That part is sort of easy:
def bin_pow(a, n, **kwargs) :
mul = kwargs.pop('mul',None)
if mul is None :
mul = lambda x,y : x*y
The thing is that in the process of calculating an the are a lot of intermediate squarings, and there often are more efficient ways to compute them than simply multiplying the object by itself. It is easy to define another function that computes the square and pass it as another keyword argument, something like:
def bin_pow(a, n, **kwargs) :
mul = kwargs.pop('mul',None)
sqr = kwargs.pop('sqr',None)
if mul is None :
mul = lambda x,y : x*y
if sqr is None :
sqr = lambda x : mul(x,x)
The problem here comes if the function to square the object is not a standalone function, but is a method of the object being exponentiated, which would be a very reasonable thing to do. The only way of doing this I can think of is something like this:
import inspect
def bin_pow(a, n, **kwargs) :
mul = kwargs.pop('mul',None)
sqr = kwargs.pop('sqr',None)
if mul is None :
mul = lambda x,y : x*y
if sqr is None :
sqr = lambda x : mul(x,x)
elif inspect.isfunction(sqr) == False : # if not a function, it is a string
sqr = lambda x : eval('x.'+sqr+'()')
It does work, but I find it an extremely unelegant way of doing things... My mastery of OOP is limited, but if there was a way to have sqr point to the class' function, not to an instance's one, then I could get away with something like sqr = lambda x : sqr(x), or maybe sqr = lambda x: x.sqr(). Can this be done? Is there any other more pythonic way?
You can call unbound methods with the instance as the first parameter:
class A(int):
def sqr(self):
return A(self*self)
sqr = A.sqr
a = A(5)
print sqr(a) # Prints 25
So in your case you don't actually need to do anything specific, just the following:
bin_pow(a, n, sqr=A.sqr)
Be aware that this is early binding, so if you have a subclass B that overrides sqr then still A.sqr is called. For late binding you can use a lambda at the callsite:
bin_pow(a, n, sqr=lambda x: x.sqr())
here's how I'd do it:
import operator
def bin_pow(a, n, **kwargs) :
pow_function = kwargs.pop('pow' ,None)
if pow_function is None:
pow_function = operator.pow
return pow_function(a, n)
That's the fastest way. See also object.__pow__ and the operator module documentations.
Now, to pass an object method you can pass it directly, no need to pass a string with the name. In fact, never use strings for this kind of thing, using the object directly is much better.
If you want the unbound method, you can pass it just as well:
class MyClass(object):
def mymethod(self, other):
return do_something_with_self_and_other
m = MyClass()
n = MyClass()
bin_pow(m, n, pow=MyClass.mymethod)
If you want the class method, so just pass it instead:
class MyClass(object):
#classmethod
def mymethod(cls, x, y):
return do_something_with_x_and_y
m = MyClass()
n = MyClass()
bin_pow(m, n, pow=MyClass.mymethod)
If you want to call the class's method, and not the (possibly overridden) instance's method, you can do
instance.__class__.method(instance)
instead of
instance.method()
I'm not sure though if that's what you want.
If I understand the design goals of the library function, you want to provide a library "power" function which will raise any object passed to it to the Nth power. But you also want to provide a "shortcut" for efficiency.
The design goals seem a little odd--Python already defines the mul method to allow the designer of a class to multiply it by an arbitrary value, and the pow method to allow the designer of a class to support raising it to a power. If I were building this, I'd expect and require the users to have a mul method, and I'd do something like this:
def bin_or_pow(a, x):
pow_func = getattr(a, '__pow__', None)
if pow_func is None:
def pow_func(n):
v = 1
for i in xrange(n):
v = a * v
return v
return pow_func(x)
That will let you do the following:
class Multable(object):
def __init__(self, x):
self.x = x
def __mul__(self, n):
print 'in mul'
n = getattr(n, 'x', n)
return type(self)(self.x * n)
class Powable(Multable):
def __pow__(self, n):
print 'in pow'
n = getattr(n, 'x', n)
return type(self)(self.x ** n)
print bin_or_pow(5, 3)
print
print bin_or_pow(Multable(5), 5).x
print
print bin_or_pow(Powable(5), 5).x
... and you get ...
125
in mul
in mul
in mul
in mul
in mul
3125
in pow
3125
I understand it's the sqr-bit at the end you want to fix. If so, I suggest getattr. Example:
class SquarableThingy:
def __init__(self, i):
self.i = i
def squarify(self):
return self.i**2
class MultipliableThingy:
def __init__(self, i):
self.i = i
def __mul__(self, other):
return self.i * other.i
x = SquarableThingy(3)
y = MultipliableThingy(4)
z = 5
sqr = 'squarify'
sqrFunX = getattr(x, sqr, lambda: x*x)
sqrFunY = getattr(y, sqr, lambda: y*y)
sqrFunZ = getattr(z, sqr, lambda: z*z)
assert sqrFunX() == 9
assert sqrFunY() == 16
assert sqrFunZ() == 25