Comparing two containers for identity of their contents - python

I have a method that returns a set of objects, and I'm writing a unit test for this method. Is there a generic, tidy and idiomatic way of comparing these for identity (rather than equality)? Or do I need to write a suitable implementation myself?
An example (somewhat contrived to keep it simple):
class Foo(object):
def has_some_property(self):
...
class Container(object):
def __init__(self):
self.foo_set = set()
def add_foo(self, foo):
self.foo_set.add(foo)
def foo_objects_that_have_property(self):
return set([foo for foo in self.foo_set if foo.has_some_property()])
import unittest
class TestCase(unittest.TestCase):
def testFoo(self):
c = Container()
x, y, z = Foo(), Foo(), Foo()
...
self.assertContentIdentity(c.foo_objects_that_have_property(), set([x, y]))
Importantly, testing here for equality won't do, since mutating the objects returned by foo_objects_that_have_property() may lead to inconsistent results depending on how those objects are used differently in Container even if they are "equal" at the time of the test.

The best I can come up with is:
#staticmethod
def set_id(c):
return set([id(e) for e in c])
def assertContentIdentity(self, a, b):
self.assertEqual(set_id(a), set_id(b))
However, this is specialised for sets and can't deal with nested containers.

A simple, albeit not the most efficient, way to do it:
def assertContentIdentity(set1, set2):
set1 = set([id(a) for a in set1])
set2 = set([id(a) for a in set2])
assert set1 == set2

x is y won't work here since that
would tell me that the sets are
different, which I know already. I
want to know if the objects that they
contain are the same objects or
different objects.
Then you need to write your own function, like
set([id(x) for x in X]) == set([id(y) for y in Y])

Related

specify comparator function in python dictionary

Is there a way to pass in a custom equality comparison function when creating a python dictionary so that it doesn't use the default __eq__ or hash comparators? I'm hoping there is a way to do so but wasn't able to find it so far.
edit: I am looking for a way to provide different definitions of object equality for classes that I defined. Like:
class A:
def __init__(self, a, b):
self.a = a
self.b = b
def c1(a1, a2): # assume both are A objects
return a1.a == a2.a
def c2(a1, a2):
return a1.b == a2.b
# is this possible?
d1 = dict(cmp=c1)
d2 = dict(cmp=c2)
I know I can override __eq__ etc in my class definition but I can only override it once. In Java I can use TreeMap and I am looking for the equivalent in Python.

__init__ using method v lambda

When initializing an attribute in a class, is there a reason to avoid using a lambda expression or list comprehension, preferring a method or vice versa?
In example:
With
class Foo():
def __init__(self, data):
self.data = data
List Comprehension
class BarListComp():
def __init__(self, listOfFoo):
self.data = [fooInst.data for fooInst in listOfFoo if fooInst.data % 2 == 0]
Lambda
class BarLambda():
def __init__(self, listOfFoo):
self.data = list(filter(lambda x: x % 2 == 0, map(lambda y: y.data, listOfFoo)))
Method
class BarMethod():
def __init__(self, listOfFoo):
self.data = self.genData(listOfFoo)
#static_method
def genData(listOfFoo):
# this could also just be the earlier list comprehension
listOut = []
for elem in listOfFoo:
if elem.data % 2 == 0:
listOut.append(elem.data)
return listOut
(Please note these might not be the greatest examples, and the needed processing could be much more complicated)
Is there a preferred method? Does an initialization process reach a suitable complexity to necessitate a new method to be split off?
Just to add to #NPE comment (because I truly agree with his statement),
is there a reason to avoid using a lambda expression or list comprehension, preferring a method or vice versa?
The answer is Yes and No.
My ambiguous answer is the consequence of the reality being, choose wathever is more readable for you specific case.
Now, opinion asides, for your above example I think it could be hardly arguable that your Lambda solution be favored. At a glance, it would be difficult to really get what is going on. So I'd say pick the method option or the list comprehension (which would be my GOTO solution).

Python: Passing mutable(?) object to method

I'm trying to implement a class with a method that calls another method with an object that's part of the class where the lowest method mutates the object. My implementation is a little more complicated, so I'll post just some dummy code so you can see what I'm talking about:
class test:
def __init__(self,list):
self.obj = list
def mult(self, x, n):
x = x*n
def numtimes(self, n):
self.mult(self.obj, n)
Now, if I create an object of this type and run the numtimes method, it won't update self.obj:
m = test([1,2,3,4])
m.numtimes(3)
m.obj #returns [1,2,3,4]
Whereas I'd like it to give me [1,2,3,4,1,2,3,4,1,2,3,4]
Basically, I need to pass self.obj to the mult method and have it mutate self.obj so that when I call m.obj, I'll get [1,2,3,4,1,2,3,4,1,2,3,4] instead of [1,2,3,4].
I feel like this is just a matter of understanding how python passes objects as arguments to methods (like it's making a copy of the object, and instead I need to use a pointer), but maybe not. I'm new to python and could really use some help here.
Thanks in advance!!
Allow me to take on the bigger subject of mutability.
Lists are mutable objects, and support both mutable operations, and immutable operations. That means, operations that change the list in-place, and operations that return a new list. Tuples, for contrast, only are only immutable.
So, to multiply a list, you can choose two methods:
a *= b
This is a mutable operation, that will change 'a' in-place.
a = a * b
This is an immutable operation. It will evaluate 'a*b', create a new list with the correct value, and assign 'a' to that new list.
Here, already, lies a solution to your problem. But, I suggest you read on a bit. When you pass around lists (and other objects) as parameters, you are only passing a new reference, or "pointer" to that same list. So running mutable operations on that list will also change the one that you passed. The result might be a very subtle bug, when you write:
>>> my_list = [1,2,3]
>>> t = test(my_list)
>>> t.numtimes(2)
>>> my_list
[1,2,3,1,2,3] # Not what you intended, probably!
So here's my final recommendation. You can choose to use mutable operations, that's fine. But then create a new copy from your arguments, as such:
def __init__(self,l):
self.obj = list(l)
OR use immutable operations, and reassign them to self:
def mult(self, x, n):
self.x = x*n
Or do both, there's no harm in being extra safe :)
The multiplication x * n creates a new instance and does not alter the existing list. See here:
a = [1]
print (id (a) )
a = a * 2
print (id (a) )
This should work:
class test:
def __init__(self,list):
self.obj = list
def mult(_, x, n):
x *= n
def numtimes(self, n):
self.mult(self.obj, n)

Using map on methods in Python

I have some classes in Python:
class Class1:
def method(self):
return 1
class Class2:
def method(self):
return 2
and a list myList whose elements are all either instances of Class1 or Class2. I'd like to create a new list whose elements are the return values of method called on each element of myList. I have tried using a "virtual" base class
class Class0:
def method(self):
return 0
class Class1(Class0):
def method(self):
return 1
class Class2(Class0):
def method(self):
return 2
But if I try map(Class0.method, myList) I just get [0, 0, 0, ...]. I'm a bit new to Python, and I hear that "duck typing" is preferred to actual inheritance, so maybe this is the wrong approach. Of course, I can do
[myList[index].method() for index in xrange(len(myList))]
but I like the brevity of map. Is there a way to still use map for this?
You can use
map(lambda e: e.method(), myList)
But I think this is better:
[e.method() for e in myList]
PS.: I don't think there is ever a need for range(len(collection)).
The operator.methodcaller tool is exactly what you're looking for:
map(methodcaller("method"), myList)
Alternatively you can use a list comprehension:
[obj.method() for obj in myList]
This is best:
[o.method() for o in myList]
Map seems to be favored by people pining for Haskell or Lisp, but Python has fine iterative structures you can use instead.

How to call same method for a list of objects?

Suppose code like this:
class Base:
def start(self):
pass
def stop(self)
pass
class A(Base):
def start(self):
... do something for A
def stop(self)
.... do something for A
class B(Base):
def start(self):
def stop(self):
a1 = A(); a2 = A()
b1 = B(); b2 = B()
all = [a1, b1, b2, a2,.....]
Now I want to call methods start and stop (maybe also others) for each object in the list all. Is there any elegant way for doing this except of writing a bunch of functions like
def start_all(all):
for item in all:
item.start()
def stop_all(all):
This will work
all = [a1, b1, b2, a2,.....]
map(lambda x: x.start(),all)
simple example
all = ["MILK","BREAD","EGGS"]
map(lambda x:x.lower(),all)
>>>['milk','bread','eggs']
and in python3
all = ["MILK","BREAD","EGGS"]
list(map(lambda x:x.lower(),all))
>>>['milk','bread','eggs']
It seems like there would be a more Pythonic way of doing this, but I haven't found it yet.
I use "map" sometimes if I'm calling the same function (not a method) on a bunch of objects:
map(do_something, a_list_of_objects)
This replaces a bunch of code that looks like this:
do_something(a)
do_something(b)
do_something(c)
...
But can also be achieved with a pedestrian "for" loop:
for obj in a_list_of_objects:
do_something(obj)
The downside is that a) you're creating a list as a return value from "map" that's just being throw out and b) it might be more confusing that just the simple loop variant.
You could also use a list comprehension, but that's a bit abusive as well (once again, creating a throw-away list):
[ do_something(x) for x in a_list_of_objects ]
For methods, I suppose either of these would work (with the same reservations):
map(lambda x: x.method_call(), a_list_of_objects)
or
[ x.method_call() for x in a_list_of_objects ]
So, in reality, I think the pedestrian (yet effective) "for" loop is probably your best bet.
The approach
for item in all:
item.start()
is simple, easy, readable, and concise. This is the main approach Python provides for this operation. You can certainly encapsulate it in a function if that helps something. Defining a special function for this for general use is likely to be less clear than just writing out the for loop.
The *_all() functions are so simple that for a few methods I'd just write the functions. If you have lots of identical functions, you can write a generic function:
def apply_on_all(seq, method, *args, **kwargs):
for obj in seq:
getattr(obj, method)(*args, **kwargs)
Or create a function factory:
def create_all_applier(method, doc=None):
def on_all(seq, *args, **kwargs):
for obj in seq:
getattr(obj, method)(*args, **kwargs)
on_all.__doc__ = doc
return on_all
start_all = create_all_applier('start', "Start all instances")
stop_all = create_all_applier('stop', "Stop all instances")
...
maybe map, but since you don't want to make a list, you can write your own...
def call_for_all(f, seq):
for i in seq:
f(i)
then you can do:
call_for_all(lamda x: x.start(), all)
call_for_all(lamda x: x.stop(), all)
by the way, all is a built in function, don't overwrite it ;-)
Starting in Python 2.6 there is a operator.methodcaller function.
So you can get something more elegant (and fast):
from operator import methodcaller
map(methodcaller('method_name'), list_of_objects)
Taking #Ants Aasmas answer one step further, you can create a wrapper that takes any method call and forwards it to all elements of a given list:
class AllOf:
def __init__(self, elements):
self.elements = elements
def __getattr__(self, attr):
def on_all(*args, **kwargs):
for obj in self.elements:
getattr(obj, attr)(*args, **kwargs)
return on_all
That class can then be used like this:
class Foo:
def __init__(self, val="quux!"):
self.val = val
def foo(self):
print "foo: " + self.val
a = [ Foo("foo"), Foo("bar"), Foo()]
AllOf(a).foo()
Which produces the following output:
foo: foo
foo: bar
foo: quux!
With some work and ingenuity it could probably be enhanced to handle attributes as well (returning a list of attribute values).
If you would like to have a generic function while avoiding referring to method name using strings, you can write something like that:
def apply_on_all(seq, method, *args, **kwargs):
for obj in seq:
getattr(obj, method.__name__)(*args, **kwargs)
# to call:
apply_on_all(all, A.start)
Similar to other answers but has the advantage of only using explicit attribute lookup (i.e. A.start). This can eliminate refactoring errors, i.e. it's easy to rename the start method and forget to change the strings that refer to this method.
The best solution, in my opinion, depends on whether you need the result of the method and whether your method takes any arguments except self.
If you don't need the result, I would simply write a for loop:
for instance in lst:
instance.start()
If you need the result, but method takes no arguments, I would use map:
strs = ['A', 'B', 'C']
lower_strs = list(map(str.lower, strs)) # ['a', 'b', 'c']
And finally, if you need the result and method does take some arguments, list comprehension would work great:
strs = ['aq', 'bq', 'cq']
qx_strs = [i.replace('q', 'x') for i in strs] # ['ax', 'bx', 'cx']

Categories

Resources