This question already has answers here:
What happens when objects in a Set are altered to match each other?
(4 answers)
Closed 11 months ago.
I have a simple custom object that represent custom tag, that user can attach to another object.
I want to store tags in a set, because I want to avoid duplicates and because order doesn't matter.
Each tag contain values "name" and "description". Later on, I might add another variables, but the key identifier for tag is "name".
I want to check whether tag is equal to other either by tag.name == other.name or against string tag == 'whatever'.
I want users to be able to edit tags including renaming them.
I have defined the object like this and everything worked as expected:
class Tag:
def __init__(self, name, description=""):
self.name = name
self.description = description
def __str__(self):
return self.name
def __repr__(self):
return self.name
def __eq__(self, other):
if isinstance(other, Tag):
return self.name == other.name
else:
return self.name == other
def __hash__(self):
return hash(self.name)
The problem appeared, when I tried to change the tag name:
blue_tag = Tag("blue")
tags = {blue_tag}
blue_tag in tags # returns True as expected
"blue" in tags # returns True as expected
blue_tag.name = "navy"
"navy" in tags # returns False. Why?
I don't understand why. The tag is correctly renamed, when I do print(tags). The id of bluetag object is also the same, hash of the name is also the same.
Everywhere, including Python official documentation, I found just basic info that in checks whether item is present in container and to define custom container, I need to define custom methods like __contains__ But I don't want to create custom set method.
The closest thing I found I found was a question here on SO:
Custom class object and "in" set operator
But it still didn't solve the problem.
The problem is that in changing a tag name attribute, you change its hash in the class above: and the hash of an object must not change after it is added to a set or as dictionary as a key.
The thing is that if two objects are "equal" they must have the same hash value - since you want your tags to be comparable by name, this implies that they can't have their name changed at all: if an object compares equal to another, their hash values must also be the equal: i.e. you can't simply add another immutable attribute to your class and base your hash value on that instead of the name.
The workaround I see in this case is to have a special "add_to_set" method on your Tag class; it would then track the sets it belongs to, and turn name into a property instance, so that whenever name is changed, it removes and re-adds the Tag itself from all sets it belongs to. The newly re-inserted tag would behave accordingly.
Making this work properly in parallel code would take somewhatmore work: as one could make use of the sets in another thread during the renaming - but if that is not a problem, then what is needed is:
class Tag:
def __init__(self, name, description=""):
self.sets = []
self.name = name
self.description = description
... # other methods as in your code
def __hash__(self):
return hash(self.name)
def add_to_set(self, set_):
self.sets.append(set_)
set_.add(self)
def remove_from_set(self, set_):
self.sets.remove(set_)
set_.remove(self)
#property
def name(self):
return self._name
#name.setter
def name(self, value):
# WARNING: this is as thread unsafe as it gets! Do not use this class
# in multi-threaded code. (async is ok)
try:
for set_ in self.sets:
set_.remove(self)
self._name = value
finally:
for set_ in self.sets:
set_.add(self)
And now:
In [17]: a = Tag("blue")
In [18]: b = set()
In [19]: a.add_to_set(b)
In [20]: a in b
Out[20]: True
In [21]: b
Out[21]: {blue}
In [22]: a.name = "mauve"
In [23]: b
Out[23]: {mauve}
In [24]: a in b
Out[24]: True
It is possible to specialize a set class that would automatically call the add_to_set and remove_from_set methods for you as well, but this is likely enough.
Related
Suppose we have the following class hierarchy:
class ClassA:
#property
def foo(self): return "hello"
class ClassB(ClassA):
#property
def bar(self): return "world"
If I explore __dict__ on ClassB like so, I only see the bar attribute:
for name,_ in ClassB.__dict__.items():
if name.startswith("__"):
continue
print(name)
Output is bar
I can roll my own means to get attributes on not only the specified type but its ancestors. However, my question is whether there's already a way in python for me to do this without re-inventing a wheel.
def return_attributes_including_inherited(type):
results = []
return_attributes_including_inherited_helper(type,results)
return results
def return_attributes_including_inherited_helper(type,attributes):
for name,attribute_as_object in type.__dict__.items():
if name.startswith("__"):
continue
attributes.append(name)
for base_type in type.__bases__:
return_attributes_including_inherited_helper(base_type,attributes)
Running my code as follows...
for attribute_name in return_attributes_including_inherited(ClassB):
print(attribute_name)
... gives back both bar and foo.
Note that I'm simplifying some things: name collisions, using items() when for this example I could use dict, skipping over anything that starts with __, ignoring the possibility that two ancestors themselves have a common ancestor, etc.
EDIT1 - I tried to keep the example simple. But I really want both the attribute name and the attribute reference for each class and ancestor class. One of the answers below has me on a better track, I'll post some better code when I get it to work.
EDIT2 - This does what I want and is very succinct. It's based on Eli's answer below.
def get_attributes(type):
attributes = set(type.__dict__.items())
for type in type.__mro__:
attributes.update(type.__dict__.items())
return attributes
It gives back both the attribute names and their references.
EDIT3 - One of the answers below suggested using inspect.getmembers. This appears very useful because it's like dict only it operates on ancestor classes as well.
Since a large part of what I was trying to do was find attributes marked with a particular descriptor, and include ancestors classes, here is some code that would help do that in case it helps anyone:
class MyCustomDescriptor:
# This is greatly oversimplified
def __init__(self,foo,bar):
self._foo = foo
self._bar = bar
pass
def __call__(self,decorated_function):
return self
def __get__(self,instance,type):
if not instance:
return self
return 10
class ClassA:
#property
def foo(self): return "hello"
#MyCustomDescriptor(foo="a",bar="b")
def bar(self): pass
#MyCustomDescriptor(foo="c",bar="d")
def baz(self): pass
class ClassB(ClassA):
#property
def something_we_dont_care_about(self): return "world"
#MyCustomDescriptor(foo="e",bar="f")
def blah(self): pass
# This will get attributes on the specified type (class) that are of matching_attribute_type. It just returns the attributes themselves, not their names.
def get_attributes_of_matching_type(type,matching_attribute_type):
return_value = []
for member in inspect.getmembers(type):
member_name = member[0]
member_instance = member[1]
if isinstance(member_instance,matching_attribute_type):
return_value.append(member_instance)
return return_value
# This will return a dictionary of name & instance of attributes on type that are of matching_attribute_type (useful when you're looking for attributes marked with a particular descriptor)
def get_attribute_name_and_instance_of_matching_type(type,matching_attribute_type):
return_value = {}
for member in inspect.getmembers(ClassB):
member_name = member[0]
member_instance = member[1]
if isinstance(member_instance,matching_attribute_type):
return_value[member_name] = member_instance
return return_value
You should use python's inspect module for any such introspective capabilities.
.
.
>>> class ClassC(ClassB):
... def baz(self):
... return "hiya"
...
>>> import inspect
>>> for attr in inspect.getmembers(ClassC):
... print attr
...
('__doc__', None)
('__module__', '__main__')
('bar', <property object at 0x10046bf70>)
('baz', <unbound method ClassC.baz>)
('foo', <property object at 0x10046bf18>)
Read more about the inspect module here.
You want to use dir:
for attr in dir(ClassB):
print attr
Sadly there isn't a single composite object. Every attribute access for a (normal) python object first checks obj.__dict__, then the attributes of all it's base classes; while there are some internal caches and optimizations, there isn't a single object you can access.
That said, one thing that could improve your code is to use cls.__mro__ instead of cls.__bases__... instead of the class's immediate parents, cls.__mro__ contains ALL the ancestors of the class, in the exact order Python would search, with all common ancestors occuring only once. That would also allow your type-searching method to be non-recursive. Loosely...
def get_attrs(obj):
attrs = set(obj.__dict__)
for cls in obj.__class__.__mro__:
attrs.update(cls.__dict__)
return sorted(attrs)
... does a fair approximation of the default dir(obj) implementation.
Here is a function I wrote, back in the day. The best answer is using the inspect module, as using __dict__ gives us ALL functions (ours + inherited) and (ALL?) data members AND properties. Where inspect gives us enough information to weed out what we don't want.
def _inspect(a, skipFunctionsAlways=True, skipMagic = True):
"""inspects object attributes, removing all the standard methods from 'object',
and (optionally) __magic__ cruft.
By default this routine skips __magic__ functions, but if you want these on
pass False in as the skipMagic parameter.
By default this routine skips functions, but if you want to see all the functions,
pass in False to the skipFunctionsAlways function. This works together with the
skipMagic parameter: if the latter is True, you won't see __magic__ methods.
If skipFunctionsAlways = False and skipMagic = False, you'll see all the __magic__
methods declared for the object - including __magic__ functions declared by Object
NOT meant to be a comprehensive list of every object attribute - instead, a
list of every object attribute WE (not Python) defined. For a complete list
of everything call inspect.getmembers directly"""
objType = type(object)
def weWantIt(obj):
#return type(a) != objType
output= True
if (skipFunctionsAlways):
output = not ( inspect.isbuiltin(obj) ) #not a built in
asStr = ""
if isinstance(obj, types.MethodType):
if skipFunctionsAlways: #never mind, we don't want it, get out.
return False
else:
asStr = obj.__name__
#get just the name of the function, we don't want the whole name, because we don't want to take something like:
#bound method LotsOfThings.bob of <__main__.LotsOfThings object at 0x103dc70>
#to be a special method because it's module name is special
#WD-rpw 02-23-2008
#TODO: it would be great to be able to separate out superclass methods
#maybe by getting the class out of the method then seeing if that attribute is in that class?
else:
asStr = str(obj)
if (skipMagic):
output = (asStr.find("__") == -1 ) #not a __something__
return (output)
for value in inspect.getmembers( a, weWantIt ):
yield value
{k: getattr(ClassB, k) for k in dir(ClassB)}
Proper values (instead of <property object...>) will be presented when using ClassB instance.
And of course You can filter this by adding things like if not k.startswith('__') in the end.
I have created a class that has a property with a setter. There are 2 different ways to use this class, so the values of some of the object components may be set in a different order depending on the scenario (i.e. I don't want to set them during __init__). I have included a non-property with its own set function here to illustrate what is and isn't working.
class Dumbclass(object):
def __init__(self):
self.name = None
self.__priority = None
#property
def priority(self):
return self.__priority
#priority.setter
def priority(self, p):
self.__priority = p
def set_name(self, name):
self.name = "dumb " + name
def all_the_things(self, name, priority=100):
self.set_name(name)
self.priority(priority)
print self.name
print self.priority
When I run the following, it returns TypeError: 'NoneType' object is not callable. I investigated with pdb, and found that it was calling the getter instead of the setter at self.priority(priority).
if __name__ == '__main__':
d1 = Dumbclass()
d1.all_the_things("class 1")
d2 = Dumbclass()
d2.all_the_things("class 2", 4)
What's going on here?
Short Answer: Please change your line self.priority(priority) to self.priority = priority
Explanation: Setter is called only when you assign something to the attribute. In your code, you are not doing an assignment operation.
Here are some nice references if you want to understand more:
Python descriptors
How does the #property decorator work?
Real Life Example
You are facing this issue due to trying to treat priority as a method in your all_the_things method. At this point it is already a property, so you assign to it like a variable:
def all_the_things(self, name, priority=100):
self.set_name(name)
self.priority = priority
Here, I am attempting to mock up a social media profile as a class "Profile", in which you have name, a group of friends, and the ability to add and remove friends. There is a method that I would like to make, that when invoked, will print the list of friends in alphabetical order.
The issue: I get a warning that I cannot sort an unsortable type. Python is seeing my instance variable as a "Profile object", rather than a list that I can sort and print.
Here is my code:
class Profile(object):
"""
Represent a person's social profile
Argument:
name (string): a person's name - assumed to uniquely identify a person
Attributes:
name (string): a person's name - assumed to uniquely identify a person
statuses (list): a list containing a person's statuses - initialized to []
friends (set): set of friends for the given person.
it is the set of profile objects representing these friends.
"""
def __init__(self, name):
self.name = name
self.friends = set()
self.statuses = []
def __str__(self):
return self.name + " is " + self.get_last_status()
def update_status(self, status):
self.statuses.append(status)
return self
def get_last_status(self):
if len(self.statuses) == 0:
return "None"
else:
return self.statuses[-1]
def add_friend(self, friend_profile):
self.friends.add(friend_profile)
friend_profile.friends.add(self)
return self
def get_friends(self):
if len(self.friends) == 0:
return "None"
else:
friends_lst = list(self.friends)
return sorted(friends_lst)
After I fill out a list of friends (from a test module) and invoke the get_friends method, python tells me:
File "/home/tjm/Documents/CS021/social.py", line 84, in get_friends
return sorted(friends_lst)
TypeError: unorderable types: Profile() < Profile()
Why can't I simply typecast the object to get it in list form? What should I be doing instead so that get_friends will return an alphabetically sorted list of friends?
Sorting algorithms look for the existence of __eq__, __ne__, __lt__, __le__, __gt__,__ge__ methods in the class definition to compare instances created from them. You need to override those methods in order to tweak their behaviors.
For performance reasons, I'd recommend you to define some integer property for your class like id and use it for comparing instead of name which has string comparison overhead.
class Profile(object):
def __eq__(self, profile):
return self.id == profile.id # I made it up the id property.
def __lt__(self, profile):
return self.id < profile.id
def __hash__(self):
return hash(self.id)
...
Alternatively, you can pass a key function to sort algorithm if you don't want to bother yourself overriding those methods:
>>> friend_list = [<Profile: id=120>, <Profile: id=121>, <Profile: id=115>]
>>> friend_list.sort(key=lambda p: p.id, reverse=True)
Using operator.attrgetter;
>>> import operator
>>> new_friend_list = sorted(friend_list, key=operator.attrgetter('id'))
I think i'll take a crack at this. first, here's teh codes:
from collections import namedtuple
class Profile(namedtuple("Profile", "name")):
def __init__(self, name):
# don't set self.name, it's already set!
self.friends = set({})
self.statuses = list([])
# ... and all the rest the same. Only the base class changes.
what we've done here is to create a class with the shape of a tuple. As such, it's orderable, hashable, and all of the things. You could even drop your __str__() method, namedtuple provides a nice one.
So class A has a collection of class B and class B has some properties.
class A(object):
bs = []
class B(object):
propertyA
propertyB
I need to be able to traverse the collection from the root of the aggregate and find all differencies between to aggregates.
So for example, one instance of A can differ from another by having additional B instance and by not having some B instance. And I need to do this recusively for every B that is in common beetween them.
A and B are value objects, so their identity completely depends on their attributes.
Right now I have three classed to incapsulate differencies
ElementExistsDifference
ElementNotExistsDifference
ElementPropertyDifference.
Which are defined as:
from abc import ABCMeta, abstractmethod
__author__ = 'michael'
class Differ():
def __init__(self, one_item, another_item, difference):
self.another_item = another_item
self.one_item = one_item
self.difference = difference
def __repr__(self, *args, **kwargs):
one_str = str(self.one_item)
two_str = str(self.two_item)
diff_str = str(self.difference)
return "{} differ from {} by {}".format(one_str, two_str, diff_str)
class AbstractDifference(metaclass=ABCMeta):
#abstractmethod
def compensate(self, db_api):
pass
class NoDifference(AbstractDifference):
def compensate(self, db_api):
pass
def __repr__(self):
return "Nothing"
class ItemDifference(AbstractDifference):
#abstractmethod
def compensate(self, db_api):
pass
def __init__(self, item):
self.item = item
def __repr__(self):
return str(self.item)
class ExistsDifference(SchemaItemDifference):
def compensate(self, api):
pass
def __repr__(self):
item = super()
return "Existence of {}".format(item)
class NotExistsDifference(ItemDifference):
def compensate(self, api):
pass
def __repr__(self):
item = super()
return "Abscence of {}".format(item)
class ItemPropertyDifference(AbstractDifference):
def compensate(self, api):
pass
def __init__(self, property):
self.property = property
def __repr__(self):
return str(self.property)
Any suggestions on how to do this?
It looks like you have thought about this problem a bit, considering the classes you have created. I am going to describe my own thought process regarding this problem and then define similar classes. Hopefully this will help you complete your solution.
You have value objects which have attributes. Those attributes can be plain values or they can be value objects.
You need to compare two trees to determine the differences between the two trees. You have defined many classes which you hope to annotate the differences in structure. I think that this is a good approach, however I think that you only need two classes to represent the difference.
class Match(object):
def __init__(self, one, two):
self.one = one
self.two = two
def add_comparison(self, attribute, comparison):
""" This adds the Match or Difference class for the attribute """
setattr(self, attribute, comparison)
class Difference(object):
def __init__(self, one, two):
self.one = one
self.two = two
These two classes represent two objects or values which either match or differ.
If two value objects are present for the same attribute in their parent then they are considered to match. If the value objects have completely different attributes then they still match, but the attributes differ. The match holds references to the actual values so you can print them as required.
If two values are present for the same attribute in their parent and their values are equal then they are considered to match. Since they are not value objects there is no further comparison done.
If an attribute is missing, or an attribute differs substantially (i.e. one is a value object and another is a plain value) then they are considered to differ.
With these classes and rules we can then define the actual matcher. I have strived to use clear method names, but I have not implemented all the methods that this code uses. You will have to complete them. Hopefully that will not be too hard!
def generate_match_tree(one, two):
# The is_value_object function should be able to handle None
if not is_value_object(one) and not is_value_object(two):
# If you need to handle lists then you would do that here
if one == two:
return Match(one, two)
else:
return Difference(one, two)
if not is_value_object(one) or not is_value_object(two):
return Difference(one, two)
# Here we know one and two are both value objects, so they match.
# We must consider the attributes of them now...
result = Match(one, two)
for attribute in get_all_attributes(one) + get_all_attributes(two):
a_one = getattr(one, attribute)
a_two = getattr(two, attribute)
result.add_comparison(attribute, generate_match_tree(a_one, a_two))
return result
Let me know if you need any more help.
Suppose we have the following class hierarchy:
class ClassA:
#property
def foo(self): return "hello"
class ClassB(ClassA):
#property
def bar(self): return "world"
If I explore __dict__ on ClassB like so, I only see the bar attribute:
for name,_ in ClassB.__dict__.items():
if name.startswith("__"):
continue
print(name)
Output is bar
I can roll my own means to get attributes on not only the specified type but its ancestors. However, my question is whether there's already a way in python for me to do this without re-inventing a wheel.
def return_attributes_including_inherited(type):
results = []
return_attributes_including_inherited_helper(type,results)
return results
def return_attributes_including_inherited_helper(type,attributes):
for name,attribute_as_object in type.__dict__.items():
if name.startswith("__"):
continue
attributes.append(name)
for base_type in type.__bases__:
return_attributes_including_inherited_helper(base_type,attributes)
Running my code as follows...
for attribute_name in return_attributes_including_inherited(ClassB):
print(attribute_name)
... gives back both bar and foo.
Note that I'm simplifying some things: name collisions, using items() when for this example I could use dict, skipping over anything that starts with __, ignoring the possibility that two ancestors themselves have a common ancestor, etc.
EDIT1 - I tried to keep the example simple. But I really want both the attribute name and the attribute reference for each class and ancestor class. One of the answers below has me on a better track, I'll post some better code when I get it to work.
EDIT2 - This does what I want and is very succinct. It's based on Eli's answer below.
def get_attributes(type):
attributes = set(type.__dict__.items())
for type in type.__mro__:
attributes.update(type.__dict__.items())
return attributes
It gives back both the attribute names and their references.
EDIT3 - One of the answers below suggested using inspect.getmembers. This appears very useful because it's like dict only it operates on ancestor classes as well.
Since a large part of what I was trying to do was find attributes marked with a particular descriptor, and include ancestors classes, here is some code that would help do that in case it helps anyone:
class MyCustomDescriptor:
# This is greatly oversimplified
def __init__(self,foo,bar):
self._foo = foo
self._bar = bar
pass
def __call__(self,decorated_function):
return self
def __get__(self,instance,type):
if not instance:
return self
return 10
class ClassA:
#property
def foo(self): return "hello"
#MyCustomDescriptor(foo="a",bar="b")
def bar(self): pass
#MyCustomDescriptor(foo="c",bar="d")
def baz(self): pass
class ClassB(ClassA):
#property
def something_we_dont_care_about(self): return "world"
#MyCustomDescriptor(foo="e",bar="f")
def blah(self): pass
# This will get attributes on the specified type (class) that are of matching_attribute_type. It just returns the attributes themselves, not their names.
def get_attributes_of_matching_type(type,matching_attribute_type):
return_value = []
for member in inspect.getmembers(type):
member_name = member[0]
member_instance = member[1]
if isinstance(member_instance,matching_attribute_type):
return_value.append(member_instance)
return return_value
# This will return a dictionary of name & instance of attributes on type that are of matching_attribute_type (useful when you're looking for attributes marked with a particular descriptor)
def get_attribute_name_and_instance_of_matching_type(type,matching_attribute_type):
return_value = {}
for member in inspect.getmembers(ClassB):
member_name = member[0]
member_instance = member[1]
if isinstance(member_instance,matching_attribute_type):
return_value[member_name] = member_instance
return return_value
You should use python's inspect module for any such introspective capabilities.
.
.
>>> class ClassC(ClassB):
... def baz(self):
... return "hiya"
...
>>> import inspect
>>> for attr in inspect.getmembers(ClassC):
... print attr
...
('__doc__', None)
('__module__', '__main__')
('bar', <property object at 0x10046bf70>)
('baz', <unbound method ClassC.baz>)
('foo', <property object at 0x10046bf18>)
Read more about the inspect module here.
You want to use dir:
for attr in dir(ClassB):
print attr
Sadly there isn't a single composite object. Every attribute access for a (normal) python object first checks obj.__dict__, then the attributes of all it's base classes; while there are some internal caches and optimizations, there isn't a single object you can access.
That said, one thing that could improve your code is to use cls.__mro__ instead of cls.__bases__... instead of the class's immediate parents, cls.__mro__ contains ALL the ancestors of the class, in the exact order Python would search, with all common ancestors occuring only once. That would also allow your type-searching method to be non-recursive. Loosely...
def get_attrs(obj):
attrs = set(obj.__dict__)
for cls in obj.__class__.__mro__:
attrs.update(cls.__dict__)
return sorted(attrs)
... does a fair approximation of the default dir(obj) implementation.
Here is a function I wrote, back in the day. The best answer is using the inspect module, as using __dict__ gives us ALL functions (ours + inherited) and (ALL?) data members AND properties. Where inspect gives us enough information to weed out what we don't want.
def _inspect(a, skipFunctionsAlways=True, skipMagic = True):
"""inspects object attributes, removing all the standard methods from 'object',
and (optionally) __magic__ cruft.
By default this routine skips __magic__ functions, but if you want these on
pass False in as the skipMagic parameter.
By default this routine skips functions, but if you want to see all the functions,
pass in False to the skipFunctionsAlways function. This works together with the
skipMagic parameter: if the latter is True, you won't see __magic__ methods.
If skipFunctionsAlways = False and skipMagic = False, you'll see all the __magic__
methods declared for the object - including __magic__ functions declared by Object
NOT meant to be a comprehensive list of every object attribute - instead, a
list of every object attribute WE (not Python) defined. For a complete list
of everything call inspect.getmembers directly"""
objType = type(object)
def weWantIt(obj):
#return type(a) != objType
output= True
if (skipFunctionsAlways):
output = not ( inspect.isbuiltin(obj) ) #not a built in
asStr = ""
if isinstance(obj, types.MethodType):
if skipFunctionsAlways: #never mind, we don't want it, get out.
return False
else:
asStr = obj.__name__
#get just the name of the function, we don't want the whole name, because we don't want to take something like:
#bound method LotsOfThings.bob of <__main__.LotsOfThings object at 0x103dc70>
#to be a special method because it's module name is special
#WD-rpw 02-23-2008
#TODO: it would be great to be able to separate out superclass methods
#maybe by getting the class out of the method then seeing if that attribute is in that class?
else:
asStr = str(obj)
if (skipMagic):
output = (asStr.find("__") == -1 ) #not a __something__
return (output)
for value in inspect.getmembers( a, weWantIt ):
yield value
{k: getattr(ClassB, k) for k in dir(ClassB)}
Proper values (instead of <property object...>) will be presented when using ClassB instance.
And of course You can filter this by adding things like if not k.startswith('__') in the end.