Python classes with __hash__ support (depending on the instance) - python

I have a mutable class in Python which I would like to be able to "freeze", at that point its immutable, therefor can have a __hash__ function.
My concern is, will having the __hash__ function present will make Python behave strangely because it may check for the existence of a hash function.
I realize I could use a subclass that has a hash function, copy the class to a subtype. But I'm interested to know if having an optional hash function is supported by Python.
In the example below it works in basic cases (but may fail in others).
Note: This assumes you don't touch _var or _is_frozen directly and only use access methods.
Note: its probably more Pythonic not to use this method and instead have a FrozenMyVar class, but Im curious if this can be considered to be supported in Python or not.
class MyVar:
__slots__ = ("_var", "_is_frozen")
def __init__(self, var):
self._var = var
self._is_frozen = False
def freeze(self):
self._is_frozen = True
def __hash__(self):
if not self._is_frozen:
raise TypeError("%r not hashable (freeze first)" % type(self))
return hash(self._var)
def __eq__(self, other):
try:
return self.val == other.val
except:
return NotImplemented
#property
def var(self):
return self._var
#var.setter
def var(self, value):
if self._is_frozen:
raise AttributeError("%r is frozen" % type(self))
self._var = value
# ------------
# Verify Usage
v = MyVar(10)
v.var = 9
try:
hash(v)
except:
print("Hash fails on un-frozen instance")
v.freeze()
try:
v.var = 11
except:
print("Assignment fails on frozen instance")
print("Hash is", hash(v))
Adding a note on the real-world use-case, We have some linear math module with Vector/Matrix/Quaternion/Euler classes. In some cases we want to have for eg, a "set of matrices" or a "dict with vector keys". Its always possible to expand them into tuples but they take up more memory & loose their abilities to behave a our own math types - so the ability to freeze them is attractive.

The original example didn't quite work "sensibly", because the class had __hash__ but not __eq__, and as https://docs.python.org/3/reference/datamodel.html#object.hash says "If a class does not define an eq() method it should not define a hash() operation either". But the OP's edit fixed that side issue.
This done, if the class and its instances are indeed used with the discipline outlined, behavior should comply with the specs: instances are "born unhashable" but "become hashable" -- "irreversibly" given said discipline, and only, of course, if their self.val is in turn hashable -- once their freeze method is called.
Of course collections.Hashable will "mis-classify" unfrozen instances (as it only checks for the presence of __hash__, not its actual working), but that is hardly unique behavior:
>>> import collections
>>> isinstance((1, [2,3], 4), collections.Hashable)
True
>>> hash((1, [2,3], 4))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unhashable type: 'list'
That tuple does appear "hashable", like all tuples (since its type does define __hash__) -- but if you in fact try hashing it, you nevertheless get a TypeError, as one of the items is a list (making the whole not actually hashable!-). Not-yet-frozen instances of the OP's class would behave similarly to such a tuple.
An alternative which does avoid this little glitch (yet doesn't require potentially onerous copies of data) is to model the "freezing" as the instance "changing type in-place", e.g...:
class MyVar(object):
_is_frozen = False
def __init__(self, var):
self._var = var
def freeze(self):
self.__class__ = FrozenMyVar
def __eq__(self, other):
try:
return self.val == other.val
except:
return NotImplemented
__hash__ = None
#property
def var(self):
return self._var
#var.setter
def var(self, value):
if self._is_frozen:
raise AttributeError("%r is frozen" % type(self))
self._var = value
class FrozenMyVar(MyVar):
_is_frozen = True
def __hash__(self):
return hash(self._var)
This behaves essentially like the original example (I've removed the "slots" to avoid issues with object layout differs errors on __class__ assignment) but may be considered an improved object model since "changing type in-place" models well such irreversible changes in behavior (and as a small side effect collections.Hashable now behaves impeccably:-).
The concept of an object "changing type in-place" freaks some out because few languages indeed would even tolerate it, and even in Python of course it's a rare thing to have a practical use case for such an obscure feature of the language. However, use cases do exist -- which is why __class__ assignment is indeed supported!-)

Related

Preferred method to get a class variable inside a class in Python. Custom method vs self [duplicate]

I'm doing it like:
def set_property(property,value):
def get_property(property):
or
object.property = value
value = object.property
What's the pythonic way to use getters and setters?
Try this: Python Property
The sample code is:
class C(object):
def __init__(self):
self._x = None
#property
def x(self):
"""I'm the 'x' property."""
print("getter of x called")
return self._x
#x.setter
def x(self, value):
print("setter of x called")
self._x = value
#x.deleter
def x(self):
print("deleter of x called")
del self._x
c = C()
c.x = 'foo' # setter called
foo = c.x # getter called
del c.x # deleter called
What's the pythonic way to use getters and setters?
The "Pythonic" way is not to use "getters" and "setters", but to use plain attributes, like the question demonstrates, and del for deleting (but the names are changed to protect the innocent... builtins):
value = 'something'
obj.attribute = value
value = obj.attribute
del obj.attribute
If later, you want to modify the setting and getting, you can do so without having to alter user code, by using the property decorator:
class Obj:
"""property demo"""
#
#property # first decorate the getter method
def attribute(self): # This getter method name is *the* name
return self._attribute
#
#attribute.setter # the property decorates with `.setter` now
def attribute(self, value): # name, e.g. "attribute", is the same
self._attribute = value # the "value" name isn't special
#
#attribute.deleter # decorate with `.deleter`
def attribute(self): # again, the method name is the same
del self._attribute
(Each decorator usage copies and updates the prior property object, so note that you should use the same name for each set, get, and delete function/method.)
After defining the above, the original setting, getting, and deleting code is the same:
obj = Obj()
obj.attribute = value
the_value = obj.attribute
del obj.attribute
You should avoid this:
def set_property(property,value):
def get_property(property):
Firstly, the above doesn't work, because you don't provide an argument for the instance that the property would be set to (usually self), which would be:
class Obj:
def set_property(self, property, value): # don't do this
...
def get_property(self, property): # don't do this either
...
Secondly, this duplicates the purpose of two special methods, __setattr__ and __getattr__.
Thirdly, we also have the setattr and getattr builtin functions.
setattr(object, 'property_name', value)
getattr(object, 'property_name', default_value) # default is optional
The #property decorator is for creating getters and setters.
For example, we could modify the setting behavior to place restrictions the value being set:
class Protective(object):
#property
def protected_value(self):
return self._protected_value
#protected_value.setter
def protected_value(self, value):
if acceptable(value): # e.g. type or range check
self._protected_value = value
In general, we want to avoid using property and just use direct attributes.
This is what is expected by users of Python. Following the rule of least-surprise, you should try to give your users what they expect unless you have a very compelling reason to the contrary.
Demonstration
For example, say we needed our object's protected attribute to be an integer between 0 and 100 inclusive, and prevent its deletion, with appropriate messages to inform the user of its proper usage:
class Protective(object):
"""protected property demo"""
#
def __init__(self, start_protected_value=0):
self.protected_value = start_protected_value
#
#property
def protected_value(self):
return self._protected_value
#
#protected_value.setter
def protected_value(self, value):
if value != int(value):
raise TypeError("protected_value must be an integer")
if 0 <= value <= 100:
self._protected_value = int(value)
else:
raise ValueError("protected_value must be " +
"between 0 and 100 inclusive")
#
#protected_value.deleter
def protected_value(self):
raise AttributeError("do not delete, protected_value can be set to 0")
(Note that __init__ refers to self.protected_value but the property methods refer to self._protected_value. This is so that __init__ uses the property through the public API, ensuring it is "protected".)
And usage:
>>> p1 = Protective(3)
>>> p1.protected_value
3
>>> p1 = Protective(5.0)
>>> p1.protected_value
5
>>> p2 = Protective(-5)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 3, in __init__
File "<stdin>", line 15, in protected_value
ValueError: protectected_value must be between 0 and 100 inclusive
>>> p1.protected_value = 7.3
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 17, in protected_value
TypeError: protected_value must be an integer
>>> p1.protected_value = 101
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 15, in protected_value
ValueError: protectected_value must be between 0 and 100 inclusive
>>> del p1.protected_value
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 18, in protected_value
AttributeError: do not delete, protected_value can be set to 0
Do the names matter?
Yes they do. .setter and .deleter make copies of the original property. This allows subclasses to properly modify behavior without altering the behavior in the parent.
class Obj:
"""property demo"""
#
#property
def get_only(self):
return self._attribute
#
#get_only.setter
def get_or_set(self, value):
self._attribute = value
#
#get_or_set.deleter
def get_set_or_delete(self):
del self._attribute
Now for this to work, you have to use the respective names:
obj = Obj()
# obj.get_only = 'value' # would error
obj.get_or_set = 'value'
obj.get_set_or_delete = 'new value'
the_value = obj.get_only
del obj.get_set_or_delete
# del obj.get_or_set # would error
I'm not sure where this would be useful, but the use-case is if you want a get, set, and/or delete-only property. Probably best to stick to semantically same property having the same name.
Conclusion
Start with simple attributes.
If you later need functionality around the setting, getting, and deleting, you can add it with the property decorator.
Avoid functions named set_... and get_... - that's what properties are for.
In [1]: class test(object):
def __init__(self):
self.pants = 'pants'
#property
def p(self):
return self.pants
#p.setter
def p(self, value):
self.pants = value * 2
....:
In [2]: t = test()
In [3]: t.p
Out[3]: 'pants'
In [4]: t.p = 10
In [5]: t.p
Out[5]: 20
Using #property and #attribute.setter helps you to not only use the "pythonic" way but also to check the validity of attributes both while creating the object and when altering it.
class Person(object):
def __init__(self, p_name=None):
self.name = p_name
#property
def name(self):
return self._name
#name.setter
def name(self, new_name):
if type(new_name) == str: #type checking for name property
self._name = new_name
else:
raise Exception("Invalid value for name")
By this, you actually 'hide' _name attribute from client developers and also perform checks on name property type. Note that by following this approach even during the initiation the setter gets called. So:
p = Person(12)
Will lead to:
Exception: Invalid value for name
But:
>>>p = person('Mike')
>>>print(p.name)
Mike
>>>p.name = 'George'
>>>print(p.name)
George
>>>p.name = 2.3 # Causes an exception
This is an old question but the topic is very important and always current. In case anyone wants to go beyond simple getters/setters i have wrote an article about superpowered properties in python with support for slots, observability and reduced boilerplate code.
from objects import properties, self_properties
class Car:
with properties(locals(), 'meta') as meta:
#meta.prop(read_only=True)
def brand(self) -> str:
"""Brand"""
#meta.prop(read_only=True)
def max_speed(self) -> float:
"""Maximum car speed"""
#meta.prop(listener='_on_acceleration')
def speed(self) -> float:
"""Speed of the car"""
return 0 # Default stopped
#meta.prop(listener='_on_off_listener')
def on(self) -> bool:
"""Engine state"""
return False
def __init__(self, brand: str, max_speed: float = 200):
self_properties(self, locals())
def _on_off_listener(self, prop, old, on):
if on:
print(f"{self.brand} Turned on, Runnnnnn")
else:
self._speed = 0
print(f"{self.brand} Turned off.")
def _on_acceleration(self, prop, old, speed):
if self.on:
if speed > self.max_speed:
print(f"{self.brand} {speed}km/h Bang! Engine exploded!")
self.on = False
else:
print(f"{self.brand} New speed: {speed}km/h")
else:
print(f"{self.brand} Car is off, no speed change")
This class can be used like this:
mycar = Car('Ford')
# Car is turned off
for speed in range(0, 300, 50):
mycar.speed = speed
# Car is turned on
mycar.on = True
for speed in range(0, 350, 50):
mycar.speed = speed
This code will produce the following output:
Ford Car is off, no speed change
Ford Car is off, no speed change
Ford Car is off, no speed change
Ford Car is off, no speed change
Ford Car is off, no speed change
Ford Car is off, no speed change
Ford Turned on, Runnnnnn
Ford New speed: 0km/h
Ford New speed: 50km/h
Ford New speed: 100km/h
Ford New speed: 150km/h
Ford New speed: 200km/h
Ford 250km/h Bang! Engine exploded!
Ford Turned off.
Ford Car is off, no speed change
More info about how and why here: https://mnesarco.github.io/blog/2020/07/23/python-metaprogramming-properties-on-steroids
Properties are pretty useful since you can use them with assignment but then can include validation as well. You can see this code where you use the decorator #property and also #<property_name>.setter to create the methods:
# Python program displaying the use of #property
class AgeSet:
def __init__(self):
self._age = 0
# using property decorator a getter function
#property
def age(self):
print("getter method called")
return self._age
# a setter function
#age.setter
def age(self, a):
if(a < 18):
raise ValueError("Sorry your age is below eligibility criteria")
print("setter method called")
self._age = a
pkj = AgeSet()
pkj.age = int(input("set the age using setter: "))
print(pkj.age)
There are more details in this post I wrote about this as well: https://pythonhowtoprogram.com/how-to-create-getter-setter-class-properties-in-python-3/
You can use accessors/mutators (i.e. #attr.setter and #property) or not, but the most important thing is to be consistent!
If you're using #property to simply access an attribute, e.g.
class myClass:
def __init__(a):
self._a = a
#property
def a(self):
return self._a
use it to access every* attribute! It would be a bad practice to access some attributes using #property and leave some other properties public (i.e. name without an underscore) without an accessor, e.g. do not do
class myClass:
def __init__(a, b):
self.a = a
self.b = b
#property
def a(self):
return self.a
Note that self.b does not have an explicit accessor here even though it's public.
Similarly with setters (or mutators), feel free to use #attribute.setter but be consistent! When you do e.g.
class myClass:
def __init__(a, b):
self.a = a
self.b = b
#a.setter
def a(self, value):
return self.a = value
It's hard for me to guess your intention. On one hand you're saying that both a and b are public (no leading underscore in their names) so I should theoretically be allowed to access/mutate (get/set) both. But then you specify an explicit mutator only for a, which tells me that maybe I should not be able to set b. Since you've provided an explicit mutator I am not sure if the lack of explicit accessor (#property) means I should not be able to access either of those variables or you were simply being frugal in using #property.
*The exception is when you explicitly want to make some variables accessible or mutable but not both or you want to perform some additional logic when accessing or mutating an attribute. This is when I am personally using #property and #attribute.setter (otherwise no explicit acessors/mutators for public attributes).
Lastly, PEP8 and Google Style Guide suggestions:
PEP8, Designing for Inheritance says:
For simple public data attributes, it is best to expose just the attribute name, without complicated accessor/mutator methods. Keep in mind that Python provides an easy path to future enhancement, should you find that a simple data attribute needs to grow functional behavior. In that case, use properties to hide functional implementation behind simple data attribute access syntax.
On the other hand, according to Google Style Guide Python Language Rules/Properties the recommendation is to:
Use properties in new code to access or set data where you would normally have used simple, lightweight accessor or setter methods. Properties should be created with the #property decorator.
The pros of this approach:
Readability is increased by eliminating explicit get and set method calls for simple attribute access. Allows calculations to be lazy. Considered the Pythonic way to maintain the interface of a class. In terms of performance, allowing properties bypasses needing trivial accessor methods when a direct variable access is reasonable. This also allows accessor methods to be added in the future without breaking the interface.
and cons:
Must inherit from object in Python 2. Can hide side-effects much like operator overloading. Can be confusing for subclasses.
You can use the magic methods __getattribute__ and __setattr__.
class MyClass:
def __init__(self, attrvalue):
self.myattr = attrvalue
def __getattribute__(self, attr):
if attr == "myattr":
#Getter for myattr
def __setattr__(self, attr):
if attr == "myattr":
#Setter for myattr
Be aware that __getattr__ and __getattribute__ are not the same. __getattr__ is only invoked when the attribute is not found.

How to use python collections for custom classes

Still somewhat perplexed by python and it's magic functional programming, so I tend to find myself writing code that is more towards the Java paradigm of programming as opposed to Idiomatic Python.
My question is somewhat related to: How do I make a custom class a collection in Python
The only difference is I have nested objects (using composition). The VirtualPage object is comprised of a list of PhysicalPage objects. I have a function which can take a list of PhyscialPage objects and coalesce all of the details into a single named tuple I call PageBoundary. Essentially it's a serialization function which can spit out a tuple comprised of an integer range which represents the physical page and the line number in the page. From this I can easily sort and order VirtualPages among one another (that's the idea at least):
PageBoundary = collections.namedtuple('PageBoundary', 'begin end')
I also have a function which can take a PageBoundary namedtuple and de-serialize or expand the tuple into a list of PhysicalPages. It's preferable that these two data storage classes not change as it will break any downstream code.
Here is a snippet of my custom python2.7 class. It is composed of lot things one is list which contains a the object PhysicalPage:
class VirtualPage(object):
def __init__(self, _physical_pages=list()):
self.physcial_pages = _physcial_pages
class PhysicalPage(object):
# class variables: number of digits each attribute gets
_PAGE_PAD, _LINE_PAD = 10, 12
def __init__(self, _page_num=-1):
self.page_num = _page_num
self.begin_line_num = -1
self.end_line_num = -1
def get_cannonical_begin(self):
return int(''.join([str(self.page_num).zfill(PhysicalPage._PAGE_PAD),
str(tmp_line_num).zfill(PhysicalPage._LINE_PAD) ]))
def get_cannonical_end(self):
pass # see get_cannonical_begin() implementation
def get_canonical_page_boundaries(self):
return PageBoundary(self.get_canonical_begin(), self.get_canonical_end())
I would like to leverage some templated collection (from the python collections module) to easily sort and compare as list or set of VirtualPage classes. Also would like some advice on the layout of my data storage classes: VirtualPage and PhysicalPage.
Given either a sequence of VirtualPages or as in the example below:
vp_1 = VirtualPage(list_of_physical_pages)
vp_1_copy = VirtualPage(list_of_physical_pages)
vp_2 = VirtualPage(list_of_other_physical_pages)
I want to easily answer questions like this:
>>> vp_2 in vp_1
False
>>> vp_2 < vp_1
True
>>> vp_1 == vp_1_copy
True
Right off the bat it seems obvious that the VirtualPage class needs to call get_cannonical_page_boundaries or even implement the function itself. At a minimum it should loop over it's PhysicalPage list to implement the required functions (lt() and eq()) so I can compare b/w VirtualPages.
1.) Currently I'm struggling with implementing some of the comparison functions. One big obstacle is how to compare a tuple? Do I create my own lt() function by creating a custom class which extends some type of collection:
import collections as col
import functools
#total_ordering
class AbstractVirtualPageContainer(col.MutableSet):
def __lt__(self, other):
'''What type would other be?
Make comparison by first normalizing to a comparable type: PageBoundary
'''
pass
2.) Should the comparison function implementation exist in the VirtualPage class instead?
I was leaning towards some type of Set data structure as the properties of the data I'm modeling has the concept of uniqueness: i.e. physical page values cannot overlap and to some extend act as a linked list. Also would setter or getter functions, implemented via # decorator functions be of any use here?
I think you want something like the code below. Not tested; certainly not tested for your application or with your data, YMMV, etc.
from collections import namedtuple
# PageBoundary is a subclass of named tuple with special relational
# operators. __le__ and __ge__ are left undefined because they don't
# make sense for this class.
class PageBoundary(namedtuple('PageBoundary', 'begin end')):
# to prevent making an instance dict (See namedtuple docs)
__slots__ = ()
def __lt__(self, other):
return self.end < other.begin
def __eq__(self, other):
# you can put in an assertion if you are concerned the
# method might be called with the wrong type object
assert isinstance(other, PageBoundary), "Wrong type for other"
return self.begin == other.begin and self.end == other.end
def __ne__(self, other):
return not self == other
def __gt__(self, other):
return other < self
class PhysicalPage(object):
# class variables: number of digits each attribute gets
_PAGE_PAD, _LINE_PAD = 10, 12
def __init__(self, page_num):
self.page_num = page_num
# single leading underscore is 'private' by convention
# not enforced by the language
self._begin = self.page_num * 10**PhysicalPage._LINE_PAD + tmp_line_num
#self._end = ...however you calculate this... ^ not defined yet
self.begin_line_num = -1
self.end_line_num = -1
# this serves the purpose of a `getter`, but looks just like
# a normal class member access. used like x = page.begin
#property
def begin(self):
return self._begin
#property
def end(self):
return self._end
def __lt__(self, other):
assert(isinstance(other, PhysicalPage))
return self._end < other._begin
def __eq__(self, other):
assert(isinstance(other, PhysicalPage))
return self._begin, self._end == other._begin, other._end
def __ne__(self, other):
return not self == other
def __gt__(self, other):
return other < self
class VirtualPage(object):
def __init__(self, physical_pages=None):
self.physcial_pages = sorted(physcial_pages) if physical_pages else []
def __lt__(self, other):
if self.physical_pages and other.physical_pages:
return self.physical_pages[-1].end < other.physical_pages[0].begin
else:
raise ValueError
def __eq__(self, other):
if self.physical_pages and other.physical_pages:
return self.physical_pages == other.physical_pages
else:
raise ValueError
def __gt__(self, other):
return other < self
And a few observations:
Although there is no such thing as "private" members in Python classes, it is a convention to begin a variable name with a single underscore, _, to indicate it is not part of the public interface of the class / module/ etc. So, naming method parameters of public methods with an '_', doesn't seem correct, e.g., def __init__(self, _page_num=-1).
Python generally doesn't use setters / getters; just use the attributes directly. If attribute values need to be calculated, or other some other processing is needed use the #property decorator (as shown for PhysicalPage.begin() above).
It's generally not a good idea to initialize a default function argument with a mutable object. def __init__(self, physical_pages=list()) does not initialize physical_pages with a new empty list each time; rather, it uses the same list every time. If the list is modified, at the next function call physical_pages will be initialized with the modified list. See VirtualPages initializer for an alternative.

overidding Pythons __eq__ method , isistance & eq mothods return false

I'm new to Python from the Java world.
I have written a Python class called "Instance" with 3 properties(attribute, value, and class). I want to override the "eq" method & also the "hash" method, I'm using the "attribute" & "value" properties used for object comparison. I instantiated two objects with the same values, however they return as not equal.
Code is below , Class Instance:
'''Class of type Instance'''
class Instance(object):
__attribute = None;
__value = None;
__classification = None;
#constructor
def __init__(self,attribute,value,classification):
self.attribute = attribute;
self.value = value;
self.classification = classification;
#setters & getters
def setAttribute(self,attribute):
self.attribute = attribute
def setValue(self,value):
self.value = value
def setClassification(self,classification):
self.classification = classification
def getAttribute(self):
return self.Attribute;
def getValue(self):
return self.Value
def getClassification(self):
return self.Classification
def __eq__(self, other):
#if self & other are the same instance & attribute & value equal
return isinstance(self,other) and (self.attribute == other.attribute) and (self.value == other.value)
def __hash__(self):
return hash(self.attribute, self.value)
I'm instantiating in , another Python module called Testing:
if __name__ == '__main__':
pass
from Instance import *
instance1 = Instance('sameValue', 1,'Iris-setosa')
instance2 = Instance('sameValue', 1,'Iris-setosa')
if (instance1 is instance2):
print "equals"
else:
print "not equals"
The program returns: not equals.
Your first problem is isinstance(self, other) isn't asking whether self and other are both instances of compatible types, or whether they're the same instance (as your comment says), it's asking whether self is an instance of the type other. Since other isn't even a type, the answer is always false.
You probably wanted isinstance(self, type(other)). Or maybe something more complicated, like isinstance(self, type(other)) or isinstance(other, type(self)).
Or maybe you don't really want this at all; even for equality testing, duck typing is often a good idea. If other has the same attributes as self, and also hashes to the same value, is that good enough? The answer may be no… but you definitely should ask the question.
Your second problem is a misunderstanding of is:
if (instance1 is instance2):
print "equals"
else:
print "not equals"
The whole point of is is that it's asking whether these are the same object, not whether these two (possibly distinct) objects are equal to each other. For example:
>>> a = []
>>> b = []
>>> a == b
True
>>> a is b
False
They're both empty lists, so they're equal to each other, but they're two different empty lists, which is why you can do this:
>>> a.append(0)
>>> b
[]
And the same is true with your class. Each Instance that you create is going to be a different, separate instance—even if they're all equal.
The __eq__ method that you define customized the == operator. There is no way to customize the is operator.

Is there a way to access __dict__ (or something like it) that includes base classes?

Suppose we have the following class hierarchy:
class ClassA:
#property
def foo(self): return "hello"
class ClassB(ClassA):
#property
def bar(self): return "world"
If I explore __dict__ on ClassB like so, I only see the bar attribute:
for name,_ in ClassB.__dict__.items():
if name.startswith("__"):
continue
print(name)
Output is bar
I can roll my own means to get attributes on not only the specified type but its ancestors. However, my question is whether there's already a way in python for me to do this without re-inventing a wheel.
def return_attributes_including_inherited(type):
results = []
return_attributes_including_inherited_helper(type,results)
return results
def return_attributes_including_inherited_helper(type,attributes):
for name,attribute_as_object in type.__dict__.items():
if name.startswith("__"):
continue
attributes.append(name)
for base_type in type.__bases__:
return_attributes_including_inherited_helper(base_type,attributes)
Running my code as follows...
for attribute_name in return_attributes_including_inherited(ClassB):
print(attribute_name)
... gives back both bar and foo.
Note that I'm simplifying some things: name collisions, using items() when for this example I could use dict, skipping over anything that starts with __, ignoring the possibility that two ancestors themselves have a common ancestor, etc.
EDIT1 - I tried to keep the example simple. But I really want both the attribute name and the attribute reference for each class and ancestor class. One of the answers below has me on a better track, I'll post some better code when I get it to work.
EDIT2 - This does what I want and is very succinct. It's based on Eli's answer below.
def get_attributes(type):
attributes = set(type.__dict__.items())
for type in type.__mro__:
attributes.update(type.__dict__.items())
return attributes
It gives back both the attribute names and their references.
EDIT3 - One of the answers below suggested using inspect.getmembers. This appears very useful because it's like dict only it operates on ancestor classes as well.
Since a large part of what I was trying to do was find attributes marked with a particular descriptor, and include ancestors classes, here is some code that would help do that in case it helps anyone:
class MyCustomDescriptor:
# This is greatly oversimplified
def __init__(self,foo,bar):
self._foo = foo
self._bar = bar
pass
def __call__(self,decorated_function):
return self
def __get__(self,instance,type):
if not instance:
return self
return 10
class ClassA:
#property
def foo(self): return "hello"
#MyCustomDescriptor(foo="a",bar="b")
def bar(self): pass
#MyCustomDescriptor(foo="c",bar="d")
def baz(self): pass
class ClassB(ClassA):
#property
def something_we_dont_care_about(self): return "world"
#MyCustomDescriptor(foo="e",bar="f")
def blah(self): pass
# This will get attributes on the specified type (class) that are of matching_attribute_type. It just returns the attributes themselves, not their names.
def get_attributes_of_matching_type(type,matching_attribute_type):
return_value = []
for member in inspect.getmembers(type):
member_name = member[0]
member_instance = member[1]
if isinstance(member_instance,matching_attribute_type):
return_value.append(member_instance)
return return_value
# This will return a dictionary of name & instance of attributes on type that are of matching_attribute_type (useful when you're looking for attributes marked with a particular descriptor)
def get_attribute_name_and_instance_of_matching_type(type,matching_attribute_type):
return_value = {}
for member in inspect.getmembers(ClassB):
member_name = member[0]
member_instance = member[1]
if isinstance(member_instance,matching_attribute_type):
return_value[member_name] = member_instance
return return_value
You should use python's inspect module for any such introspective capabilities.
.
.
>>> class ClassC(ClassB):
... def baz(self):
... return "hiya"
...
>>> import inspect
>>> for attr in inspect.getmembers(ClassC):
... print attr
...
('__doc__', None)
('__module__', '__main__')
('bar', <property object at 0x10046bf70>)
('baz', <unbound method ClassC.baz>)
('foo', <property object at 0x10046bf18>)
Read more about the inspect module here.
You want to use dir:
for attr in dir(ClassB):
print attr
Sadly there isn't a single composite object. Every attribute access for a (normal) python object first checks obj.__dict__, then the attributes of all it's base classes; while there are some internal caches and optimizations, there isn't a single object you can access.
That said, one thing that could improve your code is to use cls.__mro__ instead of cls.__bases__... instead of the class's immediate parents, cls.__mro__ contains ALL the ancestors of the class, in the exact order Python would search, with all common ancestors occuring only once. That would also allow your type-searching method to be non-recursive. Loosely...
def get_attrs(obj):
attrs = set(obj.__dict__)
for cls in obj.__class__.__mro__:
attrs.update(cls.__dict__)
return sorted(attrs)
... does a fair approximation of the default dir(obj) implementation.
Here is a function I wrote, back in the day. The best answer is using the inspect module, as using __dict__ gives us ALL functions (ours + inherited) and (ALL?) data members AND properties. Where inspect gives us enough information to weed out what we don't want.
def _inspect(a, skipFunctionsAlways=True, skipMagic = True):
"""inspects object attributes, removing all the standard methods from 'object',
and (optionally) __magic__ cruft.
By default this routine skips __magic__ functions, but if you want these on
pass False in as the skipMagic parameter.
By default this routine skips functions, but if you want to see all the functions,
pass in False to the skipFunctionsAlways function. This works together with the
skipMagic parameter: if the latter is True, you won't see __magic__ methods.
If skipFunctionsAlways = False and skipMagic = False, you'll see all the __magic__
methods declared for the object - including __magic__ functions declared by Object
NOT meant to be a comprehensive list of every object attribute - instead, a
list of every object attribute WE (not Python) defined. For a complete list
of everything call inspect.getmembers directly"""
objType = type(object)
def weWantIt(obj):
#return type(a) != objType
output= True
if (skipFunctionsAlways):
output = not ( inspect.isbuiltin(obj) ) #not a built in
asStr = ""
if isinstance(obj, types.MethodType):
if skipFunctionsAlways: #never mind, we don't want it, get out.
return False
else:
asStr = obj.__name__
#get just the name of the function, we don't want the whole name, because we don't want to take something like:
#bound method LotsOfThings.bob of <__main__.LotsOfThings object at 0x103dc70>
#to be a special method because it's module name is special
#WD-rpw 02-23-2008
#TODO: it would be great to be able to separate out superclass methods
#maybe by getting the class out of the method then seeing if that attribute is in that class?
else:
asStr = str(obj)
if (skipMagic):
output = (asStr.find("__") == -1 ) #not a __something__
return (output)
for value in inspect.getmembers( a, weWantIt ):
yield value
{k: getattr(ClassB, k) for k in dir(ClassB)}
Proper values (instead of <property object...>) will be presented when using ClassB instance.
And of course You can filter this by adding things like if not k.startswith('__') in the end.

Python metaclass for enforcing immutability of custom types

Having searched for a way to enforce immutability of custom types and not having found a satisfactory answer I came up with my own shot at a solution in form of a metaclass:
class ImmutableTypeException( Exception ): pass
class Immutable( type ):
'''
Enforce some aspects of the immutability contract for new-style classes:
- attributes must not be created, modified or deleted after object construction
- immutable types must implement __eq__ and __hash__
'''
def __new__( meta, classname, bases, classDict ):
instance = type.__new__( meta, classname, bases, classDict )
# Make sure __eq__ and __hash__ have been implemented by the immutable type.
# In the case of __hash__ also make sure the object default implementation has been overridden.
# TODO: the check for eq and hash functions could probably be done more directly and thus more efficiently
# (hasattr does not seem to traverse the type hierarchy)
if not '__eq__' in dir( instance ):
raise ImmutableTypeException( 'Immutable types must implement __eq__.' )
if not '__hash__' in dir( instance ):
raise ImmutableTypeException( 'Immutable types must implement __hash__.' )
if _methodFromObjectType( instance.__hash__ ):
raise ImmutableTypeException( 'Immutable types must override object.__hash__.' )
instance.__setattr__ = _setattr
instance.__delattr__ = _delattr
return instance
def __call__( self, *args, **kwargs ):
obj = type.__call__( self, *args, **kwargs )
obj.__immutable__ = True
return obj
def _setattr( self, attr, value ):
if '__immutable__' in self.__dict__ and self.__immutable__:
raise AttributeError( "'%s' must not be modified because '%s' is immutable" % ( attr, self ) )
object.__setattr__( self, attr, value )
def _delattr( self, attr ):
raise AttributeError( "'%s' must not be deleted because '%s' is immutable" % ( attr, self ) )
def _methodFromObjectType( method ):
'''
Return True if the given method has been defined by object, False otherwise.
'''
try:
# TODO: Are we exploiting an implementation detail here? Find better solution!
return isinstance( method.__objclass__, object )
except:
return False
However, while the general approach seems to be working rather well there are still some iffy implementation details (also see TODO comments in code):
How do I check if a particular method has been implemented anywhere in the type hierarchy?
How do I check which type is the origin of a method declaration (i.e. as part of which type a method has been defined)?
Special methods are always looked up on the type, not the instance. So hasattr must also be applied to the type. E.g.:
>>> class A(object): pass
...
>>> class B(A): __eq__ = lambda *_: 1
...
>>> class C(B): pass
...
>>> c = C()
>>> hasattr(type(c), '__eq__')
True
Checking hasattr(c, '__eq__') would be misleading as it might erroneously "catch" a per-instance attribute __eq__ defined in c itself, which would not act as a special method (note that in the specific case of __eq__ you'll always see a True result from hasattr, because ancestor class object defines it, and inheritance can only ever "add" attributes, never "subtract" any;-).
To check which ancestor class first defined an attribute (and thus which exact definition will be used when the lookup is only on the type):
import inspect
def whichancestor(c, attname):
for ancestor in inspect.getmro(type(c)):
if attname in ancestor.__dict__:
return ancestor
return None
It's best to use inspect for such tasks, as it will work more broadly than a direct access of the __mro__ attribute on type(c).
This metaclass enforces "shallow" immutability. For example, it doesn't prevent
immutable_obj.attr.attrs_attr = new_value
immutable_obj.attr[2] = new_value
Depending if attrs_attr is owned by the object or not, that might be considered to violate true immutability. E.g. it might result in the following which should not happen for an immutable type:
>>> a = ImmutableClass(value)
>>> b = ImmutableClass(value)
>>> c = a
>>> a == b
True
>>> b == c
True
>>> a.attr.attrs_attr = new_value
>>> b == c
False
Possibly you could fix that deficiency by overriding getattr as well to return some kind of immutable wrapper for any attribute it returns. It might be complicated. Blocking direct setattr calls could be done, but what about methods of the attribute that set its attributes in their code? I can think of ideas but it will get pretty meta, alright.
Also, I thought this would be a clever use of your class:
class Tuple(list):
__metaclass__ = Immutable
But it didn't make a tuple as I had hoped.
>>> t = Tuple([1,2,3])
>>> t.append(4)
>>> t
[1, 2, 3, 4]
>>> u = t
>>> t += (5,)
>>> t
[1, 2, 3, 4, 5]
>>> u
[1, 2, 3, 4, 5]
I guess list's methods are mostly or completely implemented at the C level, so I suppose your metaclass has no opportunity to intercept state changes in them.

Categories

Resources