Why can I reassign dict.update but not dict.__setitem__ - python

I'm trying to modify a third-party dict class to make it immutable after a certain point.
With most classes, I can assign to method slots to modify behavior.
However, this doesn't seem possible with all methods in all classes. In particular for dict, I can reassign update, but not __setitem__.
Why? How are they different?
For example:
class Freezable(object):
def _not_modifiable(self, *args, **kw):
raise NotImplementedError()
def freeze(self):
"""
Disallow mutating methods from now on.
"""
print "FREEZE"
self.__setitem__ = self._not_modifiable
self.update = self._not_modifiable
# ... others
return self
class MyDict(dict, Freezable):
pass
d = MyDict()
d.freeze()
print d.__setitem__ # <bound method MyDict._not_modifiable of {}>
d[2] = 3 # no error -- this is incorrect.
d.update({4:5}) # raise NotImplementedError

Note that you can define the class __setitem__, e.g.:
def __setitem__(self, key, value):
if self.update is Freezable._not_modifiable:
raise TypeError('{} has been frozen'.format(id(self)))
dict.__setitem__(self, key, value)
(This method is a bit clumsy; there are other options. But it's one way to make it work even though Python calls the class's __setitem__ directly.)

Related

Overriding getters and setters for attributes from a list of strings

The aim is to provide some strings in a list as attributes of a class. The class shall have not only attributes, but also the respective getter and setter methods. In some other class inherited from that some of those setters need to be overridden.
To this end I came up with the following. Using setattr in a loop over the list of strings, an attribute and the respective methods are created. Concerning this first part, the code works as expected.
However I am not able to override the setters in an inheriting class.
class Base():
attributes = ["attr{}".format(i) for i in range(100)]
def __init__(self):
_get = lambda a: lambda : getattr(self, a)
_set = lambda a: lambda v: setattr(self, a, v)
for attr in self.attributes:
setattr(self, attr, None)
setattr(self, "get_"+attr, _get(attr))
setattr(self, "set_"+attr, _set(attr))
class Child(Base):
def __init__(self):
super().__init__()
#setattr(self, "set_attr4", set_attr4)
# Here I want to override one of the setters to perform typechecking
def set_attr4(self, v):
print("This being printed would probably solve the problem.")
if type(v) == bool:
super().set_attr4(v)
else:
raise ValueError("attr4 must be a boolean")
if __name__ == "__main__":
b = Base()
b.attr2 = 5
print(b.get_attr2())
b.set_attr3(55)
print(b.get_attr3())
c = Child()
c.set_attr4("SomeString")
print(c.get_attr4())
The output here is
5
555
SomeString
The expected output would however be
5
555
This being printed would probably solve the problem.
ValueError("attr4 must be a boolean")
So somehow the set_attr4 method is never called; which I guess is expected, because __init__ is called after the class structure is read in. But I am at loss on how else to override those methods. I tried to add setattr(self, "set_attr4", set_attr4) (the commented line in the code above) but to no avail.
Or more generally, there is the propery which is usually used for creating getters and setters. But I don't think I understand how to apply it in a case where the getters and setters are created dynamically and need to be overridden by a child.
Is there any solution to this?
Update due to comments: It was pointed out by several people that using getters/setters in python may not be a good style and that they may usually not be needed. While this is definitely something to keep in mind, the background of this question is that I'm extending an old existing code which uses getters/setters throughout. I hence do not wish to change the style and let the user (this project only has some 20 users in total, but still...) suddenly change the way they access properties within the API.
However any future reader of this may consider that the getter/setter approach is at least questionable.
Metaclasses to the rescue!
class Meta(type):
def __init__(cls, name, bases, dct):
for attr in cls.attributes:
if not hasattr(cls, attr):
setattr(cls, attr, None)
setattr(cls, f'get_{attr}', cls._get(attr))
setattr(cls, f'set_{attr}', cls._set(attr))
class Base(metaclass=Meta):
attributes = ["attr{}".format(i) for i in range(100)]
_get = lambda a: lambda self: getattr(self, a)
_set = lambda a: lambda self, v: setattr(self, a, v)
# the rest of your code goes here
This is pretty self-explanatory: make attributes, _get, _set class variables (so that you can access them without class instantiation), then let the metaclass set everything up for you.
The __init__ is executed after the subclass is created, so it overrides what was specified there.
The minimal change needed to fix the problem is to check whether the attribute has already been set:
class Base():
attributes = ["attr{}".format(i) for i in range(100)]
def __init__(self):
_get = lambda a: lambda : getattr(self, a)
_set = lambda a: lambda v: setattr(self, a, v)
for attr in self.attributes:
setattr(self, attr, None)
if not hasattr(self, "get_"+attr):
setattr(self, "get_"+attr, _get(attr))
if not hasattr(self, "set_"+attr):
setattr(self, "set_"+attr, _set(attr))
However, I do not see to point in doing that this way. This is creating a new getter and setter for each instance of Base. I would instead rather create them on the class. That can be done with a class decorator, or with a metaclass, or in the body of the class itself, or in some other way.
For example, this is ugly, but simple:
class Base():
attributes = ["attr{}".format(i) for i in range(100)]
for attr in attributes:
exec(f"get_{attr} = lambda self: self.{attr}")
exec(f"set_{attr} = lambda self, value: setattr(self, '{attr}', value)")
del attr
This is better:
class Base:
pass
attributes = ["attr{}".format(i) for i in range(100)]
for attr in attributes:
setattr(Base, f"get_{attr}", lambda self: getattr(self, attr))
setattr(Base, f"set_{attr}", lambda self, value: setattr(self, '{attr}', value))
You're right about the problem. The creation of your Base instance happens after the Child class defines set_attr4. Since Base is creating it's getters/setters dynamically, it just blasts over Childs version upon creation.
One alternative way (in addition to the other answers) is to create the Child's getters/setters dynamically too. The idea here is to go for "convention over configuration" and just prefix methods you want to override with override_. Here's an example:
class Child(Base):
def __init__(self):
super().__init__()
overrides = [override for override in dir(self) if override.startswith("override_")]
for override in overrides:
base_name = override.split("override_")[-1]
setattr(self, base_name, getattr(self, override))
# Here I want to override one of the setters to perform typechecking
def override_set_attr4(self, v):
print("This being printed would probably solve the problem.")
if type(v) == bool:
super().set_attr4(v)
else:
raise ValueError("attr4 must be a boolean") # Added "raise" to this, overwise we just return None...
which outputs:
5
55
This being printed would probably solve the problem.
Traceback (most recent call last):
File ".\stack.py", line 39, in <module>
c.set_attr4("SomeString")
File ".\stack.py", line 29, in override_set_attr4
raise ValueError("attr4 must be a boolean") # Added "raise" to this, overwise we just return None...
ValueError: attr4 must be a boolean
Advantages here are that the Base doesn't have to know about the Child class. In the other answers, there's very subtle Base/Child coupling going on. It also might not be desirable to touch the Base class at all (violation of the Open/Closed principle).
Disadvantages are that "convention over configuration" to avoid a true inheritance mechanism is a bit clunky and unintuitive. The override_ function is also still hanging around on the Child instance (which you may or may not care about).
I think the real problem here is that you're trying to define getters and setters in such a fashion. We usually don't even want getters/setters in Python. This definitely feels like an X/Y problem, but maybe it isn't. You have a lot of rep, so I'm not going to give you some pedantic spiel about it. Even so, maybe take a step back and think about what you're really trying to do and consider alternatives.
The problem here is that you're creating the "methods" in the instance of the Base class (__init__ only runs in the instance).
Inheriting happens before you instance your class, and won't look into instances.
In other words, When you try to override the method, it wasn't even created in first place.
A solution is to create them in the class and not in self instance inside __init__:
def _create_getter(attr):
def _get(self):
return getattr(self, attr)
return _get
def _create_setter(attr):
def _set(self, value):
return setattr(self, attr, value)
return _set
class Base():
attributes = ["attr{}".format(i) for i in range(100)]
for attr in Base.attributes:
setattr(Base, 'get_' + attr, _create_getter(attr))
setattr(Base, 'set_' + attr, _create_setter(attr))
Then inheriting will work normally:
class Child(Base):
def set_attr4(self, v):
print("This being printed would probably solve the problem.")
if type(v) == bool:
super().set_attr4(v)
else:
raise ValueError("attr4 must be a boolean")
if __name__ == "__main__":
b = Base()
b.attr2 = 5
print(b.get_attr2())
b.set_attr3(55)
print(b.get_attr3())
c = Child()
c.set_attr4("SomeString")
print(c.get_attr4())
You could also just not do it - make your Base class as normal, and make setters only for the attributes you want, in the child class:
class Base:
pass
class Child(Base):
#property
def attr4(self):
return self._attr4
#attr4.setter
def attr4(self, new_v):
if not isinstance(new_v, bool):
raise TypeError('Not bool')
self._attr4 = new_v
Testing:
c = Child()
c.attr3 = 2 # works fine even without any setter
c.attr4 = True #works fine, runs the setter
c.attr4 = 3 #type error

why python's pickle is not serializing a method as default argument?

I am trying to use pickle to transfer python objects over the wire between 2 servers. I created a simple class, that subclasses dict and I am trying to use pickle for the marshalling:
def value_is_not_none(value):
return value is not None
class CustomDict(dict):
def __init__(self, cond=lambda x: x is not None):
super().__init__()
self.cond = cond
def __setitem__(self, key, value):
if self.cond(value):
dict.__setitem__(self, key, value)
I first tried to use pickle for the marshalling, but when I un-marshalled I received an error related to the lambda expression.
Then I tried to do the marshalling with dill but it seemed the __init__ was not called.
Then I tried again with pickle, but I passed the value_is_not_none() function as the cond parameter - again the __init__() does not seemed to be invoked and the un-marshalling failed on the __setitem__() (cond is None).
Why is that? what am I missing here?
If I try to run the following code:
obj = CustomDict(cond=value_is_not_none)
obj['hello'] = ['world']
payload = pickle.dumps(obj, protocol=pickle.HIGHEST_PROTOCOL)
obj2 = pickle.loads(payload)
it fails with
AttributeError: 'CustomDict' object has no attribute 'cond'
This is a different question than: Python, cPickle, pickling lambda functions
as I tried using dill with lambda and it failed to work, and I also tried passing a function and it also failed.
pickle is loading your dictionary data before it has restored the attributes on your instance. As such the self.cond attribute is not yet set when __setitem__ is called for the dictionary key-value pairs.
Note that pickle will never call __init__; instead it'll create an entirely blank instance and restore the __dict__ attribute namespace on that directly.
You have two options:
default to cond=None and ignore the condition if it is still set to None:
class CustomDict(dict):
def __init__(self, cond=None):
super().__init__()
self.cond = cond
def __setitem__(self, key, value):
if getattr(self, 'cond', None) is None or self.cond(value):
dict.__setitem__(self, key, value)
The getattr() there is needed because a blank instance has no cond attribute at all (it is not set to None, the attribute is entirely missing). You could add cond = None to the class:
class CustomDict(dict):
cond = None
and then just test for if self.cond is None or self.cond(value):.
Define a custom __reduce__ method to control how the initial object is created when restored:
def _default_cond(v): return v is not None
class CustomDict(dict):
def __init__(self, cond=_default_cond):
super().__init__()
self.cond = cond
def __setitem__(self, key, value):
if self.cond(value):
dict.__setitem__(self, key, value)
def __reduce__(self):
return (CustomDict, (self.cond,), None, None, iter(self.items()))
__reduce__ is expected to return a tuple with:
A callable that can be pickled directly (here the class does fine)
A tuple of positional arguments for that callable; on unpickling the first element is called passing in the second as arguments, so by setting this to (self.cond,) we ensure that the new instance is created with cond passed in as an argument and now CustomDict.__init__() will be called.
The next 2 positions are for a __setstate__ method (ignored here) and for list-like types, so we set these to None.
The last element is an iterator for the key-value pairs that pickle then will restore for us.
Note that I replaced the default value for cond with a function here too so you don't have to rely on dill for the pickling.

Python #property (setter method) that is restricted to setting data only in the __init__ method

I would like to setup an object that imports some raw_data during the initialization phase of the object (i.e. during the __init__() method). However I would like to make it read only from that point on. I was thinking of using a setter property self.raw_data with the following logic:
#raw_data.setter
def raw_data(self, dataframe):
<IF calling from __init__>?
self.__raw_data = df
Is there a way for the setter method to know if it is being called from within __init__?
Blocking all other attempts to change the data.
Nothing you do in the raw_data setter is going to stop direct assignment to __raw_data. I would recommend not defining a setter and using __raw_data for initialization. This will block writes to raw_data, but not __raw_data.
If you want stricter enforcement, then by design, you don't have many options. One option is to write your class in C or Cython. The other option is easier, but it has awkward side effects. That option is to subclass an immutable built-in type, such as tuple, and create pre-initialized instances with __new__ instead of mutating them into an initialized state with __init__:
class Immutable(tuple):
__slots__ = [] # Prevents creation of instance __dict__
def __new__(cls, *whatever_args):
attribute1 = compute_it_however()
attribute2 = likewise()
return super(cls, Immutable).__new__(cls, attribute1, attribute2)
#property
def attribute1(self):
return self[0]
#property
def attribute2(self):
return self[1]
This makes your objects immutable, but the awkward side effect is that your objects are now tuples.
The closest you can get is to only allow to set self._raw_data if it hasn't be set yet, ie:
class Foo(object):
def __init__(self, dataframe):
self.raw_data = dataframe
#property
def raw_data(self):
return getattr(self, '_raw_data', None)
#raw_data.setter
def raw_data(self, dataframe):
if hasattr(self, '_raw_data'):
raise AttributeError, "Attribute is read-only")
self._raw_data = dataframe
Which makes the setter mostly useless, so you'd get the same result with less code skipping it (which will make the property read-only):
class Foo(object):
def __init__(self, dataframe):
self._raw_data = dataframe
#property
def raw_data(self):
return self._raw_data
But beware that none of these solutions will prevent you to directly set _raw_data.

python __getattribute__ override and #property decorator

I had to write a class of some sort that overrides __getattribute__.
basically my class is a container, which saves every user-added property to self._meta which is a dictionary.
class Container(object):
def __init__(self, **kwargs):
super(Container, self).__setattr__('_meta', OrderedDict())
#self._meta = OrderedDict()
super(Container, self).__setattr__('_hasattr', lambda key : key in self._meta)
for attr, value in kwargs.iteritems():
self._meta[attr] = value
def __getattribute__(self, key):
try:
return super(Container, self).__getattribute__(key)
except:
if key in self._meta : return self._meta[key]
else:
raise AttributeError, key
def __setattr__(self, key, value):
self._meta[key] = value
#usage:
>>> a = Container()
>>> a
<__main__.Container object at 0x0000000002B2DA58>
>>> a.abc = 1 #set an attribute
>>> a._meta
OrderedDict([('abc', 1)]) #attribute is in ._meta dictionary
I have some classes which inherit Container base class and some of their methods have #property decorator.
class Response(Container):
#property
def rawtext(self):
if self._hasattr("value") and self.value is not None:
_raw = self.__repr__()
_raw += "|%s" %(self.value.encode("utf-8"))
return _raw
problem is that .rawtext isn't accessible. (I get attributeerror.) every key in ._meta is accessible, every attributes added by __setattr__ of object base class is accessible, but method-to-properties by #property decorator isn't. I think it has to do with my way of overriding __getattribute__ in Container base class. What should I do to make properties from #property accessible?
I think you should probably think about looking at __getattr__ instead of __getattribute__ here. The difference is this: __getattribute__ is called inconditionally if it exists -- __getattr__ is only called if python can't find the attribute via other means.
I completely agree with mgilson. If you want a sample code which should be equivalent to your code but work well with properties you can try:
class Container(object):
def __init__(self, **kwargs):
self._meta = OrderedDict()
#self._hasattr = lambda key: key in self._meta #???
for attr, value in kwargs.iteritems():
self._meta[attr] = value
def __getattr__(self, key):
try:
return self._meta[key]
except KeyError:
raise AttributeError(key)
def __setattr__(self, key, value):
if key in ('_meta', '_hasattr'):
super(Container, self).__setattr__(key, value)
else:
self._meta[key] = value
I really do not understand your _hasattr attribute. You put it as an attribute but it's actually a function that has access to self... shouldn't it be a method?
Actually I think you should simple use the built-in function hasattr:
class Response(Container):
#property
def rawtext(self):
if hasattr(self, 'value') and self.value is not None:
_raw = self.__repr__()
_raw += "|%s" %(self.value.encode("utf-8"))
return _raw
Note that hasattr(container, attr) will return True also for _meta.
An other thing that puzzles me is why you use an OrderedDict. I mean, you iterate over kwargs, and the iteration has random order since it's a normal dict, and add the items in the OrderedDict. Now you have _meta which contains the values in random order.
If you aren't sure whether you need to have a specific order or not, simply use dict and eventually swap to OrderedDict later.
By the way: never ever use an try: ... except: without specifying the exception to catch. In your code you actually wanted to catch only AttributeErrors so you should have done:
try:
return super(Container, self).__getattribute__(key)
except AttributeError:
#stuff

OO design: an object that can be exported to a "row", while accessing header names, without repeating myself

Sorry, badly worded title. I hope a simple example will make it clear. Here's the easiest way to do what I want to do:
class Lemon(object):
headers = ['ripeness', 'colour', 'juiciness', 'seeds?']
def to_row(self):
return [self.ripeness, self.colour, self.juiciness, self.seeds > 0]
def save_lemons(lemonset):
f = open('lemons.csv', 'w')
out = csv.writer(f)
out.write(Lemon.headers)
for lemon in lemonset:
out.writerow(lemon.to_row())
This works alright for this small example, but I feel like I'm "repeating myself" in the Lemon class. And in the actual code I'm trying to write (where the number of variables I'm exporting is ~50 rather than 4, and where to_row calls a number of private methods that do a bunch of weird calculations), it becomes awkward.
As I write the code to generate a row, I need to constantly refer to the "headers" variable to make sure I'm building my list in the correct order. If I want to change the variables being outputted, I need to make sure to_row and headers are being changed in parallel (exactly the kind of thing that DRY is meant to prevent, right?).
Is there a better way I could design this code? I've been playing with function decorators, but nothing has stuck. Ideally I should still be able to get at the headers without having a particular lemon instance (i.e. it should be a class variable or class method), and I don't want to have a separate method for each variable.
In this case, getattr() is your friend: it allows you to get a variable based on a string name. For example:
def to_row(self):
return [getattr(self, head) for head in self.headers]
EDIT: to properly use the header seeds?, you would need to set the attribute seeds? for the objects. setattr(self, 'seeds?', self.seeds > 0) right above the return statement.
We could use some metaclass shenanegans to do this...
In python 2, attributes are passed to the metaclass in a dict, without
preserving order, we'll also want a base class to work with so we can
distinguish class attributes that should be mapped into the row. In python3, we could dispense with just about all of this base descriptor class.
import itertools
import functools
#functools.total_ordering
class DryDescriptor(object):
_order_gen = itertools.count()
def __init__(self, alias=None):
self.alias = alias
self.order = next(self._order_gen)
def __lt__(self, other):
return self.order < other.order
We will want a python descriptor for every attribute we wish to map into the
row. slots are a nice way to get data descriptors without much work. One
caveat, though, we'll have to manually remove the helper instance to make the
real slot descriptor visible.
class slot(DryDescriptor):
def annotate(self, attr, attrs):
del attrs[attr]
self.attr = attr
slots = attrs.setdefault('__slots__', []).append(attr)
def annotate_class(self, cls):
if self.alias is not None:
setattr(cls, self.alias, getattr(self.attr))
For computed fields, we can memoize results. Memoizing off of the annotated
instance is tricky without a memory leak, we need weakref. alternatively, we
could have arranged for another slot just to store the cached value. This also isn't quite thread safe, but pretty close.
import weakref
class memo(DryDescriptor):
_memo = None
def __call__(self, method):
self.getter = method
return self
def annotate(self, attr, attrs):
if self.alias is not None:
attrs[self.alias] = self
def annotate_class(self, cls): pass
def __get__(self, instance, owner):
if instance is None:
return self
if self._memo is None:
self._memo = weakref.WeakKeyDictionary()
try:
return self._memo[instance]
except KeyError:
return self._memo.setdefault(instance, self.getter(instance))
On the metaclass, all of the descriptors we created above are found, sorted by
creation order, and instructed to annotate the new, created class. This does
not correctly treat derived classes and could use some other conveniences like
an __init__ for all the slots.
class DryMeta(type):
def __new__(mcls, name, bases, attrs):
descriptors = sorted((value, key)
for key, value
in attrs.iteritems()
if isinstance(value, DryDescriptor))
for descriptor, attr in descriptors:
descriptor.annotate(attr, attrs)
cls = type.__new__(mcls, name, bases, attrs)
for descriptor, attr in descriptors:
descriptor.annotate_class(cls)
cls._header_descriptors = [getattr(cls, attr) for descriptor, attr in descriptors]
return cls
Finally, we want a base class to inherit from so that we can have a to_row
method. this just invokes all of the __get__s for all of the respective
descriptors, in order.
class DryBase(object):
__metaclass__ = DryMeta
def to_row(self):
cls = type(self)
return [desc.__get__(self, cls) for desc in cls._header_descriptors]
Assuming all of that is tucked away, out of sight, the definition of a class
that uses this feature is mostly free of repitition. The only short coming is
that to be practical, every field needs a python friendly name, thus we had the
alias key to associate 'seeds?' to has_seeds
class ADryRow(DryBase):
__slots__ = ['seeds']
ripeness = slot()
colour = slot()
juiciness = slot()
#memo(alias='seeds?')
def has_seeds(self):
print "Expensive!!!"
return self.seeds > 0
>>> my_row = ADryRow()
>>> my_row.ripeness = "tart"
>>> my_row.colour = "#8C2"
>>> my_row.juiciness = 0.3479
>>> my_row.seeds = 19
>>>
>>> print my_row.to_row()
Expensive!!!
['tart', '#8C2', 0.3479, True]
>>> print my_row.to_row()
['tart', '#8C2', 0.3479, True]

Categories

Resources