I'm working on a project and I have a dictionary in Python. Due to this being an "append-only" dictionary, I need a way to disable .pop, .popitem, .clear, etc. Is this possible in python?
I have tried:
mydict = dict()
#attempt 1:
mydict.pop = None
#Get error pop is read-only
#attempt 2:
del mydict.pop
#Get error pop is read-only
I have tried this on all delete methods with the same results.
Using inheritance to remove functionality violates Liskov substitution (a hypothetical dict subclass would behave erroneously when treated as a dict instance). And besides, I can easily just dict.clear(my_subclassed_dict_instance). The thing you're asking for is not, in Python terminology, a dict and shouldn't masquerade as a subclass of one.
What you're looking for is a fresh class entirely. You want a class that contains a dict, not one that is a dict.
import collections.abc.Mapping
class MyAppendOnlyContainer(Mapping):
def __init__(self, **kwargs):
self._impl = kwargs
def __getitem__(self, key):
return self._impl[key]
def __setitem__(self, key, value):
self._impl[key] = value
def __iter__(self):
return iter(self._impl)
def __len__(self):
return len(self._impl)
# Plus whatever other functionality you need ...
Note that collections.abc.Mapping requires __getitem__, __iter__, and __len__ (which your use case faithfully implements) and gives you get, __contains__, and several other useful dict-like helper functions. What you have is not a dict. What you have is a Mapping with benefits.
Subclass the dict() class:
class NewDict(dict):
def clear(self):
pass
def pop(self, *args): #Note: returns a NoneType
pass
def popitem(self, *args):
pass
Now when you create your dict, create it using:
myDict = NewDict()
I quickly dumped this into the command line interpreter and it seems to work.
Related
I would like to subclass list and trigger an event (data checking) every time any change happens to the data. Here is an example subclass:
class MyList(list):
def __init__(self, sequence):
super().__init__(sequence)
self._test()
def __setitem__(self, key, value):
super().__setitem__(key, value)
self._test()
def append(self, value):
super().append(value)
self._test()
def _test(self):
""" Some kind of check on the data. """
if not self == sorted(self):
raise ValueError("List not sorted.")
Here, I am overriding methods __init__, __setitem__ and __append__ to perform the check if data changes. I think this approach is undesirable, so my question is: Is there a possibilty of triggering data checking automatically if any kind of mutation happens to the underlying data structure?
As you say, this is not the best way to go about it. To correctly implement this, you'd need to know about every method that can change the list.
The way to go is to implement your own list (or rather a mutable sequence). The best way to do this is to use the abstract base classes from Python which you find in the collections.abc module. You have to implement only a minimum amount of methods and the module automatically implements the rest for you.
For your specific example, this would be something like this:
from collections.abc import MutableSequence
class MyList(MutableSequence):
def __init__(self, iterable=()):
self._list = list(iterable)
def __getitem__(self, key):
return self._list.__getitem__(key)
def __setitem__(self, key, item):
self._list.__setitem__(key, item)
# trigger change handler
def __delitem__(self, key):
self._list.__delitem__(key)
# trigger change handler
def __len__(self):
return self._list.__len__()
def insert(self, index, item):
self._list.insert(index, item)
# trigger change handler
Performance
Some methods are slow in their default implementation. For example __contains__ is defined in the Sequence class as follows:
def __contains__(self, value):
for v in self:
if v is value or v == value:
return True
return False
Depending on your class, you might be able to implement this faster. However, performance is often less important than writing code which is easy to understand. It can also make writing a class harder, because you're then responsible for implementing the methods correctly.
I have a class which is essentially a collection/list of things. But I want to add some extra functions to this list. What I would like, is the following:
I have an instance li = MyFancyList(). Variable li should behave as it was a list whenever I use it as a list: [e for e in li], li.expand(...), for e in li.
Plus it should have some special functions like li.fancyPrint(), li.getAMetric(), li.getName().
I currently use the following approach:
class MyFancyList:
def __iter__(self):
return self.li
def fancyFunc(self):
# do something fancy
This is ok for usage as iterator like [e for e in li], but I do not have the full list behavior like li.expand(...).
A first guess is to inherit list into MyFancyList. But is that the recommended pythonic way to do? If yes, what is to consider? If no, what would be a better approach?
If you want only part of the list behavior, use composition (i.e. your instances hold a reference to an actual list) and implement only the methods necessary for the behavior you desire. These methods should delegate the work to the actual list any instance of your class holds a reference to, for example:
def __getitem__(self, item):
return self.li[item] # delegate to li.__getitem__
Implementing __getitem__ alone will give you a surprising amount of features, for example iteration and slicing.
>>> class WrappedList:
... def __init__(self, lst):
... self._lst = lst
... def __getitem__(self, item):
... return self._lst[item]
...
>>> w = WrappedList([1, 2, 3])
>>> for x in w:
... x
...
1
2
3
>>> w[1:]
[2, 3]
If you want the full behavior of a list, inherit from collections.UserList. UserList is a full Python implementation of the list datatype.
So why not inherit from list directly?
One major problem with inheriting directly from list (or any other builtin written in C) is that the code of the builtins may or may not call special methods overridden in classes defined by the user. Here's a relevant excerpt from the pypy docs:
Officially, CPython has no rule at all for when exactly overridden method of subclasses of built-in types get implicitly called or not. As an approximation, these methods are never called by other built-in methods of the same object. For example, an overridden __getitem__ in a subclass of dict will not be called by e.g. the built-in get method.
Another quote, from Luciano Ramalho's Fluent Python, page 351:
Subclassing built-in types like dict or list or str directly is error-
prone because the built-in methods mostly ignore user-defined
overrides. Instead of subclassing the built-ins, derive your classes
from UserDict , UserList and UserString from the collections
module, which are designed to be easily extended.
... and more, page 370+:
Misbehaving built-ins: bug or feature?
The built-in dict , list and str types are essential building blocks of Python itself, so
they must be fast — any performance issues in them would severely impact pretty much
everything else. That’s why CPython adopted the shortcuts that cause their built-in
methods to misbehave by not cooperating with methods overridden by subclasses.
After playing around a bit, the issues with the list builtin seem to be less critical (I tried to break it in Python 3.4 for a while but did not find a really obvious unexpected behavior), but I still wanted to post a demonstration of what can happen in principle, so here's one with a dict and a UserDict:
>>> class MyDict(dict):
... def __setitem__(self, key, value):
... super().__setitem__(key, [value])
...
>>> d = MyDict(a=1)
>>> d
{'a': 1}
>>> class MyUserDict(UserDict):
... def __setitem__(self, key, value):
... super().__setitem__(key, [value])
...
>>> m = MyUserDict(a=1)
>>> m
{'a': [1]}
As you can see, the __init__ method from dict ignored the overridden __setitem__ method, while the __init__ method from our UserDict did not.
The simplest solution here is to inherit from list class:
class MyFancyList(list):
def fancyFunc(self):
# do something fancy
You can then use MyFancyList type as a list, and use its specific methods.
Inheritance introduces a strong coupling between your object and list. The approach you implement is basically a proxy object.
The way to use heavily depends of the way you will use the object. If it have to be a list, then inheritance is probably a good choice.
EDIT: as pointed out by #acdr, some methods returning list copy should be overriden in order to return a MyFancyList instead a list.
A simple way to implement that:
class MyFancyList(list):
def fancyFunc(self):
# do something fancy
def __add__(self, *args, **kwargs):
return MyFancyList(super().__add__(*args, **kwargs))
If you don't want to redefine every method of list, I suggest you the following approach:
class MyList:
def __init__(self, list_):
self.li = list_
def __getattr__(self, method):
return getattr(self.li, method)
This would make methods like append, extend and so on, work out of the box. Beware, however, that magic methods (e.g. __len__, __getitem__ etc.) are not going to work in this case, so you should at least redeclare them like this:
class MyList:
def __init__(self, list_):
self.li = list_
def __getattr__(self, method):
return getattr(self.li, method)
def __len__(self):
return len(self.li)
def __getitem__(self, item):
return self.li[item]
def fancyPrint(self):
# do whatever you want...
Please note, that in this case if you want to override a method of list (extend, for instance), you can just declare your own so that the call won't pass through the __getattr__ method. For instance:
class MyList:
def __init__(self, list_):
self.li = list_
def __getattr__(self, method):
return getattr(self.li, method)
def __len__(self):
return len(self.li)
def __getitem__(self, item):
return self.li[item]
def fancyPrint(self):
# do whatever you want...
def extend(self, list_):
# your own version of extend
Based on the two example methods you included in your post (fancyPrint, findAMetric), it doesn't seem that you need to store any extra state in your lists. If this is the case, you're best off simple declaring these as free functions and ignoring subtyping altogether; this completely avoids problems like list vs UserList, fragile edge cases like return types for __add__, unexpected Liskov issues, &c. Instead, you can write your functions, write your unit tests for their output, and rest assured that everything will work exactly as intended.
As an added benefit, this means your functions will work with any iterable types (such as generator expressions) without any extra effort.
I have subclassed the dict and need to detect all its modifications.
(I know I cannot detect an in-place modification of a stored value. That's OK.)
My code:
def __setitem__(self, key, value):
super().__setitem__(key, value)
self.modified = True
def __delitem__(self, key):
super().__delitem__(key)
self.modified = True
The problem is it works only for a straightforward assignment or deletion. It does not detect changes made by pop(), popitem(), clear() and update().
Why are __setitem__ and __delitem__ bypassed when items are added or deleted? Do I have to redefine all those methods (pop, etc.) as well?
For this usage, you should not subclass dict class, but instead use the abstract classes form the collections module of Python standard library.
You should subclass the MutableMapping abstract class and override the following methods: __getitem__, __setitem__, __delitem__, __iter__ and __len__,
all that by using an inner dict. The abstract base class ensures that all other methods will use those ones.
class MyDict(collections.MutableMapping):
def __init__(self):
self.d = {}
# other initializations ...
def __setitem__(self, key, value):
self.d[key] = value
self.modified = true
...
pop, popitem, clear, update are not implemented through __setitem__ and __delitem__.
You must redefine them also.
I can suggest look at OrderedDict implementation.
I am quite an average programmer in python and i have not done very complex or any major application with python before ... I was reading new class styles and came across some very new things to me which i am understanding which is data types and classes unification
class defaultdict(dict):
def __init__(self, default=None):
dict.__init__(self)
self.default = default
def __getitem__(self, key):
try:
return dict.__getitem__(self, key)
except KeyError:
return self.default
but whats really getting me confused is why would they unify them? ... i really can't picture any reason making it of high importance .. i'll be glad if anybody can throw some light on this please Thank you
The primary reason was to allow for built-in types to be subclassed in the same way user-created classes could be. Prior to new-style classes, to create a dict-like class, you needed to subclass from a specially designed UserDict class, or produce a custom class that provided the full dict protocol. Now, you can just do class MySpecialDict(dict): and override the methods you want to modify.
For the full rundown, see PEP 252 - Making Types Look More Like Classes
For an example, here's a dict subclass that logs modifications to it:
def log(msg):
...
class LoggingDict(dict):
def __setitem__(self, key, value):
super(LoggingDict, self).__setitem__(key, value)
log('Updated: {}={}'.format(key, value))
Any instance of LoggingDict can be used wherever a regular dict is expected:
def add_value_to_dict(d, key, value):
d[key] = value
logging_dict = LoggingDict()
add_value_to_dict(logging_dict, 'testkey', 'testvalue')
If you instead used a function instead of LoggingDict:
def log_value(d, key, value):
log('Updated: {}={}'.format(key, value))
mydict = dict()
How would you pass mydict to add_value_to_dict and have it log the addition without having to make add_value_to_dict know about log_value?
That question was worded poorly but I couldn't think of a better way to put it. This is also probably an easy question but it's hard to describe properly to search for it. I'm coding in python for the record.
I'm trying create a new class that inherits the list type. It's supposed to effectively be a list of another class I defined where one of the variables of that class is an int. When the list is sorted I want it to sort based on those ints. When I'm redefining sort on the class however how I make it use the original sort from list type.
If you're just asking "how do I call my superclass's sort method from my sort method, that's done using super.
For example, if you just want to make the key in sort default to int in Python 2.7:
class MyList(list):
def sort(self, cmp=None, key=None, reverse=None):
if cmp is None and key is None:
key = int
return super(MyList, self).sort(cmp, key, reverse)
In 3.3, it's even simpler with magic super (and also, there's no cmp parameter to sort):
class MyList(list):
def sort(self, key=None, reverse=None):
if key is None:
key = int
return super().sort(key, reverse)
If you're asking how to monkeypatch in a method that calls the old method, that's a bit trickier… but the key is that functions and methods are first-class values, so you can save them in variables, attributes, etc.
In 2.7:
old_sort = MyList.sort
def new_sort(self, cmp=None, key=None, reverse=None):
if cmp is None and key is None:
key = int
return old_sort(self, cmp, key, reverse)
MyList.sort = types.UnboundMethodType(new_sort, None, MyList)
Again, 3.3 is simpler, this time because unbound methods are the same thing as functions:
old_sort = MyList.sort
def new_sort(self, key=None, reverse=None):
if key is None:
key = int
return old_sort(self, key, reverse)
MyList.sort = new_sort
If you know exactly what parent class has the overridden method you want to invoke (I'm assuming in this case it's list and sort, respectively), you can invoke the parent class's method in the child class like so:
def sort(self, ...):
...
list.sort(self, ...)
...