I'm running Python 2.7.10.
I need to intercept changes in a list. By "change" I mean anything that modifies the list in the shallow sense (the list is not changed if it consists of the same objects in the same order, regardless of the state of those objects; otherwise, it is). I don't need to find out how the list has changed, only that it has. So I just make sure I can detect that, and let the base method do its work. This is my test program:
class List(list):
def __init__(self, data):
list.__init__(self, data)
print '__init__(', data, '):', self
def __getitem__(self, key):
print 'calling __getitem__(', self, ',', key, ')',
r = list.__getitem__(self, key)
print '-->', r
return r
def __setitem__(self, key, data):
print 'before __setitem__:', self
list.__setitem__(self, key, data)
print 'after __setitem__(', key, ',', data, '):', self
def __delitem__(self, key):
print 'before __delitem__:', self
list.__delitem__(self, key)
print 'after __delitem__(', key, '):', self
l = List([0,1,2,3,4,5,6,7]) #1
x = l[5] #2
l[3] = 33 #3
x = l[3:7] #4
del l[3] #5
l[0:4]=[55,66,77,88] #6
l.append(8) #7
Cases #1, #2, #3, and #5 work as I expected; #4, #6, and #7 don't. The program prints:
__init__( [0, 1, 2, 3, 4, 5, 6, 7] ): [0, 1, 2, 3, 4, 5, 6, 7]
calling __getitem__( [0, 1, 2, 3, 4, 5, 6, 7] , 5 ) --> 5
before __setitem__: [0, 1, 2, 3, 4, 5, 6, 7]
after __setitem__( 3 , 33 ): [0, 1, 2, 33, 4, 5, 6, 7]
before __delitem__: [0, 1, 2, 33, 4, 5, 6, 7]
after __delitem__( 3 ): [0, 1, 2, 4, 5, 6, 7]
I'm not terribly surprised by #7: append is probably implemented in an ad-hoc way. But for #4 and #6 I am confused. The __getitem__ documentation says: "Called to implement evaluation of self[key]. For sequence types, the accepted keys should be integers and slice objects." (my emphasys). And for __setitem__: " Same note as for __getitem__()", which I take to mean that key can also be a slice.
What's wrong with my reasoning? I'm prepared, if necessary, to override every list-modifying method (append, extend, insert, pop, etc.), but what should override to catch something like #6?
I am aware of the existence of __setslice__, etc. But those methods are deprecated since 2.0 ...
Hmmm. I read again the docs for __getslice__, __setslice__, etc., and I find this bone-chilling statement:
"(However, built-in types in CPython currently still implement __getslice__(). Therefore, you have to override it in derived classes when implementing slicing.)"
Is this the explanation? Is this saying "Well, the methods are deprecated, but in order to achieve the same functionality in 2.7.10 as you had in 2.0 you still have to override them"? Alas, then why did you deprecate them? How will things work in the future? Is there a "list" class - that I am not aware of - that I could extend and would not present this inconvenience? What do I really need to override to make sure I catch every list-modifying operation?
The problem is that you're subclassing a builtin, and so have to deal with a few wrinkles. Before I delve into that issue, I'll go straight to the "modern" way:
How will things work in the future? Is there a "list" class - that I am not aware of - that I could extend and would not present this inconvenience?
Yes, there's the stdlib Abstract Base Classes. You can avoid the ugly complications caused by subclassing builtin list by using the ABCs instead. For something list-like, try subclassing MutableSequence:
from collections import MutableSequence
class MyList(MutableSequence):
...
Now you should only need to deal with __getitem__ and friends for slicing behaviour.
If you want to push ahead with subclassing the builtin list, read on...
Your guess is correct, you will need to override __getslice__ and __setslice__. The language reference explains why and you already saw that:
However, built-in types in CPython currently still implement __getslice__(). Therefore, you have to override it in derived classes when implementing slicing.
Note that l[3:7] will hook into __getslice__, whereas the otherwise equivalent l[3:7:] will hook into __getitem__, so you have to handle the possibility of receiving slices in both methods... groan!
Related
I am creating a class that inherits from collections.UserList that has some functionality very similar to NumPy's ndarray (just for exercise purposes). I've run into a bit of a roadblock regarding recursive functions involving the modification of class attributes:
Let's take the flatten method, for example:
class Array(UserList):
def __init__(self, initlist):
self.data = initlist
def flatten(self):
# recursive function
...
Above, you can see that there is a singular parameter in the flatten method, being the required self parameter. Ideally, a recursive function should take a parameter which is passed recursively through the function. So, for example, it might take a lst parameter, making the signature:
Array.flatten(self, lst)
This solves the problem of having to set lst to self.data, which consequently will not work recursively, because self.data won't be changed. However, having that parameter in the function is going to be ugly in use and hinder the user experience of an end user who may be using the function.
So, this is the solution I've come up with:
def flatten(self):
self.data = self.__flatten(self.data)
def __flatten(self, lst):
...
return result
Another solution could be to nest __flatten in flatten, like so:
def flatten(self):
def __flatten(lst):
...
return result
self.data = __flatten(self.data)
However, I'm not sure if nesting would be the most readable as flatten is not the only recursive function in my class, so it could get messy pretty quickly.
Does anyone have any other suggestions? I'd love to know your thoughts, thank you!
A recursive method need not take any extra parameters that are logically unnecessary for the method to work from the caller's perspective; the self parameter is enough for recursion on a "child" element to work, because when you call the method on the child, the child is bound to self in the recursive call. Here is an example:
from itertools import chain
class MyArray:
def __init__(self, data):
self.data = [
MyArray(x) if isinstance(x, list) else x
for x in data]
def flatten(self):
return chain.from_iterable(
x.flatten() if isinstance(x, MyArray) else (x,)
for x in self.data)
Usage:
>>> a = MyArray([[[1, 2], [3, 4]], [[5, 6], [7, 8]]])
>>> list(a.flatten())
[1, 2, 3, 4, 5, 6, 7, 8]
Since UserList is an iterable, you can use a helper function to flatten nested iterables, which can deal likewise with lists and Array objects:
from collections import UserList
from collections.abc import Iterable
def flatten_iterable(iterable):
for item in iterable:
if isinstance(item, Iterable):
yield from flatten_iterable(item)
else:
yield item
class Array(UserList):
def __init__(self, initlist):
self.data = initlist
def flatten(self):
self.data = list(flatten_iterable(self.data))
a = Array([[1, 2], [3, 4]])
a.flatten(); print(a) # prints [1, 2, 3, 4]
b = Array([Array([1, 2]), Array([3, 4])])
b.flatten(); print(b) # prints [1, 2, 3, 4]
barrier of abstraction
Write array as a separate module. flatten can be generic like the example implementation here. This differs from a_guest's answer in that only lists are flattened, not all iterables. This is a choice you get to make as the module author -
# array.py
from collections import UserList
def flatten(t): # generic function
if isinstance(t, list):
for v in t:
yield from flatten(v)
else:
yield t
class array(UserList):
def flatten(self):
return list(flatten(self.data)) # specialization of generic function
why modules are important
Don't forget you are the module user too! You get to reap the benefits from both sides of the abstraction barrier created by the module -
As the author, you can easily expand, modify, and test your module without worrying about breaking other parts of your program
As the user, you can rely on the module's features without having to think about how the module is written or what the underlying data structures might be
# main.py
from array import array
t = array([1,[2,3],4,[5,[6,[7]]]]) # <- what is "array"?
print(t.flatten())
[1, 2, 3, 4, 5, 6, 7]
As the user, we don't have to answer "what is array?" anymore than you have to answer "what is dict?" or "what is iter?" We use these features without having to understand their implementation details. Their internals may change over time, but if the interface stays the same, our programs will continue to work without requiring change.
reusability
Good programs are reusable in many ways. See python's built-in functions for proof of this, or see the the guiding principles of the Unix philosophy -
Write programs that do one thing and do it well.
Write programs to work together.
If you wanted to use flatten in other areas of our program, we can reuse it easily -
# otherscript.py
from array import flatten
result = flatten(something)
Typically, all methods of a class have at least one argument which is called self in order to be able to reference the actual object this method is called on.
If you don't need self in your function, but you still want to include it in a class, you can use #staticmethod and just include a normal function like this:
class Array(UserList):
def __init__(self, initlist):
self.data = initlist
#staticmethod
def flatten():
# recursive function
...
Basically, #staticmethod allows you to make any function a method that can be called on a class or an instance of a class (object). So you can do this:
arr = Array()
arr.flatten()
as well as this:
Array.flatten()
Here is some further reference from Pyhon docs: https://docs.python.org/3/library/functions.html#staticmethod
This feels like a fairly simple concept I'm trying to do.
Just as an example:
Say I have a list [1, 2, 3, 4]
That changes to [2, 3, 4, 1]
I need to be able to identify the change so that I can represent and update the data in JSON without updating the entire list.
Bit of background - This is for use in MIDI, the actual lists can be quite a bit longer than this, and the JSON can be nested with varying complexity. There also may be more than a single change occurring at once. It's not going to be possible to update the entire JSON or nested lists due to time complexity. I am doing it this way currently but in order to expand I need to be able to identify when a specific change occurs and have some way of representing this. It needs to be doable in Python 2 WITHOUT any external packages as it's being used in a Python installation that's embedded within a DAW (Ableton Live).
Does anyone know of anything that may help with this problem? Any help or reading material would be greatly appreciated.
EDIT:
I've tried looping over both lists and comparing the values but this detects it as a change in all values which is no faster than just resending the whole list, potentially much slower as I've got two nested for loops first THEN still send the entire list out over MIDI.
how about this, make a class that track its changes, for example
#from collections.abc import MutableSequence #this for python 3.3+
from collections import MutableSequence
class TrackingList(MutableSequence):
"""list that track its changes"""
def __init__(self,iterable=()):
self.data = list(iterable)
self.changes =[]
def __len__(self):
return len(self.data)
def __getitem__(self,index):
return self.data[index]
def __setitem__(self,index,value):
self.data[index]=value
self.changes.append(("set",index,value))
def __delitem__(self,index):
del self.data[index]
self.changes.append(("del",index))
def insert(self,index,value):
self.data.insert(index,value)
self.changes.append(("insert",index,value))
def __str__(self):
return str(self.data)
example use
>>> tl=TrackingList([1,2,3,4])
>>> print(tl)
[1, 2, 3, 4]
>>> tl.changes
[]
>>> tl[0],tl[-1] = tl[-1],tl[0]
>>> print(tl)
[4, 2, 3, 1]
>>> tl.changes
[('set', 0, 4), ('set', -1, 1)]
>>> tl.append(32)
>>> tl.changes
[('set', 0, 4), ('set', -1, 1), ('insert', 4, 32)]
>>> print(tl)
[4, 2, 3, 1, 32]
>>>
the collections.abc make it easy to make container classes and you get for free a bunch of method, in the case MutableSequence those are: append, reverse, extend, pop, remove, __iadd__, __contains__, __iter__, __reversed__, index, and count
I'm working through the Building Skills in Object Oriented Design in python and am on the wheel section for roulette. We've created a "Bin" class as an extended class from frozenset which will represent each of the positions on the roulette wheel. We then create a tuple of 38 empty "Bins", and now have to create class methods to be able to add odds/outcomes to the Bins.
My problem is that I've not been able to create a method to modify the Bin in position without the result not reverting to the frozenset class.
My desired output is to have:
class Bin(frozenset):
def add(self, other):
....do union of Bin class....
one = Bin(1, 2, 3)
two = Bin(4, 5)
one.add(two)
print(one)
>>> Bin(1, 2, 3, 4, 5)
Stuff I've tried
Extending the frozenset class with no methods defined/overridden
class Bin(frozenset):
pass
one = Bin([1,2,3])
two = Bin([4,5,6])
print(one|two)
print(type(one|two))
Which returns
frozenset({1, 2, 3, 4, 5, 6})
<class 'frozenset'>
I would have expected that by extending the class and using one of the extended methods that the output would remain as the "Bin" class.
I've also tried overriding the __ ror__ & union methods with the same result. I've tried to create a method which to brute force return the desired output. This however does not allow me to change the tuple of Bins as it doesn't operate in place
class Bin(frozenset):
def add(self, other):
self = Bin(self|other)
return self
one = Bin([1,2,3])
two = Bin([4,5,6])
one.add(two)
print(one)
Which returns
Bin({1, 2, 3})
Any insight into where in falling down in my thinking would and/or recommendations of stuff to read for further insight would be great.
frozenset.__or__ (which is called by the default implementation of Bin.__or__ when 'triggered' by one | two) has no idea that frozenset was subclassed by Bin, and that it should return a Bin instance.
You should implement Bin.__or__ and force it to return a Bin instance:
class Bin(frozenset):
def __or__(self, other):
# be wary of infinite recursion if using | incorrectly here,
# better to use the underlying __or__
return Bin(super().__or__(other))
one = Bin([1, 2, 3])
two = Bin([4, 5, 6])
print(one | two)
print(type(one | two))
Outputs
Bin({1, 2, 3, 4, 5, 6})
<class '__main__.Bin'>
You need to do something like this (to avoid infinite recursion):
class Bin(frozenset):
def __or__(self, other):
return Bin(frozenset(self) | other)
What I am looking for is a way to do that in python 2.7
oldlist = list
class list(oldlist):
def append(self, object):
super(list, self).append(object)
return self
def sort(self, cmp=None, key=None, reverse=False):
super(list, self).sort(cmp, key, reverse)
return self
__builtins__.list=list
print list([3, 4, 1, 2]).append(5)
print list([3, 4, 1, 2]).append(5).sort()
print list([3, 4, 1, 2]).append(5).sort(reverse=True)
print list([3, 4, 1, 2]).append(5).sort()[0]
print [3, 4, 1, 2].append(5)
print [3, 4, 1, 2].append(5).sort()
print [3, 4, 1, 2].append(5).sort(reverse=True)
print [3, 4, 1, 2].append(5).sort()[0]
Actually print :
[3, 4, 1, 2, 5]
[1, 2, 3, 4, 5]
[5, 4, 3, 2, 1]
1
None
...
AttributeError: 'NoneType' object has no attribute 'sort'
Should print :
[3, 4, 1, 2, 5]
[1, 2, 3, 4, 5]
[5, 4, 3, 2, 1]
1
[3, 4, 1, 2, 5]
[1, 2, 3, 4, 5]
[5, 4, 3, 2, 1]
1
I know it can be dangerous to edit builtins class, but some methods really return nothing, do a python script actually expect them to return something, so what the problem ?
For now I think that much simple to do :
filterfiles(myfilelist.sort())
than doing :
myfilelist.sort()
filterfiles(myfilelist)
And it permit to see the results when in interactive mode (instead of nothing)
One thing I don't understand is that when we put {1:1, 2:2}, python look for making the dict literal into a dict object, and I know I can't change python to make an instance of mydict, but is there a way to change the builtins dict directly, whereas it use somes hacky way?
No, it’s simply not possible. Literals, that means any literal (strings, numbers, lists, dicts), are part of the Python syntax. The objects they represent are created from the parser at a very low level, long before you can change anything with Python code.
There is another important thing though. The built-in objects are all implemented in native code; they don’t actually exist as Python objects in the Python environment. For that purpose, things like __builtins__.dict provides a reference to the native dictionary type. When the objects are created with literals, the real native type is used though, so __builtins__.dict is never accessed. As such, changing __builtins__.dict will not affect it at all. It will only change the environment, where these references actually matter.
You can imagine this situation like this:
# native code
class _InternalSpecialType:
pass
def makeSpecialType (): # this is our “literal” evaluator
return _InternalSpecialType()
# public interface
SpecialType = _InternalSpecialType
# then in the Python code
class NewSpecialType(SpecialType):
pass
SpecialType = NewSpecialType
# using a “literal”
x = makeSpecialType()
print(type(x)) # _InternalSpecialType
So no, you can’t change what the literal uses under the hood. It’s simply impossible. If you want to have an object of a different type, you will always have to create it explicitely. And then it’s best to name the type differently than the original type to avoid confusion (and incompatibility between the changed type and literals).
And finally, about methods of built-in types not allowing chaining: Just live with it. Guido knowingly decided against it, and usually, Guido has good reasons you can trust, so it’s likely for the better (see also this answer).
I'll explain how to solve the problem you have, rather than how to implement the solution you're after:
Write filterfiles(sorted(myfilelist)).
Methods that return None do so by design: In this case, to avoid inadvertently sorting a list in-place (and losing its current ordering) when you really wanted a sorted copy. Python already provides functional alternatives for such cases, like sorted() in this case, when it makes sense to. Note that sorted() does not modify its argument.
If you do find a use case for which no functional alternative is provided, I would recommend you get around it in the same way: Write a function (not method) that returns what you want to see. (But check out python's functools module first).
I'm trying to get the information from a slice. Here's the start of my function. (I have tried "elif isinstance(key, slice):" for the fourth line and can't get that to work)
def __getitem__(self, key):
if isinstance(key,(int, long)):
#do stuff if an int
elif #item is slice
#do stuff if a slice
If I make a function call of obj[4:6] to call this function and I print the "key" variable in the function, it prints "slice(4,6, None)"
How do I parse the 4 and 6 values? What I"m trying to do is be able to use the data from the list inside the function.
>>> slice(4,5).start
4
>>> slice(4,5).stop
5
>>> slice(4,5).step #None
One particularly useful method of the slice object is the indices method:
>>> slice(4,5).indices(12)
(4, 5, 1)
You might use it like this:
for i in range(*my_slice.indices(len(self))):
print self[i]
Note that this really shines with negative indices or steps:
>>> slice(4,-5).indices(12)
(4, 7, 1)
>>> print range(*slice(None,None,-1).indices(12))
[11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0]
If you want the info from the slice object, access its attributes start, stop, and step. These attributes are documented here.