Overwriting builtins python class - python

What I am looking for is a way to do that in python 2.7
oldlist = list
class list(oldlist):
def append(self, object):
super(list, self).append(object)
return self
def sort(self, cmp=None, key=None, reverse=False):
super(list, self).sort(cmp, key, reverse)
return self
__builtins__.list=list
print list([3, 4, 1, 2]).append(5)
print list([3, 4, 1, 2]).append(5).sort()
print list([3, 4, 1, 2]).append(5).sort(reverse=True)
print list([3, 4, 1, 2]).append(5).sort()[0]
print [3, 4, 1, 2].append(5)
print [3, 4, 1, 2].append(5).sort()
print [3, 4, 1, 2].append(5).sort(reverse=True)
print [3, 4, 1, 2].append(5).sort()[0]
Actually print :
[3, 4, 1, 2, 5]
[1, 2, 3, 4, 5]
[5, 4, 3, 2, 1]
1
None
...
AttributeError: 'NoneType' object has no attribute 'sort'
Should print :
[3, 4, 1, 2, 5]
[1, 2, 3, 4, 5]
[5, 4, 3, 2, 1]
1
[3, 4, 1, 2, 5]
[1, 2, 3, 4, 5]
[5, 4, 3, 2, 1]
1
I know it can be dangerous to edit builtins class, but some methods really return nothing, do a python script actually expect them to return something, so what the problem ?
For now I think that much simple to do :
filterfiles(myfilelist.sort())
than doing :
myfilelist.sort()
filterfiles(myfilelist)
And it permit to see the results when in interactive mode (instead of nothing)

One thing I don't understand is that when we put {1:1, 2:2}, python look for making the dict literal into a dict object, and I know I can't change python to make an instance of mydict, but is there a way to change the builtins dict directly, whereas it use somes hacky way?
No, it’s simply not possible. Literals, that means any literal (strings, numbers, lists, dicts), are part of the Python syntax. The objects they represent are created from the parser at a very low level, long before you can change anything with Python code.
There is another important thing though. The built-in objects are all implemented in native code; they don’t actually exist as Python objects in the Python environment. For that purpose, things like __builtins__.dict provides a reference to the native dictionary type. When the objects are created with literals, the real native type is used though, so __builtins__.dict is never accessed. As such, changing __builtins__.dict will not affect it at all. It will only change the environment, where these references actually matter.
You can imagine this situation like this:
# native code
class _InternalSpecialType:
pass
def makeSpecialType (): # this is our “literal” evaluator
return _InternalSpecialType()
# public interface
SpecialType = _InternalSpecialType
# then in the Python code
class NewSpecialType(SpecialType):
pass
SpecialType = NewSpecialType
# using a “literal”
x = makeSpecialType()
print(type(x)) # _InternalSpecialType
So no, you can’t change what the literal uses under the hood. It’s simply impossible. If you want to have an object of a different type, you will always have to create it explicitely. And then it’s best to name the type differently than the original type to avoid confusion (and incompatibility between the changed type and literals).
And finally, about methods of built-in types not allowing chaining: Just live with it. Guido knowingly decided against it, and usually, Guido has good reasons you can trust, so it’s likely for the better (see also this answer).

I'll explain how to solve the problem you have, rather than how to implement the solution you're after:
Write filterfiles(sorted(myfilelist)).
Methods that return None do so by design: In this case, to avoid inadvertently sorting a list in-place (and losing its current ordering) when you really wanted a sorted copy. Python already provides functional alternatives for such cases, like sorted() in this case, when it makes sense to. Note that sorted() does not modify its argument.
If you do find a use case for which no functional alternative is provided, I would recommend you get around it in the same way: Write a function (not method) that returns what you want to see. (But check out python's functools module first).

Related

How to tell when an item in a list has changed position in Python

This feels like a fairly simple concept I'm trying to do.
Just as an example:
Say I have a list [1, 2, 3, 4]
That changes to [2, 3, 4, 1]
I need to be able to identify the change so that I can represent and update the data in JSON without updating the entire list.
Bit of background - This is for use in MIDI, the actual lists can be quite a bit longer than this, and the JSON can be nested with varying complexity. There also may be more than a single change occurring at once. It's not going to be possible to update the entire JSON or nested lists due to time complexity. I am doing it this way currently but in order to expand I need to be able to identify when a specific change occurs and have some way of representing this. It needs to be doable in Python 2 WITHOUT any external packages as it's being used in a Python installation that's embedded within a DAW (Ableton Live).
Does anyone know of anything that may help with this problem? Any help or reading material would be greatly appreciated.
EDIT:
I've tried looping over both lists and comparing the values but this detects it as a change in all values which is no faster than just resending the whole list, potentially much slower as I've got two nested for loops first THEN still send the entire list out over MIDI.
how about this, make a class that track its changes, for example
#from collections.abc import MutableSequence #this for python 3.3+
from collections import MutableSequence
class TrackingList(MutableSequence):
"""list that track its changes"""
def __init__(self,iterable=()):
self.data = list(iterable)
self.changes =[]
def __len__(self):
return len(self.data)
def __getitem__(self,index):
return self.data[index]
def __setitem__(self,index,value):
self.data[index]=value
self.changes.append(("set",index,value))
def __delitem__(self,index):
del self.data[index]
self.changes.append(("del",index))
def insert(self,index,value):
self.data.insert(index,value)
self.changes.append(("insert",index,value))
def __str__(self):
return str(self.data)
example use
>>> tl=TrackingList([1,2,3,4])
>>> print(tl)
[1, 2, 3, 4]
>>> tl.changes
[]
>>> tl[0],tl[-1] = tl[-1],tl[0]
>>> print(tl)
[4, 2, 3, 1]
>>> tl.changes
[('set', 0, 4), ('set', -1, 1)]
>>> tl.append(32)
>>> tl.changes
[('set', 0, 4), ('set', -1, 1), ('insert', 4, 32)]
>>> print(tl)
[4, 2, 3, 1, 32]
>>>
the collections.abc make it easy to make container classes and you get for free a bunch of method, in the case MutableSequence those are: append, reverse, extend, pop, remove, __iadd__, __contains__, __iter__, __reversed__, index, and count

Arguments in a function

I understand that when defining a function, you can provide a parameter, like so:
def func(lst):
# code
I also understand that when you are defining a function, you can provide multiple parameters using the *args syntax, like so:
def func(*lst):
# code
I have a problem where I have to define a function that sorts a list and removes any duplicates.
Here's what I did:
def func(lst):
return sorted(set(lst))
The website that I was doing this practice problem (edabit.com) tested my code, like so:
Test.assert_equals(func([1, 3, 3, 5, 5]), [1, 3, 5])
The test ran successfully, and my code was correct. But here's where I got confused, the test provided multiple parameters (1, 3, 3, 5, 5), I didn't use the *args syntax, yet somehow it ran successfully.
Isn't it supposed to give me an error, saying something like func() takes exactly 1 argument (5 given)?
When I provided the *args syntax, it told me TypeError: unhashable type:'list'
My guess is that this probably happened because the test didn't call the function, instead they used the assert keyword. Is my guess correct?
No, you gave a single argument of type list. If you have
a = [1,2,3,4]
You have a list
Calling
f(a)
And f([1, 2,3,4]) is the same
Notice the [ brackets.
If, however, you were too call f(1, 2,3,4), that would be a mistake.
Also: the assert keyword still calls the function. It has no way of not calling it, as it has been put in as an expression.
if you call f(g(5))
Then f is already called with the result of g, not the function call itself.
You passed in a list of numbers, which is one argument:
func([1, 3, 3, 5, 5])
vs.
func(1, 3, 3, 5, 5)
It looks like they only did pass a single parameter to your function.
Test.assert_equals(func([1, 3, 3, 5, 5]), [1, 3, 5]))
The array [1, 3, 3, 5, 5] is passed as the single argument to func(), then the array [1, 3, 5] is passed as the second argument to assert_equals().

How do __getitem__, __setitem__, work with slices?

I'm running Python 2.7.10.
I need to intercept changes in a list. By "change" I mean anything that modifies the list in the shallow sense (the list is not changed if it consists of the same objects in the same order, regardless of the state of those objects; otherwise, it is). I don't need to find out how the list has changed, only that it has. So I just make sure I can detect that, and let the base method do its work. This is my test program:
class List(list):
def __init__(self, data):
list.__init__(self, data)
print '__init__(', data, '):', self
def __getitem__(self, key):
print 'calling __getitem__(', self, ',', key, ')',
r = list.__getitem__(self, key)
print '-->', r
return r
def __setitem__(self, key, data):
print 'before __setitem__:', self
list.__setitem__(self, key, data)
print 'after __setitem__(', key, ',', data, '):', self
def __delitem__(self, key):
print 'before __delitem__:', self
list.__delitem__(self, key)
print 'after __delitem__(', key, '):', self
l = List([0,1,2,3,4,5,6,7]) #1
x = l[5] #2
l[3] = 33 #3
x = l[3:7] #4
del l[3] #5
l[0:4]=[55,66,77,88] #6
l.append(8) #7
Cases #1, #2, #3, and #5 work as I expected; #4, #6, and #7 don't. The program prints:
__init__( [0, 1, 2, 3, 4, 5, 6, 7] ): [0, 1, 2, 3, 4, 5, 6, 7]
calling __getitem__( [0, 1, 2, 3, 4, 5, 6, 7] , 5 ) --> 5
before __setitem__: [0, 1, 2, 3, 4, 5, 6, 7]
after __setitem__( 3 , 33 ): [0, 1, 2, 33, 4, 5, 6, 7]
before __delitem__: [0, 1, 2, 33, 4, 5, 6, 7]
after __delitem__( 3 ): [0, 1, 2, 4, 5, 6, 7]
I'm not terribly surprised by #7: append is probably implemented in an ad-hoc way. But for #4 and #6 I am confused. The __getitem__ documentation says: "Called to implement evaluation of self[key]. For sequence types, the accepted keys should be integers and slice objects." (my emphasys). And for __setitem__: " Same note as for __getitem__()", which I take to mean that key can also be a slice.
What's wrong with my reasoning? I'm prepared, if necessary, to override every list-modifying method (append, extend, insert, pop, etc.), but what should override to catch something like #6?
I am aware of the existence of __setslice__, etc. But those methods are deprecated since 2.0 ...
Hmmm. I read again the docs for __getslice__, __setslice__, etc., and I find this bone-chilling statement:
"(However, built-in types in CPython currently still implement __getslice__(). Therefore, you have to override it in derived classes when implementing slicing.)"
Is this the explanation? Is this saying "Well, the methods are deprecated, but in order to achieve the same functionality in 2.7.10 as you had in 2.0 you still have to override them"? Alas, then why did you deprecate them? How will things work in the future? Is there a "list" class - that I am not aware of - that I could extend and would not present this inconvenience? What do I really need to override to make sure I catch every list-modifying operation?
The problem is that you're subclassing a builtin, and so have to deal with a few wrinkles. Before I delve into that issue, I'll go straight to the "modern" way:
How will things work in the future? Is there a "list" class - that I am not aware of - that I could extend and would not present this inconvenience?
Yes, there's the stdlib Abstract Base Classes. You can avoid the ugly complications caused by subclassing builtin list by using the ABCs instead. For something list-like, try subclassing MutableSequence:
from collections import MutableSequence
class MyList(MutableSequence):
...
Now you should only need to deal with __getitem__ and friends for slicing behaviour.
If you want to push ahead with subclassing the builtin list, read on...
Your guess is correct, you will need to override __getslice__ and __setslice__. The language reference explains why and you already saw that:
However, built-in types in CPython currently still implement __getslice__(). Therefore, you have to override it in derived classes when implementing slicing.
Note that l[3:7] will hook into __getslice__, whereas the otherwise equivalent l[3:7:] will hook into __getitem__, so you have to handle the possibility of receiving slices in both methods... groan!

Python objects returned more than once by gc.get_referents()

I'm using gc module (Python 2.7.3 on Ubuntu 12.10) to analyze object references.
Starting with the following code:
a = [1,2,3]
b = [1,2,3,4,5]
print(gc.get_referents(a,b))
Obtaining the result:
[3, 2, 1, 5, 4, 3, 2, 1]
It seems that an object is referenced more than once from the list returned by gc.get_referents(). Using set in the following way:
print(set(gc.get_referents(a,b)))
I get something like the union of the list of referents of a and b:
set([1, 2, 3, 4, 5])
I'd like to know if this is the correct way to get the correct number objects referred to by a list of objects.
If you want to get all objects that are referred to without duplicates, yes,
set(gc.get_referents(a, b))
will give you those.

Mapping func over dictionary

How might one map a function over certain values in a dictionary and also update those values in the dictionary?
dic1 = { 1 : [1, 2, 3, 4], 2 : [2, 3, 5, 5], 3 : [6, 3, 7, 2] ... }
map(func, (data[col] for data in dic1.itervalues()))
This is sort of what I'm looking for, but I need a way to reinsert the new func(val) back into each respective slot in the dict. The function works fine, and printed it returns all the proper index values with the func applied, but I can't think of a good way to update the dictionary. Any ideas?
You don't want to use map for updating any kind of sequence; that's not what it's for. map is for generating a new sequence:
dict2 = dict(map(func, dict1.iteritems()))
Of course func has to take a (key, old_value) and return (key, new_value) for this to work as-is. If it just returns, say, new_value, you need to wrap it up in some way. But at that point, you're probably better off with a dictionary comprehension than a map call and a dict constructor:
dict2 = {key: func(value) for key, value in dict1.itervalues()}
If you want to use map, and you want to mutate, you could do it by creating a function that updates things, like this:
def func_wrapped(d, key):
d[key] = func(d[key])
map(partial(func_wrapped, d), dict1)
(This could even be done as a one-liner by using partial with d.__setitem__ if you really wanted.)
But that's a silly thing to do. For one thing, it means you're building a list of values just to throw them away. (Or, in Python 3, you're not actually doing anything unless you write some code that iterates over the map iterator.) But more importantly, you're making things harder for yourself for no good reason. If you don't need to modify things in-place, don't do it. If you do need to modify things in place, use a for loop.
PS, I'm not saying this was a silly question to ask. There are other languages that do have map-like mutating functions, so it wouldn't be a silly thing to do in, say, C++. It's just not pythonic, and you wouldn't know that without asking.
Your function can mutate the list:
>>> d = {1:[1],2:[2],3:[3]}
>>> def func(lst):
... lst.append(lst[0])
... return lst
...
>>> x = map(func,d.values())
>>> x
[[1, 1], [2, 2], [3, 3]]
>>> d
{1: [1, 1], 2: [2, 2], 3: [3, 3]}
however, please note that this really isn't idiomatic python and should be considered for instructional/educational purposes only ... Usually if a function mutates it's arguments, it's polite to have it return None.

Categories

Resources