"subtracting" strings, classes in python - python

Learning about classes in python. I want the difference between two strings, a sort of subtraction. eg:
a = "abcdef"
b ="abcde"
c = a - b
This would give the output f.
I was looking at this class and I am new to this so would like some clarification on how it works.
class MyStr(str):
def __init__(self, val):
return str.__init__(self, val)
def __sub__(self, other):
if self.count(other) > 0:
return self.replace(other, '', 1)
else:
return self
and this will work in the following way:
>>> a = MyStr('thethethethethe')
>>> b = a - 'the'
>>> a
'thethethethethe'
>>> b
'thethethethe'
>>> b = a - 2 * 'the'
>>> b
'thethethe'
So a string is passed to the class and the constructor is called __init__. This runs the constructor and an object is returned, which contains the value of the string? Then a new subtraction function is created, so that when you use - with the MyStr object it is just defining how subtract works with that class? When sub is called with a string, count is used to check if that string is a substring of the object created. If that is the case, the first occurrence of the passed string is removed. Is this understanding correct?
Edit: basically this class could be reduced to:
class MyStr(str):
def __sub__(self, other):
return self.replace(other, '', 1)

Yes, your understanding is entirely correct.
Python will call a .__sub__() method if present on the left-hand operand; if not, a corresponding .__rsub__() method on the right-hand operand can also hook into the operation.
See emulating numeric types for a list of hooks Python supports for providing more arithmetic operators.
Note that the .count() call is redundant; .replace() will not fail if the other string is not present; the whole function could be simplified to:
def __sub__(self, other):
return self.replace(other, '', 1)
The reverse version would be:
def __rsub__(self, other):
return other.replace(self, '', 1)

Related

adding class elements to list elements [duplicate]

I am trying to understand how __add__ works:
class MyNum:
def __init__(self,num):
self.num=num
def __add__(self,other):
return MyNum(self.num+other.num)
def __str__(self):
return str(self.num)
If I put them in a list
d=[MyNum(i) for i in range(10)]
this works
t=MyNum(0)
for n in d:
t=t+n
print t
But this does not:
print sum(d)
TypeError: unsupported operand type(s) for +: 'int' and 'instance'
What am I doing wrong? How can I get the sum() to work?
My problem is how to use the sum on a list of objects that support the __add__, need to keep it as generic as possible.
You need to define __radd__ as well to get this to work.
__radd__ is reverse add. When Python tries to evaluate x + y it first attempts to call x.__add__(y). If this fails then it falls back to y.__radd__(x).
This allows you to override addition by only touching one class. Consider for example how Python would have to evaluate 0 + x. A call to 0.__add__(x) is attempted but int knows nothing about your class. You can't very well change the __add__ method in int, hence the need for __radd__. I suppose it is a form of dependency inversion.
As Steven pointed out, sum operates in place, but starts from 0. So the very first addition is the only one that would need to use __radd__. As a nice exercise you could check that this was the case!
>>> help(sum)
Help on built-in function sum in module __builtin__:
sum(...)
sum(sequence[, start]) -> value
Returns the sum of a sequence of numbers (NOT strings) plus the value
of parameter 'start' (which defaults to 0). When the sequence is
empty, returns start.
In other words, provide a start value:
sum(d, MyNum(0))
Edit pasted from my below comment:
sum works with a default start value of the integer zero. Your MyNum class as written does not know how to add itself to integers. To solve this you have two options. Either you can provide a start value to sum that has the same type as you class, or you can implement __radd__, which Python calls when adding values of differing types (such as when the first value in d is added to the default start value of zero).
I oppose relaying on sum() with a start point, the loop hole exposed below,
In [51]: x = sum(d, MyNum(2))
In [52]: x.num
Out[52]: 47
Wondering why you got 47 while you are expecting like
…start from 2nd of MyNum() while leaving first and add them till end, so the expected result = 44 (sum(range(2,10))
The truth here is that 2 is not kept as start object/position but instead treated as an addition to the result
sum(range(10)) + 2
oops, link broken !!!!!!
Use radd
Here below the correct code. Also note the below
Python calls __radd__ only when the object on the right side of the + is your class instance
eg: 2 + obj1
#!/usr/bin/env python
class MyNum:
def __init__(self,num):
self.num=num
def __add__(self,other):
return MyNum(self.num+other.num)
def __radd__(self,other):
return MyNum(self.num+other)
def __str__(self):
return str(self.num)
d=[MyNum(i) for i in range(10)]
print sum(d) ## Prints 45
d=[MyNum(i) for i in range(2, 10)]
print sum(d) ## Prints 44
print sum(d,MyNum(2)) ## Prints 46 - adding 2 to the last value (44+2)
class MyNum:
def __init__(self,num):
self.num=num
def __add__(self,other):
return self.num += other.num
def __str__(self):
return str(self.num)
one = MyNum(1)
two = MyNum(2)
one + two
print(two.num)
Another option is reduce (functools.reduce in Python 3.x).
from functools import reduce
from operators import add
d=[MyNum(i) for i in range(10)]
my_sum = reduce(add,d)

How to remove `duplicates' in list of instances

I have a list of instances of a certain class. This list contains `duplicates', in the sense that duplicates share the exact same attributes. I want to remove the duplicates from this list.
I can check whether two instances share the same attributes by using
class MyClass:
def __eq__(self, other) :
return self.__dict__ == other.__dict__
I could of course iterate through the whole list of instances and compare them element by element to remove duplicates, but I was wondering if there is a more pythonic way to do this, preferably using the in operator + list comprehension.
sets (no order)
A set cannot contain duplicate elements. list(set(content)) will deduplicate a list. This is not too inefficient and is probably one of the better ways to do it :P You will need to define a __hash__ function for your class though, which must be the same for equal elements and different for unequal elements for this to work. Note that the hash value must obey the aforementioned rule but otherwise it may change between runs without causing issues.
index function (stable order)
You could do lambda l: [l[index] for index in range(len(l)) if index == l.index(l[index])]. This only keeps elements that are the first in the list.
in operator (stable order)
def uniquify(content):
result = []
for element in content:
if element not in result:
result.append(element)
return result
This will keep appending elements to the output list unless they are already in the output list.
A little more on the set approach. You can safely implement a hash by delegating to a tuple's hash - just hash a tuple of all the attributes you want to look at. You will also need to define an __eq__ that behaves properly.
class MyClass:
def __init__(self, a, b, c):
self.a = a
self.b = b
self.c = c
def __eq__(self, other):
return (self.a, self.b, self.c) == (other.a, other.b, other.c)
def __hash__(self):
return hash((self.a, self.b, self.c))
def __repr__(self):
return "MyClass({!r}, {!r}, {!r})".format(self.a, self.b, self.c)
As you're doing so much tuple construction, you could just make your class iterable:
def __iter__(self):
return iter((self.a, self.b, self.c))
This enables you to call tuple on self instead of laboriously doing .a, .b, .c etc.
You can then do something like this:
def unordered_elim(l):
return list(set(l))
If you want to preserve ordering, you can use an OrderedDict instead:
from collections import OrderedDict
def ordered_elim(l):
return list(OrderedDict.fromkeys(l).keys())
This should be faster than using in or index, while still preserving ordering. You can test it something like this:
data = [MyClass("this", "is a", "duplicate"),
MyClass("first", "unique", "datum"),
MyClass("this", "is a", "duplicate"),
MyClass("second", "unique", "datum")]
print(unordered_elim(data))
print(ordered_elim(data))
With this output:
[MyClass('first', 'unique', 'datum'), MyClass('second', 'unique', 'datum'), MyClass('this', 'is a', 'duplicate')]
[MyClass('this', 'is a', 'duplicate'), MyClass('first', 'unique', 'datum'), MyClass('second', 'unique', 'datum')]
NB if any of your attributes aren't hashable, this won't work, and you'll either need to work around it (change a list to a tuple) or use a slow, n ^ 2 approach like in.

similar use of operator overloading in python functions

Is there a way to use variable like in operator overloading.
e.g.
a += 1
Instead of a = a + 1
in
a = max(a, some_other_variable)
The max() function is just an example.
NOTE:
My intent here is not to use the variable 'a' again, if possible. These two examples are different and not related to each other.
e.g.
a = some_function(a, b)
Here, the values returned from some_function() is assigned back to variable 'a' again.
Unless variable 'a' is a class variable I cannot access variable inside function some_function(), although if there is a way so that I can use it only once?
You cannot supplement Python's set of operators and statements directly in the Python code. However, you can write a wrapper that uses Python's language services to write a Pythonesque DSL which includes the operators you want.
I feel like you want something along these lines ...
>>> class Foo(object):
... def __iadd__(self, other):
... return max(self.num, other)
... def __init__(self, num):
... self.num = num
...
>>> a = Foo(5)
>>> a += 4
>>> print a
5
>>> a = Foo(4)
>>> a += 6
>>> a
6
But please note that I would consider this use of __iadd__ to be very impolite. Having __iadd__ return something other than self is generally inconsiderate if the type is mutable.
Instead of overloading an operator in a like the other answer, you could create a partial-like object for the second part. (I used the left shift operator for "coolness")
class partial(functools.partial):
def __rlshift__(self, val):
return self(val)
and use like this:
>>> a = 10
>>> a <<= partial(max, 20)
>>> a
20
So you don't need to mess with your variable types to execute the operation. Also you will not need to declare a new class for every function.
PS: Beware that the actual execution is max(20, a).

Classes: Alternative way of adding len()

I am building a string Class that behaves like a regular string class except that the addition operator returns the sum of the lengths of the two strings instead of concatenating them. And then a multiplication operator returns the products of the length of the two strings. So I was planning on doing
class myStr(string):
def __add__(self):
return len(string) + len (input)
at least that is what I have for the first part but that is apparently not correct. Can someone help me correct it.
You need to derive from str, and you can use len(self) to get the length of the current instance. You also need to give __add__ a parameter for the other operand of the + operator.
class myStr(str):
def __add__(self, other):
return len(self) + len(other)
Demo:
>>> class myStr(str):
... def __add__(self, other):
... return len(self) + len(other)
...
>>> foo = myStr('foo')
>>> foo
'foo'
>>> foo + 'bar'
6
string is not a class. It's not anything*. There is no context where len(string) will work unless you define string.
Secondly, __add__ does not have an input parameter.
You need to fix both of these issues.
* You could import a module called string, but it's not something that just exists in global scope.

TypeError after overriding the __add__ method

I am trying to understand how __add__ works:
class MyNum:
def __init__(self,num):
self.num=num
def __add__(self,other):
return MyNum(self.num+other.num)
def __str__(self):
return str(self.num)
If I put them in a list
d=[MyNum(i) for i in range(10)]
this works
t=MyNum(0)
for n in d:
t=t+n
print t
But this does not:
print sum(d)
TypeError: unsupported operand type(s) for +: 'int' and 'instance'
What am I doing wrong? How can I get the sum() to work?
My problem is how to use the sum on a list of objects that support the __add__, need to keep it as generic as possible.
You need to define __radd__ as well to get this to work.
__radd__ is reverse add. When Python tries to evaluate x + y it first attempts to call x.__add__(y). If this fails then it falls back to y.__radd__(x).
This allows you to override addition by only touching one class. Consider for example how Python would have to evaluate 0 + x. A call to 0.__add__(x) is attempted but int knows nothing about your class. You can't very well change the __add__ method in int, hence the need for __radd__. I suppose it is a form of dependency inversion.
As Steven pointed out, sum operates in place, but starts from 0. So the very first addition is the only one that would need to use __radd__. As a nice exercise you could check that this was the case!
>>> help(sum)
Help on built-in function sum in module __builtin__:
sum(...)
sum(sequence[, start]) -> value
Returns the sum of a sequence of numbers (NOT strings) plus the value
of parameter 'start' (which defaults to 0). When the sequence is
empty, returns start.
In other words, provide a start value:
sum(d, MyNum(0))
Edit pasted from my below comment:
sum works with a default start value of the integer zero. Your MyNum class as written does not know how to add itself to integers. To solve this you have two options. Either you can provide a start value to sum that has the same type as you class, or you can implement __radd__, which Python calls when adding values of differing types (such as when the first value in d is added to the default start value of zero).
I oppose relaying on sum() with a start point, the loop hole exposed below,
In [51]: x = sum(d, MyNum(2))
In [52]: x.num
Out[52]: 47
Wondering why you got 47 while you are expecting like
…start from 2nd of MyNum() while leaving first and add them till end, so the expected result = 44 (sum(range(2,10))
The truth here is that 2 is not kept as start object/position but instead treated as an addition to the result
sum(range(10)) + 2
oops, link broken !!!!!!
Use radd
Here below the correct code. Also note the below
Python calls __radd__ only when the object on the right side of the + is your class instance
eg: 2 + obj1
#!/usr/bin/env python
class MyNum:
def __init__(self,num):
self.num=num
def __add__(self,other):
return MyNum(self.num+other.num)
def __radd__(self,other):
return MyNum(self.num+other)
def __str__(self):
return str(self.num)
d=[MyNum(i) for i in range(10)]
print sum(d) ## Prints 45
d=[MyNum(i) for i in range(2, 10)]
print sum(d) ## Prints 44
print sum(d,MyNum(2)) ## Prints 46 - adding 2 to the last value (44+2)
class MyNum:
def __init__(self,num):
self.num=num
def __add__(self,other):
return self.num += other.num
def __str__(self):
return str(self.num)
one = MyNum(1)
two = MyNum(2)
one + two
print(two.num)
Another option is reduce (functools.reduce in Python 3.x).
from functools import reduce
from operators import add
d=[MyNum(i) for i in range(10)]
my_sum = reduce(add,d)

Categories

Resources