How to "fool" duck typing in Python - python

Suppose I had a class A:
class A:
def __init__(self, x, y):
self.x = x
self.y = y
def sum(self):
return self.x + self.y
And I defined a factory method called factory:
def factory(x, y):
class B: pass
b = B()
setattr(b, 'x', x)
setattr(b, 'y', y)
B.__name__ = 'A'
return b
Now, If I do print(type(A(1, 2))) and print(type(factory(1, 2))) they will show that these are different types. And if I try to do factory(1, 2).sum() I'll get an exception. But, type(A).__name__ and type(factory(1, 2)).__name__ are equivalent and if I do A.sum(factory(1, 2)) I'll get 3, as if I was calling it using an A. So, my question is this:
What would I need to do here to make factory(1, 2).sum() work without defining sum on B or doing inheritance?

I think you're fundamentally misunderstanding the factory pattern, and possibly getting confused with how interfaces work in Python. Either that, or I am fundamentally confused by the question. Either way, there's some sorting out we need to do.
What would I need to do here to make factory(1, 2).sum() work without
defining sum on B or doing inheritance?
Just return an A instead of some other type:
def factory(x, y):
return A(x, y)
then
print(factory(1,2).sum())
will output 3 as expected. But that's kind of a useless factory...could just do A(x, y) and be done with it!
Some notes:
You typically use a "factory" (or factory pattern) when you have easily "nameable" types that may be non-trivial to construct. Consider how when you use scipy.interpolate.interp1d (see here) there's an option for kind, which is basically an enum for all the different strategies you might use to do an interpolation. This is, in essence, a factory (but hidden inside the function for ease of use). You could imagine this could be standalone, so you'd call your "strategy" factory, and then pass this on to the interp1d call. However, doing it inline is a common pattern in Python. Observe: These strategies are easy to "name", somewhat hard to construct in general (you can imagine it would be annoying to have to pass in a function that does linear interpolation as opposed to just doing kind='linear'). That's what makes the factory pattern useful...
If you don't know what A is up front, then it's definitely not the factory pattern you'd want to apply. Furthermore, if you don't know what you're serializing/deserializing, it would be impossible to call it or use it. You'd have to know that, or have some way of inferring it.
Interfaces in Python are not enforced like they are in other languages like Java/C++. That's the spirit of duck typing. If an interface does something like call x.sum(), then it doesn't matter what type x actually is, it just has to have a method called sum(). If it acts like the "sum" duck, quacks like the "sum" duck, then it is the "sum" duck from Python's perspective. Doesn't matter if x is a numpy array, or A, it'll work all the same. In Java/C++, stuff like that wont compile unless the compiler is absolutely certain that x has the method sum defined. Fortunately Python isn't like that, so you can even define it on the fly (which maybe you were trying to do with B). Either way, interfaces are a much different concept in Python than in other mainstream languages.
P.S.
But, type(A).__name__ and type(factory(1, 2)).__name__ are equivalent
Of course they are, you explicitly do this when you say B.__name__ = 'A'. So I'm not sure what you were trying to get at there...
HTH!

Related

Where to put a function that acts on two instances of a specific class

This is really a design question and I would like to know a bit of what design patterns to use.
I have a module, let's say curves.py that defines a Bezier class. Then I want to write a function intersection which uses a recursive algorithm to find the intersections between two instances of Bezier.
What options do I have for where to put this functions? What are some best practices in this case? Currently I have written the function in the module itself (and not as a method to the class).
So currently I have something like:
def intersections(inst1, inst2): ...
def Bezier(): ...
and I can call the function by passing two instances:
from curves import Bezier, intersections
a = Bezier()
b = Bezier()
result = intersections(a, b)
However, another option (that I can think of) would be to make intersection a method of the class. In this case I would instead use
a.intersections(b)
For me the first choice makes a bit more sense since it feels more natural to call intersections(a, b) than a.intersections(b). However, the other option feels more natural in the sense that the function intersection really only acts on Bezier instances and this feels more encapsulated.
Do you think one of these is better than the other, and in that case, for what reasons? Are there any other design options to use here? Are there any best practices?
As an example, you can compare how the builtin set class does this:
intersection(*others)
set & other & ...
Return a new set with elements common to the set and all others.
So intersection is defined as a regular instance method on the class that takes another (or multiple) sets and returns the intersection, and it can be called as a.intersection(b).
However, due to the standard mechanics of how instance methods work, you can also spell it set.intersection(a, b) and in practice you'll see this quite often since like you say it feels more natural.
You can also override the __and__ method so this becomes available as a & b.
In terms of ease of use, putting it on the class is also friendlier, because you can just import the Bezier class and have all associated features available automatically, and they're also discoverable via help(Bezier).

How do I overload `+` on a NamedTuple

How do I create a custom overloading of plus for a named tuple in 3.5? I know there's some new syntax in 3.6 for this, can you do it in 3.5? I want it also to pass mypy checks.
from typing import NamedTuple
Point = NamedTuple('Point',[('x',int),
('y',int)])
def joinPoints(a: Point, b:Point) -> Point:
return Point(x = a.x+b.x,y=a.y+b.y)
q = Point(1,2)
r = Point(3,4)
s= joinPoints(q,r)
t = q+r #HOW DO I MAKE THIS GO?
#s should equal t
As a note, what the new, class-based syntax for defining typed namedtuples in Python 3.6 is ultimately doing at runtime is basically a bunch of metaprogramming hijinkery to make a custom class, which happens to also contain your custom __add__ method, if you included one.
Since we can't have that syntax in Python 3.5, the best you can really do is to just implement all that boilerplate yourself, I'm afraid.
Remember, namedtuples are basically meant to be a convenient way of defining simple classes (that subclass tuple), nothing more. If you want anything more complex, you're actually going to need to implement it yourself.
In any case, setting aside types completely, there isn't a super clean way of doing what you're trying to do at runtime, much less with types (at least, to the best of my knowledge). I guess one sort of clean way would be to manually patch the Point class after you define it, like so:
from typing import NamedTuple
Point = NamedTuple('Point', [('x', int), ('y', int)])
def add(self: Point, other: Point) -> Point:
return Point(self.x + other.x, self.y + other.y)
Point.__add__ = add
a = Point(1, 2)
b = Point(3, 4)
print(a + b) # prints 'Point(4, 6)'
However, you'd have to give up on using mypy then -- mypy makes the simplifying (and usually reasonable) assumption that a class's attributes and type signatures will not change after that class has been defined, and so will disallow assigning a new method to Point and will consequently will throw an error on the last line.
Perhaps there's some cleverer approach (maybe using the abc module somehow?) that ends up satisfying you, the Python runtime, and mypy, but I'm currently not aware of such an approach.

Inequalities as python parameters

Firstly, Sry fo the bad title of this question I simply don't know a better one.
If you have a better one => Tell me!
So my problem is that I would like to write a simplex solver in Python by myself to deeply understand how they work.
Therefore, I would like to have something like this in my program:
m.addConstr(x[0] <= 7)
Which basically should add a constraint to my model m.
This works in Gurobi which is just wonderful cause it's so easy to read.
The problem is that x[0] has to be an object where I itself can define what should happen when there is an inequality or equality or whatever, right?
I am happy to figure out most of the stuff by myself would just like to get an idea how this works.
It looks like you want to overload the comparison operators of whatever objects you're working with. So if Foo is the class of x[0] in your example, then you could write it like this:
class Foo:
def __gt__(self, other):
# construct and return some kind of constraint object
def __lt__(self, other):
# likewise
These special methods (__gt__, __ge__, __lt__, __le__, __ne__ and __eq__) are called for the left-hand object in a comparison relation. So if you have x > y, then the __gt__ method on x is called with y as an argument.
I don't think it should be your first concern to come up with an elegant input syntax. You should rather implement the simplex algorithm first.
I suggest, you handle the input by writing a parser for the two standard formats for linear programming problems: .lp and .mps
If you still want to know how to implement proper expression handling in Python I recommend you have a look at PySCIPOpt since this is exactly doing what you want and you can inspect the entire source code.

Proper use of class constants in Python

This question specifically relates to the use of the class constants ABOVE and BELOW in the sample code below.
I have a few different classes in different modules that look like this:
class MyClass(object):
ABOVE = 1
BELOW = 0
def __init__(self):
self.my_numbers = [1,2,3,4,5]
def find_thing_in_direction(self, direction, cutoff):
if direction == self.ABOVE:
return [n for n in self.my_numbers if n > cutoff]
else:
return [n for n in self.my_numbers if n < cutoff]
my_class = MyClass()
my_var = my_class.find_thing_in_direction(MyClass.ABOVE, 3)
If I have a handful of classes scattered across different modules that each have their own ABOVE and BELOW, should I extract these constants to somewhere, or is it better to keep the constants within their own classes?
Is there a more Pythonic way to do this instead of using these class constants?
It seems you're using classes as namespaces for your constants. You should ask yourself if the ABOVE and BELOW constants in every single class differs in something between each other.
If a differentiation is required (not just numeric difference, but semantic as well) then storing them in the class they represent is the best approach. On the other side if they have the same semantics in every class then you're not sticking to DRY principle and you're duplicating code.
A solution can be stored them at module level or create a class merely to contain the constants.
EDIT:
based on the OP's comment, I've realized that I overlooked that fact that ABOVE and BELOW are not really parametric constants but just strongly typed names (i.e. an enumeration).
Therefore I think the accepted answer is the correct one :)
Old answer:
It really boils down to preference if the number of constants is small, in the end, however, if you have a lot of them, namespacing by classes is probably a good idea.
Also, do you have inheritance? if yes, do you override the constant values in subclasses? If yes, you obviously need to keep them inside of your classes.
Also, my_class.find_thing_in_direction(MyClass.ABOVE, 3) is smelly: find_thing_in_direction should most probably refer to its own class' ABOVE constant directly.
Also, SomeClass is a really bad class name :)
For your specific method find_thing_in_direction, the direction param is better be a bool flag named something like reverse, just like what the builtin sorted function does:
def find_thing_in_direction(self, reverse, cutoff):
if not reverse:
pass
else:
pass
This way you don't have to use class attributes.

Defining "overloaded" functions in python

I really like the syntax of the "magic methods" or whatever they are called in Python, like
class foo:
def __add__(self,other): #It can be called like c = a + b
pass
The call
c = a + b
is then translated to
a.__add__(b)
Is it possible to mimic such behaviour for "non-magic" functions? In numerical computations I need the Kronecker product, and am eager to have "kron" function such that
kron(a,b)
is in fact
a.kron(b)?
The use case is: I have two similar classes, say, matrix and vector, both having Kronecker product. I would like to call them
a = matrix()
b = matrix()
c = kron(a,b)
a = vector()
b = vector()
c = kron(a,b)
matrix and vector classes are defined in one .py file, thus share the common namespace. So, what is the best (Pythonic?) way to implement functions like above? Possible solutions:
1) Have one kron() functions and do type check
2) Have different namespaces
3) ?
The python default operator methods (__add__ and such) are hard-wired; python will look for them because the operator implementations look for them.
However, there is nothing stopping you from defining a kron function that does the same thing; look for __kron__ or __rkron__ on the objects passed to it:
def kron(a, b):
if hasattr(a, '__kron__'):
return a.__kron__(b)
if hasattr(b, '__rkron__'):
return b.__rkron__(a)
# Default kron implementation here
return complex_operation_on_a_and_b(a, b)
What you're describing is multiple dispatch or multimethods. Magic methods is one way to implement them, but it's actually more usual to have an object that you can register type-specific implementations on.
For example, http://pypi.python.org/pypi/multimethod/ will let you write
#multimethod(matrix, matrix)
def kron(lhs, rhs):
pass
#multimethod(vector, vector)
def kron(lhs, rhs):
pass
It's quite easy to write a multimethod decorator yourself; the BDFL describes a typical implementation in an article. The idea is that the multimethod decorator associates the type signature and method with the method name in a registry, and replaces the method with a generated method that performs type lookup to find the best match.
Technically speaking, implementing something similar to the "standard" operator (and operator-like - think len() etc) behaviour is not difficult:
def kron(a, b):
if hasattr(a, '__kron__'):
return a.__kron__(b)
elif hasattr(b, '__kron__'):
return b.__kron__(a)
else:
raise TypeError("your error message here")
Now you just have to add a __kron__(self, other) method on the relevant types (assuming you have control over these types or they don't use slots or whatever else that would prevent adding methods outside the class statement's body).
Now I'd not use a __magic__ naming scheme as in my above snippet since this is supposed to be reserved for the language itself.
Another solution would be to maintain a type:specifici function mapping and have the "generic" kron function looking up the mapping, ie:
# kron.py
from somewhere import Matrix, Vector
def matrix_kron(a, b):
# code here
def vector_kron(a, b):
# code here
KRON_IMPLEMENTATIONS = dict(
Matrix=matrix_kron,
Vector=vector_kron,
)
def kron(a, b):
for typ in (type(a), type(b)):
implementation = KRON_IMPLEMENTATION.get(typ, None)
if implementation:
return implementation(a, b)
else:
raise TypeError("your message here")
This solution doesn't work well with inheritance but it "less surprinsing" - doesn't require monkeypatching nor __magic__ name etc.
I think having one single function that delegate the actual computation is a nice way to do it. If the Kronecker product only works on two similar classes, you can even do the type checking in the function :
def kron(a, b):
if type(a) != type(b):
raise TypeError('expected two instances of the same class, got %s and %s'%(type(a), type(b)))
return a._kron_(b)
Then, you just need to define a _kron_ method on the class. This is only some basic example, you might want to improve it to handle more gracefully the cases where a class doesn't have the _kron_ method, or to handle subclasses.
Binary operations in the standart libary usually have a reverse dual (__add__ and __radd__), but since your operator only work for same type objects, it isn't useful here.

Categories

Resources