numpy.ndarray: converting to a "normal" class - python

[Python 3]
I like ndarray but I find it annoying to use.
Here's one problem I face. I want to write class Array that will inherit much of the functionality of ndarray, but has only one way to be instantiated: as a zero-filled array of a certain size. I was hoping to write:
class Array(numpy.ndarray):
def __init__(size):
# What do here?
I'd like to call super().__init__ with some parameters to create a zero-filled array, but it won't work since ndarray uses a global function numpy.zeros (rather than a constructor) to create a zero-filled array.
Questions:
Why does ndarray use global (module) functions instead of constructors in many cases? It is a big annoyance if I'm trying to reuse them in an object-oriented setting.
What's the best way to define class Array that I need? Should I just manually populate ndarray with zeroes, or is there any way to reuse the zeros function?

Why does ndarray use global (module) functions instead of constructors in many cases?
To be compatible/similar to Matlab, where functions like zeros or ones originally came from.
Global factory functions are quick to write and easy to understand. What should the semantics of a constructor be, e.g. how would you express a simple zeros or empty or ones with one single constructor? In fact, such factory functions are quite common, also in other programming languages.
What's the best way to define class Array that I need?
import numpy
class Array(numpy.ndarray):
def __new__(cls, size):
result = numpy.ndarray.__new__(Array, size)
result.fill(0)
return result
arr = Array(5)
def test(a):
print type(a), a
test(arr)
test(arr[2:4])
test(arr.view(int))
arr[2:4] = 5.5
test(arr)
test(arr[2:4])
test(arr.view(int))
Note that this is Python 2, but it would require only small modifications to work with Python 3.

If you don't like ndarray interface then don't inherit it. You can define your own interface and delegate the rest to ndarray and numpy.
import functools
import numpy as np
class Array(object):
def __init__(self, size):
self._array = np.zeros(size)
def __getattr__(self, attr):
try: return getattr(self._array, attr)
except AttributeError:
# extend interface to all functions from numpy
f = getattr(np, attr, None)
if hasattr(f, '__call__'):
return functools.partial(f, self._array)
else:
raise AttributeError(attr)
def allzero(self):
return np.allclose(self._array, 0)
a = Array(10)
# ndarray doesn't have 'sometrue()' that is the same as 'any()' that it has.
assert a.sometrue() == a.any() == False
assert a.allzero()
try: a.non_existent
except AttributeError:
pass
else:
assert 0

Inheritance of ndarray is little bit tricky. ndarray does not even have method __init(self, )___, so it can't be called from subclass, but there are reasons for that. Please see numpy documentation of subclassing.
By the way could you be more specific of your particular needs? It's still quite easy to cook up a class (utilizing ndarray) for your own needs, but a subclass of ndarray to pass all the numpy machinery is quite different issue.
It seems that I can't comment my own post, odd
#Philipp: It will be called by Python, but not by numpy. There are three ways to instantiate ndarray, and the guidelines how to handle all cases is given on that doc.

Related

In Python, can you change how a method from class 1 acts on class 2 from within class 2?

Basically I have a class which subclasses ndarray and has additional information. When I call np.asarray() on my object, it returns just the numpy array and destroys my additional information.
My question is then this: Is there a way in Python to change how np.asarray() acts on my subclass of ndarray from within my subclass? I don't want to change numpy of course, and I do not want to go through every instance where np.asarray() is called to take care of this.
Thanks in advance!
Chris
Short answer: No. Numpy's asarray() doesn't check e.g. if a special method on the class of its argument exists and so doesn't provide a way to override its behaviour.
Long answer: It's not possible from your subclass, but you can hotpatch the numpy module in your module level code to replace the asarray function with your own wrapper. This is a very hacky solution and I don't recommend it, but it may work for you.
_real_asarray = np.asarray
def _new_asarray(a, dtype=None, order=None):
if isinstance(a, MyClass):
# special handling here
else:
return _real_asarray(a, dtype, order)
np.asarray = _new_asarray
No. Numpy's asarray() is coded to instantiate a regular numpy array, and you can't change that without editing asarray() or changing the caller's code to call your special method instead of asarray()

Python numpy.ndarray subclasses and zero rank arrays

I am trying to create a subclass of numpy.ndarray. It is very simple, and is just a numpy array with some extra attributes and methods that manipulate those attributes. For the most part, it works fine, however I have a problem when using reductions like np.sum.
First off, I have read both Subclassing ndarray and Zero-Rank Arrays.
It seems that when I create a subclass of ndarray it behaves differently with respect to zero-rank array -> scalar conversion.
In this example I just use the simplest possible derived class, one that doesn't actually do anything:
class XArray(np.ndarray):
pass
x = np.eye(2)
y = x.view(type=XArray)
print type(np.sum(x)), type(np.sum(y))
<type 'numpy.float64'> <type '__main__.XArray'>
The former is a numpy scalar, the latter is a zero-rank array of my subclass. Overriding __new__ and __array_finalize__ as documented in the array subclassing guide doesn't change this behavior.
First, my problem: this breaks object oriented-ness. XArray instances cannot be substituted for ndarray instances transparently without breaking lots of code.
I can fix this by overriding the __array_wrap__ method:
class XArray(np.ndarray):
def __array_wrap__(self, obj):
if len(obj.shape)==0:
return obj[()]
else:
return np.ndarray.__array_wrap__(obj)
a = np.sum(np.eye(2).view(XArray))
print type(a)
<type 'numpy.float64'>
I am fine with this, except for two questions:
Is this the right place to do this special case? I can't figure out where this conversion is happening for normal numpy arrays, so I can't tell where it should happen to my derived class.
Is this enough to make my subclass work, or am I going to continue having compatibility problems. Should I just abandon the idea of subclassing ndarray?
The goal here is to be 100 % compatible with regular numpy arrays. It is OK and expected that some operations will lose the derived type information and return an ndarray base class. I am fine with that, I just can't have code written to operate on ndarray's break.

function of a function (property) python

I have a Python class with functions and properties like this:
#property
def xcoords(self):
' Returns numpy array. '
try:
return self.x_coords
except:
self.x_coords = self._read_coords('x')
return self.x_coords
def _read_coords(self, type):
# read lots of stuff from big file
return array
This allows me to do this: data.xcoords, nice and simple.
I want to keep this as it is, however I want to define functions which allow me to do this:
data.xcoords.mm
data.xcoords.in
How do I do it? I also want these function to work for other properties of the class such as data.zcoords.mm.
If you really want xcoords to return a numpy array, then people may not expect the value of xcoords to have mm and in_ methods. You should think about whether mm and in_ are really properties of the arrays themselves, or if they are properties of the class you're defining. In the latter case, I would recommend against subclassing ndarray -- just define them as methods of the containing class.
On the other hand, if these are definitely properties of the thing returned by xcoords, then subclassing ndarray is a reasonable approach. Be sure to get it right by defining __new__ and __array_finalize__ as discussed in the docs.
To decide whether you should subclass ndarray, you might consider whether you can see yourself reusing this class elsewhere in your program. (You don't actually have to use it elsewhere, right now -- you just have to be able to see yourself reusing it at some point.) If you can't, then these are probably properties of the containing class. The line of reasoning here is that -- thinking in terms of functions -- if you have a short function foo and a short function bar, and know you will never call them any other way than foo(bar(x)), you might be better off writing foo_bar instead. The same logic applies to classes.
Finally, as larsmans pointed out, in is a keyword in python, and so isn't available for use in this case (which is why I used in_ above).

Defining "overloaded" functions in python

I really like the syntax of the "magic methods" or whatever they are called in Python, like
class foo:
def __add__(self,other): #It can be called like c = a + b
pass
The call
c = a + b
is then translated to
a.__add__(b)
Is it possible to mimic such behaviour for "non-magic" functions? In numerical computations I need the Kronecker product, and am eager to have "kron" function such that
kron(a,b)
is in fact
a.kron(b)?
The use case is: I have two similar classes, say, matrix and vector, both having Kronecker product. I would like to call them
a = matrix()
b = matrix()
c = kron(a,b)
a = vector()
b = vector()
c = kron(a,b)
matrix and vector classes are defined in one .py file, thus share the common namespace. So, what is the best (Pythonic?) way to implement functions like above? Possible solutions:
1) Have one kron() functions and do type check
2) Have different namespaces
3) ?
The python default operator methods (__add__ and such) are hard-wired; python will look for them because the operator implementations look for them.
However, there is nothing stopping you from defining a kron function that does the same thing; look for __kron__ or __rkron__ on the objects passed to it:
def kron(a, b):
if hasattr(a, '__kron__'):
return a.__kron__(b)
if hasattr(b, '__rkron__'):
return b.__rkron__(a)
# Default kron implementation here
return complex_operation_on_a_and_b(a, b)
What you're describing is multiple dispatch or multimethods. Magic methods is one way to implement them, but it's actually more usual to have an object that you can register type-specific implementations on.
For example, http://pypi.python.org/pypi/multimethod/ will let you write
#multimethod(matrix, matrix)
def kron(lhs, rhs):
pass
#multimethod(vector, vector)
def kron(lhs, rhs):
pass
It's quite easy to write a multimethod decorator yourself; the BDFL describes a typical implementation in an article. The idea is that the multimethod decorator associates the type signature and method with the method name in a registry, and replaces the method with a generated method that performs type lookup to find the best match.
Technically speaking, implementing something similar to the "standard" operator (and operator-like - think len() etc) behaviour is not difficult:
def kron(a, b):
if hasattr(a, '__kron__'):
return a.__kron__(b)
elif hasattr(b, '__kron__'):
return b.__kron__(a)
else:
raise TypeError("your error message here")
Now you just have to add a __kron__(self, other) method on the relevant types (assuming you have control over these types or they don't use slots or whatever else that would prevent adding methods outside the class statement's body).
Now I'd not use a __magic__ naming scheme as in my above snippet since this is supposed to be reserved for the language itself.
Another solution would be to maintain a type:specifici function mapping and have the "generic" kron function looking up the mapping, ie:
# kron.py
from somewhere import Matrix, Vector
def matrix_kron(a, b):
# code here
def vector_kron(a, b):
# code here
KRON_IMPLEMENTATIONS = dict(
Matrix=matrix_kron,
Vector=vector_kron,
)
def kron(a, b):
for typ in (type(a), type(b)):
implementation = KRON_IMPLEMENTATION.get(typ, None)
if implementation:
return implementation(a, b)
else:
raise TypeError("your message here")
This solution doesn't work well with inheritance but it "less surprinsing" - doesn't require monkeypatching nor __magic__ name etc.
I think having one single function that delegate the actual computation is a nice way to do it. If the Kronecker product only works on two similar classes, you can even do the type checking in the function :
def kron(a, b):
if type(a) != type(b):
raise TypeError('expected two instances of the same class, got %s and %s'%(type(a), type(b)))
return a._kron_(b)
Then, you just need to define a _kron_ method on the class. This is only some basic example, you might want to improve it to handle more gracefully the cases where a class doesn't have the _kron_ method, or to handle subclasses.
Binary operations in the standart libary usually have a reverse dual (__add__ and __radd__), but since your operator only work for same type objects, it isn't useful here.

numpy coercion problem for left-sided binary operator

I am implementing an array-like object that should be interoperable with standard numpy arrays. I just hit an annoying problem that narrows down to the following:
class MyArray( object ):
def __rmul__( self, other ):
return MyArray() # value not important for current purpose
from numpy import array
print array([1,2,3]) * MyArray()
This yields the following output:
[<__main__.MyArray instance at 0x91903ec>
<__main__.MyArray instance at 0x919038c>
<__main__.MyArray instance at 0x919042c>]
Clearly, rather than calling MyArray().__rmul__( array([1,2,3]) ) as I had hoped, __rmul__ is called for every individual element of the array, and the result wrapped in an object array. This seems to me non compliant with python's coercion rules. More importantly, it renders my left multiplication useless.
Does anybody know a way around this?
(I thought a could fix it using __coerce__ but the linked document explains that that one is no longer invoked in response to binary operators...)
It turns out that numpy offers a simple fix for this problem. The following code works as intended.
class MyArray( object ):
__array_priority__ = 1. # <- fixes the problem
def __rmul__( self, other ):
return MyArray()
More information can be found here.

Categories

Resources