I'm working with matrices for a project I'm writing in Python. I know that a lot of libraries already exist for manipulating matrices but I'm writing my own so I know exactly what's going on under the hood.
So I have a Matrix base class and a Vector subclass. Both work as expected individually but I'd like a Matrix to be a Vector if initialized with a single line or column.
I tried something like self = Vector(...) when the Matrix is initialized with the right size. But that doesn't seem to affect the object. I also thought of calling the __init__() method of the Vector class but that doesn't suffice because what I want most importantly are the Vector's methods.
Is there a pythonic way of dealing with a situation like this?
This can be done, although it might not be the best way to do it. After all if the Matrix class is instantiated, one expect the result to be a Matrix instance.
One way of achieving that is to customize the constructor of the Matrix class:
class Matrix:
def __new__(cls, nrows, ncols):
if nrows == 1:
inst = super(Matrix, cls).__new__(Vector)
else:
inst = super(Matrix, cls).__new__(cls)
inst.nrows = nrows
inst.ncols = ncols
return inst
def __repr__(self):
return '{}(nrows={}, ncols={})'.format(
self.__class__.__name__, self.nrows, self.ncols)
Demo:
>>> m1 = Matrix(2, 5)
Matrix(nrows=2, ncols=5)
>>> Matrix(1, 5)
Vector(nrows=1, ncols=5)
Mind that instances are actually created inside the __new__() method, while __init__() is used for initializing the newly created instance.
Also, as mentioned in a comment below by #Blckknght, creating a Vector instance through the Matrix class can lead to unwanted surprises, such as as the Vector's __init__() method not getting called (it would have to be called manually).
Depending on your use case, though, it might thus be better to keep things clean and just use a factory for instance creation:
class Matrix:
def __init__(self, nrows, ncols):
self.nrows = nrows
self.ncols = ncols
def __repr__(self):
return '{}(nrows={}, ncols={})'.format(
self.__class__.__name__, self.nrows, self.ncols)
class Vector(Matrix):
pass
def make_matrix(nrows, ncols):
if nrows == 1:
return Vector(nrows, ncols)
return Matrix(nrows, ncols)
Demo:
>>> make_matrix(1, 5)
Vector(nrows=1, ncols=5)
>>> make_matrix(2, 5)
Matrix(nrows=2, ncols=5)
Of course make_matrix() could also be implemented as a (class/static) method of the Matrix class, but that would make the parent class more tightly coupled with one of its child classes...
Related
I have a three dimensional dataset where the 1st dimension gives the type of the variable and the 2nd and 3rd dimensions are spatial indexes. I am attempting to make this data more user friendly by creating a subclass of ndarray containing the data, but with attributes that have sensible names that point to the appropriate variable dimension. One of the variable types is temperature, which I would like to represent with the attribute .T. I attempt to set it like this:
self.T = self[8,:,:]
However, this clashes with the underlying numpy attribute for transposing an array. Normally, overriding a class attribute is trivial, however in this case I get an exception when I try to re-write the attribute. The following is a minimal example of the same problem:
import numpy as np
class foo(np.ndarray):
def __new__(cls, input_array):
obj = np.asarray(input_array).view(cls)
obj.T = 100.0
return obj
foo([1,2,3,4])
results in:
Traceback (most recent call last):
File "tmp.py", line 9, in <module>
foo([1,2,3,4])
File "tmp.py", line 6, in __new__
obj.T = 100.0
AttributeError: attribute 'T' of 'numpy.ndarray' objects is not writable
I have tried using setattr(obj, 'T', 100.0) to set the attribute, but the result is the same.
Obviously, I could just give up and name my attribute .temperature, or something else. However .T will be much more eloquent for the subsequent mathematical expressions which will be done with these data objects. How can I force python/numpy to override this attribute?
For np.matrix subclass, as defined in np.matrixlib.defmatrix:
#property
def T(self):
"""
Returns the transpose of the matrix.
....
"""
return self.transpose()
T is not a conventional attribute that lives in a __dict__ or __slots__. In fact, you can see this immediately because the result of T changes if you modify the shape or contents of an array.
Since ndarray is a class written in C, it has special descriptors for the dynamic attributes it exposes. T is one of these dynamic attributes, defined as a PyGetSetDef structure. You can't override it by simple assignment, because there is nothing to assign to, but you can make a descriptor that overrides it at the class level.
As #hpaulj's answer suggests, the simplest solution may be to use a property to implement the descriptor protocol for you:
import numpy as np
class foo(np.ndarray):
#property
def T(self):
return self[8, :, :]
More complicated alternatives would be to make your own descriptor type, or even to extend the class in C and write your own PyGetSetDef structure. It all depends on what you are trying to achieve.
Following Mad Physicist and hpaulj's lead, the solution to my minimal working example is:
import numpy as np
class foo(np.ndarray):
def __new__(cls, input_array):
obj = np.asarray(input_array).view(cls)
return obj
#property
def T(self):
return 100.0
x = foo([1,2,3,4])
print("T is", x.T)
Which results in:
T is [1 2 3 4]
Some times I found useful split bigger class in smaller classes with their methods and attributes that then I access assigning to an attribute of the bigger class an instance of the smaller class. In this way I can organize the class better: when I work using the console I can use nested dot notation instead of seeing a lot of attributes. For instance, I have an instrument with some parameters that can be grouped together and a method that is linked to these parameters. I would structure the class like this:
class params(object):
def __init__(self,P,I,D):
self.P = P
self.I = I
self.D = D
def compute_PID(self):
pass
class instrument(object):
def __init__(self,name,SN,P,I,D):
self.name = name
self.SN = SN
self.params = params(P,I,D)
def swith_on(self):
pass
myinstrument = instrument('blender','123',45,4,3)
myinstrument.params.P
Is there any drawback of this deign patter? I imagine that from the point of view of the memory it requires more memory, but working with the dot notation make the things easier compared to a dictionary.
I'm struggling to understand why my simple code behaves like this. I create 2 instances a and b that takes in an array as argument. Then I define a method to change one of the instances array, but then both get changed. Any idea why this happen and how can I avoid the method changing the other instance?
import numpy as np
class Test:
def __init__(self, arg):
self.arg=arg
def change(self,i,j,new):
self.arg[i][j]=new
array=np.array([[11,12,13]])
a=Test(array)
b=Test(array)
#prints the same as expected
print(a.arg)
print(b.arg)
print()
a.change(0,0,3)
#still prints the same, even though I did
#not change b.arg
print(a.arg)
print(b.arg)
Because you assigned the same object as the instance members. You can use np.array(x, copy=True) or x.copy() to generate a new array object:
array = np.array([[11,12,13]])
a = Test(array.copy())
b = Test(np.array(array, copy=True))
Alternatively, if your arg is always a np.array, you could do it in the __init__ method (as noted by roganjosh in the comments):
class Test:
def __init__(self, arg):
self.arg = np.array(arg, copy=True)
...
In the snippet below, how do I avoid computing the following numpy variables mask, zbar, te , ro and rvol in the procedures Get_mask, Get_K_Shell_Te etc? These variables are large arrays and I have to define at least six more procedures that reuse them. It looks like what I am doing is not a good idea and is slow.
import numpy as np
# this computes various quantities related to the shell in a object oriented way
class Shell_Data:
def __init__(self, data):
self.data = data
def Get_mask(self):
zbar=self.data['z2a1']
y=self.data['y']*1000
mask= np.logical_and(zbar >= 16 ,zbar<= 19 )
return self.mask
def Get_Shell_Te(self):
self.mask =self.Get_mask()
te =self.data['te'][self.mask]
ro =self.data['ro'][self.mask]
rvol =self.data['rvol'][self.mask]
self.Shell_Te=np.sum(te*ro/rvol)/(np.sum(ro/rvol))
print "Shell Temperature= %0.3f eV" % (self.Shell_Te)
return self.Shell_Te
def Get_Shell_ro(self):
mask =self.Get_mask()
te =self.data['te'][mask]
ro =self.data['ro'][mask]
rvol =self.data['rvol'][mask]
radk =self.data['radk'][mask]
self.Shell_ro=np.sum(radk*ro/rvol)/np.sum(radk/rvol)
return self.Shell_ro
zbar depends on self.data. If you update self.data, you likely have to re-compute it.
If you can make your data immutable, you can compute these values once, e.g. in the constructor.
If you want to avoid calculating the mask data until it's actually required, you can cache the value, like so:
class Shell_Data(...):
def __init__(self,...):
self.cached_mask = None
...
# #property makes an access to self.mask
# to actually return the result of a call to self.mask()
#property
def mask(self):
if self.cached_mask is None: # Not yet calculated.
self.cached_mask = self.computeMask()
return self.cached_mask
def computeMask(self):
zbar = ...
...
return np.logical_and(...)
def someComputation(self):
# The first access to self.mask will compute it.
te = self.data['te'][self.mask]
# The second access will just reuse the same result.
ro = self.data['ro'][self.mask]
...
If you have to mutate self.data, you can cache the computed data, and re-calculate it only when self.data changes. E.g. if you had a setData() method for that, you could recalculate the mask in it, or just set self.cached_mask to None.
(Also, read about instance variables again.
Every method receives the parameter named self, the instance of the object for which it is called. You can access all its instance variables as self.something, exactly the way you access instance variables (and methods) when an object is not called self. If you set an instance variable in one method, you can just access it an another (e.g. self.mask), no need to return it. If you return something from a method, likely it's not worth storing as an instance variable, like self.mask.)
I have a class in python that acts as a front-end to a c-library. This library performs simulations and handles very large arrays of data. This library passes forward a ctype array and my wrapper converts it into a proper numpy.ndarray.
class SomeClass(object):
#property
def arr(self):
return numpy.array(self._lib.get_arr())
However, in order to make sure that memory problems don't occur, I keep the ndarray data separate from the library data, so changing the ndarray does not cause a change in the true array being used by the library. However, I can pass along a new array of the same shape and overwrite the library's held array.
#arr.setter
def arr(self, new_arr):
self._lib.set_arr(new_arr.ctypes)
So, I can interact with the array like so:
x = SomeClass()
a = x.arr
a[0] += 1
x.arr = a
My desire is to simplify this even more by allowing syntax to simply be x.arr[0] += 1, which would be more readable and have less variables. I am not exactly sure how to go about creating such a wrapper (I have very little experience making wrapper classes/functions) that mimics properties but allows item access as my example.
How would I go about making such a wrapper class? Is there a better way to accomplish this goal? If you have any advice or reading that could help I would appreciate it very much.
This could work. Array is a proxy for the Numpy/C array:
class Array(object):
def __init__(self):
#self.__lib = ...
self.np_array = numpy.array(self._lib.get_arr())
def __getitem__(self, key):
self.np_array = numpy.array(self._lib.get_arr())
return self.np_array.__getitem__(key)
def __setitem__(self, key, value):
self.np_array.__setitem__(key, value)
self._lib.set_arr(new_arr.ctypes)
def __getattr__(self, name):
"""Delegate to NumPy array."""
try:
return getattr(self.np_array, name)
except AttributeError:
raise AttributeError(
"'Array' object has no attribute {}".format(name))
Should behave like this:
>>> a = Array()
>>> a[1]
1
>>> a[1] = 10
>>> a[1]
10
The 10 should end up in your C array too.
I think your descriptor should return Instance of list-like class which knows about self._lib and will update it during normal operation append, __setitem__, __getitem__, etc.