How do I avoid recomputing variables in this python class? - python

In the snippet below, how do I avoid computing the following numpy variables mask, zbar, te , ro and rvol in the procedures Get_mask, Get_K_Shell_Te etc? These variables are large arrays and I have to define at least six more procedures that reuse them. It looks like what I am doing is not a good idea and is slow.
import numpy as np
# this computes various quantities related to the shell in a object oriented way
class Shell_Data:
def __init__(self, data):
self.data = data
def Get_mask(self):
zbar=self.data['z2a1']
y=self.data['y']*1000
mask= np.logical_and(zbar >= 16 ,zbar<= 19 )
return self.mask
def Get_Shell_Te(self):
self.mask =self.Get_mask()
te =self.data['te'][self.mask]
ro =self.data['ro'][self.mask]
rvol =self.data['rvol'][self.mask]
self.Shell_Te=np.sum(te*ro/rvol)/(np.sum(ro/rvol))
print "Shell Temperature= %0.3f eV" % (self.Shell_Te)
return self.Shell_Te
def Get_Shell_ro(self):
mask =self.Get_mask()
te =self.data['te'][mask]
ro =self.data['ro'][mask]
rvol =self.data['rvol'][mask]
radk =self.data['radk'][mask]
self.Shell_ro=np.sum(radk*ro/rvol)/np.sum(radk/rvol)
return self.Shell_ro

zbar depends on self.data. If you update self.data, you likely have to re-compute it.
If you can make your data immutable, you can compute these values once, e.g. in the constructor.
If you want to avoid calculating the mask data until it's actually required, you can cache the value, like so:
class Shell_Data(...):
def __init__(self,...):
self.cached_mask = None
...
# #property makes an access to self.mask
# to actually return the result of a call to self.mask()
#property
def mask(self):
if self.cached_mask is None: # Not yet calculated.
self.cached_mask = self.computeMask()
return self.cached_mask
def computeMask(self):
zbar = ...
...
return np.logical_and(...)
def someComputation(self):
# The first access to self.mask will compute it.
te = self.data['te'][self.mask]
# The second access will just reuse the same result.
ro = self.data['ro'][self.mask]
...
If you have to mutate self.data, you can cache the computed data, and re-calculate it only when self.data changes. E.g. if you had a setData() method for that, you could recalculate the mask in it, or just set self.cached_mask to None.
(Also, read about instance variables again.
Every method receives the parameter named self, the instance of the object for which it is called. You can access all its instance variables as self.something, exactly the way you access instance variables (and methods) when an object is not called self. If you set an instance variable in one method, you can just access it an another (e.g. self.mask), no need to return it. If you return something from a method, likely it's not worth storing as an instance variable, like self.mask.)

Related

Loop through functions in a class

I have several functions inside a class that I applied augmentation to a numpy image array. I would like to know how to loop through all of them and apply those. For example:
Class Augmentation():
def rotation(data):
return rotated_image
def shear(data):
return sheared_image
def elasticity(data):
return enlarged_image
A=Augmentation()
My end result should be stacking all my functions. So for example: my data is (64,64) in shape. So after all my augmentations I should have a final numpy of (12,64,64). I currently tried creating different functions and then used
stack_of_images = np.stack(f1,f2,f3,....,f12)
stack_of_images.shape = (12,64,64)
I am using 12 different functions to augmentate numpy image arrays. I insert 1 image (64,64) and I get 12 images stacked (12,64,64).
You can do this by accessing the attribute dictionary of the type.
You can either get it with vars(Augmentation) or Augmentation.__dict__. Then, just iterate through the dict, and check for functions with callable.
NOTE: querying vars(A) or A.__dict__ (note it's the instance, not the class), will NOT include anything defined in the class, and in this case would be just {}. You don't even have to create an instance in the first place.
NOTE2: It seems like you should tag all methods with the decorator #staticmethod instead. Otherwise calling any method on an instance, like A.shear(), would pass A as data instead, which is most likely not desired.
class foo:
#staticmethod
def bar(data):
...
Example:
methods = []
for attrname,attrvalue in vars(Augmentation).items():
if callable(attrvalue):
methods.append(attrvalue)
print([i.__name__ for i in methods])

Subclass Initialization Issue with Data not from Superclass

I'm having an issue trying to make a rectangle class that is subclassed from a Points class, which (bare with me) is subclassed from nd.array.
I'm aware (or so I think) of the nuances of subclassing an nd.array. What i'm trying to achieve is being able to construct a rectangle with a center point, height, width, and rotation and not need to directly feed in discrete points. Below is my implementation of the rectangle constructor, the #init_args wrapper takes type hinted inputs and calls the class constructor which works. It appears that it isn't even getting into the __init__ method and jumps straight to the Points __new__() method.
class Points(np.ndarray):
def __new__(cls, list_of_points):
if type(list_of_points) == 'numpy.ndarray':
obj = list_of_points
else:
obj = np.vstack(list_of_points).view(cls)
return obj
class Rectangle(Points):
#init_args
def __init__(self, rect_attributes: RectAttributes):
self.points = np.zeros((4, 2))
print('in rect constructor')
self.attributes = rect_attributes
self.construct_rect_from_attributes()
super().__init__(self.points)
I don't think I really need the super().__init__ call, I was just hoping that it would delay the Points constructor. What I really need is just to be able to initialize the rectangle with attributes that aren't contained in the superclass.
The comment of chepner is make sense. Another thing that's confusing to me as that you want Rectangle to be a subclass of ndarray, yet you assign to it a self.points (initialized to zeros) of a different ndarray. ndarray.__new__ itself constructs the array and its data, so your Rectangle class is essentially already constructed from the list_of_points passed to Points.__new__.
Meanwhile you have a self.points that has no association with the points passed to the constructor.
From your example it's not clear what rect_attributes is supposed to be or what construct_rect_from_attributes() is supposed to do. But I think what you probably want is something like this (and IMO you should allow construction of Rectangle from a list of points as well, but that's up to whatever your requirements are):
class Rectangle(Points):
# put in the type hints anything else array-like
def __new__(cls, data: Union[list, np.ndarray, RectAttributes]):
if isinstance(data, RectAttributes):
# assuming you want to store the original attributes somewhere
self.attributes = data
# rect_from_attributes should be a classmethod and
# should return an array of points
data = cls.rect_from_attributes(data)
else:
# a default value if the rect was constructed another
# way, or inversely construct a RectAttributes from points
self.attributes = ...
return super().__new__(cls, data)

Instance variable as function of other instance variables

Is it possible to define an instance variable in a class as a function of another? I haven't gotten it to work unless you redefine the "function instance variable" all the time.
Basically you could have a scenario where you have one instance variable that is a list of integers, and want to have the sum of these as an instance variable, that automatically redefines every time the list is updated.
Is this possible?
class Example:
list_variable = []
sum_variable = sum(list_variable)
def __init__(self, list_variable):
self.list_variable = list_variable
return
This will result in sum_variable = 0 unless you change it.
I understand that this is far from a major issue, you could either define sum_variable as a method or redefine it every time you change list_variable, I'm just wondering if it's possible to skip those things/steps.
Python offers the property decorator for a syntatically identical use of your example:
class Example:
list_variable = []
def __init__(self, list_variable):
self.list_variable = list_variable
return
#property
def sum_variable(self):
return sum(self.list_variable)
e = Example(list_variable=[10, 20, 30])
e.sum_variable # returns 60

Creation and persistence of a class property in Python

I am writing a simple class to retrieve a signal x digitized at constant sampling rate Fs. Digitization begins at time t0. Given the signal length N = len(x), the sampling rate, and the initial time, the signal's time base is uniquely determined. I rarely need to access the time base, but I would like an easy means of doing so when needed. Below, I've implemented a minimal working example of my desired time-base functionality using a property() decorator:
import numpy as np
class Signal(object):
def __init__(self, x, Fs, t0):
self.x = x
self.Fs = Fs
self.t0 = t0
return
#property
def t(self):
return self.t0 + (np.arange(len(self.x)) / self.Fs)
I'd like to know about the creation and "persistence" of the time-base property Signal.t. Take the example use case below:
x = np.arange(10)
Fs = 1.
t0 = 0.
sig = Signal(x, Fs, t0)
print sig.t
When is the time-base array t generated? During initialization or dynamically when print sig.t is called? If the sig.t attribute is dynamically calculated, will it persist beyond the print command? (i.e. has memory been allocated to store the time base as an object attribute?).
While the above is a trivial example, my typical signals are very large, and I do not want the memory overhead of creating and storing the time base for every signal. I'd like an easy means of dynamically generating the time base on an as-needed basis, however; the time base should not persist as an object attribute after its one-off use (e.g. for creating a plot of the raw signal).
If the property() decorator does not provide this desired functionality (i.e. minimal memory overhead, ease of use on an as-needed basis), what should I be using? Simply a class method? Or is there a different, more optimal solution? Thanks!
Every time you access sig.t, the t function you decorated with #property is (re)run, and the result is used as the value of sig.t. That means the array is created on demand and never stored on the t object.
You seem to want this, but I'd be wary about it. Property accesses are generally expected to be cheap, and this property isn't. Consider making an ordinary method instead.
When is the time-base array t generated?
When it is used. i.e. when you write print sig.t
If the sig.t attribute is dynamically calculated, will it persist beyond the print command? (i.e. has memory been allocated to store the time base as an object attribute?).
Nope. The next time your code references sig.t, a new object will be created.
If the property() decorator does not provide this desired functionality (i.e. minimal memory overhead, ease of use on an as-needed basis), what should I be using? Simply a class method? Or is there a different, more optimal solution? Thanks!
There are differing opinions here I suspect... you can modify the code so that you cache the value and return the same thing each call:
class Signal(object):
def __init__(self, x, Fs, t0):
self.x = x
self.Fs = Fs
self.t0 = t0
self._t = None
return
#property
def t(self):
if self._t is not None:
return self._t
self._t = self.t0 + (np.arange(len(self.x)) / self.Fs)
return self._t
But here you have no way of telling the class that t should be re-computed unless you make a setter...
If t isn't going to change after initialization, then why not just make it a public property?
class Signal(object):
def __init__(self, x, Fs, t0):
self.x = x
self.Fs = Fs
self.t0 = t0
self.t = self.t0 + (np.arange(len(self.x)) / self.Fs)
In your example, you are dynamically generating the Signal.t value every time you access the attribute t because you are essentially calling Signal.t() to access it.So, you are not storing the value just returning it.
Whenever there is a #property decorator being used around a class, it is quite common for that function to act as a "getter" for a "private"(not really) variable, and sometimes there is a "setter" for that "private" variable.
When I mean "private" I really mean attributes that are named to reflect that these attributes should not be accessed directly. However, in python you can access any attribute and so there isn't any private variables since python objects can be changed quite easy.
If you want to store your values, then you should do something like this.
import numpy as np
class Signal(object):
def __init__(self, x, Fs, t0):
self.x = x
self.Fs = Fs
self.t0 = t0
self._t = None
return
#property
def t(self):
if self._t is None:
self._t = self.t0 + (np.arange(len(self.x)) / self.Fs)
return self._t
#t.setter
def t(self,value):
self._t = value
The above example, will only calculate it once and store it inside _t , but you get the point. When the #property decorator is being used, usually there is an underlying variable that is used to retrieve and store a value.Hope that helps

How do I downcast in python

I have two classes - one which inherits from the other. I want to know how to cast to (or create a new variable of) the sub class. I have searched around a bit and mostly 'downcasting' like this seems to be frowned upon, and there are some slightly dodgy workarounds like setting instance.class - though this doesn't seem like a nice way to go.
eg.
http://www.gossamer-threads.com/lists/python/python/871571
http://code.activestate.com/lists/python-list/311043/
sub question - is downcasting really that bad? If so why?
I have simplified code example below - basically i have some code that creates a Peak object after having done some analysis of x, y data. outside this code I know that the data is 'PSD' data power spectral density - so it has some extra attributes. How do i down cast from Peak, to Psd_Peak?
"""
Two classes
"""
import numpy as np
class Peak(object) :
"""
Object for holding information about a peak
"""
def __init__(self,
index,
xlowerbound = None,
xupperbound = None,
xvalue= None,
yvalue= None
):
self.index = index # peak index is index of x and y value in psd_array
self.xlowerbound = xlowerbound
self.xupperbound = xupperbound
self.xvalue = xvalue
self.yvalue = yvalue
class Psd_Peak(Peak) :
"""
Object for holding information about a peak in psd spectrum
Holds a few other values over and above the Peak object.
"""
def __init__(self,
index,
xlowerbound = None,
xupperbound = None,
xvalue= None,
yvalue= None,
depth = None,
ampest = None
):
super(Psd_Peak, self).__init__(index,
xlowerbound,
xupperbound,
xvalue,
yvalue)
self.depth = depth
self.ampest = ampest
self.depthresidual = None
self.depthrsquared = None
def peakfind(xdata,ydata) :
'''
Does some stuff.... returns a peak.
'''
return Peak(1,
0,
1,
.5,
10)
# Find a peak in the data.
p = peakfind(np.random.rand(10),np.random.rand(10))
# Actually the data i used was PSD -
# so I want to add some more values tot he object
p_psd = ????????????
edit
Thanks for the contributions.... I'm afraid I was feeling rather downcast(geddit?) since the answers thus far seem to suggest I spend time hard coding converters from one class type to another. I have come up with a more automatic way of doing this - basically looping through the attributes of the class and transfering them one to another. how does this smell to people - is it a reasonable thing to do - or does it spell trouble ahead?
def downcast_convert(ancestor, descendent):
"""
automatic downcast conversion.....
(NOTE - not type-safe -
if ancestor isn't a super class of descendent, it may well break)
"""
for name, value in vars(ancestor).iteritems():
#print "setting descendent", name, ": ", value, "ancestor", name
setattr(descendent, name, value)
return descendent
You don't actually "cast" objects in Python. Instead you generally convert them -- take the old object, create a new one, throw the old one away. For this to work, the class of the new object must be designed to take an instance of the old object in its __init__ method and do the appropriate thing (sometimes, if a class can accept more than one kind of object when creating it, it will have alternate constructors for that purpose).
You can indeed change the class of an instance by pointing its __class__ attribute to a different class, but that class may not work properly with the instance. Furthermore, this practice is IMHO a "smell" indicating that you should probably be taking a different approach.
In practice, you almost never need to worry about types in Python. (With obvious exceptions: for example, trying to add two objects. Even in such cases, the checks are as broad as possible; here, Python would check for a numeric type, or a type that can be converted to a number, rather than a specific type.) Thus it rarely matters what the actual class of an object is, as long as it has the attributes and methods that whatever code is using it needs.
See following example. Also, be sure to obey the LSP (Liskov Substitution Principle)
class ToBeCastedObj:
def __init__(self, *args, **kwargs):
pass # whatever you want to state
# original methods
# ...
class CastedObj(ToBeCastedObj):
def __init__(self, *args, **kwargs):
pass # whatever you want to state
#classmethod
def cast(cls, to_be_casted_obj):
casted_obj = cls()
casted_obj.__dict__ = to_be_casted_obj.__dict__
return casted_obj
# new methods you want to add
# ...
This isn't a downcasting problem (IMHO). peekfind() creates a Peak object - it can't be downcast because its not a Psd_Peak object - and later you want to create a Psd_Peak object from it. In something like C++, you'd likely rely on the default copy constructor - but that's not going to work, even in C++, because your Psd_Peak class requires more parameters in its constructor. In any case, python doesn't have a copy constructor, so you end up with the rather verbose (fred=fred, jane=jane) stuff.
A good solution may be to create an object factory and pass the type of Peak object you want to peekfind() and let it create the right one for you.
def peak_factory(peak_type, index, *args, **kw):
"""Create Peak objects
peak_type Type of peak object wanted
(you could list types)
index index
(you could list params for the various types)
"""
# optionally sanity check parameters here
# create object of desired type and return
return peak_type(index, *args, **kw)
def peakfind(peak_type, xdata, ydata, **kw) :
# do some stuff...
return peak_factory(peak_type,
1,
0,
1,
.5,
10,
**kw)
# Find a peak in the data.
p = peakfind(Psd_Peak, np.random.rand(10), np.random.rand(10), depth=111, ampest=222)

Categories

Resources