Subclassing and extending numpy.ndarray

Subclassing and extending numpy.ndarray - python

I need some basic data class representations and I want to use existing numpy classes, since they already offer great functionality.
However, I'm not sure if this is the way to do it (although it works so far). So here is an example:
The Position class should act like a simple numpy.array, but it should map the attributes .x, .y and .z to the three array components. I overwrote the __new__ method which returns an ndarray with the initial array. To allow access and modification of the array, I defined properties along with setters for each one.
import numpy as np
class Position(np.ndarray):
"""Represents a point in a 3D space
Adds setters and getters for x, y and z to the ndarray.
"""
def __new__(cls, input_array=(np.nan, np.nan, np.nan)):
obj = np.asarray(input_array).view(cls)
return obj
#property
def x(self):
return self[0]
#x.setter
def x(self, value):
self[0] = value
#property
def y(self):
return self[1]
#y.setter
def y(self, value):
self[1] = value
#property
def z(self):
return self[2]
#z.setter
def z(self, value):
self[2] = value
This seems however a bit too much code for such a basic logic and I'm wondering if I do it the "correct" way. I also need bunch of other classes like Direction which will have quite a few other functionalities (auto-norm on change etc.) and before I start integrating numpy, I thought I ask you…

I would argue ndarray is the wrong choice here, you probably want a simple namedtuple.
>>> import collections
>>> Position = collections.namedtuple('Positions', 'x y z')
>>> p = Position(1, 2, 3)
>>> p
Positions(x=1, y=2, z=3)
You could get the unpacking like so
>>> x, y, z = p
>>> x, y, z
(1, 2, 3)
>>>

Related

Creating a set of instances involving float

I have a class having list of floats x (and y, generated from x, so if x is equivalent, y is also equivalent). Once initialized, the instance does not change. I would like to make a set of the instances (to use .add()), so I tried to make the class hashable:
class X:
def __init__(self,x,y):
self.x = x
self.y = y
def __hash__(self):
return hash(tuple(self.x))
def __eq__(self,other):
return (
self.__class__ == other.__class__ and
self.x == other.x
)
But because of the floating point inaccuracy, the set will recognize two very close instances as different. I would like to set the __eq__ to be something like
def __eq__(self,other):
diff = np.max(np.asarray(self.x)-np.asarray(other.x))
if diff<1e-6:
return True
else:
return False
but this does not solve the floating point problem.
I could use a tuple (x,y) for this problem, but I do not need to compare y, and the real class I work on is a little more complicated.

You could use math.isclose from the math module in the standard library to compare floats, and, perhaps, round (or truncate) the value used to produce the hash to the number of decimals used by default by isclose. (this last value could be parametrized)
class X:
def __init__(self,x,y):
self.x = x
self.y = y
def __hash__(self):
return hash(tuple(round(self.x, 9)) # round the value hashed to match the default of math.isclose
def __eq__(self,other):
return (
self.__class__ == other.__class__ and
math.isclose(self.x, other.x)
)

Change the underlying data representation with the descriptor protocol

Suppose I have an existing class, for example doing some mathematical stuff:
class Vector:
def __init__(self, x, y):
self.x = y
self.y = y
def norm(self):
return math.sqrt(math.pow(self.x, 2) + math.pow(self.y, 2))
Now, for some reason, I'd like to have that Python does not store the members x and y like any variable. I'd rather want that Python internally stores them as strings. Or that it stores them into a dedicated buffer, maybe for interoperability with some C code. So (for the string case) I build the following descriptor:
class MyStringMemory(object):
def __init__(self, convert):
self.convert = convert
def __get__(self, obj, objtype):
print('Read')
return self.convert(self.prop)
def __set__(self, obj, val):
print('Write')
self.prop = str(val)
def __delete__(self, obj):
print('Delete')
And I wrap the existing vector class in a new class where members x and y become MyStringMemory:
class StringVector(Vector):
def __init__(self, x, y):
self.x = x
self.y = y
x = MyStringMemory(float)
y = MyStringMemory(float)
Finally, some driving code:
v = StringVector(1, 2)
print(v.norm())
v.x, v.y = 10, 20
print(v.norm())
After all, I replaced the internal representation of x and y to be strings without any change in the original class, but still with its full functionality.
I just wonder: Will that concept work universally or do I run into serious pitfalls? As I said, the main idea is to store the data into a specific buffer location that is later on accessed by a C code.
Edit: The intention of what I'm doing is as follows. Currently, I have a nicely working program where some physical objects, all of type MyPhysicalObj interact with each other. The code inside the objects is vectorized with Numpy. Now I'd also like to vectorize some code over all objects. For example, each object has an energy that is computed by a complicated vectorized code per-object. Now I'd like to sum up all energies. I can iterate over all objects and sum up, but that's slow. So I'd rather have that property energy for each object automatically stored into a globally predefined buffer, and I can just use numpy.sum over that buffer.

There is one pitfall regarding python descriptors.
Using your code, you will reference the same value, stored in StringVector.x.prop and StringVector.y.prop respectively:
v1 = StringVector(1, 2)
print('current StringVector "x": ', StringVector.__dict__['x'].prop)
v2 = StringVector(3, 4)
print('current StringVector "x": ', StringVector.__dict__['x'].prop)
print(v1.x)
print(v2.x)
will have the following output:
Write
Write
current StringVector "x": 1
Write
Write
current StringVector "x": 3
Read
3.0
Read
3.0
I suppose this is not what you want=). To store unique value per object inside object, make the following changes:
class MyNewStringMemory(object):
def __init__(self, convert, name):
self.convert = convert
self.name = '_' + name
def __get__(self, obj, objtype):
print('Read')
return self.convert(getattr(obj, self.name))
def __set__(self, obj, val):
print('Write')
setattr(obj, self.name, str(val))
def __delete__(self, obj):
print('Delete')
class StringVector(Vector):
def __init__(self, x, y):
self.x = x
self.y = y
x = MyNewStringMemory(float, 'x')
y = MyNewStringMemory(float, 'y')
v1 = StringVector(1, 2)
v2 = StringVector(3, 4)
print(v1.x, type(v1.x))
print(v1._x, type(v1._x))
print(v2.x, type(v2.x))
print(v2._x, type(v2._x))
Output:
Write
Write
Write
Write
Read
Read
1.0 <class 'float'>
1 <class 'str'>
Read
Read
3.0 <class 'float'>
3 <class 'str'>
Also, you definitely could save data inside centralized store, using descriptor's __set__ method.
Refer to this document: https://docs.python.org/3/howto/descriptor.html

If you need a generic convertor('convert') like you did, this is the way to go.
The biggest downside will be performance when you will need to create a lot of instances( I assumed you might, since the class called Vector). This will be slow since python class initiation is slow.
In this case you might consider using namedTuple you can see the docs have a similar scenario as you have.
As a side note: If that possible, why not creating a dict with the string representation of x and y on the init method? and then keep using the x and y as normal variables without all the converting

Properly Implementing Python Star Operator for a Custom Class

I have a Python class called Point, that is basically a holder for an x and y value with added functionality for finding distance, angle, and such with another Point.
For passing a point to some other function that may require the x and y to be separate, I would like to be able to use the * operator to unpack my Point to just the separate x, y values.
I have found that this is possible if I override __getitem__ and raise a StopIterationException for any index beyond 1, with x corresponding to 0 and y to 1.
However it doesn't seem proper to raise a StopIteration when a ValueError/KeyError would be more appropriate for values beyond 1.
Does anyone know of the correct way to implement the * operator for a custom class? Preferably, a way that does not raise StopIteration through __getitem__?

You can implement the same by overriding the __iter__ magic method, like this
class Point(object):
def __init__(self, x, y):
self.x, self.y = x, y
def __iter__(self):
return (self.__dict__[item] for item in sorted(self.__dict__))
def printer(x, y):
print x, y
printer(*Point(2, 3))
Output
2 3

Here's another way to do it that uses __dict__ but gives you precise control over the order without having to perform a sort on the keys for every access:
def __iter__(self): return (self.__dict__[item] for item in 'xy')
Of course, you could stash a sorted tuple, list or string of keys somewhere, but I think using a literal makes sense here.
And while I'm at it, here's one way to do the setter & getter methods.
def __getitem__(self, key): return (self.x, self.y)[key]
def __setitem__(self, key, val): setattr(self, 'xy'[key], val)

python class Vector, change from 3dimension to ndimension

I made this class that computes some operations for 3d vectors, is there anyway I can change the code so that it computes the same operations but for vectors of any dimension n?
import sys
class Vector:
def __init__(self,x,y,z):
self.x= x
self.y= y
self.z= z
def __repr__(self):
return "(%f,%f,%f)"%(self.x,self.y,self.z)
def __add__(self,other):
return (self.x+other.x,self.y+other.y,self.z+other.z)
def __sub__(self,other):
return (self.x-other.x,self.y-other.y,self.z-other.z)
def __norm__(self):
return (self.x**2+self.y**2+self.z**2)**0.5
def escalar(self,other):
return (self.x*other.x+self.y*other.y+self.z*other.z)
def __mod__(self,other):
return (self.x%other.x,self.y%other.y,self.z%other.z)
def __neg__(self):
return (-self.x,-self.y,-self.z)

As an example, for a n dimensional vector, something like
class Vector:
def __init__(self, components):
self.components = components # components should be a list
def __add__(self, other):
assert len(other.components) == len(self.components)
added_components = []
for i in range(len(self.components)):
added_components.append(self.components[i] + other.components[i])
return Vector(added_components)
def dimensions(self):
return len(self.components)
would be possible. Note that the __add__ override returns a new Vector instance, not a tuple as in your case. Then adapt your other methods likewise.
There are more 'clever' ways of adding elements from two lists, into a third. You should really not do it this way if performance is an issue though (or in any other case but an exercise, IMO). Look into numpy.

Use a list to store the coefficients rather than explicit variables. For negating, adding, subtracting etc. you just iterate over the lists.
In terms of initialisation, you need to use *args for the input. Have a look at this post for an explanation of how it works: https://stackoverflow.com/a/3394898/1178052

setitem() implementation in Python for Point(x,y) class

I'm trying to make a Point class in python. I already have some of the functions, like __ str__ , or __ getitem__ implemented, and it works great.
The only problem I'm facing is that my implementation of the __ setitem__ does not work, the others are doing fine.
Here is my Point class, and the last function is my __ setitem__():
class point(object):
def __init__(self, x=0, y=0):
self.x = x
self.y = y
def __str__(self):
return "point(%s,%s)" % (self.x, self.y)
def __getitem__(self, item):
return (self.x, self.y)[item]
def __setitem__(self, x, y):
[self.x, self.y][x] = y
It should work like this:
p = point(2, 3)
p[0] = 1 # sets the x coordinate to 1
p[1] = 10 # sets the y coordinate to 10
Am I even right, should the `setitem() work like this?
Thanks!

Let self.data and only self.data hold the coordinate values.
If self.x and self.y were to also store these values there is a chance self.data and self.x or self.y will not get updated consistently.
Instead, make x and y properties that look up their values from self.data.
class Point(object):
def __init__(self,x=0,y=0):
self.data=[x, y]
def __str__(self):
return "point(%s,%s)"%(self.x,self.y)
def __getitem__(self,item):
return self.data[item]
def __setitem__(self, idx, value):
self.data[idx] = value
#property
def x(self):
return self.data[0]
#property
def y(self):
return self.data[1]
The statement
[self.x, self.y][x]=y
is interesting but problematic. Let pick it apart:
[self.x, self.y] causes Python to build a new list, with values self.x and self.y.
somelist[x]=y causes Python to assign value y to the xth index of somelist. So this new list somelist gets updated. But this has no effect on self.data, self.x or self.y. That is why your original code was not working.

This is pretty old post, but the solution for your problem is very simple:
class point(object):
def __init__(self,x=0,y=0):
self.x=x
self.y=y
def __str__(self):
return "point(%s,%s)"%(self.x,self.y)
def __getitem__(self,item):
return self.__dict__[item]
def __setitem__(self,item,value):
self.__dict__[item] = value
Each class has his own dictionary with all properties and methods created inside the class. So with this you can call:
In [26]: p=point(1,1)
In [27]: print p
point(1,1)
In [28]: p['x']=2
In [29]: print p
point(2,1)
In [30]: p['y']=5
In [31]: print p
point(2,5)
It is more readable then your "index" like reference.

Let's strip this down to the bare minimum:
x, y = 2, 3
[x, y][0] = 1
print(x)
This will print out 2.
Why?
Well, [x, y] is a brand-new list containing two elements. When you do reassign its first member to 1, that just changes the brand-new list, so its first element is now 1 instead of 2. It doesn't turn the number 2 into the number 1.
Since your code is essentially identical to this, it has the same problem. As long as your variables have immutable values, you can't mutate the variables.
You could fix it by doing something like this:
x, y = [2], [3]
[x, y][0][0] = 1
print(x[0])
Now you'll get 1.
Why? Well, [x, y] is a new list with two elements, each of which is a list. You're not replacing its first element with something else, you're replacing the first element of its first element with something else. But its first element is the same list as x, so you're also replacing x's first element with something else.
If this is a bit hard to keep straight in your head… well, that's usually a sign that you're doing something you probably shouldn't be. (Also, the fact that you're using x for a parameter that means "select x or y" and y for a parameter that means "new value" makes it a whole lot more confusing…)
There are many simpler ways to do the same thing:
Use an if/else statement instead of trying to get fancy.
Use a single list instead of two integer values: self.values[x] = y. (That's unutbu's answer.)
Use a dict instead of two integer values: self.values['xy'[x]] = y.
Use setattr(self, 'xy'[x], y).
Use a namedtuple instead of trying to build the same thing yourself.

This works in python 2.6 i guess it works for 2.7 as well
The __setitem__ method accept 3 arguments (self, index, value)
in this case we want to use index as int for retrive the name of the coordinate from __slots__ tuple (check the documentation of __slots__ is really usefull for performance)
remember with __slots__ only x and y attributes are allowed! so:
p = Point()
p.x = 2
print(p.x) # 2.0
p.z = 4 # AttributeError
print(p.z) # AttributeError
This way is faster respect using #property decorator (when you start to have 10000+ instances)
class Point(object):
#property
def x(self):
return self._data[0] # where self._data = [x, y]
...
so this is my tip for you :)
class Point(object):
__slots__ = ('x', 'y') # Coordinates
def __init__(self, x=0, y=0):
'''
You can use the constructor in many ways:
Point() - void arguments
Point(0, 1) - passing two arguments
Point(x=0, y=1) - passing keywords arguments
Point(**{'x': 0, 'y': 1}) - unpacking a dictionary
Point(*[0, 1]) - unpacking a list or a tuple (or a generic iterable)
Point(*Point(0, 1)) - copy constructor (unpack the point itself)
'''
self.x = x
self.y = y
def __setattr__(self, attr, value):
object.__setattr__(self, attr, float(value))
def __getitem__(self, index):
'''
p = Point()
p[0] # is the same as self.x
p[1] # is the same as self.y
'''
return self.__getattribute__(self.__slots__[index])
def __setitem__(self, index, value):
'''
p = Point()
p[0] = 1
p[1] = -1
print(repr(p)) # <Point (1.000000, -1.000000)>
'''
self.__setattr__(self.__slots__[index], value) # converted to float automatically by __setattr__
def __len__(self):
'''
p = Point()
print(len(p)) # 2
'''
return 2
def __iter__(self):
'''
allow you to iterate
p = Point()
for coord in p:
print(coord)
for i in range(len(p)):
print(p[i])
'''
return iter([self.x, self.y])
def __str__(self):
return "(%f, %f)" % (self.x, self.y)
def __repr__(self):
return "<Point %s>" % self

Your may find it a lot easier to use namedtuple for this:
from collections import namedtuple
Point= namedtuple('Point', ['x','y'])
fred = Point (1.0, -1.0)
#Result: Point(x=1.0, y=-1.0)
The main drawback is that you can't poke values into a namedtuple - it's immutable. In most applications that's a feature, not a bug

What's happening in setitem is it builds a temporary list, sets the value, then throws away this list without changing self.x or self.y. Try this for __setitem__:
def __setitem__(self,coord,val):
if coord == 0:
self.x = val
else:
self.y = val
This is quite an abuse of __setitem__, however... I'd advise figuring out a different way of setting the x/y coordinates if possible. Using p.x and p.y is going to be much faster than p[0] and p[1] pretty much no matter how you implement it.

Here's an example:
from collections import namedtuple
Deck = namedtuple('cards',['suits','values'])
class FrenchDeck(object):
deck = [str(i) for i in range(2,11)]+list('JQKA')
suits = "heart clubs spades diamond".split()
def __init__(self):
self.totaldecks = [Deck(each,every) for each in self.suits for every in self.deck]
def __len__(self):
return len(self.totaldecks)
def __getitem__(self,index):
return self.totaldecks[index]
def __setitem__(self,key,value):
self.totaldecks[key] = value
CardDeck = FrenchDeck()
CardDeck[0] = "asdd" # needs`__setitem__()`
print CardDeck[0]
If you don't use the __setitem__(), you will get an error
TypeError: 'FrenchDeck' object does not support item assignment

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Subclassing and extending numpy.ndarray - python

Related

Creating a set of instances involving float

Change the underlying data representation with the descriptor protocol

Properly Implementing Python Star Operator for a Custom Class

python class Vector, change from 3dimension to ndimension

setitem() implementation in Python for Point(x,y) class

Categories

Resources

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Subclassing and extending numpy.ndarray - python

Related

Creating a set of instances involving float

Change the underlying data representation with the descriptor protocol

Properly Implementing Python Star Operator for a Custom Class

python class Vector, change from 3dimension to ndimension

__setitem__() implementation in Python for Point(x,y) class

Categories

Resources

setitem() implementation in Python for Point(x,y) class