How to make the elements of a NumPy array property settable? - python

I have a property of a Python object that returns an array.
Now, I can set the setter of that property such that the whole array is settable.
However, I'm missing how to make the elements by themselves settable through the property.
I would expect from a user perspective (given an empty SomeClass class):
>>> x = SomeClass()
>>> x.array = [1, 2, 3]
>>> x.array[1] = 4
>>> print (x.array)
[1, 4, 3]
Now, suppose that SomeClass.array is a property defined as
class SomeClass(object):
def __init__(self, a):
self._a = a
#property
def array(self):
return self._a
#array.setter
def array(self, a):
self._a = a
Everything still works as above. Also if I force simple NumPy arrays on the setter.
However, if I replace the return self._a with a NumPy function (that goes in a vectorised way through the elements) and I replace self._a = a with the inverse function, of course the entry does not get set anymore.
Example:
import numpy as np
class SomeClass(object):
def __init__(self, a):
self._a = np.array(a)
#property
def array(self):
return np.sqrt(self._a)
#array.setter
def array(self, a):
self._a = np.power(a, 2)
Now, the user sees the following output:
>>> x = SomeClass([1, 4, 9])
>>> print (x.array)
array([1., 2., 3.])
>>> x.array[1] = 13
>>> print (x.array) # would expect an array([1., 13., 3.]) now!
array([1., 2., 3.])
I think I understand where the problem comes from (the array that NumPy creates during the operation gets its element changed but it doesn't have an effect on the stored array).
What would be a proper implementation of SomeClass to make single elements of the array write-accessible individually and thus settable as well?
Thanks a lot for your hints and help,
TheXMA
The points #Jaime made below his answer helped me a lot! Thanks!

Since arrays are mutable objects, the individual items are settable even without a setter function:
>>> class A(object):
... def __init__(self, a):
... self._a = np.asarray(a)
... #property
... def arr(self):
... return self._a
...
>>> a = A([1,2,3])
>>> a.arr
array([1, 2, 3])
>>> a.arr = [4,5,6] # There is no setter...
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: can't set attribute
>>> a.arr[1] = 7 # ...but the array is mutable
>>> a.arr
array([1, 7, 3])
This is one of the uses of tuples vs. lists, since the latter are mutable, but the former aren't. Anyway, to answer your question: making individual items settable is easy, as long as your getter returns the object itself.
The fancier performance in your second example doesn't seem easy to get in any simple way. I think you could make it happen by making your SomeClass.array attribute be a custom class, that either subclasses ndarray or wraps an instance of it. Either way would be a lot of nontrivial work.

Related

Convert list of objects to numpy array without coercing objects into array

I want to turn a Python list of objects into a numpy array of objects for easier manipulation of the list, e.g. index slicing, etc. However, numpy coerces list-like objects into arrays during conversion
import numpy as np
class ListLike(object):
def __init__(self, list):
self.list = list
self.otherdata = 'foo'
def __getitem__(self, i):
return self.list[i]
def __len__(self):
return len(self.list)
o1 = ListLike([1,2])
o2 = ListLike([3,4])
listOfObj = [o1, o2]
numpyArray = np.array(listOfObj)
>>> numpyArray
array([[1, 2],
[3, 4]])
>>> type(numpyArray[0])
<class 'numpy.ndarray'>
This means I lose the objects and other data and methods associated with them
>>> numpyArray[0].otherdata
AttributeError: 'numpy.ndarray' object has no attribute 'otherdata'
How do I get this result instead?
>>> numpyArray
array([(<__main__.ListLike object at 0x000000002786DAF0>, <__main__.ListLike object at 0x000000002E56B700>)], dtype=object)
>>> type(numpyArray[0])
<class '__main__.ListLike'>
It would be even better if I could also control the depth of the coersion with something like this
o3 = ListLike([o1,o2])
numpyArray = np.array([o3, o3], depth=1)
>>> numpyArray
array([[(<__main__.ListLike object at 0x000000002786DAF0>, <__main__.ListLike object at 0x000000002E56B700>)], [(<__main__.ListLike object at 0x000000002786DAF0>, <__main__.ListLike object at 0x000000002E56B700>)]], dtype=object)
In [35]: arr = np.empty(2, object)
In [36]: arr[:] = listOfObj
In [37]: arr
Out[37]:
array([<__main__.ListLike object at 0x7f3b9c6f1df0>,
<__main__.ListLike object at 0x7f3b9c6f19d0>], dtype=object)
Element methods/attributes have to be accessed with a list comprehension, just as with the list:
In [39]: arr[0].otherdata
Out[39]: 'foo'
In [40]: [a.otherdata for a in listOfObj]
Out[40]: ['foo', 'foo']
In [41]: [a.otherdata for a in arr]
Out[41]: ['foo', 'foo']
frompyfunc could also be used for this, though timing is about the same:
In [44]: np.frompyfunc(lambda a: a.otherdata,1,1)(arr)
Out[44]: array(['foo', 'foo'], dtype=object)
I think that you wish achieve this:
class ListLike(np.array):
def __init__(self, _list):
super().__init__(_list)
self.otherdata = 'foo'
But you can't subclass from a builtin_function.. though you can try subclass from np.ndarray
Instead i've managed to write
class ListLike(object):
def __init__(self, _list):
self._list = _list
self.otherdata = 'foo'
def __getitem__(self, key): return self._list[key]
def __str__(self): return self._list.__str__()
#Note 1
#def __len__(self): return self._list.__len__()
tested with:
o1 = ListLike([1,2])
o2 = ListLike([3,4])
numpyArray = np.array([o1, o2], dtype=object)
#Note 2
#numpyArray = np.array([o1, o2, 888], dtype=object)
print(numpyArray[0], numpyArray[0].otherdata, end='.\t')
print("Expected: [1,2] foo")
Note 1
This works as long as you don't implement __len__ method for LikeList class.
Note 2
If you insist, then append some kind of sorcery to this line.
An empty item as: '',None, []
An instance of another dim as: LikeList([7])
Another object as: 5, ... (Ellipsis) , 'hello', 0.1
(i couldn't figure it out, why np.array doesn't detect dtype=object)

Can I use numpy array with dtype=object to share list of arbitrary type across class instances?

Here is what I want to achieve:
I have a class MyData that holds some sort of data.
Now in class A I store a sequence of data with attribute name data_list and type List[MyData].
Suppose MyData instances are data with different type index. Class A is a management class. I need A to hold all the data to implement sampling uniformly from all data.
But some other operations that are type-specific also need to be done. So a base class B and derived class B1,B2... is designed to account for each type of data. An instance of class A have a list of B instances as member, each storing data points with one type. Code that illustrates this: B.data_list = A.data_list[start_index:start_index+offset].
A have methods that returns some of the data, and B have methods that may modify some of the data.
Now here is the problem: I need to pass data by reference, so that any modification by member function of B is also visible from the side of A.
If I use python builtin List to store data, modifications by B won't be visible for A. I did some experiment using np.array(data_list, dtype=object), it seemed to work. But I'm not familiar with such kind of usage and not sure if it works for data of any type, and whether there will be performance concerns, etc.
Any suggestions or alternatives? Thanks!!
Illustrating code:
class A:
def __init__(self, data_list, n_segment):
self.data_list = data_list
data_count = len(data_list)
segment_length=data_count // n_segment
self.segments = [self.data_list[segment_length*i:segment_length*(i+1)] for i in range(n_segment)]
self.Bs = [B(segment) for segment in self.segments]
def __getitem__(self, item):
return self.data_list[item]
class B:
def __init__(self, data_list):
self.data_list = data_list
def modify(self, index, data):
self.data_list[index]=data
A_data_list = [1,2,3,4,5,6,7,8,9]
A_instance = A(A_data_list, n_segment=3)
print(A_instance[0]) # get 1
A_instance.Bs[0].modify(0,2) # modify A[0] to be 2
print(A_instance[0]) # still get 1
Note that in the above example changing A_data_list to numpy array will solve my problem, but in my case elements in list are objects which cannot be stacked into numpy arrays.
In class A, the segments are all copies of portions of data_list. Thus, so are Bs items. When you try to modify values, A.Bs are modified, but not the corresponding elements in A.data_list.
With numpy, it is probable that you have memory views instead. So when a value is modified, it affects both A.Bs and A.data_list. It is still bad form though.
Here is how to fix your classes so that the proper values are modified:
class A:
def __init__(self, data_list, n_segment):
self.data_list = data_list
data_count = len(data_list)
segment_length = data_count // n_segment
r = range(0, (n_segment + 1) * segment_length, segment_length)
slices = [slice(i, j) for i, j in zip(r, r[1:])]
self.Bs = [B(self.data_list, slice_) for slice_ in slices]
def __getitem__(self, item):
return self.data_list[item]
class B:
def __init__(self, data_list, slice_):
self.data_list = data_list
self.data_slice = slice_
def modify(self, index, data):
a_ix = list(range(*self.data_slice.indices(len(self.data_list))))[index]
self.data_list[a_ix] = data
Test:
A_data_list = [1,2,3,4,5,6,7,8,9]
a = A(A_data_list, n_segment=3)
>>> a[0]
1
a.Bs[0].modify(0, 2) # modify A[0] to be 2
>>> a[0]
2
a.Bs[1].modify(1, -5)
>>> vars(a)
{'data_list': [2, 2, 3, 4, -5, 6, 7, 8, 9],
... }
a.Bs[2].modify(-1, -1) # modify last element of segment #2
>>> vars(a)
{'data_list': [2, 2, 3, 4, -5, 6, 7, 8, -1],
... }
>>> A_instance.Bs[0].modify(3, 0)
IndexError: ... list index out of range
Note: This updated answer would also deal with arbitrary slices, including, hypothetically, ones with a step greater than 1.

"Pythonic" class level recursion?

I am creating a class that inherits from collections.UserList that has some functionality very similar to NumPy's ndarray (just for exercise purposes). I've run into a bit of a roadblock regarding recursive functions involving the modification of class attributes:
Let's take the flatten method, for example:
class Array(UserList):
def __init__(self, initlist):
self.data = initlist
def flatten(self):
# recursive function
...
Above, you can see that there is a singular parameter in the flatten method, being the required self parameter. Ideally, a recursive function should take a parameter which is passed recursively through the function. So, for example, it might take a lst parameter, making the signature:
Array.flatten(self, lst)
This solves the problem of having to set lst to self.data, which consequently will not work recursively, because self.data won't be changed. However, having that parameter in the function is going to be ugly in use and hinder the user experience of an end user who may be using the function.
So, this is the solution I've come up with:
def flatten(self):
self.data = self.__flatten(self.data)
def __flatten(self, lst):
...
return result
Another solution could be to nest __flatten in flatten, like so:
def flatten(self):
def __flatten(lst):
...
return result
self.data = __flatten(self.data)
However, I'm not sure if nesting would be the most readable as flatten is not the only recursive function in my class, so it could get messy pretty quickly.
Does anyone have any other suggestions? I'd love to know your thoughts, thank you!
A recursive method need not take any extra parameters that are logically unnecessary for the method to work from the caller's perspective; the self parameter is enough for recursion on a "child" element to work, because when you call the method on the child, the child is bound to self in the recursive call. Here is an example:
from itertools import chain
class MyArray:
def __init__(self, data):
self.data = [
MyArray(x) if isinstance(x, list) else x
for x in data]
def flatten(self):
return chain.from_iterable(
x.flatten() if isinstance(x, MyArray) else (x,)
for x in self.data)
Usage:
>>> a = MyArray([[[1, 2], [3, 4]], [[5, 6], [7, 8]]])
>>> list(a.flatten())
[1, 2, 3, 4, 5, 6, 7, 8]
Since UserList is an iterable, you can use a helper function to flatten nested iterables, which can deal likewise with lists and Array objects:
from collections import UserList
from collections.abc import Iterable
def flatten_iterable(iterable):
for item in iterable:
if isinstance(item, Iterable):
yield from flatten_iterable(item)
else:
yield item
class Array(UserList):
def __init__(self, initlist):
self.data = initlist
def flatten(self):
self.data = list(flatten_iterable(self.data))
a = Array([[1, 2], [3, 4]])
a.flatten(); print(a) # prints [1, 2, 3, 4]
b = Array([Array([1, 2]), Array([3, 4])])
b.flatten(); print(b) # prints [1, 2, 3, 4]
barrier of abstraction
Write array as a separate module. flatten can be generic like the example implementation here. This differs from a_guest's answer in that only lists are flattened, not all iterables. This is a choice you get to make as the module author -
# array.py
from collections import UserList
def flatten(t): # generic function
if isinstance(t, list):
for v in t:
yield from flatten(v)
else:
yield t
class array(UserList):
def flatten(self):
return list(flatten(self.data)) # specialization of generic function
why modules are important
Don't forget you are the module user too! You get to reap the benefits from both sides of the abstraction barrier created by the module -
As the author, you can easily expand, modify, and test your module without worrying about breaking other parts of your program
As the user, you can rely on the module's features without having to think about how the module is written or what the underlying data structures might be
# main.py
from array import array
t = array([1,[2,3],4,[5,[6,[7]]]]) # <- what is "array"?
print(t.flatten())
[1, 2, 3, 4, 5, 6, 7]
As the user, we don't have to answer "what is array?" anymore than you have to answer "what is dict?" or "what is iter?" We use these features without having to understand their implementation details. Their internals may change over time, but if the interface stays the same, our programs will continue to work without requiring change.
reusability
Good programs are reusable in many ways. See python's built-in functions for proof of this, or see the the guiding principles of the Unix philosophy -
Write programs that do one thing and do it well.
Write programs to work together.
If you wanted to use flatten in other areas of our program, we can reuse it easily -
# otherscript.py
from array import flatten
result = flatten(something)
Typically, all methods of a class have at least one argument which is called self in order to be able to reference the actual object this method is called on.
If you don't need self in your function, but you still want to include it in a class, you can use #staticmethod and just include a normal function like this:
class Array(UserList):
def __init__(self, initlist):
self.data = initlist
#staticmethod
def flatten():
# recursive function
...
Basically, #staticmethod allows you to make any function a method that can be called on a class or an instance of a class (object). So you can do this:
arr = Array()
arr.flatten()
as well as this:
Array.flatten()
Here is some further reference from Pyhon docs: https://docs.python.org/3/library/functions.html#staticmethod

Storing passed data in object twice with `attrs` package

I am creating a data provider class that will hold data, perform transformations and make it available to other classes.
If the user creates and instance of this class and passes some data at instantiation, I would like to store it twice: once for all transformations and once as a copy of the original data. Let's assume the data itself has a copy method.
I am using the attrs package to create classes, but would also be interested in best approaches to this in general (perhaps there is a better way of getting what I am after?)
Here is what I have so far:
#attr.s
class DataContainer(object):
"""Interface for managing data. Reads and write data, acts as a provider to other classes.
"""
data = attr.ib(default=attr.Factory(list))
data_copy = data.copy()
def my_func(self, param1='all'):
"""Do something useful"""
return param1
This doesn't work: AttributeError: '_CountingAttr' object has no attribute 'copy'
I also cannot call data_copy = self.data.copy(), I get the error: NameError: name 'self' is not defined.
The working equivalent without the attrs package would be:
class DataContainer(object):
"""Interface for managing data. Reads and write data, acts as a provider to other classes.
"""
def __init__(self, data):
"Init method, saving passed data and a backup copy"
self.data = data
self.data_copy = data
EDIT:
As pointed out by #hynek, my simple init method above needs to be corrected to make an actual copy of the data: i.e. self.data_copy = data.copy(). Otherwise both self.data and self.data_copy would point to the same object.
You can do two things here.
The first one you've found yourself: you use __attr_post_init__.
The second one is to have a default:
>>> import attr
>>> #attr.s
... class C:
... x = attr.ib()
... _x_backup = attr.ib()
... #_x_backup.default
... def _copy_x(self):
... return self.x.copy()
>>> l = [1, 2, 3]
>>> i = C(l)
>>> i
C(x=[1, 2, 3], _x_backup=[1, 2, 3])
>>> i.x.append(4)
>>> i
C(x=[1, 2, 3, 4], _x_backup=[1, 2, 3])
JFTR, you example of
def __init__(self, data):
self.data = data
self.data_copy = data
is wrong because you’d assign the same object twice which means that modifying self.data also modifies self.data_copy and vice versa.
After looking through the documentation a little more deeply (scroll right to the bottom), I found that there is a kind of post-init hook for classes that are created by attrs.
You can just include a special __attrs_post_init__ method that can do the more complicated things one might want to do in an __init__ method, beyond simple assignment.
Here is my final working code:
In [1]: #attr.s
...: class DataContainer(object):
...: """Interface for managing data. Reads and write data,
...: acts as a provider to other classes.
...: """
...:
...: data = attr.ib()
...:
...: def __attrs_post_init__(self):
...: """Perform additional init work on instantiation.
...: Make a copy of the raw input data.
...: """
...: self.data_copy = self.data.copy()
In [2]: some_data = np.array([[1, 2, 3], [4, 5, 6]])
In [3]: foo = DataContainer(some_data)
In [4]: foo.data
Out[5]:
array([[1, 2, 3],
[4, 5, 6]])
In [6]: foo.data_copy
Out[7]:
array([[1, 2, 3],
[4, 5, 6]])
Just to be doubly sure, I checked to see that the two attributes are not referencing the same object. In this case they are not, which is likely thanks to the copy method on the NumPy array.
In [8]: foo.data[0,0] = 999
In [9]: foo.data
Out[10]:
array([[999, 2, 3],
[ 4, 5, 6]])
In [11]: foo.data_copy
Out[12]:
array([[1, 2, 3],
[4, 5, 6]])

Shuffling a list of functions with random.shuffle

I have some functions:
def feeling():
...
def homesick():
...
def miss():
...
I'd like to put them in a list, shuffle them, and call each of them in succession:
import random
prompts = [feeling, homesick, miss]
My idea was to call each function like this:
random.shuffle(prompts)()
But this throws a
TypeError: 'NoneType' object is not callable
What am I doing wrong, and how can I get this to work?
You have a task to choose one of these functions at random. Here's a small demo does what you're doing, but correctly.
>>> f = [sum, max, min]
>>> random.shuffle(f)
>>> f.pop()([1, 2, 3]) # looks like we picked max. Alternatively, `f[0](...)`
3
Or, if it's only one function you want, there's no need to use random.shuffle at all. Use random.choice instead.
>>> random.choice(f)([1, 2, 3])
>>> 3
Why Your Error Occurs
random.shuffle performs an inplace shuffling, as the docs mention.
>>> y = list(range(10))
>>> random.shuffle(y)
>>> y
[6, 3, 4, 1, 5, 8, 9, 0, 7, 2]
So, when you call the function, expect nothing in return. In other words, expect None in return.
Further, calling () on an object invokes its __call__ method. Since NoneType objects do not have such a method, this errors out with TypeError. For an object to be callable, you'd need -
class Foo:
def __init__(self, x):
self.x = x
def __call__(self, y):
return self.x + y
>>> f = Foo(10)
>>> f(20)
30
As an exercise, try removing __call__ and rerunning the code. Calling f(20) should give you the same error.

Categories

Resources