When I create a numy array of a list of sublists of equal length, it implicitly converts it to a (len(list), len(sub_list)) 2d array:
>>> np.array([[1,2], [1,2]],dtype=object).shape
(2, 2)
But when I pass variable length sublists it creates a vector of length len(list):
>>> np.array([[1,2], [1,2,3]],dtype=object).shape
(2,)
How can I get a vector output when the sublists are the same length (i.e. make the first case behave like the second)?
Here you go...create with dtype=np.ndarray instead of dtype=object.
Simple example below (with 5 elements):
In [1]: arr = np.empty((5,), dtype=np.ndarray)
In [2]: arr.shape
Out[2]: (5,)
In [3]: arr[0]=np.array([1,2])
In [4]: arr[1]=np.array([2,3])
In [5]: arr[2]=np.array([1,2,3,4])
In [6]: arr
Out[6]:
array([array([1, 2]), array([2, 3]), array([1, 2, 3, 4]), None, None],
dtype=object)
You can create an array of objects of the desired size, and then set the elements like so:
elements = [np.array([1,2]), np.array([1,2])]
arr = np.empty(len(elements), dtype='object')
arr[:] = elements
But if you try to cast to an array directly with a list of arrays/lists of the same length, numpy will implicitly convert it into a multidimensional array.
np.array([[1,2], [1,2]],dtype=object)[0].shape
Related
I'm trying to see if there is a prettier way to create (i.e force the creation) of a 1d numpy array from another list/array of objects. These objects, however, may have entries that are themselves iterable (so they can be lists, tuples, etc. but can also be more arbitrary objects).
So to make things really simple, let me consider the following scenario:
a=[(1,2), (3,4), (3,5)]
b=np.array(a, dtype=object)
b.shape # gives (2,3), but I would like to have (3,1) or (3,)
I was wondering if there is a nice pythonic/numpy'ish way to force b to have a shape (3,), and the iterable structure of the elements of a to be neglected in b. Right now I do this:
a=[(1,2), (3,4), (3,5)]
b=np.empty(len(a), dtype=object)
for i,x in enumerate(a):
b[i]=x
b.shape # gives (3,) this is what i want.
which works, but a bit ugly. I could not find a nicer way to do this in way that's more built-in into numpy. Any ideas?
(more context: what I really need to do is reshuffle the dimensions of b in various ways, hence I don't want b to know anything about the dimensions of its elements if they are iterable).
Thanks!
In [60]: b = np.empty(3, object)
You don't need to iterate when assigning from a list:
In [61]: b[:] = [(1,2),(3,4),(3,5)]
In [62]: b
Out[62]: array([(1, 2), (3, 4), (3, 5)], dtype=object)
In [63]: b.shape
Out[63]: (3,)
For an array it doesn't work:
In [64]: b[:] = np.array([(1,2),(3,4),(3,5)])
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-64-3042dce1f885> in <module>
----> 1 b[:] = np.array([(1,2),(3,4),(3,5)])
ValueError: could not broadcast input array from shape (3,2) into shape (3)
You may have use the iteration in the array case:
In [66]: for i,n in enumerate(np.array([(1,2),(3,4),(3,5)])):
...: b[i] = n
...:
In [67]: b
Out[67]: array([array([1, 2]), array([3, 4]), array([3, 5])], dtype=object)
Keep in mind that object dtype arrays are a bit of fall back option. np.array(...) tries to create a multidimensional array if possible (with numeric dtype). Making an object dtype is done only if that isn't possible. And for some combinations of shapes, it throws up its hands and raises an error.
Turning that array into a list of arrays with list() also works (same speed):
In [92]: b[:] = list(np.array([(1,2),(3,4),(3,5)]))
In [93]: b
Out[93]: array([array([1, 2]), array([3, 4]), array([3, 5])], dtype=object)
Given the following numpy array:
arr = np.array([
[1,2,3],
[4,5,6],
[7,8,9]
])
delete and return:
arr = np.array([
[1,2,3],
[4,6],
[7,8,9]
])
I want to delete 5 from this array. or delete arr[1][2] only. When I am using del arr[i][j] it throws the following err. ValueError: cannot delete array elements and numpy documentation is not clear on this case for me.
Similarly how to add an element to some rows in the same array?
To be specific, When I am reading an image with opencv I am getting this err.
rgb_image = cv2.imread("image.png")
del operation gives me the top error and I couldnt make it with np.delete(...)
A numpy array (ndarray) is quote:
An ndarray is a (usually fixed-size) multidimensional container of items of the same type and size.
So you cannot have rows of different lengths if you want to use the ndarray data structure (with all of its optimizations).
A possible workaround is to have an array of lists
>>> arr=np.array([
[1,2,3],
[4,5,6],
[7,8,9],
[]
])
(note the empty row to escape the ndarray datatype)
so you can delete an element from one of the lists
>>> arr
array([list([1, 2, 3]), list([4, 5, 6]), list([7, 8, 9]), list([])],
dtype=object)
>>> arr[1]=np.delete(arr[1], [1], axis=0)
>>> arr
array([list([1, 2, 3]), array([4, 6]), list([7, 8, 9]), list([])],
dtype=object)
I think the one way would be to cast np.array to list and repeat cast to np.array, like this:
arr = arr.tolist()
arr[1].pop(1)
arr = np.array(arr)
Edit:
It seems to be right, numpy way:
np.delete(arr, [4, 4])
np.split(arr, [3, 5, 9])
Edit2:
Doesn't seems to be less time consuming, but you could check this way:
arr = np.empty(3, dtype=np.object)
arr[:] = [1,2,3], [4,5,6], [7,8,9]
arr[1].remove(5)
First convert the Array into a list using
new_list = list(old_array) function.(This will create a list of arrays)
Now,you can perfome all the operations of list like pop,filter, etc to remove whatever elements you want.
Finally when you have your filtered list, convert it back to array using
new_array = np.array(new_list) .(This new array will retain the dimensions of old array)
I have a list of tuples, one of them is an object and the other one is generic. When I run np.asarray(list_in) the result is a 2D array, with the tuples being converted in a row. However I would like to obtain a 1D array made of tuples.
I could pass a dtype to force it and it works well if I try this minimalistic example
a = [(1,2),(3,4)]
b = np.asarray(a,dtype=('float,float'))
print b
[( 1., 2.) ( 3., 4.)]
But how do I take the first element of the list and construct a proper dtype out of it. type(list_in[0]) returns tuple and passing this to asarray does not work.
With this list of tuples you can make 3 kinds of arrays:
In [420]: a = [(1,2),(3,4)]
2d array, with dtype inferred from the inputs (but it could also be specified as something like float). Inputs match in size.
In [421]: np.array(a)
Out[421]:
array([[1, 2],
[3, 4]])
Structured array. 1d with 2 fields. Field indexing by name. Input must be a list of tuples (not list of lists):
In [422]: np.array(a, dtype='i,i')
Out[422]:
array([(1, 2), (3, 4)],
dtype=[('f0', '<i4'), ('f1', '<i4')])
In [423]: _['f0']
Out[423]: array([1, 3], dtype=int32)
In the structured array, input and display uses tuples, but the data is not actually stored as tuples. The values are packed as bytes - in this case 8 bytes representing 2 integers.
Object array. This is 1d with tuple contents. Contents could be anything else. This is an enhanced/debased list.
In [424]: A = np.empty((2,), dtype=object)
In [425]: A[:] = a
In [426]: A
Out[426]: array([(1, 2), (3, 4)], dtype=object)
In [427]: A.shape
Out[427]: (2,)
In [428]: A[1]
Out[428]: (3, 4)
Out[428] is an actual tuple. Trying to modify it, A[1][0]=30, raises an error.
In this last case A = np.empty(2, dtype=tuple) does the same thing. Any thing other than integer, float, string, etc is 'converted' to `object'.
Simply specifying object dtype doesn't help. The result is 2d with numeric elements (but stored as object pointers).
In [429]: np.array(a, dtype=object)
Out[429]:
array([[1, 2],
[3, 4]], dtype=object)
In [430]: _.shape
Out[430]: (2, 2)
More on making an object dtype array at
numpy ravel on inconsistent dimensional object
Why can't I index an ndarray using a list of tuple indices like so?
idx = [(x1, y1), ... (xn, yn)]
X[idx]
Instead I have to do something unwieldy like
idx2 = numpy.array(idx)
X[idx2[:, 0], idx2[:, 1]] # or more generally:
X[tuple(numpy.vsplit(idx2.T, 1)[0])]
Is there a simpler, more pythonic way?
You can use a list of tuples, but the convention is different from what you want. numpy expects a list of row indices, followed by a list of column values. You, apparently, want to specify a list of (x,y) pairs.
http://docs.scipy.org/doc/numpy/reference/arrays.indexing.html#integer-array-indexing
The relevant section in the documentation is 'integer array indexing'.
Here's an example, seeking 3 points in a 2d array. (2 points in 2d can be confusing):
In [223]: idx
Out[223]: [(0, 1, 1), (2, 3, 0)]
In [224]: X[idx]
Out[224]: array([2, 7, 4])
Using your style of xy pairs of indices:
In [230]: idx1 = [(0,2),(1,3),(1,0)]
In [231]: [X[i] for i in idx1]
Out[231]: [2, 7, 4]
In [240]: X[tuple(np.array(idx1).T)]
Out[240]: array([2, 7, 4])
X[tuple(zip(*idx1))] is another way of doing the conversion. The tuple() is optional in Python2. zip(*...) is a Python idiom that reverses the nesting of a list of lists.
You are on the right track with:
In [242]: idx2=np.array(idx1)
In [243]: X[idx2[:,0], idx2[:,1]]
Out[243]: array([2, 7, 4])
My tuple() is just a bit more compact (and not necessarily more 'pythonic'). Given the numpy convention, some sort of conversion is necessary.
(Should we check what works with n-dimensions and m-points?)
Use a tuple of NumPy arrays which can be directly passed to index your array:
index = tuple(np.array(list(zip(*index_tuple))))
new_array = list(prev_array[index])
I have an array of distances a = np.array([20.5 ,5.3 ,60.7 ,3.0 ], 'double') and I need the indices of the sorted array (for example [3, 1, 0, 2], for a.sort()). Is there a function in Numpy to do that?
Yes, there's the x = numpy.argsort(a) function or x = numpy.ndarray.argsort(a) method. It does exactly what you're asking for. You can also call argsort as a method on an ndarray object like so: a.argsort().
Here's a link to the documentation: http://docs.scipy.org/doc/numpy/reference/generated/numpy.argsort.html#numpy.argsort
Here's an example, for reference and convenience:
# create an array
a = np.array([5,2,3])
# np.sort - returns the array, sorted
np.sort(a)
>>> array([2, 3, 5])
# argsort - returns the original indexes of the sorted array
np.argsort(a)
>>> array([1, 2, 0])