Is there any `itemgetter` equivalent for `numpy.ndarray`? - python

When I call itemgetter with a numpy.ndarray, I get a tuple.
In [1]: import numpy as np
In [2]: import operator as op
In [3]: ar = np.array([1,2,3,4,5])
In [4]: op.itemgetter(1,3)(ar)
Out[4]: (2, 4)
I wonder if there's any numpy function that's like itemgetter but returns an ndarray instead.

With numpy arrays you can access multiple indices directly by indexing into the array with a list:
>>> x
array([1, 2, 3, 4, 5])
>>> x[[1, 3]]
array([2, 4])

Might not be answering the question, but I'd do something like
ar[[1,3]]
to get back a numpy.ndarray object with the required elements

Related

Numpy array of numpy arrays

When I create a numy array of a list of sublists of equal length, it implicitly converts it to a (len(list), len(sub_list)) 2d array:
>>> np.array([[1,2], [1,2]],dtype=object).shape
(2, 2)
But when I pass variable length sublists it creates a vector of length len(list):
>>> np.array([[1,2], [1,2,3]],dtype=object).shape
(2,)
How can I get a vector output when the sublists are the same length (i.e. make the first case behave like the second)?
Here you go...create with dtype=np.ndarray instead of dtype=object.
Simple example below (with 5 elements):
In [1]: arr = np.empty((5,), dtype=np.ndarray)
In [2]: arr.shape
Out[2]: (5,)
In [3]: arr[0]=np.array([1,2])
In [4]: arr[1]=np.array([2,3])
In [5]: arr[2]=np.array([1,2,3,4])
In [6]: arr
Out[6]:
array([array([1, 2]), array([2, 3]), array([1, 2, 3, 4]), None, None],
dtype=object)
You can create an array of objects of the desired size, and then set the elements like so:
elements = [np.array([1,2]), np.array([1,2])]
arr = np.empty(len(elements), dtype='object')
arr[:] = elements
But if you try to cast to an array directly with a list of arrays/lists of the same length, numpy will implicitly convert it into a multidimensional array.
np.array([[1,2], [1,2]],dtype=object)[0].shape

What's the differences between a numpy matrix and a numpy.matrixlib.defmatrix.matrix?

I read in a csv file and did some manipulations, and got an object x. When I use type(x), it returns numpy.matrixlib.defmatrix.matrix. I have never seen this type before and wondering if there is any difference between it and a commonly seen 'numpy matrix'. Thanks!
They are the same:
In [1]: import numpy as np
In [2]: np.matrix
Out[2]: numpy.matrixlib.defmatrix.matrix
In [3]: id(np.matrix)
Out[3]: 4300190472
In [4]: id(np.matrixlib.defmatrix.matrix)
Out[4]: 4300190472
In [5]: a = np.matrix([[1, 2], [3, 4]])
In [6]: type(a)
Out[6]: numpy.matrixlib.defmatrix.matrix
numpy.matrix is a more convenient name for numpy.matrixlib.defmatrix.matrix.

How to sort a list based on the output of numpy's argsort function

I have a list like this:
myList = [10,30,40,20,50]
Now I use numpy's argsort function to get the indices for the sorted list:
import numpy as np
so = np.argsort(myList)
which gives me the output:
array([0, 3, 1, 2, 4])
When I want to sort an array using so it works fine:
myArray = np.array([1,2,3,4,5])
myArray[so]
array([1, 4, 2, 3, 5])
But when I apply it to another list, it does not work but throws an error
myList2 = [1,2,3,4,5]
myList2[so]
TypeError: only integer arrays with one element can be converted to an
index
How can I now use so to sort another list without using a for-loop and without converting this list to an array first?
myList2 is a normal python list, and it does not support that kind of indexing.
You would either need to convert that to a numpy.array , Example -
In [8]: np.array(myList2)[so]
Out[8]: array([1, 4, 2, 3, 5])
Or you can use list comprehension -
In [7]: [myList2[i] for i in so]
Out[7]: [1, 4, 2, 3, 5]
You can't. You have to convert it to an array then back.
myListSorted = list(np.array(myList)[so])
Edit: I ran some benchmarks comparing the NumPy way to the list comprehension. NumPy is ~27x faster
>>> from timeit import timeit
>>> import numpy as np
>>> myList = list(np.random.rand(100))
>>> so = np.argsort(myList) #converts list to NumPy internally
>>> timeit(lambda: [myList[i] for i in so])
12.29590070003178
>>> myArray = np.random.rand(100)
>>> so = np.argsort(myArray)
>>> timeit(lambda: myArray[so])
0.42915570305194706

Can I avoid using `asmatrix`?

Is there any way for me to create matrices directly and not have to use asmatrix? From what I can see, all of the typical matrix functions (ones, rand, etc) in Numpy return arrays, not matrices, which means (according to the documentation) that asmatrix will copy the data. Is there any way to avoid this?
According to the documentation:
Unlike matrix, asmatrix does not make a copy if the input is already a
matrix or an ndarray. Equivalent to matrix(data, copy=False).
So, asmatrix does not copy the data if it doesn't need to:
>>> import numpy as np
>>> a = np.arange(9).reshape((3,3))
>>> b = np.asmatrix(a)
>>> b.base is a
True
>>> a[0] = 3
>>> b
matrix([[3, 3, 3],
[3, 4, 5],
[6, 7, 8]])

Convert NumPy array to Python list

How do I convert a NumPy array into a Python List?
Use tolist():
>>> import numpy as np
>>> np.array([[1,2,3],[4,5,6]]).tolist()
[[1, 2, 3], [4, 5, 6]]
Note that this converts the values from whatever numpy type they may have (e.g. np.int32 or np.float32) to the "nearest compatible Python type" (in a list). If you want to preserve the numpy data types, you could call list() on your array instead, and you'll end up with a list of numpy scalars. (Thanks to Mr_and_Mrs_D for pointing that out in a comment.)
c = np.array([[1,2,3],[4,5,6]])
list(c.flatten())
The numpy .tolist method produces nested lists if the numpy array shape is 2D.
if flat lists are desired, the method below works.
import numpy as np
from itertools import chain
a = [1,2,3,4,5,6,7,8,9]
print type(a), len(a), a
npa = np.asarray(a)
print type(npa), npa.shape, "\n", npa
npa = npa.reshape((3, 3))
print type(npa), npa.shape, "\n", npa
a = list(chain.from_iterable(npa))
print type(a), len(a), a`
tolist() works fine even if encountered a nested array, say a pandas DataFrame;
my_list = [0,1,2,3,4,5,4,3,2,1,0]
my_dt = pd.DataFrame(my_list)
new_list = [i[0] for i in my_dt.values.tolist()]
print(type(my_list),type(my_dt),type(new_list))
Another option
c = np.array([[1,2,3],[4,5,6]])
c.ravel()
#>> array([1, 2, 3, 4, 5, 6])
# or
c.ravel().tolist()
#>> [1, 2, 3, 4, 5, 6]
also works.
The easiest way to convert array to a list is using the numpy package:
import numpy as np
#2d array to list
2d_array = np.array([[1,2,3],[8,9,10]])
2d_list = 2d_array.tolist()
To check the data type, you can use the following:
type(object)

Categories

Resources