Row-wise indexing in Numpy

Row-wise indexing in Numpy - python

I have two matrices, A and B:
A = array([[2., 13., 25., 1.], [ 18., 5., 1., 25.]])
B = array([[2, 1], [0, 3]])
I want to index each row of A with each row of B, producing the slice:
array([[25., 13.], [18., 25.]])
That is, I essentially want something like:
array([A[i,b] for i,b in enumerate(B)])
Is there a way to fancy-index this directly? The best I can do is this "flat-hack":
A.flat[B + arange(0,A.size,A.shape[1])[:,None]]

#Ophion's answer is great, and deserves the credit, but I wanted to add some explanation, and offer a more intuitive construction.
Instead of rotating B and then rotating the result back, it's better to just rotate the arange. I think this gives the most intuitive solution, even if it takes more characters:
A[((0,),(1,)), B]
or equivalently
A[np.arange(2)[:, None], B]
This works because what's really going on here, is you're making an i array and a j array, each of which have the same shape as your desired result.
i = np.array([[0, 0],
[1, 1]])
j = B
But you can use just
i = np.array([[0],
[1]])
Because it will broadcast to match B (this is what np.arange(2)[:,None] gives).
Finally, to make it more general (not knowing 2 as the arange size), you could also generate i from B with
i = np.indices(B.shape)[0]
however you build i and j, you just call it like
>>> A[i, j]
array([[ 25., 13.],
[ 18., 25.]])

Not pretty but:
A[np.arange(2),B.T].T
array([[ 25., 13.],
[ 18., 25.]])

Related

Outer minimum vectorization in numpy follow up

This is a follow-up to my previous question.
Given an NxM matrix A, I want to efficiently obtain the NxN matrix whose ith row is the sum along the 2nd axis of the result of applying np.minimum between A and the ith row of A.
Using a for loop,
> A = np.array([[1, 2], [3, 4], [5,6]])
> output = np.zeros(shape=(A.shape[0], A.shape[0]))
> for i in range(A.shape[0]):
output[i] = np.sum(np.minimum(A, A[i]), axis=1)
> output
array([[ 3., 3., 3.],
[ 3., 7., 7.],
[ 3., 7., 11.]])
Is is possible to optimize this further without the for loop?
Edit: I would also like to do it without allocating an MxMxN tensor because of memory constraints.

instead of a for loop. Using the NumPy minimum and sum functions, you can compute the desired matrix output as follows:
output = np.sum(np.minimum(A[:, None], A), axis=2)

Python : Mapping values to other values without gap

I have the following question. Is there somekind of method with numpy or scipy , which I can use to get an given unsorted array like this
a = np.array([0,0,1,1,4,4,4,4,5,1891,7]) #could be any number here
to something where the numbers are interpolated/mapped , there is no gap between the values and they are in the same order like before?:
[0,0,1,1,2,2,2,2,3,5,4]
EDIT
Is it furthermore possible to swap/shuffle the numbers after the mapping, so that
[0,0,1,1,2,2,2,2,3,5,4]
become something like:
[0,0,3,3,5,5,5,5,4,1,2]

Edit: I'm not sure what the etiquette is here (should this be a separate answer?), but this is actually directly obtainable from np.unique.
>>> u, indices = np.unique(a, return_inverse=True)
>>> indices
array([0, 0, 1, 1, 2, 2, 2, 2, 3, 5, 4])
Original answer: This isn't too hard to do in plain python by building a dictionary of what index each value of the array would map to:
x = np.sort(np.unique(a))
index_dict = {j: i for i, j in enumerate(x)}
[index_dict[i] for i in a]

Seems you need to rank (dense) your array, in which case use scipy.stats.rankdata:
from scipy.stats import rankdata
rankdata(a, 'dense')-1
# array([ 0., 0., 1., 1., 2., 2., 2., 2., 3., 5., 4.])

Combine numpy arrays to form a matrix

This seems like it should be straightforward, but I can't figure it out.
Data source is a two column, comma delimited input file with these contents:
6,10
5,9
8,13
...
And my code is:
import numpy as np
data = np.loadtxt("data.txt", delimiter=",")
m = len(data)
x = np.reshape(data[:,0], (m,1))
y = np.ones((m,1))
z = np.matrix([x,y])
Which gives me this error:
Users/acpigeon/.virtualenvs/ipynb/lib/python2.7/site-packages/numpy-1.9.0.dev_297f54b-py2.7-macosx-10.9-intel.egg/numpy/matrixlib/defmatrix.pyc in __new__(subtype, data, dtype, copy)
270 shape = arr.shape
271 if (ndim > 2):
--> 272 raise ValueError("matrix must be 2-dimensional")
273 elif ndim == 0:
274 shape = (1, 1)
ValueError: matrix must be 2-dimensional
No amount of reshaping seems to get this to work, so I'm either missing something really simple or there's a better way to do this.
EDIT:
Would have been helpful to specify the output I am looking for. Here is a line of code that generates the desired result:
In [1]: np.matrix([[5,1],[6,1],[8,1]])
Out[1]:
matrix([[5, 1],
[6, 1],
[8, 1]])

The desired output can be generated this way:
In [12]: np.array((data[:, 0], np.ones(m))).transpose()
Out[12]:
array([[ 6., 1.],
[ 5., 1.],
[ 8., 1.]])
The above is copied from ipython and so has ipython style prompts.
Answer to previous version
To eliminate the error, replace:
x = np.reshape(data[:, 0], (m, 1))
with:
x = data[:, 0]
The former line produces a 2-dimensional matrix and that is what causes the error message. The latter produces a 1-D array with the same data.

Or how about first turning the array into a matrix, and then change the last column to 1?
In [2]: data=np.loadtxt('stack23859379.txt',delimiter=',')
In [3]: np.matrix(data)
Out[3]:
matrix([[ 6., 10.],
[ 5., 9.],
[ 8., 13.]])
In [4]: z = np.matrix(data)
In [5]: z[:,1]=1
In [6]: z
Out[6]:
matrix([[ 6., 1.],
[ 5., 1.],
[ 8., 1.]])

Finding minimum value element-wise in three 2d submatrices

I'm trying to produce a color mapping of the convergence of a polynomial's roots in complex space. In order to do this, I have created a grid of points and applied Newton's method to those points, in order to find to which complex root they each converge. This gives me a 2d array of complex numbers, the elements of which denote the point to which they converge, within some tolerance. I want to be able to match the numbers in that matrix to an element-wise color mapping.
I have done this by iterating over the array and computing colors element-by-element, but it is very slow, and seems it would benefit from vectorizing. Here's my code so far:
def colorvec(rootsmatrix, known_roots):
dim = len(known_roots)
dist = ndarray((dim, nx, ny))
for i in range(len(known_roots)):
dist[i] = abs(rootsmatrix-known_roots[i])
This creates a 3d array with the distances of each point's computed root to each of the actual roots. It looks something like this, except with 75 000 000 elements.
[ [ [6e-15 7e-15 0.5]
[1.5 5e-15 0.5] #submatrix 1
[0.75 0.98 0.78] ]
[ [1.5 0.75 0.5]
[8e-15 5e-15 0.8] #submatrix 2
[0.75 0.98 0.78] ]
[ [1.25 0.5 5e-15]
[0.5 0.64 4e-15] #submatrix 3
[5e-15 4e-15 7e-15] ]
I want to take dist, and return the 1st dimension argument (i.e., 1, 2, or 3) for each 2nd- and 3rd-dimension argument, for which dist is minimum. That will be my color mapping. For example, comparing the element (0,0) of each of the 3 submatrices would yield that color(0,0) = 0. Similarly, color(1,1) = 0 and color (2,2) = 2. I want to be able to do this for the entire color matrix.
I haven't been able to find a way to do this using numpy.argmin, but I could be missing something. If there's another way to do this, I'd be happy to hear, especially if it doesn't involve loops. I'm making ~25MP images here, so looping takes fully 25 minutes to assign colors.
Thanks in advance for your advice!

You can pass an axis argument to argmin. You want to minimize along the first axis (what you're calling 'submatrices'), which is axis=0:
dist.argmin(0)
dist = array([[[ 6.00e-15, 7.00e-15, 5.00e-01],
[ 1.50e+00, 5.00e-15, 5.00e-01],
[ 7.50e-01, 9.80e-01, 7.80e-01]],
[[ 1.50e+00, 7.50e-01, 5.00e-01],
[ 8.00e-15, 5.00e-15, 8.00e-01],
[ 7.50e-01, 9.80e-01, 7.80e-01]],
[[ 1.25e+00, 5.00e-01, 5.00e-15],
[ 5.00e-01, 6.40e-01, 4.00e-15],
[ 5.00e-15, 4.00e-15, 7.00e-15]]])
dist.argmin(0)
#array([[0, 0, 2],
# [1, 0, 2],
# [2, 2, 2]])
This of course gives you 0, 1, 2 as the returns, if you want 1, 2, 3 as stated, use:
dist.argmin(0) + 1
#array([[1, 1, 3],
# [2, 1, 3],
# [3, 3, 3]])
Finally, if you actually want the minimum value itself (instead of which 'submatrix' it comes from), you can just use dist.min(0):
dist.min(0)
#array([[ 6.00e-15, 7.00e-15, 5.00e-15],
# [ 8.00e-15, 5.00e-15, 4.00e-15],
# [ 5.00e-15, 4.00e-15, 7.00e-15]])
If you want to use the minimum location from the dist matrix to pull a value out of another matrix, it's a little tricky, but you can use
minloc = dist.argmin(0)
other[dist.argmin(0), np.arange(dist.shape[1])[:, None], np.arange(dist.shape[2])]
Note that if other=dist this gives the same output as just calling dist.min(0):
dist[dist.argmin(0), np.arange(dist.shape[1])[:, None], np.arange(dist.shape[2])]
#array([[ 6.00e-15, 7.00e-15, 5.00e-15],
# [ 8.00e-15, 5.00e-15, 4.00e-15],
# [ 5.00e-15, 4.00e-15, 7.00e-15]])
or if other just says which submatrix it is, you get the same thing back:
other = np.ones((3,3,3))*np.arange(1,4).reshape(3,1,1)
other
#array([[[ 1., 1., 1.],
# [ 1., 1., 1.],
# [ 1., 1., 1.]],
# [[ 2., 2., 2.],
# [ 2., 2., 2.],
# [ 2., 2., 2.]],
# [[ 3., 3., 3.],
# [ 3., 3., 3.],
# [ 3., 3., 3.]]])
other[dist.argmin(0), np.arange(dist.shape[1])[:, None], np.arange(dist.shape[2])]
#array([[ 1., 1., 3.],
# [ 2., 1., 3.],
# [ 3., 3., 3.]])
As an unrelated note, you can rewrite colorvec without that loop, assuming rootsmatrix.shape is (nx, ny) and known_roots.shape is (dim,)
def colorvec(rootsmatrix, known_roots):
dist = abs(rootsmatrix - known_roots[:, None, None])
where known_roots[:, None, None] is the same as known_roots.reshape(len(known_roots), 1, 1) and causes it to broadcast with rootsmatrix

Unsuccessful append to an empty NumPy array

I am trying to fill an empty(not np.empty!) array with values using append but I am gettin error:
My code is as follows:
import numpy as np
result=np.asarray([np.asarray([]),np.asarray([])])
result[0]=np.append([result[0]],[1,2])
And I am getting:
ValueError: could not broadcast input array from shape (2) into shape (0)

I might understand the question incorrectly, but if you want to declare an array of a certain shape but with nothing inside, the following might be helpful:
Initialise empty array:
>>> a = np.zeros((0,3)) #or np.empty((0,3)) or np.array([]).reshape(0,3)
>>> a
array([], shape=(0, 3), dtype=float64)
Now you can use this array to append rows of similar shape to it. Remember that a numpy array is immutable, so a new array is created for each iteration:
>>> for i in range(3):
... a = np.vstack([a, [i,i,i]])
...
>>> a
array([[ 0., 0., 0.],
[ 1., 1., 1.],
[ 2., 2., 2.]])
np.vstack and np.hstack is the most common method for combining numpy arrays, but coming from Matlab I prefer np.r_ and np.c_:
Concatenate 1d:
>>> a = np.zeros(0)
>>> for i in range(3):
... a = np.r_[a, [i, i, i]]
...
>>> a
array([ 0., 0., 0., 1., 1., 1., 2., 2., 2.])
Concatenate rows:
>>> a = np.zeros((0,3))
>>> for i in range(3):
... a = np.r_[a, [[i,i,i]]]
...
>>> a
array([[ 0., 0., 0.],
[ 1., 1., 1.],
[ 2., 2., 2.]])
Concatenate columns:
>>> a = np.zeros((3,0))
>>> for i in range(3):
... a = np.c_[a, [[i],[i],[i]]]
...
>>> a
array([[ 0., 1., 2.],
[ 0., 1., 2.],
[ 0., 1., 2.]])

numpy.append is pretty different from list.append in python. I know that's thrown off a few programers new to numpy. numpy.append is more like concatenate, it makes a new array and fills it with the values from the old array and the new value(s) to be appended. For example:
import numpy
old = numpy.array([1, 2, 3, 4])
new = numpy.append(old, 5)
print old
# [1, 2, 3, 4]
print new
# [1, 2, 3, 4, 5]
new = numpy.append(new, [6, 7])
print new
# [1, 2, 3, 4, 5, 6, 7]
I think you might be able to achieve your goal by doing something like:
result = numpy.zeros((10,))
result[0:2] = [1, 2]
# Or
result = numpy.zeros((10, 2))
result[0, :] = [1, 2]
Update:
If you need to create a numpy array using loop, and you don't know ahead of time what the final size of the array will be, you can do something like:
import numpy as np
a = np.array([0., 1.])
b = np.array([2., 3.])
temp = []
while True:
rnd = random.randint(0, 100)
if rnd > 50:
temp.append(a)
else:
temp.append(b)
if rnd == 0:
break
result = np.array(temp)
In my example result will be an (N, 2) array, where N is the number of times the loop ran, but obviously you can adjust it to your needs.
new update
The error you're seeing has nothing to do with types, it has to do with the shape of the numpy arrays you're trying to concatenate. If you do np.append(a, b) the shapes of a and b need to match. If you append an (2, n) and (n,) you'll get a (3, n) array. Your code is trying to append a (1, 0) to a (2,). Those shapes don't match so you get an error.

This error arise from the fact that you are trying to define an object of shape (0,) as an object of shape (2,). If you append what you want without forcing it to be equal to result[0] there is no any issue:
b = np.append([result[0]], [1,2])
But when you define result[0] = b you are equating objects of different shapes, and you can not do this. What are you trying to do?

Here's the result of running your code in Ipython. Note that result is a (2,0) array, 2 rows, 0 columns, 0 elements. The append produces a (2,) array. result[0] is (0,) array. Your error message has to do with trying to assign that 2 item array into a size 0 slot. Since result is dtype=float64, only scalars can be assigned to its elements.
In [65]: result=np.asarray([np.asarray([]),np.asarray([])])
In [66]: result
Out[66]: array([], shape=(2, 0), dtype=float64)
In [67]: result[0]
Out[67]: array([], dtype=float64)
In [68]: np.append(result[0],[1,2])
Out[68]: array([ 1., 2.])
np.array is not a Python list. All elements of an array are the same type (as specified by the dtype). Notice also that result is not an array of arrays.
Result could also have been built as
ll = [[],[]]
result = np.array(ll)
while
ll[0] = [1,2]
# ll = [[1,2],[]]
the same is not true for result.
np.zeros((2,0)) also produces your result.
Actually there's another quirk to result.
result[0] = 1
does not change the values of result. It accepts the assignment, but since it has 0 columns, there is no place to put the 1. This assignment would work in result was created as np.zeros((2,1)). But that still can't accept a list.
But if result has 2 columns, then you can assign a 2 element list to one of its rows.
result = np.zeros((2,2))
result[0] # == [0,0]
result[0] = [1,2]
What exactly do you want result to look like after the append operation?

numpy.append always copies the array before appending the new values. Your code is equivalent to the following:
import numpy as np
result = np.zeros((2,0))
new_result = np.append([result[0]],[1,2])
result[0] = new_result # ERROR: has shape (2,0), new_result has shape (2,)
Perhaps you mean to do this?
import numpy as np
result = np.zeros((2,0))
result = np.append([result[0]],[1,2])

SO thread 'Multiply two arrays element wise, where one of the arrays has arrays as elements' has an example of constructing an array from arrays. If the subarrays are the same size, numpy makes a 2d array. But if they differ in length, it makes an array with dtype=object, and the subarrays retain their identity.
Following that, you could do something like this:
In [5]: result=np.array([np.zeros((1)),np.zeros((2))])
In [6]: result
Out[6]: array([array([ 0.]), array([ 0., 0.])], dtype=object)
In [7]: np.append([result[0]],[1,2])
Out[7]: array([ 0., 1., 2.])
In [8]: result[0]
Out[8]: array([ 0.])
In [9]: result[0]=np.append([result[0]],[1,2])
In [10]: result
Out[10]: array([array([ 0., 1., 2.]), array([ 0., 0.])], dtype=object)
However, I don't offhand see what advantages this has over a pure Python list or lists. It does not work like a 2d array. For example I have to use result[0][1], not result[0,1]. If the subarrays are all the same length, I have to use np.array(result.tolist()) to produce a 2d array.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Row-wise indexing in Numpy - python

Not pretty but: A[np.arange(2),B.T].T array([[ 25., 13.], [ 18., 25.]])

Related

Outer minimum vectorization in numpy follow up

Python : Mapping values to other values without gap

Combine numpy arrays to form a matrix

Finding minimum value element-wise in three 2d submatrices

Unsuccessful append to an empty NumPy array

Categories

Resources