Absolute difference of two NumPy arrays - python

Is there an efficient way/function to subtract one matrix from another and writing the absolute values in a new matrix?
I can do it entry by entry but for big matrices, this will be fairly slow...
For example:
X = [[12,7,3],
[4 ,5,6],
[7 ,8,9]]
Y = [[5,8,1],
[6,7,3],
[4,5,9]]
for i in range(len(r_0)):
for j in range(len(r)):
delta_r[i][j]= sqrt((r[i][j])**2 - (r_0[i][j])**2)

If you want the absolute element-wise difference between both matrices, you can easily subtract them with NumPy and use numpy.absolute on the resulting matrix.
import numpy as np
X = [[12,7,3],
[4 ,5,6],
[7 ,8,9]]
Y = [[5,8,1],
[6,7,3],
[4,5,9]]
result = np.absolute(np.array(X) - np.array(Y))
Outputs:
[[7 1 2]
[2 2 3]
[3 3 0]]
Alternatively (although unnecessary), if you were required to do so in native Python you could zip the dimensions together in a nested list comprehension.
result = [[abs(a-b) for a, b in zip(xrow, yrow)]
for xrow, yrow in zip(X,Y)]
Outputs:
[[7, 1, 2], [2, 2, 3], [3, 3, 0]]

Doing this becomes trivial if you cast your 2D arrays to numpy arrays:
import numpy as np
X = [[12, 7, 3],
[4, 5, 6],
[7, 8, 9]]
Y = [[5, 8, 1],
[6, 7, 3],
[4, 5, 9]]
X, Y = map(np.array, (X, Y))
result = X - Y
Numpy is designed to work easily and efficiently with matrices.
Also, you spoke about subtracting matrices, but you also seemed to want to square the individual elements and then take the square root on the result. This is also easy with numpy:
result = np.sqrt((A ** 2) - (B ** 2))

I recommend using NumPy
X = numpy.array([
[12,7,3],
[4 ,5,6],
[7 ,8,9]
])
Y = numpy.array([
[5,8,1],
[6,7,3],
[4,5,9]
])
delta_r = numpy.sqrt(X ** 2 - Y ** 2)

Related

Use numpy to stack combinations of a 1D and 2D array

I have 2 numpy arrays, one 2D and the other 1D, for example like this:
import numpy as np
a = np.array(
[
[1, 2],
[3, 4],
[5, 6]
]
)
b = np.array(
[7, 8, 9, 10]
)
I want to get all possible combinations of the elements in a and b, treating a like a 1D array, so that it leaves the rows in a intact, but also joins the rows in a with the items in b. It would look something like this:
>>> combine1d(a, b)
[ [1 2 7] [1 2 8] [1 2 9] [1 2 10]
[3 4 7] [3 4 8] [3 4 9] [3 4 10]
[5 6 7] [5 6 8] [5 6 9] [5 6 10] ]
I know that there are slow solutions for this (like a for loop), but I need a fast solution to this as I am working with datasets with millions of integers.
Any ideas?
This is one of those cases where it's easier to build a higher dimensional object, and then fix the axes when you're done. The first two dimensions are the length of b and the length of a. The third dimension is the number of elements in each row of a plus 1. We can then use broadcasting to fill in this array.
x, y = a.shape
z, = b.shape
result = np.empty((z, x, y + 1))
result[...,:y] = a
result[...,y] = b[:,None]
At this point, to get the exact answer you asked for, you'll need to swap the first two axes, and then merge those two axes into a single axis.
result.swapaxes(0, 1).reshape(-1, y + 1)
An hour later. . . .
I realized by being a little bit more clever, I didn't need to swap axes. This also has the nice benefit that the result is a contiguous array.
def convert1d(a, b):
x, y = a.shape
z, = b.shape
result = np.empty((x, z, y + 1))
result[...,:y] = a[:,None,:]
result[...,y] = b
return result.reshape(-1, y + 1)
this is very "scotch tape" solution:
import numpy as np
a = np.array(
[
[1, 2],
[3, 4],
[5, 6]
]
)
b = np.array(
[7, 8, 9, 10]
)
z = []
for x in b:
for y in a:
z.append(np.append(y, x))
np.array(z).reshape(3, 4, 3)
You need to use np.c_ to attach to join two dataframe. I also used np.full to generate a column of second array (b). The result are like what follows:
result = [np.c_[a, np.full((a.shape[0],1), x)] for x in b]
result
Output
[array([[1, 2, 7],
[3, 4, 7],
[5, 6, 7]]),
array([[1, 2, 8],
[3, 4, 8],
[5, 6, 8]]),
array([[1, 2, 9],
[3, 4, 9],
[5, 6, 9]]),
array([[ 1, 2, 10],
[ 3, 4, 10],
[ 5, 6, 10]])]
The output might be kind of messy. But it's exactly like what you mentioned as your desired output. To make sure, you cun run below to see what comes from the first element in the result array:
print(result[0])
Output
array([[1, 2, 7],
[3, 4, 7],
[5, 6, 7]])

How to sort a 2d numpy array based on their value when put into a function in python

Let's say I have a NumPy array:
[[7 2]
[7 3]
[2 8]
[4 3]
[5 5]]
Where the 0th index is the x value and the 1st index is the y value. How do I sort these values so that when I put them into the function:
(x^2 + y- 11)^2 + (x + y^2 -7)^2, they get sorted in ascending order depending on the results? so the sorted values would look like this:
[[4 3]
[5 5]
[7 2]
[7 3]
[2 8]]
The arrays can have duplicates.
One of my ideas would be to use the .argsort() method, though I don't know how I could implement that.
Thanks!
You can apply the function you have along the first axis to get a one dimensional array with the function values. Passing that result to np.argsort() will give you the proper sorting indices:
a = np.array([
[7, 2],
[7, 3],
[2, 8],
[4, 3],
[5, 5]]
)
def my_func(row):
x, y = row
return (x ** 2 + y - 11) ** 2 + (x + y ** 2) ** 2
f = np.apply_along_axis(my_func, 1, a)
# array([1721, 1937, 4357, 233, 1261])
indices = np.argsort(f)
# array([3, 4, 0, 1, 2])
a[indices]
# array([[4, 3],
# [5, 5],
# [7, 2],
# [7, 3],
# [2, 8]])
Per #mozway's comment...this is significanlty faster since it allows Numpy to vectorize the function:
x,y = a.T
aa = (x ** 2 + y - 11) ** 2 + (x + y ** 2) ** 2
indices = np.argsort(aa)
a[indices]
with the same result.
So this works :
def f(x, y):
return (x**2 + y- 11)**2 + (x + y**2 -7)**2
def sortTuples(TupleList):
output = [0, 0, 0, 0, 0]
indexList = []
for i in TupleList:
x = i[0]
y = i[1]
indexList.append(f(x, y))
indexList.sort()
for i in TupleList:
output[indexList.index(f(i[0], i[1]))] = i
return output
Hope you find a nicer way to do this !
At least for small arrays, sorted is competitive to np.argsort, (especially if lists suffice for your task):
out = sorted(arr.tolist(), key=lambda x: (x[0]**2+x[1]-11)**2+(x[0]+x[1]**2-7)**2)
Output:
[[4, 3], [5, 5], [7, 2], [7, 3], [2, 8]]

python numpy `np.take` with 2 dimensional array

I'm trying to take a list of elements from an 2D numpy array with given list of coordinates and I want to avoid using loop. I saw that np.take works with 1D array but I can't make it work with 2D arrays.
Example:
a = np.array([[1,2,3], [4,5,6]])
print(a)
# [[1 2 3]
# [4 5 6]]
np.take(a, [[1,2]])
# gives [2, 3] but I want just [6]
I want to avoid loop because I think that will be slower (I need speed). But if you can persuade me that a loop is as fast as an existing numpy function solution, then I can go for it.
If I understand it correctly, you have a list of coordinates like this:
coords = [[y0, x0], [y1, x1], ...]
To get the values of array a at these coordinates you need:
a[[y0, y1, ...], [x0, x1, ...]]
So a[coords] will not work. One way to do it is:
Y = [c[0] for c in coords]
X = [c[1] for c in coords]
or
Y = np.transpose(coords)[0]
X = np.transpose(coords)[1]
Then
a[Y, X]
Does fancy indexing do what you want? np.take seems to flatten the array before operating.
import numpy as np
a = np.arange(1, 10).reshape(3,3)
a
# array([[1, 2, 3],
# [4, 5, 6],
# [7, 8, 9]])
rows = [ 1,1,2,0]
cols = [ 0,1,1,2]
# Use the indices to access items in a
a[rows, cols]
# array([4, 5, 8, 3])
a[1,0], a[1,1], a[2,1], a[0,2]
# (4, 5, 8, 3)

How to get max (top) N values across entire numpy matrix

I want to get the top N (maximal) args & values across an entire numpy matrix, as opposed to across a single dimension (rows / columns).
Example input (with N=3):
import numpy as np
mat = np.matrix([[9,8, 1, 2], [3, 7, 2, 5], [0, 3, 6, 2], [0, 2, 1, 5]])
print(mat)
[[9 8 1 2]
[3 7 2 5]
[0 3 6 2]
[0 2 1 5]]
Desired output: [9, 8, 7]
Since max isn't transitive across a single dimension, going by rows or columns doesn't work.
# by rows, no 8
np.squeeze(np.asarray(mat.max(1).reshape(-1)))[:3]
array([9, 7, 6])
# by cols, no 7
np.squeeze(np.asarray(mat.max(0)))[:3]
array([9, 8, 6])
I have code that works, but looks really clunky to me.
# reshape into single vector
mat_as_vector = np.squeeze(np.asarray(mat.reshape(-1)))
# get top 3 arg positions
top3_args = mat_as_vector.argsort()[::-1][:3]
# subset the reshaped matrix
top3_vals = mat_as_vector[top3_args]
print(top3_vals)
array([9, 8, 7])
Would appreciate any shorter way / more efficient way / magic numpy function to do this!
Using numpy.partition() is significantly faster than performing full sort for this purpose:
np.partition(np.asarray(mat), mat.size - N, axis=None)[-N:]
assuming N<=mat.size.
If you need the final result also be sorted (besides being top N), then you need to sort previous result (but presumably you will be sorting a smaller array than the original one):
np.sort(np.partition(np.asarray(mat), mat.size - N, axis=None)[-N:])
If you need the result sorted from largest to lowest, post-pend [::-1] to the previous command:
np.sort(np.partition(np.asarray(mat), mat.size - N, axis=None)[-N:])[::-1]
One way may be with flatten and sorted and slice top n values:
sorted(mat.flatten().tolist()[0], reverse=True)[:3]
Result:
[9, 8, 7]
The idea is from this answer: How to get indices of N maximum values in a numpy array?
import numpy as np
import heapq
mat = np.matrix([[9,8, 1, 2], [3, 7, 2, 5], [0, 3, 6, 2], [0, 2, 1, 5]])
ind = heapq.nlargest(3, range(mat.size), mat.take)
print(mat.take(ind).tolist()[0])
Output
[9, 8, 7]

Trouble vectorizing code

I'm having a hard time on doing this. I have two m x n matrices (A and B) and I need to multiply every column of A by the rows in B, to generate a m x (n*n) matrix. I guess I wasn't very clear in the explanation so I'll post an example:
A =
[1 2
3 4]
B =
[5 6
7 8]
I wish to have:
[[5 6] [10 12]
[21 24] [28 32]]
I was able to do it using for loops but I want to avoid for as much as possible. Also using numpy to all this and all data is stored as np.array.
Maybe:
>>> A = np.array([[1,2],[3,4]])
>>> B = np.array([[5,6],[7,8]])
>>> (A * B[None, :].T).T
array([[[ 5, 6],
[21, 24]],
[[10, 12],
[28, 32]]])
where we use None to add an extra dimension to B, and a few transpositions to get the alignment right.
If I understand you right, you want basic ( m * n ) multiplication right? Use numpy.dot():
>>> a = [[1, 0], [0, 1]]
>>> b = [[4, 1], [2, 2]]
>>> np.dot(a, b)
array([[4, 1],
[2, 2]])

Categories

Resources