Change dictionary into label array - python

So I have a 2d array
X = [[ 7.3571296 0.49626 ]
[-0.7747436 3.14599 ]
[ 3.7817762 4.1808457 ]
[ 4.5332413 6.8228664 ]
[ 7.4655724 -0.11392868]
[ 2.416418 4.692072 ]]
and a cluster label array.
y = [1 3 2 2 1 3]
Then I have an algorithm that can predict the label of the 2d array.
Z = {1: array([[ 7.3571296 0.49626 ],
[ 7.4655724 -0.11392868]]),
2: array([[ 3.7817762 4.1808457 ]
[ 2.416418 4.692072 ]]),
3: array([[-0.7747436 3.14599 ],
[ 4.5332413 6.8228664 ]])}
I want to match my predicted label with original label to know my algorithm's accuracy. But how can I extract the dictionary format into label array format? (i.e. y_pred = [1 3 2 3 1 2])

You can use the keys() method of the dictionary and cast it to list.
import numpy as np
Z = {1: np.asarray([[7.3571296, 0.49626], [7.4655724, 0.11392868]]),
2: np.asarray([[3.7817762, 4.1808457], [2.416418, 4.692072]]),
3: np.asarray([[-0.7747436, 3.14599], [4.5332413, 6.8228664]])}
print(list(Z.keys())) #[1, 2, 3]

Related

Filtering an np array using values from each row

I have an array with the shape (29,2):
arr = [[ 405.95168576 1033. ]
[ 406.23572583 1033. ]
[ 407.49812423 1028. ]
[ 402.66145728 1029. ]
[ 404.11080846 1032. ]
[ 401.75897118 1033. ]
[ 402.29352509 1029. ]
[ 402.34504752 1024. ]
[ 402.69938672 1027. ]
[ 400.55298544 1029. ]
[ 401.41432112 1027. ]
[ 400.89318038 1027. ]
[ 401.07444532 1029. ]
[ 400.43212193 1033. ]
[ 400.38178995 1027. ]
[ 399.89895625 1025. ]
[ 399.88394127 1031. ]
[ 399.97766298 1021. ]
[ 399.68084993 1027. ]
[ 399.65810987 1029. ]
[ 399.40565484 1020. ]
[ 399.34339145 1023. ]
[ 399.39613518 1019. ]
[ 399.37733697 1020. ]
[ 399.38314402 1020. ]
[ 399.47479381 1025. ]
[ 399.44134998 1025. ]
[ 399.43511907 1020. ]
[ 399.40346787 1020. ]]
I would like to filter this array to find whether each row contains the maximum value for column arr[:,0], of all rows in which the value of arr[:,1] is equal to or lower than the one contained in that row.
At the moment, I have the following code, which produces the correct result:
import numpy as np
res = np.array([])
for i in range(arr.shape[0]):
print(np.max(arr[:,0][arr[:,1] <= arr[i][1]]))
if arr[i][0] >= np.max(arr[:,0][arr[:,1] <= arr[i][1]]):
res = np.hstack((res, True))
else:
res = np.hstack((res, False))
print(res)
Is there a way to perform this operation in pure numpy, i.e. without using the loop?
The following approach:
uses np.lexsort to order the array first by the second column ascending, then by the first column descending
uses np.maximum.accumulate to calculate the accumulated maxima
reverses the sorted order back to the original order to be able to compare
import numpy as np
arr = np.array([[405.95168576, 1033], [406.23572583, 1033], [407.49812423, 1028], [402.66145728, 1029], [404.11080846, 1032], [401.75897118, 1033], [402.29352509, 1029], [402.34504752, 1024], [402.69938672, 1027], [400.55298544, 1029], [401.41432112, 1027], [400.89318038, 1027], [401.07444532, 1029], [400.43212193, 1033], [400.38178995, 1027], [399.89895625, 1025], [399.88394127, 1031], [399.97766298, 1021], [399.68084993, 1027], [399.65810987, 1029], [399.40565484, 1020], [399.34339145, 1023], [399.39613518, 1019], [399.37733697, 1020], [399.38314402, 1020], [399.47479381, 1025], [399.44134998, 1025], [399.43511907, 1020], [399.40346787, 1020]])
# sort on arr[:,1] ascending then on arr[:,0] descending, return the indices
ind_sorted = np.lexsort((-arr[:, 0], arr[:, 1]))
# calculate the accumulated maxima of the sorted list
max_per_level_sorted = np.maximum.accumulate(arr[ind_sorted, 0])
# get the ordering that maps the sorted values back to the originals
reverse_sorted = np.argsort(ind_sorted)
# get the maxima in the order of the original array
max_per_level = max_per_level_sorted[reverse_sorted]
res = arr[:, 0] >= max_per_level
print(res.astype(int)) # res is a boolean array, show it as integers
[0 0 1 0 0 0 0 1 1 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 0]
If you really want to, you can compress all this together:
ind_sorted = np.lexsort((-arr[:, 0], arr[:, 1]))
res = arr[:, 0] >= np.maximum.accumulate(arr[ind_sorted, 0])[np.argsort(ind_sorted)]
print(res.astype(int))
Here is a visualization:
import matplotlib.pyplot as plt
plt.scatter(arr[:, 1], arr[:, 0], color='dodgerblue')
plt.scatter(arr[res, 1], arr[res, 0], fc='none', ec='crimson', lw=2, s=100, marker='H')

appending and formatting multi dimensional arrays in python Numpy

I want to write a code that appends a value to the order multidimensional array. If the last column is 0 order indx[-1:,1] (function for the last element in the first column) the it will append 10000 to the second column as well as 1 on the first column (1, 10000). If the first column last element is 1 than it will append 2 in the first column and 20000 in the second column (2, 20000). How could i write such code without the use of a for loop or list comprehensions.
import numpy as np
order = np.array([[ 0, 38846],
[ 1, 51599],
[ 0, 51599],
[ 1, 52598],
[ 0, 290480],
[ 1, 335368],
[ 0, 335916]])
Expected Output
#if the last element on column 1 is 1
[[ 0, 38846]
[ 1, 51599]
[ 0, 51599]
[ 1, 52598]
[ 0, 290480]
[ 1, 335368]
[ 0, 335916]
[ 2, 20000]]
#if the last element on column 1 is 0
[[ 0 38846]
[ 1 51599]
[ 0 51599]
[ 1 52598]
[ 0 290480]
[ 1 335368]
[ 0 335916]
[ 1 10000]]
def extend(order):
if order[-1, 0] == 0:
return np.concatenate([order, np.array([[1, 10000]])], axis=0)
elif order [-1, 0] == 1:
return np.concatenate([order, np.array([[2, 20000]])], axis=0)

python - numpy fancy broadcasting for special case riddle

I want to do some forces calculations between vertices and because the forces are symmetrical I have a list of vertice-pairs that need those forces added. I am sure it's possible with fancy indexing, but I really just can get it to work with a slow python for-loop. for symmetric reasons, the right-hand side of the index array needs a negative sign when adding the forces.
consider you have the vertice index array:
>>> I = np.array([[0,1],[1,2],[2,0]])
I = [[0 1]
[1 2]
[2 0]]
and the x,y forces array for each pair:
>>> F = np.array([[3,6],[4,7],[5,8]])
F = [[3 6]
[4 7]
[5 8]]
the wanted operation could be described as:
"vertice #0 sums the force vectors (3,6) and (-5,-8),
vertice #1 sums the force vectors (-3,-6) and (4,7),
vertice #2 sums the force vectors (-4,-7) and (5,8)"
Desired results:
[ 3 6 ] [ 0 0 ] [-5 -8 ] [-2 -2 ] //resulting force Vertice #0
A = [-3 -6 ] + [ 4 7 ] + [ 0 0 ] = [ 1 1 ] //resulting force Vertice #1
[ 0 0 ] [-4 -7 ] [ 5 8 ] [ 1 1 ] //resulting force Vertice #2
edit:
my ugly for-loop solution:
import numpy as np
I = np.array([[0,1],[1,2],[2,0]])
F = np.array([[3,6],[4,7],[5,8]])
A = np.zeros((3,2))
A_x = np.zeros((3,2))
A_y = np.zeros((3,2))
for row in range(0,len(F)):
A_x[I[row][0],0]= F[row][0]
A_x[I[row][1],1]= -F[row][0]
A_y[I[row][0],0]= F[row][1]
A_y[I[row][1],1]= -F[row][1]
A = np.hstack((np.sum(A_x,axis=1).reshape((3,1)),np.sum(A_y,axis=1).reshape((3,1))))
print(A)
A= [[-2. -2.]
[ 1. 1.]
[ 1. 1.]]
Your current "push-style" interpretation of I is
For row-index k in I, take the forces from F[k] and add/subtract them to out[I[k], :]
I = np.array([[0,1],[1,2],[2,0]])
out = numpy.zeros_like(F)
for k, d in enumerate(I):
out[d[0], :] += F[k]
out[d[1], :] -= F[k]
out
# array([[-2, -2],
# [ 1, 1],
# [ 1, 1]])
However you can also change the meaning of I on its head and make it "pull-style", so it says
For row-index k in I, set vertex out[k] to be the difference of F[I[k]]
I = np.array([[0,2],[1,0],[2,1]])
out = numpy.zeros_like(F)
for k, d in enumerate(I):
out[k, :] = F[d[0], :] - F[d[1], :]
out
# array([[-2, -2],
# [ 1, 1],
# [ 1, 1]])
In which case the operation simplifies quite easily to mere fancy indexing:
out = F[I[:, 0], :] - F[I[:, 1], :]
# array([[-2, -2],
# [ 1, 1],
# [ 1, 1]])
You can preallocate an array to hold the shuffled forces and then use the index like so:
>>> N = I.max() + 1
>>> out = np.zeros((N, 2, 2), F.dtype)
>>> out[I, [1, 0]] = F[:, None, :]
>>> np.diff(out, axis=1).squeeze()
array([[-2, -2],
[ 1, 1],
[ 1, 1]])
or, equivalently,
>>> out = np.zeros((2, N, 2), F.dtype)
>>> out[[[1], [0]], I.T] = F
>>> np.diff(out, axis=0).squeeze()
array([[-2, -2],
[ 1, 1],
[ 1, 1]])
The way I understand the question, the values in the I array represent the vortex number, or the name of the vortex. They are not an actual positional index. Based on this thought, I have a different solution that uses the original I array. It does not quite come without loops, but should be OK for a reasonable number of vertices:
I = np.array([[0,1],[1,2],[2,0]])
F = np.array([[3,6],[4,7],[5,8]])
pos = I[:, 0]
neg = I[:, 1]
A = np.zeros_like(F)
unique = np.unique(I)
for i, vortex_number in enumerate(unique):
A[i] = F[np.where(pos==vortex_number)] - F[np.where(neg==vortex_number)]
# produces the expected result
# [[-2 -2]
# [ 1 1]
# [ 1 1]]
Maybe this loop can also be replaced by some numpy magic.

Extracting coordinates from two numpy arrays

Say you have two numpy arrays one, call it A = [x1,x2,x3,x4,x5] which has all the x coordinates, then I have another array, call it B = [y1,y2,y3,y4,y5].. How would one "extract" a set of coordinates e.g (x1,y1) so that i could actually do something with it? Could I use a forloop or something similar? I can't seem to find any good examples, so if you could direct me or show me some I would be grateful.
Not sure if that's what you're looking for. But you can use numpy.concatenate. You just have to add a fake dimension before with [:,None] :
import numpy as np
a = np.array([1,2,3,4,5])
b = np.array([6,7,8,9,10])
arr_2d = np.concatenate([a[:,None],b[:,None]], axis=1)
print arr_2d
# [[ 1 6] [ 2 7] [ 3 8] [ 4 9] [ 5 10]]
Once you have generated a 2D array you can just use arr_2d[i] to get the i-th set of coordinates.
import numpy as np
a = np.array([1, 2, 3, 4, 5])
b = np.array([6, 7, 8, 9, 10])
print(np.hstack([a[:, np.newaxis], b[:, np.newaxis]]))
[[ 1 6]
[ 2 7]
[ 3 8]
[ 4 9]
[ 5 10]]
As #user2314737 said in a comment, you could manually do it by simply grabbing the same element from each array like so:
a = np.array([1,2,3])
b = np.array([4,5,6])
index = 2 #completely arbitrary index choice
#as individual values
pointA = a[index]
pointB = b[index]
#or in tuple form
point = (a[index], b[index])
If you need all of them converted to coordinate form, then #Nuageux's answer is probably better
Let's say you have x = np.array([ 0.48, 0.51, -0.43, 2.46, -0.91]) and y = np.array([ 0.97, -1.07, 0.62, -0.92, -1.25])
Then you can use the zip function
zip(x,y)
This will create a generator. Turn this generator into a list and turn the result into a numpy array
np.array(list(zip(x,y)))
the result will look like this
array([[ 0.48, 0.97],
[ 0.51, -1.07],
[-0.43, 0.62],
[ 2.46, -0.92],
[-0.91, -1.25]])

Numpy change all elements on condition on multidimensional array

I want to change elements to be [0,0,0] if the pixel at that color is blue. The code below works, but is extremely slow:
for row in range(w):
for col in range(h):
if np.array_equal(image[row][col], [255,0,0]):
image[row][col] = (0,0,0)
else:
image[row][col] = (255,255,255)
I know np.where works for single dimensional arrays, but how can I use that function to replace stuff for a 3 dimensional object?
Since you brought up numpy.where, this is how you'd do it using nupmy.where:
import numpy as np
# Make an example image
image = np.random.randint(0, 255, (10, 10, 3))
image[2, 2, :] = [255, 0, 0]
# Define the color you're looking for
pattern = np.array([255, 0, 0])
# Make a mask to use with where
mask = (image == pattern).all(axis=2)
newshape = mask.shape + (1,)
mask = mask.reshape(newshape)
# Finish it off
image = np.where(mask, [0, 0, 0], [255, 255, 255])
The reshape is in there so that numpy will apply broadcasting, more here also.
The simplest thing you could do is just multiply the element you want to set to a zero array by zero. An example of this array property for a three dimensional array is shown below.
x = array([ [ [ 1,2,3 ] , [ 2 , 3 , 4 ] ] , [ [ 1, 2, 3, ] , [ 2 , 3 , 4 ] ] , [ [ 1,2,3 ] , [ 2 , 3 , 4 ] ] , [ [ 1, 2, 3, ] , [ 2 , 3 , 4 ] ] ])
print x
if 1:
x[0] = x[0] * 0
print x
This will yield two printouts:
[[[1 2 3]
[2 3 4]]
[[1 2 3]
[2 3 4]]...
and
[[[0 0 0]
[0 0 0]]
[[1 2 3]
[2 3 4]]...
This method will work for both image[row], and image[row][column] in your example. Your example reworked would look like:
for row in range(w):
for col in range(h):
if np.array_equal(image[row][col], [255,0,0]):
image[row][col] = 0
else:
image[row][col] = 255

Categories

Resources