Modify NumPy array in loops - python

I have a problem with array manipulation in NumPy. If I create two arrays x and y, and do
x = x - y
I get what I expect, that is each element of y is subtracted from the corresponding element of x, and thus x is modified.
However, if I put this in a loop:
m = np.array([[1,2,3],[1,2,3]])
y = array([1, 1, 1])
for i in m:
i = i - y
the matrix m remains unaltered. I am sure I am missing something very basic... How can I change the array m in a loop?

This is not related with numpy matrix, but how python deal with your
i = i - y
i - y produces a new reference of an array. When you assigns it to name i, so i is not referred to the one it was before, but the newly created array.
The following code will meet your purpose
for idx, i in enumerate(m):
m[idx] = i - y

Update: I realize that the easiest thing is to do
m = m-y
This does directly what I expected!

If working with an example where you are unable to avoid looping through the array but still want to change the row, do this
m = np.array([[1,2,3],[1,2,3]])
y = array([1, 1, 1])
for i in m:
i -= y

Related

Numpy array assignment by boolean indices array

I have a very large array, but I'll use a smaller one to explain.
Given source array X
X = [ [1,1,1,1],
[2,2,2,2],
[3,3,3,3]]
A target array with the same size Y
Y = [ [-1,-1,-1,-1],
[-2,-2,-2,-2],
[-3,-3,-3,-3]]
And an assigment array IDX:
IDX = [ [1,0,0,0],
[0,0,1,0],
[0,1,0,1]]
I want to assign Y to X by IDX - Only assign where IDX==1
In this case, something like:
X[IDX] = Y[IDX]
will result in:
X = [ [-1,1,1,1],
[2,2,-2,2],
[3,-3,3,-3]]
How can this be done efficiently (not a for-loop) in numpy/pandas?
Thx
If IDX is a NumPy array of Boolean type, and X and Y are NumPy arrays then your intuition works:
X = np.array(X)
Y = np.array(Y)
IDX = np.array(IDX).astype(bool)
X[IDX] = Y[IDX]
This changes X in place.
If you don't want to do all this type casting, or don't want to overwrite X, then np.where() does what you want in one go:
np.where(IDX==1, Y, X)

Avoiding for loop with numpy and function parameter

I am trying to get good at numpy and want to know if I can use values in exisiting arrays to serve as indices for a function that returns values for another array. I can do this:
def somefun(i):
return i+1
x = np.array([2, 4, 5])
k_labs = np.arange(100)
k_labs2 = k_labs[somefun(x[:])]
But how do I deal with using vectors in matrices in case x was a double array, where I just want to use one vector at a time as indices-arguments for a function, such as X[:, i], without using for-loops?
such as would be the case in:
x = np.array([[2, 4, 5],[7, 8, 9]])
def somefun(i):
return i+1
k_labs = np.arange(100)
k_labs2 = k_labs[somefun(x[:, i])]
EDIT ITERATION 2
To get the gist of what I am trying to accomplish see the code below. In the function pred as you can see i wanted to write the things I've commented out in a numpy fashion that might work better yet. I have some probelms though we the two lines I put in instead, since I get an error of wrong broadcast dimensions in the function called distance, at the the line where I try to assign the normalized vectors at a variable.
class kNN:
def __init__(self, X_train : np.array, label_train, val = None):
self.X = X_train#X[:-1, :]
self.labels = label_train#X[-1, :]
#self.k = k
self.kNN_4all = None #np.zeros(self.X.shape[1])
def distance(self, x1):
x1 = np.tile(x1, (self.X.shape[1], 1)) #creates a matrix of len of X with copyes of x1 vector for easy matrix subtraction.
dists = np.linalg.norm(x1 - self.X.T, axis = 1) #Flips to find linalg.norm for all the axis
return dists
def k_nearest(self, x_vec, k):
k_nearest = self.distance(x_vec)
k_nearest = np.argsort(k_nearest)[ :k]
kNN_labs = np.zeros(k_nearest.shape)
kNN_labs[:] = self.labels[k_nearest[:]]
unique, vote = np.unique(kNN_labs, return_counts=True)
return unique[np.argmax(vote)]
def pred(self, X_test, k):
self.kNN_4all = np.zeros(X_test.shape[1])
self.kNN_4all = self.k_nearest(X_test[:, :], k)
#for i in range(X_test.shape[1]):
# NewLabel = self.k_nearest(X_test[:, i], k) #defines x_vec in matrix X
# self.kNN_4all[i] = NewLabel
#return self.kNN_4all
def prec(self, labels_val):
elem_equal = (self.kNN_4all == labels_val).astype(int).flatten()
prec = np.sum(elem_equal)/elem_equal.shape
return 1 - prec[0]
X_train = X[:, :100]
labs_train = labs[:100]
pilot = kNN(X_train, labs_train)
pilot.pred(X[:,100:200], 10)
pilot.prec(labs[100:200])
I get the following error:
ValueError: operands could not be broadcast together with shapes (78400,100) (100,784)
As we can see from the code the k_nearest(self, x_vec, k) takes one 1D-subarray, so passing any full matrix X will cause the broad-casting error, since the functions within k_nearest relies on passing only a 1D subarray.
I don't know if it really is possible to avoid for loops in this regard and use numpy to increment through 1D subarrays as arguments for a function, such that each call of the function with the arguments can be assigned to a different cell in another array, in this case the self.kNN_4all
x = np.array([[2, 4, 5], [7, 8, 9], [33, 50, 71]])
x = x + 1
k_labs = np.arange(100)
ttt = k_labs[x]
print(ttt)
ttt creates an array that takes values from 'k_labs' based on pseudo-indexes 'x'. The array is accessed for example:
print(ttt[1])#[ 8 9 10]
If you want to refer to a certain value (for example, with indexes x[2]) alone, then the code will be as follows:
x = np.array([[2, 4, 5], [7, 8, 9], [33, 50, 71]])
x = x + 1
k_labs = np.arange(100)
print(k_labs[x[2]])

Creating Density/Heatmap Plot from Coordinates and Magnitude in Python

I have some data which is the number of readings at each point on a 5x10 grid, which is in the format of;
X = [1, 2, 3, 4,..., 5]
Y = [1, 1, 1, 1,...,10]
Z = [9,8,14,0,89,...,0]
I would like to plot this as a heatmap/density map from above, but all of the matplotlib graphs (incl. contourf) that I have found are requiring a 2D array for Z and I don't understand why.
EDIT;
I have now collected the actual coordinates that I want to plot which are not as regular as what I have above they are;
X = [8,7,7,7,8,8,8,9,9.5,9.5,9.5,11,11,11,10.5,
10.5,10.5,10.5,9,9,8, 8,8,8,6.5,6.5,1,2.5,4.5,
4.5,2,2,2,3,3,3,4,4.5,4.5,4.5,4.5,3.5,2.5,2.5,
1,1,1,2,2,2]
Y = [5.5,7.5,8,9,9,8,7.5,6,6.5,8,9,9,8,6.5,5.5,
5,3.5,2,2,1,2,3.5,5,1,1,2,4.5,4.5,4.5,4,3,
2,1,1,2,3,4.5,3.5,2.5,1.5,1,5.5,5.5,6,7,8,9,
9,8,7]
z = [286,257,75,38,785,3074,1878,1212,2501,1518,419,33,
3343,1808,3233,5943,10511,3593,1086,139,565,61,61,
189,155,105,120,225,682,416,30632,2035,165,6777,
7223,465,2510,7128,2296,1659,1358,204,295,854,7838,
122,5206,6516,221,282]
From what I understand you can't use floats in a np.array so I have tried to multiply all values by 10 so that they are all integers, but I am still running into some issues. Am I trying to do something that will not work?
They expect a 2D array because they use the "row" and "column" to set the position of the value. For example, if array[2, 3] = 5, then when x is 2 and y is 3, the heatmap will use the value 5.
So, let's try transforming your current data into a single array:
>>> array = np.empty((len(set(X)), len(set(Y))))
>>> for x, y, z in zip(X, Y, Z):
array[x-1, y-1] = z
If X and Y are np.arrays, you could do this too (SO answer):
>>> array = np.empty((X.shape[0], Y.shape[0]))
>>> array[np.array(X) - 1, np.array(Y) - 1] = Z
And now just plot the array as you prefer:
>>> plt.imshow(array, cmap="hot", interpolation="nearest")
>>> plt.show()

List Comprehension to Create Vector Twice the Square of a Column

I need to write a list comprehension to create a vector twice the square of the middle column of a matrix. (My matrix x = [[1,2,3],[4,5,6],[7,8,9]].) Problem is, I know how to extract the middle column BUT I don't know how to square it or double the square. Any help would be greatly appreciated (...still learning but trying my best)!
x = np.array([[1,2,3],[4,5,6],[7,8,9]])
print(x)
z = [b[1] for b in x]
print(z)
To create a vector twice the square of the column:
import numpy as np
x = np.array([[1,2,3],[4,5,6],[7,8,9]])
print(x)
with a list comprehension: (not recommended)
z = [2*b[1]**2 for b in x]
print(z)
The output is a python list:
[8, 50, 128]
using numpy indexing: (recommended)
more info here
z = 2 * x[:,1] ** 2
print(z)
The output is a numpy array:
[ 8 50 128]

Numpy array: concatenate arrays and integers

In my Python program I concatenate several integers and an array. It would be intuitive if this would work:
x,y,z = 1,2,np.array([3,3,3])
np.concatenate((x,y,z))
However, instead all ints have to be converted to np.arrays:
x,y,z = 1,2,np.array([3,3,3])
np.concatenate((np.array([x]),np.array([y]),z))
Especially if you have many variables this manual converting is tedious. The problem is that x and y are 0-dimensional arrays, while z is 1-dimensional. Is there any way to do the concatenation without the converting?
They just have to be sequence objects, not necessarily numpy arrays:
x,y,z = 1,2,np.array([3,3,3])
np.concatenate(([x],[y],z))
# array([1, 2, 3, 4, 5])
Numpy also does have an insert function that will do this:
x,y,z = 1,2,np.array([3,3,3])
np.insert(z, [0,0], [x, y])
I'll add that if you're just trying to add integers to an list, you don't need numpy to do it:
x,y,z = 1,2,[3,3,3]
z = [x] + [y] + z
or
x,y,z = 1,2,[3,3,3]
[x, y] + z
or
x,y,z = 1,2,[3,3,3]
z.insert(0, y)
z.insert(0, x)

Categories

Resources