Generating Random vectors and matrices for weights and bias - python

I am trying to know the chance of fire based on sensors x1 and x2.
y=1
For this, I am trying to generate random vectors and matrices for weights and bias but I get an error.
import numpy as np
np.random.seed(seed=123)
w1 = np.random.rand(4,2)
b1 = 4*1
x = np.array([0.4, 0.32])
z1 = np.dot(w1,x) + b1
a1 = 1 / (1+np.exp(-z1))
np.random.seed(seed=123)
w2 = np.random.rand(1,4)
b2 = 1*1
z2 = np.dot(w2,x) + b2
a2 = 1 /(1+np.exp(-z2))
But I get the error below:
----> 1 z2 = np.dot(w2,x) + b2
2 a2 = np.tanh(Z1)
3 print(a2)
ValueError: shapes (2,4) and (2,) not aligned: 4 (dim 1) != 2 (dim 0)
I am not able to figure out how to solve this.

The answer is in the error - you are trying to multiply the matrices w2 and x, which have invalid dimensions to be multiplied.
Matrix w2 has 1 row and 4 columns:
>>> w2 = np.random.rand(1,4)
>>> w2.shape
(1, 4)
Matrix x has 2 entries:
>>> x = np.array([0.4, 0.32])
>>> x.shape
(2,)
Therefore, you cannot multiply these matrices together - matrices can only be multiplied if and only if the number of columns in the first matrix, w2, is equal to the number of rows in the second, x. Here, as the error says, 4 (dim 1) != 2 (dim 0).
You can solve this by either giving x four rows, or w2 two columns.
Hope this helps.

Related

Converting pandas.core.series.Series to dataframe with multiple column names

My toy example is as follows:
import numpy as np
from sklearn.datasets import load_iris
import pandas as pd
### prepare data
Xy = np.c_[load_iris(return_X_y=True)]
mycol = ['x1','x2','x3','x4','group']
df = pd.DataFrame(data=Xy, columns=mycol)
dat = df.iloc[:100,:] #only consider two species
dat['group'] = dat.group.apply(lambda x: 1 if x ==0 else 2) #two species means two groups
dat.shape
dat.head()
### Linear discriminant analysis procedure
G1 = dat.iloc[:50,:-1]; x1_bar = G1.mean(); S1 = G1.cov(); n1 = G1.shape[0]
G2 = dat.iloc[50:,:-1]; x2_bar = G2.mean(); S2 = G2.cov(); n2 = G2.shape[0]
Sp = (n1-1)/(n1+n2-2)*S1 + (n2-1)/(n1+n2-2)*S2
a = np.linalg.inv(Sp).dot(x1_bar-x2_bar); u_bar = (x1_bar + x2_bar)/2
m = a.T.dot(u_bar); print("Linear discriminant boundary is {} ".format(m))
def my_lda(x):
y = a.T.dot(x)
pred = 1 if y >= m else 2
return y.round(4), pred
xx = dat.iloc[:,:-1]
xxa = xx.agg(my_lda, axis=1)
xxa.shape
type(xxa)
We have xxa is a pandas.core.series.Series with shape (100,). Note that there are two columns in parentheses of xxa, I want convert xxa to a pd.DataFrame with 100 rows x 2 columns and I try
xxa_df1 = pd.DataFrame(data=xxa, columns=['y','pred'])
which gives ValueError: Shape of passed values is (100, 1), indices imply (100, 2).
Then I continue to try
xxa2 = xxa.to_frame()
# xxa2 = pd.DataFrame(xxa) #equals `xxa.to_frame()`
xxa_df2 = pd.DataFrame(data=xxa2, columns=['y','pred'])
and xxa_df2 presents all NaN with 100 rows x 2 columns. What should I do next?
Let's try Series.tolist()
xxa_df1 = pd.DataFrame(data=xxa.tolist(), columns=['y','pred'])
print(xxa_df1)
y pred
0 42.0080 1
1 32.3859 1
2 37.5566 1
3 31.0958 1
4 43.5050 1
.. ... ...
95 -56.9613 2
96 -61.8481 2
97 -62.4983 2
98 -38.6006 2
99 -61.4737 2
[100 rows x 2 columns]

np.linalg.norm AxisError: axis 1 is out of bounds for array of dimension 1

I want to use np.linalg.norm to calculate the norm of a row vector, and then use this norm to normalize the row vector, as I wrote in the code. I give an initial value to the vector x, but after I run this code I always get:
AxisError: axis 1 is out of bounds for array of dimension 1
So I'm confused and don't know what's the problem. Here is my code:
import numpy as np
def normalizeRows(x):
x_norm = np.linalg.norm(x, axis=1, keepdims=True)
x_normalized = x / x_norm
return x_normalized
x = np.array([1, 2, 3])
print(normalizeRows(x))
And here is the error:
---------------------------------------------------------------------------
AxisError Traceback (most recent call last)
<ipython-input-16-155a4cdd9bf8> in <module>
10
11 x = np.array([1, 2, 3])
---> 12 print(normalizeRows(x))
13
14
<ipython-input-16-155a4cdd9bf8> in normalizeRows(x)
4
5 def normalizeRows(x):
----> 6 x_norm = np.linalg.norm(x, axis=1, keepdims=True)
7 x_normalized = x / x_norm
8
<__array_function__ internals> in norm(*args, **kwargs)
d:\programs\python39\lib\site-packages\numpy\linalg\linalg.py in norm(x, ord, axis, keepdims)
2558 # special case for speedup
2559 s = (x.conj() * x).real
-> 2560 return sqrt(add.reduce(s, axis=axis, keepdims=keepdims))
2561 # None of the str-type keywords for ord ('fro', 'nuc')
2562 # are valid for vectors
AxisError: axis 1 is out of bounds for array of dimension 1
Could somebody tell me why this is wrong and how to fix it? Thanks!
This gives you axis error because your array is a 1d array, and since :
"In a multi-dimensional NumPy array, axis 1 is the second axis. When we're talking about 2-d and multi-dimensional arrays, axis 1 is the axis that runs horizontally across the columns" quoted from link
All you have to do is change your axis to 0
import numpy as np
def normalizeRows(x):
x_norm = np.linalg.norm(x, axis=0, keepdims=True)
x_normalized = x / x_norm
return x_normalized
x = np.array([1, 2, 3])
print(normalizeRows(x))
[0.26726124 0.53452248 0.80178373]

Python Numpy: ValueError: shapes (200,2) and (1,2) not aligned: 2 (dim 1) != 1 (dim 0)

I have two python numpy arrays; a1, and W2, and I want to make a numpy dot product:
z2 = a1.dot(W2)
Shape of a1 array is (200,2), and shape of W2 array is (1, 2). Why I encounter the error ValueError: shapes (200,2) and (1,2) not aligned: 2 (dim 1) != 1 (dim 0)?
The multiplication condition is not valid in this case a1*w2.
The rows of w2 should equal to the columns of a1!

Subtract all pairs of values from two arrays

I have two vectors, v1 and v2. I'd like to subtract each value of v2 from each value of v1 and store the results in another vector. I also would like to work with very large vectors (e.g. 1e6 size), so I think I should be using numpy for performance.
Up until now I have:
import numpy
v1 = numpy.array(numpy.random.uniform(-1, 1, size=1e2))
v2 = numpy.array(numpy.random.uniform(-1, 1, size=1e2))
vdiff = []
for value in v1:
vdiff.extend([value - v2])
This creates a list with 100 entries, each entry being an array of size 100. I don't know if this is the most efficient way to do this though.
I'd like to calculate the 1e4 desired values very fast with the smallest object size (memory wise) possible.
You're not going to have very much fun with the giant arrays that you mentioned. But if you have more reasonably-sized matrices (small enough that the result can fit in memory), the best way to do this is with broadcasting.
import numpy as np
a = np.array(range(5, 10))
b = np.array(range(2, 6))
res = a[None, :] - b[:, None]
print(res)
# [[3 4 5 6 7]
# [2 3 4 5 6]
# [1 2 3 4 5]
# [0 1 2 3 4]]
np.subtract.outer
You can use np.ufunc.outer with np.subtract and then transpose:
a = np.array(range(5, 10))
b = np.array(range(2, 6))
res1 = np.subtract.outer(a, b).T
res2 = a[None, :] - b[:, None]
assert np.array_equal(res1, res2)
Performance is comparable between the two methods.

Calculate the values of dependent variable using multivariate linear regression with numpy

I am trying to implement multivariate linear regression using numpy. There are several questions in this forum regarding that but seems to answer my question. I have the following independent variables (X1, X2, X3, X4, X5) and dependent variable Y. I want to predict the values of Y'.
X1 X2 X3 X4 Y Y'
1 0 1 0 1 ? // ? -> referring this value as y'1
0 0 1 1 0 ? // ? -> referring this value as y'2
0 1 0 1 0 ? // ? -> referring this value as y'3
0 0 0 1 1 ? // ? -> referring this value as y'4
1 0 1 1 0 ? // ? -> referring this value as y'5
So, I am using numpy as:
>>> X1 = np.array([1,0,0,0,1])
>>> X2 = np.array([0,0,1,0,0])
>>> X3 = np.array([1,1,0,0,1])
>>> X4 = np.array([0,1,1,1,1])
>>> Y = np.array([1,0,0,1,0])
>>> x = np.array([X1,X2,X3,X4], np.int32)
>>> n = np.max(x.shape)
>>> X = np.vstack([np.ones(n), x]).T
>>> print np.linalg.lstsq(X, Y)[0]
[ 2.00000000e+00 -2.22044605e-16 -1.00000000e+00 -1.00000000e+00 -1.00000000e+00]
So, I have the equation y = a + b1.x1 +b2.x2 + b3.x3 + b4.x4 . From above, I have got the values of a,b1,b2,b3,b4.
So,how do I calculate the values of Y' which are y'1, y'2,y'3, y'4,y'5 from the above coefficient values?
The point of OLS is to fit parameters based on data you have and use that to predict a new Y. Try ...
>>> import numpy as np
>>> X = np.array([[1,0,1,0], [0,0,1,1], [0,1,0,1], [0,0,0,1], [1,0,1,1]])
>>> Y = np.array([1,0,0,1,0]).reshape((5,1))
>>> b = np.linalg.inv((X.T).dot(X)).dot(X.T).dot(Y)
>>> b
out [1]: array([[0.666], [-0.333], [-0.333], [0.333]])
Then use this to predict a new Y given 4 new X's. Also, if your Y's are binary (all zeros and ones), you should look at using Logistic Regression.

Categories

Resources