How to concatenate two vectors of different dimensions

How to concatenate two vectors of different dimensions - python

If X = [[ 1. 1. 1. 1. 1. 1.]] and Y = [[ 0. 0. 0. 0.]] - how can I concatenate the two vectors to form a single vector along column?
I did the following but it didn't work:
import tensorflow as tf
X = tf.constant(1.0, shape=[1, 6])
Y = tf.zeros(shape=[1,4])
XY = tf.concat((X,Y), axis = 0)
sess = tf.Session()
print(sess.run(XY))

If you want to concat them on axis 0, then their size must be equal
Assuming that you don't want it,
you need to set axis = 1 in the tf.concat method

Related

Why np.corrcoef is not working as expected with two vectors of dimension two?

I am trying to calculate the Pearson correlation coefficient between two vectors in 2-dimensions using np.corrcoef. When the dimension of the vectors is different than two, they work fine, see for example:
import numpy as np
x = np.random.uniform(-10, 10, 3)
y = np.random.uniform(-10, 10, 3)
print(x, y)
print(np.corrcoef(x,y))
Output:
[-6.59840638 -1.81100446 5.6158669 ] [ 6.7200348 -7.0373677 -2.11395157]
[[ 1. -0.53299763]
[-0.53299763 1. ]]
However, when the dimension is exactly two, the correlation is wrong with the only values 1 or -1:
import numpy as np
x = np.random.uniform(-10, 10, 2)
y = np.random.uniform(-10, 10, 2)
print(x, y)
print(np.corrcoef(x,y))
Output 1:
[-2.61268708 8.32602293] [6.42020314 3.43806504]
[[ 1. -1.]
[-1. 1.]]
Output 2:
[ 5.04249697 -3.6599369 ] [6.12936665 3.15827974]
[[1. 1.]
[1. 1.]]
Output 3:
[7.33503682 7.7145613 ] [-9.54304108 7.43840944]
[[1. 1.]
[1. 1.]]
Question: What's happening and how to solve it?

There are a couple misunderstandings leading to your confusion:
I'll use row major order as numpy "Each row of x represents a variable, and each column a single observation of all those variables."
The Pearson correlation coefficient describes the linear relationship between 2 variables. If you only have 2 values point for each. You can always create a linear relationship between the 2. With the normalization, you'll always get 1 or -1.
A covariance or correlation matrix is usually calculated amongst the components of a random vector X=(X1,....,Xn).T . When you say you want the correlation between 2 vectors, it is unclear whether you want the cross-correlation between X an Y in which case you need np.correlate.

Multiply same numpy array with scalars multiple times

I have a 3D NumPy array of size (9,9,200) and a 2D array of size (200,200).
I want to take each channel of shape (9,9,1) and generate an array (9,9,200), every channel multiplied 200 times by 1 scalar in a single row, and average it such that the resultant array is (9,9,1).
Basically, if there are n channels in an input array, I want each channel multiplied n times and averaged - and this should happen for all channels. Is there an efficient way to do so?
So far what I have is this -
import numpy as np
arr = np.random.rand(9,9,200)
nchannel = arr.shape[-1]
transform = np.array([np.random.uniform(low=0.0, high=1.0, size=(nchannel,)) for i in range(nchannel)])
for channel in range(nchannel):
# The below line needs optimization
temp = [arr[:,:,i] * transform[channel][i] for i in range(nchannel)]
arr[:,:,channel] = np.sum(temp, axis=0)/nchannel
Edit :
A sample image demonstrating what I am looking for. Here nchannel = 3.
The input image is arr. The final image is the transformed arr.

EDIT:
import numpy as np
n_channels = 3
scalar_size = 2
t = np.ones((n_channels,scalar_size,scalar_size)) # scalar array
m = np.random.random((n_channels,n_channels)) # letters array
print(m)
print(t)
m_av = np.mean(m, axis=1)
print(m_av)
for i in range(n_channels):
t[i] = t[i]*m_av1[i]
print(t)
output:
[[0.04601533 0.05851365 0.03893352]
[0.7954655 0.08505869 0.83033369]
[0.59557455 0.09632997 0.63723506]]
[[[1. 1.]
[1. 1.]]
[[1. 1.]
[1. 1.]]
[[1. 1.]
[1. 1.]]]
[0.04782083 0.57028596 0.44304653]
[[[0.04782083 0.04782083]
[0.04782083 0.04782083]]
[[0.57028596 0.57028596]
[0.57028596 0.57028596]]
[[0.44304653 0.44304653]
[0.44304653 0.44304653]]]

What you're asking for is a simple matrix multiplication along the last axis:
import numpy as np
arr = np.random.rand(9,9,200)
transform = np.random.uniform(size=(200, 200)) / 200
arr = arr # transform

Convert probability vector into target vector in python?

I am doing logistic regression on iris dataset from sklearn, I know the math and try to implement it. At the final step, I get a prediction vector, this prediction vector represents the probability of that data point being to class 1 or class 2 (binary classification).
Now I want to turn this prediction vector into target vector. Say if probability is greater than 50%, that corresponding data point will belong to class 1, otherwise class 2. Use 0 to represent class 1, 1 for class 2.
I know there is a for loop version of it, just looping through the whole vector. But when the size get large, for loop is very expensive, so I want to do it more efficiently, like numpy's matrix operation, it is faster than doing matrix operation in for loop.
Any suggestion on the faster method?

import numpy as np
a = np.matrix('0.1 0.82')
print(a)
a[a > 0.5] = 1
a[a <= 0.5] = 0
print(a)
Output:
[[ 0.1 0.82]]
[[ 0. 1.]]
Update:
import numpy as np
a = np.matrix('0.1 0.82')
print(a)
a = np.where(a > 0.5, 1, 0)
print(a)

A more general solution to a 2D array which has many vectors with many classes:
import numpy as np
a = np.array( [ [.5, .3, .2],
[.1, .2, .7],
[ 1, 0, 0] ] )
idx = np.argmax(a, axis=-1)
a = np.zeros( a.shape )
a[ np.arange(a.shape[0]), idx] = 1
print(a)
Output:
[[1. 0. 0.]
[0. 0. 1.]
[1. 0. 0.]]

Option 1: If you do binary classification and have 1d prediction vector then your solution is numpy.round:
prob = model.predict(X_test)
Y = np.round(prob)
Option 2: If you have an n-dimensional one-hot prediction matrix, but want to have labels then you can use numpy.argmax. This will return 1d vector with labels:
prob = model.predict(X_test)
y = np.argmax(prob, axis=1)

In case you want to procede with a confusion matrix etc. afterwards and get the original format of a target variable in scikit again: array([1 0 ... 1])you can use:
a = clf.predict_proba(X_test)[:,1]
a = np.where(a>0.5, 1, 0)
The [:,1] referes to the second class (in my case: 1), the first class in my case was 0

for multi class, or a more generalized solution, use
np.argmax(y_hat, axis=1)

How to change chunks of data in a numpy array

I have a large numpy 1 dimensional array of data in Python and want entries x (500) to y (520) to be changed to equal 1. I could use a for loop but is there a neater, faster numpy way of doing this?
for x in range(500,520)
numpyArray[x] = 1.
Here is the for loop that could be used but it seems like there could be a function in numpy that I'm missing - I'd rather not use the masked arrays that numpy offers

You can use [] to access a range of elements:
import numpy as np
a = np.ones((10))
print(a) # Original array
# [ 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
startindex = 2
endindex = 4
a[startindex:endindex] = 0
print(a) # modified array
# [ 1. 1. 0. 0. 1. 1. 1. 1. 1. 1.]

Simple implementation of NumPy cov (covariance) function

I was trying to implement the numpy.cov() function as given here: numpy cov (covariance) function, what exactly does it compute?, but I am getting some bizarre results. Please correct me:
import numpy as np
def my_covar(X):
X -= X.mean(axis=0)
N = X.shape[1]
return np.dot(X, X.T.conj())/float(N-1)
X = np.asarray([[1.0,1.0],[2.0,2.0],[3.0,3.0]])
## Run NumPy's implementation
print np.cov(X)
"""
NumPy's output:
[[ 0. 0. 0.]
[ 0. 0. 0.]
[ 0. 0. 0.]]
"""
## Run my implementation
print my_covar(X)
"""
My output:
[[ 2. 0. -2.]
[ 0. 0. 0.]
[ -2. 0. 2.]]
"""
What is going wrong?

Both your function and np.cov (by default) assume that the rows of X correspond to variables, and the columns correspond to observations.
When you center X by subtracting the mean, you need to compute the mean over observations, i.e. the columns of X rather than the rows:
X -= X.mean(axis=1)[:, None]

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to concatenate two vectors of different dimensions - python

If you want to concat them on axis 0, then their size must be equal Assuming that you don't want it, you need to set axis = 1 in the tf.concat method

Related

Why np.corrcoef is not working as expected with two vectors of dimension two?

Multiply same numpy array with scalars multiple times

Convert probability vector into target vector in python?

How to change chunks of data in a numpy array

Simple implementation of NumPy cov (covariance) function

Categories

Resources