I'm working on a linear regression problem with Pytorch (y=A*x, where the dimensions of A are 2x2). I wrote the following code. I don't know why the loss doesn't change... Can someone help me ?
Thanks,
Thomas
import torch
import numpy as np
from scipy.integrate import odeint
from matplotlib import pyplot as plt
from torch.autograd import Variable
def EDP(X,t):
X_0=-2*X[0]
X_1=-2*X[1]
grad=np.array([X_0,X_1])
return grad
T=np.arange(0,10,0.1)
X_train=odeint(EDP,[10,20],T)
Y_train=np.zeros_like(X_train)
for i in range(Y_train.shape[0]):
Y_train[i,:]=np.dot(np.array([[2,0],[0,2]]),X_train[i,:])
print(X_train,Y_train)
X_train=torch.Tensor(X_train)
torch.transpose(X_train,0,1)
Y_train=torch.Tensor(Y_train)
print(X_train.shape)
import torch.nn as nn
class LinearRegression(torch.nn.Module):
def __init__(self):
super(LinearRegression, self).__init__()
self.linear = torch.nn.Linear(2,2,bias = False) # bias is default True
def forward(self, x):
y_pred = self.linear(x)
return y_pred
criterion = torch.nn.MSELoss()
optimizer = torch.optim.SGD(our_model.parameters(), lr = 0.0001)
our_model = LinearRegression()
x_train = X_train
y_train = Y_train
#x_train.requires_grad=True
print(x_train.shape)
print(y_train.shape)
ntrain=10
for t in range(ntrain):
y_pred=our_model(x_train)
loss=criterion(y_train,y_pred)
loss.backward()
optimizer.step()
optimizer.zero_grad()
print(t,loss)
print(our_model.linear.weight)
In my laptop it worked ...
since you are running it on just 10 epochs ...and using lr = 0.0001 ,you wont see it in just 10 epochs.
i did this optimizer = torch.optim.SGD(our_model.parameters(), lr = 0.01) (increased lr )which actually decreased the loss in just 10 epochs
Related
I couldn't find anything helpful about plotting the process of converging test and training data of Sklearn.neural_network.MLPrgressor. I found that there is loss_curve_ attribute, but what about validation data?
I have built a simple model in which both inputs and outputs are randomly selected (say x = numpy.linspace(0, numpy.pi, 100), y = numpy.sin(x). I wrote this one to obtain variation of sklearn.metrics.mean_squared_error for a different number of hidden layers.
How can I overcome this problem?
from sklearn.preprocessing import RobustScaler
inputs /= 10
ERE /= 10
scaler = RobustScaler()
inputs = scaler.fit_transform(inputs)
X_train, X_test, y_train, y_test = train_test_split(inputs, ERE,
train_size=0.8,
random_state=123)
from sklearn.neural_network import MLPRegressor
hidden_layer_size = (10, )
activation = "tanh"
solver = "adam"
alpha = 1e-4
batch_size = 6
learning_rate = "adaptive"
learning_rate_init = 1e-4
power_t = "sgd"
max_iter = 1000
shuffle = True
random_state = 123
verbose = True
early_stopping = True
validation_fraction = 0.15
n_iter_no_change = 35
from sklearn.metrics import mean_squared_error as mse
import numpy as np
error_scores = np.zeros(shape = (11,))
for _iterator, hidden_layer_size in enumerate(range(1, 110, 10)):
mlr = MLPRegressor(hidden_layer_sizes=hidden_layer_size,
activation=activation,
solver=solver,
batch_size=batch_size,
learning_rate=learning_rate,
learning_rate_init=learning_rate_init,
shuffle=shuffle,
random_state=random_state,
early_stopping=early_stopping,
validation_fraction=validation_fraction,
n_iter_no_change=n_iter_no_change,
alpha=alpha)
mlr.fit(X_train, y_train)
error_scores[_iterator] = mse(y_test, mlr.predict(X_test))
Class MLPrgressor (well, BaseMultilayerPerceptron really) has an undocumented validation_scores_ attribute which keeps track of scores on validation data. However, it is only populated if you pass True as parameter early_stopping when initialising the solver object.
I'm trying to implement a Bayesian neural network for genomic predictions. My X is a matrix that is scaled and gets normalized so that the values are between 0 and 1. The y is a vector of values that are again normalized so that the values are between 0 and 1.
The network seems to learn as seen here:
But, when I try to make predictions these look strange and seem to behave randomly. While the true values of y are distributed between 0 and 1. The predicted values are between ~ 0.4 - 0.6 and my R2 is negative. The MSE is around 0.02, what seems not to bad, but might be caused by the fact that the range of the predictions is quite narrow.
I'm a bit running out of ideas what could be wrong. Any suggestions are appreciated :).
I've also tried to predict the training data. That is also not working and I'm getting a negative R2.
X has the dimensions (5000,500) and y (5000,)
Increasing the number of hidden layers (up to 3) and units (upt to 128) doesn't change anything.
# Import necessary packages:
import sys
from os.path import join
import warnings
warnings.filterwarnings('ignore')
from IPython import display
import tensorflow as tf
from tensorflow import keras
import tensorflow_probability as tfp
import kerastuner as kt
from keras import backend as K
from keras import activations, initializers
from keras.layers import Layer
import tensorflow_docs as tfdocs
import tensorflow_docs.modeling
import tensorflow_docs.plots
import numpy as np
import numpy.ma as ma
import pandas as pd
import seaborn as sns
import matplotlib.pylab as plt
import time
import tempfile
import math
import statsmodels.api as sm
from sklearn.model_selection import train_test_split, KFold
from sklearn.metrics import r2_score,mean_absolute_error
from sklearn.linear_model import LinearRegression, BayesianRidge
from sklearn.utils import shuffle
from sklearn.impute import SimpleImputer
from sklearn.preprocessing import StandardScaler, scale
from pandas_plink import read_plink
from pandas_plink import read_plink1_bin
from pandas_plink import get_data_folder
tfd = tfp.distributions
# Set random seed and start timer
np.random.seed(12345)
start = time.time()
### functions
def get_optimizer():
return tf.keras.optimizers.SGD()
def get_callbacks():
return [
#tfdocs.modeling.EpochDots(),
tf.keras.callbacks.EarlyStopping(monitor='val_loss', min_delta=1, patience=500, restore_best_weights=True),
]
def normalize_data(df):
return (df - df.min())/(df.max() - df.min())
def compile_model(model, optimizer=None):
if optimizer is None:
optimizer = get_optimizer()
model.compile(optimizer=optimizer,
loss=keras.losses.MeanSquaredError())
return model
def MSE(test,pred):
sqr_err = np.subtract(test,pred)**2
return sqr_err.mean()
# ## load & preprocess data
# ### load genotype data
G = np.genfromtxt("genotype.txt")
G[np.isnan(G)] = 0.
G = normalize_data(G)
print(G.mean())
print(G.var())
# ### load phenotype data
traits = np.genfromtxt("phenotype.txt")
traits = normalize_data(traits)
print(traits.mean())
print(traits.var())
# ### split training and validation set
train_X, test_X, train_y, test_y = train_test_split(G, traits, test_size = 0.2, random_state = 42)
X = np.concatenate((train_X, test_X), axis=0)
y = np.concatenate((train_y, test_y), axis=0)
# ### parameter definition
N = G.shape[0]
p = G.shape[1]
NUM_FOLDS = 5
kfold = KFold(n_splits=NUM_FOLDS, shuffle=True)
INPUT_SHAPE = X.shape[1]
OUTPUT_SHAPE = y.shape[0]
BATCH_SIZE = 32
STEPS_PER_EPOCH = math.ceil((X.shape[0]*(1-1/NUM_FOLDS)*0.8)/BATCH_SIZE)
MAX_EPOCHS = 5000
df = pd.DataFrame(columns = ['method','MSE','R2'])
histories = {}
# Specify the surrogate posterior over `keras.layers.Dense` `kernel` and `bias`.
def posterior_mean_field(kernel_size, bias_size=0, dtype=None):
n = kernel_size + bias_size
c = np.log(np.expm1(1.))
return tf.keras.Sequential([
tfp.layers.VariableLayer(2 * n, dtype=dtype),
tfp.layers.DistributionLambda(lambda t: tfd.Independent(
tfd.Normal(loc=t[..., :n],
scale=1e-5 + tf.nn.softplus(c + t[..., n:])),
reinterpreted_batch_ndims=1)),
])
# Specify the prior over `keras.layers.Dense` `kernel` and `bias`.
def prior_trainable(kernel_size, bias_size=0, dtype=None):
n = kernel_size + bias_size
return tf.keras.Sequential([
tfp.layers.VariableLayer(n, dtype=dtype),
tfp.layers.DistributionLambda(lambda t: tfd.Independent(
tfd.Normal(loc=t, scale=1),
reinterpreted_batch_ndims=1)),
])
def neg_log_likelihood(y_true, y_pred, sigma=1.0):
dist = tfp.distributions.Normal(loc=y_pred, scale=sigma)
return K.sum(-dist.log_prob(y_true))
#neg_log_likelihood = lambda y, p_y: -p_y.log_prob(y)
kl_loss_weight = 1.0 / STEPS_PER_EPOCH
histories = {}
fold_no = 1
for train, test in kfold.split(X, y):
model = tf.keras.Sequential([
keras.layers.InputLayer(input_shape=(INPUT_SHAPE,)),
tfp.layers.DenseVariational(units=32,
make_posterior_fn=posterior_mean_field,
make_prior_fn=prior_trainable,
kl_weight=kl_loss_weight,
activation='sigmoid'),
tfp.layers.DenseVariational(units=1,
make_posterior_fn=posterior_mean_field,
make_prior_fn=prior_trainable,
kl_weight=kl_loss_weight,
activation='sigmoid'),
tfp.layers.DistributionLambda(lambda t: tfd.Normal(loc=t, scale=1)),
])
model.compile(loss=neg_log_likelihood, optimizer=tf.keras.optimizers.Adam(lr=0.0001), metrics=['mse'])
history = model.fit(X[train], y[train],
validation_split = 0.2,
steps_per_epoch = STEPS_PER_EPOCH,
epochs=MAX_EPOCHS,
callbacks=get_callbacks(),
verbose=0)
histories['BNN2_'+str(fold_no)] = history
y_pred_list = []
for i in range(500):
y_pred = model.predict(X[test])
y_pred_list.append(y_pred)
y_preds = np.concatenate(y_pred_list, axis=1)
y_mean = np.mean(y_preds, axis=1)
m_err = MSE(y[test],y_mean)
r2_acc = r2_score(y[test],y_mean)
df = df.append({'MSE':m_err, 'R2':r2_acc, 'method':'BNN2'}, ignore_index=True)
fold_no = fold_no + 1
df.to_csv("results.csv")
I am making a neural network model using pytorch.
I built a simple and shallow 3 layer model by referring to the tutorial.
However, training is random despite using the same model and script.
In other words, it can be seen that the loss does not drop about once out of 4, so it is not trained. I don't know why the model is shallow and unstable. I would be grateful if someone with the same experience as me or who has solved the problem can advise.
enter image description here
It's same script running result.
1 out of 4 times don't trained.
but I used same script and model.
The value of the input tensor is the same in both the case of learning and the case of not learning.
my script is under here. and x input shape is [10000, 1]
import os
import pandas as pd
from sklearn.preprocessing import StandardScaler
import numpy as np
import torch.nn as nn
import torch.optim as optim
from sklearn.preprocessing import StandardScaler
import sys
import torch
from sklearn.preprocessing import StandardScaler
import re
os.chdir("...")
F1 = os.listdir(os.getcwd())
print(F1)
df = pd.read_excel('10000.xlsx', sheet_name=1)
Ang_tilt = torch.from_numpy(df['Ang_tilt'].values).unsqueeze(dim=1).float()
x_list = [Ang_tilt]
nb_epochs = 3000
import sys
#from aug_data_processing import *
import torch.nn as nn
import torch.optim as optim
from sklearn.preprocessing import StandardScaler
from matplotlib import pyplot as plt
########################################
####################model
#print(x_list)
net = Net(x_dim=Ang_tilt.size()[1])
criterion = nn.MSELoss()
# optimizer = torch.optim.Adam(model.parameters(), lr=1e-9)
optimizer = torch.optim.SGD(net.parameters(), lr=1e-6, momentum=0.7)
losses = []
################forward
for step in range(nb_epochs + 1):
scaler = StandardScaler()
Ang_tilt = scaler.fit_transform(Ang_tilt)
Ang_tilt = torch.from_numpy(Ang_tilt).float()
#print(x_list[i])
prediction = net(Ang_tilt)
#print(prediction)
loss = criterion(input=prediction, target=y_label)
optimizer.zero_grad()
losses.append(loss.item())
loss.backward()
optimizer.step()
#print(Ang_tilt)
plt.title('3_layer_NN_loss_pre+post')
plt.xlabel('epoch')
plt.ylabel('losses')
plt.plot(range(nb_epochs+1), losses)
plt.show()
torch.save(obj=net, f='aug.pt')
And this is Network
from torch import nn
from torch.nn import functional as F
import torch
import torch
from torch.autograd import Variable
'''
x_dim = dimension을 바로
'''
class Net(nn.Module):
def __init__(self, x_dim):
super(Net, self).__init__()
self.fc1 = nn.Linear(x_dim, 150)
self.fc2 = nn.Linear(150, 100)
self.fc3 = nn.Linear(100, 40)
self.fc4 = nn.Linear(40,1)
self.dropout = nn.Dropout(p=0.5)
torch.nn.init.xavier_uniform_(self.fc1.weight)
torch.nn.init.xavier_uniform_(self.fc2.weight)
torch.nn.init.xavier_uniform_(self.fc3.weight)
torch.nn.init.xavier_uniform_(self.fc4.weight)
def forward(self, x):
x = F.relu(self.fc1(x))
x = F.relu(self.fc2(x))
x = F.relu(self.fc3(x))
x = F.relu(self.fc4(x))
return x
I found it by printing my parameters.
When not training, weight is so low.
so I changed model structrue than solved.
I'm working with some code that classifies the infamous dog vs cat image classification using a ResNet-18 model and I'd like to extend it to be able to classify for greater than two image categories (like dog vs cat vs hamster vs ....). In particular I've got 5 categories. I'm new at transfer learning and I'm not sure what I have to change in my code to make this work.
import torch
import numpy as np
import torch.nn.functional as F
from torch.nn import Linear
from torch.utils.data import DataLoader, random_split
from torch.optim import Adam
from torchvision.transforms import Compose, Resize, ToTensor
from torchvision.datasets import ImageFolder
from torchvision.models import resnet18
from matplotlib import pyplot as plt
import random
transform = Compose([Resize((128,128)), ToTensor()])
ds = ImageFolder("*Image_Folder*", transform=transform)
ds_train, ds_val = random_split(ds, [3250, 1073])
dl_train = DataLoader(ds_train, batch_size= 32, shuffle=True)
dl_val = DataLoader(ds_val, batch_size= len(ds_val), shuffle= True)
model = resnet18(pretrained=True)
model.requires_grad_(False)
model.fc = Linear(model.fc.in_features, 5)
X_val, y_val = next(dl_val.__iter__())
opt = torch.optim.Adam(model.parameters(), lr=0.001)
def accuracy(yy, y):
return torch.mean(1.0*(yy == y))
X_val.shape, y_val.shape
y_val = y_val.reshape(-1, 1).float()
for epoch in range(10):
losses = []
accs = []
losses_val = []
accs_val = []
model.train()
for X, y in dl_train:
y = y.reshape(-1, 1).float()
yy = torch.sigmoid(model(X))
loss = F.binary_cross_entropy(yy, y)
losses.append(loss.item())
loss.backward()
opt.step()
opt.zero_grad()
acc = accuracy(torch.round(yy), y)
accs.append(acc.item())
model.eval()
with torch.no_grad():
yy_val = torch.sigmoid(model(X_val))
loss_val = F.binary_cross_entropy(yy_val, y_val)
losses_val.append(loss_val.item())
acc_val = accuracy(torch.round(yy_val), y_val)
accs_val.append(acc_val.item())
print(f"Epoch {epoch}: t-loss = {np.mean(losses):.4f}, t-acc = {np.mean(accs):.4f}, v-loss = {loss_val:.4f}, v-acc = {acc_val:.4f}")
I believe the code is fine up to the for loop, however it could be something I need to add or alter. Currently the line loss = F.binary_cross_entropy(yy, y) is what's giving me an error ValueError: Using a target size (torch.Size([32, 1])) that is different to the input size (torch.Size([32, 5])) is deprecated. Please ensure they have the same size.
This is the data I'm working from: https://www.kaggle.com/alxmamaev/flowers-recognition
Binary Cross Entropy is a loss function designed for binary classification tasks.
In order to convert this model into one capable of 5-class classification, in addition to changing the final layer's width to 5, you need to change the loss function to a multinomial scorer e.g. CrossEntropyLoss().
I wrote a small
"Linear Regression Neural Network Tensorflow Keras Python program"
Input dataset is
y = mx + c straight line data.
Predicted y values are not correct and are giving horizontal line kind of
values, instead of a line with some slope.
I ran this program on Windows laptop with tensorflow, Keras and
Jupyter notebook.
What to do to fix this program please?
Thanks and best regards,
SSJ
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
n2 = 50
count = 20
n4 = n2 + count
p = 100
m = 10
c = 5
x = np.linspace(n2, n4, p)
y = m * x + c
x
y
plt.scatter(x,y)
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
from tensorflow.keras.layers.experimental import preprocessing
x_normalizer = preprocessing.Normalization(input_shape=[1,])
x_normalizer.adapt(x)
x_normalized = x_normalizer(x)
y_normalizer = preprocessing.Normalization(input_shape=[1,])
y_normalizer.adapt(y)
y_normalized = x_normalizer(y)
y_model = tf.keras.Sequential([
y_normalizer,
layers.Dense(1)
])
y_model.compile(optimizer='rmsprop', loss='mse', metrics = ['mae'])
y_hist = y_model.fit(x, y, epochs=100, verbose=0, validation_split = 0.2)
hist = pd.DataFrame(y_hist.history)
hist['epoch'] = y_hist.epoch
hist.head()
hist.tail()
xin = [51,53,59,64]
ypred = y_model.predict(xin)
ypred
plt.scatter(x, y)
plt.scatter(xin, ypred, color = 'r')
plt.grid(linestyle = '--')
Use StandardScaler instead of Normalization
Normalizer acts row-wise and StandardScaler column-wise.
Normalizer does not remove the mean and scale by deviation but scales
the whole row to unit norm.
Found here: Difference between StandardScaler and Normalizer
This is how you can process the data:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from tensorflow.keras import Sequential
from tensorflow.keras.layers import Dense
from sklearn.preprocessing import StandardScaler
x = np.linspace(50, 70, 100).reshape(-1, 1)
y = 10 * x + 5
x_standard_scaler = StandardScaler().fit(x)
y_standard_scaler = StandardScaler().fit(y)
x_scaled = x_standard_scaler.transform(x)
y_scaled = y_standard_scaler.transform(y)
Remember that you need two separate scalers for x and y so don't use the same object for that. Also if you want to use that scaler to process new data for testing, save the scaler in some variable. A good practice is to not refit the scaler again on test data.
model = Sequential([
Dense(1, input_dim=1, activation='linear'),
])
model.compile(optimizer='rmsprop', loss='mse')
history = model.fit(x_scaled, y_scaled, epochs=1000, verbose=0, validation_split = 0.2).history
pd.DataFrame(history).plot()
plt.show()
As you can see the model is converging. Its worth to plot the loss history which helps to tell if your model is learning or not.
x_test = np.linspace(20, 100, 10).reshape(-1, 1)
y_test = 10 * x_test + 5
x_test_scaled = x_standard_scaler.transform(x_test)
y_test_scaled = y_standard_scaler.transform(y_test)
If you have a test data that you want to use for validation or just predict it, remember to use standard scaler again, but without fitting. It should be fitted on train data only in most cases.
y_test_pred_scaled = model.predict(x_test_scaled)
y_test_pred = y_standard_scaler.inverse_transform(y_test_pred_scaled)
plt.scatter(x_test, y_test, s=30, label='true')
plt.scatter(x_test, y_test_pred, s=15, label='pred')
plt.legend()
plt.show()
If you want to get your prediction rescaled back to its original range use inverse_transform. Notice that prediction on x_test after rescaling is very close to y_test.