I have done a linear regression problem with boston dataset and I have obtained the next results:
Loss value does not change with increasing number of value. What is the reason of this mistake? Please, help me
import pandas as pd
import torch
import numpy as np
import torch.nn as nn
from sklearn import preprocessing
training_set=pd.read_csv('boston_data.csv')
training_set=training_set.to_numpy()
test_set=test_set.to_numpy()
inputs=training_set[:,0:13]
inputs=preprocessing.normalize(inputs)
target=training_set[:,13:14]
target=preprocessing.normalize(target)
inputs=torch.from_numpy(inputs)
target=torch.from_numpy(target)
test_set=torch.from_numpy(test_set)
w=torch.randn(13,1,requires_grad=True)
b=torch.randn(404,1,requires_grad=True)
def model(x):
return x#w+b
pred=model(inputs.float())
def loss_MSE(x,y):
ras=x-y
return torch.sum(ras * ras) / ras.numel()
for i in range(100):
pred=model(inputs.float())
loss=loss_MSE(target,pred)
loss.backward()
with torch.no_grad():
w -= w.grad * 1e-5
b -= b.grad * 1e-5
w.grad.zero_()
b.grad.zero_()
print(loss)
welcome to Stackoverflow
Your main loop is fine (you could have made you life much easier however, you should probably read this), but your learning rate (1e-5) is most likely way too low.
I tried with a small dummy problem, it was solved very quickly with a learning rate ~1e-2, and would take tremendously longer with 1e-5. It does converge anyway though, but after much more than 100 epochs. You mentionned you tried increasing the number of epoch, did not write how with many you actually ran the experiment.
Please try increasing this parameter (learning rate) to see whether it solves your issue. You can also try removing the division by numel(), which will have the same effect (the division is also applied to the gradients).
Next time, please provide a small minimal example than can be run and help reproduce your error. Here most of your code is data loading which can be replaced with 2 lines of dummy data generation.
Related
I'm trying to fit a quantile regression model to my input data. I would like to use sklearn, but I am getting a memory allocation error when I try to fit the model. The same data with the statsmodels equivalent function is working fine.
There error I get is the following:
numpy.core._exceptions._ArrayMemoryError: Unable to allocate 55.9 GiB for an array with shape (86636, 86636) and data type float64
It doesn't make any sense, my X and y are shapes (86636, 4) and (86636, 1) respectively.
Here's my script:
import pandas as pd
import statsmodels.api as sm
from sklearn.linear_model import QuantileRegressor
training_df = pd.read_csv("/path/to/training_df.csv") # 86,000 rows
FEATURES = [
"feature_1",
"feature_2",
"feature_3",
"feature_4",
]
TARGET = "target"
# STATSMODELS WORKS FINE WITH 86,000, RUNS IN 2-3 SECONDS.
model_statsmodels = sm.QuantReg(training_df[TARGET], training_df[FEATURES]).fit(q=0.5)
# SKLEARN GIVES A MEMORY ALLOCATION ERROR, OR TAKES MINUTES TO RUN IF I SIGNIFICANTLY TRIM THE DATA TO < 1000 ROWS.
model_sklearn = QuantileRegressor(quantile=0.5, alpha=0)
model_sklearn.fit(training_df[FEATURES], training_df[TARGET])
I've checked the sklearn documentation and pretty sure my inputs are fine as dataframes, I get the same issues with NDarrays. So not sure what the issue is. Is it possible there's an issue with something under-the-hood?
[Here][1] is the scikit-learn documentation for QunatileRegressor.
Many thanks for any help / ideas.
[1]: https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.QuantileRegressor.html
0
The sklearn QuantileRegressor class uses linear programming to solve the quantile regression problem which is much more computationally expensive than iterative reweighted least squares as used by statsmodel QuantReg class.
Here is a github issue for the same problem: https://github.com/scikit-learn/scikit-learn/issues/22922
I am a college student who has just started learning Python and doesn't have a lot of coding experience and have been experimenting with TensorFlow out of curiosity. I know that I should become more fluent in learning python before attempting such an ambitious project, but I really want to learn about this experimentally.
So my goal is to take a pre-formatted CSV that has the RSI, RS, MACD, and signal of a stock, and then also if the price of that stock increase the next day (in relation to the prvious day). Wether or not it increased is represented by a 1 or a 0 (1 being an increase, 0 being no change or decrease) so everything is an integer. What I am trying to find is what combination of these indicators leads to the increase. The increase or not is indicated by the class.
So far I have trained the model and tested my test set and gotten it to be 89% accurate, but what I am trying to do is print the combination of values that it found led to the increase. So how might I print the Spy_features that result in Spy_label(class) =1, from the already trained model that has been calculated?
If anymore information is needed, I will gladly provide it, I just feel like I've hit a wall with this aspect of my first project. Most of all I really would like to learn more about Python and machine learning, so a more explanation of how to go about something like this would be greatly appreciated.
import tensorflow as tf
import numpy as np
import pandas as pd
from tensorflow.keras import layers
from tensorflow.keras.layers.experimental import preprocessing
import matplotlib.pyplot as plt
from IPython.display import clear_output
from six.moves import urllib
#Data files
SPY_Training = pd.read_csv(r'C:\Users\matth\Downloads\Spy_training(noheader).csv',names=["RSI","RS","MACD","signal","Class"])
SPY_Test = pd.read_csv(r'C:\Users\matth\Downloads\Spy_test(noheader).csv',names=["RSI","RS","MACD","signal","Class"])
SPY_Test.shape[0], SPY_Training.shape[0]
SPY_Training.head()
SpyTest_features = SPY_Test.copy()
SpyTest_labels = SpyTest_features.pop('Class')
Spy_features = SPY_Training.copy()
Spy_labels = Spy_features.pop('Class')
Spy_features = np.array(Spy_features)
Spy_features
print(Spy_features)
Spy_model = tf.keras.Sequential([
layers.Dense(64),
layers.Dense(1)
])
Spy_model.compile(loss = tf.losses.MeanSquaredError(),
optimizer= tf.optimizers.Adam(), metrics=['accuracy'])
Spy_model.fit(Spy_features, Spy_labels, epochs=10)
test_loss, test_acc = Spy_model.evaluate(SpyTest_features, SpyTest_labels, verbose=2)
print('Test accuracy:', test_acc)
In neural networks, every value has a effect on the outcome(either positive or negative).
All your 4 inputs are directly connected to 64 hidden units which are then directly connected to the 1 ouput unit. So one can not say in advance, what value is positively or negatively effecting the output.
You can use matplotlib/seaborn to understand the data by plotting multiple graphs between values.
I applied this tutorial https://github.com/Hvass-Labs/TensorFlow-Tutorials/blob/master/23_Time-Series-Prediction.ipynb (on a different dataset), the turorial did not compute the mean squared error from individual output, so I added the following line in the comparison function:
mean_squared_error(signal_true,signal_pred)
but the loss and mse from the prediction were different from loss and mse from the model.evaluation on the test data. The errors from the model.evaluation (Loss, mae, mse) (test-set):
[0.013499056920409203, 0.07980187237262726, 0.013792216777801514]
the error from individual target (outputs):
Target0 0.167851388666284
Target1 0.6068108648555771
Target2 0.1710370357827747
Target3 2.747463225418181
Target4 1.7965991690103074
Target5 0.9065426398192563
I think it might a problem in training the model but i could not find where is it exactly. I would really appreciate your help.
thanks
There are a number of reasons that you can have differences between the loss for training and evaluation.
Certain ops, such as batch normalization, are disabled on prediction- this can make a big difference with certain architectures, although it generally isn't supposed to if you're using batch norm correctly.
MSE for training is averaged over the entire epoch, while evaluation only happens on the latest "best" version of the model.
It could be due to differences in the datasets if the split isn't random.
You may be using different metrics without realizing it.
I'm not sure exactly what problem you're running into, but it can be caused by a lot of different things and it's often difficult to debug.
I had the same problem and found a solution. Hopefully this is the same problem you encountered.
It turns out that model.predict doesn't return predictions in the same order generator.labels does, and that is why MSE was much larger when I attempted to calculate manually (using the scikit-learn metric function).
>>> model.evaluate(valid_generator, return_dict=True)['mean_squared_error']
13.17293930053711
>>> mean_squared_error(valid_generator.labels, model.predict(valid_generator)[:,0])
91.1225401637833
My quick and dirty solution:
valid_generator.reset() # Necessary for starting from first batch
all_labels = []
all_pred = []
for i in range(len(valid_generator)): # Necessary for avoiding infinite loop
x = next(valid_generator)
pred_i = model.predict(x[0])[:,0]
labels_i = x[1]
all_labels.append(labels_i)
all_pred.append(pred_i)
print(np.shape(pred_i), np.shape(labels_i))
cat_labels = np.concatenate(all_labels)
cat_pred = np.concatenate(all_pred)
The result:
>>> mean_squared_error(cat_labels, cat_pred)
13.172956865002352
This can be done much more elegantly, but was enough for me to confirm my hypothesis of the problem and regain some sanity.
First of all, I tried to perform dimensionality reduction on my n_samples x 53 data using scikit-learn's Kernel PCA with precomputed kernel. The code worked without any issues when I tried using 50 samples at first. However, when I increased the number of samples into 100, suddenly I got the following message.
Process finished with exit code -1073740940 (0xC0000374)
Here's the detail of what I want to do:
I want to obtain the optimum value of kernel function hyperparameter in my Kernel PCA function, defined as the following.
from sklearn.decomposition.kernel_pca import KernelPCA as drm
from somewhere import costfunction
from somewhere_else import customkernel
def kpcafun(w,X):
# X is sample
# w is hyperparam
n_princomp = 2
drmodel = drm(n_princomp,kernel='precomputed')
k_matrix = customkernel (X,X,w)
transformed_x = drmodel.fit_transform(k_matrix)
cost = costfunction(transformed_x)
return cost
Therefore, to optimize the hyperparams I used the following code.
from scipy.optimize import minimize
# assume that wstart and optimbound are already defined
res = minimize(kpcafun, wstart, method='L-BFGS-B', bounds=optimbound, args=(X))
The strange thing is when I tried to debug the first 10 iterations of the optimization process, nothing strange has happened all values of the variables seemed normal. But, when I turned off the breakpoints and let the program continue the message appeared without any error notification.
Does anyone know what might be wrong with my code? Or anyone has some tips to resolve a problem like this?
Thanks
I have a problem with a GARCH model in python. My code looks as follow
import sys
import numpy as np
import pandas as pd
from arch import arch_model
sys.setrecursionlimit(1800)
spotmarket = pd.read_excel("./data/external/Spotmarket.xlsx", index=True)
l = spotmarket['Price'].pct_change().dropna()
returns = 100 * l
returns.plot()
plt.show()
model=arch_model(returns, vol='Garch', p=1, o=0, q=1, dist='Normal')
results=model.fit()
print(results.summary())
The first part of the code works well. I have end of the day prices in a separate excel table and want to model them with a GARCH model. The problem is, that I get the error message The optimizer returned code 9. The message is:
Iteration limit exceeded
See scipy.optimize.fmin_slsqp for code meaning.
Has someone an idea, how I can handle the problem with the iteration limit? Thank you!
Reading the source code (here), you can pass additional parameters to the fit method. Internally, scipy.optimize.minimize (doc) is called and the parameters of interest to you are probably max_iter and ftol.
Try manually changing the default values (max_iter=100 and ftol= 1e-06) to new ones that might lead to convergence. Example:
results=model.fit(options={'max_iter': 200})