I am studying with first machine-learning practice.
This is the prediction system of monthly temperature.
train_t has the temperatures and train_x has the weight for each data.
However I have a question where initializing train_x
import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt
from pprint import pprint
x = tf.placeholder(tf.float32,[None,5])
w = tf.Variable(tf.zeros([5,1]))
y = tf.matmul(x,w)
t = tf.placeholder(tf.float32,[None,1])
loss = tf.reduce_sum(tf.square(y-t))
train_step = tf.train.AdamOptimizer().minimize(loss)
sess = tf.Session()
sess.run(tf.initialize_all_variables())
train_t = np.array([5.2,5.7,8.6,14.9,18.2,20.4,25.5,26.4,22.8,17.5,11.1,6.6]) #montly temperature
train_t = train_t.reshape([12,1])
train_x = np.zeros([12,5])
for row, month in enumerate(range(1,13)):
for col, n in enumerate(range(0,5)):
train_x[row][col] = month**n ## why initialize like this??
i = 0
for _ in range(10000):
i += 1
sess.run(train_step,feed_dict={x:train_x,t:train_t})
if i % 1000 == 0:
loss_val = sess.run(loss,feed_dict={x:train_x,t:train_t})
print('step : %d,Loss: %f' % (i,loss_val))
w_val = sess.run(w)
pprint(w_val)
def predict(x):
result = 0.0
for n in range(0,5):
result += w_val[n][0] * x**n
return result
fig = plt.figure()
subplot = fig.add_subplot(1,1,1)
subplot.set_xlim(1,12)
subplot.scatter(range(1,13),train_t)
linex = np.linspace(1,12,100)
liney = predict(linex)
subplot.plot(linex, liney)
However I don't understand here
for row, month in enumerate(range(1,13)): #
for col, n in enumerate(range(0,5)): #
train_x[row][col] = month**n ## why initialize like this??
What does this mean??
There is no comment about this in my book??
Why train_x is initialized here??
In fact, this bloc of code:
train_t = np.array([5.2,5.7,8.6,14.9,18.2,20.4,25.5,26.4,22.8,17.5,11.1,6.6]) #montly temperature
train_t = train_t.reshape([12,1])
train_x = np.zeros([12,5])
for row, month in enumerate(range(1,13)):
for col, n in enumerate(range(0,5)):
train_x[row][col] = month**n
Is the generation of your data. It initialize train_t and train_x which are the data that will be injected into placeholders x and t
train_t is a tensor of temperatures
train_x is a tensor of sort of weight of each temperatures.
They constitute the dataset.
Both train_x and train_t are arrays with your training data. In the array train_t you have the target of your model, while train_x contains the features in input to your model.
The weights of your model (the ones that are trained) are w (which is the only tf.Variable in your code), which is initialized randomly.
The model you are training is a degree 4 (which is the max of range(0, 5)) polynomial on the linear variable month which ranges in range(1, 13). That snipped of code generates the features for a degree 4 polynomial starting from the linear variable month.
Related
I'm currently trying to build a "simple" LSTM model that takes historical Bitcoin data, learns from that and then tries to predict the future X steps in advance.
I've build it on the idea that A + B + C = D so B + C + D should be E. (I think that's a very simple idea behind an LSTM model. I might be wrong however i'm pretty new to it.)
I managed to build the basics in python (I'm fairly new to python) but something seems off by the prediction. For some reason many of the predictions i test / make end up flatlining. I have a theory on why but we have no idea if it's correct and even less idea on how to solve it.
My theory is that within a sequence the model learns to put more importance / weight on the last digit in the sequence because with Bitcoin prices the future price (in 1 minute) is probably pretty close to the price now. That's try the predicted values keeps getting closer to the real value eventually being equal and thus flatlining in a graph. (I don't know if that makes sense but thats what i tought anyway.)
I've also added a screenshot of my graph from a few days ago. Almost all predictions however end similar to this graph. This is just a more extreme example as demonstration.
Here is my code, can someone please explain why it flatlines and what i did wrong?
import numpy as np
from matplotlib import pyplot
import pandas as pd
from sklearn.preprocessing import MinMaxScaler
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import LSTM
from keras.layers import Dropout
import yfinance as yf
from sklearn.preprocessing import MinMaxScaler
from math import sqrt
from sklearn.metrics import mean_squared_error
# Create output sets X + Y from given input-set
# with inputset : a 1-dimensional list of floats
# with N : the number of lookback values to use for X
# with Gap : the number of point skipped between X and Y
# Y: is equal to input, (although the first N are missing)
# X: for each y of Y a corresponding set of size N is created
# composed of the N values preceeding y.
def create_lookback(inputset, n=1, gap=0):
print("create_lookback with n=%d gap=%d" % (n,gap))
print(" - length of inputset = %d" % len(inputset))
dataX, dataY = [], []
for i in range(len(inputset) - (n+gap)):
a = inputset[i:(i + n), 0]
dataX.append(a)
dataY.append(inputset[i + n+gap, 0])
print(" - length of dataY = %d" % len(dataY))
data_x = np.array(dataX)
xret = data_x.reshape(data_x.shape[0], 1, data_x.shape[1])
return xret, np.array(dataY)
# Train model based on given training-set + Test-set
def create_model(trainX,trainY,testX,testY):
model = Sequential()
model.add(LSTM(units = 100, input_shape=(trainX.shape[1], trainX.shape[2], )))
model.add(Dropout(0.2))
#model.add(LSTM(30, return_sequences=True))
#model.add(Dropout(0.1))
model.add(Dense(1))
model.compile(loss='mae', optimizer='adam')
history = model.fit(trainX, trainY, epochs=100, batch_size=5, validation_data=(testX, testY), verbose=1, shuffle=False)
return model
# Evaluate given X / Y set.
# - Calculate RMSE
# - Generate visual line-plot to screen
def show_result(scaler,yhat,setY,txt):
print("Show %s result" % txt)
yhat_inverse = scaler.inverse_transform(yhat.reshape(-1, 1))
testY_inverse = scaler.inverse_transform(setY.reshape(-1, 1))
if len(testY_inverse) == len(yhat_inverse):
rmse = sqrt(mean_squared_error(testY_inverse, yhat_inverse))
print(' RMSE %s : %.3f' % (txt,rmse))
pyplot.plot(yhat_inverse, label='predict '+txt)
pyplot.plot(testY_inverse, label='actual '+txt, alpha=0.5)
pyplot.legend()
pyplot.show()
# Extrapoleer is dutch for Extrapolate
def extrapoleer(i,model,tup,toekomst):
if(i == 0):
return
setX = np.array([[tup]])
y = model.predict(setX)
y_float = y[0][0]
tup_new = np.append(tup[1:], y_float)
toekomst.append(y_float)
extrapoleer(i-1, model, tup_new,toekomst)
# --- end of defined functions
# -- start of main flow
data_grid_1 = yf.download('BTC-USD', start="2021-04-14",end="2021-04-15", interval="1m");
data_grid_2 = yf.download('BTC-USD', period="12h", interval="1m");
dataset_1 = data_grid_1.iloc[:, 1:2].values
dataset_2 = data_grid_2.iloc[:, 1:2].values
scaler = MinMaxScaler(feature_range = (0, 1))
scaled = scaler.fit_transform(dataset_1)
# 70% of dataset_1 is used to train ; 30% to test
train_size = int(len(scaled) * 0.7)
test_size = len(scaled) - train_size
train, test = scaled[0:train_size,:], scaled[train_size:len(scaled),:]
print("train: %d test: %d" % (len(train), len(test)))
scaled_2 = scaler.fit_transform(dataset_2)
look_back_n = 3
look_back_gap = 0
trainX, trainY = create_lookback(train, look_back_n, look_back_gap)
testX, testY = create_lookback(test, look_back_n, look_back_gap)
testX_2, testY_2 = create_lookback(scaled_2, look_back_n, look_back_gap)
model = create_model(trainX,trainY,testX,testY)
yhat_1 = model.predict(testX)
yhat_2 = model.predict(testX_2)
show_result(scaler,yhat_1,testY,"test")
show_result(scaler,yhat_2,testY_2,"test2")
last_n = testY_2[-look_back_n:]
#toekomst = Future in dutch
toekomst = []
#aantal = Amount in Dutch, this indicates the amount if steps you want to future predict
aantal = 30
extrapoleer(aantal, model, last_n, toekomst)
print("Resultaat van %d voorspelde punten in de toekomst: " % aantal)
print(toekomst)
yhat_2_plus = np.append(yhat_2,toekomst)
show_result(scaler,yhat_2_plus,testY_2,"test2-plus")
I followed the guide on https://stackabuse.com/time-series-prediction-using-lstm-with-pytorch-in-python/ in order to set up a LSTM model for predictions using torch.
Once imported the needed modules (torch, torch.nn, functional, optim, etc:) I set up my indexlist with quantities for 52 weeks.
idxlist = []
for x in range (0, len(inputslist)):
idxlist.append(a - datetime.timedelta(weeks = x))
in_data = pd.DataFrame(columns=['date','qty'])
in_data.date = idxlist
in_data.qty = inputslist
all_data = in_data['qty'].values.astype(float)
Then, I defined the test size and set up a scaler to normalize data:
from sklearn.preprocessing import MinMaxScaler
scaler = MinMaxScaler(feature_range=(-1,1))
train_data_normalized = scaler.fit_transform(train_data.reshape(-1,1))
train_data_normalized = torch.FloatTensor(train_data_normalized).view(-1)
#train window
train_window = 12
Then I created sequences:
def create_inout_sequences(input_data,tw):
inout_seq = []
L = len(input_data) ##40
for i in range(L-tw): ##32 (se tw=8)
train_seq = input_data[i:i+tw]
train_label = input_data[i+tw:i+tw+1]
inout_seq.append((train_seq ,train_label))
return inout_seq
train_inout_seq = create_inout_sequences(train_data_normalized, train_window)
print(len(train_inout_seq)) #len(train) - train_w
And set up the Net class for the LSTM model. After optimizing with model=LSTM(), optim.Adam and model.eval(), I obtain the inverse transform with:
actual_predictions = scaler.inverse_transform(np.array(test_inputs[train_window:] ).reshape(-1, 1))
However, plotting the predicted values of last 12 weeks with the actual values I obtain:
Graph.
Why is this prediction so much below the actual values, even if periodic?
import matplotlib.pyplot as plt
import tensorflow as tf
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
df = pd.read_csv("FuelConsumption.csv")
df = df[['ENGINE SIZE','CYLINDERS', 'Mcity', 'Mhwy', 'Mcmb', 'McmbMPG', 'CO2']]
features = np.asanyarray(df[['ENGINE SIZE','CYLINDERS', 'Mcity', 'Mhwy', 'Mcmb', 'McmbMPG']])
label = np.asanyarray(df[['CO2']])
mu = np.mean(features,axis=0)
sigma = np.std(features,axis=0)
feature_normalized = (features - mu)/sigma
n_training_samples = feature_normalized.shape[0]
n_dim = feature_normalized.shape[1]
feature_reshaped = np.reshape(features,[n_training_samples,n_dim])
label_reshaped = np.reshape(label,[n_training_samples,1])
train_X, test_X, train_Y, test_Y = train_test_split\
(feature_reshaped, label_reshaped,shuffle = True , test_size=0.25, random_state=42)
print("shape of training input = ", train_X.shape)
print("shape of training output = ", train_Y.shape)
numFeatures = train_X.shape[1]
print("Number of features = ", numFeatures)
numLabels = train_Y.shape[1]
print("Number of labels = ", numLabels)
learning_rate = 0.01
training_epochs = 1000
X = tf.placeholder(tf.float32,[None,numFeatures])
Y = tf.placeholder(tf.float32,[None,numLabels])
W = tf.Variable(tf.ones([numFeatures,numLabels]))
B = tf.Variable(tf.ones([1,numLabels]))
init = tf.global_variables_initializer()
Y_model = tf.add(tf.matmul(X, W, name="apply_weights"), B, name="add_bias")
cost = tf.reduce_mean(tf.square(Y_model - Y))
training_step = tf.train.GradientDescentOptimizer(learning_rate).minimize(cost)
sess = tf.Session()
sess.run(init)
loss_values = []
train_data = []
for epoch in range(training_epochs):
_, loss_val, a_val, b_val = sess.run([training_step, cost, W, B],feed_dict={X: train_X,Y: train_Y})
loss_values.append(loss_val)
if epoch % 20 == 0:
print(epoch, loss_val, a_val, b_val)
train_data.append([a_val, b_val])
plt.plot(loss_values, '--')
plt.show()
I am trying to predict CO2 emission using multiple variables like Cyliners, Milage, engine size etc using linear regression. I used above code in tensorflow. When I try to run the model loss value, weight and biases are getting updated only for 20 iterations and after that they are infinity(Nan). The problem here is with the code or selection of cost function/Optimizer?
I have plotted loss values which looks like .
If only Engine size and cylinders are used as features result is good. Any other features(Milage in city, Milage in highway, Milage combined) results in the above mentioned problem. I am attching the scttered plot of Mcity, Mhwy. Is it a problem with data itself?Please take a look at scattered plot of milage in city and hwy.
I'm finding the best cluster set in my data by getting a result which has the lowest average distance from many k means trials on Tensorflow.
But my code doesn't update initial centroids in each trial so all results are same.
Here's my code1 - tensor_kmeans.py
import numpy as np
import pandas as pd
import random
import tensorflow as tf
from tensorflow.contrib.factorization import KMeans
from sklearn import metrics
import imp
import pickle
# load as DataFrame
pkl = 'fasttext_words_k.pkl'
with open(pkl, 'rb') as f:
unique_words_in_fasttext = pickle.load(f).T
vector =[]
for i in range(len(unique_words_in_fasttext)):
vector.append(list(unique_words_in_fasttext.iloc[i,:]))
vector = [np.array(f) for f in vector ]
# Import data
full_data_x = vector
# Parameters
num_steps = 100 # Total steps to train
batch_size = 1024 # The number of samples per batch
n_clusters = 1300 # The number of clusters
num_classes = 100 # The 10 digits
num_rows = 13074
num_features = 300 # Each image is 28x28 pixels
### tensor kmeans ###
# Input images
X = tf.placeholder(tf.float32, shape=[None , num_features])
# Labels (for assigning a label to a centroid and testing)
# Y = tf.placeholder(tf.float32, shape=[None, num_classes])
# K-Means Parameters
kmeans = KMeans(inputs=X, num_clusters=n_clusters, distance_metric='cosine',
use_mini_batch=True, initial_clusters="random")
# Build KMeans graph
training_graph = kmeans.training_graph()
if len(training_graph) > 6: # Tensorflow 1.4+
(all_scores, cluster_idx, scores, cluster_centers_initialized,
cluster_centers_var, init_op, train_op) = training_graph
else:
(all_scores, cluster_idx, scores, cluster_centers_initialized,
init_op, train_op) = training_graph
cluster_idx = cluster_idx[0] # fix for cluster_idx being a tuple
avg_distance = tf.reduce_mean(scores)
# Initialize the variables (i.e. assign their default value)
init_vars = tf.global_variables_initializer()
# Start TensorFlow session
sess = tf.Session()
# Run the initializer
sess.run(init_vars, feed_dict={X: full_data_x})
sess.run(init_op, feed_dict={X: full_data_x})
# Training
for i in range(1, num_steps + 1):
_, d, idx = sess.run([train_op, avg_distance, cluster_idx],
feed_dict={X: full_data_x})
if i % 10 == 0 or i == 1:
print("Step %i, Avg Distance: %f" % (i, d))
labels = list(range(num_rows))
# Assign a label to each centroid
# Count total number of labels per centroid, using the label of each training
# sample to their closest centroid (given by 'idx')
counts = np.zeros(shape=(n_clusters, num_classes))
for i in range(len(idx)):
counts[idx[i]] += labels[i]
# Assign the most frequent label to the centroid
labels_map = [np.argmax(c) for c in counts]
labels_map = tf.convert_to_tensor(labels_map)
# Evaluation ops
# Lookup: centroid_id -> label
cluster_label = tf.nn.embedding_lookup(labels_map, cluster_idx)
# assign variables
cluster_list_k = idx
and here's a code outside the code1.
k_li=[]
rotation = 50
best_labels = []
best_k = -1
for i in range(rotation):
import tensor_kmeans
k_li.append(tensor_kmeans.k)
if len(k_li) > 0:
for i in range(len(k_li)):
if k_li[i] > best_k:
best_labels = tensor_kmeans.cluster_list_k
best_k = k_li[i]
tensor_kmeans = imp.reload(tensor_kmeans)
Where can I find the problem?
I'm waiting your answer, thank you.
Each time you call KMeans() you should use a new random_seed, i.e.
kmeans = KMeans(inputs=X, num_clusters=n_clusters, distance_metric='cosine',
use_mini_batch=True, initial_clusters="random", random_seed=SOME_NEW_VALUE)
Otherwise the function KMeans() will assume random_seed=0, so that the results are reproducible (i.e. the results are always the same).
A simple way to resolve your issue would be to make a function out of code1 - tensor_kmeans.py, then calling this function with a new random_seed (as input parameter) for each trial.
I am trying to make a simple MLP to predict values of a pixel of an image - original blog .
Here's my earlier attempt using Keras in python - link
I've tried to do the same in tensorflow, but I am getting very large output values (~10^12) when they should be less than 1.
Here's my code:
import numpy as np
import cv2
from random import shuffle
import tensorflow as tf
'''
Image preprocessing
'''
image_file = cv2.imread("Mona Lisa.jpg")
h = image_file.shape[0]
w = image_file.shape[1]
preX = []
preY = []
for i in xrange(h):
for j in xrange(w):
preX.append([i,j])
preY.append(image_file[i,j,:].astype('float32')/255.0)
print preX[:5], preY[:5]
zipped = [i for i in zip(preX,preY)]
shuffle(zipped)
X_train = np.array([i for (i,j) in zipped]).astype('float32')
Y_train = np.array([j for (i,j) in zipped]).astype('float32')
print X_train[:10], Y_train[:10]
'''
Tensorflow code
'''
def weight_variable(shape):
initial = tf.truncated_normal(shape, stddev=0.1)
return tf.Variable(initial)
def bias_variable(shape):
initial = tf.constant(0.1, shape=shape)
return tf.Variable(initial)
x = tf.placeholder(tf.float32, shape=[None,2])
y = tf.placeholder(tf.float32, shape=[None,3])
'''
Layers
'''
w1 = weight_variable([2,300])
b1 = bias_variable([300])
L1 = tf.nn.relu(tf.matmul(X_train,w1)+b1)
w2 = weight_variable([300,3])
b2 = bias_variable([3])
y_model = tf.matmul(L1,w2)+b2
'''
Training
'''
# criterion
MSE = tf.reduce_mean(tf.square(tf.sub(y,y_model)))
# trainer
train_op = tf.train.GradientDescentOptimizer(learning_rate = 0.01).minimize(MSE)
nb_epochs = 10
init = tf.initialize_all_variables()
sess = tf.Session()
sess.run(init)
cost = 0
for i in range(nb_epochs):
sess.run(train_op, feed_dict ={x: X_train, y: Y_train})
cost += sess.run(MSE, feed_dict ={x: X_train, y: Y_train})
cost /= nb_epochs
print cost
'''
Prediction
'''
pred = sess.run(y_model,feed_dict = {x:X_train})*255.0
print pred[:10]
output_image = []
index = 0
h = image_file.shape[0]
w = image_file.shape[1]
for i in xrange(h):
row = []
for j in xrange(w):
row.append(pred[index])
index += 1
row = np.array(row)
output_image.append(row)
output_image = np.array(output_image)
output_image = output_image.astype('uint8')
cv2.imwrite('out_mona_300x3_tf.png',output_image)
First of all, I think that instead of running the train_op and then the MSE
you can run both ops in a list and reduce your computational cost significantly.
for i in range(nb_epochs):
cost += sess.run([MSE, train_op], feed_dict ={x: X_train, y: Y_train})
Secondly, I suggest always writing out your cost function so you can see what is going on during the training phase. Either manually print it out or use tensorboard to log your cost and plot it (you can find examples on the official tf page).
You can also monitor your weights to see that they aren't blowing up.
A few things you can try:
Reduce learning rate, add regularization to weights.
Check that your training set (pixels) really consist of the values that
you expect them to.
You give the input layer weights and the output layer weights the same names w and b, so it seems something goes wrong in the gradient-descent procedure. Actually I'm surprised tensorflow doesn't issue an error or at leas a warning (or am I missing something?)