What I am trying to achieve.
I am trying to predict opening price of Natural Gas ("NG Open") from multiple input parameters per table below. I have followed some tutorials but they don't explain the reason behind a particular format.The code is working after multiple trial and error but need to have some understanding on re-shaping data.
Dataset - Only few lines.
Contract NGLast NGOpen NGHigh NGLow NGVolumes COOpen COHigh COLow
2018-12-01 4.487 4.50 4.60 4.03 100,000 56.00 58.00 50.00
2019-01-01 4.450 4.52 4.61 4.11 93000 51.00 53.00 45.00
Code
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from keras.layers import Dense
from keras.models import Sequential
from keras.layers import LSTM
import date time
from keras import metrics
from sklearn.preprocessing import MinMaxScaler
data = pd.read_excel("C:\Futures\Futures.xls")
data['Contract'] = pd.to_datetime(data['Contract'],unit='s').dt.date
data['NG Last'] = data['NG Last'].str.rstrip('s')
data['CO Last'] = data['CO Last'].str.rstrip('s')
COHigh = np.array([data.iloc[:,8]])
COLow = np.array([data.iloc[:,9]])
NGLast = np.array([data.iloc[:,1]])
NGOpen = np.array([data.iloc[:,2]])
NGHigh = np.array([data.iloc[:,3]])
X = np.concatenate([COHigh,COLow, NGLast,NGOpen], axis =0)
X = np.transpose(X)
Y = NGHigh
Y = np.transpose(Y)
scaler = MinMaxScaler()
scaler.fit(X)
X = scaler.transform(X)
scaler.fit(Y)
Y = scaler.transform(Y)
**X = np.reshape(X,(X.shape[0],1,X.shape[1]))**
print(X.shape)
model = Sequential()
**model.add(LSTM(100,activation='tanh',input_shape=(1,4),** recurrent_activation='hard_sigmoid'))
model.add(Dense(1))
model.compile(loss='mean_squared_error', optimizer='rmsprop', metrics = [metrics.mae])
model.fit(X,Y,epochs = 10,batch_size=1,verbose=2)
Predict = model.predict(X,verbose=1)
Question
What is the reasoning behind the code marked in astrix above?
1> I have four columns as input so shouldn't it be X = np.reshape(X,(X.shape[0],1,X.shape[1], X.Shape[2],X.shape[3]))? and so on for all the columns considered as inputs?
2> I need explanation of parameters in this line below. model.add(LSTM(100,activation='tanh',input_shape=(1,4), recurrent_activation='hard_sigmoid'))
Your data is currently an array of shape (x,4) where x is the number of rows. So in the toy data you provide, X.shape should return (2,4). If you look at the LSTM line later on, you'll note that it's looking for tensor of shape (1,4) -- that's the input_shape parameter. The np.reshape line is what gets you there. It's converting your 2-d array into a 3-d one. Again, with your example, X.shape will return (2,1,4) after you reshape it. Basically, you now have a list of length 2 of (1,4) arrays, which matches what the LSTM layer wants.
I'd recommend taking a look at the Keras documentation on the LSTM, but here's basically what's happening. 100 is the number of units (or neurons) that this layer will have. Input shape is noted above. The activation is the function that will be used to calculate the output for each neuron. The tanh function is pretty standard in this basic setup, but others are possible Take a look at the list of activations that Keras provides out of the box.
Related
I have the following piece of code, which is a simplification of my actual code:
#!/usr/bin/env python3
import tensorflow as tf
import re
import numpy as np
import pandas as pd
from tensorflow.keras.preprocessing.text import Tokenizer #text to vector.
from keras.layers import Dense, Embedding, LSTM, SpatialDropout1D
from keras.models import Sequential
DATASIZE = 1000
model = Sequential()
model.add(Embedding(1000, 120, input_length = DATASIZE))
model.add(SpatialDropout1D(0.4))
model.add(LSTM(176, dropout=0.2, recurrent_dropout=0.2))
model.add(Dense(2,activation='sigmoid'))
model.compile(loss = 'binary_crossentropy', optimizer='adam', metrics = ['accuracy'])
print(model.summary())
training = [[0 for x in range(10)] for x in range (DATASIZE)] #random value
label = [[1 for x in range(10)] for x in range (DATASIZE)] #random value
model.fit(training, label, epochs = 5, batch_size=32, verbose = 'auto')
What I need:
My endgoal is to be able to check whether a given input vector (which is a numerical representation of data) is positive or negative, so it is a binary classification issue here. Here all the 1000 vectors are 10 digits long, but it could be waay longer or even much shorter, length might vary throughout the dataset.
When running this I get the following error:
ValueError: logits and labels must have the same shape ((None, 2) vs (None, 10))
How do I have to structure my vectors in order to not get this error and actually correctly fit the model?
EDIT:
I could change the model.add(Dense(2,activation='sigmoid')) call to model.add(Dense(10,activation='sigmoid')). But this doesn't make much sense, I think. As the first parameter of Dense is the actual number of output possibilities. In my case there are only 2 possibilities: positive or negative. So rnndomly changing this to 10, makes the program run, but doesn't make sense to me. And I am not even sure it is using my 1000 training vectors....
Your last layer dense has 2 neurons, while your dataset has 10 labels. So technically, the last layer must have same number of neurons as the number of classes in your dataset which is 10 in your case. Just replace 2 with 10 in your last dense layer.
I wish to exam the values of a tensor after mask is applied to it.
Here is a truncated part of the model. I let temp = x so later I wish to print temp to check the exact values.
So given a 4-class classification model using acoustic features. Assume I have data in (1000,50,136) as (batch, timesteps, features)
The objective is to check if the model is studying the features by timesteps. In other words, we wish to reassure the model is learning using slice as the red rectangle in the picture. Logically, it is the way for Keras LSTM layer but the confusion matrix produced is quite different when a parameter changes (eg. Dense units). The validation accuracy stays 45% thus we would like to visualize the model.
The proposed idea is to print out the first step of the first batch and print out the input in the model. If they are the same, then model is learning in the right way ((136,1) features once) instead of (50,1) timesteps of a single feature once.
input_feature = Input(shape=(X_train.shape[1],X_train.shape[2]))
x = Masking(mask_value=0)(input_feature)
temp = x
x = Dense(Dense_unit,kernel_regularizer=l2(dense_reg), activation='relu')(x)
I have tried tf.print() which brought me AttributeError: 'Tensor' object has no attribute '_datatype_enum'
As Get output from a non final keras model layer suggested by Lescurel.
model2 = Model(inputs=[input_attention, input_feature], outputs=model.get_layer('masking')).output
print(model2.predict(X_test))
AttributeError: 'Masking' object has no attribute 'op'
You want to output after mask.
lescurel's link in the comment shows how to do that.
This link to github, too.
You need to make a new model that
takes as inputs the input from your model
takes as outputs the output from the layer
I tested it with some made-up code derived from your snippets.
import numpy as np
from keras import Input
from keras.layers import Masking, Dense
from keras.regularizers import l2
from keras.models import Sequential, Model
X_train = np.random.rand(4,3,2)
Dense_unit = 1
dense_reg = 0.01
mdl = Sequential()
mdl.add(Input(shape=(X_train.shape[1],X_train.shape[2]),name='input_feature'))
mdl.add(Masking(mask_value=0,name='masking'))
mdl.add(Dense(Dense_unit,kernel_regularizer=l2(dense_reg),activation='relu',name='output_feature'))
mdl.summary()
mdl2mask = Model(inputs=mdl.input,outputs=mdl.get_layer("masking").output)
maskoutput = mdl2mask.predict(X_train)
mdloutput = mdl.predict(X_train)
maskoutput # print output after/of masking
mdloutput # print output of mdl
maskoutput.shape #(4, 3, 2): masking has the shape of the layer before (input here)
mdloutput.shape #(4, 3, 1): shape of the output of dense
I have some data that is of shape 10000 x 1440 x 8 where 10000 is the number of days, 1440 the number of minutes and 8 is the number of features.
For each day, ie. each submatrix of size 1440 x 8 I wish to train an autoencoder and extract the weights from the second layer, such that my output will be a matrix output = 10000 x 8
I can do this in a loop with
import numpy as np
from keras.layers import Input, Dense
from keras import regularizers, models, optimizers
data = np.random.random(size=(10000,1440,8))
def AE(y, epochs=100,learning_rate = 1e-4, regularization = 5e-4, epochs=3):
input = Input(shape=(y.shape[1],))
encoded = Dense(1, activation='relu',
kernel_regularizer=regularizers.l2(regularization))(input)
decoded = Dense(y.shape[1], activation='relu',
kernel_regularizer=regularizers.l2(regularization))(encoded)
autoencoder = models.Model(input, decoded)
autoencoder.compile(optimizer=optimizers.Adam(lr=learning_rate), loss='mean_squared_error')
autoencoder.fit(y, y, epochs=epochs, batch_size=10, shuffle=False)
(w1,b1,w2,b2)=autoencoder.get_weights()
return (w1,b1,w2,b2)
lst = []
for i in range(data.shape[0]):
y = data[i]
(_, _, w2, _) = AE(y)
lst.append(w2[0])
output = np.array(lst)
However, this feels very stupid as surely I must be able to just pass the 3D data to the autoencoder and retrieve what I want. However, if I try modify the shape of input to be input = Input(shape=(y.shape[1],y.shape[2]))
I get an error
ValueError: Dimensions must be equal, but are 1440 and 8 for '{{node
mean_squared_error/SquaredDifference}} =
SquaredDifference[T=DT_FLOAT](model_778/dense_1558/Relu,
IteratorGetNext:1)' with input shapes: [?,1440,1440], [?,1440,8].
Any pointers on how to get the shape right?
Simply reshape your your data like so and call the function.
data = data.reshape(data.shape[0]*data.shape[1], -1)
(w1, b1, w2, b2) = AE(data)
print(w2.shape)
Your first layer of the NN is a Dense layer. You can only pass two dimensional data into it. One dimension will be batch size and the other dimension will be the feature vector. When you are using the data in the way you are using it, you are considering each data point independently. Which means that you can join the first two axes together and just pass it on to the NN. However, note that you would still need to modify the code so that you are not passing the entire dataset at once to the NN. You need to split the data into batches and loop over those before passing it on. And honestly, it's the same as what you are doing now. So your looping is not as bad as you think it is for what you are trying to do.
However, also note that you have a time series data and considering each datapoint as an independent point doesn't really make sense. You need an LSTM layer or something to learn the time series encoding.
As the question says, I can only predict from my model with model.predict_on_batch(). Keras tries to concatenate everything together if I use model.predict() and that doesn't work.
For my application (a sequence to sequence model) it is faster to do grouping on the fly. But even if I had done it in Pandas and then only used Dataset for the padded batch, .predict() still shouldn't work?
If I can get predict_on_batch to work then that's what works. But I can only predict on the first batch of the Dataset. How do I get predictions for the rest? I can't loop over the Dataset, I can't consume it...
Here's a smaller code example. The group is the same as the labels but in the real world they are obviously two different things. There are 3 classes, maximum of 2 values in a sequence, 2 row of data per batch. There's a lot of comments and I nicked parts of the windowing from somewhere on StackOverflow. I hope it is fairly legible to most.
If you have any other suggestions on how to improve the code, please comment. But no, that's not what my model looks like at all. So suggestions for that part probably aren't helpful.
EDIT: Tensorflow version 2.1.0
import tensorflow as tf
from tensorflow.keras.models import Model
from tensorflow.keras.layers import Bidirectional, Masking, Input, Dense, GRU
import random
import numpy as np
random.seed(100)
# input data
feature = list(range(3, 14))
# shuffle data
random.shuffle(feature)
# make label from feature data, +1 because we are padding with zero
label = [feat // 5 +1 for feat in feature]
group = label[:]
# random.shuffle(group)
max_group = 2
batch_size = 2
print('Data:')
print(*zip(group, feature, label), sep='\n')
# make dataset from data arrays
ds = tf.data.Dataset.zip((tf.data.Dataset.from_tensor_slices({'group': group, 'feature': feature}),
tf.data.Dataset.from_tensor_slices({'label': label})))
# group by window
ds = ds.apply(tf.data.experimental.group_by_window(
# use feature as key (you may have to use tf.reshape(x['group'], []) instead of tf.cast)
key_func=lambda x, y: tf.cast(x['group'], tf.int64),
# convert each window to a batch
reduce_func=lambda _, window: window.batch(max_group),
# use batch size as window size
window_size=max_group))
# shuffle at most 100k rows, but commented out because we don't want to predict on shuffled data
# ds = ds.shuffle(int(1e5))
ds = ds.padded_batch(batch_size,
padded_shapes=({s: (None,) for s in ['group', 'feature']},
{s: (None,) for s in ['label']}))
# show dataset contents
print('Result:')
for element in ds:
print(element)
# Keras matches the name in the input to the tensor names in the first part of ds
inp = Input(shape=(None,), name='feature')
# RNNs require an additional rank, even if it is a degenerate dimension
duck = tf.expand_dims(inp, axis=-1)
rnn = GRU(32, return_sequences=True)(duck)
# again Keras matches names
out = Dense(max(label)+1, activation='softmax', name='label')(rnn)
model = Model(inputs=inp, outputs=out)
model.summary()
model.compile(loss='sparse_categorical_crossentropy', metrics=['accuracy'])
model.fit(ds, epochs=3)
model.predict_on_batch(ds)
You can iterate over the dataset, like so, remembering what is "x" and what is "y" in typical notation:
for item in ds:
xi, yi = item
pi = model.predict_on_batch(xi)
print(xi["group"].shape, pi.shape)
Of course, this predicts on each element individually. Otherwise you'd have to define the batches yourself, by batching matching shapes together, as the batch size itself is allowed to be variable.
I build RNN using keras but when I want to change time steps to different size I get an error and I can't get it done
here is my example for dummy data
from numpy import array
import numpy as np
import pandas as pd
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import LSTM
from keras.layers import Dropout
from sklearn.preprocessing import MinMaxScaler
from keras import optimizers
X=array(
[
[#first sample
[0,2],[1,2],[2,2] # three time steps and 2 features
]
,
[# sample 2
[0,2],[1,2],[2,2] # three time steps and 2 features
]
,
[# sample 3
[7,2], [9,2], [4,2] # three time steps and 2 features
]
,
[# sample 4
[2,2], [5,2], [4,2],[7,9] # four steps and 2 features
]
]
)
Y=np.array([1,2,3,4])
model = Sequential()
model.add(LSTM(8, input_shape=(None, 2),return_sequences=True))
model.add(Dropout(0.2))
model.add(LSTM(32,return_sequences=True))
model.add(Dropout(0.2))
model.add(LSTM(128,return_sequences=False))
model.add(Dropout(0.2))
model.add(Dense(58, activation='softmax'))
optimize=optimizers.RMSprop(lr=0.001, rho=0.9, epsilon=None, decay=0.0)
model.compile(optimize,loss='sparse_categorical_crossentropy',metrics=['accuracy'])
model.summary()
model.fit(X,Y,batch_size=1,epochs=50,shuffle=True,verbose=2)
as you can see from the code I have 4 sequence and 2 features in each sequence.
in the last sequence I have 4 time steps instead of 3 and here is the problem if I change it to 3 time steps the code works correctly,
but I want it to work on different time steps how can I achieve that without use padding or masking.
I did read different topics describe different solutions but I can't get it work in the above example
when I try to run above code I got error
ValueError: Error when checking input: expected lstm_1_input to have 3 dimensions, but got array with shape (4, 1)
Your X is not a valid array. A numpy array must be rectangular and not jagged. Keras can only take in valid numpy arrays as inputs. You have two choices:
Feed the samples into your model 1 sample at a time. I.e. use a batch_size of 1, use fit_on_batch or fit_generator rather than just fit. Note that this will remove all vectorization related speed-optimizations and will slow your training down to a crawl if you have a lot of data.
Pad your training set so that they all are of the same time dimension. 0-padding shouldn't really affect the performance of your model. This is the recommended method.
See this thread for more details.