I'm trying to build a LSTM autoencoder with the goal of getting a fixed sized vector from a sequence, which represents the sequence as good as possible. This autoencoder consists of two parts:
LSTM Encoder: Takes a sequence and returns an output vector (return_sequences = False)
LSTM Decoder: Takes an output vector and returns a sequence (return_sequences = True)
So, in the end, the encoder is a many to one LSTM and the decoder is a one to many LSTM.
Image source: Andrej Karpathy
On a high level the coding looks like this (similar as described here):
encoder = Model(...)
decoder = Model(...)
autoencoder = Model(encoder.inputs, decoder(encoder(encoder.inputs)))
autoencoder.compile(loss='binary_crossentropy',
optimizer='adam',
metrics=['accuracy'])
autoencoder.fit(data, data,
batch_size=100,
epochs=1500)
The shape (number of training examples, sequence length, input dimension) of the data array is (1200, 10, 5) and looks like this:
array([[[1, 0, 0, 0, 0],
[0, 1, 0, 0, 0],
[0, 0, 1, 0, 0],
...,
[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0]],
... ]
Problem: I am not sure how to proceed, especially how to integrate LSTM to Model and how to get the decoder to generate a sequence from a vector.
I am using keras with tensorflow backend.
EDIT: If someone wants to try out, here is my procedure to generate random sequences with moving ones (including padding):
import random
import math
def getNotSoRandomList(x):
rlen = 8
rlist = [0 for x in range(rlen)]
if x <= 7:
rlist[x] = 1
return rlist
sequence = [[getNotSoRandomList(x) for x in range(round(random.uniform(0, 10)))] for y in range(5000)]
### Padding afterwards
from keras.preprocessing import sequence as seq
data = seq.pad_sequences(
sequences = sequence,
padding='post',
maxlen=None,
truncating='post',
value=0.
)
Models can be any way you want. If I understood it right, you just want to know how to create models with LSTM?
Using LSTMs
Well, first, you have to define what your encoded vector looks like. Suppose you want it to be an array of 20 elements, a 1-dimension vector. So, shape (None,20). The size of it is up to you, and there is no clear rule to know the ideal one.
And your input must be three-dimensional, such as your (1200,10,5). In keras summaries and error messages, it will be shown as (None,10,5), as "None" represents the batch size, which can vary each time you train/predict.
There are many ways to do this, but, suppose you want only one LSTM layer:
from keras.layers import *
from keras.models import Model
inpE = Input((10,5)) #here, you don't define the batch size
outE = LSTM(units = 20, return_sequences=False, ...optional parameters...)(inpE)
This is enough for a very very simple encoder resulting in an array with 20 elements (but you can stack more layers if you want). Let's create the model:
encoder = Model(inpE,outE)
Now, for the decoder, it gets obscure. You don't have an actual sequence anymore, but a static meaningful vector. You may want to use LTSMs still, they will suppose the vector is a sequence.
But here, since the input has shape (None,20), you must first reshape it to some 3-dimensional array in order to attach an LSTM layer next.
The way you will reshape it is entirely up to you. 20 steps of 1 element? 1 step of 20 elements? 10 steps of 2 elements? Who knows?
inpD = Input((20,))
outD = Reshape((10,2))(inpD) #supposing 10 steps of 2 elements
It's important to notice that if you don't have 10 steps anymore, you won't be able to just enable "return_sequences" and have the output you want. You'll have to work a little. Acually, it's not necessary to use "return_sequences" or even to use LSTMs, but you may do that.
Since in my reshape I have 10 timesteps (intentionally), it will be ok to use "return_sequences", because the result will have 10 timesteps (as the initial input)
outD1 = LSTM(5,return_sequences=True,...optional parameters...)(outD)
#5 cells because we want a (None,10,5) vector.
You could work in many other ways, such as simply creating a 50 cell LSTM without returning sequences and then reshaping the result:
alternativeOut = LSTM(50,return_sequences=False,...)(outD)
alternativeOut = Reshape((10,5))(alternativeOut)
And our model goes:
decoder = Model(inpD,outD1)
alternativeDecoder = Model(inpD,alternativeOut)
After that, you unite the models with your code and train the autoencoder.
All three models will have the same weights, so you can make the encoder bring results just by using its predict method.
encoderPredictions = encoder.predict(data)
What I often see about LSTMs for generating sequences is something like predicting the next element.
You take just a few elements of the sequence and try to find the next element. And you take another segment one step forward and so on. This may be helpful in generating sequences.
You can find a simple of sequence to sequence autoencoder here: https://blog.keras.io/building-autoencoders-in-keras.html
Here is an example
Let's create a synthetic data consisting of a few sequence. The idea is looking into these sequences through the lens of an autoencoder. In other words, lowering the dimension or summarizing them into a fixed length.
# define input sequence
sequence = np.array([[0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9],
[0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8],
[0.2, 0.4, 0.6, 0.8],
[0.3, 0.6, 0.9, 1.2]])
# prepare to normalize
x = pd.DataFrame(sequence.tolist()).T.values
scaler = preprocessing.StandardScaler()
x_scaled = scaler.fit_transform(x)
sequence_normalized = [col[~np.isnan(col)] for col in x_scaled.T]
# make sure to use dtype='float32' in padding otherwise with floating points
sequence = pad_sequences(sequence, padding='post', dtype='float32')
# reshape input into [samples, timesteps, features]
n_obs = len(sequence)
n_in = 9
sequence = sequence.reshape((n_obs, n_in, 1))
Let's device a simple LSTM
#define encoder
visible = Input(shape=(n_in, 1))
encoder = LSTM(2, activation='relu')(visible)
# define reconstruct decoder
decoder1 = RepeatVector(n_in)(encoder)
decoder1 = LSTM(100, activation='relu', return_sequences=True)(decoder1)
decoder1 = TimeDistributed(Dense(1))(decoder1)
# tie it together
myModel = Model(inputs=visible, outputs=decoder1)
# summarize layers
print(myModel.summary())
#sequence = tmp
myModel.compile(optimizer='adam', loss='mse')
history = myModel.fit(sequence, sequence,
epochs=400,
verbose=0,
validation_split=0.1,
shuffle=True)
plot_model(myModel, show_shapes=True, to_file='reconstruct_lstm_autoencoder.png')
# demonstrate recreation
yhat = myModel.predict(sequence, verbose=0)
# yhat
import matplotlib.pyplot as plt
#plot our loss
plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.title('model train vs validation loss')
plt.ylabel('loss')
plt.xlabel('epoch')
plt.legend(['train', 'validation'], loc='upper right')
plt.show()
Lets build the autoencoder
# use our encoded layer to encode the training input
decoder_layer = myModel.layers[1]
encoded_input = Input(shape=(9, 1))
decoder = Model(encoded_input, decoder_layer(encoded_input))
# we are interested in seeing how the encoded sequences with lenght 2 (same as the dimension of the encoder looks like)
out = decoder.predict(sequence)
f = plt.figure()
myx = out[:,0]
myy = out[:,1]
s = plt.scatter(myx, myy)
for i, txt in enumerate(out[:,0]):
plt.annotate(i+1, (myx[i], myy[i]))
And here is the representation of the sequences
Related
I'm building a LSTM model with variable length arrays as input. In many resources, I was recommended to do padding which is inserting 0 until all input arrays have the same length and then applying Masking for the model to ignore the 0s.
However, after many trainings, I feel like Masking does not work as expected, the padded 0s in the input still affect the prediction ability of the model.
After concatenating all sequences into one array, my training data looks like below without padding:
X y
[1 2 3] 4
[2 3 4] 5
[3 4 5] 6
...
My python implementation:
import numpy as np
import pandas as pd
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, LSTM, Masking
from tensorflow.keras.preprocessing.sequence import pad_sequences
from tensorflow.keras.preprocessing.sequence import TimeseriesGenerator
""" Raw Training Input """
arr = np.array([
[1, 2, 3, 4, 5, 6],
[5, 6, 7],
[11, 12, 13, 14]
], dtype=object)
timesteps = 3
n_features = 1
maxlen = 6
""" Padding """
padded_arr = pad_sequences(arr, maxlen=maxlen, padding='pre', truncating='pre')
""" Concatenate all sequences into one array """
sequence = np.concatenate(padded_arr)
sequence = sequence.reshape((len(sequence), n_features))
# print(sequence)
""" Training Data Generator """
generator = TimeseriesGenerator(sequence, sequence, length=timesteps, batch_size=1)
""" Check Generator """
for i in range(len(generator)):
x, y = generator[i]
print('%s => %s' % (x, y))
""" Build Model """
model = Sequential()
model.add(Masking(mask_value=0.0, input_shape=(timesteps, n_features))) # masking to ignore padded 0
model.add(LSTM(1024, activation='relu', input_shape=(timesteps, n_features), return_sequences=False))
model.add(Dense(1))
model.compile(optimizer='adam', loss='mse')
model.fit(generator, steps_per_epoch=1, epochs=1000, verbose=10)
""" Prediction """
x_input = np.array([2,3,4]).reshape((1, timesteps, n_features))
yhat = model.predict(x_input, verbose=0)
print(yhat) # here I'm expecting 5 because the input is [2, 3, 4]
For the prediction, I input [2,3,4] and most of the time I keep getting values very far away from the expected value (= 5)
I'm wondering if I missed out on something or simply because the model architecture was not correctly tuned.
I want to understand why the model is not predicting correctly. Is the Masking the issue or is it something else?
The problem is that the batch size is 1, and also just one step per epoch. As a result, no meaningful gradient can be calculated. You have to put all training data into one batch, and you should have good results:
""" Training Data Generator """
generator = TimeseriesGenerator(sequence, sequence, length=timesteps,
batch_size=15)
[ Alternatively, you could leave batch size 1, but change the steps_per_epoch to len(generator) Which seems to work with the adam optimizer, but not with SGD. And it's much slower. ]
I have a dataset, where the data are measurements that I took. I am able to load them with the torch.load(), what i would like to do is normalize my data, since the values vary so much. I have a transform variable that should be able to normalize the data, which i found online, but i don't know where it goes, or how to implement it, can anyone help? Thank in advance.
Basically the code is as follows:
allCellsData = torch.load("/Users/andresmoreno/Documents/allData_tensor.pt")
labels = torch.load('/Users/andresmoreno/Documents/labelFile.pt')
allCellsData = allCellsData.to(dtype=torch.float)
Transform variable
transform = transforms.Compose([
# to tensor
#transforms.ToTensor(),
# normalize
# from [0, 1] to [-1, 1]
# parameters [means], [std]
transforms.Normalize(
(0.5, 0.5, 0.5),
(0.5, 0.5, 0.5)
)
])
my_dataset = TensorDataset(allCellsData, labels)
# batch size
batch_size = 64
# create train dataloader
trainLoader = torch.utils.data.DataLoader(
my_dataset,
batch_size = batch_size,
shuffle = True,
num_workers = 2
)
I have an input that is a time series of 5 dimensions:
a = [[8,3],[2] , [4,5],[1], [9,1],[2]...] #total 100 timestamps. For each element, dims 0,1 are numerical data and dim 2 is a numerical encoding of a category. This is per sample, 3200 samples
The category has 3 possible values (0,1,2)
I want to build a NN such that the last dimension (the category) will go through an embedding layer with output size 8, and then will be concatenated back to the first two dims (the numerical data).
So, this will be something like:
input1 = keras.layers.Input(shape=(2,)) #the numerical features
input2 = keras.layers.Input(shape=(1,)) #the encoding of the categories. this part will be embedded to 5 dims
x2 = Embedding(input_dim=1, output_dim = 8)(input2) #apply it to every timestamp and take only dim 3, so [2],[1], [2]
x = concatenate([input1,x2]) #will get 10 dims at each timepoint, still 100 timepoints
x = LSTM(units=24)(x) #the input has 10 dims/features at each timepoint, total 100 timepoints per sample
x = Dense(1, activation='sigmoid')(x)
model = Model(inputs=[input1, input2] , outputs=[x]) #input1 is 1D vec of the width 2 , input2 is 1D vec with the width 1 and it is going through the embedding
model.compile(
loss='binary_crossentropy',
optimizer='adam',
metrics=['acc']
)
How can I do it? (preferably in keras)?
My problem is how to apply the embedding to every time point?
Meaning, if I have 1000 timepoints with 3 dims each, I need to convert it to 1000 timepoints with 8 dims each (The emebedding layer should transform input2 from (1000X1) to (1000X8)
There are a couple of issues you are having here.
First let me give you a working example and explain along the way how to solve your issues.
Imports and Data Generation
import tensorflow as tf
import numpy as np
from tensorflow.keras import layers
from tensorflow.keras.models import Model
num_timesteps = 100
max_features_values = [100, 100, 3]
num_observations = 2
input_list = [[[np.random.randint(0, v) for _ in range(num_timesteps)]
for v in max_features_values]
for _ in range(num_observations)]
input_arr = np.array(input_list) # shape (2, 3, 100)
In order to use an embedding we need to the voc_size as input_dimension, as stated in the LSTM documentation.
Embedding and Concatenation
voc_size = len(np.unique(input_arr[:, 2, :])) + 1 # 4
Now we need to create the inputs. Inputs should be of size [None, 2, num_timesteps] and [None, 1, num_timesteps] where the first dimension is the flexible and will be filled with the number of observations we are passing in. Let's use the embedding right after that using the previously calculated voc_size.
inp1 = layers.Input(shape=(2, num_timesteps)) # TensorShape([None, 2, 100])
inp2 = layers.Input(shape=(1, num_timesteps)) # TensorShape([None, 1, 100])
x2 = layers.Embedding(input_dim=voc_size, output_dim=8)(inp2) # TensorShape([None, 1, 100, 8])
x2_reshaped = tf.transpose(tf.squeeze(x2, axis=1), [0, 2, 1]) # TensorShape([None, 8, 100])
This cannot be easily concatenated since all dimensions must match except for the one along the concatenation axis. But the shapes are not matching unfortunately. Therefore we reshape x2. We do so by removing the first dimension and then transposing.
Now we can concatenate without any issue and everything works in a straight forward fashion:
x = layers.concatenate([inp1, x2_reshaped], axis=1)
x = layers.LSTM(32)(x)
x = layers.Dense(1, activation='sigmoid')(x)
model = Model(inputs=[inp1, inp2], outputs=[x])
Check on Dummy Example
inp1_np = input_arr[:, :2, :]
inp2_np = input_arr[:, 2:, :]
model.predict([inp1_np, inp2_np])
# Output
# array([[0.544262 ],
# [0.6157502]], dtype=float32)
#This outputs values between 0 and 1 just as expected.
In case you are not looking for Embeddings the way it's usually used in Keras (positive integers mapping to dense vectors). You might be looking for some sort of unprojection or basis expansion, in which 3 dimensions get mapped (embedded) to 8 and concatenating the result. This can be done using the kernel trick or other methods, but also happens implicitly in neural networks with non-linear applications.
As such, you can do something like this, following a similar format to pythonic833 because it was good (but with timestamps in the middle per the Keras LSTM documentation asking for [batch, timesteps, feature]):
Input generation
import tensorflow as tf
import numpy as np
from tensorflow.keras import layers
from tensorflow.keras.models import Model
num_timesteps = 100
num_features = 5
num_observations = 2
input_list = [[[np.random.randint(1, 100) for _ in range(num_features)]
for _ in range(num_timesteps)]
for _ in range(num_observations)]
input_arr = np.array(input_list) # shape (2, 100, 5)
Model construction
Then you can process the inputs:
input1 = layers.Input(shape=(num_timesteps, 2,))
input2 = layers.Input(shape=(num_timesteps, 3))
x2 = layers.Dense(8, activation='relu')(input2)
x = layers.concatenate([input1,x2], axis=2) # This produces tensors of shape (None, 100, 10)
x = layers.LSTM(units=24)(x)
x = layers.Dense(1, activation='sigmoid')(x)
model = Model(inputs=[input1, input2] , outputs=[x])
model.compile(
loss='binary_crossentropy',
optimizer='adam',
metrics=['acc']
)
Results
inp1_np = input_arr[:, :, :2]
inp2_np = input_arr[:, :, 2:]
model.predict([inp1_np, inp2_np])
which produces
array([[0.44117224],
[0.23611131]], dtype=float32)
Other explanations about basis expansion to check out:
https://stats.stackexchange.com/questions/527258/embedding-data-into-a-larger-dimension-space
https://www.reddit.com/r/MachineLearning/comments/2ffejw/why_dont_researchers_use_the_kernel_method_in/
I encountered many hardships when trying to fit a CNN (U-Net) to my tif training images in Python.
I have the following structure to my data:
X
0
[Images] (tif, 3-band, 128x128, values ∈ [0, 255])
X_val
0
[Images] (tif, 3-band, 128x128, values ∈ [0, 255])
y
0
[Images] (tif, 1-band, 128x128, values ∈ [0, 255])
y_val
0
[Images] (tif, 1-band, 128x128, values ∈ [0, 255])
Starting with this data, I defined ImageDataGenerators:
import tensorflow as tf
from tensorflow import keras as ks
from matplotlib import pyplot as plt
import numpy as np
bs = 10 # batch size
args_col = {"data_format" : "channels_last",
"brightness_range" : [0.5, 1.5]
}
args_aug = {"rotation_range" : 365,
"width_shift_range" : 0.05,
"height_shift_range" : 0.05,
"horizontal_flip" : True,
"vertical_flip" : True,
"fill_mode" : "constant",
"featurewise_std_normalization" : False,
"featurewise_center" : False
}
args_flow = {"color_mode" : "rgb",
"class_mode" : "sparse",
"batch_size" : bs,
"target_size" : (128, 128),
"seed" : 42
}
# train generator
X_generator = ks.preprocessing.image.ImageDataGenerator(rescale = 1.0/255.0,
**args_aug,
**args_col)
X_gen = X_generator.flow_from_directory(directory = "my/directory/X",
**args_flow)
y_generator = ks.preprocessing.image.ImageDataGenerator(**args_aug,
cval = NoDataValue)
y_gen = y_generator.flow_from_directory(directory = "my/directory/y",
**args_flow, color_mode = "grayscale")
train_generator = zip(X_gen, y_gen)
# val generator
X_val_generator = ks.preprocessing.image.ImageDataGenerator(rescale = 1.0/255.0)
X_val_gen = X_generator.flow_from_directory(directory = "my/directory/X_val"),
**args_flow)
y_val_generator = ks.preprocessing.image.ImageDataGenerator()
y_val_gen = y_generator.flow_from_directory(directory = "my/directory/y_val"),
**args_flow, color_mode = "grayscale")
val_generator = zip(X_val_gen, y_val_gen)
Using this generator, I can create pairs of training images and corresponding masks and visualize them like this:
X, y = next(train_generator)
X_test = X[0][0]
y_test = y[0][0]
plt.subplot(1, 2, 1)
plt.imshow(np.array(X_test))
plt.subplot(1, 2, 2)
plt.imshow(np.array(y_test))
Resulting in:
However, I cannot train a U-Net, as I intended:
When I define a U-Net based on an example from the internet (or basically any other example of a U-Net I've found) as model and then do the following:
model.compile(optimizer = "adam", loss = "sparse_categorical_crossentropy", metrics = ["accuracy"])
model.fit(train_generator, epochs = 5, steps_per_epoch = 10, validation_data = val_generator)
it will fail with the error:
ValueError: Layer model expects 1 input(s), but it received 2 input tensors. Inputs received: [<tf.Tensor 'IteratorGetNext:0' shape=(None, None, None, None) dtype=float32>, <tf.Tensor 'ExpandDims:0' shape=(None, 1) dtype=float32>]
I tried other loss functions and other class_mode arguments, but it always failed with some error related to the dimensions of the input data or the data passed between layers. An other example (when setting class_mode = None:
InvalidArgumentError: logits and labels must have the same first dimension, got logits shape [16384,1] and labels shape [49152]
I just started getting into CNNs and Python, so I have no clue what to try further or how to resolve those errors. I was pretty sure I use the correct loss function, which seems to be often the problem when similar errors occur (I have multiple classes, hence the "sparse_categorical_crossentropy").
Any ideas how to solve this and make the data fit the expected CNN input (or the other way round, depending on what the problem is)?
Note:
My ImageDataGenerator outputs a pair of images (X and y) with the following format (I noticed I had to set color_mode to "grayscale" for the masks (y)):
I used keras.layers.Input(shape = (128, 128, 3)) in the example U-Net, since the keras documentation states shape = "A shape tuple (integers), not including the batch size".
I found the answer to this particular problem. Amongst other issues, "class_mode" has to be set to None for this kind of model. With that set, the second array in both X and y is not written by the ImageDataGenerator. As a result, X and y are interpreted as the data and the mask (which is what we want) in the combined ImageDataGenerator. Otherwise, X_val_gen already produces the tuple shown in the screenshot, where the second entry is interpreted as the class, which would make sense in a classification problem with images spread out in various folders each labeled with a class ID.
I have just started developing some simple classifier in Tenosrflow and I've started using this example on Tensorflow site: https://www.tensorflow.org/tutorials/keras/basic_classification
Now I want my model to get images like this as features:
These images should have, as corresponding labels, three arrays: [1,0], [3,0] and [1,3].
My problem is: how can I load into the model these kind of labels (i.e. labels that are arrays and not a single scalar)?
When I try as in the example down here, the only thing I got is an error message that I won't report here because they are generated from my lack of knowledge on the thing that I'm trying to do.
Additional question: how should the last neural layer be? How many neurons should it have?
Here is the code:
import tensorflow as tf
from tensorflow import keras
import skimage
from skimage.color import rgb2gray
import csv
import numpy as np
names = ['Cerchio', 'Quadrato', 'Stella']
images = []
labels = [[]]
test_images = []
test_labels = [[]]
final_images = []
for i in range(1, 501):
images.append(skimage.data.imread("{0}.bmp".format(i)))
for i in range(501, 601):
test_images.append(skimage.data.imread("{0}.bmp".format(i)))
for i in range(601, 701):
final_images.append(skimage.data.imread("{0}.bmp".format(i)))
file = open("labels.csv", "rU")
reader = csv.reader(file, delimiter=",")
for row in reader:
for i in range(0, 499):
if int(row[i]) < 10:
labels.append([int(int(row[i])/10), 0])
else:
labels.append([int(int(row[i])/10), int(row[i])%10])
for i in range(500, 600):
if int(row[i]) < 10:
test_labels.append([int(int(row[i])/10), 0])
else:
test_labels.append([int(int(row[i])/10), int(row[i])%10])
file.close()
images28 = np.array(images)
images28 = rgb2gray(images28)
test_images28 = np.array(test_images)
test_images28 = rgb2gray(test_images28)
final_images28 = np.array(final_images)
final_images28 = rgb2gray(final_images28)
labels = np.array(labels)
test_labels = np.array(test_labels)
print(labels)
model = keras.Sequential([
keras.layers.Flatten(input_shape=(28, 56)),
keras.layers.Dense(128, activation=tf.nn.relu),
keras.layers.Dense(4, activation=tf.nn.softmax)
])
model.compile(optimizer=tf.train.AdamOptimizer(),
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
model.fit(images28, labels, epochs=5)
test_loss, test_acc = model.evaluate(test_images28, test_labels)
print('Test accuracy:', test_acc)
a = input()
img = final_images28[int(a)]
print(img.shape)
img = (np.expand_dims(img, 0))
print(img.shape)
predictions_single = model.predict(img)
print(predictions_single)
print(names[np.argmax(predictions_single)])
One way is just map the array labels into an index, like [[0,0],[0,0],[0,0]]->0, [[1,0],[0,0],[0,0]]->1,... etc. You'll have 3^6=729 possible labels. If these forms on the images are standard you probably can use just simplest classificator with no hidden layers so it's gonna be dim1xdim2x729 trainable weights. If they are not standard you will be better off using convolutional layers.
You can probably also use fully convolutional model for this problem that is returning 3 dimensional tensor as an output. In this case you can use multidimensional labels. But then you'll have to write custom loss function for it.
After Googling around and toying with my program, I found the solution: a multi-hot encoded array.
In this array, if I have a position for a circle, a square, a star and the blank space (hence a 4 position array), I can feed to my model labels that have a '1' in each corresponding space.
E.g. (referring to the example above):
[1, 0, 1, 0]
[1, 0, 0, 1]
[0, 0, 1, 1]
This did work perfectly.