Merging layers on Keras (dot product) - python

I've been following Towards Data Science's tutorial about word2vec and skip-gram models, but I stumbled upon a problem that I cannot solve, despite searching about it for hours and trying a lot of unsuccessful solutions.
https://towardsdatascience.com/understanding-feature-engineering-part-4-deep-learning-methods-for-text-data-96c44370bbfa
The step that it shows you how to build the skip-gram model architecture seems deprecated because of the use of the Merge layer from keras.layers.
I've seem many discussions about it, and the majority of answers was the you need to use the Functional API of Keras to merge layers now. But the problem is, I'm a total beginner in Keras and have no idea how to translate my code from Sequential to Functional, here's the code that the author used (and I copied):
from keras.layers import Merge
from keras.layers.core import Dense, Reshape
from keras.layers.embeddings import Embedding
from keras.models import Sequential
# build skip-gram architecture
word_model = Sequential()
word_model.add(Embedding(vocab_size, embed_size,
embeddings_initializer="glorot_uniform",
input_length=1))
word_model.add(Reshape((embed_size, )))
context_model = Sequential()
context_model.add(Embedding(vocab_size, embed_size,
embeddings_initializer="glorot_uniform",
input_length=1))
context_model.add(Reshape((embed_size,)))
model = Sequential()
model.add(Merge([word_model, context_model], mode="dot"))
model.add(Dense(1, kernel_initializer="glorot_uniform", activation="sigmoid"))
model.compile(loss="mean_squared_error", optimizer="rmsprop")
# view model summary
print(model.summary())
# visualize model structure
from IPython.display import SVG
from keras.utils.vis_utils import model_to_dot
SVG(model_to_dot(model, show_shapes=True, show_layer_names=False,
rankdir='TB').create(prog='dot', format='svg'))
And when I run the block, the following error is shown:
ImportError Traceback (most recent call last)
<ipython-input-79-80d604373468> in <module>()
----> 1 from keras.layers import Merge
2 from keras.layers.core import Dense, Reshape
3 from keras.layers.embeddings import Embedding
4 from keras.models import Sequential
5
ImportError: cannot import name 'Merge'
What I'm asking here is some guidance on how to transform this Sequential into a Functional API structure.

This did indeed change. For a dot product, you can now use the dot layer:
from keras.layers import dot
...
dot_product = dot([target, context], axes=1, normalize=False)
...
You have to set the axis parameter according to your data, of course. If you set normalize=True, this gives the cosine proximity. For more information, see the documentation.
To learn about the functional API to Keras, there is a good guide to the functional API in the documentation. It's not difficult to switch if you already understand the sequential API.

Merge seems deprecated so Instead of Merge use Dot directly on embedding (and not with models). Use below code.
from keras.layers import Input
from keras.models import Model
from keras.layers.embeddings import Embedding
from keras.layers.core import Dense, Reshape
from keras.layers import dot
input_target = Input((1,))
input_context = Input((1,))
embedding = Embedding(vocab_size, embed_size, input_length=1, name='embedding')
word_embedding = embedding(input_target)
word_embedding = Reshape((embed_size, 1))(word_embedding)
context_embedding = embedding(input_context)
context_embedding = Reshape((embed_size, 1))(context_embedding)
# now perform the dot product operation
dot_product = dot([word_embedding, context_embedding], axes=1)
dot_product = Reshape((1,))(dot_product)
# add the sigmoid output layer
output = Dense(1, activation='sigmoid')(dot_product)
model = Model(input=[input_target, input_context], output=output)
model.compile(loss='mean_squared_error', optimizer='rmsprop')
# view model summary
print(model.summary())
# visualize model structure
from IPython.display import SVG
from keras.utils.vis_utils import model_to_dot
SVG(model_to_dot(model, show_shapes=True, show_layer_names=False,
rankdir='TB').create(prog='dot', format='svg'))

Related

Get values of KerasTensor

Is there a way to get the values of a Keras Tensor as a numpy array?
A normal K.eval() does not work and results in:
AttributeError: 'KerasTensor' object has no attribute 'numpy'
I use the eager execution.
I need this to access and check the outputs of some layers of my sequential model. Example: (out2 = K.eval(cnn.layers[2].output))
Minimal example:
import numpy
import tensorflow.keras.models as models
import tensorflow.keras.layers as layers
from keras import backend as K
cnn = models.Sequential([
layers.Conv2D(filters=64, kernel_size=3, activation='relu',
kernel_initializer='he_uniform', padding='same',
input_shape=(32,32,3))],
name='cnn')
res = cnn(numpy.random.random((1,32,32,3)))
print(K.eval(cnn.layers[0].output))
Do not mix keras with tensorflow.keras, i.e.
from keras import backend as K
#wrong, do not import anything from keras and mix them with tensorflow
To get intermediate outputs from a model:
#for example, to get the intermediate outputs from model.layers[2]
temp_model=tf.keras.Model(model.input,model.layers[2].output)
print(temp_model(np.random.rand(1,32,32,3)).numpy())

Simple CNN model four image inputs to detect two classes

I am working on a project where I must use four different images as inputs. I take the inputs run them through a simple model and detect two classes.I am really struggling in the actual setup of the model.
Not sure if I am on the right track. I haven't been able to run the code since I am still unsure of the architecture of the model. The code below is my model how its setup currently. I have all the images already. I took one image and split it into four. Then using the four images, detect one of two classes. If this doesn't make sense or I am taking the wrong direction with this please help.'
# import the necessary packages
from keras.models import Sequential
from keras.layers.normalization import BatchNormalization
from keras.layers.convolutional import Conv2D
from keras.layers.convolutional import MaxPooling2D
from keras.layers.core import Activation
from keras.layers.core import Dropout
from keras.layers.core import Dense
from keras.layers import Flatten
from keras.layers import Input
from keras.models import Model
from keras.layers import Dense ,LSTM,concatenate,Input,Flatten
from keras.models import Sequential
from keras.preprocessing import image
from keras.layers import Dense, InputLayer, Conv2D, MaxPool2D, Flatten
import keras
# define four sets of inputs
inputA = Input(shape=(200, 200, 3))
inputB = Input(shape=(200, 200, 3))
inputC = Input(shape=(200, 200, 3))
inputD = Input(shape=(200, 200, 3))
# merge all input images
merged = keras.layers.Concatenate(axis=1)([inputA, inputB, inputC, inputD])
# the first branch operates on the first input through the fourth input
dense1 = keras.layers.Conv2D(16, (2, 2), activation='relu')(merged)
output = keras.layers.Conv2D(16, (2, 2), activation='relu')(dense1)
# apply a FC layer and then a regression prediction on the
# combined outputs
z = keras.layers.Dense(128, activation="relu")(output)
z = keras.layers.Dense(9, activation="softmax")(z)
# then output a single value then our model will accept the inputs of the two branches and
# model = Model(inputs=[tt.output, y.output, t.output, w.output], outputs=z)
model.compile(loss='categorical_crossentropy', optimizer="adam", metrics=['accuracy'])
model.summary()
I think the model is setup correctly to run using four images as inputs, but now I need to setup the classes. My idea with is , to set up a CNN that looks at four different pictures of the same tree. Using those four images to detect the type of tree. I want all four images to be for one tree. Then detect the tree. After detecting the tree moving onto the next four images for a different tree.
Thanks everyone I am very grateful for all your inputs and help.

What does concatenate layers do in Keras multitask?

I am implementing a simple multitask model in Keras. I used the code given in the documentation under the heading of shared layers.
I know that in multitask learning, we share some of the initial layers in our model and the final layers are made individual to the specific tasks as per the link.
I have following two cases in keras API where in the first, I am using keras.layers.concatenate while in the other, I am not using any keras.layers.concatenate.
I am posting the codes as well as the models for each case as follows.
Case-1 code
import keras
from keras.layers import Input, LSTM, Dense
from keras.models import Model
from keras.models import Sequential
from keras.layers import Dense
from keras.utils.vis_utils import plot_model
tweet_a = Input(shape=(280, 256))
tweet_b = Input(shape=(280, 256))
# This layer can take as input a matrix
# and will return a vector of size 64
shared_lstm = LSTM(64)
# When we reuse the same layer instance
# multiple times, the weights of the layer
# are also being reused
# (it is effectively *the same* layer)
encoded_a = shared_lstm(tweet_a)
encoded_b = shared_lstm(tweet_b)
# We can then concatenate the two vectors:
merged_vector = keras.layers.concatenate([encoded_a, encoded_b], axis=-1)
# And add a logistic regression on top
predictions1 = Dense(1, activation='sigmoid')(merged_vector)
predictions2 = Dense(1, activation='sigmoid')(merged_vector)
# We define a trainable model linking the
# tweet inputs to the predictions
model = Model(inputs=[tweet_a, tweet_b], outputs=[predictions1, predictions2])
model.compile(optimizer='rmsprop',
loss='binary_crossentropy',
metrics=['accuracy'])
Case-1 Model
Case-2 code
import keras
from keras.layers import Input, LSTM, Dense
from keras.models import Model
from keras.models import Sequential
from keras.layers import Dense
from keras.utils.vis_utils import plot_model
tweet_a = Input(shape=(280, 256))
tweet_b = Input(shape=(280, 256))
# This layer can take as input a matrix
# and will return a vector of size 64
shared_lstm = LSTM(64)
# When we reuse the same layer instance
# multiple times, the weights of the layer
# are also being reused
# (it is effectively *the same* layer)
encoded_a = shared_lstm(tweet_a)
encoded_b = shared_lstm(tweet_b)
# And add a logistic regression on top
predictions1 = Dense(1, activation='sigmoid')(encoded_a )
predictions2 = Dense(1, activation='sigmoid')(encoded_b)
# We define a trainable model linking the
# tweet inputs to the predictions
model = Model(inputs=[tweet_a, tweet_b], outputs=[predictions1, predictions2])
model.compile(optimizer='rmsprop',
loss='binary_crossentropy',
metrics=['accuracy'])
Case-2 Model
In both cases, the LSTMlayer is shared only. In case-1, we have keras.layers.concatenate but in case-2, we don't have any keras.layers.concatenate.
My question is, which one is multitasking, case-1 or case-2? Morover, what is the function of keras.layers.concatenate in case-1?
Both are multi-task models, as this only depends if there are multiple outputs with one task associated to each output.
The difference is that your first model explicitly concatenates features produced by the shared layer, so both output tasks can consider information from both inputs. The second model only has connections from one input directly to one of the outputs, without considering the other input. The only link between models here is that they share the LSTM weights.

Keras - Translation from Sequential to Functional API

I've been following Towards Data Science's tutorial about word2vec and skip-gram models, but I stumbled upon a problem that I cannot solve, despite searching about it a lot and trying multiple unsuccessful solutions.
https://towardsdatascience.com/understanding-feature-engineering-part-4-deep-learning-methods-for-text-data-96c44370bbfa
The step that it shows you how to build the skip-gram model architecture seems deprecated because of the use of the Merge layer from keras.layers.
What I tried to do was translate his piece of code - which is implemented in the Sequential API of Keras - to the Functional API to solve the deprecation of the Merge layer, by replacing it with the keras.layers.Dot layer. However, I'm still stuck in this step of merging the two models (word and context) into the final model, whose architecture must be like this:
Here's the code that the author used:
from keras.layers import Merge
from keras.layers.core import Dense, Reshape
from keras.layers.embeddings import Embedding
from keras.models import Sequential
# build skip-gram architecture
word_model = Sequential()
word_model.add(Embedding(vocab_size, embed_size,
embeddings_initializer="glorot_uniform",
input_length=1))
word_model.add(Reshape((embed_size, )))
context_model = Sequential()
context_model.add(Embedding(vocab_size, embed_size,
embeddings_initializer="glorot_uniform",
input_length=1))
context_model.add(Reshape((embed_size,)))
model = Sequential()
model.add(Merge([word_model, context_model], mode="dot"))
model.add(Dense(1, kernel_initializer="glorot_uniform", activation="sigmoid"))
model.compile(loss="mean_squared_error", optimizer="rmsprop")
And here is my attempt to translate the Sequential code implementation into the Functional one:
from keras import models
from keras import layers
from keras import Input, Model
word_input = Input(shape=(1,))
word_x = layers.Embedding(vocab_size, embed_size, embeddings_initializer='glorot_uniform')(word_input)
word_reshape = layers.Reshape((embed_size,))(word_x)
word_model = Model(word_input, word_reshape)
context_input = Input(shape=(1,))
context_x = layers.Embedding(vocab_size, embed_size, embeddings_initializer='glorot_uniform')(context_input)
context_reshape = layers.Reshape((embed_size,))(context_x)
context_model = Model(context_input, context_reshape)
model_input = layers.dot([word_model, context_model], axes=1, normalize=False)
model_output = layers.Dense(1, kernel_initializer='glorot_uniform', activation='sigmoid')
model = Model(model_input, model_output)
However, when executed, the following error is returned:
ValueError: Layer dot_5 was called with an input that isn't a symbolic
tensor. Received type: . Full
input: [,
]. All inputs to
the layer should be tensors.
I'm a total beginner to the Functional API of Keras, I will be grateful if you could give me some guidance in this situation on how could I input the context and word models into the dot layer to achieve the architecture in the image.
You are passing Model instances to the layer, however as the error suggests you need to pass Keras Tensors (i.e. outputs of layers or models) to layers in Keras. You have two option here. One is to use the .output attribute of the Model instance like this:
dot_output = layers.dot([word_model.output, context_model.output], axes=1, normalize=False)
or equivalently, you can use the output tensors directly:
dot_output = layers.dot([word_reshape, context_reshape], axes=1, normalize=False)
Further, you need to apply the Dense layer which is followed on the dot_output and pass instances of Input layer as inputs of Model. Therefore:
model_output = layers.Dense(1, kernel_initializer='glorot_uniform',
activation='sigmoid')(dot_output)
model = Model([word_input, context_input], model_output)

How to use multi threading in keras/tensorflow when fitting a model?

I have a CPU with 20 cores and I am trying to use all the cores to fit a model. I set a tf session with intra_op_parallelism_threads=20 and called model.fit within the same tf session.
The python process utilizes 2000% CPU (as stated by top). However, when comparing the following code with single core configuration (intra_op_parallelism_threads=1) I get the same learning rate.
from keras.layers import Dense, Activation, Dropout
from keras.layers import Input, Conv1D
import numpy as np
from keras.layers.merge import concatenate
from keras.models import Model
import tensorflow as tf
from keras.backend import tensorflow_backend as K
with tf.Session(config=tf.ConfigProto(intra_op_parallelism_threads=20)) as sess:
K.set_session(sess)
size=20
batch_size=16
def xor_data_generator():
while True:
data1 = np.random.choice([0, 1], size=(batch_size, size,size))
data2 = np.random.choice([0, 1], size=(batch_size, size,size))
labels = np.bitwise_xor(data1, data2)
yield ([data1, data2], np.array(labels))
a = Input(shape=(size,size))
b = Input(shape=(size,size))
merged = concatenate([a, b])
hidden = Dense(2*size)(merged)
conv1 = Conv1D(filters=size*16, kernel_size=1, activation='relu')(hidden)
hidden = Dropout(0.1)(conv1)
outputs = Dense(size, activation='sigmoid')(hidden)
model = Model(inputs=[a, b], outputs=outputs)
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit_generator(xor_data_generator(), steps_per_epoch=batch_size, epochs=10000)
Note that I can't use multi_gpu_model, because I have a system with only 20 CPU cores.
How can I distribute model.fit_generator(xor_data_generator(), steps_per_epoch=batch_size, epochs=10000) on different cores simultaneously?
Have a look at Keras' Sequence object to write your custom generator. It is the underlying object of the ImageDataGenerator to yield image data. The docs contain boilerplate code that you can adapt. If you use it, you can set the use_multiprocessing argument of fit.generator() to True. See also this answer.

Categories

Resources