Can I concatenate different shape tensors with keras? - python

I'm doing my project about machine learning and I need to merge (concatenate) two tensors that have different shapes.
For more details:
We're trying to concatenate an matrix of tokens with an one hot matrix. tokens pass through an embedding layer so we get a weights matrix with shape like (100, 10, 300).
Finally we need to merge one hot matrix and weights matrix like this:
(100, 300) and (100, 10, 300) to be (100, >11, 300)
This is, to append each of the 300 one hot vectors in matrix in the first position of each weight matrix like (1,300) + (1,10,300) to get a sample of merged values with shape (1,>11,300)
I actually reached this in a manual form through a loop but it takes too much time so I wanted to know if this is posible through keras or any other similar.
This is the function I wrote so here I reached what I wanted, but if it is possible to do in a better way that doesn't take too much time is the ideal.
def join_demo_sentence(X, Demo, embedding, max_length):
X = pad_sequences(X, maxlen=max_length, padding='post')
Demo = pad_sequences(Demo, maxlen=300, padding='post')
joined = []
for i, sequence in tqdm(enumerate(X), desc='Joining'):
demo = Demo[i]
sequence = embedding.get_weights()[0][sequence]
join = np.insert(sequence.T, 0, demo, axis=1)
joined.append(join.T)
X = np.asarray(joined)
return X
That function loops through the matrix to join demo one hot values and sentence tokens so for final result I get the sentences with the one hot of demographic in first position.
I'm learning about keras so I think there's a way with keras.layers.Concatenate

Is it what you need:
a = tf.random.uniform((100, 10, 300))
b = tf.random.uniform((100, 300))
b = b[:, tf.newaxis, :] # add the second axis
res = tf.concat((a, b), -2)

Related

Group elements in ndarray by index

I have an image dataset of a 1000 images, which I have created embeddings for. Each embeddings (512 embeddings for each image with a 256-d vector) is an ndarray of shape (512, 256), so the total array shape would be (1000, 512, 256).
Now, from each image (1000), I want to create a group of observation for the first embedding, of the 512 available, and collecting this embedding from each image. Then I want to do this for the second embedding, third, fourth, up to the 512th.
How would I go about creating these groups?
You can achieve that as follows:
groups = []
for i in range(512):
# Select the i-th embedding from each image
group = embeddings[:, i, :]
groups.append(group)
groups = np.array(groups)
Another optimized solution:
groups = np.array([embeddings[:, i, :] for i in range(512)])
groups = np.transpose(groups, (1, 0, 2))

Stacking tensors

Imagine that you have a batch of two lists of embeddings of the same dimension 4. Each sequence has a different length. You want to compute a function of the two sequences for each pair in the batch. To this end, you write something like
import tensorflow as tf
# suppose the batch size is 1 for simplicity
a = tf.random.normal((1, 100, 4))
b = tf.random.normal((1, 57, 4))
batch = tf.ragged.stack((a, b), axis=1)
This works okay, yielding a tensor with shape batch.shape == TensorShape([1, None, None, 4]). Now, why is the following an IndexError?
batch[0][0]

Is there a way that feeding a input to layers parallelly in PyTorch?

I'm implementing ELMo with PyTorch. I want to feed ELMo's CNNs, which have different filter map sizes, on word matrices but I'm considering an efficent way. Here's my code:
# fill the empty tensor iteratively
batch_size = word.size(0)
y = torch.zeros(batch_size, self.kernel_dim)
cnt = 0
for kernel in self.kernels:
temp = kernel(word)
pooled = torch.max(temp, dim=2)[0]
y[:, cnt:cnt+pooled.size(1)] = pooled
cnt += pooled.size(1)
# Using torch.cat
y = []
for kernel in kernels:
temp = kernel(a)
y.append(torch.max(temp, dim=2)[0]) # max pooling
y = torch.cat(y, dim=1)
I have two questions. The first one is "Is there a way that feeding input to layers in parallel?". It makes me avoiding for loop iteration so that make my code more efficient. The second one is "Which one is faster between using torch.cat and just filling an empty tensor.

Keras multi-output data reshape for LSTM model

I have a Keras LSTM model that contains multiple outputs.
The model is defined as follows:
outputs=[]
main_input = Input(shape= (seq_length,feature_cnt), name='main_input')
lstm = LSTM(32,return_sequences=True)(main_input)
for _ in range((output_branches)): #output_branches is the number of output branches of the model
prediction = LSTM(8,return_sequences=False)(lstm)
out = Dense(1)(prediction)
outputs.append(out)
model = Model(inputs=main_input, outputs=outputs)
model.compile(optimizer='rmsprop',loss='mse')
I have a problem when reshaping the output data.
The code for reshaping the output data is:
y=y.reshape((len(y),output_branches,1))
I got the following error:
ValueError: Error when checking model target: the list of Numpy arrays
that you are passing to your model is not the size the model expected.
Expected to see 5 array(s), but instead got the following list of 1
arrays: [array([[[0.29670931],
[0.16652206],
[0.25114482],
[0.36952324],
[0.09429612]],
[[0.16652206],
[0.25114482],
[0.36952324],
[0.09429612],...
How can I correctly reshape the output data?
It depends on how y is structured initially. Here I assume that y is a single-valued label for each sequence in batch.
When there are multiple inputs/outputs model.fit() expects a corresponding list of inputs/outputs to be given. np.split(y, output_branches, axis=-1) in a following fully reproducible example does exactly this - for each batch splits a single list of outputs into a list of separate outputs where each output (in this case) is 1-element list:
import tensorflow as tf
import numpy as np
tf.enable_eager_execution()
batch_size = 100
seq_length = 10
feature_cnt = 5
output_branches = 3
# Say we've got:
# - 100-element batch
# - of 10-element sequences
# - where each element of a sequence is a vector describing 5 features.
X = np.random.random_sample([batch_size, seq_length, feature_cnt])
# Every sequence of a batch is labelled with `output_branches` labels.
y = np.random.random_sample([batch_size, output_branches])
# Here y.shape() == (100, 3)
# Here we split the last axis of y (output_branches) into `output_branches` separate lists.
y = np.split(y, output_branches, axis=-1)
# Here y is not a numpy matrix anymore, but a list of matrices.
# E.g. y[0].shape() == (100, 1); y[1].shape() == (100, 1) etc...
outputs = []
main_input = tf.keras.layers.Input(shape=(seq_length, feature_cnt), name='main_input')
lstm = tf.keras.layers.LSTM(32, return_sequences=True)(main_input)
for _ in range(output_branches):
prediction = tf.keras.layers.LSTM(8, return_sequences=False)(lstm)
out = tf.keras.layers.Dense(1)(prediction)
outputs.append(out)
model = tf.keras.models.Model(inputs=main_input, outputs=outputs)
model.compile(optimizer='rmsprop', loss='mse')
model.fit(X, y)
You might need to play around with axes as you didn't specify how exactly your data look like.
EDIT:
As author is looking for an answer drawing from official sources, it's mentioned here (not explicitly though, it only mentions what the Dataset should yield, hence - what kind of input structure model.fit() expects):
When calling fit with a Dataset object, it should yield either a tuple of lists like ([title_data, body_data, tags_data], [priority_targets, dept_targets]) or a tuple of dictionaries like ({'title': title_data, 'body': body_data, 'tags': tags_data}, {'priority': priority_targets, 'department': dept_targets}).
Since you have an amount of outputs equal to output_branches, your output data must be a list with the same amount of arrays.
Basically, if the output data is in the middle dimension as your reshape suggests:
y = [ y[:,i] for i in range(output_branches)]

Apply a shared Embedding layer on a set of documents in keras

I am trying to create a model in which I want to predict the order of a certain set of documents given a certain query. My idea was basically to use a shared embedding layer for both the query and the documents, then merge the two "branches" using a cosine similarity between each document and the query (using a custom lambda). The loss function would then compute the difference between the expected position and the predicted similarity.
My question is: Is there a way to create Embeddings for a set of textual features (provided that they have the same length)?
I can properly transform my query in a "doc2vec-like embedding" by applying Embedding + Convolution1D + GlobalMaxPooling1D, but I had no luck using the same strategy on the sets of documents (and Reshaping + 2D convolutions don't really make sense to me given that I am working with textual data).
Note that a constraint I have is that I need to use the same Embedding layer for both my query and the set of documents (I am using the Keras' functional apis to do so).
[EDIT, adding sample code]
Q = Input(shape=(5, )) # each query is made of 5 words
T = Input(shape=(50, 50)) # each search result is made of 50 words and 50 docs
emb = Embedding(
max_val,
embedding_dims,
dropout=embedding_dropout
)
left = emb(Q)
left = Convolution1D(nb_filter=5,
filter_length=5,
border_mode='valid',
activation='relu',
subsample_length=1)(left)
left = GlobalMaxPooling1D()(left)
print(left)
right = emb(T) # <-- this is my problem, I don't really know what to do/apply here
def merger(vests):
x, y = vests
x = K.l2_normalize(x, axis=0) # Normalize rows
y = K.l2_normalize(y, axis=-1) # Normalize the vector
return tf.matmul(x, y) # obviously throws an error because of mismatching matrix ranks
def cos_dist_output_shape(shapes):
shape1, shape2 = shapes
return (50, 1)
merger_f = Lambda(merger)
predictions = merge([left, right], output_shape=cos_dist_output_shape, mode=merger_f)
model = Model(input=[Q, T], output=predictions)
def custom_objective(y_true, y_pred):
ordered_output = tf.cast(tf.nn.top_k(y_pred)[1], tf.float32) # returns the indices of the top values
return K.mean(K.square(ordered_output - y_true), axis=-1)
model.compile(optimizer='adam', loss=custom_objective)
[SOLUTION] thanks to Nassim Ben, use TimeDistributed like this to apply recurrently a Layer to all the dimensions of a layer like this:
right = TimeDistributed(emb)(T)
right = TimeDistributed(Convolution1D(nb_filter=5,
filter_length=5,
border_mode='valid',
activation='relu',
subsample_length=1)(right)
right = TimeDistributed(GlobalMaxPooling1D())(right)
Alright. If I understand correctly the situation, you have 50 text snippets of length 50 that you want to embed.
After doing the word embeddings, you find yourself with a Tensor T of shape (50,50,emb_size).
Whay I would do is to use a LSTM layer in a TimeDistributed wrapper. Adding those lines after emb(T) :
right = TimeDistributed(LSTM(5))(right)
This will apply the same LSTM to each of the 50 documents and output a final state of length 5 at the end of each document processing. The shape of right after this step is (50,5). You have embedded each document in a length 5 vector.
The advantage of TimeDistributed is that the LSTM applied to each document will share the same weights so your documents will be 'treated' the same way. You can find documentation about LSTM here and about TimeDistributed here.
I hope this helps a bit.

Categories

Resources