Every time I run GPT-2, I am receiving this message. Is there a way I can get this to go away?
Some weights of GPT2LMHeadModel were not initialized from the model checkpoint at gpt2 and are newly initialized: ['h.0.attn.masked_bias', 'h.1.attn.masked_bias', 'h.2.attn.masked_bias', 'h.3.attn.masked_bias', 'h.4.attn.masked_bias', 'h.5.attn.masked_bias', 'h.6.attn.masked_bias', 'h.7.attn.masked_bias', 'h.8.attn.masked_bias', 'h.9.attn.masked_bias', 'h.10.attn.masked_bias', 'h.11.attn.masked_bias', 'lm_head.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
Yes you need to change the loglevel before you import anything from the transformers library:
import logging
logging.basicConfig(level='ERROR')
from transformers import GPT2LMHeadModel
model = GPT2LMHeadModel.from_pretrained('gpt2')
Related
I am trying to jit trace and save my pytorch model from the segmentation models package. But I am getting an error. "Could not export Python function call 'SwishImplementation'. Remove calls to python functions before export. Did you forget to add #script or #scrript_method annotation? If this is a nn.ModuleList, add it to _ constants _" It only happens when I use the efficientnet backbone. How can I get the save() function to work? I need to be able to use the model in a c++ application.
import torch
import segmentation_models_pytorch as smp
model = smp.Unet('efficientnet-b7')
model.eval()
input = torch.randn((1,3,224,224))
torch_out = model(input)
model = torch.jit.trace(model,input)
trace_out = model(input)
model.save('model.pt')
The UNET model from the segmentation_models_pytorch module uses an EfficientNet, which uses a MemoryEfficientSwish module. To fix the error, change all instances of MemoryEfficientSwish to Swish before saving the model.
You can iterate through the UNET model, and if the module is an instance of EfficientNet, call the function .set_swish(memory_efficient = False).
After that, you can load the state_dict, and then trace and save the model.
I have successfully trained a Keras model like:
import tensorflow as tf
from keras_segmentation.models.unet import vgg_unet
# initaite the model
model = vgg_unet(n_classes=50, input_height=512, input_width=608)
# Train
model.train(
train_images=train_images,
train_annotations=train_annotations,
checkpoints_path="/tmp/vgg_unet_1", epochs=5
)
And saved it in hdf5 format with:
tf.keras.models.save_model(model,'my_model.hdf5')
Then I load my model with
model=tf.keras.models.load_model('my_model.hdf5')
Finally I want to make a segmentation prediction on a new image with
out = model.predict_segmentation(
inp=image_to_test,
out_fname="/tmp/out.png"
)
I am getting the following error:
AttributeError: 'Functional' object has no attribute 'predict_segmentation'
What am I doing wrong ?
Is it when I am saving my model or when I am loading it ?
Thanks !
predict_segmentation isn't a function available in normal Keras models. It looks like it was added after the model was created in the keras_segmentation library, which might be why Keras couldn't load it again.
I think you have 2 options for this.
You could use the line from the code I linked to manually add the function back to the model.
model.predict_segmentation = MethodType(keras_segmentation.predict.predict, model)
You could create a new vgg_unet with the same arguments when you reload the model, and transfer the weights from your hdf5 file to that model as suggested in the Keras documentation.
model = vgg_unet(n_classes=50, input_height=512, input_width=608)
model.load_weights('my_model.hdf5')
I am trying to use this huggingface model and have been following the example provided, but I am getting an error when loading the tokenizer:
from transformers import AutoTokenizer
task = 'sentiment'
MODEL = f"cardiffnlp/twitter-roberta-base-{task}"
tokenizer = AutoTokenizer.from_pretrained(MODEL)
OSError: Can't load tokenizer for 'cardiffnlp/twitter-roberta-base-sentiment'. Make sure that:
'cardiffnlp/twitter-roberta-base-sentiment' is a correct model identifier listed on 'https://huggingface.co/models'
or 'cardiffnlp/twitter-roberta-base-sentiment' is the correct path to a directory containing relevant tokenizer files
What I find very weird is that I was able to run my script several times but ran into an error after some time, while I don't recall changing anything in the meantime. Does anyone know what's the solution here?
EDIT: Here is my entire script:
from transformers import AutoTokenizer
from transformers import AutoModelForSequenceClassification
from transformers import TFAutoModelForSequenceClassification
import numpy as np
from scipy.special import softmax
import csv
import urllib.request
task = 'sentiment'
MODEL = f"nlptown/bert-base-multilingual-uncased-{task}"
tokenizer = AutoTokenizer.from_pretrained(MODEL)
labels = ['very_negative', 'negative', 'neutral', 'positive', 'very_positive']
model = AutoModelForSequenceClassification.from_pretrained(MODEL)
model.save_pretrained(MODEL)
text = "I love you"
encoded_input = tokenizer(text, return_tensors='pt')
output = model(**encoded_input)
scores = output[0][0].detach().numpy()
scores = softmax(scores)
print(scores)
The error seems to start happening when I run model.save_pretrained(MODEL), but this might be a coincidence.
I just came across this same issue. It seems like a bug with model.save_pretrained(), as you noted.
I was able to resolve by deleting the directory where the model had been saved (cardiffnlp/) and running again without model.save_pretrained().
Not sure what your application is. For me, re-downloading the model each time takes ~5s and that is acceptable.
I'm trying to prune a pre-trained model: MobileNetV2 and I got this error. Tried searching online and couldn't understand. I'm running on Google Colab.
These are my imports.
import tensorflow as tf
import tensorflow_model_optimization as tfmot
import tensorflow_datasets as tfds
from tensorflow import keras
import os
import numpy as np
import matplotlib.pyplot as plt
import tempfile
import zipfile
This is my code.
model_1 = keras.Sequential([
basemodel,
keras.layers.GlobalAveragePooling2D(),
keras.layers.Dense(1)
])
model_1.compile(optimizer='adam',
loss=keras.losses.BinaryCrossentropy(from_logits=True),
metrics=['accuracy'])
model_1.fit(train_batches,
epochs=5,
validation_data=valid_batches)
prune_low_magnitude = tfmot.sparsity.keras.prune_low_magnitude
pruning_params = {
'pruning_schedule': tfmot.sparsity.keras.PolynomialDecay(initial_sparsity=0.50,
final_sparsity=0.80,
begin_step=0,
end_step=end_step)
}
model_2 = prune_low_magnitude(model_1, **pruning_params)
model_2.compile(optmizer='adam',
loss=keres.losses.BinaryCrossentropy(from_logits=True),
metrics=['accuracy'])
This is the error i get.
---> 12 model_2 = prune_low_magnitude(model, **pruning_params)
ValueError: Please initialize `Prune` with a supported layer. Layers should either be a `PrunableLayer` instance, or should be supported by the PruneRegistry. You passed: <class 'tensorflow.python.keras.engine.training.Model'>
I believe you are following Pruning in Keras Example and jumped into Fine-tune pre-trained model with pruning section without setting your prunable layers. You have to reinstantiate model and set layers you wish to set as prunable. Follow this guide for further information on how to set prunable layers.
https://www.tensorflow.org/model_optimization/guide/pruning/comprehensive_guide.md
I faced the same issue with:
tensorflow version: 2.2.0
Just updating the version of tensorflow to 2.3.0 solved the issue, I think Tensorflow added support to this feature in 2.3.0.
One thing I found is that the experimental preprocessing I added to my model was throwing this error. I had this at the beginning of my model to help add some more training samples but the keras pruning code doesn't like subclassed models like this. Similarly, the code doesn't like the experimental preprocessing like I have with centering of the image. Removing the preprocessing from the model solved the issue for me.
def classificationModel(trainImgs, testImgs):
L2_lambda = 0.01
data_augmentation = tf.keras.Sequential(
[ layers.experimental.preprocessing.RandomFlip("horizontal", input_shape=IM_DIMS),
layers.experimental.preprocessing.RandomRotation(0.1),
layers.experimental.preprocessing.RandomZoom(0.1),])
model = tf.keras.Sequential()
model.add(data_augmentation)
model.add(layers.experimental.preprocessing.Rescaling(1./255, input_shape=IM_DIMS))
...
Saving the model as below and reloading worked for me.
_, keras_file = tempfile.mkstemp('.h5')
tf.keras.models.save_model(model, keras_file, include_optimizer=False)
print('Saved baseline model to:', keras_file)
Had the same problem today, its the following error.
If you don't want the layer to be pruned or don't care for it, you can use this code to only prune the prunable layers in a model:
from tensorflow_model_optimization.python.core.sparsity.keras import prunable_layer
from tensorflow_model_optimization.python.core.sparsity.keras import prune_registry
def apply_pruning_to_prunable_layers(layer):
if isinstance(layer, prunable_layer.PrunableLayer) or hasattr(layer, 'get_prunable_weights') or prune_registry.PruneRegistry.supports(layer):
return tfmot.sparsity.keras.prune_low_magnitude(layer)
print("Not Prunable: ", layer)
return layer
model_for_pruning = tf.keras.models.clone_model(
base_model,
clone_function=apply_pruning_to_pruneable_layers
)
I don't understand how to use tensor board to visualize the training step of my keras network.
I already launch tensor board with the command line : tensorboard --logdir=/run1
But he raise this error :
No dashboards are active for the current data set. Probable causes:You
haven’t written any data to your event files. TensorBoard can’t find
your event files.
import tensorflow as tf
from tensorflow import keras
import numpy as np
# Create the array of data
train_data = [[1.0,2.0,3.0],[4.0,5.0,6.0]]
train_data_np = np.asarray(train_data)
train_label = [[1,2,3],[4,5,6]]
train_label_np = np.asarray(train_data)
### Build the model
model = keras.Sequential([
keras.layers.Dense(3,input_shape =(3,2)),
keras.layers.Dense(3,activation=tf.nn.sigmoid)
])
model.compile(optimizer='sgd',loss='sparse_categorical_crossentropy',metrics=['accuracy'])
#Train the model
tensorboard = TensorBoard(log_dir="run1")
model.fit(train_data_np,train_label_np,epochs=10,callbacks=tensorboard)
#test the model
restest = model.evaluate(test_data_np,test_label_np)
Adding formal answer here; looks like there is a typo in the tensorboard logdir parameter. You need to remove the slash at the beginning of the directory
tensorboard --logdir=run1