how to use XLMRoberta in fine-tuning , - python

There are two problems i met when i fine-tuning my code.
And i was trying to use X_1 and X_2 to regress.
There are different languages in the corpus.
HTTPError: 404 Client Error: Not Found for url: https://huggingface.co/xlm-roberta-base/resolve/main/tf_model.h5
During handling of the above exception, another exception occurred:
OSError Traceback (most recent call last)
/tmp/ipykernel_33/2123064688.py in <module>
55 # )
56
---> 57 model = TFXLMRobertaForSequenceClassification.from_pretrained('xlm-roberta-base',num_labels=1)
OSError: Can't load weights for 'xlm-roberta-base'. Make sure that:
- 'xlm-roberta-base' is a correct model identifier listed on 'https://huggingface.co/models'
(make sure 'xlm-roberta-base' is not a path to a local directory with something else, in that case)
- or 'xlm-roberta-base' is the correct path to a directory containing a file named one of tf_model.h5, pytorch_model.bin.
This is my code:
tokenizer = XLMRobertaTokenizerFast.from_pretrained('xlm-roberta-base')
train_encoding = tokenizer(X_train_1,X_train_2,truncation=True,padding=True)
val_encoding = tokenizer(X_val_1,X_val_2,truncation=True,padding=True)
train_dataset = tf.data.Dataset.from_tensor_slices(
(dict(train_encoding),y_train)
)
val_dataset = tf.data.Dataset.from_tensor_slices(
(dict(val_encoding),y_val)
)
model = TFXLMRobertaForSequenceClassification.from_pretrained('xlm-roberta-base',num_labels=1)

There are several things you're better to know before diving deep into huggingface transformers.
The preferred library for working with huggingface's transformers is PyTorch.
For several widely used models, you may find the Tensorflow version alongside but not for all.
fortunately, there are ways to convert pt checkpoints to tf and vise versa.
Finally how to fix the code:
# switching to pytorch
tokenizer = XLMRobertaTokenizerFast.from_pretrained('xlm-roberta-base')
model = XLMRobertaForSequenceClassification.from_pretrained('xlm-roberta-base')
# using un-official checkpoints
model = TFXLMRobertaForSequenceClassification.from_pretrained('jplu/tf-xlm-roberta-base',num_labels=1)
# converting pt checkpoint to tensorflow (not recommended!)

Related

SwishImplementation error when saving jit trace

I am trying to jit trace and save my pytorch model from the segmentation models package. But I am getting an error. "Could not export Python function call 'SwishImplementation'. Remove calls to python functions before export. Did you forget to add #script or #scrript_method annotation? If this is a nn.ModuleList, add it to _ constants _" It only happens when I use the efficientnet backbone. How can I get the save() function to work? I need to be able to use the model in a c++ application.
import torch
import segmentation_models_pytorch as smp
model = smp.Unet('efficientnet-b7')
model.eval()
input = torch.randn((1,3,224,224))
torch_out = model(input)
model = torch.jit.trace(model,input)
trace_out = model(input)
model.save('model.pt')
The UNET model from the segmentation_models_pytorch module uses an EfficientNet, which uses a MemoryEfficientSwish module. To fix the error, change all instances of MemoryEfficientSwish to Swish before saving the model.
You can iterate through the UNET model, and if the module is an instance of EfficientNet, call the function .set_swish(memory_efficient = False).
After that, you can load the state_dict, and then trace and save the model.

Yolov5s on Openvino

I have trained a model using yolov5 and it is working just fine:
My ultimate goal is to use a model that I have trained on custom data (to detect the hook and bucket) in the Openvino framework.
To achieve this, I first exported the best version of the model to the appropriate Openvino format, using the following command:
!python export.py --weights runs/train/yolov5s24/weights/best.pt --include openvino --dynamic --simplify
The export performed successfully generated me 3 files: best.xml, best.bin, best.mapping;
Now I would like to load it using the Openvino framework and to to that I am following this pipeline:
Create Core object
1.1. (Optional) Load extensions
Read a model from a drive
2.1. (Optional) Perform model preprocessing
Load the model to the device
Create an inference request
Fill input tensors with data
Start inference
Process the inference results
1 Create Core
import numpy as np
import openvino.inference_engine as ie
core = ie.IECore()
2 Read a model from a drive
path_to_xml_file = 'models/best_openvino_model/best.xml'
path_to_bin_file = 'models/best_openvino_model/best.bin'
network = core.read_network(model=path_to_xml_file, weights=path_to_bin_file)
3 Load the Model to the Device
# Load network to the device and create infer requests
exec_network = core.load_network(network, "CPU", num_requests=4)
And here I am getting an error:
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
<ipython-input-21-7c9ba5f53484> in <module>
1 # Load network to the device and create infer requests
----> 2 exec_network = core.load_network(network, "CPU", num_requests=4)
ie_api.pyx in openvino.inference_engine.ie_api.IECore.load_network()
ie_api.pyx in openvino.inference_engine.ie_api.IECore.load_network()
RuntimeError: Check 'std::get<0>(valid)' failed at inference/src/ie_core.cpp:1414:
InferenceEngine::Core::LoadNetwork doesn't support inputs having dynamic shapes. Use ov::Core::compile_model API instead. Dynamic inputs are :{ input:'images,images', shape={?,3,?,?}}
I am using the OpenVINO™ Development Tools - release 2022.1;
The files to reproduce the error are here;
This error is expected as the model contains a dynamic shape. This model can be executed using the ov::Core:compile_model API in OpenVINO 2022.1.
You can refer to class ov::CompiledModel and Dynamic Shape for more information.
You use API 1.0 that does not support dynamic shapes.
You need to use API 2.0 for dynamic shapes.
Here an example of how to infer for dynamic shapes using API 2.0:
from openvino.runtime import AsyncInferQueue, Core, InferRequest, Layout, Type
core = Core()
model = core.read_model("model.xml")
compiled_model = core.compile_model(model, "CPU")
output = compiled_model.infer_new_request({0: input_data})
yolov5 on openvino #
yolov5-openvino
enjoy.....

Problem with loading language_model_learner fastai

I have problem with fastai library. My code below:
import fastai
from fastai.text import *
import os
import pandas as pd
import fastai
from fastai import *
lab = df.columns[0]
data_lm = TextLMDataBunch.from_csv(r'/AWD', 'data.csv', label_cols = lab, text_cols = ['text'])
data_clas = TextClasDataBunch.from_csv(r'/AWD', 'data.csv', vocab = data_lm.train_ds.vocab, bs = 256,label_cols = lab, text_cols=['text'])
data_lm.save('data_lm_export.pkl')
data_clas.save('data_clas.pkl')
learn = language_model_learner(data_lm,AWD_LSTM,drop_mult = 0.3)
learn.lr_find()
learn.recorder.plot(skip_end=10)
learn.fit_one_cycle(10,1e-2,moms=(0.8,0.7))
learn.save('fit_head')
learn.load('fit_head')
My data is quite big, so each epoch in fit_one_cycle lasts about 6h. My resources enables me only to train model in SLURM JOB 70h, so my whole script will be cancelled. I wanted to divide my script into pieces and the first longest part has to learn and save fit_head. Everything was ok, and after that I wanted to load my model to train it again, but i got this error:
**RuntimeError: Error(s) in loading state_dict for SequentialRNN:
size mismatch for 0.encoder.weight: copying a param with shape torch.Size([54376, 400]) from checkpoint, the shape in current model is torch.Size([54720, 400]).
**
I have checked similar problems on github/stack posts and I tried those solutions like this below, but i cannot find anything usefull.
data_clas.vocab.stoi = data_lm.vocab.stoi
data_clas.vocab.itos = data_lm.vocab.itos
Is there any possibility to load trained model without having this issue ?
When you do learner.save() only the model weights are saved on your disk and not the model state dict which contains the model architecture information.
To train the model in a different session you must first define the model itself. Remember to use the same code to define your new model. Since your data is quite heavy as you mentioned you can use a very small subset (~16 records) of your data to create this new model and then do learn.load(model_path) and you should be able to resume training.
you can modify the training data with learn.data.train_dl = new_dl

weights = 'noisy-student' ValueError: The `weights` argument should be either `None`, `imagenet`, or the path to the weights file to be loaded

I am using Google Colab and I want to use the weights of EfficientNet Noisy Student. https://www.kaggle.com/c/bengaliai-cv19/discussion/132894
First, I installed the package via:
!pip install git+https://github.com/qubvel/efficientnet
Then I tried the code found on the site mentioned above:
import efficientnet.keras as eff
model = eff.EfficientNetB0(weights='noisy-student')
And got this Value error:
ValueError: The `weights` argument should be either `None` (random initialization), `imagenet` (pre-training on ImageNet), or the path to the weights file to be loaded.
Does someone know how to fix this?
You could download the weights from here.
And load it manually like this:
path_to_weights = "/..your..path../efficientnet-b5_noisy-student_notop.h5"
model = EfficientNetB5(include_top=False)
model.load_weights(path_to_weights, by_name=True)

TFLearn: Error in loading 2 different saved models one after another

I have 2 different neural network models, trained and saved using TFLearn. When I run each script, the saved models are loaded properly. I need a system where, the second model should be called after the output of the first model.
But when I try to load the second model after the first model has been loaded, it gives me the following error:
NotFoundError (see above for traceback): Key val_loss_2 not found in checkpoint
[[Node: save_6/RestoreV2_42 = RestoreV2[dtypes=[DT_FLOAT], _device="/job:localhost/replica:0/task:0/cpu:0"](_arg_save_6/Const_0_0, save_6/RestoreV2_42/tensor_names, save_6/RestoreV2_42/shape_and_slices)]]
The second model is properly loaded if I comment out the loading of the first model, or if I run the 2 scripts separately. Any idea why this error is happening?
The code structure is something like ..
from second_model_file import check_second_model
def run_first_model(input):
features = convert_to_features(input)
model = tflearn.DNN(get_model())
model.load("model1_path/model1") # relative path
pred = model.predict(features)
...
if pred == certain_value:
check_second_model()
The second_model_file.py is something similar:
def check_second_model():
input_var = get_input_var()
model2 = tflearn.DNN(regression_model())
model2.load("model2_path/model2") # relative path
pred = model2.predict(input_var)
#other stuff ......
The models have been saved in different folders and so each have their own checkpoint file
Well, okay I found the solution. It was hidden in the discussion on this thread .
I used tf.reset_default_graph() before building the second network and model and it worked. Hope this helps someone else too.
New code:
import tensorflow as tf
def check_second_model():
input_var = get_input_var()
tf.reset_default_graph()
model2 = tflearn.DNN(regression_model())
model2.load("model2_path/model2") # relative path
pred = model2.predict(input_var)
Though I intuitively understand why this solution works, I would be happy if someone can explain me better why it is designed such.

Categories

Resources