BIOBERT Pre-trained Word Embeddings into CNN using tensorflow - python

I have a textual dataset of 15000 rows and a label for each row. since my dataset is in the clinical domain, I want to use BIOBERT pre-trained word embeddings on the textual data using tensorflow and then use it as an input to a CNN network for prediction. Can anyone please show me how to implement this on any textual field using python and tensorflow.

Related

What if labels contain both multi-label and single-label? (NLP - Sentence classification problem)

Currenlty I'm building a sentence classification model.
Input data X: app review data (text)
Y: To classify the review - the reviews about ['performace', 'ability', 'usability', 'etc']
I used pre-trained bert model and fine-tuning it using huggingface Trainer().
Most of the data (90%) has single label. However, some data has multi-label (ex. ['performance', 'ability'] or ['performance', 'usability', 'etc'] )
When I use 'multi-label classification', the performance was not good.. Therefore, I filtered rows with multi-label.
I'm curious is there any other way to use both multi-label and single-label?
Thanks.
(1) When I download the pre-trained model, I set 'multi-label classification', but it's performance was too bad.
(2) Therefore, I currently filtered rows with multip-label.
(3) I'm curious is there any other way to use both multi-label and single-label.

deep learning multi-task for multi datasets

I am implementing a speech emotion recognition multitask model in Pytorch. I see many researchers use single-input multi-output model in speech emotion recognition, but I want to realize the multi_input multi-output model. For this kind of model, I have seen some examples, they use different datasets as input, speech emotion recognition and gender recognition as output tasks. Although it is called multi-input multi-output model, but they concat the input information together in the following layers, which I thought can be regard as single-input.
I want to implement a model that can use several datasets as input and the input features in each datasets can be used predict the labels independently, which means for the output tasks, each task has its own input features, and input features of each dataset are corresponding to the each output tasks. How can I apply this to the model using Pytorch or Keras? Maybe if I don't concat features in all dataset, it can be worked. I don't know whether it is right.

fine tune universal-sentence-encoder embeddings

I am new to NLP and Neural Networks. I want to do topic analysis for a dataset of reviews of our product. I tried to use the universal-sentence-encoder along with top2vec and they do a good job. However, the similarity score is low most of the time because the combination of words is unique. I want to retrain the model to capture the similarity in my dataset (which is about 40k reviews).
Is there a way to do so? I think this is called unsupervised fine-tuning
I am aware of Keras' API to import the transformer as a layer, but I do not how to continue from there.
Question 1: What layers or loses could I add to capture similarity between my reviews and outputs vector embedding of the text?
I also read about siamese networks. If I understood correctly, I can add the universal encoder twice as the common network, then add a layer for similarity like keras.layers.Dot.
Question 2: If I trained the model on every possible combination of me reviews, can I then use the output of the first embedding layers (with the new weights) as the embedding of text?
Question 3: If the answer for question is yes, both embeddings in the siamese networks should be almost identical for the same text. Right?

Object detection - How to detect and extract features using CNN and classify them using a classifier?

I have an image classification problem where the number of classes increases over time and when a new class is created I just trained the model with images of the new class. I know this is not possible to do with a CNN, so to solve this problem I did transfer learning where I used a Keras pretrained model to extract the features of the images but instead of replacing the last layers (used for classification) with new layers, I used a Random Forest that is able to increase the number of classes. I achieved an accuracy of 86% using the InceptionResnetV2 trained on the imagenet dataset, which is good for now.
Now I want to do the same but on an object detection problem. How can I achieve this? Can I use the Tensorflow Object Detection API?
Is it possible to replace the last layers, of a pretrained CNN with a detection algorithm like Faster-RCNN or SSD, with a random forest?
Yes, you could implement the above-mentioned approach using Tensorflow object detection API. Also, you could use your InceptionResnetV2 trained model as a feature extractor. The tensorflow object detection API already has InceptionResnetV2 feature extractor trained on coco dataset. Its available at https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/detection_model_zoo.md
Or if you want to provide or create custom feature extractor, please follow the link https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/defining_your_own_model.md
If you are new to Tensorflow object detection API. Please follow this tutorial,
https://github.com/EdjeElectronics/TensorFlow-Object-Detection-API-Tutorial-Train-Multiple-Objects-Windows-10
Hope this helps.

Difference between pre-trained word embedding and training word embedding in keras

I am new to Deep Learning and I want to explore Deep Learning for NLP. I went through word embeddings and tested them in gensim word2vec. I also heard about pre-trained models. I am confused about the difference between pre-trained models and training the model yourself, and how to use the results.
I want to apply it in keras because I do not want to write formulas and all in Theano or Tensorflow.
When training word2vec with gensim, the result you achieve is a representation of the words in your vocabulary as vectors. The dimension of these vectors is the size of the neural network.
The pre-trained word2vec models simply contain a list of those vectors that were pre-trained on a large corpus. You will find pre-trained vectors of various sizes.
How to use those vector representations? That depends on what you want to do. Some interesting properties have been shown for these vectors: it has been shown that the vector for 'man' + 'king' - 'woman' will often result in the closest match to the vector 'woman'. You may also consider using the word vectors as input for another neural network/computation model.
Gensim is a very optimized library to perform the CBOW and skip-gram algorithms but if you really want to set up your neural network yourself, you will first have to learn about the structure of CBOW and skip-gram and learn how to code it in keras for example. This should not be particularly complex and a google search for these subjects should provide you with many results to help you along.

Categories

Resources