I am making image search using the one-shot model because I have very few data for per class.
I am following this tutorial
Already prepared the datapipeline and trained the model. But I didn't understand the single image prediction process which we do which we do generally by model.predict.
I tried the following code but I think I am missing something.
img1 = cv2.imread("./images_evaluation/test.jpg",cv2.IMREAD_GRAYSCALE)
img1 = cv2.resize(img1,(105,105))
img1 = np.expand_dims(cv2.resize(img1, (105,105)), axis=2)
(test_image_names, train_image_names) = generate_oneshot_validation_trials(dataset, 20)
train_images = get_images(train_image_names, IMAGE_SHAPE)
images = np.tile(img1, (len(train_images), 1, 1, 1))
preds = siamese_model1.predict([images, train_images])
pred_idx = np.argmax(preds, axis=0)[0]
pred_char_name = train_image_names[pred_idx].split('/')[-2]
print(pred_char_name) ## here, finding different prediction after every try. whats the reason?
Related
I followed this tutorial: https://github.com/lih0905/PyTorch_Study/blob/master/8)%20TorchVision%200.3%20Object%20Detection%20finetuning%20tutorial.ipynb and how can you see there is output images at the end of page and it's pretty good.
Now I also did this and my images are cloudy, it looks like a trash.
I trained maskrcnn and save models, and from here I'm loading epoch-2.pt to see results.
This is my code:
PATH = '/home/Nezz/Train/ArT/models_vjezba/epoch-2.pt'
#model = torch.load(PATH)
#model.to(device)
model.load_state_dict(torch.load(PATH))
# pick one image from the test set
img, _ = dataset_test[21]
# put the model in evaluation mode
model.eval()
#evaluate(model, data_loader_test, device=device)
with torch.no_grad():
prediction = model([img.to(device)])
#print(prediction)
imaag = Image.fromarray(img.mul(255).permute(1, 2, 0).byte().numpy())
imag = Image.fromarray(prediction[0]['masks'][0, 0].mul(255).byte().cpu().numpy())
imag.show()
imaag.show()
output:
I trained a model, now I would like to use it to detect objects in images. Using the DefaultDetector only the boundyboxes are returned, I would need the masks. I saw that you can also perform inference with this method:
model.eval()
with torch.no_grad():
outputs = model(inputs)
I think that's what he should use. The problem is that I don't know how to set the inputs, starting with images.
import torch
import glob
cfg = get_cfg()
cfg.merge_from_file(model_zoo.get_config_file("COCO-InstanceSegmentation/"
"mask_rcnn_R_101_FPN_3x.yaml"))
cfg.MODEL.WEIGHTS = os.path.join(cfg.OUTPUT_DIR, "model_final.pth")
cfg.SOLVER.IMS_PER_BATCH = 1
cfg.MODEL.ROI_HEADS.NUM_CLASSES = 1 # only has one class
cfg.INPUT.FORMAT = "BGR"
#Just run these lines if you have the trained model im memory
cfg.MODEL.ROI_HEADS.SCORE_THRESH_TEST = 0.7 # set the testing threshold for this model
#build model
model = build_model(cfg)
DetectionCheckpointer(model).load("output/model_final.pth")
model.eval()#make sure its in eval mode
image = cv2.imread("/kaggle/working/detectron2/images/73-ab1.jpg")
height, width = image.shape[:2]
image = torch.as_tensor(image.astype("float32").transpose(2, 0, 1))
image = ImageList.from_tensors([image])
with torch.no_grad():
inputs = image
outputs = model(inputs)
Unfortunately, however, I think I'm wrong, can someone enlighten me?
See the Model Input Format for the builtin models.
Basically, the model in your code is not expecting an ImageList object, but a list of dicts where each dict needs to provide specific information about one image, as explained in the documentation linked above.
So, your inference code needs to be corrected to the following.
image = cv2.imread("/kaggle/working/detectron2/images/73-ab1.jpg")
height, width = image.shape[:2]
image = torch.as_tensor(image.astype("float32").transpose(2, 0, 1))
inputs = [{"image": image, "height": height, "width": width}]
with torch.no_grad():
outputs = model(inputs)
You can also see this in the code - the forward method of the GeneralizedRCNN class.
I am trying to predict values by loading a saved version of my model.
here is the code for it-
def classifier(img, weights_file):
# Load the model
model = tf.lite.TFLiteConverter.from_keras_model(weights_file)
# Create the array of the right shape to feed into the keras model
data = np.ndarray(shape=(1, 200, 200, 3), dtype=np.float32)
image = img
# image sizing
size = (200, 200)
image = ImageOps.fit(image, size, Image.ANTIALIAS)
# turn the image into a numpy array
image_array = np.asarray(image)
# Normalize the image
normalized_image_array = image_array.astype(np.float32) / 255
# Load the image into the array
data[0] = normalized_image_array
# run the inference
prediction_percentage = model.predict(data)
prediction = prediction_percentage.round()
return prediction, prediction_percentage
My model throws an error " 'TFLiteKerasModelConverterV2' object has no attribute 'predict'"
Can anyone please tell me what can i change here?
You are creating a TFLiteConverter object from your weights file. The correct way to load the model weights is using load_weights link. Try:
tf.keras.model.load_weights(weights_file)
However, you also would first need to define the model the same way as you did when training the model. If you have saved your model in SavedModel format, use
model = tf.keras.models.load_model(weights_file)
I have a Keras model that I have trained and evaluated and even tested. Now I am trying to use three test images into the model.
I run the images through a preprocessor which is the same one I used to make the training data. I then do the exact same thing to the single images that I did for the testing data. But it gives me an error of
Error when checking model input: the list of Numpy arrays that you are passing to your model is not the size the model expected. Expected to see 1 array(s), but instead got the following list of 2 arrays:
I don't know what is wrong with it.
So this is how I test the model successfully.
y_pred = []
y_true = []
for i in range(0, len(test_x1)):
x1 = test_x1[i]
x2 = test_x2[i]
x1 = np.expand_dims(x1, axis=0)
x2 = np.expand_dims(x2, axis=0)
y_true.append(np.argmax(test_y[i]))
pred = model.predict([x1, x2])
y_pred.append(make_binary(pred))
This is the preprocessing method I used for both images
def create_features(file, image_dir, base_model):
img_path = os.path.join(image_dir, file)
img = image.load_img(img_path, target_size=(224, 224))
img = image.img_to_array(img)
x = resnet50.preprocess_input(img)
x = np.array([x])
feature = base_model.predict(x)
return feature
And this is the way I am processing the new images:
IMAGE_DIR = 'Data'
img1 = 'test1.jpg'
img2 = 'test2.jpg'
img3 = 'test3.jpg'
img1_feat = create_features(img1, IMAGE_DIR, model)
img2_feat = create_features(img2, IMAGE_DIR, model)
img3_feat = create_features(img3, IMAGE_DIR, model)
Now when I look at the two features they are the same.
x1 = test_x1[0]
x1 = np.expand_dims(x1, axis=0)
print(x1.shape)
print(type(x1))
print(img1_feat.shape)
print(type(img1_feat))
(1, 1, 1000)
<class 'numpy.ndarray'>
(1, 1, 1000)
<class 'numpy.ndarray'>
And then I try to make a prediction from it
pred1 = model.predict([img1_feat, img2_feat])
But that results in an error.
I figured out what was wrong thanks to #Matias Valdenegro and #Mukul
I was doing this in an Ipython notebook and after a few epochs go around I found out that on occasion the model gets overwritten by an imported resent model from another class.
Thanks to everyone for the help. I didnt think about using the model.summary() as I didnt really think that it has changed.
I started working with tf a few weeks ago and am struggling with the input queue right now.
What I want to do is the following: I have a folder with 477 temporal, greyscale images. Now I want e.g. take the first 3 images and stack them together (=> 600,600,3), so that I get a single example with 3 channels. Next I want to take the fourth image and use it as a label (just 1 channel => 600,600,1). Then I want to pass both to tf.train.batch and create batches.
I think I found a solution, see code below. But I was wondering if there is a more fashionable solution.
My actual question is: What happens at the end of the queue. Since I'm always picking 4 images from the queue (3 for input, 1 for label) and I have 477 images in my queue, things are not working out. Does tf then just fill up my queue again and continues (so if there is 1 image left in the queue, it takes this image, fills up the queue again and take 2 more images to get the desired 3 images?). Or do I need a number of images divisible by 4 in my folder if I want a proper solution?
def read_image(filename_queue):
reader = tf.WholeFileReader()
_, value = reader.read(filename_queue)
image = tf.image.decode_png(value, dtype=tf.uint8)
image = tf.cast(image, tf.float32)
image = tf.image.resize_images(image, [600, 600])
return image
def input_pipeline(file_names, batch_size, num_epochs=None):
filename_queue = tf.train.string_input_producer(file_names, num_epochs=num_epochs, shuffle=False)
image1 = read_image(filename_queue)
image2 = read_image(filename_queue)
image3 = read_image(filename_queue)
image = tf.concat([image1, image2, image3,], axis=2)
label = read.image(filename_queue)
# Reshape is necessary, otherwise I get an error..
image = tf.reshape(image, [600, 600, 3])
label = tf.reshape(label, [600, 600, 1])
min_after_dequeue = 200
capacity = min_after_dequeue + 12 * batch_size
image_batch, label_batch = tf.train.batch([image, label],
batch_size=batch_size,
num_threads=12,
capacity=capacity)
return image_batch, label_batch
Thanks for any help!
But I was wondering if there is a more fashionable solution
Yes! there's a better and faster solution. First you redesign your database, since you want to combine 3 gray images into 1 rgb images for training; prepare a dataset of RGB images from gray images (it will save whole lot of time during training).
redesign the way you retrieve data
# retrieve image and corresponding label at the same time
# here if you set the num_epochs=None, the queue will run continuously; and it will take-care of the data need for training till end
filename_queue = tf.train.string_input_producer([file_names_images_list, corresponding_file_names_label_list], num_epochs=None, shuffle=False)
image = read_image(filename_queue[0])
label = read_image(filename_queue[1])