Show image from fetched data using openCV - python

I've been using datasets from sklearn. And I want to show image from 'MNIST original' using openCV.imshow
Here is part of my code
dataset = datasets.fetch_mldata('MNIST original')
features = np.array(dataset.data, 'int16')
labels = np.array(dataset.target, 'int')
list_hog_fd = []
deskewed_images = []
for img in features:
cv2.imshow("digit", img)
deskewed_images.append(deskew(img))
"digit" window appears but it is definitely not an digit image. How can I access real image from dataset?

Shape
MNIST image datasets generally are distributed and used as a 1D vector of 784 values.
However, in order to show it as image, you need to convert it to a 2D matrix with 28*28 values.
Simply using img = img.reshape(28,28) might work in your case.

Related

How to crop out the annotation box and everything within it?

After running yolov8, the algorithm annotated the following picture: Density-Area
My goal is to crop out a large number of these pictures to use in the further analysis. So, I want everything within the bounding box saved, and everything else outside of it removed.
I tried using torch, numpy, cv2, and PIL but haven't been successful.
import torch
import torchvision
from PIL import Image
# Load the image
image = Image.open("path to .jpg")
# Define the model and download the pre-trained weights
model = torchvision.models.detection.fasterrcnn_resnet50_fpn(pretrained=True, weights=None)
# Set the model to evaluation mode
model.eval()
# Transform the image to a tensor
transform = torchvision.transforms.ToTensor()
image_tensor = transform(image)
# Make predictions on the image using the model
predictions = model([image_tensor])
# Extract the bounding boxes and object labels from the predictions
boxes = predictions[0]['boxes'].tolist()
labels = predictions[0]['labels'].tolist()
# Crop the image for each object detected
for i in range(len(boxes)):
bbox = tuple(boxes[i])
object_label = labels[i]
object_image = image.crop(bbox)
object_image.save(f"image_save.jpg")
The image is just an nd-array, so just use array indexing to perform the cropping operation you desire.
For example I assume your bounding boxes are of the form [xmin,ymin,xmax,ymax].
for i in range(len(boxes)):
object_label = labels[i]
object_image = image_tensor
crop = object_image[:,ymin:ymax,xmin:xmax]
# permute color dimension last
crop = crop.permute(1,2,0)
# convert from tensor to numpy array
crop = crop.data.numpy()
# swap from RGB to BGR (per opencv convention)
crop = crop[:,:,::-1]
# save
cv2.imwrite("output_image.jpg",crop)
I'm sure you could accomplish this working directly with the PIL image objects as well but more generally in response to your comment: NO, you cannot crop an image without providing the coordinates of the cropping bounding box.

How to convert data array to show easy to understand results

I tried to make a algorithm using Teachable Machine to receive a picture and see if it fall under one of two categories of pictures (e.g dogs or humans), but after I exported the code that was given I couldn't make sense of how I could make the results that were given via array to turn into something that anyone can understand. So far it only shows a list of two numbers (e.g [[0.00058185 0.99941814]] the first number being dogs and the second one humans) I wanted to make it to show which one of the two numbers means dog and human and the percentage of both or to make it to only shows which one is the most probable to be.
Here's the code:
import tensorflow.keras
from PIL import Image, ImageOps
import numpy as np
from decimal import Decimal
# Disable scientific notation for clarity
np.set_printoptions(suppress=True)
# Load the model
model = tensorflow.keras.models.load_model('keras_model.h5')
# Create the array of the right shape to feed into the keras model
# The 'length' or number of images you can put into the array is
# determined by the first position in the shape tuple, in this case 1.
data = np.ndarray(shape=(1, 224, 224, 3), dtype=np.float32)
# Replace this with the path to your image
image = Image.open('test_photo.jpg')
#resize the image to a 224x224 with the same strategy as in TM2:
#resizing the image to be at least 224x224 and then cropping from the center
size = (224, 224)
image = ImageOps.fit(image, size, Image.ANTIALIAS)
#turn the image into a numpy array
image_array = np.asarray(image)
# display the resized image
image.show()
# Normalize the image
normalized_image_array = (image_array.astype(np.float32) / 127.0) - 1
# Load the image into the array
data[0] = normalized_image_array
# run the inference
prediction = model.predict(data)
print(prediction)
input('Press ENTER to exit')
Using argmax and max does what you want:
"Prediction is {} with {}% probability".format(["dog", "human"][np.argmax(prediction)], round(np.max(prediction)*100,2))
'Prediction is human with 99.94% probability'

Why do I get completely black labels when converting slices of nifty data to png images?

I am quite new to deep learning and image segmentation tasks.
I want to train a 2D unet on 3D nifty data (CT scans) by taking the center 50 slices of every case. I saved the images and labels as png, but the labels are completely black. My goal is to predict the tumor region (Y), given a CT scan slice (X).
What am I doing wrong?
My code:
labels = []
for i in range (0,100):
seg = Path("/content/drive/My Drive/")/"case_{:05d}".format(i) / "segmentation.nii.gz"
seg = nib.load(seg)
seg = seg.get_data()
n_i, n_j, n_k = seg.shape
seg = seg[int(((n_i-1)/2)-25):int(((n_i-1)/2)+25),:,:]
for i in range(seg.shape[0]):
labels.append((seg[i,:,:]))
i+=1
labels = np.array(labels)
for i in range(labels.shape[0]):
label = (labels[i,:,:])
imsave('/content/drive/My Drive/labels/labels_slice_{:05d}.png'.format(i), label)
i += 1
I do the same for the images ad get the following .png file:
image_slice_00680
But for the labels I only get a completely black image.
Data type of image and label is float64 and uint16 respectively.
Example segmentation nifti file
I checked your nifti file in both 3D Slicer and nipy and the file is perfectly fine. Out of 608 slices in this segmentation file, only slices 181-397 have positive values, so you are supposed to get completely black images for the rest.
This short snippet allows me to save the positive example at the 300th slice:
import nibabel
import matplotlib.pyplot as plt
seg = nibabel.load("D:/Downloads/segmentation.nii.gz")
data = seg.get_fdata()
layer = data[300,:,:]
plt.imsave("D:/Downloads/seg.png", layer, cmap='gray')
Let me know if you can replicate this using the code above?
Also, I know it was not a part of the question, but you should strongly consider staying with nifti (or NRRD) format instead of converting them to PNG files.
1) When you are saving to PNG, you lose a lot of information from the CT scan. Basically you're rescaling CT values that often range from -2000 to +2000 to 0-255 pixel range.
2) Similarly with segmentation masks, in your nifti files segmented region is saved as "1" and background as "0". When you save it to PNG, you will get it rescaled to 0-255 and you will have to convert it back again for network training.

How to extract patches of same size from images of different size and batch them together with tensorflow dataset api?

I am trying to make a tensorflow dataset api(tf version 1.8) for a set of images which are of different sizes. To do this, I am extracting patches of same size from the images and feeding it to my neural net.
The problem is in tf.extract_patches_from_images, the patches get stored in the channel dimension. As each image is of different size, number of patches are different for each image. Hence the shape of each resulting image is different. Hence I can't batch them together using tf dataset api.
Can someone suggest changes in my following modify_image function to tackle the issue?
I guess separating the patches into different images and then batching them together would work. But I can't understand how to do that.
I want to scan the whole image, hence randomly selecting equal number of patches won't work for me.
def modify_image(image):
'''add preprocessing functions here'''
image = tf.expand_dims(image,0)
image = tf.extract_image_patches(
image,
ksizes=[1,patch_size,patch_size,1],
strides=[1,patch_size,patch_size,1],
rates=[1,1,1,1],
padding='SAME',
name=None
)
image = tf.reshape(image,shape=[-1,patch_size,patch_size,1])
return image;
def parse_function(image,labels):
image= tf.read_file(image)
image = tf.image.decode_image(image)
labels = tf.read_file(labels)
labels = tf.image.decode_image(labels)
image = modify_image(image)
labels = modify_image(labels)
return image,labels
def list_files(directory):
files = glob.glob(directory)
return files
def load_dataset(img_dir,labels_dir):
images = list_files(img_dir)
images = tf.constant(images)
labels = list_files(labels_dir)
labels = tf.constant(labels)
dataset = tf.data.Dataset.from_tensor_slices((images,labels))
dataset = dataset.map(parse_function)
return dataset
def make_batches(home_dir,img_dir,labels_dir,batch_size):
img_dir = home_dir + img_dir
labels_dir = home_dir +labels_dir
dataset = load_dataset(img_dir,labels_dir)
batched_dataset = dataset.batch(batch_size)
return batched_dataset
The tf.contrib.data.unbatch() transformation might be helpful here, as it can separate the patches from a single image into different elements:
dataset = tf.data.Dataset.from_tensor_slices((images,labels))
dataset = dataset.map(parse_function)
patches_dataset = dataset.apply(tf.contrib.data.unbatch())
batched_dataset = dataset.batch(batch_size)
Note that for tf.contrib.data.unbatch() to work, the number of patches in an image must match the number of elements/rows in labels. For example, if each patch should get the same label, you could achieve this by modifying parse_function() as follows to tf.tile() the labels an appropriate number of times:
def parse_function(images, labels):
# ...
return image, tf.tile([labels], tf.shape(image)[0:1])

Build the feature matrix and label vector:

I have a dataset “Digit” . The dataset includes 1797 small images (8x8 pixels), each one includes a hand-written digit (0-9). Each image is considered as a data sample with pixels as features. Thus, to build the feature table you have to convert each 8x8 image into a row of the feature matrix with 64 feature columns for 64 pixels. How to build a feature matrix and label vector for it ???
You can follow the scikit-learn tutorial on supervised learning, where they are using the Digit dataset
http://scikit-learn.org/stable/tutorial/basic/tutorial.html#loading-an-example-dataset
with more detail here. If you load the dataset as in the example, you can simple reshape the images:
from sklearn import datasets
digits = datasets.load_digits()
# To apply a classifier on this data, we need to flatten the image, to
# turn the data in a (samples, feature) matrix:
n_samples = len(digits.images)
data = digits.images.reshape((n_samples, -1))
This makes data a 2D matrix, with n_samples rows and as many columns as needed to fit the flattened image.
If you're using numpy and cv2 you can do the following:
import numpy as np
import cv2
fname = "image1.jpg"
image = cv2.imread(fname) # (8, 8, 1)
feature = image.reshape(64) # (64,)
to read a bunch of images and load into a 'feature matrix' (a numpy array) you can do the following:
N = 10 # number of images
data = np.zeros((N, 64))
for index in range(N):
# get the current image and convert to feature, as above
data[index] = np.copy(feature)
Each row of your data matrix is now one example (a 64 dim list of features).
Does this help?
The label vector can just be a 1D numpy array, i.e. labels = np.zeros(N)
EDIT:
There are a number of ways to read images:
(1) img = cv2.imread(filename)
(2) using matplotlib:
import matplotlib.image as mpimg
img = mpimg.imread(filename)
(3) using PIL (or PILLOW):
from PIL import Image
img = Image.open(filename)
It pays to check the shape of the image after it has been read, so that you know it is in the correct channel, width, height order that is appropriate for your application.

Categories

Resources