Generating 3D image using stack of 2D images

Generating 3D image using stack of 2D images - python

I'm trying to generate a 3D images using stack of 2D grayscale images in python. I currently have the images, mask, and mask output. I tried creating an ndarray by adding an axis to my images but this didn't seems to work.
This is what I wrote:
# load images
images_gray = []
#x, y= images[0].shape
#z= len(frames)
#threeD= np.ndarray([x,y,z]) #3D
threeD=[]
for i in range(len(images)):
frame= cv2.imread(path+'/images/' + str(i))
#convert to grayscale then save
images_gray.append(rgb2gray(frame))
#create a polygon
coordinates=coord[i]
coordinates = [[y,x] for [x,y] in coordinates] #change order for polygon2mask
polygon = np.array(coordinates)
#create a mask
mask= polygon2mask(images_gray[i].shape, polygon)
#apply mask
result=ma.masked_array(images_gray[i], np.invert(mask))
temp=result[... ,np.newaxis]
threeD.append(temp)
The resulted output shape for threeD is (#of frames, image hight, image width, 1). I don't know where the 1 come from, and I also expected the order to be (x, y, z) = (image hight, image width, #of frames). The output is wrong and I wasn't able to view it using plt as I got type error saying invalid shape.
For the z, I thought about setting a value of 0.1 that would represent the thickness, not sure how to set that up.
I'm also not sure if my approach is correct or not; do I have to create a points clouds instead? mesh? any suggestions?

Related

Extracting separate images from YOLO bounding box coordinates

I have a set of images and their corresponding YOLO coordinates. Now I want to extract the objects that these YOLO coordinates denote into separate images.
But these coordinates are in floating point notation and hence am not able to use splicing.
This is an image Sample Image and the corresponding YOLO coordinates are
labels = [0.536328, 0.5, 0.349219, 0.611111]
I read my image as follows :
image = cv2.imread('frame0.jpg')
Then I wanted to use something like image[y:y+h,x:x+w] as I had seen in a similar question. But the variables are float, so I tried to convert them into integers using the dimensions of the image 1280 x 720 like this :
object = [int(label[0]*720), int(label[1]*720), int(label[2]*1280), int(label[3]*1280)]
x,y,w,h = object
But it doesn't get the part of the image correctly as you can see over here extractedImage
This is part of my training dataset, so I had cropped these parts earlier using some tools, so there would not be any errors in my labels. Also all the images are incorrectlly cropped this way, I have shown the output for 1 of the images.
Thanks a lot in advance. Any suggestions would be really helpful !

The labels need to be normalized differently - since the x and y are with respect to the center of the screen, they're actually multiplied by W/2 and H/2, respectively. Also, the width and height dimensions have to be multiplied by W and H, respectively - they're currently both being normalized by the W (1280). Here's how I solved it:
import cv2
import matplotlib.pyplot as plt
label = [0.536328, 0.5, 0.349219, 0.611111]
img = cv2.imread('P6A4J.jpg')
H, W, _ = img.shape
object = [int(label[0]*W/2), int(label[1]*H/2), int(label[2]*W), int(label[3]*H)]
x,y,w,h = object
plt.subplot(1,2,1)
plt.imshow(img)
plt.subplot(1,2,2)
plt.imshow(img[y:y+h, x:x+w])
plt.show()
plt.show()
Output:
]1
Hope this helps!

detect.py
Crops will be saved under runs/detect/exp/crops, with a directory for each class detected.
python detect.py --save-crop
https://github.com/ultralytics/yolov5/issues/5412

affine transformation using nearest neighbor in python

I want to make an affine transformation and afterwards use nearest neighbor interpolation while keeping the same dimensions for input and output images. I use for example the scaling transformation T= [[2,0,0],[0,2,0],[0,0,1]]. Any idea how can I fill the black pixels with nearest neighbor ? I tryied giving them the min value of neighbors' intensities. For ex. if a pixel has neighbors [55,22,44,11,22,55,23,231], I give it the value of min intensity: 11. But the result is not anything clear..
import numpy as np
from matplotlib import pyplot as plt
#Importing the original image and init the output image
img = plt.imread('/home/left/Desktop/computerVision/SET1/brain0030slice150_101x101.png',0)
outImg = np.zeros_like(img)
# Dimensions of the input image and output image (the same dimensions)
(width , height) = (img.shape[0], img.shape[1])
# Initialize the transformation matrix
T = np.array([[2,0,0], [0,2,0], [0,0,1]])
# Make an array with input image (x,y) coordinations and add [0 0 ... 1] row
coords = np.indices((width, height), 'uint8').reshape(2, -1)
coords = np.vstack((coords, np.zeros(coords.shape[1], 'uint8')))
output = T # coords
# Arrays of x and y coordinations of the output image within the image dimensions
x_array, y_array = output[0] ,output[1]
indices = np.where((x_array >= 0) & (x_array < width) & (y_array >= 0) & (y_array < height))
# Final coordinations of the output image
fx, fy = x_array[indices], y_array[indices]
# Final output image after the affine transformation
outImg[fx, fy] = img[fx, fy]
The input image is:
The output image after scaling is:

well you could simply use the opencv resize function
import cv2
new_image = cv2.resize(image, new_dim, interpolation=cv.INTER_AREA)
it'll do the resize and fill in the empty pixels in one go
more on cv2.resize

If you need to do it manually, then you could simply detect dark pixels in resized image and change their value to mean of 4 neighbour pixels (for example - it depends on your required alghoritm)
See: nereast neighbour, bilinear, bicubic, etc.

Orientation binary image seems shifted when extracting coordinates of nonzero pixels

I have a pretty large binary jpg file. The idea is to extract a small part and find shapes. Later on I want to make a shape file of the shapes. The steps I take:
1. Load image and crop it
2. Blur image
3. Skeletonize image
4. Extract coordinates nonzero values of skeletonized image
5. Do some interpolation (not shown in code)
6. Convert coordinates (I just added the difference) such that it will have the same coordinates it originally had inside the large image (before cropping)
7. Extract geo reference data from tiff file of same size original jpg. Bind these to our shapes and create shape file.
My final shape file product appeared to be flipped. So I checked throughout my coding process and the orientation of the image appears to change after I extracted the coordinates. When I used imshow to show the matrices, the orientation was all fine. When I used scatter to show the coordinates of the points, the orientation was off.
ds = gdal.Open(‘path_to/binary.jpg')
image = ds.ReadAsArray()
image_crop = image[28500:30000,25500:28000]
def blur(image,resblur):
blur = cv2.blur(image,(resblur,resblur))
blur = (blur-np.min(blur))/(np.max(blur)-np.min(blur))
blur[blur<=np.mean(blur)] = 0
blur[blur!=0]=1
return blur
blur = blur(image_crop,50)
plt.imshow(blur)
def skel_coordinates(blur):
skel = skeletonize(blur)
skel = np.uint8(skel)
y,x = np.nonzero(skel)
coords = np.transpose(np.vstack([x, y]))
return skel, coords
skel,coords = skel_coordinates(blur)
plt.imshow(skel)
plt.imshow(coord[:,0],coord[:,1])
x = x + 25500
y = y + 28500

Flip/Transpose image efficiently along y=x

I want to flip an image along the y=x axis as so.
I've made this function to do what I want but I was wondering if there's a more optimised way to do this. The function I made is a bit slow when working with big images
def flipImage(img):
# Get image dimensions
h, w = img.shape[:2]
# Create a image
imgYX = np.zeros((w, h, 3), np.uint8)
for y in range(w):
for x in range(h):
imgYX[y,x,:]=img[x,y,:] #Flip pixels along y=x
return imgYX

Simply swap the first two axes that correspond to the height and width -
img.swapaxes(0,1) # or np.swapaxes(img,0,1)
We can permute axes with transpose as well -
img.transpose(1,0,2) # or np.transpose(img,(1,0,2))
We can also roll axes for the same effect -
np.rollaxis(img,0,-1)
We use the same trick when working with images in MATLAB.

numpy.where() on grayscale frame, wrong indices?

I'm trying to implement a basic RANSAC algorithm for the detection of a circle in a grayscale image.
The problem is that, after I thresholded the image and I search for non-zero pixels I get the right shape, but the points are somehow delocalized from the original position:
video = cv2.VideoCapture('../video/01_CMP.avi')
video.set(cv2.CAP_PROP_POS_FRAMES,200)
succ, frame = video.read()
frame = cv2.cvtColor(frame,cv2.COLOR_BGR2GRAY)
frame = cv2.normalize(frame,frame, alpha=0,norm_type=cv2.NORM_MINMAX, beta = 255)
ret,frame = cv2.threshold(frame,35,255,cv2.THRESH_BINARY)
points = n.where(frame>0) #Thresholded pixels
#Orienting correctly the points in a (n,2) shape
#needed because of arguments of circle.points_distance()
points = n.transpose(n.vstack([points[0],points[1]]))
plt.imshow(frame,cmap='gray');
plt.plot(points[:,0],points[:,1],'wo')
video.release()
What am I missing here?

OpenCV use NumPy ndarray to represent image, the axis 0 of the array is vertical, corresponding to Y axis of the image.
So, to plot the points you need: plt.plot(points[:,1],points[:,0],'wo')

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Generating 3D image using stack of 2D images - python

Related

Extracting separate images from YOLO bounding box coordinates

affine transformation using nearest neighbor in python

Orientation binary image seems shifted when extracting coordinates of nonzero pixels

Flip/Transpose image efficiently along y=x

numpy.where() on grayscale frame, wrong indices?

Categories

Resources