I am trying to crop an image of a piece of card/paper or such so that the card/paper is in focus. I tried the below code but the problem is that it works only when the object in question is alone in the picture. If it is a blank background with nothing else in it- the cropping is flawless, otherwise it does not work as expected.
I am attempting create a system which crops different kinds of images and puts them through a classifier and then extracts text from them.
import cv2
import numpy as np
filenames = "img.jpg"
img = cv2.imread(filenames)
blurred = cv2.blur(img, (3,3))
canny = cv2.Canny(blurred, 50, 200)
## find the non-zero min-max coords of canny
pts = np.argwhere(canny>0)
y1,x1 = pts.min(axis=0)
y2,x2 = pts.max(axis=0)
## crop the region
cropped = img[y1:y2, x1:x2]
filename_cropped = filenames.split('.')
filename_cropped[0] = filename_cropped[0] + '_cropped'
filename_cropped = '.'.join(filename_cropped)
cv2.imwrite(filename_cropped, cropped)
An sample image that works is
Something that does not work is
Can anyone help with this?
The first image works because the entire images besides your target is empty. Canny will also give other results when there is more in the image.
If you are looking for those specific cards I suggest you try to use some colour filtering first. You can try to filer for the blue/purple hue of the card.
Increasing the canny threshold could also work, but you will always still be finding the hand as well in this image unless you add some colour filtering.
You can also try Sobel edge detection . This will probably highlight the instant edges of the card pretty well. But then again, it will also show the hand, so you can't just take all the Sobel/Canny outputs. You need to add processing before it that isolates the card, or after it that can find the rectangular shape of the card in the sobel/canny.
Related
I am trying to recognize hand written digits. Say that I have the following image:
My target is to smooth the extremal features of the contours, and keep only the shape of the white trace like below:
I first applied cv2.THRESH_BINARY_INV to remove the noise.
Now I tried applying cv2.erode() with np.ones((5,5)) as the kernel, but the resulting figure still had the extremal points.
I think applying cv2.findContours() may help to get the desired shape, but I am going to end up with two contours, one for the inner and another for the outer part. Any ideas will be much appreciated!
Edit:
Thanks to #stateMachine, I managed to get a skeleton of the digit. I applied cv2.ximgproc.thinning(), followed by cv2.GaussianBlur() and cv2.MORPH_CLOSE. If the extremal points of this image can be smoothened a bit then it would be perfect. I am still open to any ideas :)
Maybe what you are looking for is the shape's skeleton. The skeleton is part of OpenCV's extended image processing module (pip install opencv-contrib-python). You can compute the skeleton of your image like this:
# Imports:
import cv2
# Image path
path = "D://opencvImages//"
fileName = "OKwfZ.png"
# Reading an image in default mode:
inputImage = cv2.imread(path + fileName)
# To Grayscale:
grayscaleImage = cv2.cvtColor(inputImage, cv2.COLOR_BGR2GRAY)
# Compute the skeleton:
skeleton = cv2.ximgproc.thinning(grayscaleImage, None, 1)
cv2.imshow("Skeleton", skeleton)
cv2.waitKey(0)
This is the result:
The skeleton normalizes the thickness of the image to 1 pixel. If you need a thicker line you can apply some dilations.
I have been learning how to implement pretrained yolo using pytorch, and I want to display the output image using openCV's cv2.imshow() method.
The output image can be displayed using .show() function and saved using .save() function, I however want to display it using cv2.imshow(), and for that I would need the image in the form of a numpy array.
I'm unaware about how we do that or even if that is at all possible.
Here's the code for it.
import torch
model = torch.hub.load('ultralytics/yolov5', 'yolov5s', pretrained=True)
imgs = ['img.png'] # batch of images
results = model(imgs)
results.print()
results.show() # or .save(), shows/saves the same image with bounding boxes around detected objects
# Show 'results' using openCV's cv2.imshow() method?
results.xyxy[0] # img1 predictions (tensor)
print(results.pandas().xyxy[0]) # img1 predictions (pandas)
A longer way of solving this problem would be to create bounding boxes ourselves over the detected objects in the image and display it, but consider me lazy :p .
I am lazy like you :) and you can display the bounding boxes without the need to draw them manually. When you call results.save() it will save a version of the image with the boxes to this folder 'runs/detect/exp/' Then you can display that image using cv2.
results.save()
import cv2
img = cv2.imread("runs/detect/exp/zidane.jpg")
#cv2.imgshow does not work on Google collab so this is a work around.
# You should get the same results if you use cv2.imgshow
from google.colab.patches import cv2_imshow
cv2_imshow(img)
had the same issue, so I wrote a small method to do so quickly draw the image without saving it.
def drawRectangles(image, dfResults):
for index, row in dfResults.iterrows():
print( (row['xmin'], row['ymin']))
image = cv2.rectangle(image, (row['xmin'], row['ymin']), (row['xmax'], row['ymax']), (255, 0, 0), 2)
cv2_imshow(image)
_
results = model(image)
dfResults = results.pandas().xyxy[0]
self.drawRectangles(image, dfResults[['xmin', 'ymin', 'xmax','ymax']].astype(int))
Just open the image via cv2 and then ad rectangles by drawing a rectangle via the given points from your result arrays?
import cv2
img = cv2.imread("image_path")
link how to draw a rectangle with cv2
#ChristophRackwitz
here u have it
cv2.rectangle(image, start_point, end_point, color, thickness)
I have two questions.
I am working with openCv and python and I am trying to have an image's contours. I am succesfull at that but when I try to se what is the difference between when I use cv2.drawContorus() functions and directly edit image with cv2.findContours() without sending a copy of original image as the source parameter. I have tried on some images but I couldnt see anything even happenning.
I am trying to get the contours of a square I created with paint square tool. But when I try with cv2.CHAIN_APPROX_SIMPLE method, it gives me coordinates of 6 points which none of the combinations from them is suitable for my square. Why does it do like that?
Can someone explain?
Here is my code for both problems:
import cv2
import numpy as np
image = cv2.imread(r"C:\Users\fazil\Desktop\12.png")
gray = cv2.cvtColor(image,cv2.COLOR_BGR2GRAY)
gray = cv2.Canny(gray,75,200)
gray = cv2.threshold(gray,127,255,cv2.THRESH_BINARY_INV)[1]
cv2.imshow("s",gray)
contours, hiearchy = cv2.findContours(gray,cv2.RETR_TREE,cv2.CHAIN_APPROX_SIMPLE)
print(contours[1])
cv2.drawContours(image,contours,1,(45,67,89),5)
cv2.imshow("k",gray)
cv2.imshow("j",image)
cv2.waitKey(0)
cv2.destroyAllWindows()
I'm writing a script in Python for my image processing class, which should read a directory for images, display them, and then I will eventually add additional code to perform Otsu thresholding on these images. I can get a reference image to display properly to include Otsu thresholding; however, I run into trouble when I attempt to display the remaining images in the directory. I am not sure that my images are being read from the directory correctly, as I am trying to store them in an array; however, I can see the output window displays grey squares which correspond to the dimensions of the actual image resolutions, which suggests that they are being at least partly read correctly.
I've already attempted to isolate the script to load images and display them into a separate file and running it. I was concerned that the successful processing of my sample image (which included a black/white binarization) was somehow affecting my image display later. This was not the case, as running a separate script produced the same grey square output.
****Update****
I've managed to tweak the below script(not yet updated) to run almost correctly. By writing the full filepath directly for each file, I can get the output to display correctly. It appears there is some issue with loading images into an array, best I can tell; a potential workaround for future testing is importing file locations as a string array, and implementing that vs. loading images into an array directly.
import cv2 as cv
import numpy as np
from PIL import Image
import glob
from matplotlib import pyplot as plot
import time
image=cv.imread('Fig ref.jpg')
image2=cv.cvtColor(image, cv.COLOR_RGB2GRAY)
cv.imshow('Image', image)
# global thresholding
ret1,th1 = cv.threshold(image2,127,255,cv.THRESH_BINARY)
# Otsu's thresholding
ret2,th2 = cv.threshold(image2,0,255,cv.THRESH_BINARY+cv.THRESH_OTSU)
# Otsu's thresholding after Gaussian filtering
blur = cv.GaussianBlur(image2,(5,5),0)
ret3,th3 = cv.threshold(blur,0,255,cv.THRESH_BINARY+cv.THRESH_OTSU)
# plot all the images and their histograms
images = [image2, 0, th1,
image2, 0, th2,
blur, 0, th3]
titles = ['Original Noisy Image','Histogram','Global Thresholding (v=127)',
'Original Noisy Image','Histogram',"Otsu's Thresholding",
'Gaussian filtered Image','Histogram',"Otsu's Thresholding"]
for i in range(3):
plot.subplot(3,3,i*3+1),plot.imshow(images[i*3],'gray')
plot.title(titles[i*3]), plot.xticks([]), plot.yticks([])
plot.subplot(3,3,i*3+2),plot.hist(images[i*3].ravel(),256)
plot.title(titles[i*3+1]), plot.xticks([]), plot.yticks([])
plot.subplot(3,3,i*3+3),plot.imshow(images[i*3+2],'gray')
plot.title(titles[i*3+2]), plot.xticks([]), plot.yticks([])
plot.show()
imageFolderPath = 'D:\Google Drive\Engineering\Senior Year\Image processing\Image processing group work'
imagePath = glob.glob(imageFolderPath + '/*.JPG')
im_array = np.array( [np.array(Image.open(img).convert('RGB')) for img in imagePath] )
temp=cv.imread("D:\Google Drive\Engineering\Senior Year\Image processing\Image processing group work\Fig ref.jpg")
cv.imshow('image', temp)
time.sleep(15)
for i in range(9):
cv.imshow('Image', im_array[i])
time.sleep(2)
plot.subplot(3,3,i*3+3),plot.imshow(images[i*3+2],'gray'): The second argument says you use gray color map. Get rid of it and you would get color displays.
I'm trying to detect the white dots in the following image using OpenCV and Python.
I tried using the function cv2.HoughCircles but without any success.
Do I need to use a different method?
This is my code:
import cv2, cv
import numpy as np
import sys
if len(sys.argv)>1:
filename = sys.argv[1]
else:
filename = 'p.png'
img_gray = cv2.imread(filename,cv2.CV_LOAD_IMAGE_GRAYSCALE)
if img_gray==None:
print "cannot open ",filename
else:
img = cv2.GaussianBlur(img_gray, (0,0), 2)
cimg = cv2.cvtColor(img,cv2.COLOR_GRAY2BGR)
circles = cv2.HoughCircles(img,cv2.cv.CV_HOUGH_GRADIENT,4,10,param1=200,param2=100,minRadius=3,maxRadius=100)
if circles:
circles = np.uint16(np.around(circles))
for i in circles[0,:]:
cv2.circle(cimg,(i[0],i[1]),i[2],(0,255,0),1)
cv2.circle(cimg,(i[0],i[1]),2,(0,0,255),3)
cv2.imshow('detected circles',cimg)
cv2.waitKey(0)
cv2.destroyAllWindows()
If you can reproduce a morphological reconstruction in OpenCV, you can easily build a h-dome transform which simplifies the task significantly. Otherwise, a simple threshold on a gaussian filtering might be enough too.
Binarize[FillingTransform[GaussianFilter[f, 2], 0.4, Padding -> 1]]
The gaussian filtering was done in the code above to effectively suppress the noise around the border of the input, which would remain after the h-dome transform otherwise.
Next there is the result of a simple threshold after a gaussian filtering (Binarize[GaussianFilter[f, 2], 0.5]) as well another result that is given by a direct binarization using Kapur's thresholding method (see the paper "A new method for gray-level picture thresholding using the entropy of the histogram" (which is no longer a new method, it is from 1985)):
The right image above has a lot of small points all over the border (which cannot be seen at this image resolution), but is fully automatic. From these 3 options, only the second one is already present in OpenCV.
I think a median filter will improve your image. Try to experiment with some kernels, 3x3 or 7x7. Then after that some (local) thresholding algorithm will get you shapes. You can either you HoughCircles, or just find contours and check them for roundness.
Convert the image to binary image using a suitable threshold technique (Otsu might help). Then use morphological operations like erosion to make circles smaller and then you can easily find their centers.