I would like to create a panoramic image by combining 2 images in which the same feature, a plus sign.
I've used cv2.xfeatures2d.SIFT_create() to find keypoints in the image however it doesn't find the plus symbol very well. Is there some way I can improve this by making it search specifically for a plus-shaped feature?
import cv2
image1 = cv2.imread('example_image.png')
sift = cv2.xfeatures2d.SIFT_create()
kp = sift.detect(grey_image1, None)
kp_image = cv2.drawKeypoints(grey_image1, kp, None)
def showimage(image, name="No name given"):
cv2.imshow(name, image)
cv2.waitKey(0)
cv2.destroyAllWindows()
return
showimage(kp_image)
The source image is here, second image to align is here. Here is the resulting image from the code above. This is an example of the desired output made using GIMP and manually aligning the two images (the second image will need to transformed to fit properly).`
NB I'm open to using other approaches outside of OpenCV/Python to solve this problem.
Related
I am trying to recognize hand written digits. Say that I have the following image:
My target is to smooth the extremal features of the contours, and keep only the shape of the white trace like below:
I first applied cv2.THRESH_BINARY_INV to remove the noise.
Now I tried applying cv2.erode() with np.ones((5,5)) as the kernel, but the resulting figure still had the extremal points.
I think applying cv2.findContours() may help to get the desired shape, but I am going to end up with two contours, one for the inner and another for the outer part. Any ideas will be much appreciated!
Edit:
Thanks to #stateMachine, I managed to get a skeleton of the digit. I applied cv2.ximgproc.thinning(), followed by cv2.GaussianBlur() and cv2.MORPH_CLOSE. If the extremal points of this image can be smoothened a bit then it would be perfect. I am still open to any ideas :)
Maybe what you are looking for is the shape's skeleton. The skeleton is part of OpenCV's extended image processing module (pip install opencv-contrib-python). You can compute the skeleton of your image like this:
# Imports:
import cv2
# Image path
path = "D://opencvImages//"
fileName = "OKwfZ.png"
# Reading an image in default mode:
inputImage = cv2.imread(path + fileName)
# To Grayscale:
grayscaleImage = cv2.cvtColor(inputImage, cv2.COLOR_BGR2GRAY)
# Compute the skeleton:
skeleton = cv2.ximgproc.thinning(grayscaleImage, None, 1)
cv2.imshow("Skeleton", skeleton)
cv2.waitKey(0)
This is the result:
The skeleton normalizes the thickness of the image to 1 pixel. If you need a thicker line you can apply some dilations.
I am trying to crop an image of a piece of card/paper or such so that the card/paper is in focus. I tried the below code but the problem is that it works only when the object in question is alone in the picture. If it is a blank background with nothing else in it- the cropping is flawless, otherwise it does not work as expected.
I am attempting create a system which crops different kinds of images and puts them through a classifier and then extracts text from them.
import cv2
import numpy as np
filenames = "img.jpg"
img = cv2.imread(filenames)
blurred = cv2.blur(img, (3,3))
canny = cv2.Canny(blurred, 50, 200)
## find the non-zero min-max coords of canny
pts = np.argwhere(canny>0)
y1,x1 = pts.min(axis=0)
y2,x2 = pts.max(axis=0)
## crop the region
cropped = img[y1:y2, x1:x2]
filename_cropped = filenames.split('.')
filename_cropped[0] = filename_cropped[0] + '_cropped'
filename_cropped = '.'.join(filename_cropped)
cv2.imwrite(filename_cropped, cropped)
An sample image that works is
Something that does not work is
Can anyone help with this?
The first image works because the entire images besides your target is empty. Canny will also give other results when there is more in the image.
If you are looking for those specific cards I suggest you try to use some colour filtering first. You can try to filer for the blue/purple hue of the card.
Increasing the canny threshold could also work, but you will always still be finding the hand as well in this image unless you add some colour filtering.
You can also try Sobel edge detection . This will probably highlight the instant edges of the card pretty well. But then again, it will also show the hand, so you can't just take all the Sobel/Canny outputs. You need to add processing before it that isolates the card, or after it that can find the rectangular shape of the card in the sobel/canny.
I have two questions.
I am working with openCv and python and I am trying to have an image's contours. I am succesfull at that but when I try to se what is the difference between when I use cv2.drawContorus() functions and directly edit image with cv2.findContours() without sending a copy of original image as the source parameter. I have tried on some images but I couldnt see anything even happenning.
I am trying to get the contours of a square I created with paint square tool. But when I try with cv2.CHAIN_APPROX_SIMPLE method, it gives me coordinates of 6 points which none of the combinations from them is suitable for my square. Why does it do like that?
Can someone explain?
Here is my code for both problems:
import cv2
import numpy as np
image = cv2.imread(r"C:\Users\fazil\Desktop\12.png")
gray = cv2.cvtColor(image,cv2.COLOR_BGR2GRAY)
gray = cv2.Canny(gray,75,200)
gray = cv2.threshold(gray,127,255,cv2.THRESH_BINARY_INV)[1]
cv2.imshow("s",gray)
contours, hiearchy = cv2.findContours(gray,cv2.RETR_TREE,cv2.CHAIN_APPROX_SIMPLE)
print(contours[1])
cv2.drawContours(image,contours,1,(45,67,89),5)
cv2.imshow("k",gray)
cv2.imshow("j",image)
cv2.waitKey(0)
cv2.destroyAllWindows()
I am trying to recognize the text in a captcha and it is not possible for me. I am using python3, openCv and tesseract.
The simplified code is:
import cv2
from pytesseract import *
img_path = "path"
img = cv2.imread(img_path)
img = cv2.resize(img, None, fx=2, fy=2, interpolation=cv2.INTER_LINEAR)
img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
pytesseract.image_to_string(img)
I think I should remove the color lines first, then leave the text alone, and maybe change the brightness and contrast. What filter could apply?
These are some images to recognize.
For recognising captcha text using pytesseract-ocr, you can do the following..
Prepare custom train_set to training your tesseract instance to recognise a specific font [Optional]
The captcha images need some pre-processing(such as * Apply Black & White Filter > Scale(up) > Blur > Morphological Transformation + Adaptive threshold*)to enhance the text part and reduce the noises/lines.
For removing lines: In the sample images only the text can be seen in black color and there is no black line, so you can simply convert the each non-black pixel to white by using PIL or OpenCV, you can even utilize some specific algo like Hough Line Transform to detect and remove lines.
You can learn about all these filters and algos from the official documentation and tutorial on OpenCV website.
I am trying to extract data from a scanned form. The form has a standard format similar to the one shown in the image below:
I have tried using pytesseract (tesseract OCR) to detect the image's text and it has done a decent job at finding the text and converting the image to text.
However it essentially just gives me all the detected text without keeping the format of the data.
I would like to be able to do something like the below:
Find a particular piece of text and then find the associated data below or beside it. Similar to this question using opencv Detect text region in image using Opencv
Is there a way that I can essentially do the following:
Either find all text boxes on the form, perform OCR on each box and see which one is the closest match to the "witnesess:" text, then find the sections immediately below it and perform separate OCR on those.
Or if the form is standard and I know the approximate location of the "witness" text section can I specify its general location in opencv and then just extract the below text and perform OCR on it.
EDIT: I have tried the below code to try to detect specific regions of the text. However it is not specifically identifying the text just all regions.
import cv2
img = cv2.imread('t2.jpg')
mser = cv2.MSER_create()
img = cv2.resize(img, (img.shape[1]*2, img.shape[0]*2))
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
vis = img.copy()
regions = mser.detectRegions(gray)
hulls = [cv2.convexHull(p.reshape(-1, 1, 2)) for p in regions[0]]
cv2.polylines(vis, hulls, 1, (0,255,0))
cv2.imshow('img', vis)
Here is the result:
I think you have the answer already in your own post.
I did recently something similar and this is how I did it:
//id_image was loaded with cv2.imread
temp_image = id_image[start_y:end_y,start_x:end_x]
img = Image.fromarray(temp_image)
text = pytesseract.image_to_string(img, config="-psm 7")
So basically, if your format is predefined, you just need to know the location of the fields that you want the text of (which you already know), crop it, and then apply the ocr (tesseract) extraction.
In this case you need import pytesseract, PIL, cv2, numpy.