I want to eliminate gray lines in 16 bit image you can see.
Final goal is remove line in object image(second image) with background image(first image).
I thought it need FFT, but i don't know how FFT applied. There will be other ways, too.
please help me.
One simple way using Python/OpenCV is to use morphology close multiple times with a small vertical rectangular kernel.
Input:
import cv2
import numpy as np
img = cv2.imread('lines.png')
# do morphology multiple times to remove horizontal lines
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (1,5))
result = cv2.morphologyEx(img, cv2.MORPH_CLOSE, kernel, iterations = 9)
# write result to disk
cv2.imwrite("lines_removed.png", result)
# display it
cv2.imshow("result", result)
cv2.waitKey(0)
However, it will modify the image everywhere slightly
Related
I have these images containing the handwritten circular annotation on the printed text images. I want to remove these annotations from the input image. I have tried to apply some of the thresholding methods as discussed in many threads on StackOverflow, but my results are not as I expected.
However, the method that I am using works really well if the annotation is marked by a blue pen but when the annotation is marked by a black pen then the method of thresholding and erosion won’t produce the output as expected.
Here is a sample image of my achieved results on blue annotations with the thresholding and erosion method
Image (input on the left and output on the right)
Code
import cv2
import numpy as np
from google.colab.patches import cv2_imshow
img = cv2.imread("/content/Scan_0101.jpg")
cv2_imshow(img)
wimg = img[:, :, 0]
ret,thresh = cv2.threshold(wimg,120,255,cv2.THRESH_BINARY)
cv2_imshow(thresh)
kernel = np.ones((3, 3), np.uint8)
erosion = cv2.erode(thresh, kernel, iterations = 1)
mask = cv2.bitwise_or(erosion, thresh)
#cv2_imshow(erosion)
white = np.ones(img.shape,np.uint8)*255
white[:, :, 0] = mask
white[:, :, 1] = mask
white[:, :, 2] = mask
result = cv2.bitwise_or(img, white)
erosion = cv2.erode(result, kernel, iterations = 1)
Here is a sample image of my achieved results on black annotations with the thresholding and erosion method
Image (input on the left and output on the right)
Any suggested approach for this problem? or how this code can be modified to produce the required results.
You must understand that as the gray values in the text and those of the hand writings are in the same range, no thresholding method in the world can work.
In fact, no algorithm at all can succeed without "hints" on what characters look like or don't look like. Even the stroke thickness is not distinctive enough.
The only possible indication is that the circles are made of a smooth and long stroke. And removing them where they cross the characters is just impossible.
Some Parts of handwritten circles (on line spacing regions) may be able to extract, with the assumption "many letters align on same line". In your image, upper and lower part of the circle will be extracted, I think.
Then, if you track the black line with starting from the extracted part (with assuming smooth curvature), it may be able to detect the connected handwritten circle.
However... in real, I think such process will encounter many difficulties : especially regarding the fact that characters will be cut off by removing curve.
I have a contour that is represented in a numpy array in such a way that the boundary points have 1 and the rest are 0. An example image is shown below. How could I smooth this contour ?
I am trying the get a contour that is smoother than the one in the image right now
You can use active_contour to match a spline with a set number of points to your contour. If you lower the number of points you get a smoother contour.
You can use morphological transformations such as erosion and dilation. OpenCV has a nice tutorial here: https://docs.opencv.org/master/d9/d61/tutorial_py_morphological_ops.html
As #Mad Physicist says you can use morphological techniques I recommend using Opening (Dilation and erosion) followed by a dilation as follows:
import cv2
import numpy as np
img = cv2.imread('img.png',0)
kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE,(11,11))
opening = cv2.morphologyEx(img, cv2.MORPH_OPEN, kernel,iterations= 5)
opening = cv2.dilate(opening,kernel,iterations = 1)
img_window = np.hstack((img,opening))
cv2.imshow("image",img_window)
cv2.waitKey(0)
cv2.destroyAllWindows()
results:
Problem:
I'm working with a dataset that contains many images that look something like this:
Now I need all these images to be oriented horizontally or vertically, such that the color palette is either at the bottom or the right side of the image. This can be done by simply rotating the image, but the tricky part is figuring out which images should be rotated and which shouldn't.
What I have tried:
I thought that the best way to do this, is by detecting the white line that separates the the color palette from the image. I decided to rotate all images that have the palette at the bottom such that they have it at the right side.
# yes I am mixing between PIL and opencv (I like the PIL resizing more)
# resize image to be 128 by 128 pixels
img = img.resize((128, 128), PIL.Image.BILINEAR)
img = np.array(img)
# perform edge detection, not sure if these are the best parameters for Canny
edges = cv2.Canny(img, 30, 50, 3, apertureSize=3)
has_line = 0
# take numpy slice of the area where the white line usually is
# (not always exactly in the same spot which probably has to do with the way I resize my image)
for line in edges[75:80]:
# check if most of one of the lines contains white pixels
counts = np.bincount(line)
if np.argmax(counts) == 255:
has_line = True
# rotate if we found such a line
if has_line == True:
s = np.rot90(s)
An example of it working correctly:
An example of it working incorrectly:
This works maybe on 98% of images but there are some cases where it will rotate images that shouldn't be rotated or not rotate images that should be rotated. Maybe there is an easier way to do this, or maybe a more elaborate way that is more consistent? I could do it manually but I'm dealing with a lot of images. Thanks for any help and/or comments.
Here are some images where my code fails for testing purposes:
You can start by thresholding your image by setting a very high threshold like 250 to take advantage of the property that your lines are white. This will make all the background black. Now create a special horizontal kernel with a shape like (1, 15) and erode your image with it. What this will do is remove the vertical lines from the image and only the horizontal lines will be left.
import cv2
import numpy as np
img = cv2.imread('horizontal2.jpg')
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
_, thresh = cv2.threshold(gray, 250, 255, cv2.THRESH_BINARY)
kernel_hor = np.ones((1, 15), dtype=np.uint8)
erode = cv2.erode(thresh, kernel_hor)
As stated in the question the color palates can only be on the right or the bottom. So we can test to check how many contours does the right region has. For this just divide the image in half and take the right part. Before finding contours dilate the result to fill in any gaps with a normal (3, 3) kernel. Using the cv2.RETR_EXTERNAL find the contours and count how many we have found, if greater than a certain number the image is correct side up and there is no need to rotate.
right = erode[:, erode.shape[1]//2:]
kernel = np.ones((3, 3), dtype=np.uint8)
right = cv2.dilate(right, kernel)
cnts, _ = cv2.findContours(right, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
if len(cnts) > 3:
print('No need to rotate')
else:
print('rotate')
#ADD YOUR ROTATE CODE HERE
P.S. I tested for all four images you have provided and it worked well. If in case it does not work for any image let me know.
I'm working on my bachelor's degree final project and I want to create an OCR for bottle inspection with python. I need some help with text recognition from the image. Do I need to apply the cv2 operations in a better way, train tesseract or should I try another method?
I tried image processing operations on the image and I used pytesseract to recognize the characters.
Using the code bellow I got from this photo:
to this one:
and then to this one:
Sharpen function:
def sharpen(img):
sharpen = iaa.Sharpen(alpha=1.0, lightness = 1.0)
sharpen_img = sharpen.augment_image(img)
return sharpen_img
Image processing code:
textZone = cv2.pyrUp(sharpen(originalImage[y:y + h - 1, x:x + w - 1])) #text zone cropped from the original image
sharp = cv2.cvtColor(textZone, cv2.COLOR_BGR2GRAY)
ret, thresh = cv2.threshold(sharp, 127, 255, cv2.THRESH_BINARY)
#the functions such as opening are inverted (I don't know why) that's why I did opening with MORPH_CLOSE parameter, dilatation with erode and so on
kernel_open = cv2.getStructuringElement(cv2.MORPH_RECT, (3, 3))
open = cv2.morphologyEx(thresh, cv2.MORPH_CLOSE, kernel_open)
kernel_dilate = cv2.getStructuringElement(cv2.MORPH_ELLIPSE,(5,7))
dilate = cv2.erode(open,kernel_dilate)
kernel_close = cv2.getStructuringElement(cv2.MORPH_RECT, (1, 5))
close = cv2.morphologyEx(dilate, cv2.MORPH_OPEN, kernel_close)
print(pytesseract.image_to_string(close))
This is the result of pytesseract.image_to_string:
22203;?!)
92:53 a
The expected result is :
22/03/20
02:53 A
"Do I need to apply the cv2 operations in a better way, train tesseract or should I try another method?"
First, kudos for taking this project on and getting this far with it. What you have from the OpenCV/cv2 standpoint looks pretty good.
Now, if you're thinking of Tesseract to carry you the rest of the way, at the very least you'll have to train it. Here you have a tough choice: Invest in training Tesseract, or work up a CNN to recognize a limited alphabet. If you have a way to segment the image, I'd be tempted to go with the latter.
From the result you got and the expected result, you can see that some of the characters are recognized correctly. Assuming you are using a different image from that shown in the tutorial, I recommend you to change the values of threshold and getStructuringElement.
These values work better depending on the image color. The tutorial author must have optimized it for his/her use (by trial and error or some other way).
Here is a video if you want to play around with those value using sliders in opencv. You can also print your result in the same loop to see if you are getting the desired result.
One potential thing you could do to improve recognition on the characters is to dilate the characters so pytesseract gives a better result. Dilating the characters will connect the individual blobs together and can fix the / or the A characters. So starting with your latest binary image:
Original
Dilate with a 3x3 kernel with iterations=1 (left) or iterations=2 (right). You can experiment with other values but don't do it too much or the characters will all connect. Maybe this will provide a better result with you OCR.
import cv2
image = cv2.imread("1.PNG")
thresh = cv2.threshold(image, 115, 255, cv2.THRESH_BINARY_INV)[1]
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (3,3))
dilate = cv2.dilate(thresh, kernel, iterations=1)
final = cv2.threshold(dilate, 115, 255, cv2.THRESH_BINARY_INV)[1]
cv2.imshow('image', image)
cv2.imshow('dilate', dilate)
cv2.imshow('final', final)
cv2.waitKey(0)
I have binarized image. Yellow = 1 mask, purple = 0 background:
I can filter if the whole masks is small by some threshold this way:
def filter_image(img):
if img.sum() < 10:
return np.zeros(img.shape)
else:
return img
However how get rid of this small yellow points?
That can be easily done with a Morphological Transformation
You can check the docs here
Something like this should work:
import cv2
import numpy as np
img = cv2.imread('some_image.png',0)
kernel = np.ones((5,5),np.uint8)
closing = cv2.morphologyEx(img, cv.MORPH_CLOSE, kernel)
You can play with the values inside the kernel to close the holes with a determinate size.
I hope it helped
Maybe you can use sequential morphological dilations and erosions in order to remove this noise.
These operation are also known as open and close.