I have an image that contains shapes made out of lines. I would like to use a Python library (e.g. opencv or skimage) to identify the shapes.
So if the input is this:
I'm trying to get to an output like this:
I'm fairly new to CV techniques and have been trying several processing techniques in opencv documentation/tutorials, but haven't found any ideas that can emphasis the angle or clustering of the lines as I imagine I need to solve this problem.
I'm also open to machine learning based approaches to solving this problem but preferably ones that won't require me to generate an especially large dataset.
This isn't a perfect answer, but I've found some success using this Gaussian adaptive thresholding and Harris corner detection.
import cv2
img = cv2.imread("input.jpg")
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
threshold = cv2.adaptiveThreshold(gray,255,cv2.ADAPTIVE_THRESH_GAUSSIAN_C,cv2.THRESH_BINARY,3,0)
img = cv2.cornerHarris(threshold, 35, 1, 0.2)
plt.imshow(img, cmap='binary')
It doesn't solve the problem completely but it manages to separate the shapes in a meaningful way which is a strong start for me.
Related
I am working with an IR camera and am trying to find out if we have any lens distortion. I am using the example from OpenCV here to guide my work. I used a chessboard template from here and attached it to the back of a book. Before taking any images I heated the book/paper and observed that the checkerboard pattern was coming in very clear.
I took ~50 still frames with the chessboard pattern tilted/moved so that every part of the frame contained some part of the pattern. An example of one of my images is below:
I used the following code which resulted in False for every image. I tried every combination of grid pattern sizes from (5-9, 5-9).
import numpy as np
import cv2
import glob2 as glob
import matplotlib.pyplot as plt
base = 'pathtoimages/'
files = glob.glob(base + '*.png')
for file in files:
img = cv2.imread(file)
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
ret, corners = cv2.findChessboardCorners(gray, (6,8), None)
print (ret)
I can't figure out why the algorithm is not finding the corners. Any ideas?
Edit May 30, 2019:
Today I took some more images with the camera. I took the photos in a more controlled environment without any external light sources present. These new images still fail the findchessboard corner detection. I tried increasing the contrast and brightness using cv2.convertScaleAbs to produce the following image as an example.
This also fails. If I use cv2.goodFeaturesToTrack to find corners I get the following result:
It seems like the Opencv corner detection algorithm is actively avoiding my chess board corners. It will find any corner it can before finding one on the chessboard. I am truly stumped here.
As a sanity check I made sure that openCV can find the corners on the original chessboard I am using and it worked perfectly.
Any ideas?
Edit June 4, 2019:
I ended up writing a script that allows me to manually assign each of the corners. I was able to get the camera distortion model successfully. I still have no solution for why the corners couldn't automatically be found by openCV. I think if I were to do this again in IR, I would make a custom grid that increases the contrast between grid cells simply due to differing thermal properties between "white" and "black" grid cells (use different materials).
I got findChessboardCorners to work with 2 adjustments.
As said in one of the comments, openCV expects a white boarder
around the chessboard. To achieve that you can 'invert' the image,
essentially creating the equivalent to a photographic negative.
Before I called findChessboardCorners, I did: image_inverted =
numpy.array(256 – image_original, dtype=uint8)
Only use the cv2.CALIB_CB_ADAPTIVE_THREASH flag when calling
findChessboardCorners. CALIB_CB_FAST_CHECK and
CALIB_CB_NORMALIZE_IMAGE flags seem to make findChessboardCorners
not find the corners.
Good Luck
I'm trying to do OCR of a scanned document which has handwritten signatures in it. See the image below.
My question is simple, is there a way to still extract the names of the people using OCR while ignoring the signatures? When I run Tesseract OCR it fails to retrieve the names. I tried grayscaling/blurring/thresholding, using the code below, but without luck. Any suggestions?
image = cv2.imread(file_path)
image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
image = cv2.GaussianBlur(image, (5, 5), 0)
image = cv2.threshold(image, 0, 255, cv2.THRESH_BINARY_INV | cv2.THRESH_OTSU)[1]
You can use scikit-image's Gaussian filter to blur thin lines first (with an appropriate sigma), followed by binarization of image (e.g., with some thresholding function), then by morphological operations (such as remove_small_objects or opening with some appropriate structure), to remove the signatures mostly and then try classification of the digits with sliding window (assuming that one is already trained with some blurred characters as in the test image). The following shows an example.
from skimage.morphology import binary_opening, square
from skimage.filters import threshold_minimum
from skimage.io import imread
from skimage.color import rgb2gray
from skimage.filters import gaussian
im = gaussian(rgb2gray(imread('lettersig.jpg')), sigma=2)
thresh = threshold_minimum(im)
im = im > thresh
im = im.astype(np.bool)
plt.figure(figsize=(20,20))
im1 = binary_opening(im, square(3))
plt.imshow(im1)
plt.axis('off')
plt.show()
[EDIT]: Use Deep Learning Models
Another option is to pose the problem as an object detection problem where the alphabets are objects. We can use deep learning: CNN/RNN/Fast RNN models (with tensorflow/keras) for object detection or Yolo model (refer to the this article for car detection with yolo model).
I suppose the input pictures are grayscale, otherwise maybe the different color of the ink could have a distinctive power.
The problem here is that, your training set - I guess - contains almost only 'normal' letters, without the disturbance of the signature - so naturally the classifier won't work on letters with the ink of signature on them. One way to go could be to extend the training set with letters of this type. Of course it is quite a job to extract and label these letters one-by-one.
You can use real letters with different signatures on them, but it might be also possible to artificially generate similar letters. You just need different letters with different snippets of signatures moved above them. This process might be automated.
You may try to preprocess the image with morphologic operations.
You can try opening to remove the thin lines of the signature. The problem is that it may remove the punctuation as well.
image = cv2.imread(file_path)
image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
kernel = cv2.getStructuringElement(cv2.MORPH_CROSS,(5,5))
image = cv2.morphologyEx(image, cv2.MORPH_OPEN, kernel)
You may have to alter the kernel size or shape. Just try different sets.
You can try other OCR providers for the same task. For example, https://cloud.google.com/vision/ try this. You can upload an image and check for free.
You will get a response from API from where you can extract the text which you need. Documentation for extracting that text is also given on the same webpage.
Check out this. this will help you in fetching that text. this is my own answer when I faced the same problem. Convert Google Vision API response to JSON
I have two different types of images (which I cannot post due to reputation, so I've linked them.):
Image 1 Image 2
I was trying to extract hand features from the images using OpenCV and Python. Which kinda looks like this:
import cv2
image = cv2.imread('image.jpg')
blur = cv2.GaussianBlur(image, (5,5), 0)
gray = cv2.cvtColor(blur, cv2.COLOR_BGR2GRAY)
retval, thresh1 = cv2.threshold(gray, 70, 255, / cv2.THRESH_BINARY_INV+cv2.THRESH_OTSU)
cv2.imshow('image', thresh1)
cv2.waitKey(0)
The result of which looks like this:
Image 1 Image 2
The change in background in the second image is messing with the cv2.threshold() function and its not getting the skin parts right. Is there a way to do this right?
As a follow up question, what is the best way to extract hand features? I tried a HaaR Cascade and I didn't really get results? Should I train my own cascade? What other options do I have?
It's hard to say based on a sample size of two images, but I would try OpenCV's Integral Channel Features (ChnFtrs), which are like supercharged Haar features that can take cues from colour as well as any other image channels you care to create and provide.
In any case, you are going to have to train your own cascades. Separate cascades for front and profile shots of course.
Take out your thresholding by skin colour, because as you've already noticed, it may throw away some or all of the hands depending on the actual subject's skin colour and lighting. ChnFtrs will do the skin detection for you more robustly than a fixed threshold can. (Though for future reference, all humans are actually orange :))
You could eliminate some false positives by only detecting within a bounding box of where you expect the hands to be.
Try both RGB and YUV channels to see what works best. You could also throw in the results of edge detection (say, Canny, maximised across your 3 colour channels) for good measure. At the end, you could cull channels which are underused to save processing if necessary.
If you have much variation in hand pose, you may need to group similar poses and train a separate ChnFtrs cascade for each group. Individual cascades do not have a branching structure, so they do not cope well when the positive samples are disjoint in parameter space. This is, AFAIK, a bit of an unexplored area.
A correctly trained ChnFtrs cascade (or several) may give you a bounding box for the hands, which will help in extracting hand contours, but it can't exclude invalid contours within the same bounding box. Most other object detection routines will also have this problem.
Another option, which may be better/simpler than ChnFtrs, is LINEMOD (a current favourite of mine). It has the advantage that there's no complex training process, nor any training time needed.
I am doing a license-plate recognition. I have crop out the plate but it is very blurred. Therefore I cannot split out the digits/characters and recognize it.
Here is my image:
I have tried to denoise it through using scikit image function.
First, import the libraries:
import cv2
from skimage import restoration
from skimage.filters import threshold_otsu, rank
from skimage.morphology import closing, square, disk
then, I read the image and convert it to gray scale
image = cv2.imread("plate.jpg")
image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
I try to remove the noise:
denoise = restoration.denoise_tv_chambolle(image , weight=0.1)
thresh = threshold_otsu(denoise)
bw = closing(denoise > thresh, square(2))
What I got is :
As you can see, all the digits are mixed together. Thus, I cannot separate them and recognize the characters one by one.
What I expect is something like this (I draw it):
I am looking for help how can I better filter the image? Thank you.
=====================================================================
UPDATE:
After using skimage.morphology.erosion, I got:
First, this image seems to be more defaced by blur, than by noize, so there are no good reasons to denoise it, try debluring instead.
The simplest would be inverse filtering or even Wiener filtering. Then you'll need to separate image's background from letters by the level of luminosity for example with watershed algorithm. Then you'll get separate letters which you need to pass through one of classifiers for example, based on neural networks (even simplistic feed-forward net would be ok).
And then you'll finally get the textual representation. That's how such recognitions are usually made.
There's good book by Gonzalez&Woods, try looking for detailed explaination there.
I concur with the opinion that you should probably try to optimize the input image quality.
Number plate blur is a typical example of motion blur.
How well you can deblur depends upon how big or small is the blur radius.
Generally greater the speed of the vehicle, larger the blur radius and therefore more difficult to restore.
A simple solution that somewhat works is de-interlacing of images.
Note that it is only slightly more readable than your input image.
Here I have dropped every alternate line and resized the image to half its size using PIL/Pillow and this is what I get:
from PIL import Image
img=Image.open("license.jpeg")
size=list(img.size)
size[0] /= 2
size[1] /= 2
smaller_image=img.resize(size, Image.NEAREST)
smaller_image.save("smaller_image.png")
The next and more formal approach is deconvolution.
Since blurring is achieved using convolution of images, deblurring requires doing the inverse of convolution or deconvolution of the image. There are various kinds of deconvolution algorithms like the Wiener deconvolution,
Richardson-Lucy method, Radon transform and a few types of Bayesian filtering.
You can apply Wiener deconvolution algorithm using this code. Play with the angle, diameter and signal to noise ratio and see if it provides some improvements.
The skimage.restoration module also provides implementation of both unsupervised_wiener and richardson_lucy deconvolution.
In the code below I have shown both the implementations but you will have to modify the psf to see which one suits better.
import numpy as np
import matplotlib.pyplot as plt
import cv2
from skimage import color, data, restoration
from scipy.signal import convolve2d as conv2
img = cv2.imread('license.jpg')
licence_grey_scale = color.rgb2gray(img)
psf = np.ones((5, 5)) / 25
# comment/uncomment next two lines one by one to see unsupervised_wiener and richardson_lucy deconvolution
deconvolved, _ = restoration.unsupervised_wiener(licence_grey_scale, psf)
deconvolved = restoration.richardson_lucy(licence_grey_scale, psf)
fig, ax = plt.subplots()
plt.gray()
ax.imshow(deconvolved)
ax.axis('off')
plt.show()
Unfortunately most of these deconvolution alogirthms require you to know in advance the
blur kernel (aka the Point Spread Function aka PSF).
Here since you do not know the PSF, so you will have to use blind deconvolution.
Blind deconvolution attempts to estimate the original image without any knowledge of the blur kernel.
I have not tried this with your image but here is a Python implementation of blind deconvolution algorithm:
https://github.com/alexis-mignon/pydeconv
Note that an effective general purpose blind deconvolution algorithms has not yet been found and is an active field of research.
ChanVeseBinarize with an image enhanced binarized kernel gave me this result. This is helpful to highlight 4,8,1 and 2. I guess you need to do separate convolution with each character and if the peak of the convolution is higher than a threshold we can assume that letter to be present at the location of the peak. To take care of distortion, you need to do the convolution with few different types of font of a given character.
Another potential improvement using derivative filter and little bit of Gaussian smoothing. The K & X are not as distorted as the previous solution.
I am working on a project that requires detecting lines on a plate of sand. The lines are hand-drew by user so that are not exactly "straight" (see photo). And because of the sand, the lines are quite hard to distinguish.
I tried cv2.HoughLines from OpenCV but didn't achieve good results. So any suggestion on the detecting method? And welcome for suggestion to improve the clarity of the lines. I am thinking of putting a few led light surrounding the plate.
Thanks
The detecting method depends a lot on how much generality you require: is the exposure and contrast going to change from one image to another? Is the typical width of lines going to change? In the following, I assume that such parameters do not vary much for your applications, please correct me if I'm wrong.
I'll be using scikit-image, a common image processing package for Python. If you're not familiar with this package, documentation can be found on http://scikit-image.org/, and the package is bundled with all installations of Scientific Python. However, the algorithms that I use are also available in other tools, like opencv.
My solution is written below. Basically, the principle is
first, denoise the image. Life is usually simpler after a denoising step. Here I use a total variation filter, since it results in a piecewise-constant image that will be easier to threshold. I enhance dark regions using a morphological erosion (on the gray-level image).
then apply an adaptive threshold that varies locally in space, since the contrast varies through the image. This operation results in a binary image.
erode the binary image to break spurious links between regions, and keep only large regions.
compute a measure of the elongation of the regions to keep only the most elongated ones. Here I use the ratio of the eigenvalues of the inertia tensor.
Parameters that are the most difficult to tune is the block size for the adaptive thresholding, and the minimum size of regions to keep. I also tried a Canny filter on the denoised image (skimage.filters.canny), and results were quite good, but edges were not always closed, you might also want to try an edge-detection method such as a Canny filter.
The result is shown below:
# Import modules
import numpy as np
from skimage import io, measure, morphology, restoration, filters
from skimage import img_as_float
import matplotlib.pyplot as plt
# Open the image
im = io.imread('sand_lines.png')
im = img_as_float(im)
# Denoising
tv = restoration.denoise_tv_chambolle(im, weight=0.4)
ero = morphology.erosion(tv, morphology.disk(5))
# Threshold the image
binary = filters.threshold_adaptive(ero, 181)
# Clean the binary image
binary = morphology.binary_dilation(binary, morphology.disk(8))
clean = morphology.remove_small_objects(np.logical_not(binary), 4000)
labels = measure.label(clean, background=0) + 1
# Keep only elongated regions
props = measure.regionprops(labels)
eigvals = np.array([prop.inertia_tensor_eigvals for prop in props])
eigvals_ratio = eigvals[:, 1] / eigvals[:, 0]
eigvals_ratio = np.concatenate(([0], eigvals_ratio))
color_regions = eigvals_ratio[labels]
# Plot the result
plt.figure()
plt.imshow(color_regions, cmap='spectral')