I am trying to detect lines in a certain image. I run it through a skeletonization process before applying the cv2.HoughLinesP. I used the skeletonization code here.
No matter what I try I keep getting results similar to what is described here i.e. 'only fragments of a line..'
As suggested by Jiby, I use the named notation for the parameters and also high rho and theta, but to no avail.
Here is my code:
lines = cv2.HoughLinesP(skel, rho=5, theta=np.deg2rad(10), threshold=0, minLineLength=0, maxLineGap=0)
Prior to this I threshold a RGB image to extract most of my 'blue' hollow rectangle. Then I convert it to gray scale which I then feed to the skeletonizer.
Please advise.
Related
I'm writing a script to process an image and extract a PDF417 2D barcode, which will then be decoded. I'm extracting the ROI without problems, but when I try to correct the perspective using cv2.warpPerspective, the result is not as expected.
The following is the extracted barcode, the red dots are the detected corners:
This is the resulting image:
This is the code I'm using for the transformation (the values are found by the script, but for the previous images are as follow):
box_or = np.float32([[23, 30],[395, 23],[26, 2141],[389, 2142]])
box_fix = np.float32([[0,0],[415,0],[0,2159],[415,2159]])
M = cv2.getPerspectiveTransform(box_or,box_fix)
warped = cv2.warpPerspective(img,M,(cols,rows))
I've checked and I don't find anything wrong with the code, yet the transformation is definitely wrong. The amount of perspective distortion in the extracted ROI is minimum, but may affect the decoding process.
So, is there a way to get rid of the perspective distortion? Am I doing something wrong? Is this a known bug or something? Any help is very much welcome.
BTW, I'm using OpenCV 3.3.0
It looks like you're giving the image coordinates as (y, x). I know the interpretation of coordinates varies within OpenCV.
In the homography example code they provide the coordinates as (x,y) - at least based on their use of 'h' and 'w' in this snippet:
h,w = img1.shape
pts = np.float32([ [0,0],[0,h-1],[w-1,h-1],[w-1,0] ]).reshape(-1,1,2)
dst = cv2.perspectiveTransform(pts,M)
So try providing the coordinates as (x,y) to both getPerspectiveTransform and warpPerspective.
I have a .png image that contains three grayscale values. It contains black (0), white (255) and gray (128) blobs. I want to resize this image to a smaller size while preserving only these three grayscale values.
Currently, I am using scipy.misc.imresize to do it but I noticed that when I reduce the size, the edges get blurred and now contains more than 3 grayscale values.
Does anyone know how to do this in python?
From the docs for imresize, note the interp keyword argument:
interp : str, optional
Interpolation to use for re-sizing
(‘nearest’, ‘lanczos’, ‘bilinear’, ‘bicubic’ or ‘cubic’).
The default is bilinear filtering; switch to nearest and it will instead use the exact color of the nearest existing pixel, which will preserve your precise grayscale values rather than trying to linearly interpolate between them.
I believe that PIL.Image.resize does exactly what you want. Take a look at the docs.
Basically what you need is:
from PIL import Image
im = Image.open('old.png')
# The Image.NEAREST is the default, I'm just being explicit
im = im.resize((im.size[0]/2, im.size[1]/2), Image.NEAREST)
im.save('new.png')
Actually you can pretty much do that with the scipy.misc.imresize
Take a look at its docs.
The interp parameter is what you need. If you set it to nearest the image colors won't be affected.
I am testing the cv2.threshold() function in with different values but I get each time unexpected results. So this means simply I do not understand the effect of the parameter:
maxval
Could someone clear me on this ?
For example, I want to draw the contours of this star following the white color:
Here is what I got:
From this code:
import cv2
im=cv2.imread('image.jpg') # read picture
imgray=cv2.cvtColor(im,cv2.COLOR_BGR2GRAY) # BGR to grayscale
ret,thresh=cv2.threshold(imgray,200,255,cv2.THRESH_BINARY_INV)
countours,hierarchy=cv2.findContours(thresh,cv2.RETR_TREE,cv2.CHAIN_APPROX_SIMPLE)
cv2.drawContours(im,countours,-1,(0,255,0),3)
cv2.imshow("Contour",im)
cv2.waitKey(0)
cv2.destroyAllWindows()
Each time I change the value of maxval I get a strange result that I can not understand. How can I draw the contour of this star correctly using this function then ?
Thank you in advance.
You may want to experiment with a very simple image that clearly lets you understand the various parameters. The interesting thing about the image attached below is that the grayscale value of a number shown in the image is equal to the number. E.g. 200 is written with grayscale value 200. Here is example python code you can use.
import cv2
# Read image
src = cv2.imread("threshold.png", cv2.CV_LOAD_IMAGE_GRAYSCALE)
# Set threshold and maxValue
thresh = 127
maxValue = 255
# Basic threshold example
th, dst = cv2.threshold(src, thresh, maxValue, cv2.THRESH_BINARY);
# Find Contours
countours,hierarchy=cv2.findContours(dst,cv2.RETR_TREE,cv2.CHAIN_APPROX_SIMPLE)
# Draw Contour
cv2.drawContours(dst,countours,-1,(255,255,255),3)
cv2.imshow("Contour",dst)
cv2.waitKey(0)
I have copied the following image from a OpenCV Threshold Tutorial I wrote recently. It explains the various parameters with example image, Python and C++ Code. Hope this helps.
Input Image
Result Image
well here you can use COLOR_BGR2HSV and then choose a color and the making of contour will be quite easy try it and let me know
in black and while conversion u have same color of yellow and white thats why this is not working
For better accuracy at finding contours one may apply threshold on image as binary image tend to give higher accuracy and then use contours method.Hope this will help..!!!
I attach a zip archive with all the files needed to illustrate and reproduce the problem.
(I don't have permissions to upload images yet...)
I have an image (test2.png in the zip archive ) with curved lines.
I try to warp it so the lines are straight.
I thought of using scikit-image transform, and in particular transform.PolynomialTransform because the transformation involves high order distortions.
So first I measure the precise position of each line at regular intervals in x to define the input interest points (in the file source_test2.csv).
Then I compute the corresponding desired positions, located along a straight line (in the file destination_test2.csv).
The figure correspondence.png shows how it looks like.
Next, I simply call transform.PolynomialTransform() using a polynomial of order 3.
It finds a solution, but when I apply it using transform.warp(), the result is crazy, as illustrated in the file Crazy_Warped.png
Anybody can tell what I am doing wrong?
I tried polynomial of order 2 without luck...
I managed to get a good transformation for a sub-image (the first 400 columns only).
Is transform.PolynomialTransform() completely unstable in a case like mine?
Here is the entire code:
import numpy as np
import matplotlib.pyplot as plt
import asciitable
import matplotlib.pylab as pylab
from skimage import io, transform
# read image
orig=io.imread("test2.png",as_grey=True)
# read tables with reference points and their desired transformed positions
source=asciitable.read("source_test2.csv")
destination=asciitable.read("destination_test2.csv")
# format as numpy.arrays as required by scikit-image
# (need to add 1 because I started to count positions from 0...)
source=np.column_stack((source["x"]+1,source["y"]+1))
destination=np.column_stack((destination["x"]+1,destination["y"]+1))
# Plot
plt.imshow(orig, cmap='gray', interpolation='nearest')
plt.plot(source[:,0],source[:,1],'+r')
plt.plot(destination[:,0],destination[:,1],'+b')
plt.xlim(0,orig.shape[1])
plt.ylim(0,orig.shape[0])
# Compute the transformation
t = transform.PolynomialTransform()
t.estimate(destination,source,3)
# Warping the image
img_warped = transform.warp(orig, t, order=2, mode='constant',cval=float('nan'))
# Show the result
plt.imshow(img_warped, cmap='gray', interpolation='nearest')
plt.plot(source[:,0],source[:,1],'+r')
plt.plot(destination[:,0],destination[:,1],'+b')
plt.xlim(0,img_warped.shape[1])
plt.ylim(0,img_warped.shape[0])
# Save as a file
io.imsave("warped.png",img_warped)
Thanks in advance!
There are a couple of things wrong here, mainly they have to do with coordinate conventions. For example, if we examine the code where you plot the original image, and then put the clicked point on top of it:
plt.imshow(orig, cmap='gray', interpolation='nearest')
plt.plot(source[:,0],source[:,1],'+r')
plt.xlim(0,orig.shape[1])
plt.ylim(0,orig.shape[0])
(I've taken out the destination points to make it cleaner) then we get the following image:
As you can see, the y-axis is flipped, if we invert the y-axis with:
source[:,1] = orig.shape[0] - source[:,1]
before plotting, then we get the following:
So that is the first problem (don't forget to invert the destination points as well), the second has to do with the transform itself:
t.estimate(destination,source,3)
From the documentation we see that the call takes the source points first, then the destination points. So the order of those arguments should be flipped.
Lastly, the clicked points are of the form (x,y), but the image is stored as (y,x), so we have to transpose the image before applying the transform and then transpose back again:
img_warped = transform.warp(orig.transpose(), t, order=2, mode='constant',cval=float('nan'))
img_warped = img_warped.transpose()
When you make these changes, you get the following warped image:
These lines aren't perfectly flat but it makes much more sense.
Thank you very much for the detailed answer! I cannot believe I did not see the axis inversion problem... Thanks for catching it!
But I am afraid your final solution does not solve my problem... The image you get is still crazy. It should be continuous, no have such big holes and weird distortions... (see final solution below)
I found I could get a reasonable solution using RANSAC:
from skimage.measure import ransac
t, inliers = ransac((destination,source), transform.PolynomialTransform, min_samples=20,residual_threshold=1.0, max_trials=1000)
outliers = inliers == False
I then get the following result
Note that I think I was right using (destination,source) in that order! I think it has to do with the fact that transform.warp requires the inverse_map as input for the transformation object, not the forward map. But maybe I am wrong? The good result I am getting suggest it's correct.
I guess that Polynomial transforms are too unstable, and using RANSAC allows to get a reasonable solution.
My problem was then to find a way to change the polynomial order in the RANSAC call...
transform.PolynomialTransform() does not take any parameters, and uses by default a 2nd order polynomial, but from the result I can see I would need a 3rd or 4th order polynomial.
So I opened a new question, and got a solution from Stefan van der Walt. Follow the link to see how to do it.
Thanks again for your help!
I have a number of images from Chinese genealogies, and I would like to be able to programatically categorize them. Generally speaking, one type of image has primarily line-by-line text, while the other type may be in a grid or chart format.
Example photos
'Desired' type: http://www.flickr.com/photos/63588871#N05/8138563082/
'Other' type: http://www.flickr.com/photos/63588871#N05/8138561342/in/photostream/
Question: Is there a (relatively) simple way to do this? I have experience with Python, but little knowledge of image processing. Direction to other resources is appreciated as well.
Thanks!
Assuming that at least some of the grid lines are exactly or almost exactly vertical, a fairly simple approach might work.
I used PIL to find all the columns in the image where more than half of the pixels were darker than some threshold value.
Code
import Image, ImageDraw # PIL modules
withlines = Image.open('withgrid.jpg')
nolines = Image.open('nogrid.jpg')
def findlines(image):
w,h, = image.size
s = w*h
im = image.point(lambda i: 255 * (i < 60)) # threshold
d = im.getdata() # faster than per-pixel operations
linecolumns = []
for col in range(w):
black = sum( (d[x] for x in range(col, s, w)) )//255
if black > 450:
linecolumns += [col]
# return an image showing the detected lines
im2 = image.convert('RGB')
draw = ImageDraw.Draw(im2)
for col in linecolumns:
draw.line( (col,0,col,h-1), fill='#f00', width = 1)
return im2
findlines(withlines).show()
findlines(nolines).show()
Results
showing detected vertical lines in red for illustration
As you can see, four of the grid lines are detected, and, with some processing to ignore the left and right sides and the center of the book, there should be no false positives on the desired type.
This means that you could use the above code to detect black columns, discard those that are near to the edge or the center. If any black columns remain, classify it as the "other" undesired class of pictures.
AFAIK, there is no easy way to solve this. You will need a decent amount of image processing and some basic machine learning to classify these kinds of images (and even than it probably won't be 100% successful)
Another note:
While this can be solved by only using machine learning techniques, I would advice you to start searching for some image processing techniques first and try to convert your image to a form that has a decent difference for both images. For this you best start reading about the fft. After that have a look at some digital image processing techniques. When you feel comfortable that you have a decent understanding of these, you can read up on pattern recognition.
This is only one suggested approach though, there are more ways to achieve this.