I'm currently learning about computer vision OCR. I have an image that needs to be scan. I face a problem during the image cleansing.
I use opencv2 in python to do the things. This is the original image:
image = cv2.imread(image_path)
cv2.imshow("imageWindow", image)
I want to cleans the above image, the number at the middle (64) is the area I wanted to scan. However, the number got cleaned as well.
image[np.where((image > [0,0,105]).all(axis=2))] = [255,255,255]
cv2.imshow("imageWindow", image)
What should I do to correct the cleansing here? I wanted to make the screen where the number 64 located is cleansed coz I will perform OCR scan afterwards.
Please help, thank you in advance.
What you're trying to do is called "thresholding". Looks like your technique is recoloring pixels that fall below a certain threshold, but the LCD digit darkness varies enough in that image to throw it off.
I'd spend some time reading about thresholding, here's a good starting place:
Thresholding in OpenCV with Python. You're probably going to need an adaptive technique (like Adaptive Gaussian Thresholding), but you may find other ways that work for your images.
Related
The LCD display cannot be cleaned, the light conditions cannot be changed and the background can be tricky. So, I cannot use any kind of segmentation by color, search for rectangles and use Otsu. MSER doesn't give a good result. I even tried to locate display relatively to "DEXP" logo. The logo turned out to be too small to do this with sufficient accuracy. Bilateral filtering or gaussian blur helps, but not much. Even supposing that I found ROI, local thresholding gives too noisy results. Morphological transformations don't help. Is there a way to extract digits for further OCR?
Is it possible to dash the blurred part of the image?
Right now I am using python with OpenCV. I know only how to load images and display if the image is blurred.
My input is a blurred image:
I would like to get:
I do not have:
original/unblurred image.
Output can have still blurred parts but dashed.
Thanks a lot for help!
You could try by computing the "Variance of the Laplacian" on parts of the image to detect the regions that have a low variation in greyscales (= assumed blurry) and which regions have a high variation in greyscale (= assumed non-blurry).
There is a nice tutorial on how to check if an image is blurry, it can be found here
There is also a post here that explains the theory behind it.
It ain't a complete solution, but it might be a way to start.
I'm trying to develop a system which can convert a seven-segment display on an old analog pressure output system to text, such that the data can be processed by LabVIEW. I've been working on image processing to get Tesseract (using v3.02) to recognize the numbers correctly, but have been hitting some roadblocks and don't quite know how to proceed. This is what I've got so far:
Image needs to be a height of between 50-100 pixels for Tesseract to read it correctly. I've found the best results with a height of 50.
Image needs to be cropped such that there is only one line of text.
Image should be in black and white
Image should be relatively level from left to right.
I've been using the seven-segment training data 'letsgodigital'. This is the code for the image manipulation I've been doing so far:
ret, i = video.read()
h,width,channels = i.shape #get dimensions
g = cv2.cvtColor(i,cv2.COLOR_BGR2GRAY)
histeq=cv2.equalizeHist(g) #spreads pixel values across entire spectrum
_,t = cv2.threshold(histeq,150,225,cv2.THRESH_BINARY) #thresholds histeq
cropped = t[int(0.4*h):int(.6*h), int(0.1*width):int(0.9*width)]
rotated = imutils.rotate_bound(cropped, angle)
resized = imutils.resize(rotated,height=resizing_height)
Some numbers work better than others - for example, '1' seems to have a lot of trouble. The numbers occurring after the '+' or '-' often don't show up, and the '+' often shows up as a '-'. I've played around with the threshold values a bit, too.
The last three parts are because my video sample I've been drawing from was slightly askew. I could try taking some better data to work with, and I could also try making my own training data over the standard 'letsgodigital' lang. I feel like I'm not doing the image processing in the best way though, and would appreciate some guidance.
I plan to use some degree of edge detection to autocrop to the display, but for now I've just been trying to keep it simple and manually get the results I want. I've uploaded sample images with various degrees of image processing applied at http://imgur.com/a/vnqgP. It's difficult because sometimes I get the exact right answer from tesseract, and other times get nothing. The camera or light levels haven't really changed though, which makes me think it's a problem with my training data. Any suggestions or direction on where I should go would be much appreciated!! Thank you
For reading seven segment digits, normal OCR programs like tesseract don't usually work too well because of the space between individual segments. You should try ssocr, which was made specifically for reading seven segment digits. However, your preprocessing will need to be better as ssocr expects the input to be a single row of seven segment digits.
References - https://www.unix-ag.uni-kl.de/~auerswal/ssocr/
Usage example - http://www.instructables.com/id/Raspberry-Pi-Reading-7-Segment-Displays/
I am new to openCV and python both. I am trying to count people in an image. The image is supposed to be captured with an overhead camera or the way a CCTV camera is placed.
I have converted the colored image into binary image and then inverted the binary image. Then I used bitwise OR on original and inverted binary image so that the background is white and the people are colored.
How to count these people? Is it necessary to use a classifier or can i just count the contours ,if yes then how to count them?
Plus there are some issues with the technique I'm using.
Faces of people are light in color so sometimes only hair are getting extracted.
The dark objects other than people also get extracted.
If the floor is dark it won't give the binary image that is needed.
So is there any other method to achieve what I'm trying to do here?
Not sure but it may worth to check there.
It explain how to perform face recognition using openCV and python in pictures and extand it to webcam here, it's not quite what your looking for but may give you some clue/
I'm trying to isolate the sky region from a series of grayscale images in OpenCV. All of the images are fairly similar: the top of the image is always a sky region, and is always a bright, gray-white colour. I've attempted contour-based approaches, and written my own algorithm to extract the line of the horizon and divide the image into two masks accordingly. However, I've noticed that the reliability of the magic wand tool in Photoshop on this image set is MUCH more accurate.
Here's the image that I'm processing:
and the result that I hope to achieve:
How can this be imitated in OpenCV?
I think what you're looking for is the grabcut algorithm