In the image there are 2 insulators, the one on the left has a gap i.e a disk missing in between. I have to detect the missing disk with a rectangular box. I know alogorithms SIFT and SURF or by using absdiff() in opencv for calculating difference between two images.
How can i can detect the missing disk if I only have this image.
Image
You should find contours,bounding boxes and circles.After that you can find missing object or noise objectt. Other way to use AI to fit objects and search for that. But this one is very hard job
General algorithm (it's obviously):
find insulators
find gaps
find insulators with gaps.
I think, insulators are kinda standardized by size and look. So, probably, you can detect them by color/texture and/or some specific details. They can't be very "curve", so you can estimate them with lines and separate overlapped elements. If all insulators have same size, than you can normalize them, stretch by one axe, and then detect gaps.
Their is no way to do 100% correct recognizing in all cases, but you can use some knowledge about insulators and get good results.
Related
Im trying to remove the differences between two frames and keep the non-chaning graphics. Would probably repeat the same process with more frames to get more accurate results. My idea is to simplify the frames removing things that won't need to simplify the rest of the process that will do after.
The different frames are coming from the same video so no need to deal with different sizes, orientation, etc. If the same graphic its in another frame but with a different orientation or scale, I would like to also remove it. For example:
Image 1
Image 2
Result (more or less, I suppose that will be uglier but containing a similar information)
One of the problems of this idea is that the source video, even if they are computer generated graphics, is compressed so its not that easy to identify if a change on the tonality of a pixel its actually a change or not.
Im ideally not looking at a pixel level and given the differences in saturation applied by the compression probably is not possible. Im looking for unchaged "objects" in the image. I want to extract the information layer shown on top of whats happening behind it.
During the last couple of days I have tried to achieve it in a Python script by using OpenCV with all kinds of combinations of absdiffs, subtracts, thresholds, equalizeHists, canny but so far haven't found the right implementation and would appreciate any guidance. How would you achieve it?
Im ideally not looking at a pixel level and given the differences in saturation applied by the compression probably is not possible. Im looking for unchaged "objects" in the image. I want to extract the information layer shown on top of whats happening behind it.
This will be extremely hard. You would need to employ proper CV and if you're not an expert in that field, you'll have really hard time.
How about this, forgetting about tooling and libs, you have two images, ie. two equally sized sequences of RGB pixels. Image A and Image B, and the output image R. Allocate output image R of the same size as A or B.
Run a single loop for every pixel, read pixel a and from A and pixel b from B. You get a 3-element (RGB) vector. Find distance between the two vectors, eg. magnitude of a vector (b-a), if this is less than some tolerance, write either a or b to the same offset into result image R. If not, write some default (background) color to R.
You can most likely do this with some HW accelerated way using OpenCV or some other library, but that's up to you to find a tool that does what you want.
Given two images - one a cropped (but not scaled) portion of the other, how can I find the crop parameters (i.e.: the x and y offsets and width/height)? The idea is to crop one image (screenshot) by hand, and then crop a lot more at the same points.
Ideally via imagemagick, but I am happy with any pseudo-code solution, or with Perl, Python, JavaScript (in order of preference)
I have thought of a brute-force approach (find the first pixel which is the same color, check the next, keep going until different, or move to the next). Before I go down this barabarous (and probably slow) route, I'd like to check for better ones.
Template matching can be used for the identification of smaller image within a larger image.
The following resource might be helpful. Please check it out
https://docs.opencv.org/4.5.2/d4/dc6/tutorial_py_template_matching.html
I am trying to identify a state of a valve(on or off). My approach is to give to images of each states and compare the current image with those two and see which one it belongs to.
I have tried to compare using new_image - on_image and new_image - off_image. Then compare the number of different pixels. It works, but i feel like in some cases it might not work and there must be another better way do a simple classification like this.
Any reference or ideas?
Subtracting pixels might not be very robust in case your camera position changes slightly. If you don't shy away from using open Computer Vision (open CV) there is an interesting recipe for finding a predefined object in a picture:
Feature Matching + Homography to find Objects
You could cut out the lever from your image and search it in every new image. Depending on the coordinates and especially the rotation, you can set the status of the valve. This might even work in crazy cases where someone half opened (or for pessimists: half closed) the valve, or if the lever becomes partially covered.
I'm trying to extract text from a scanned technical drawing. For confidentiality reasons, I cannot post the actual drawing, but it looks similar to this, but a lot busier with more text within shapes. The problem is quite complex due to issues with letters touching both each other and it's surrounding borders / symbols.
I found an interesting paper that does exactly this called "Detection of Text Regions From Digital Engineering Drawings" by Zhaoyang Lu. It's behind a paywall so you might not be able to access it, but essentially it tries to erase everything that's not text from the image through mainly two steps:
1) Erases linear components, including long and short isolated lines
2) Erases non-text strokes in terms of analysis of connected components of strokes
What kind of OpenCV functions would help in performing these operations? I would rather not write something from the ground up to do these, but I suspect I might have to.
I've tried using a template-based approach to try to isolate the text, but since the text location isn't completely normalized between drawings (even in the same project), it fails in detecting text past the first scanned figure.
I am working on a similar problem. Technical drawings are an issue because OCR software mostly tries to find text baselines and the drawing artifacts (lines etc) get in the way of that approach. In the drawing you specified there are not many characters touching each other. So I suggest to break the image into contiguous (black) pixels and then scan those individually. The height of the contiguous areas should give you also an indication if the contiguous area is text, or a piece of the drawing. To break the image into contiguous pixels, use a flood fill algorithm, and for the scanning Tesseract does a good job.
Obviously I've never attempted this specific task, however if the image really looks like the one you showed me I would start by removing all vertical and horizontal lines. This could be done pretty easily, just set a width threshold and for all pixels with intensity larger than some N value, and after that look the threshold amount of pixels perpendicular to the hypothethic line orientation. If it looks like a line erase it.
More elegant and perhaps better would be to do a hough transform for lines and circles and remove those elements that way.
Also you could maybe try some FFT based filtering, but I'm not so sure about that.
I've never used OpenCV but i would guess it can do the things i mentioned.
I have written a program in Python which automatically reads score sheets like this one
At the moment I am using the following basic strategy:
Deskew the image using ImageMagick
Read into Python using PIL, converting the image to B&W
Calculate calculate the sums of pixels in the rows and the columns
Find peaks in these sums
Check the intersections implied by these peaks for fill.
The result of running the program is shown in this image:
You can see the peak plots below and to the right of the image shown in the top left. The lines in the top left image are the positions of the columns and the red dots show the identified scores. The histogram bottom right shows the fill levels of each circle, and the classification line.
The problem with this method is that it requires careful tuning, and is sensitive to differences in scanning settings. Is there a more robust way of recognising the grid, which will require less a-priori information (at the moment I am using knowledge about how many dots there are) and is more robust to people drawing other shapes on the sheets? I believe it may be possible using a 2D Fourier Transform, but I'm not sure how.
I am using the EPD, so I have quite a few libraries at my disposal.
First of all, I find your initial method quite sound and I would have probably tried the same way (I especially appreciate the row/column projection followed by histogramming, which is an underrated method that is usually quite efficient in real applications).
However, since you want to go for a more robust processing pipeline, here is a proposal that can probably be fully automated (also removing at the same time the deskewing via ImageMagick):
Feature extraction: extract the circles via a generalized Hough transform. As suggested in other answers, you can use OpenCV's Python wrapper for that. The detector may miss some circles but this is not important.
Apply a robust alignment detector using the circle centers.You can use Desloneux parameter-less detector described here. Don't be afraid by the math, the procedure is quite simple to implement (and you can find example implementations online).
Get rid of diagonal lines by a selection on the orientation.
Find the intersections of the lines to get the dots. You can use these coordinates for deskewing by assuming ideal fixed positions for these intersections.
This pipeline may be a bit CPU-intensive (especially step 2 that will proceed to some kind of greedy search), but it should be quite robust and automatic.
The correct way to do this is to use Connected Component analysis on the image, to segment it into "objects". Then you can use higher level algorithms (e.g. hough transform on the components centroids) to detect the grid and also determine for each cell whether it's on/off, by looking at the number of active pixels it contains.