I have some kinds of paper sheet and I am writing python script with opencv to recognize the same paper sheet to classify. I am stuck in how to find the same kind of paper sheet. For example, I attached two pic. Picture 1 is the template and picture 2 is some kind of paper I need to know if it is matching with the template. I don't need to match the text and I just need to match the form. I need to classify the same sheet in many of paper sheet.
I have adjust the skew of paper and detect some lines but I don't know how to match the lines and judge this paper sheet is the same kind with the template.
Is there any one can give me an advice for the matching algorithm?
I'm not sure if such paper form is rich enough in visual information for this solution, but I think you should start with feature detection and homography calculation (opencv tutorial: Features2D + Homography). From there you can try to adjust 2D features for your problem.
Check out the findContours and MatchShape function. Either way, you are much better off matching a specific visual ID within the form that is representative of the form. Like a really simple form of barcode.
Related
I am writing a python tool to find specific symbols (e.g. a circle/square with a number inside) on a drawing pdf/screenshot.png
I know from another data source the specific number(s) that should be inside the circle/square.
Using opencv matchTemplate I can find symbols and its coordinates.
One way would be to created all possible symbols (so circles/squares with number 1 to 1000) and save them. Then use opencv to find it on the drawing since I know the number to be found, and thus the filled symbol.
I am sure that the is a smart way to do this. Can somebody guide me into the right direction.
Note: pdfminer will not work since I will not be able to distinguish between measurement numbers and the text coming from the symbol, but I could be wrong here.
I am also trying to solve a similar problem in a coding assignment. The input is a n low poly art illustration.
Once you find the location of the UFO's, you need to crop that part and pass it through a classifier to find the number that UFO contains. The classifier is trained on 5000 images.
I am now going to try the matchTemplate method suggested by you to find the co-ordinates of the UFOs.
I am trying to detect and extract the "labels" and "dimensions" of a 2D technical drawing which is being saved as PDF using python. I came across a python library call "pytesseract" which has optical character recognition capability. I tried the demo on my image but it fails to detect most of the label/dimensions. Please suggest if there is other way to do it. Thank you**.
** Attached is a sample of the 2D technical drawing I try to detect
** what I am trying to achieve is to able to obtain the coordinate of every dimensions (the 160,120,10 4x45 etc) on the image, and extract the, as well.
About 16 months ago we asked ourselves the same question.
If you want to implement it yourself, I'd suggest the following process:
Extract the Canvas from the sheet
Separate the Cuts
Detect the Measure Regions on each Cut
Detect the individual attributes of the Measure Regions to understand where the Measure Start & End. In your particular example that's relatively easy.
Run the detected Measure Labels through OCR
Associate the Labels to the Measures
Verify your results
Alternatively you can also run it through our API and get the results as JSON.
Here's a quick visualization of the result:
Drawing Read (GT stands for General Tolerances)
I have to do the following task on Python and I have not idea where
to begin:
OCR of handwritten dates
Page/document orientation detection for pretreatment
Stamp and Logo Detection and classification
a. Orientation variation included
b. Quality degradation to be considered
c. Overlying Primary Content
Anybody could help me?
THANKS IN ADVANCE¡¡
You can ocrmypdf to extract text from a pdf. It will extract text from the page and return a pdf same like original pdf with text on it. For detection of logos, you need to implement a computer vision-based model. if you need more details then please specify your requirement in details
I've implemented CBIR app by using standard ConvNet approach:
Use Transfer Learning to extract features from the data set of images
Cluster extracted features via knn
Given search image, extract its features
Give top 10 images that are close to the image in hand in knn network
I am getting good results, but I want to further improve them by adding text search as well. For instance, when my image is the steering wheel of the car, the close results will be any circular objects that resemble a steering wheel for instance bike wheel. What would be the best possible way to input text say "car part" to produce only steering wheels similar to the search image.
I am unable to find a good way to combine ConvNet with text search model to construct improved knn network.
My other idea is to use ElasticSearch in order to do text search, something that ElasticSearch is good at. For instance, I would do a CBIR search described previously and out of the return results, I can look up their description and then use ElasticSearch on the subset of the hits to produce the results. Maybe tag images with classes and allow user to de/select groups of images of interest.
I don't want to do text search before image search as some of the images are poorly described so text search would miss them.
Any thoughts or ideas will be appreciated!
I have not found the original paper, but maybe you might find it interesting: https://www.slideshare.net/xavigiro/multimodal-deep-learning-d4l4-deep-learning-for-speech-and-language-upc-2017
It is about looking for a vector space where both images and text are (multimodal embedding). In this way you can find text similar to a images, images referring to a text, or use the tuple text / image to find similar images.
I think maybe this idea is an interesting point to start from.
I am new to the image processing subject. I'm using opencv library for image processing with python. I need to extract symbols and texts related to those symbols for further work. I saw some of developers have done handwritten text recognitions with Neural network, KNN and other techniques.
My question is what is the best way to extract these symbols and handwritten texts related to them?
Example diagram:
Details I need to extract:
No of Circles in the diagram.
What are the texts inside them.
What are the words within square brackets.
Are they connected with arrows or not.
Of course, there is a method called SWT - Stokes Width Transform.
Please see this paper, if you search it by its name, you can find the codes that some students have written during their school project.
By using this method, text recognitions can be applied. But it is not a days job.
Site: Detecting Text in Natural Scenes with
Stroke Width Transform
Hope that it helps.
For handwritten text recognition, try using TensorFlow. Their website has a simple example for digit recognition (with training data). You can use it to implement your own application for recognizing handwritten alphabets as well. (You'll need to get training data for this though; I used a training data set provided by NIST.)
If you are using OpenCV with python, Hough transform can detect circles in images. You might miss some hand drawn circles, but there are ways to detect ovals and other closed shapes.
For handwritten character recognition, there are lots of libraries available.
Since you are now to this area, I strongly recommend LearnOpenCV and and PyImageSearch to help you familiarize with the algorithms that are available for this kind of tasks.