Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
I am trying to extract images of words from a picture that mostly has sentences in different type fonts. For example, consider this scenario:
Now I would want to extract individual images of words Clinton, Street and so on like this:
I tried applying the binary dilation but the distance between the white and black area was almost negligible to crop out the words. However, there was a little success when I first cropped out the blank area in the original image and then re-do the binary dilation on the cropped image with a lower F1 value.
What should be the best and high-accuracy approach to separate out images of the words from this picture?
Ps: I am following this blog post to help me get the task done.
Thank you
Fennec
With dilatation, I get this :
Is this not satisfactory for you because of the fact that lines may be too close by and merged together with dilatation (like it sort of happens for the last two lines) ?
Other stuff to try, from the top of my head :
-clustering.
-low level method where you count number of pixels in each line to find out where the lines are, then count the pixels in each column to figure out where the words are in each line.
Related
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed last year.
Improve this question
I downloaded this dataset with numbers and other mathematical symbols, which contains circa 380 000 images, split into 80 different folders, each named after the symbol it represents. For this project, a machine learning one, i need to get train and test sets which equally represent each symbol. For example 1/3 of the symbol folder in former dataset goes to test directory and 2/3 goes into train dir. I tried many times, but i always ended up with a ineffective code, iterating through every item, which lasted for ages and didn't even finish.
The dataset:
https://www.kaggle.com/xainano/handwrittenmathsymbols/
Dataset that you are using has extractor.py script that automaticaly does this for you
Scripts info
extract.py
Extracts trace groups from inkml files.
Converts extracted trace groups into images. Images are square shaped bitmaps > with only black (value 0) and white (value 1) pixels. Black color denotes patterns (ROI).
Labels those images (according to inkml files).
Flattens images to one-dimensional vectors.
Converts labels to one-hot format.
Dumps training and testing sets separately into outputs folder.
Visit its github here: https://github.com/ThomasLech/CROHME_extractor
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 2 years ago.
Improve this question
I'm trying to find a way to screen scrape the letters and numbers (mainly numbers) from the attached picture.
example picture
In previous attempts, I've used pyocr and many other variations.
My question is, has any body found a way to scrape off numbers? Or how to train the pyocr algorithm to use custom data?
Thanks in advance!
The folks at PyImageSearch have a TON of info about processing images in Python with OpenCV.
They even have a free blog post about using Tesseract OCR. Though Tesseract can be a bit fussy about fonts, the good news is it looks like your text in the image should always be the same font, and perfectly aligned horizontally and vertically.
(disclaimer: I'm a student of theirs; but I don't work for them)
https://www.pyimagesearch.com/2018/09/17/opencv-ocr-and-text-recognition-with-tesseract/
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
Character engraved on a metal plate
How to extract the characters engraved on a metallic plate?
OCR( Pytesseract) is unable to give good results. I tried Ridge detection but in vain. Any form of threshold doesn't seem to work because the background and the foreground are of the same color. Is there a series of steps that I can follow for such a use-case?
I think Binarization wont work in your image. If any preprocessing improves the quality of this image that doesn't mean that same method will work on all the images you have.
So my suggestion is to create your own Custom OCR using machine learning or CNN.
You can convert your digits into a 28x28 image matrix and then reshape it into 1x784 matrix and perform the training like MNIST dataset is trained.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
I am trying to find the unique area from multiple bounded box generated to find the screen capture by the products in python
You can try threshold the image by colors first (using either HSV threshold, or RGB).
Then having several binary images, you can use Contour Approximation (number 4 on the page) feature, using Douglas-Peucker algorithm. Fill the resulting bounding boxes.
Afterwards, you can subtract resulting binary images from one another to find exact areas of intersections.
Hope it helps!
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 5 years ago.
Improve this question
I am trying to implement some kind of text detection algorithm and I want to separate the image into regions where each region contains different font size.
As in this image, for example:
Is there any easy way to implement it using python and/or opencv? if so, how?
I did tried googling it but could not find anything useful..
Thanks.
This is an interesting question. There are a few steps that you need to take in order to achieve your goals. I hope you are sufficiently informed of basic computer vision algorithm (knowledge in openCV function helps) to understand the steps i am suggesting.
Group all the words together using morphological dilation process.
Use openCV findcountour function to label all the blobs. This will give you the width and height information of each blob as well.
Here is the tricky part, now that you have data on each blob, try to run a clustering algorithm on the data with the location(x,y) and geometry(width,height) as your features.
Once you cluster them correctly, its a matter of finding the leftmost, rightmost, topmost and bottom data to draw the bounding rect.
I hope this will provide you enough information to start you work. Its is not detailed but i think its enough to guide you.