I am trying to write a code which, given an image, will run tessearct on the entire image and return a map of all locations where text was detected (as a binary image).
It doesn't have to be pixel-by-pixel, a union of bounding boxes is more than enough.
Is there a way to do this?
Thanks in advance
Yes... (of course). Look at the Python imaging library for loading the image, and cropping it. Then you can apply Tesseract on each piece and check the output.
Have a look at the program I answered a while back. It might help you with the elements you need. It lets you manually select and area, and OCR it. But that can be easily changed.
Related
I am trying to detect the pdf417 barcode (2D barcode) from an image using python.
I will be receiving images of IDs where there is a barcode in them but it might not always be straight. So I am looking for an effective way to DETECT the pdf417 barcode using Python.
I tried all of the available methods that I could find (that uses python)
e.g.,
pdf417decoder: requires the image to be cut exactly around the barcode just like in the image below:
pyzbar: only detects 1D barcodes
python-zxing and zxing: didn't detect any of the pdf417 barcodes that I tried (around 10 different IDs - different country)
Barcode-detection: this is a DL approach that uses YOLO-V3 to detect barcodes, but again (after trying it), it only detects 1D barcodes...
Is there a method that I missed?
Am I using a wrong approach towards this problem?
Possible solution that I am thinking of: using computer vision (some filters and transformations) to detect a box that has black and white dots... Something similar to this.
Thanks!
After various trials, I ended up using an approach of template matching by OpenCV.
You need to precisely choose your template image that will be the search reference of your algorithm. You need to feed it some grayscaled images.
Then, you need to choose the boxes that have a result higher than a certain threshold (for me 0.55). Then apply NMS (non max suppression) to filter out the noisy boxes.
But keep in mind that there are many edge cases to encounter. If someone is interested to see the complete solution, please let me know.
Forgive me but I'm new in OpenCV.
I would like to delete the common background in 3 images, where there is a landscape and a man.
I tried some subtraction codes but I can't solve the problem.
I would like output each image only with the man and without landscape
Are there in OpenCV Algorithms what do this do? (then without any manual operation so no markers or other)
I tried this python code CV - Extract differences between two images
but not works because in my case i don't have an image with only background (without man).
I thinks that good solution should to Compare all the images and save those "points" that are the same at least in an image.
In this way I can extrapolate a background (which we call "Result.jpg") and finally analyze each image and cut those portions that are also present in "Result.jpg".
You say it's a good idea? Do you have other simplest ideas?
Without semantic segmentation, you can't do that.
Because all you can compute is where two images differ, and this does not give you the silhouette of the person, but an overlapping of two silhouettes. You'll never know the exact outline.
I have an image like the following:
and I would want to extract the text from it, that should be ws35, I've tried with pytesseract library using the method :
pytesseract.image_to_string(Image.open(path))
but it returns nothing... Am I doing something wrong? How can I get back the text using the OCR ? Do I need to apply some filter on it ?
You can try the following approach:
Binarize the image with a method of your choice (Thresholding with 127 seems to be sufficient in this case)
Use a minimum filter to connect the lose dots to form characters. Thereby, a filter with r=4 seems to work quite good:
If necessary the result can be further improved via application of a median blur (r=4):
Because i personally do not use tesseract i am not able to try this picture, but online ocr tools seem to be able to identify the sequence correctly (especially if you use the blurred version).
Similar to #SilverMonkey's suggestion: Gaussian blur followed by Otsu thresholding.
The problem is that this picture is low quality and very noisy!
even proffesional and enterprisal programs are struggling with this
you have most likely seen a capatcha before and the reason for those is because its sent back to a database with your answer and the image and then used to train computers to read images like these.
short answer is: pytesseract cant read the text inside this image and most likely no module or proffesional programs can read it either.
You may need apply some image processing/enhancement on it. Look at this post read suggestions and try to apply.
fig:Shoe in the red circle is to be detected
I am trying to create a python script using cv2 that can recognize the shoe of the baller and determine whether the shoe is beyond, on or before the white line(refer to the image).
I have no idea about any kind of approach to use, what kind of algorithms might be helpful. Need some guideline, please help!
(Image is attached)
I realize this would work better as a comment because it isn't a full answer, but I don't have enough rep yet to leave comments, haha.
You may be interested in OpenCV's Canny Edge detection algorithm:
http://docs.opencv.org/trunk/da/d22/tutorial_py_canny.html
This will allow you to find shapes within your image.
Also, you can find similarly colored blobs using SimpleBlobDetector:
https://www.learnopencv.com/blob-detection-using-opencv-python-c/
This should make it fairly easy to detect the white line.
In order to detect a more complex object like the shoe, you'll probably have to make something like a object detection cascade file and use a CascadeClassifier to find it:
http://docs.opencv.org/2.4/doc/tutorials/objdetect/cascade_classifier/cascade_classifier.html#cascade-classifier
http://johnallen.github.io/opencv-object-detection-tutorial/
Basically, you take a bunch of pictures to "teach" what the object looks like, and output that info to a file that a CascadeClassifier can use to detect objects in input images. It may be hard to distinguish between different brands of shoe though, if you need it to be that specific. Also, you may need to adjust the input images (saturation, brightness, etc) before trying to detect objects in order to get good results.
there is this project am currently working on, which requires me to watermark every uploaded image. i have tried series of examples online, but they are not giving me what i really want as result
for example
i have an image A with image B watermarked on it, the two images are of the same dimensions. i applied opacity of 0.5 on image B before placing it on image A
now, i would really appreciate if anyone could help with a boolean function to check if image A has already been watermarked with image B before watermarking it.
thanks.
This depends on several factors that you'll need to provide more information for.
For instance, how complex are these images? Is there a lot of noise? Are the images that are uploaded similar in any way, or are they heterogenous? Are the watermarks always the same, or are they different?
As a general principle for extracting objects from images, you should look into processes such as color deconvolution, thresholding, and blob extraction.
In short--some sample images would go a long way...
yeah, finally found a dubious way to solve the problem, by hiding a specific text on the alpha layer of the image after watermarking it using steganography.
so on every upload, i get the image, iterate through the lowest pixels of the image's alpha layer, then compare the result to the text. if the result matches the text, definitely, the image has been watermarked.