pytesseract - Recognize text on different backgrounds - python

I'm trying to recognize the page number from a colored image but pytesseract does not recognize nothing with the command:
print(pytesseract.image_to_string(Image.open("C:\\Users\\user\\Desktop\\test\\colorimage.jpg")))
Then i tried to binarize the same image: same result.
Finally I decided to adjust the binarized image by creating a "unique" background.
Check this results map: https://i.stack.imgur.com/sMD3M.jpg
How can I recognize text without edit the default image with Python?

Related

OCR text extraction from user interfaces image

I am currently using Pytesseract to extract text from images like Amazon, ebay, (e-commerce) etc to observe certain patterns. I do not want to use a web crawler since this is about recognising certain patterns from the text on such sites. The image example looks like this:
However every website looks different so template matching wouldn't help as well. Also the image background is not of the same colour.
The code gives me about 40% accuracy. But if I crop the images into smaller size, it gives me all the text correctly.
Is there a way to take in one image, crop it into multiple parts and then extract text? The preprocessing of images does not help. What I have tried is using: rescaling, removing noise, deskewing, skewing, adaptiveThreshold, grey scale,otsu, etc but I am unable to figure out what to do.
try:
from PIL import Image
except ImportError:
import Image
import pytesseract
# import pickle
def ocr_processing(filename):
"""
This function uses Pillow to open the file and Pytesseract to find string in image.
"""
text = pytesseract.image_to_data(Image.open(
filename), lang='eng', config='--psm 6')
# text = pytesseract.image_to_string(Image.open(
# filename), lang='eng', config ='--psm 11')
return text
Just for a recommendation if you have a lot of text and you want to detect it through OCR (example image is above), "Keras" is a very good option. Much much better than pytesseract or using just EAST. It was a suggestion provided in the comments section. It was able to trace 98.99% of the text correctly.
Here is the link to the Keras-ocr documentation: https://keras-ocr.readthedocs.io/en/latest/

Saving an image as jpg gives me plain black

I am implementing gabor kernels, when I display the kernels while running the code (before saving them) they give me a picture like this
But after saving the kernels as jpg images using cv2.imwrite, I get like that
Any explanations? and how to save the kernels as in the first image?
There could be different causes. So I have two suggestions:
If you display the first picture with plt.imshow(), export it with plt.savefig(). This should easily be working.
If you still want to export the image with cv2.imwrite() make sure that the picture is correctly rescaled first. (mind that if you have only one channel, you will get a grayscale picture).
If we call the original picture org_img:
img = org_img
min_val,max_val=img.min(),img.max()
img = 255.0*(img - min_val)/(max_val - min_val)
img = img.astype(np.uint8)
cv2.imwrite(img,"picture.png")

Remove background of image containing text

I am building custom ocr for some documents. After getting ROI I am passing them to tesseract. To improve accuracy I want to remove background of image. I am observing that when there are images like this:
tesseract is not able to read anything.(Because of lines in the image)
But for images like this: Its giving correct results.
Can any one suggest how to remove everything from image except text?

python: when reading and saving a image the color change

I tried loading and saving images with python using cv2,PIL, scipy , but the saved image has a bit different color compare to the original.
I am loading and saving tif format, so i expect no color change.
link to the image I am using:
https://data.csail.mit.edu/graphics/fivek/img/tiff16_c/a0486-jmac_MG_0791.tif
the difference between loaded image and saved image is:
can you help me understand what I am doing wrong? why the color change?
update:
the problem is because the image is prophoto rgb color.
does anyone knows how can i convert a batch of images from prophoto rgb to rgb?
thanks,
yoav
option 1:
img = imread(file_name)
imsave('imread.tif', img)
option 2:
img = cv2.imread(file_name)
cv2.imwrite('cv2.tif', img)
option 3:
img = Image.open(file_name)
img.save('pil.tif')
I think OpenCV is more interested in Computer Vision - i.e. detecting and measuring objects etc than printing or high quality image reproduction, editing and printing, so it pretty much ignores ICC profiles. If anyone knows better, I am happy to be corrected.
You can use ImageMagick to convert images from one format to another, and to do many, many other things, one of which is changing colour profiles. So, I think, if you go to this website and download an sRGB profile (I chose the first one with "preference" in its name) and save it as sRGB.icc, you can change one of your ProPhoto images to a normal sRGB image with the following command in Terminal:
convert input.tif -profile sRGB.icc output.tiff
Try that and see if it works. If so, make a copy of your images and on a copy, you can run mogrify to do the whole lot in one go - beware and make a copy like I suggest because it will very quickly alter all your images...
magick mogrify -profile sRGB.icc *tif
You can see the embedded profile and loads of other information about an image using ImageMagick's identify command:
magick identify -verbose OneOfYourImages.tiff

Is it possobile to change a specific position back color using PIL

I'm developing a python task to generate images using PIL.
Is there any way to change numbers back color to yellow instead of white and keep the rest of the image white as it is ?
Please see images below:
Original Image
Expected result is something like this:

Categories

Resources