I am trying to read text from the image , the image consist of single character it is not reading correctly. this is the type of images i have
it is reading this image as 't'
most of them it is reading incorrectly .
these are some of my images
this is my code
import pytesseract
from PIL import Image
import requests
import io
text = pytesseract.image_to_string(Image.open('15.png'), lang='eng',config='--psm 10')
print(text)
Related
I have tried this way to workaround:
from pytesseract import pytesseract
from PIL import Image
img = Image.open('img.jpg')
text = pytesseract.image_to_string(img, config='')
# Displaying the extracted text
print(text[:-1])
But this code does not extract all the text.
Here is the output output
I am using tinytags module in python to get the cover art of a mp3 file and want to display or store it. The return type of the variable is showing to be bytes. I have tried fumbling around with PIL using frombytes but to no avail. Is there any method to convert the bytes to image?
from tinytag import TinyTag
tag = TinyTag.get("03. Me, Myself & I.mp3", image=True)
img = tag.get_image()
I actually got a PNG image when I called tag.get_image() but I guess you might get a JPEG. Either way, you can wrap it in a BytesIO and open it with PIL/Pillow or display it. Carrying on from your code:
from PIL import Image
import io
...
im = tag.get_image()
# Make a PIL Image
pi = Image.open(io.BytesIO(im))
# Save as PNG, or JPEG
pi.save('cover.png')
# Display
pi.show()
Note that you don't have to use PIL/Pillow. You could look at the first few bytes and if they are a PNG signature (\x89PNG) save data as binary with PNG extension. If the signature is JPEG (\xff \xd8) save data as binary with JPEG extension.
I need to extract digits from images (see sample images). I tried pytesseract but it is not working, it produces empty results. Below is the code I am using
Code
import pytesseract
import cv2
img = cv2.imread('image_path')
digits = pytesseract.image_to_string(img)
print(digits)
Sample Images
I have a large pool of images, as shown above. Tesseract is not working on any of them.
Try adding config --psm 7 (meaning Treat the image as a single text line.)
import pytesseract
import cv2
img = cv2.imread('image_path')
digits = pytesseract.image_to_string(img,config='--psm 7')
print(digits)
#'971101004900 1545'
This is the first time I am working with OCR. I have an image and want to extract data from the image. My image looks like this:
I have 500 such images and will have to record the parameters and the respective values. I'm thinking of doing it through code than doing manually.
I have tried with python py-tesseract and PIL libraries. They are performing good if the image contains some simple text.This is what i tried
from PIL import Image, ImageEnhance, ImageFilter
from pytesseract import image_to_string
from pytesseract import image_to_boxes
im = Image.open("AHU.png")
im = im.filter(ImageFilter.MedianFilter())
enhancer = ImageEnhance.Contrast(im)
im = enhancer.enhance(2)
im = im.convert('1')
im.save('temp2.jpg')
text = image_to_string(Image.open('temp2.jpg'))
print(text)
What to do in this case where there are several parameters? All my images are similar with respect to position of the values.
I need some help with image processing , I'm working on a script which can detect Alphabets on the image and trace them , For example if there is a letter A in the Image then code has to detect it and trace(side-by-side not over same line) 3-4 times (with different colors) with given distance based on width of the text . As of now I'am able to detect the words and font and size using tesserocr module, but I'm unable to do the tracing thing .
import io
import tesserocr
from PIL import Image
with tesserocr.PyTessBaseAPI() as api:
image = Image.open("1d.png")
api.SetImage(image)
api.Recognize()
iterator = api.GetIterator()
print iterator.WordFontAttributes()
Thanks in advance #peace