Pytesseract, trying to detect text from on screen

Pytesseract, trying to detect text from on screen - python

I'm using MSS in conjunction with pytesseract to try and read on-screen to determine a string of characters from the region being monitored. My code is as follows:
import Image
import pytesseract
import cv2
import os
import mss
import numpy as np
with mss.mss() as sct:
mon = {'top': 0, 'left': 0, 'width': 150, 'height': 150}
im = sct.grab(mon)
im = np.asarray(im)
im_gray = cv2.cvtColor(im, cv2.COLOR_BGR2GRAY)
#im_gray = plt.imshow(im_gray, interpolation='nearest')
cv2.imwrite("test.png", im_gray)
#cur_dir = os.getcwd()
text = pytesseract.image_to_string(Image.open(im_gray))
print(text)
cv2.imshow("Image", im)
cv2.imshow("Output", im_gray)
cv2.waitKey(0)
And I get returned with the following error: AttributeError: 'numpy.ndarray' object has no attribute 'read'
I have also tried converting it back to an image using pyplot as indicated by the commented line in the code sample. However that prints back the error: TypeError: img is not a numpy array, neither a scalar
I'm somewhat new to Python (just started dabbling with it on Sunday). However, I've been rather successful with my other attempts at detecting images. But, to reach my end goal, I'll need to be able to read characters on screen. They will be the same font and the same size, consistently so I don't have to worry about scaling issues, but for the time being I'm trying to understand how it works by storing an image in memory (without saving to file) from the recycle bin icon on desktop, and trying to grab the string "Recycle Bin" from the image.
UPDATE
I think I may have some breakthrough, but if I am tryin to display the stream at the same time, there is some issues. However, I may be able to process the stream fast enough by using temporary files.
My updated code is as follows:
from PIL import Image
from PIL import ImageGrab
import pytesseract
import cv2
import os
import mss
import numpy as np
from matplotlib import pyplot as plt
import tempfile
png = tempfile.NamedTemporaryFile(mode="wb")
with mss.mss() as sct:
#while True:
mon = {'top': 0, 'left': 0, 'width': 150, 'height': 150}
im = sct.grab(mon)
im_array = np.asarray(im)
#im_gray = cv2.cvtColor(im, cv2.COLOR_BGR2GRAY)
#with tempfile.NamedTemporaryFile(mode="wb") as png:
png.write(im_array)
im_name = png.name
print(png.name)
#cv2.imwrite("test.png", im_gray)
#cur_dir = os.getcwd()
#text = pytesseract.image_to_string(Image.open(im_name))
#print(text)
cv2.imshow("Image", im_array)
#cv2.imshow("Output", im_gray)
cv2.waitKey(0)
This currently spits out a permission is denied error, which is as follows:
File "C:\Python\Python36-32\Lib\idlelib\ocr.py", line 27, in <module>
text = pytesseract.image_to_string(Image.open(im_name))
File "C:\Python\Python36-32\lib\site-packages\PIL\Image.py", line 2543, in open
fp = builtins.open(filename, "rb")
PermissionError: [Errno 13] Permission denied: 'C:\\Users\\JMCOLL~1\\AppData\\Local\\Temp\\tmp7_mwy2k9'
I am skeptical that this is normal, and I will be trying this update on my laptop at home. It could be due to restrictions on the work laptop, which I just don't have time to work around.
I am rather confused why displaying the image without the while True: loop works fine, as a screenshot. However, putting it in a while True: loop causes the window to freeze.

I can get this code working:
import time
import cv2
import mss
import numpy
import pytesseract
mon = {'top': 0, 'left': 0, 'width': 150, 'height': 150}
with mss.mss() as sct:
while True:
im = numpy.asarray(sct.grab(mon))
# im = cv2.cvtColor(im, cv2.COLOR_BGR2GRAY)
text = pytesseract.image_to_string(im)
print(text)
cv2.imshow('Image', im)
# Press "q" to quit
if cv2.waitKey(25) & 0xFF == ord('q'):
cv2.destroyAllWindows()
break
# One screenshot per second
time.sleep(1)
The sleep time may be a good thing to not explode your CPU.

Related

Tesseract doesn't recognize characters(numbers)

I'm trying to read a water level from a photo file(jpg or png).
but Tesseract does not read any of character at all even I cut the all the unnecessary area of photo.
I put a print function at the end of the code to see recognized number, but nothing appeared.
enter image description here
Here is the photo I have and under neath is the Python code.
import cv2
import os
from PIL import Image
import pytesseract
pytesseract.pytesseract.tesseract_cmd = r'C:\Program Files\Tesseract-OCR\tesseract'
image = cv2.imread ("water002.jpg")
#gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
gray = cv2.Canny(image, 300, 350)
filename = "{}.png".format(os.getpid())
cv2.imwrite(filename, gray)
text = pytesseract.image_to_string(Image.open(filename), lang=None)
#os.remove(filename)
print(text)
cv2.imshow("Image", image)
cv2.waitKey(0)
Is there any tip to let Tesseract to read the level or any other method?

Opencv Error:(-215)

I am writing this simple code but this is showing error saying size.width>0&&size.height>0 in function imshow()enter code here
import numpy as np
import cv2
img = cv2.imread('C:/Users/Desktop/x.jpg',0)
cv2.namedWindow('image',cv2.WINDOW_NORMAL)
cv2.imshow('image',img)
cv2.namedWindow('image',cv2.WINDOW_NORMAL)
cv2.waitKey(0)
cv2.destroyAllWindows()

Assuming you are using Windows, write path name as:
img = cv2.imread('C:\\Users\\Desktop\\x.jpg', 0)
to load the image correctly.

opencv-3.0.0 on Raspberry error

HI i have this error in Opencv-3.0.0 on rasperry Pi
enter cod# OpenCV_test1.py
this program opens the file in the same directory names "image.jpg" and displays the original image and a Canny edges of the original image
import cv2
import numpy as np
import os
#
def main():
imgOriginal = cv2.imread("image.jpg") # open image
if imgOriginal is None: # if image was not read successfully
print "error: image not read from file \n\n" # print error message to std out
os.system("pause") # pause so user can see error message
return # and exit function (which exits program)
# end if
imgGrayscale = cv2.cvtColor(imgOriginal, cv2.COLOR_BGR2GRAY) # convert to grayscale
imgBlurred = cv2.GaussianBlur(imgGrayscale, (5, 5), 0) # blur
imgCanny = cv2.Canny(imgBlurred, 100, 200) # get Canny edges
cv2.namedWindow("imgOriginal", cv2.WINDOW_AUTOSIZE) # create windows, use WINDOW_AUTOSIZE for a fixed window size
cv2.namedWindow("imgCanny", cv2.WINDOW_AUTOSIZE) # or use WINDOW_NORMAL to allow window resizing
cv2.imshow("imgOriginal", imgOriginal) # show windows
cv2.imshow("imgCanny", imgCanny)
cv2.waitKey() # hold windows open until user presses a key
cv2.destroyAllWindows() # remove windows from memory
return
#
if name == "main":
main()
e here
and this is the Error
python OpenCV_test1.py
File "OpenCV_test1.py", line 4
import cv2 import numpy as np import os
^
SyntaxError: invalid syntax

Imports must go on separate rows (or be be separated by semicolons, but please don't).
Change
import cv2 import numpy as np import os
to
import cv2
import numpy as np
import os

opencv python face detection from url images

I have no problem getting the opencv face detection using haar feature based cascades working on saved images:
from PIL import Image
face_cascade = cv2.CascadeClassifier('haarcascade_frontalface_default.xml')
eye_cascade = cv2.CascadeClassifier('haarcascade_eye.xml')
img = cv2.imread('pic.png')
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
faces = face_cascade.detectMultiScale(gray, 1.3, 5)
but I can't figure out how to open a url image and pass it into face_cascade. I've been playing around with cStringIO, but I don't know what to do with it...
import cv2.cv as cv
import urllib, cStringIO
img = 'http://scontent-b.cdninstagram.com/hphotos-prn/t51.2885-15/10424498_582114441904402_1105042543_n.png'
file = cStringIO.StringIO(urllib.urlopen(img).read())
source = Image.open(file).convert("RGB")
bitmap = cv.CreateImageHeader(source.size, cv.IPL_DEPTH_8U, 3)
cv.SetData(bitmap, source.tostring())
cv.CvtColor(bitmap, bitmap, cv.CV_RGB2BGR)
is it possible to work with a numpy array instead?
source2 = Image.open(file)
imarr=numpy.array(source2,dtype=numpy.uint8)
I'm a beginner, so I apologize for the poor explanation.
thanks a lot in advance!!

In your first example you are using OpenCV2.imread to read your image in the second you are presumably using PIL.Image then trying to convert.
Why not simply save the file to a temp directory and then use OpenCV2.imread again?

Or in another way you can use VideoCapture() class to open url image.
See the C++ code below,
VideoCapture cap;
if(!cap.open("http://docs.opencv.org/trunk/_downloads/opencv-logo.png")){
cout<<"Cannot open image"<<endl;
return -1;
}
Mat src;
cap>>src;
imshow("src",src);
waitKey();

python open cv loading image

I'm new to python and open cv. I'm trying to find out how to load an image in opencv with python. Can any one provide an example (with code) explaining how to load the image and display it?
import sys
import cv
from opencv.cv import *
from opencv.highgui import *
ll="/home/pavan/Desktop/iff pics/out0291.tif"
img= cvLoadImage( ll );
cvNamedWindow( “Example1”, CV_WINDOW_AUTOSIZE );
cvShowImage( “Example1”, img );
cvWaitKey(10);
cvDestroyWindow( “Example");

There have been quite a few changes in the openCV2 API:
import cv
ll = "/home/pavan/Desktop/iff pics/out0291.tif"
img = cv.LoadImage(ll)
cv.NamedWindow("Example", cv.CV_WINDOW_AUTOSIZE )
cv.ShowImage("Example", img )
cv.WaitKey(10000)
cv.DestroyWindow("Example")
It is a simpler, quite cleaner syntax!
Also, you don't need trailing ; à-la-matlab. Last, be careful about the quotes you use.
For the newer openCV3 API, you should see the other answer to this question.

import cv2
image_path = "/home/jay/Desktop/earth.jpg"
img = cv2.imread(image_path) # For Reading The Image
cv2.imshow('image', img) # For Showing The Image in a window with first parameter as it's title
cv2.waitKey(0) #waits for a key to be pressed on a window
cv2.destroyAllWindows() # destroys the window when the key is pressed

There are 2 possible approaches to this:
Using argparse (recommended):
import cv2
import argparse
ap = argparse.ArgumentParser() ap.add_argument("-i", "--image", required = True,help = "Path to the image")
args = vars(ap.parse_args()) image = cv2.imread(args["image"])
This will take the image as an argument, will then convert the argument, add it to ap and the load it using the imread function()
To run it.
Go to your required folder
source activate your environment
python filename.py -i img.jpg
Hardcoding the image location:
import cv2
img = cv2.imread("\File\Loca\img.jpg")
cv2.imshow("ImageName",img)
cv2.waitKey(0)
cv2.destroyAllWindows()
Run this similarly, omitting the arguments.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Pytesseract, trying to detect text from on screen - python

Related

Tesseract doesn't recognize characters(numbers)

Opencv Error:(-215)

opencv-3.0.0 on Raspberry error

opencv python face detection from url images

python open cv loading image

Categories

Resources