so i'm using opencv in python to look at a specific part of screen using this code:
import numpy as np
from PIL import ImageGrab
import cv2
while(True):
printscreen_pil = ImageGrab.grab(bbox=(852,530,911,575))
printscreen = np.array(printscreen_pil.getdata(),dtype='uint8')\
.reshape((printscreen_pil.size[1],printscreen_pil.size[0],3))
cv2.imshow('window',cv2.cvtColor(printscreen, cv2.COLOR_BGR2GRAY))
if cv2.waitKey(25) & 0xFF == ord('q'):
cv2.destroyAllWindows()
break
I want to give an output whenever this part of the screen matches an image. I'm stuck and reading up on 100 different tutorials but I'm stuck at the moment.
Related
I was using ImageGrab from PIL, but this is too slow to use for the project.
Are there any alternatives without using PIL?
Capturing an image is equivalent to capture a single frame of a video. You can do it using the VideoCapture method of OpenCV.
import cv2
cap = cv2.VideoCapture(0) # video capture source camera (Here webcam of laptop)
ret,frame = cap.read() # return a single frame in variable `frame`
while(True):
cv2.imshow('img',frame) #display the captured image
if cv2.waitKey(1) & 0xFF == ord('y'): #save on pressing 'y'
cv2.imwrite('images/test.png',frame)
cv2.destroyAllWindows()
break
cap.release()
Check the OpenCV tutorial for more information.
I have 5 images, namely im1, im2, im3, im4 and im5 which are all in JPG format.
I want to create an image carousel using these images.
I've started with the following code:
from time import sleep
import cv2
imagelist = ["im1.jpg", "im2.jpg", "im3.jpg", "im4.jpg", "im5.jpg"]
for image in imagelist:
img = cv2.imread(image, 1)
cv2.namedWindow("SCREEN")
cv2.imshow("SCREEN", img)
sleep(0.2)
cv2.destroyAllWindows()
Problem: It actually creates a cv2 window every 0.2 seconds and
displays the image. But I want it to display the image in the same
opened window without closing and creating multiple windows.
Kindly help me doing this task.
Thank you
You don't need sleep, you need to use cv2.waitKey(). I tested it and this should work fine.
waitKey function takes an int for delay in ms but it also records a keypress as an ordinal which you can then use to set up keypress commands eg, quit when pressing q. If you leave it empty it advances a step with any keypress.
I just used glob to grab all the .jpg in the folder but replacing it with the images manually in a list like you did will work fine.
import cv2
import glob
imagelist = glob.glob("*.jpg")
for image in imagelist:
img = cv2.imread(image)
cv2.imshow("SCREEN", img)
cv2.waitKey(20)
This is an image with Pytesseract guessing what's on small window with '59' below in the white text.
The window is a live screen grab and not a static image.
[EDIT] Was advised to post the small image so people can experiment with it, so here:-
Here is the code:
import numpy as np
import cv2
from PIL import ImageGrab
import pytesseract as loki
loki.pytesseract.tesseract_cmd = r"C:\Users\Rahul And Anisha\AppData\Local\Tesseract-OCR\tesseract.exe"
while True:
Odo = ImageGrab.grab(bbox = (1055,505, 1170, 570))
Speed = loki.image_to_string(Odo)
Odo = cv2.cvtColor(np.array(Odo), cv2.COLOR_BGR2RGB)
cv2.imshow('Speed' , Odo)
print(Speed)
if cv2.waitKey(25) & 0xFF == ord('q'):
cv2.destroyAllWindows()
break
The problem is that no matter what config I set (Tried --psm1 through --psm13), tesseract is unable to guess the number correctly
What's the problem here?
Try adding a little bit of empty area around the text(padding). The below code is for the smaller image.
M = np.float32([[1,0,25],[0,1,25]])
img = cv2.warpAffine(img,M,(cols*2,rows*2),borderValue=(127,127,127))
custom_oem_psm_config = r'--oem 3 --psm 3 -c tessedit_char_whitelist=1234567890'
print(pytesseract.image_to_string(img,config=custom_oem_psm_config))
This should work but try passing the binarized image instead, tesseract works best with binarized images. Preprocessing is mandatory before passing the image to tesseract. Psm modes do not process the image.
Please correct me if I am wrong.
I have a video file (i.e. https://www.example.com/myvideo.mp4) and need to load it with OpenCV.
Doing the equivalent with an image is fairly trivial:
imgReq = requests.get("https://www.example.com/myimage.jpg")
imageBytes = np.asarray(bytearray(data), dtype=np.uint8)
loadedImage = cv2.imdecode(image, cv2.IMREAD_COLOR)
I would like to do something similar to the following (where loadedVideo will be similar to what OpenCV returns from cv2.VideoCapture):
videoReq = requests.get("https://www.example.com/myimage.mp4")
videoBytes = np.asarray(bytearray(data), dtype=np.uint8)
loadedVideo = cv2.videodecode(image, cv2.IMREAD_COLOR)
But cv2.videodecode does not exist. Any ideas?
Edit: Seeing as this may be a dead end with only OpenCV, I'm open for solutions that combine other imaging libraries before loading into OpenCV...if such a solution exists.
It seems that cv2.videocode is not a valid OpenCV API either in OpenCV 2.x or OpenCV 3.x.
Below is a sample code it works in OpenCV 3 which uses cv2.VideoCapture class.
import numpy as np
import cv2
# Open a sample video available in sample-videos
vcap = cv2.VideoCapture('https://www.sample-videos.com/video/mp4/720/big_buck_bunny_720p_2mb.mp4')
#if not vcap.isOpened():
# print "File Cannot be Opened"
while(True):
# Capture frame-by-frame
ret, frame = vcap.read()
#print cap.isOpened(), ret
if frame is not None:
# Display the resulting frame
cv2.imshow('frame',frame)
# Press q to close the video windows before it ends if you want
if cv2.waitKey(22) & 0xFF == ord('q'):
break
else:
print "Frame is None"
break
# When everything done, release the capture
vcap.release()
cv2.destroyAllWindows()
print "Video stop"
You may check this Getting Started with Videos tutorial for more information.
Hope this help.
You will have to read the video using VideoCapture. there is no other way around that for now. unless you define it yourself.
remember a video is a combination of images changing at defined frame rate.
So You can read each frame in a while loop. as you apply the imdecode function.
import numpy as np
import cv2
cap = cv2.VideoCapture('https://www.example.com/myimage.mp4')
while(cap.isOpened()):
ret, image = cap.read()
loadedImage = cv2.imdecode(image, cv2.IMREAD_COLOR)
cv2.imshow('frame',loadedImage)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
cap.release()
cv2.destroyAllWindows()
I have two pieces of codes. Here is the first one. It was mainly copied from save a video section on OpenCV-Python tutorial website, but I modified a little bit.
import numpy as np
import cv2
cap = cv2.VideoCapture(0)
cap.set(7,200)
out = cv2.VideoWriter('output.avi',cv2.cv.CV_FOURCC('X','V','I','D'), 20.0, (640,480))
while(cap.isOpened()):
ret, frame = cap.read()
if ret==True:
out.write(frame)
cv2.imshow('frame',frame)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
else:
break
cap.release()
cv2.destroyAllWindows()
Second one is here:
import cv
cv.NamedWindow('camera',1)
cap = cv.CaptureFromCAM(0)
fps = 20
fourcc = cv.CV_FOURCC('X','V','I','D')
cv.SetCaptureProperty(cap,cv.CV_CAP_PROP_FRAME_COUNT,200)
out = cv.CreateVideoWriter('output.avi',fourcc,fps,(640,480))
while True
img = cv.QueryFrame(out,img)
cv.WriteFrame(out,img)
cv.ShowImage('camera',img)
if cv.WaitKey(1) & 0xFF == ord('q'):
break
cv.DestroyAllWindows()
Neither of them can make a video file saved or destroy the window in the end. No errors occurred in shell after running the code. I used Python 2.7.6 and OpenCV 2.3.1. Can somebody help me? Thanks a lot.
PS: I am not sure whether my method to set frame numbers correct or not.
It may have several reasons. Check the following:
Check that you can encode with XVID, maybe try with MJPEG first.
Set width and height of your input video by cap.set(3,640) and cap.set(4,480)