I have 5 images, namely im1, im2, im3, im4 and im5 which are all in JPG format.
I want to create an image carousel using these images.
I've started with the following code:
from time import sleep
import cv2
imagelist = ["im1.jpg", "im2.jpg", "im3.jpg", "im4.jpg", "im5.jpg"]
for image in imagelist:
img = cv2.imread(image, 1)
cv2.namedWindow("SCREEN")
cv2.imshow("SCREEN", img)
sleep(0.2)
cv2.destroyAllWindows()
Problem: It actually creates a cv2 window every 0.2 seconds and
displays the image. But I want it to display the image in the same
opened window without closing and creating multiple windows.
Kindly help me doing this task.
Thank you
You don't need sleep, you need to use cv2.waitKey(). I tested it and this should work fine.
waitKey function takes an int for delay in ms but it also records a keypress as an ordinal which you can then use to set up keypress commands eg, quit when pressing q. If you leave it empty it advances a step with any keypress.
I just used glob to grab all the .jpg in the folder but replacing it with the images manually in a list like you did will work fine.
import cv2
import glob
imagelist = glob.glob("*.jpg")
for image in imagelist:
img = cv2.imread(image)
cv2.imshow("SCREEN", img)
cv2.waitKey(20)
Related
I have read a folder containing pictures using glob and imread. Now my I want to resize all of those pictures using for loop in cv2.resize.
following is my code but the output is not correct--
import cv2
import glob
path = glob.glob("C:/Users/RX-91-9/Desktop/prescriptions/*.jpg")
for file in (path):
img=cv2.imread(file)
cv2.imshow("Image", img)
cv2.cv2.waitKey(3)
cv2.destroyAllWindows()
for i in img:
resized_image = cv2.resize(i, (1600,1600))
cv2.imshow('resized_image', resized_image)
cv2.waitKey(3)
cv2.destroyAllWindows()
I don't know why the last for loop is not giving the expected output, i want all the images in 'img' to be resized. Please help if you find what is wrong in my for last for loop.
I assume that you have a list of images in some folder and you to resize all of them. You can run
import cv2
import glob
for filename in glob.glob('images/*.jpg'): # path to your images folder
print(filename)
img=cv2.imread(filename)
rl=cv2.resize(img, (500,500))
cv2.imwrite(f'{filename}resized.jpg', rl)
This is the code that I am using for OpenCV to display image. It only shows me a blank screen instead of showing a picture.
import cv2
# location and name of file is completely correct
img = cv2.imread("./Resources/img-2.jpg")
# Doesn't give a null so its okay
print(img.shape)
# suspecting that problem is here
cv2.imshow("preview", img)
cv2.waitKey(0)
cv2.destroyAllWindows()
The image is stored in the right location and when I'm using a similar approach for a video and a webcam, it works perfectly.
The following is what the out is -
Try using matplotlib instead :
import matplotlib.pyplot as plt
import cv2
img = cv2.imread("./Resources/img-2.jpg")
img = cv2.cvtColor(img, cv2.COLOR_RGB2BGR) # convert img pixels to RGB format, so that matplotlib displays the image properly
plt.imshow(img)
plt.show()
If it still gives you a blank image, then the problem might come from your file or filename.
This is an image with Pytesseract guessing what's on small window with '59' below in the white text.
The window is a live screen grab and not a static image.
[EDIT] Was advised to post the small image so people can experiment with it, so here:-
Here is the code:
import numpy as np
import cv2
from PIL import ImageGrab
import pytesseract as loki
loki.pytesseract.tesseract_cmd = r"C:\Users\Rahul And Anisha\AppData\Local\Tesseract-OCR\tesseract.exe"
while True:
Odo = ImageGrab.grab(bbox = (1055,505, 1170, 570))
Speed = loki.image_to_string(Odo)
Odo = cv2.cvtColor(np.array(Odo), cv2.COLOR_BGR2RGB)
cv2.imshow('Speed' , Odo)
print(Speed)
if cv2.waitKey(25) & 0xFF == ord('q'):
cv2.destroyAllWindows()
break
The problem is that no matter what config I set (Tried --psm1 through --psm13), tesseract is unable to guess the number correctly
What's the problem here?
Try adding a little bit of empty area around the text(padding). The below code is for the smaller image.
M = np.float32([[1,0,25],[0,1,25]])
img = cv2.warpAffine(img,M,(cols*2,rows*2),borderValue=(127,127,127))
custom_oem_psm_config = r'--oem 3 --psm 3 -c tessedit_char_whitelist=1234567890'
print(pytesseract.image_to_string(img,config=custom_oem_psm_config))
This should work but try passing the binarized image instead, tesseract works best with binarized images. Preprocessing is mandatory before passing the image to tesseract. Psm modes do not process the image.
Please correct me if I am wrong.
I am using following code to draw rectangle on an image text for matching date pattern and its working fine.
import re
import cv2
import pytesseract
from PIL import Image
from pytesseract import Output
img = cv2.imread('invoice-sample.jpg')
d = pytesseract.image_to_data(img, output_type=Output.DICT)
keys = list(d.keys())
date_pattern = '^(0[1-9]|[12][0-9]|3[01])/(0[1-9]|1[012])/(19|20)\d\d$'
n_boxes = len(d['text'])
for i in range(n_boxes):
if int(d['conf'][i]) > 60:
if re.match(date_pattern, d['text'][i]):
(x, y, w, h) = (d['left'][i], d['top'][i], d['width'][i], d['height'][i])
img = cv2.rectangle(img, (x, y), (x + w, y + h), (0, 255, 0), 2)
cv2.imshow('img', img)
cv2.waitKey(0)
img.save("sample.pdf")
Now, at the end I am getting a PDF with rectangle on matched date pattern.
I want to give this program scanned PDF as input instead of image above.
It should first convert PDF into image format readable by opencv for same processing as above.
Please help.
(Any workaround is fine. I need a solution in which I can convert PDF to image and use it directly instead of saving on disk and read them again from there. As I have lot of PDFs to process.)
There is a library named pdf2image. You can install it with pip install pdf2image. Then, you can use the following to convert pages of the pdf to images of the required format:
from pdf2image import convert_from_path
pages = convert_from_path("pdf_file_to_convert")
for page in pages:
page.save("page_image.jpg", "jpg")
Now you can use this image to apply opencv functions.
You can use BytesIO to do your work without saving the file:
from io import BytesIO
from PIL import Image
with BytesIO() as f:
page.save(f, format="jpg")
f.seek(0)
img_page = Image.open(f)
From PDF to opencv ready array in two lines of code. I have also added the code to resize and view the opencv image. No saving to disk.
# imports
from pdf2image import convert_from_path
import cv2
import numpy as np
# convert PDF to image then to array ready for opencv
pages = convert_from_path('sample.pdf')
img = np.array(pages[0])
# opencv code to view image
img = cv2.resize(img, None, fx=0.5, fy=0.5)
cv2.imshow("img", img)
cv2.waitKey(0)
cv2.destroyAllWindows()
Remember if you do not have poppler in your Windows PATH variable you can provide the path to convert_form_path
poppler_path = r'C:\path_to_poppler'
pages = convert_from_path('sample.pdf', poppler_path=poppler_path)
You can use the library pdf2image. Install with this command: pip install pdf2image. You can then convert the file into one or multiple images readable by cv2. The next sample of code will convert the PIL Image into something readable by cv2:
Note: The following code requires numpy pip install numpy.
from pdf2image import convert_from_path
import numpy as np
images_of_pdf = convert_from_path('source2.pdf') # Convert PDF to List of PIL Images
readable_images_of_pdf = [] # Create a list for thr for loop to put the images into
for PIL_Image in images_of_pdf:
readable_images_of_pdf.append(np.array(PIL_Image)) # Add items to list
The next bit of code can convert the pdf into one big image readable by cv2:
import cv2
import numpy as np
from pdf2image import convert_from_path
image_of_pdf = np.concatenate(tuple(convert_from_path('/path/to/pdf/source.pdf')), axis=0)
The pdf2image library's convert_from_path() function returns a list containing each pdf page in the PIL image format. We convert the list into a tuple for the numpy concatenate function to stack the images on top of each other. If you want them side by side you could change the axis integer to 1 signifying you want to concatenate the images along the y-axis. This next bit of code will show the image on the screen:
cv2.imshow("Image of PDF", image_of_pdf)
cv2.waitKey(0)
This will probably create a window on the screen that is too big. To resize the image for the screen you'll use the following code that uses cv2's built-in resize function:
import cv2
from pdf2image import convert_from_path
import numpy as np
image_of_pdf = np.concatenate(tuple(convert_from_path('source2.pdf')), axis=0)
size = 0.15 # 0.15 is equal to 15% of the original size.
resized = cv2.resize(image_of_pdf, (int(image_of_pdf.shape[:2][1] * size), int(image_of_pdf.shape[:2][0] * size)))
cv2.imshow("Image of PDF", resized)
cv2.waitKey(0)
On a 1920x1080 monitor, a size of 0.15 can comfortably display a 3-page document. The downside is that the quality is reduced dramatically. If you want to have the pages separated you can just use the original convert_from_path() function. The following code shows each page individually, to go to the next page press any key:
import cv2
from pdf2image import convert_from_path
import numpy
images_of_pdf = convert_from_path('source2.pdf') # Convert PDF to List of PIL Images
count = 0 # Start counting which page we're on
while True:
cv2.imshow(f"Image of PDF Page {count + 1}", numpy.array(images_of_pdf[count])) # Display the page with it's number
cv2.waitKey(0) # Wait until key is pressed
cv2.destroyWindow(f"Image of PDF Page {count + 1}") # Destroy the following window
count += 1 # Add to the counter by 1
if count == len(images_of_pdf):
break # Break out of the while loop before you get an "IndexError: list index out of range"
I am trying to do something very simple: to subtract a bg image from a video for object tracking. I understood images can be simple subtracted from one another as follows img3 = img2 - img1. However, even when I start simple with one image, add a black line to it and store it as img2, img3 will not just show the line. When I run the following code
import cv2
img1 = cv2.imread("img1.png")
img2 = cv2.imread("img2.png")
img3 = img2 - img1
cv2.imwrite("img3.png",img3)
with bellow img1 and img2:
I get the image on the left below, instead of the image on the right:
I want to use this method for background extraction in a video, e.g. where I have a bg image file that shows an emtpy scene and a video that shows the same scene with sometimes objects moving in and out of the screen. I use the following code but similarly get a B/W image instead of just the object visible without the scene..
import cv2
import numpy as np
from PIL import Image
capture = cv2.VideoCapture("video.mov")
while True:
f, frame = capture.read()
frame = cv2.GaussianBlur(frame,(15,15),0)
frame = frame - bg
cv2.imshow("window", frame)
ps: I know about automatic background subtraction but I have very good background files and very clear empty scenes with very obvious objects so thought this should easily work!
Update: I have just found out about the PIL ImageChops difference function that works for getting what I want with two images but seems not possible to use with a video opened with opencv. Also would it be possible to do ImageChops.difference(img1,img2) manually with numpy arrays?
The closest to expected result you can get using this code:
img3 = 255 - cv2.absdiff(img1,img2)
This code will give you this:
Note that using only cv2.absdiff(img1,img2) will give the oposite of this result, because basically this operation tells you what is the difference between 2 images - if on some position there is no difference, the result (int this position) is 0.
To achieve "perfect result" (exactly what you expect) you need to apply some thresholding(or some other kind of filter which will erase left part of image).