I'm working on a project with april tags and a computer vision system to detect them from a webcam. I have a good system as of now that prints the data to the terminal but I would like to display this numerical/text data on top of the video window or in another window. I've already tried using cv2.putText() but that only puts static text on the page and it can't be updated in real time like I want. This is my code that tries to update a window in real time with the number of tags detected in the webcam video. But it ends up just writing a 1 for example and I can't figure out a way to erase that text and update it.
Is this even possible in OpenCV? Or is there another way?
while True:
success, frame = cap.read()
if not success:
break
gray = cv2.cvtColor(frame, cv2.COLOR_RGB2GRAY)
detections, dimg = detector.detect(gray, return_image=True)
print(detections)
num_detections = len(detections)
# print('Detected {} tags.\n'.format(num_detections))
num_detections_string = str(num_detections)
overlay = frame // 2 + dimg[:, :, None] // 2
clear_text = ''
text = checkNumDetections(num_detections, num_detections_string)
cv2.putText(whiteBackground, clear_text, (100, 100), cv2.FONT_HERSHEY_PLAIN, 10, (0, 255, 0), 2)
cv2.putText(whiteBackground, text, (100, 100), cv2.FONT_HERSHEY_PLAIN, 10, (0, 255, 0), 2)
cv2.imshow(window, overlay)
k = cv2.waitKey(1)
cv2.imshow(dataWindow, whiteBackground)
if k == 27:
break
You need to re-initialise the 'whiteBackground' image in each loop, before you draw anything on it.
I know this will work, but it will give you a black background:
whiteBackground = np.zeros((columns, rows, channels), dtype = "uint8")
This should work to give you a white background, but experiment and see:
whiteBackground = np.full((columns, rows, channels), 255, dtype = "uint8")
I usually work with opencv in c++, so I'm not 100% sure of the exact syntax, but that should work.
Related
I am trying to create a python code that scans the barcode and retrieves the output. A library called pyzbar had already been created for the same purpose. Using that and OpenCV, I had created a code (attached below), for scanning and drawing bounding boxes on the barcode/QR code. The problem I'm facing is that when I input a pre-recorded video above 100 MB as input the output video that is displayed/saved is very slow, but with live stream, there is no such issue. I tried several methods to reduce the fps like PROP_FPS, but nothing worked. I even tried multithreading method and it seems to not have any effect. The code that I was referring to is attached below. Please help me out on the same.
import cv2
import numpy as np
from pyzbar.pyzbar import decode
cv2.namedWindow("Result", cv2.WINDOW_NORMAL)
cap = cv2.VideoCapture('2016_0806_040333_0081.mp4')
cap.set(3, 1280)
cap.set(4, 720)
#frame_width = int(cap.get(3))
#frame_height = int(cap.get(4))
#cap.set(cv2.CAP_PROP_FPS, 0.1)
#size = (frame_width, frame_height)
#result = cv2.VideoWriter('processed_video.avi', cv2.VideoWriter_fourcc(*'MJPG'),0.1, size)
while(True):
ret, frame = cap.read()
for barcode in decode(frame):
myData = barcode.data.decode('utf-8')
pts = np.array([barcode.polygon],np.int32)
pts = pts.reshape((-1,1,2))
cv2.polylines(frame, [pts], True, (255,0,255),5)
pts2 = barcode.rect
akash = []
akash.append(myData)
cv2.putText(frame, myData, (pts2[0], pts2[1]), cv2.FONT_HERSHEY_SIMPLEX, 0.9, (255, 0, 255), 2)
"""
f=open('output.csv','a+')
for ele in akash:
f.write(ele+'\n')
"""
result.write(frame)
cv2.imshow('Result',frame)
cv2.waitKey(1)
#video.release()
#result.release()
#cv2.destroyAllWindows()
print("The video was successfully saved")
You could change the frame rate of your project, but this won't add missing frames. Even if your editor can attempt to extrapolate the needed frames, it can result in glitches.
Best approach if you have the option is to shoot the video at a higher frame rate to begin with.
Hello there people of the internet,
The code in question is using python 3.8.5 and opencv 4 (I do not know how to check the exact version but I know its opencv 4). My team and I are attempting to take a live video feed from a usb webcam and determine the distance between the camera and the object in the video feed. We had some success in reading the distance with image stills taken from the same camera and read via the imutils library. But now we want to attempt to calculate that data live.
Our code is below.
from imutils import paths
import numpy as np
import imutils
import cv2
import time
import os
def find_marker(image):
#conver the image into grayscales, blurs it then detects edges
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
gray = cv2.GaussianBlur(gray, (5, 5), 0)
edged = cv2.Canny(gray, 35, 125)
#find the contours in the edged image and keep the largest one;
#w'll assume that this our piece of paper in the image
cnts = cv2.findContours(edged.copy(), cv2.RETR_LIST, cv2.CHAIN_APPROX_SIMPLE)
cnts = imutils.grab_contours(cnts)
c = max(cnts, key = cv2.contourArea)
#compute the bounding box of the paper region and return it
return cv2.minAreaRect(c)
def distance_to_camera(knownWidth, focalLength, perWidth):
#compute and return the distance from the marker to the camera
return (knownWidth * focalLength) / perWidth
#intialize the known distances from the camera to the object
KNOWN_DISTANCE = 22
#initialize the known object width, which in this case the piece of paper is 12 inches
KNOWN_WIDTH = 11
#load the first image that contains an object that is known to be 2 feet
#from our camera, the find the paper marker in the image and
#initialize the focal length
rootimage = cv2.imread("/Volumes/404/final_rov_code/Python/images/2ft.jpg")
marker1 = find_marker(rootimage)
marker2 = marker1[0][1] - marker1[1][1]
focalLength = (marker2 * KNOWN_DISTANCE) / KNOWN_WIDTH
print(marker1)
print(marker2)
image = cv2.VideoCapture(0)
#Loop over the image
while True:
#load the image, find the marker in the image then compute the
#distance to the marker from the camera
frame, ret = image.read()
marker = find_marker(ret)
inches = distance_to_camera(KNOWN_WIDTH, focalLength, marker[1][0])
print(inches)
#draw a bounding box around the image and display it
box = cv2.cv.BoxPoints(marker) if imutils.is_cv2() else cv2.boxPoints(marker)
box = np.int0(box)
cv2.drawContours(frame, [box], -1, (0, 255, 0), 2)
cv2.putText(ret, "%.2fin" % inches,
(ret.shape[1] - 200, ret.shape[0] - 20), cv2.FONT_HERSHEY_SIMPLEX,
2.0, (0, 255, 0), 3)
cv2.imshow("image", ret)
# if cv2.waitKey(33) == ord('q'):
# os.system('pause')
I understand that it should be as minimalistic as possible but since we have no idea what could be causing the program to hang upon reading the first frame of the video feed. Could it be the fact that the processing is taking too many resources from the single thread? (We're all newbies to the advanced sides of opencv and python 3)
There is no other errors that we are aware of at the moment so no leads in the terminal of where it could be coming from.
Thank you in advance.
Your problem is likely a result of not including the waitkey() statement at the end of your while loop. It takes time for openCV to load the image, so if the program doesn't pause for long enough for the image to be drawn, the display just doesn't update. Check out this other StackOverflow question for more details.
In addition, you have your ret and frame variables mixed up. ret should be the first one and frame should be the second. Right now, the drawContours() method isn't going to do anything because you're passing it a boolean instead of an image.
Making those changes fixed this for me using Python 3.9 and OpenCV 4.5.
I'm trying to use pyterreract ocr to recognize characters on a part of my screen. It says that it can not detect any characters even though it is scanning some. I think the problem is that it has a gray background and the text is black for what I am trying to scan/ capture. I think the gray background throws it off, so I would like to change that gray to white and hope the software works. Here is my code:
imghi = ImageGrab.grab(bbox = (x, y, x + offx, y + offy))
imgglo = np.array(imghi)
gray = cv2.cvtColor(imgglo,cv2.COLOR_BGR2GRAY)
gray2 = cv2.fastNlMeansDenoising(gray, None, 9, 13)
th, threshed = cv2.threshold(gray2, 240, 255, cv2.THRESH_BINARY_INV)
kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (11,11))
morphed = cv2.morphologyEx(threshed, cv2.MORPH_CLOSE, kernel)
Image_Inversed = cv2.bitwise_not(gray)
thresh = 0
maxValue = 255
cv2.threshold(Image_Inversed,10,255,cv2.THRESH_BINARY_INV)
img_final = np.array(Image_Inversed)
cv2.imshow('Mats Viewer', img_final)
tess.pytesseract.tesseract_cmd = r'C:\Users\tyler\AppData\Local\Tesseract-
OCR\tesseract.exe' text = tess.image_to_string(img_final)
If you are having trouble understanding, please comment. Thank You!
In fact, your code does already remove gray background - it's in the line th, threshed = cv2.threshold(gray2, 240, 255, cv2.THRESH_BINARY_INV), it sets all pixels with color value below 240 to 0 (white), and those with higher value to 255 (black).
The problem is that in further processing you don't use the image you've processed (threshed), but stick with the gray image instead. The same applies to other transformations you do - they are performed, but you assign the results to new variables which you don't use later on. It is important to understand that many operations (I'd risk saying it's generally default for most python libraries and cases) don't change the input in place - instead, they return a new object with some changes applied.
So, in short - in each line you process the image in some way you should use the output of previous operation, not the initial variable.
I'm new to OpenCV, so apologies if this is a trivial question...
I'm writing an application that tracks the path of an object in real time. So far, I have successfully isolated the object and created a "trail" of its path using cv2.accumulateWeighted(). Everything looks great in the preview window, but when I save the merged frame to a file, things aren't so good.
The result varies, but typically the saved frame has much less detail than the displayed frame. I've converted the input to grayscale, and often the written file has very "dim" features.
I believe only the final frame is written (multiplied by the alpha blend), rather than the accumulated image. Any ideas would be greatly appreciated.
Sample program to demonstrate the issue:
import cv2
#---- read the next frame from the capture device
def read_frame(cap):
ret, frame = cap.read()
if ret is False or frame is None:
return None
gray_frame = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
return gray_frame
#---- setup components
cap = cv2.VideoCapture(index=0)
background_subtractor = cv2.createBackgroundSubtractorMOG2(
history=30, varThreshold=50, detectShadows=False
)
#---- prime the accumulator
frame = read_frame(cap)
merged_frame = frame.astype(float)
#---- capture some frames
while True:
frame = read_frame(cap)
mask = background_subtractor.apply(frame, learningRate=0.01)
foreground = cv2.bitwise_and(frame, frame, mask=mask)
cv2.accumulateWeighted(foreground, merged_frame, 0.1)
cv2.imshow('Acccumulator', merged_frame)
key = cv2.waitKey(1)
# press 'q' to quit and save the current frame
if key == ord('q') or key == ord('Q'):
cv2.imwrite('merged.png', merged_frame)
break
The following are images when moving my hand through the scene... You can see the path of my hand in the displayed image, along with some other background elements. In the saved image, only a very dim version of my hand in the final position is saved.
This is the displayed image (using screen capture):
This is the image written to disk (using imwrite()):
I guess you want to save merged_frame as it shown by cv2.imshow.
You may limit the upper value of merged_frame to 1, scale by 255, and convert to uint8 type, before saving:
merged_frame = np.round(np.minimum(merged_frame, 1)*255).astype(np.uint8)
The type of merged_frame is float64.
When using cv2.imshow for image of type float, all the values above 1.0 are white (and below 0 are black).
Gray level of range [0, 1] is equivalent to range [0, 255] of uint8 type (0.5 is like 128).
When using cv2.imwrite the image is converted to uint8, but without clamping and scaling (simple cast to 255). The result is usually very dark.
In case you want to save the image as it shown, you need to clamp value to 1, then scale by 255.
You didn't post input samples, so I created synthetic input:
import numpy as np
import cv2
background_subtractor = cv2.createBackgroundSubtractorMOG2(
history=30, varThreshold=50, detectShadows=False
)
width, height = 640, 480
frame = np.full((height, width), 60, np.uint8)
merged_frame = frame.astype(float)
for n in range(100):
img = np.full((height, width, 3), 60, np.uint8)
cv2.putText(img, str(n), (width//2-100*len(str(n)), height//2+100), cv2.FONT_HERSHEY_DUPLEX, 10, (30, 255, 30), 20) # Green number
#frame = read_frame(cap)
frame = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
mask = background_subtractor.apply(frame, learningRate=0.01)
foreground = cv2.bitwise_and(frame, frame, mask=mask)
cv2.accumulateWeighted(foreground, merged_frame, 0.1)
cv2.imshow('Acccumulator', merged_frame)
cv2.waitKey(10)
#merged_frame = cv2.normalize(merged_frame, merged_frame, 0, 255.0, cv2.NORM_MINMAX).astype(np.uint8) # Alternative approch - normalize between 0 and 255
merged_frame = np.round(np.minimum(merged_frame, 1)*255).astype(np.uint8)
cv2.imshow('merged_frame as uint8', merged_frame)
cv2.imwrite('merged.png', merged_frame)
cv2.waitKey(0)
cv2.destroyAllWindows()
PNG image using imwrite, without camping and scaling:
PNG image using imwrite, with camping and scaling:
A better way for showing the image, is normalize the values to range [0, 1] before showing the image.
Example:
In the loop, after cv2.accumulateWeighted(foreground, merged_frame, 0.1):
norm_acccumulator = merged_frame.copy()
cv2.normalize(norm_acccumulator, norm_acccumulator, 0, 1.0, cv2.NORM_MINMAX)
cv2.imshow('Acccumulator', norm_acccumulator)
So i would like to make a program which can detect an object by color, position and sharpness.
Now I am there that I could detect the object by color and draw its contour and bounding box.
My problem is that i really dont know how to cut out the object from the picture and save it as picture file when the program recognise its contour or bounding box.
here's a picture of what my camera is seeing
input
output
I would like to cut out what is inside of the green colored boundig box as many times as fps in the video and as long as you can see it in the video. So if the video is 30 fps and the object is visible for 10 seconds it needs to take 300 pictures.
Here is the code:
i know it looks bad, im just trying to figure out what to use to make it work
import cv2 as cv
import numpy as np
import os
import uuid
cap = cv.VideoCapture(1)
font = cv.FONT_HERSHEY_COMPLEX
path = os.getcwd()
print(path)
def createFolder(directory):
try:
if not os.path.exists(directory):
os.makedirs(directory)
except OSError:
print('Error: Creating directory. ' + directory)
createFolder("./data")
# folderName = '%s' % (str(uuid.uuid4()))
while cap.isOpened():
_, frame = cap.read()
hsv = cv.cvtColor(frame, cv.COLOR_BGR2HSV)
# blue is the chosen one for now
lower_color = np.array([82, 33, 39])
upper_color = np.array([135, 206, 194])
mask = cv.inRange(hsv, lower_color, upper_color)
kernel = np.ones((5, 5), np.uint8)
mask = cv.erode(mask, kernel)
contours, hierarchy = cv.findContours(mask, cv.RETR_EXTERNAL, cv.CHAIN_APPROX_SIMPLE)
# find contour
for contour in contours:
area = cv.contourArea(contour)
x, y, h, w = cv.boundingRect(contour)
if area > 100:
# bounding box
# cv.rectangle(frame, (x - 40, y - 30), (x + h * 3, y + w * 3), (0, 255, 0), 1)
# cutting and saving
ext_left = tuple(contour[contour[:, :, 0].argmin()][0] - 20)
ext_right = tuple(contour[contour[:, :, 0].argmax()][0] + 20)
ext_top = tuple(contour[contour[:, :, 1].argmin()][0] - 20)
ext_bot = tuple(contour[contour[:, :, 1].argmax()][0] + 20)
outfile = '%s.jpg' % (str(uuid.uuid4()))
cropped_image = frame[ext_top[1]:ext_bot[1], ext_left[0]:ext_right[0]]
# write images to a specified folder
cv.imwrite(os.path.join(path, "/data/", outfile), cropped_image)
# outputs
cv.imshow("Frame", frame)
cv.imshow("Mask", mask)
key = cv.waitKey(1)
if key == 27:
break
cap.release()
cv.destroyAllWindows()
Focusing on the question and ignoring the code style, I can say you are close to achieving your goal :)
For cropping the object, you can use the Mat copyTo method. Here is the official OpenCV documentation and here is an example from the OpenCV forums.
Now, for creating the mask from the contours, you can use the same drawCountours method you already use, but provide a negative value for the thickness parameters (for example, thickness=CV_FILLED). You can see a code snippet in this stackoverflow post and check details in the official documentation.
For saving the image to disk you can use imwrite.
So, in a nutshell, draw filled contours to a mask and use that mask to copy only the object pixels from the video frame to another mat that you can save the disk.
Instead of posting code, I will share this very similar question with an accepted answer that may have the code snippet you are looking for.