I am doing a project in python.I want to detect rectangle shape object by opening the webcam using python.I had tried it but I didn't get accurate Answer.I show the object in front of webcam If any finger touched our object it doesn't recognize our object.please anyone can help me.Thanks in Advance:)
Here is my code:
py:
import math
import numpy as np
import cv2
#dictionary of all contours
contours = {}
#array of edges of polygon
approx = []
#scale of the text
scale = 2
#camera
cap = cv2.VideoCapture(0)
print("press q to exit")
# Define the codec and create VideoWriter object
fourcc = cv2.VideoWriter_fourcc(*'XVID')
out = cv2.VideoWriter('output.avi',fourcc, 20.0, (640,480))
#calculate angle
def angle(pt1,pt2,pt0):
dx1 = pt1[0][0] - pt0[0][0]
dy1 = pt1[0][1] - pt0[0][1]
dx2 = pt2[0][0] - pt0[0][0]
dy2 = pt2[0][1] - pt0[0][1]
return float((dx1*dx2 + dy1*dy2))/math.sqrt(float((dx1*dx1 + dy1*dy1))*(dx2*dx2 + dy2*dy2) + 1e-10)
while(cap.isOpened()):
#Capture frame-by-frame
ret, frame = cap.read()
if ret==True:
#grayscale
gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
#Canny
canny = cv2.Canny(frame,80,240,3)
#contours
canny2, contours, hierarchy = cv2.findContours(canny,cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_SIMPLE)
for i in range(0,len(contours)):
#approximate the contour with accuracy proportional to
#the contour perimeter
approx = cv2.approxPolyDP(contours[i],cv2.arcLength(contours[i],True)*0.02,True)
#Skip small or non-convex objects
if(abs(cv2.contourArea(contours[i]))<100 or not(cv2.isContourConvex(approx))):
continue
x,y,w,h = cv2.boundingRect(contours[i])
vtc = len(approx)
if(vtc==4):
cv2.putText(frame,'RECTANGLE',(x,y),cv2.FONT_HERSHEY_SIMPLEX,scale,(255,255,255),2,cv2.LINE_AA)
#Display the resulting frame
out.write(frame)
cv2.imshow('frame',frame)
cv2.imshow('canny',canny)
if cv2.waitKey(1) == 1048689: #if q is pressed
break
#When everything done, release the capture
cap.release()
cv2.destroyAllWindows()
Here is my output:
Non Working output
Working output
You currently have this part in your code:
#Skip small or non-convex objects
if(abs(cv2.contourArea(contours[i]))<100 or not(cv2.isContourConvex(approx))):
continue
x,y,w,h = cv2.boundingRect(contours[i])
vtc = len(approx)
if(vtc==4):
cv2.putText(frame,'RECTANGLE',(x,y),cv2.FONT_HERSHEY_SIMPLEX,scale,(255,255,255),2,cv2.LINE_AA)
here it is possible to create a rectangle around the contour and compare the areas. For that it is possible to use boundingRect, however, your phone can be slightly angled, so minAreaRect is better for this. It will return ((x,y), (w,h), angle) you care is the (w,h) part since the area is w*h. You already know hot to get the actual area of the contour, since it is in your code.
At the end the code should look like this:
#Skip small or non-convex objects
if(abs(cv2.contourArea(contours[i]))<100 or not(cv2.isContourConvex(approx))):
continue
x,y,w,h = cv2.boundingRect(contours[i])
vtc = len(approx)
rect = cv2.minAreaRect(contours[i])
rectArea = rect[1][0] * rect[1][1]
contourArea = cv2.contourArea(contours[i])
# now it will check if the difference is less than 10% of the rect area
if vtc==4 or abs(rectArea - contourArea) < rectArea * .10:
cv2.putText(frame,'RECTANGLE',(x,y),cv2.FONT_HERSHEY_SIMPLEX,scale,(255,255,255),2,cv2.LINE_AA)
Probably this will work, but you may need to adjust the threshold (I used 10% of rectArea). Even the 4 point check can be omitted, If it is a rectangle it will have a perfect fit (rectarea-contourarea) = 0.
I hope this helps, however this is a simple way. More possible answers are also valid for this. You can even think it with machine learning algorithms, or rectangle fitting algorithms.
Related
I'm currently doing some research to detect and locate a text-cursor (you know, the blinking rectangle shape that indicates the character position when you type on your computer) from a screen-record video. To do that, I've trained YOLOv4 model with custom object dataset (I took a reference from here) and planning to also implement DeepSORT to track the moving cursor.
Here's the example of training data I used to train YOLOv4:
Here's what I want to achieve:
Do you think using YOLOv4 + DeepSORT is considered overkill for this task? I'm asking because as of now, only 70%-80% of the video frame that contains the text-cursor can be successfully detected by the model. If it is overkill after all, do you know any other method that can be implemented for this task?
Anyway, I'm planning to detect the text-cursor not only from Visual Studio Code window, but also from Browser (e.g., Google Chrome) and Text Processor (e.g., Microsoft Word) as well. Something like this:
I'm considering the Sliding Window method as an alternative, but from what I've read, the method might consume much resources and perform slower. I'm also considering Template Matching from OpenCV (like this), but I don't think it will perform better and faster than the YOLOv4.
The constraint is about the performance speed (i.e, how many frames can be processed given amount of time) and the detection accuracy (i.e, I want to avoid letter 'l' or '1' detected as the text-cursor, since those characters are similar in some font). But higher accuracy with slower FPS is acceptable I think.
I'm currently using Python, Tensorflow, and OpenCV for this.
Thank you very much!
This would work if the cursor is the only moving object on the screen. Here is the before and after:
Before:
After:
The code:
import cv2
import numpy as np
BOX_WIDTH = 10
BOX_HEIGHT = 20
def process_img(img):
img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
kernel = np.ones((5, 5))
img_canny = cv2.Canny(img_gray, 50, 50)
return img_canny
def get_contour(img):
contours, hierarchies = cv2.findContours(img, cv2.RETR_TREE, cv2.CHAIN_APPROX_NONE)
if contours:
return max(contours, key=cv2.contourArea)
def get_line_tip(cnt1, cnt2):
x1, y1, w1, h1 = cv2.boundingRect(cnt1)
if h1 > BOX_HEIGHT / 2:
if np.any(cnt2):
x2, y2, w2, h2 = cv2.boundingRect(cnt2)
if x1 < x2:
return x1, y1
return x1 + w1, y1
def get_rect(x, y):
half_width = BOX_WIDTH // 2
lift_height = BOX_HEIGHT // 6
return (x - half_width, y - lift_height), (x + half_width, y + BOX_HEIGHT - lift_height)
cap = cv2.VideoCapture("screen_record.mkv")
success, img_past = cap.read()
cnt_past = np.array([])
line_tip_past = 0, 0
while True:
success, img_live = cap.read()
if not success:
break
img_live_processed = process_img(img_live)
img_past_processed = process_img(img_past)
img_diff = cv2.bitwise_xor(img_live_processed, img_past_processed)
cnt = get_contour(img_diff)
line_tip = get_line_tip(cnt, cnt_past)
if line_tip:
cnt_past = cnt
line_tip_past = line_tip
else:
line_tip = line_tip_past
rect = get_rect(*line_tip)
img_past = img_live.copy()
cv2.rectangle(img_live, *rect, (0, 0, 255), 2)
cv2.imshow("Cursor", img_live)
if cv2.waitKey(1) & 0xFF == ord("q"):
break
cv2.destroyAllWindows()
Breaking it down:
Import the necessary libraries:
import cv2
import numpy as np
Define the size of the tracking box depending on the size of the cursor:
BOX_WIDTH = 10
BOX_HEIGHT = 20
Define a function to process the frames into edges:
def process_img(img):
img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
kernel = np.ones((5, 5))
img_canny = cv2.Canny(img_gray, 50, 50)
return img_canny
Define a function that would retrieve the contour with the greatest area in an image (the cursor doesn't need to be large for this to work, it can be tiny if needed):
def get_contour(img):
contours, hierarchies = cv2.findContours(img, cv2.RETR_TREE, cv2.CHAIN_APPROX_NONE)
if contours:
return max(contours, key=cv2.contourArea)
Define a function that will take in 2 contours, one being the contour of the cursor + some text for the current frame, the other being the contour + some stray text for the contour of the cursor + some text from the frame before. With the two contours, we can identify if the cursor is moving left or right:
def get_line_tip(cnt1, cnt2):
x1, y1, w1, h1 = cv2.boundingRect(cnt1)
if h1 > BOX_HEIGHT / 2:
if np.any(cnt2):
x2, y2, w2, h2 = cv2.boundingRect(cnt2)
if x1 < x2:
return x1, y1
return x1 + w1, y1
Define a function that will take in the tip points of the cursor, and return a box based on the BOX_WIDTH and BOX_HEIGHT constants defined earlier:
def get_rect(x, y):
half_width = BOX_WIDTH // 2
lift_height = BOX_HEIGHT // 6
return (x - half_width, y - lift_height), (x + half_width, y + BOX_HEIGHT - lift_height)
Define a capture devices for the video, and remove one frame from the start of the video and store it in a variable that will be used as the frame before every frame. Also define temporary values for the past contour and past line tip:
cap = cv2.VideoCapture("screen_record.mkv")
success, img_past = cap.read()
cnt_past = np.array([])
line_tip_past = 0, 0
Use a while loop, and read from the video. Process the frame and the frame before that frame in the video:
while True:
success, img_live = cap.read()
if not success:
break
img_live_processed = process_img(img_live)
img_past_processed = process_img(img_past)
With the processed frames, we can find the difference between the frame using the cv2.bitwise_xor method to get where the movement is on the screen. Then, we can find the contour of the movement between the 2 frames using the get_contour function defined:
img_diff = cv2.bitwise_xor(img_live_processed, img_past_processed)
cnt = get_contour(img_diff)
With the contour, we can utilize the get_line_tip function defined to find the tip of the cursor. If a tip was found, save it into the line_tip_past variable to use for the next iteration, and if a tip was not found, we can us the past tip we saved as the current tip:
line_tip = get_line_tip(cnt, cnt_past)
if line_tip:
cnt_past = cnt
line_tip_past = line_tip
else:
line_tip = line_tip_past
Now we define a rectangle using the cursor tip and the get_rect function we defined earlier, and draw it onto the current frame. But before drawing it on, we save the frame to be the frame before the current frame of the next iteration:
rect = get_rect(*line_tip)
img_past = img_live.copy()
cv2.rectangle(img_live, *rect, (0, 0, 255), 2)
Finally, we display the frame:
cv2.imshow("Cursor", img_live)
if cv2.waitKey(1) & 0xFF == ord("q"):
break
cv2.destroyAllWindows()
Hello there people of the internet,
The code in question is using python 3.8.5 and opencv 4 (I do not know how to check the exact version but I know its opencv 4). My team and I are attempting to take a live video feed from a usb webcam and determine the distance between the camera and the object in the video feed. We had some success in reading the distance with image stills taken from the same camera and read via the imutils library. But now we want to attempt to calculate that data live.
Our code is below.
from imutils import paths
import numpy as np
import imutils
import cv2
import time
import os
def find_marker(image):
#conver the image into grayscales, blurs it then detects edges
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
gray = cv2.GaussianBlur(gray, (5, 5), 0)
edged = cv2.Canny(gray, 35, 125)
#find the contours in the edged image and keep the largest one;
#w'll assume that this our piece of paper in the image
cnts = cv2.findContours(edged.copy(), cv2.RETR_LIST, cv2.CHAIN_APPROX_SIMPLE)
cnts = imutils.grab_contours(cnts)
c = max(cnts, key = cv2.contourArea)
#compute the bounding box of the paper region and return it
return cv2.minAreaRect(c)
def distance_to_camera(knownWidth, focalLength, perWidth):
#compute and return the distance from the marker to the camera
return (knownWidth * focalLength) / perWidth
#intialize the known distances from the camera to the object
KNOWN_DISTANCE = 22
#initialize the known object width, which in this case the piece of paper is 12 inches
KNOWN_WIDTH = 11
#load the first image that contains an object that is known to be 2 feet
#from our camera, the find the paper marker in the image and
#initialize the focal length
rootimage = cv2.imread("/Volumes/404/final_rov_code/Python/images/2ft.jpg")
marker1 = find_marker(rootimage)
marker2 = marker1[0][1] - marker1[1][1]
focalLength = (marker2 * KNOWN_DISTANCE) / KNOWN_WIDTH
print(marker1)
print(marker2)
image = cv2.VideoCapture(0)
#Loop over the image
while True:
#load the image, find the marker in the image then compute the
#distance to the marker from the camera
frame, ret = image.read()
marker = find_marker(ret)
inches = distance_to_camera(KNOWN_WIDTH, focalLength, marker[1][0])
print(inches)
#draw a bounding box around the image and display it
box = cv2.cv.BoxPoints(marker) if imutils.is_cv2() else cv2.boxPoints(marker)
box = np.int0(box)
cv2.drawContours(frame, [box], -1, (0, 255, 0), 2)
cv2.putText(ret, "%.2fin" % inches,
(ret.shape[1] - 200, ret.shape[0] - 20), cv2.FONT_HERSHEY_SIMPLEX,
2.0, (0, 255, 0), 3)
cv2.imshow("image", ret)
# if cv2.waitKey(33) == ord('q'):
# os.system('pause')
I understand that it should be as minimalistic as possible but since we have no idea what could be causing the program to hang upon reading the first frame of the video feed. Could it be the fact that the processing is taking too many resources from the single thread? (We're all newbies to the advanced sides of opencv and python 3)
There is no other errors that we are aware of at the moment so no leads in the terminal of where it could be coming from.
Thank you in advance.
Your problem is likely a result of not including the waitkey() statement at the end of your while loop. It takes time for openCV to load the image, so if the program doesn't pause for long enough for the image to be drawn, the display just doesn't update. Check out this other StackOverflow question for more details.
In addition, you have your ret and frame variables mixed up. ret should be the first one and frame should be the second. Right now, the drawContours() method isn't going to do anything because you're passing it a boolean instead of an image.
Making those changes fixed this for me using Python 3.9 and OpenCV 4.5.
I am working on a project for camera tampering. I have the code for tamper detection for a static camera. Tampering means blocking or defocussing the camera. I want to modify it for a moving camera or rotating camera so that it takes all the frames from the background and then compare it with new frames using the same background subtractor method.
I have tried but unable to figure out how to use the list of frames to compare the captured frame from other ones.
import numpy as np
import cv2
from playsound import playsound
import time
#cap = cv2.VideoCapture('http://192.168.43.1:8080/video') #Opening of IP camera just enter the ip address
cap = cv2.VideoCapture(0)
fgbg = cv2.createBackgroundSubtractorMOG2() #generating a foreground mask
frame_list = [] #creating the list of frame
##################################################################
while(True):
ret, frame = cap.read()
if ret == True:
if frame not in frame_list:
frame_list.append(frame) #This list contain all the frames
###################################################################
#ret, frame = cap.read() #to get the initial frame
#fgmask = fgbg.apply(frame) #to save the initial frame
kernel = np.ones((5,5), np.uint8) #creating a matrix of (5, 5) consisting of 1
while(True):
ret, frame = cap.read() #reading all the frames
if ret == True:
a = 0
bounding_rect = [] # An empty list where will furthur input the contours
fgmask = fgbg.apply(frame) #Applying the changes of the backgroud to the foreground mask
fgmask= cv2.erode(fgmask, kernel, iterations=5)
fgmask = cv2.dilate(fgmask, kernel, iterations = 5) #Erosion and Dilation is done to detect even the blur objects better
cv2.imshow('frame',frame) #Showing the frame.
contours,_ = cv2.findContours(fgmask,cv2.RETR_TREE,cv2.CHAIN_APPROX_SIMPLE) #The mode RETR_TREE together with the CHAIN APPROX_SIMPLE returns only the endpoints required to draw contour
for i in range(0,len(contours)):
bounding_rect.append(cv2.boundingRect(contours[i])) #cv2.bounding rectangle gives the coordinates of bounding rectangle and then we will input these coordinates to the list we made
for i in range(0,len(contours)):
if bounding_rect[i][2] >=40 or bounding_rect[i][3] >= 40: #setting the threshold for the width and height if the contour
a = a+(bounding_rect[i][2])*bounding_rect[i][3] #updating the area of contour
if(a >=int(frame.shape[0])*int(frame.shape[1])/3): #It is basically the comparison of the change in area of the background, so if there is a certain change in area it will detect the tampering
cv2.putText(frame,"TAMPERING DETECTED",(5,30),cv2.FONT_HERSHEY_SIMPLEX,1,(0,255,255),2)
#playsound('warning.mp3') #put the address of the warning tune in the playsound object and uncomment it
cv2.imshow('frame',frame) #showing the final frame
if cv2.waitKey(30) & 0xff== ord('q'): #To close the camera press q
break
else : break
cap.release()
cv2.destroyAllWindows()
In my opinion, I don't think background subtraction is a good solution to your problem, your methods is mainly used to detect moving object by subtracting background. This method most often used in static camera but for moving camera, intensity variation or texture change may also need to take account.
So i would like to make a program which can detect an object by color, position and sharpness.
Now I am there that I could detect the object by color and draw its contour and bounding box.
My problem is that i really dont know how to cut out the object from the picture and save it as picture file when the program recognise its contour or bounding box.
here's a picture of what my camera is seeing
input
output
I would like to cut out what is inside of the green colored boundig box as many times as fps in the video and as long as you can see it in the video. So if the video is 30 fps and the object is visible for 10 seconds it needs to take 300 pictures.
Here is the code:
i know it looks bad, im just trying to figure out what to use to make it work
import cv2 as cv
import numpy as np
import os
import uuid
cap = cv.VideoCapture(1)
font = cv.FONT_HERSHEY_COMPLEX
path = os.getcwd()
print(path)
def createFolder(directory):
try:
if not os.path.exists(directory):
os.makedirs(directory)
except OSError:
print('Error: Creating directory. ' + directory)
createFolder("./data")
# folderName = '%s' % (str(uuid.uuid4()))
while cap.isOpened():
_, frame = cap.read()
hsv = cv.cvtColor(frame, cv.COLOR_BGR2HSV)
# blue is the chosen one for now
lower_color = np.array([82, 33, 39])
upper_color = np.array([135, 206, 194])
mask = cv.inRange(hsv, lower_color, upper_color)
kernel = np.ones((5, 5), np.uint8)
mask = cv.erode(mask, kernel)
contours, hierarchy = cv.findContours(mask, cv.RETR_EXTERNAL, cv.CHAIN_APPROX_SIMPLE)
# find contour
for contour in contours:
area = cv.contourArea(contour)
x, y, h, w = cv.boundingRect(contour)
if area > 100:
# bounding box
# cv.rectangle(frame, (x - 40, y - 30), (x + h * 3, y + w * 3), (0, 255, 0), 1)
# cutting and saving
ext_left = tuple(contour[contour[:, :, 0].argmin()][0] - 20)
ext_right = tuple(contour[contour[:, :, 0].argmax()][0] + 20)
ext_top = tuple(contour[contour[:, :, 1].argmin()][0] - 20)
ext_bot = tuple(contour[contour[:, :, 1].argmax()][0] + 20)
outfile = '%s.jpg' % (str(uuid.uuid4()))
cropped_image = frame[ext_top[1]:ext_bot[1], ext_left[0]:ext_right[0]]
# write images to a specified folder
cv.imwrite(os.path.join(path, "/data/", outfile), cropped_image)
# outputs
cv.imshow("Frame", frame)
cv.imshow("Mask", mask)
key = cv.waitKey(1)
if key == 27:
break
cap.release()
cv.destroyAllWindows()
Focusing on the question and ignoring the code style, I can say you are close to achieving your goal :)
For cropping the object, you can use the Mat copyTo method. Here is the official OpenCV documentation and here is an example from the OpenCV forums.
Now, for creating the mask from the contours, you can use the same drawCountours method you already use, but provide a negative value for the thickness parameters (for example, thickness=CV_FILLED). You can see a code snippet in this stackoverflow post and check details in the official documentation.
For saving the image to disk you can use imwrite.
So, in a nutshell, draw filled contours to a mask and use that mask to copy only the object pixels from the video frame to another mat that you can save the disk.
Instead of posting code, I will share this very similar question with an accepted answer that may have the code snippet you are looking for.
I am using a microscope to observe the motion of small 4 micron beads. I have a video of the beads moving and I would like to process the video to extract the bead locations as a function of time to get a mathematical model of their motion.
I am currently using opencv and programming in python
My code was importing a video from file, thresholding the image then applying a HoughCircles transform to find the spherical beads.
import numpy as np
import cv2
def nothing(x):
pass
cap = cv2.VideoCapture('testvideo.mp4')
cv2.namedWindow('trackbar')
cv2.createTrackbar('Param1','trackbar',40,255,nothing)
cv2.createTrackbar('Param2','trackbar',10,255,nothing)
cv2.createTrackbar('MaxRadius','trackbar',18,255,nothing)
while(cap.isOpened()):
e1 = cv2.getTickCount()
ret, frame = cap.read()
#get grayscale image for HoughCircles
gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
p1 = cv2.getTrackbarPos('Param1','trackbar')
p2 = cv2.getTrackbarPos('Param2','trackbar')
rMax = cv2.getTrackbarPos('MaxRadius','trackbar')
#Threshold grayscale image
ret,th1 = cv2.threshold(gray,127,255,cv2.THRESH_BINARY)
#Find circles in image and store locations to circles list
circles = cv2.HoughCircles(th1, cv2.cv.CV_HOUGH_GRADIENT, 1, 10,
param1=p1, param2=p2,minRadius=0,
maxRadius=rMax)
#Hack for fixing the list if it is empty so program wont crash
if circles == None:
circles = [[[0,0,0.000],[0,0,0.000]]]
#convert circles list to integer list
circles = np.uint16(np.around(circles))
#store points to a file
datafile = file('datafile.txt', 'a')
np.savetxt(datafile, circles[0], fmt ='%i',delimiter=',', newline = ',')
datafile.write('\n')
datafile.close()
for i in circles[0,:]:
# draw the outer circle
cv2.circle(frame,(i[0],i[1]),i[2],(0,255,0),2)
# draw the center of the circle
cv2.circle(frame,(i[0],i[1]),2,(0,0,255),3)
cv2.imshow('detected circles',frame)
cv2.imshow('threshold video',th1)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
e2 = cv2.getTickCount()
time = (e2 - e1)/ cv2.getTickFrequency()
print time
cap.release()
cv2.destroyAllWindows()
Here is a sample frame from the video I am using for detecting the beads.
http://imgur.com/bVZbH3c
Here is an example of the single frame tracking that I did with the previous algorithm
http://imgur.com/4VWJI2F
I don't need to detect every single bead. Just an aggregate would be fine.
The beads are spherical and should look the same, so is there a library that I can use to integrate the bead image over the entire image and see where the points are most correlated in the image? Sometimes the beads are out of focus and that's why the current program I have keeps bouncing around and giving me false positives for the beads.
I eventually need this process to happen realtime so if possible it would be nice to have the algorithm be as efficient as possible.
If anyone knows a good approach to this type of problem it would be appreciated.