I am trying to track objects with opencv in python from recorded video. I want to give a unique object nr to each visible object when it appears. For example, one object enters the screen and gets nr1, then a second joins the first and gets nr2, then the first object leaves the screen but the second is still visible and still gets object nr2 and not 1 (being the only object). I can't find any information on how to do this online. Any help (including code) is appreciated.
The code I have written so far for getting the contours and drawing object numbers:
cap = cv2.VideoCapture("video.mov")
while True:
flag, frame = cap.read()
cv2.drawContours(frame, contours, -1, (255,0,0) ,1)
for i in range(len(contours)):
cnt = contours[i]
cnt_nr = i+1
x,y,w,h = cv2.boundingRect(cnt)
cv2.putText(frame, str(cnt_nr), ((x+w)/2,(y+h)/2), cv2.FONT_HERSHEY_PLAIN, 1.8, (0,0,0))
cv2.imshow("Tracked frame",frame)
k = cv2.waitKey(0)
if k == 27:
cv2.destroyAllWindows()
break
What kind of objects are you trying to track? If it's easy to distinguish them you can try to collect some features of objects and check whether object with similar features appeared earlier. It's hard to tell what kind of features will be the best in your situation, but you may try following:
contour size, area and length (or ratio: area/length or some other)
convex hull of object and it length (same as above - you may try ratio as well)
object colour (average colour) - if lighting can change consider using only H channel from HSV color space
some more complicated - "sum" of edges inside object (use some edge detector on object and just calculate sum of the result image)
Other solution is to use more powerfull tool designed for such task - object tracker. In one of my projects i'm using TLD tracker and it works fine, another option is to use CMT tracker, which might be better for you, because it's written in Python. Note that for tracking multiple objects you will need multiple tracker objects (or find tracker which is capable of tracking multiple different objects).
Related
I have several thousand images of fluid pathlines -- below is a simple example --
and I would like to automatically detect them: Length and position.
For the position a defined point would be sufficient (e.g. left end).
I don't need the full shape information.
This is a pretty common task but I did not find a reliable method.
How could I do this?
My choice would be Python but it's no necessity as long as I can export the results.
Counting curves, angles and straights in a binary image in openCV and python pretty much answers your question.
I tried it on your image and it works.
I used:
ret, thresh = cv2.threshold(gray, 90, 255, cv2.THRESH_BINARY_INV)
and commented out these two lines:
pts = remove_straight(pts) # remove almost straight angles
pts = max_corner(pts) # remove nearby points with greater angle
I've been given an image containing stars and ovals, and have been tasked with detecting which is which and counting how many of each are contained within the image. One such image with ovals only looks like this:
I've first tried to solve the problem using OpenCV using tutorials such as this one and this one.
However I seem to run into issues with both in bounding the ovals, one results in a count of 1 oval whereas another results in a count of 330.
I then tried using YOLOv4, thinking that it would be more useful when dealing with two different classes (stars and ovals). I used the following code from top try bound boxes on my sample image.
box, label, count = cv.detect_common_objects(img)
output = draw_bbox(img, box, label, count)
output = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
plt.figure(figsize= (10, 10))
plt.axis("off")
plt.imshow(img1)
plt.show()
However I received IndexError: Invalid index to scalar variable.
Can anyone point me in the right direction on how to proceed?
I first need to be able to do it for one class, and then multiple classes, before expanding into doing it for several images automatically.
Thanks
I am trying to make an object detection tool (given a sample) using Contours.
I have had some progress however when the object is in front of another object with a complicated structure (a hand or a face for example), or the object and its background merge in colors, it stops detecting the edges and thus doesnt give a good contour.
After reading through the algorithms documentation, I discovered that it works on the basis that the edges are detected by difference in color intensity - for example if the object is black and the background is black it will not detect it.
So now i am trying to apply some effects and blurs to try and make it work.
I am currently trying to get a combined Sobel blur (in both axis) with hopes that given enough light it will define the edges - since the product will be used by phones who have flash.
So when i tried to do it:
frame = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
frame = cv2.GaussianBlur(frame, (5, 5), 10)
frameX = cv2.Sobel(frame, cv2.CV_64F, 1, 0)
frameY = cv2.Sobel(frame, cv2.CV_64F, 0, 1)
frame = cv2.bitwise_or(frameX, frameY)
I get an error saying the cv2.findContours supports only CV_8UC1 images when the mode is not CV_RETR_FLOODFILL.
Here is the line that triggers the error:
counturs, hierarchy = cv2.findContours(thresh, cv2.RETR_TREE, cv2.CHAIN_APPROX_NONE)
I started messing around with this thing only 1 week ago and im surprised how easy it is to get results but some of the error messages are ridiculous.
Edit: I did try to swap the mode to be CV_RETR_FLOODFILL but that did not fix the problem, then it didnt work at all.
The reason is that findContours function expects a binary image (an image consists of 0's and 1's) whose type is 8 bit integer (uint8). The developers might have done this to reduce the memory usage since there is no point in storing binary values with 64 bits instead of 8 bits. Convert frame into uint8 type by just using
frame = np.uint8(frame)
I want to crop to an object in my image, so that only the colored object remains. How can I do it in python in the most efficient way?
Basically the image has black (0,0,0) background but different colors for an object. I want to crop to the object in order to drop the useless background.
I know cv2 has the resize() function but they cannot detect whether it is the background or not. I can also loop the whole image to find the position but that is too slow.
Finally I have find an API to do the work.
use cv2.findContours() to get the position of the object from the mask image (an object with a corresponding color) and directly cut it with numpy.
def cut_object(rgb_image,mask_image,object_color):
"""This function is used to cut a specific object from the pair RGB/mask image."""
rgb_image=cv2.imread(rgb_image)
mask_image=cv2.imread(mask_image)
mask_image=cv2.cvtColor(mask_image,cv2.COLOR_BGR2RGB)
# Create mask image with the only object
object_mask_binary=cv2.inRange(mask_image,object_color,object_color)
object_mask=cv2.bitwise_and(mask_image,mask_image,mask=object_mask_binary)
# Detect the position of the object
object_contour=cv2.cvtColor(object_mask,cv2.COLOR_BGR2GRAY)
object_position,c=cv2.findContours(object_contour,cv2.RETR_TREE,cv2.CHAIN_APPROX_SIMPLE)
object_position=np.array(object_position).squeeze()
hmin,hmax=object_position[:,:1].min(),object_position[:,:1].max()
wmin,wmax=object_position[:,1:2].min(),object_position[:,1:2].max()
# Cut the object from the RGB image
crop_rgb=rgb_image[wmin:wmax,hmin:hmax]
return crop_rgb
I have a number of images from Chinese genealogies, and I would like to be able to programatically categorize them. Generally speaking, one type of image has primarily line-by-line text, while the other type may be in a grid or chart format.
Example photos
'Desired' type: http://www.flickr.com/photos/63588871#N05/8138563082/
'Other' type: http://www.flickr.com/photos/63588871#N05/8138561342/in/photostream/
Question: Is there a (relatively) simple way to do this? I have experience with Python, but little knowledge of image processing. Direction to other resources is appreciated as well.
Thanks!
Assuming that at least some of the grid lines are exactly or almost exactly vertical, a fairly simple approach might work.
I used PIL to find all the columns in the image where more than half of the pixels were darker than some threshold value.
Code
import Image, ImageDraw # PIL modules
withlines = Image.open('withgrid.jpg')
nolines = Image.open('nogrid.jpg')
def findlines(image):
w,h, = image.size
s = w*h
im = image.point(lambda i: 255 * (i < 60)) # threshold
d = im.getdata() # faster than per-pixel operations
linecolumns = []
for col in range(w):
black = sum( (d[x] for x in range(col, s, w)) )//255
if black > 450:
linecolumns += [col]
# return an image showing the detected lines
im2 = image.convert('RGB')
draw = ImageDraw.Draw(im2)
for col in linecolumns:
draw.line( (col,0,col,h-1), fill='#f00', width = 1)
return im2
findlines(withlines).show()
findlines(nolines).show()
Results
showing detected vertical lines in red for illustration
As you can see, four of the grid lines are detected, and, with some processing to ignore the left and right sides and the center of the book, there should be no false positives on the desired type.
This means that you could use the above code to detect black columns, discard those that are near to the edge or the center. If any black columns remain, classify it as the "other" undesired class of pictures.
AFAIK, there is no easy way to solve this. You will need a decent amount of image processing and some basic machine learning to classify these kinds of images (and even than it probably won't be 100% successful)
Another note:
While this can be solved by only using machine learning techniques, I would advice you to start searching for some image processing techniques first and try to convert your image to a form that has a decent difference for both images. For this you best start reading about the fft. After that have a look at some digital image processing techniques. When you feel comfortable that you have a decent understanding of these, you can read up on pattern recognition.
This is only one suggested approach though, there are more ways to achieve this.