I’m trying to use opencv to take a photo on a 1080p camera, however, only want the photo to be 224x224 pixels. How can I use opencv to do this.
I currently have the following code:
Import cv2
Cap = cv2.VideoCam(0)
Cap.set(3, 224)
Cap.set(4, 224)
Ret, frame = cap.read()
However when I look at the shape of frame, it is not (224, 224, 3). Could someone please help me figure out how to make it output the dimensions of pixels I want
When you say you want a 224x224 image, it depends what you mean. If we start with this image which is 1920x1080, you might want:
(A) - the top-left corner, highlighted in magenta
(B) - the central 224x224 pixels, highlighted in blue
(C) - the largest square, resized down to 224x224, highlighted in red
(D) - the entire image distorted to fit in 224x224
So, assume in the following that you have read your frame from the camera into a variable called im using something like:
ret, im = cap.read()
If you want (A), use:
# If you want the top-left corner
good = im[:224, :224]
If you want (B), use:
# If you want the centre
x = h//2 - 112
y = w//2 - 112
good = im[x:x+224, y:y+224]
If you want (C), use:
# If you want the largest square, scaled down to 224x224
y = (w-h)//2
good = im[:, y:y+h]
good = cv2.resize(good,(224,224))
If you want (D), use:
# If you want the entire frame distorted to fit 224x224
good = cv2.resize(im,(224,224))
import cv2
video_capture = cv2.VideoCapture(0)
while video_capture.isOpened():
video_capture.set(cv2.CAP_PROP_FRAME_HEIGHT, 224)
video_capture.set(cv2.CAP_PROP_FRAME_WIDTH, 224)
frame = video_capture.read()[1]
cv2.imshow('frame', frame)
if cv2.waitKey(1) == ord("q"):
to get the current size use :
if that didn't work then Opencv doesn't have access to the (width, height )features of the camera you are using.
This is quite simple, just use the cv2.resize(frame, size) function. Example:
import cv2
cam = cv2.VideoCapture(0) # My camera is 640x480
size = (200,200) # The "size" parameter must be a tuple.
while True:
frame = cam.read()[1]
new_frame = cv2.resize(frame,size) # Resizing the frame ...
if cv2.waitKey(1) == ord("q"):
I am working on a project for camera tampering. I have the code for tamper detection for a static camera. Tampering means blocking or defocussing the camera. I want to modify it for a moving camera or rotating camera so that it takes all the frames from the background and then compare it with new frames using the same background subtractor method.
I have tried but unable to figure out how to use the list of frames to compare the captured frame from other ones.
import numpy as np
import cv2
from playsound import playsound
import time
#cap = cv2.VideoCapture('') #Opening of IP camera just enter the ip address
cap = cv2.VideoCapture(0)
fgbg = cv2.createBackgroundSubtractorMOG2() #generating a foreground mask
frame_list = [] #creating the list of frame
ret, frame = cap.read()
if ret == True:
if frame not in frame_list:
frame_list.append(frame) #This list contain all the frames
#ret, frame = cap.read() #to get the initial frame
#fgmask = fgbg.apply(frame) #to save the initial frame
kernel = np.ones((5,5), np.uint8) #creating a matrix of (5, 5) consisting of 1
ret, frame = cap.read() #reading all the frames
if ret == True:
a = 0
bounding_rect = [] # An empty list where will furthur input the contours
fgmask = fgbg.apply(frame) #Applying the changes of the backgroud to the foreground mask
fgmask= cv2.erode(fgmask, kernel, iterations=5)
fgmask = cv2.dilate(fgmask, kernel, iterations = 5) #Erosion and Dilation is done to detect even the blur objects better
cv2.imshow('frame',frame) #Showing the frame.
contours,_ = cv2.findContours(fgmask,cv2.RETR_TREE,cv2.CHAIN_APPROX_SIMPLE) #The mode RETR_TREE together with the CHAIN APPROX_SIMPLE returns only the endpoints required to draw contour
for i in range(0,len(contours)):
bounding_rect.append(cv2.boundingRect(contours[i])) #cv2.bounding rectangle gives the coordinates of bounding rectangle and then we will input these coordinates to the list we made
for i in range(0,len(contours)):
if bounding_rect[i][2] >=40 or bounding_rect[i][3] >= 40: #setting the threshold for the width and height if the contour
a = a+(bounding_rect[i][2])*bounding_rect[i][3] #updating the area of contour
if(a >=int(frame.shape[0])*int(frame.shape[1])/3): #It is basically the comparison of the change in area of the background, so if there is a certain change in area it will detect the tampering
cv2.putText(frame,"TAMPERING DETECTED",(5,30),cv2.FONT_HERSHEY_SIMPLEX,1,(0,255,255),2)
#playsound('warning.mp3') #put the address of the warning tune in the playsound object and uncomment it
cv2.imshow('frame',frame) #showing the final frame
if cv2.waitKey(30) & 0xff== ord('q'): #To close the camera press q
else : break
In my opinion, I don't think background subtraction is a good solution to your problem, your methods is mainly used to detect moving object by subtracting background. This method most often used in static camera but for moving camera, intensity variation or texture change may also need to take account.
I'm new to OpenCV, so apologies if this is a trivial question...
I'm writing an application that tracks the path of an object in real time. So far, I have successfully isolated the object and created a "trail" of its path using cv2.accumulateWeighted(). Everything looks great in the preview window, but when I save the merged frame to a file, things aren't so good.
The result varies, but typically the saved frame has much less detail than the displayed frame. I've converted the input to grayscale, and often the written file has very "dim" features.
I believe only the final frame is written (multiplied by the alpha blend), rather than the accumulated image. Any ideas would be greatly appreciated.
Sample program to demonstrate the issue:
import cv2
#---- read the next frame from the capture device
def read_frame(cap):
ret, frame = cap.read()
if ret is False or frame is None:
return None
gray_frame = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
return gray_frame
#---- setup components
cap = cv2.VideoCapture(index=0)
background_subtractor = cv2.createBackgroundSubtractorMOG2(
history=30, varThreshold=50, detectShadows=False
#---- prime the accumulator
frame = read_frame(cap)
merged_frame = frame.astype(float)
#---- capture some frames
while True:
frame = read_frame(cap)
mask = background_subtractor.apply(frame, learningRate=0.01)
foreground = cv2.bitwise_and(frame, frame, mask=mask)
cv2.accumulateWeighted(foreground, merged_frame, 0.1)
cv2.imshow('Acccumulator', merged_frame)
key = cv2.waitKey(1)
# press 'q' to quit and save the current frame
if key == ord('q') or key == ord('Q'):
cv2.imwrite('merged.png', merged_frame)
The following are images when moving my hand through the scene... You can see the path of my hand in the displayed image, along with some other background elements. In the saved image, only a very dim version of my hand in the final position is saved.
This is the displayed image (using screen capture):
This is the image written to disk (using imwrite()):
I guess you want to save merged_frame as it shown by cv2.imshow.
You may limit the upper value of merged_frame to 1, scale by 255, and convert to uint8 type, before saving:
merged_frame = np.round(np.minimum(merged_frame, 1)*255).astype(np.uint8)
The type of merged_frame is float64.
When using cv2.imshow for image of type float, all the values above 1.0 are white (and below 0 are black).
Gray level of range [0, 1] is equivalent to range [0, 255] of uint8 type (0.5 is like 128).
When using cv2.imwrite the image is converted to uint8, but without clamping and scaling (simple cast to 255). The result is usually very dark.
In case you want to save the image as it shown, you need to clamp value to 1, then scale by 255.
You didn't post input samples, so I created synthetic input:
import numpy as np
import cv2
background_subtractor = cv2.createBackgroundSubtractorMOG2(
history=30, varThreshold=50, detectShadows=False
width, height = 640, 480
frame = np.full((height, width), 60, np.uint8)
merged_frame = frame.astype(float)
for n in range(100):
img = np.full((height, width, 3), 60, np.uint8)
cv2.putText(img, str(n), (width//2-100*len(str(n)), height//2+100), cv2.FONT_HERSHEY_DUPLEX, 10, (30, 255, 30), 20) # Green number
#frame = read_frame(cap)
frame = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
mask = background_subtractor.apply(frame, learningRate=0.01)
foreground = cv2.bitwise_and(frame, frame, mask=mask)
cv2.accumulateWeighted(foreground, merged_frame, 0.1)
cv2.imshow('Acccumulator', merged_frame)
#merged_frame = cv2.normalize(merged_frame, merged_frame, 0, 255.0, cv2.NORM_MINMAX).astype(np.uint8) # Alternative approch - normalize between 0 and 255
merged_frame = np.round(np.minimum(merged_frame, 1)*255).astype(np.uint8)
cv2.imshow('merged_frame as uint8', merged_frame)
cv2.imwrite('merged.png', merged_frame)
PNG image using imwrite, without camping and scaling:
PNG image using imwrite, with camping and scaling:
A better way for showing the image, is normalize the values to range [0, 1] before showing the image.
In the loop, after cv2.accumulateWeighted(foreground, merged_frame, 0.1):
norm_acccumulator = merged_frame.copy()
cv2.normalize(norm_acccumulator, norm_acccumulator, 0, 1.0, cv2.NORM_MINMAX)
cv2.imshow('Acccumulator', norm_acccumulator)
I'm new here and i want to solve a problem that makes me sick this past weeks...
This is the problem, i have 2 codes, one that grab a video of a directory (it could be a web camera to) with OpenCV and read frame by frame drawing text on each frame, plotting one variable (just for testing), show me the video and save it in the python files directory, this is the code.... ("MyPath" is the path of the file that is irrelevant for the question)
import cv2
import random
capture = cv2.VideoCapture(r"MyPath\Car.mp4")
out = cv2.VideoWriter('Test.avi',cv2.VideoWriter_fourcc('M','P','4','2'), 25, (1280, 720))
def velocity():
now = (random.randint(0,100))
return now
while True:
ret, frame = capture.read()
if ret:
cv2.putText(frame, str(velocity()), (130,580),font,0.7,(255,255,255),1)
cv2.imshow("Testing", frame)
This works fine, no problems here, and i have another code with pillow that open a background image (jpg) and 4 images in png that resize and repositions them, then the code paste the images above the background, (i made a background just for testing, this 4 png's images have to plot into the videoframes but now it's just testing...) then show me and save the background with the png's above... and again, it works perfectly!
from PIL import Image
back = Image.open(r"MyPath\Eagle.jpg")
vel = Image.open(r"MyPath\Velocímeter.png")
dis = Image.open(r"MyPath\Distance.png")
inp = Image.open(r"MyPath\Inclination.png")
orz = Image.open(r"MyPath\Orientation.png")
vel = vel.resize((60, 60), Image.LANCZOS)
dis = dis.resize((55, 55), Image.LANCZOS)
inp = inp.resize((60, 60), Image.LANCZOS)
orz = orz.resize((60, 60), Image.LANCZOS)
back.paste(vel, (10, 420), vel)
back.paste(dis, (10, 510), dis)
back.paste(inp, (10, 583), inp)
back.paste(orz, (10, 655), orz)
The problem is that Pillow is an image library so i can't open a video and paste the images, and OpenCV not accept PNG images easily like Pillow... is there any way to mix this 2 codes in 1 and do what i want to do? (plotting the png images in videoframes to get a render video with text and the images) this project it's for get the information of sensors and plotting, that's why i made a test function just for see,(the images it's a detail that i want to add to the project). If my code likes you and it's useful for you feel free to use it! thank you very much for reading!!! i hope you can help me (i don't speak fluenty english sooo, sorry for posibbles mistakes, but i can read perfectly your answers).
You can mix PIL/Pillow and OpenCV. There are three things you need to know...
To convert an OpenCV image into a PIL image:
pilimage = Image.fromarray(opencvimage)
To convert from a PIL image into an OpenCV image:
opencvimage = np.array(pilimage)
And lastly, OpenCV stores images in BGR order while PIL uses RGB, so reds and blues will get swapped if you don't watch out - cvtColor(BGR2RGB) is your friend here.
So we can now mix up those two pieces of code:
#!/usr/bin/env python3
import cv2
import random
import numpy as np
from PIL import Image
# Load a speedo image which has transparency
speedo = Image.open('speedo.png').convert('RGBA')
capture = cv2.VideoCapture("movie.mov")
out = cv2.VideoWriter('test.mov',cv2.VideoWriter_fourcc('a','v','c','1'), 25, (1280, 720))
def velocity():
now = (random.randint(0,100))
return now
while True:
ret, frame = capture.read()
if ret:
cv2.putText(frame, str(velocity()), (130,580),font,0.7,(255,255,255),1)
# Make PIL image from frame, paste in speedo, revert to OpenCV frame
pilim = Image.fromarray(frame)
frame = np.array(pilim)
cv2.imshow("Testing", frame)
Here's the speedo I used:
If you want the colours to go across to OpenCV correctly from PIL, you need to re-order the channels to what OpenCV expects, namely BGRA. So, you could change the image loading to this:
# Open speedo image, split into separate RGBA channels, then re-combine in BGRA order for OpenCV
speedo = Image.open('speedo.png').convert('RGBA')
R,G,B,A = speedo.split()
speedo = Image.merge('RGBA',(B,G,R,A))
I want to calibrate a car video recorder and use it for 3D reconstruction with Structure from Motion (SfM). The original size of the pictures I have took with this camera is 1920x1080. Basically, I have been using the source code from the OpenCV tutorial for the calibration.
But there are some problems and I would really appreciate any help.
So, as usual (at least in the above source code), here is the pipeline:
Find the chessboard corner with findChessboardCorners
Get its subpixel value with cornerSubPix
Draw it for visualisation with drawhessboardCorners
Then, we calibrate the camera with a call to calibrateCamera
Call the getOptimalNewCameraMatrix and the undistort function to undistort the image
In my case, since the image is too big (1920x1080), I have resized it to 640x320 (during SfM, I will also use this size of image, so, I don't think it would be any problem). And also, I have used a 9x6 chessboard corners for the calibration.
Here, the problem arose. After a call to the getOptimalNewCameraMatrix, the distortion come out totally wrong. Even the returned ROI is [0,0,0,0]. Below is the original image and its undistorted version:
You can see the image in the undistorted image is at the bottom left.
But, if I didn't call the getOptimalNewCameraMatrix and just straight undistort it, I got a quite good image.
So, I have three questions.
Why is this? I have tried with another dataset taken with the same camera, and also with my iPhone 6 Plus, but the results are same as above.
Another question is, what is the getOptimalNewCameraMatrix does? I have read the documentations several times but still cannot understand it. From what I have observed, if I didn't call the getOptimalNewCameraMatrix, my image will retain its size but it would be zoomed and blurred. Can anybody explain this function in more detail for me?
For SfM, I guess the call to getOptimalNewCameraMatrix is important? Because if not, the undistorted image would be zoomed and blurred, making the keypoint detection harder (in my case, I will be using the optical flow)?
I have tested the code with the opencv sample pictures and the results are just fine.
Below is my source code:
from sys import argv
import numpy as np
import imutils # To use the imutils.resize function.
# Resizing while preserving the image's ratio.
# In this case, resizing 1920x1080 into 640x360.
import cv2
import glob
# termination criteria
criteria = (cv2.TERM_CRITERIA_EPS + cv2.TERM_CRITERIA_MAX_ITER, 30, 0.001)
# prepare object points, like (0,0,0), (1,0,0), (2,0,0) ....,(6,5,0)
objp = np.zeros((9*6,3), np.float32)
objp[:,:2] = np.mgrid[0:9,0:6].T.reshape(-1,2)
# Arrays to store object points and image points from all the images.
objpoints = [] # 3d point in real world space
imgpoints = [] # 2d points in image plane.
images = glob.glob(argv[1] + '*.jpg')
width = 640
for fname in images:
img = cv2.imread(fname)
if width:
img = imutils.resize(img, width=width)
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
# Find the chess board corners
ret, corners = cv2.findChessboardCorners(gray, (9,6),None)
# If found, add object points, image points (after refining them)
if ret == True:
corners2 = cv2.cornerSubPix(gray,corners,(11,11),(-1,-1),criteria)
# Draw and display the corners
img = cv2.drawChessboardCorners(img, (9,6), corners2,ret)
ret, mtx, dist, rvecs, tvecs = cv2.calibrateCamera(objpoints, imgpoints, gray.shape[::-1],None,None)
for fname in images:
img = cv2.imread(fname)
if width:
img = imutils.resize(img, width=width)
h, w = img.shape[:2]
newcameramtx, roi=cv2.getOptimalNewCameraMatrix(mtx,dist,(w,h),1,(w,h))
# undistort
dst = cv2.undistort(img, mtx, dist, None, newcameramtx)
# crop the image
x,y,w,h = roi
dst = dst[y:y+h, x:x+w]
cv2.imshow("undistorted", dst)
mean_error = 0
for i in xrange(len(objpoints)):
imgpoints2, _ = cv2.projectPoints(objpoints[i], rvecs[i], tvecs[i], mtx, dist)
error = cv2.norm(imgpoints[i],imgpoints2, cv2.NORM_L2)/len(imgpoints2)
mean_error += error
print "total error: ", mean_error/len(objpoints)
Already ask someone in answers.opencv.org and he tried my code and my dataset with success. I wonder what is actually wrong.
Question #2:
With cv::getOptimalNewCameraMatrix(...) you can compute a new camera matrix according to the free scaling parameter alpha.
If alpha is set to 1 then all the source image pixels are retained in the undistorted image that is you'll see black and curved border along the undistorted image (like a pincushion). This scenario is unlucky for several computer vision algorithms, because new edges are appeared on the undistorted image for example.
By default cv::undistort(...) regulates the subset of the source image that will be visible in the corrected image and that's why only the sensible pixels are shown on that - no pincushion around the corrected image but data loss.
Anyway, you are allowed to control the subset of the source image that will be visible in the corrected image:
cv::Mat image, cameraMatrix, distCoeffs;
// ...
cv::Mat newCameraMatrix = cv::getOptimalNewCameraMatrix(cameraMatrix, distCoeffs, image.size(), 1.0);
cv::Mat correctedImage;
cv::undistort(image, correctedImage, cameraMatrix, distCoeffs, newCameraMatrix);
Question #1:
It is just my feeling, but you should also take care, if you resize your image after the calibration then the camera matrix must be also "scaled" as well, for example:
cv::Mat cameraMatrix;
cv::Size calibSize; // Image during the calibration, e.g. 1920x1080
cv::Size imageSize; // Your current image size, e.g. 640x320
// ...
cv::Matx31d t(0.0, 0.0, 1.0);
t(0) = (double)imageSize.width / (double)calibSize.width;
t(1) = (double)imageSize.height / (double)calibSize.height;
cameraMatrixScaled = cv::Mat::diag(cv::Mat(t)) * cameraMatrix;
This must be done only for the camera matrix, because the distortion coefficients do not depend on the resolution.
Question #3:
Whatever I think cv::getOptimalNewCameraMatrix(...) is not important in your case, the undistorted image can be zoomed and blurred because you remove the effect of a non-linear transformation. If I were you then I would try the optical flow without calling cv::undistort(...). I think that even a distorted image can contain a lot of good features for tracking.
I am using a microscope to observe the motion of small 4 micron beads. I have a video of the beads moving and I would like to process the video to extract the bead locations as a function of time to get a mathematical model of their motion.
I am currently using opencv and programming in python
My code was importing a video from file, thresholding the image then applying a HoughCircles transform to find the spherical beads.
import numpy as np
import cv2
def nothing(x):
cap = cv2.VideoCapture('testvideo.mp4')
e1 = cv2.getTickCount()
ret, frame = cap.read()
#get grayscale image for HoughCircles
gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
p1 = cv2.getTrackbarPos('Param1','trackbar')
p2 = cv2.getTrackbarPos('Param2','trackbar')
rMax = cv2.getTrackbarPos('MaxRadius','trackbar')
#Threshold grayscale image
ret,th1 = cv2.threshold(gray,127,255,cv2.THRESH_BINARY)
#Find circles in image and store locations to circles list
circles = cv2.HoughCircles(th1, cv2.cv.CV_HOUGH_GRADIENT, 1, 10,
param1=p1, param2=p2,minRadius=0,
#Hack for fixing the list if it is empty so program wont crash
if circles == None:
circles = [[[0,0,0.000],[0,0,0.000]]]
#convert circles list to integer list
circles = np.uint16(np.around(circles))
#store points to a file
datafile = file('datafile.txt', 'a')
np.savetxt(datafile, circles[0], fmt ='%i',delimiter=',', newline = ',')
for i in circles[0,:]:
# draw the outer circle
# draw the center of the circle
cv2.imshow('detected circles',frame)
cv2.imshow('threshold video',th1)
if cv2.waitKey(1) & 0xFF == ord('q'):
e2 = cv2.getTickCount()
time = (e2 - e1)/ cv2.getTickFrequency()
print time
Here is a sample frame from the video I am using for detecting the beads.
Here is an example of the single frame tracking that I did with the previous algorithm
I don't need to detect every single bead. Just an aggregate would be fine.
The beads are spherical and should look the same, so is there a library that I can use to integrate the bead image over the entire image and see where the points are most correlated in the image? Sometimes the beads are out of focus and that's why the current program I have keeps bouncing around and giving me false positives for the beads.
I eventually need this process to happen realtime so if possible it would be nice to have the algorithm be as efficient as possible.
If anyone knows a good approach to this type of problem it would be appreciated.