I just bought a FLIR BlackFlyS USB3.0 camera. I can grap frames from the camera but I am not able to use that frame with opencv without saving them first. Is there anyone who knows how to convert them to use in opencv?
I searched on the internet about everything that include "PySpin" word and found this book.
I have tried to use PySpinCapture which is mentioned in this book but I couldn't figure it out anyway.
capture = PySpinCapture.PySpinCapture(0, roi=(0, 0, 960, 600),binningRadius=2,isMonochrome=True)
ret, frame = capture.read()
cv2.imshow("image",frame)
cv2.waitKey(0)
I expect the see the image but it throws an error
_PySpin.SpinnakerException: Spinnaker: GenICam::AccessException= Node is not writable. : AccessException thrown in node 'PixelFormat' while calling 'PixelFormat.SetIntValue()' (file 'EnumerationT.h', line 83) [-2006]
terminate called after throwing an instance of 'Spinnaker::Exception'
One year later, and not sure if my response will help, but I figured out that you can just get the RGB numpy array from a PySpin Image by using the GetData() function.
So you could do without the PySpinCapture module and just do something like the following.
import PySpin
import cv2
serial = '18475994' #Probably different for you although I also use a BlackFly USB3.0
system = PySpin.System.GetInstance()
blackFly_list = system.GetCameras()
blackFly = blackFly_list.GetBySerial(serial)
height = blackFly.Height()
width = blackFly.Width()
channels = 1
fourcc = cv2.VideoWriter_fourcc(*'XVID')
out = cv2.VideoWriter('test_vid.avi',fourcc, blackFly.AcquisitionFrameRate(), (blackFly.Width(), blackFly.Height()), False) #The last argument should be True if you are recording in color.
blackFly.Init()
blackFly.AcquisitionMode.SetValue(PySpin.AcquisitionMode_Continuous)
blackFly.BeginAcquisition()
nFrames = 1000
for _ in range(nFrames):
im = blackFly.GetNextImage()
im_cv2_format = im.GetData().reshape(height,width,channels)
# Here I am writing the image to a Video, but once you could save the image as something and just do whatever you want with it.
out.write(im_cv2_format)
im.release()
out.release()
In this code example I want to create an AVI video file with 1000 grabbed frames. The im.GetData() returns a 1-D numpy array which can then be converted into the correct dimensions with reshape. I have seen some talks about using the UMat class, but it does not seem to be necessary to make it work in this case. Perhaps it helps performance, but I am not sure :)
Related
So a bit of context, I'm using the TensorFlow object detection API for a project, and I've modified the visualization_utils file to print any present class labels to the terminal and then write them to a .txt file. From a bit of research I've come across FFmpeg, I'm wondering if there is a function I can use in FFmpeg so that it only prints and writes the class labels from keyframes within the video? - i.e. when there is a change in the video. At the moment it is printing all the class labels per frame even if there is no change, so I have duplicate numbers of labels even if there is no new object within the video. Following on from this, would I have to apply this keyframe filtering to an input video beforehand?
Thanks in advance!
I'm using opencv2 to capture my video input.
Please see below for code:
visualization_utils.py - inside the draw_bounding_box_on_image_array function:
# Write video output to file for evaluation.
f = open("ObjDecOutput.txt", "a")
print(display_str_list[0])
f.write(display_str_list[0])
Thought I'd just follow up on this, I ended up using ffmpeg mpdecimate and setpts filters to remove duplicate and similar frames.
ffmpeg -i example.mp4 -vf mpdecimate=frac=1,setpts=N/FRAME_RATE/TB example_decimated.mp4
This however didn't solve the problem of duplicates within the file I was writing the labels to - to solve this I appended each row in the file to a list and looped through it to remove groups of duplicated elements and only kept the first occurrence and appended that to a new list.
Finally, I found the solution here after a year. However, there is a small bug in the code converted from this script.
The fix is and frame["key_frame"]
import json
import subprocess
def get_frames_metadata(file):
command = '"{ffexec}" -show_frames -print_format json "{filename}"'.format(ffexec='ffprobe', filename=file)
response_json = subprocess.check_output(command, shell=True, stderr=None)
frames = json.loads(response_json)["frames"]
frames_metadata, frames_type, frames_type_bool = [], [], []
for frame in frames:
if frame["media_type"] == "video":
video_frame = json.dumps(dict(frame), indent=4)
frames_metadata.append(video_frame)
frames_type.append(frame["pict_type"])
if frame["pict_type"] == "I" and frame["key_frame"]:
frames_type_bool.append(True)
else:
frames_type_bool.append(False)
# print(frames_type)
return frames_metadata, frames_type, frames_type_bool
The frame types are stores in frames_type, but don't trust it. True keyframes are in frames_type_bool.
I tested a clip for which I had two consecutive I-frames at the beginning, but avidemux was showing only one. So I checked the original code and found that some frames may have pict_type = I but key_frame = False. I thus fixed the code.
After having the frames_type_bool, you can extract the True indices and opencv or imageio to extract keyframes only.
This is how to use this function and imageio to show the keyframes:
import matplotlib.pyplot as plt
import imageio
filename = 'Clip.mp4'
# extract frame types
_,_, isKeyFrame = get_frames_metadata(filename)
# keep keyframes indices
keyframes_index = [i for i,b in enumerate(isKeyFrame) if b]
# open file
vid = imageio.get_reader(filename, 'ffmpeg')
for i in keyframes_index:
image = vid.get_data(i)
fig = plt.figure()
fig.suptitle('image #{}'.format(i), fontsize=20)
plt.imshow(image)
plt.show()
I am currently working on making a dataset for a computer vision problem. I wanted to add some data to the previous ones I had. So I wanted to get around ~3000 frames from 2 different videos.
I used openCV because I knew the capture feature but I'm not sure about this because my memory is really exploding. I was using pickle file for the previous dataset that was already processed and I had no problem having that much information with my memory. Maybe my code is horrible without noticing it...
Here is my code to get around 3000 frames from the videos :
import cv2
video_name1 = "videosDataset/AMAZExNHORMS2019_Lo-res.mp4"
video_name2 = "videosDataset/CAMILLATHULINS2019_Lo-res.mp4"
def getAllFrames(videoName):
video_name = videoName
property_id =int(cv2.CAP_PROP_FRAME_COUNT)
cap = cv2.VideoCapture(video_name) #video_name is the video being called
frames = []
length = int(cv2.VideoCapture.get(cap, property_id))
print(length)
minL = int(length/2)
maxL = int(2*length/3)
print(minL,maxL)
for i in range(minL,maxL):
cap.set(1,i); # Where frame_no is the frame you want
ret, frame = cap.read() # Read the frame
frames.append(frame)
print(str(round((i-minL)/(maxL-minL)*100, 2))+'%')
return frames
frames1 = getAllFrames(video_name1)
I would like to know if there is a better way to do this. Thank you
The problem here is the compresion - when read, each frame is stored as numpy array which is rather expensive. For example - one RGB frame of 1280 x 720 pixels is about 200 kB in jpg format, 1.2 MB in png format, 2.7 MB when stored in numpy uint8 array and 22 MB when stored in numpy float64 array.
Easiest solution is to store each frame to disk as jpg image (e.g. by cv2.imwrite) instead of creating an array with all frames.
Assuming that by making a dataset, you mean that want to save all the frames individually for use in the dataset, the easiest option would probably be to use a tool like ffmpeg to do so. See here for an example to do so. Ffmpeg will support a number of image file formats, probably including the format you want to save the image in.
I am trying to do something very simple: to subtract a bg image from a video for object tracking. I understood images can be simple subtracted from one another as follows img3 = img2 - img1. However, even when I start simple with one image, add a black line to it and store it as img2, img3 will not just show the line. When I run the following code
import cv2
img1 = cv2.imread("img1.png")
img2 = cv2.imread("img2.png")
img3 = img2 - img1
cv2.imwrite("img3.png",img3)
with bellow img1 and img2:
I get the image on the left below, instead of the image on the right:
I want to use this method for background extraction in a video, e.g. where I have a bg image file that shows an emtpy scene and a video that shows the same scene with sometimes objects moving in and out of the screen. I use the following code but similarly get a B/W image instead of just the object visible without the scene..
import cv2
import numpy as np
from PIL import Image
capture = cv2.VideoCapture("video.mov")
while True:
f, frame = capture.read()
frame = cv2.GaussianBlur(frame,(15,15),0)
frame = frame - bg
cv2.imshow("window", frame)
ps: I know about automatic background subtraction but I have very good background files and very clear empty scenes with very obvious objects so thought this should easily work!
Update: I have just found out about the PIL ImageChops difference function that works for getting what I want with two images but seems not possible to use with a video opened with opencv. Also would it be possible to do ImageChops.difference(img1,img2) manually with numpy arrays?
The closest to expected result you can get using this code:
img3 = 255 - cv2.absdiff(img1,img2)
This code will give you this:
Note that using only cv2.absdiff(img1,img2) will give the oposite of this result, because basically this operation tells you what is the difference between 2 images - if on some position there is no difference, the result (int this position) is 0.
To achieve "perfect result" (exactly what you expect) you need to apply some thresholding(or some other kind of filter which will erase left part of image).
I'm looking to create a function for converting a QImage into OpenCV's (CV2) Mat format from within the PyQt.
How do I do this? My input images I've been working with so far are PNGs (either RGB or RGBA) that were loaded in as a QImage.
Ultimately, I want to take two QImages and use the matchTemplate function to find one image in the other, so if there is a better way to do that than I'm finding now, I'm open to that as well. But being able to convert back and forth between the two easily would be ideal.
Thanks for your help,
After much searching on here, I found a gem that got me a working solution. I derived much of my code from this answer to another question: https://stackoverflow.com/a/11399959/1988561
The key challenge I had was in how to correctly use the pointer. The big thing I think I was missing was the setsize function.
Here's my imports:
import cv2
import numpy as np
Here's my function:
def convertQImageToMat(incomingImage):
''' Converts a QImage into an opencv MAT format '''
incomingImage = incomingImage.convertToFormat(4)
width = incomingImage.width()
height = incomingImage.height()
ptr = incomingImage.bits()
ptr.setsize(incomingImage.byteCount())
arr = np.array(ptr).reshape(height, width, 4) # Copies the data
return arr
I tried the answer given above, but couldn't get the expected thing. I tried this crude method where i saved the image using the save() method of the QImage class and then used the image file to read it in cv2
Here is a sample code
def qimg2cv(q_img):
q_img.save('temp.png', 'png')
mat = cv2.imread('temp.png')
return mat
You could delete the temporary image file generated once you are done with the file.
This may not be the right method to do the work, but still does the required job.
Before a couple days ago I had never used OpenCV or done any video processing. I've been asked to computationally overlay a video based upon some user inputs and build a new video with the overlays incorporated for download in AVI format. Essentially, the goal is to have a form that takes as input 3 images (icon, screenshot #1, screenshot #1) and 3 text inputs and overlays the original video with them. Here is a link to the video. When the video is running you'll notice the icon in the center of the iPhone at the beginning is stretched and pulled. I've been iteratively testing OpenCV methods by breakding down the video frame by frame and doing stuff to each one, then rebuilding (obviously this is probably the only way to successfully rebuild a video with OpenCV with edits, but anyway). this video is one I overlayed a colored circle that moves back and forth with.
# the method I've been using
import cv2 as cv
import numpy as np
cap = cv.VideoCapture('the_vid.avi')
flag, frame = cap.read()
width = np.size(frame,1)
height = np.size(frame,0)
writer = cv.VideoWriter('output.avi', cv.VideoWriter_fourcc('I','4','2','0'), fps=35, (width,height), 1)
while True:
flag, frame = cap.read()
if flag == 0:
break
x = width/2
y = height/2
# add a line or circle or something
origin radius
cv.circle(frame, (x,y), 20, (0,0,255), -1)
# write our new frame
writer.write(frame)
Now we've got an output of this very large uncompressed AVI file which can be compressed using ffmpeg
ffmpeg -i output.avi -vcodec msmpeg4v2 compressed_output.avi
Ok, so that's the method I've been using to rebuild this video, and from that method I'm not seeing it possible to take a static image and stretch it around like is shown in the first 90 frames or so. The only other possibility I saw was maybe doing something like below. If you can tell me if there is even a way to implement this pseudo-code that would be awesome, I'm thinking it will be extremely difficult:
# example for the first image and first few seconds of video only
first_image = cv.imread('user_uploaded_icon.png')
flag, first_frame = cap.read()
# section of the frame that contains the original icon
the_section = algorithm_to_get_array_of_original_icon_in_first_frame(first_frame)
rows, cols = the_section.shape
# somehow find the array within the first image that is the same size as the_section
# containing JUST the icon
icon = array_of_icon(first_image)
# build a blank image with the size of the original icon in the current frame
blank_image = np.zeros((rows,cols,3),np.uint8)
for i in xrange(row):
for j in xrange(col):
blank_image[i,j] = icon[i,j]
What seems like it might not work about this is the fact that the_section in the first_frame will be stretched to different dimensions than the static image...so I'm not sure if there is ANY viable way to handle this. I appreciate all the time saving help in advance.