Extracting JPG frames from webcam MJPG stream using OpenCV

Extracting JPG frames from webcam MJPG stream using OpenCV - python

I am working on Ubuntu 18.04 with Python 2.7 and OpenCV 3.2. My application is the front-end of a video pipeline and entails extracting video frames from a webcam, possibly cropping and/or rotating them (90, 180, 270 deg), and then distributing them to one or more other pieces of code for further processing. The overall system tries to maximize efficiency at every step to e.g., improve options for adding functionality later on (compute power and bandwidth wise).
Functionally, I have the front-end working, but I want to improve its efficiency by processing JPEG frames extracted from the camera's MJPEG stream. This would allow efficient, lossless cropping and rotation in the JPEG domain, e.g. using jpegtran-cffi, and distribution of compressed frames that are smaller than the corresponding decoded ones. JPEG decoding will take place if/when/where necessary, with an overall expected gain. As an extra benefit, this approach allows efficient saving of the webcam video without loss of image quality due to decoding + re-coding.
The problem I run into is that OpenCV's VideoCapture class does not seem to allow access to the MJPEG stream:
import cv2
cam = cv2.VideoCapture()
cam.open(0)
if not cam.isOpened():
print("Cannot open camera")
else:
enabled = True
while enabled:
enabled, frame = cam.read()
# do stuff
Here, frame is always in component (i.e., decoded) format. I looked at using cam.grab() + cam.retrieve() instead of cam.read() with the same result (in line with the OpenCV documentation). I also tried cam.set(cv2.CAP_PROP_CONVERT_RGB, False) but that only converts the decoded video to RGB (if it is in another component format) and does not prevent decoding. BTW I verified that the camera uses the MJPEG codec (via cam.get(cv2.CAP_PROP_FOURCC)).
So my questions are: am I missing something or will this approach not work? If the latter, is there an alternative?
A final point: the application has to be able to control the webcam within its capabilities; e.g., frame size, frame rate, exposure, gain, ... This is nicely supported by cv2.VideoCapture.
Thanks!
===
Follow-up: in absence of the solution I was looking for, I added explicit JPEG encoding:
jpeg_frame = cv2.imencode('.jpg', frame, [int(cv2.IMWRITE_JPEG_QUALITY), _JPEG_QUALITY])[1]
with _JPEG_QUALITY set to 90 (out of 100). While this adds computation and reduces image quality, both in principle redundant, it allows me to experiment with trade-offs. --KvZ

Related

Python CV2 reads out-of-date frames from video stream

I am using Python 3.9 and Open-CV (cv2) to read frames from a video stream and save them as JPGs.
My program seems to run OK. It captures the video stream fine, obtains frames, and saves them as JPGs.
However, the frames it is obtaining from the stream are out-of-date - sometimes by several minutes. The clock in the video stream is running accurately, but the clock displays in the JPGs are all identical (to the second - but one or more minutes prior to the datetime in the program's "print()" output (and the saved JPG file time), and moving objects that were in view at the time they were saved are missing completely.
Strangely:
The JPG images are not identical in size. They grow by 10K - 20K as the sequence progresses. Even though they look identical to the eye, they show significant difference when compared using CV2 - but no difference if compared using PIL (which is about 10 - 15 times slower for image comparisons).
The camera can be configured to send a snapshot by email when it detects motion. These snapshots are up-to-date, and show moving objects that were in frame at the time (but no clock display). Enabling or disabling this facility has no effect on the out-of-date issue with JPGs extracted from the video stream. And, sadly, the snapshots are only about 60K, and too low resolution for our purposes (which is an AI application that needs images to be 600K or more).
The camera itself is ONVIF - and things like PTZ work nicely from Python code. Synology Surveillance Station works really well with it in every aspect. This model has reasonably good specs - zoom and good LPR anti-glare functionality. It is made in China - but I don't want to be 'a poor workman who blames his tools'.
Can anyone spot something in the program code that may be causing this?
Has anyone encountered this issue, and can suggest a work-around or different library / methodology?
(And if it is indeed an issue with this brand / model of camera, you are welcome to put in a plug for a mid-range LPR camera that works well for you in an application like this.)
Here is the current program code:
import datetime
from time import sleep
import cv2
goCapturedStream = None
# gcCameraLogin, gcCameraURL, & gcPhotoFolder are defined in the program, but omitted for simplicity / obfuscation.
def CaptureVideoStream():
global goCapturedStream
print(f"CaptureVideoStream({datetime.datetime.now()}): Capturing video stream...")
goCapturedStream = cv2.VideoCapture(f"rtsp://{gcCameraLogin}#{gcCameraURL}:554/stream0")
if not goCapturedStream.isOpened(): print(f"Error: Video Capture Stream was not opened.")
return
def TakePhotoFromVideoStream(pcPhotoName):
llResult = False ; laFrame = None
llResult, laFrame = goCapturedStream.read()
print(f"TakePhotoFromVideoStream({datetime.datetime.now()}): Result is {llResult}, Frame data type is {type(laFrame)}, Frame length is {len(laFrame)}")
if not ".jpg" in pcPhotoName.lower(): pcPhotoName += ".jpg"
lcFullPathName = f"{gcPhotoFolder}/{pcPhotoName}"
cv2.imwrite(lcFullPathName, laFrame)
def ReleaseVideoStream():
global goCapturedStream
goCapturedStream.release()
goCapturedStream = None
# Main Program: Obtain sequence of JPG images from captured video stream
CaptureVideoStream()
for N in range(1,7):
TakePhotoFromVideoStream(f"Test{N}.jpg")
sleep(2) # 2 seconds
ReleaseVideoStream()

Dan Masek's suggestions were very valuable.
The program (now enhanced significantly) saves up-to-date images correctly, when triggered by the camera's inbuilt motion detection (running in a separate thread and communicating through global variables).
The key tricks were:
A much faster loop reading the frames (and discarding most of them). I reduced the sleep to 0.1 (and even further to 0.01), and saved relatively few frames to JPG files only when required
Slowing down the frame rate on the camera (from 25 to 10 fps - even tried 5 at one point). This meant that the camera didn't get ahead of the software and send unpredictable frames.

OpenCV changing VideoCapture resolution causes colour issues and glitches

I want to capture 1920x1080 video from my camera but I've run into two issues
When I initialize a VideoCapture, it changes the width/height to 640/480
When I try to change the width/height in cv2, the image becomes messed up
Images
When setting 1920x1080 in cv2, the image becomes blue and has a glitchy bar at the bottom
cap = cv2.VideoCapture('/dev/video0')
cap.set(cv2.CAP_PROP_FRAME_WIDTH, 1920)
cap.set(cv2.CAP_PROP_FRAME_HEIGHT, 1080)
Here's what's happening according to v4l2-ctl. The blue image doesn't seem to be a result of a pixelformat change (eg. RGB to BGR)
And finally, here's an example of an image being captured at 640x480 that has the correct colouring. The only difference in the code is that width/height is not set in cv2

Problem:
Actually the camera you are using has 2 mode:
640x480
1920x1080
One is for main stream, one is for sub stream. I also met this problem couple of times and here is the possible reasons why it doesn't work.
Note: I assume you tried different ways to run on full resolution(1920x1080) such as cv2.VideoCapture(0) , cv2.VideoCapture(-1) , cv2.VideoCapture(1) ...
Possible reasons
First reason could be that the camera doesn't support the resolution you desire but in your case we see that it supports 1920x1080 resolution. So this can not be the reason for your isssue.
Second reason which is general reason is that opencv backend doesn't support your camera driver. Since you are using VideoCaptureProperties of opencv, Documentation says:
Reading / writing properties involves many layers. Some unexpected result might happens along this chain. Effective behaviour depends from device hardware, driver and API Backend.
What you can do:
In this case, if you really need to reach that resolution and make compatible with opencv, you should use the SDK of your camera(if it has).

Why the frames of a VideoClip change when it is written to a video file?

I wrote the following code:
from moviepy.editor import *
from PIL import Image
clip= VideoFileClip("video.mp4")
video= CompositeVideoClip([clip])
video.write_videofile("video_new.mp4",fps=clip.fps)
then to check whether the frames have changed or not and if changed, which function changed them, i retrieved the first frame of 'clip', 'video' and 'video_new.mp4' and compared them:
clip1= VideoFileClip("video_new.mp4")
img1= clip.get_frame(0)
img2= video.get_frame(0)
img3= clip1.get_frame(0)
a=img1[0,0,0]
b=img2[0,0,0]
c=img3[0,0,0]
I found that a=24, b=24, but c=26....infact on running a array compare loop i found that 'img1' and 'img2' were identical but 'img3' was different.
I suspect that the function video.write_videofile is responsible for the change in array. But i dont know why...Can anybody explain this to me and also suggest a way to write clips without changing their frames?
PS: i read the docs of 'VideoFileClip', 'FFMPEG_VideoWriter', 'FFMPEG_VideoReader' but could not find anything useful...I need to read the exact frame as it was before writing in a code I'm working on. Please, suggest me a way.

Like JPEG, MPEG-4 uses lossy compression, so it's not surprising that the frames read from "video_new.mp4" are not perfectly identical to those in "video.mp4". And as well as the variations caused purely by the lossy compression there are also variations that arise due to the wide variety of encoding options that can be used by programs that write MPEG data.
If you really need to be able to read back the exact same frame data that you write then you will have to use a different file format, but be warned: your files will be huge!
The choice of video format partly depends on what the image data is like and on what you want to do with it. If the data uses 256 colours or less, and you don't intend to perform transformations on it that will modify the colours, a simple GIF anim is a good choice. But bear in mind that even something like non-integer scaling modifies colours.
If you want to analyze the image data and transform it in various ways, it makes sense to use a format with better colour support than GIF, eg a stream of PNG images, which I assume is what Zulko mentions in his answer. FWIW, there's an anim format related to PNG called MNG, but it is not well supported or widely known.
Another option is to use a stream of PPM images, or maybe even a stream of YUV data, which is useful for certain kinds of analysis and convenient if you do intend to encode as MPEG for final consumption. The PPM format is very simple and easy to work with; YUV is slightly messy since it's a raw format with no header data, so you have to keep track of the image size and resolution data yourself.
The file size of PPM or YUV streams is large, since they incorporate no compression at all, but of course they can be compressed using standard compression techniques, if you want to save a little space when saving them to disk. OTOH, typical video processing workflows that use such streams often don't bother writing them to disk: they are sent in pipelines (perhaps using named pipes), so the file size is (mostly) irrelevant.
Although such formats take up a lot of space compared to MPEG-based files, they are far superior for use as intermediate formats while performing image data analysis and transformation, since every time you write & read back MPEG you are losing a little bit of quality.
I assume that you intend to do your image data analysis and transformations using PIL/Pillow. But you can also work with PPM & YUV streams using the ffmpeg / avconv command line programs; and the ffmpeg family happily work with sets of individual image files and GIF anims, too.

You can have lossless compression with the 'png' codec:
clip.write_videoclip('clip_new.avi', codec='png')
EDIT #PM 2Ring: when you write the line above, it makes a video that is compressed using the png algortihm (I'm not sure whether each frame is a png or if it's more subtle).

How to set framerate with OpenCV camera capture

How can I set the capture framerate, using OpenCV in Python? Here's my code, but the resulting framerate is less than the requested 30fps. Also, quality of video very bad.
import cv
cv.NamedWindow ('CamShiftDemo', 1)
device = -1
cap = cv.CaptureFromCAM(device)
size = (640,480)
cv.SetCaptureProperty(cap, cv.CV_CAP_PROP_FPS,30)
cv.SetCaptureProperty(cap, cv.CV_CAP_PROP_FRAME_WIDTH, size[0])
cv.SetCaptureProperty(cap, cv.CV_CAP_PROP_FRAME_HEIGHT, size[1])
while True:
frame = cv.QueryFrame(cap)
cv.ShowImage('CamShiftDemo', frame)
cv.WaitKey(10)

You are limited by the hardware, namely:
your camera's capture capabilities, and
your computer's system resources.
If either of these cannot handle the requested capture parameters (in your case 640x480 resolution at 30fps), you're out of luck. Parameters you give to OpenCV are merely suggestions -- it tries to match them as best it can.
What model camera are you using? I would first look at the model specs to see if they advertise the parameters you desire.

Video editing with python or command line

I need to perform the following operations in my python+django project:
joining videos with same size and bitrate
joining videos and images (for the image manipulation I'll use PIL: writing text to an existing image)
fading in the transitions between videos
I already know of some video editing libraries for python: MLT framework (too complex for my needs), pygame and pymedia (don't include all the features I want), gstreamer bindings (terrible documentation).
I could also do all the work from command line, using ffmpeg, mencoder or transcode.
What's the best approach to do such a thing on a Linux machine.
EDIT: eventually I've chosen to work with melt (mlt's command line)

http://avisynth.org/mediawiki/Main_Page is a scripting language for video.

Because ffmpeg is available on GNU/Linux, i thing using it with modules such as pexpect or subprocess is the best solution....

You can use OpenCV for joining videos and images. See the documentation, in particular the image/video I/O functions.
However, I'm not sure if the library has functions that will do the fading for you.

What codec are you using?
There are two ways to compress video: lossy and lossless. It's easy to tell them apart. Depending on their length, lossy video files are in the megabyte range, lossless (including uncompressed) are in the gigabyte range.
Here's an oversimplification. Editing video files is a lot different from editing film, where you just glue the pieces of film together. It's not just about bitrate, frame rate and resolution. Most lossy video codecs (MPEG 1-4, Ogg Theora, H.26x, VC-1, etc.) start out with a full frame then record only the changes in movement. When you watch the video what you're actually seeing is a static scene with layer after layer of changes pasted on top of it. It looks like you're seeing full frame after full frame, but if you looked at the data in the file all you'd see would be a black background and scrambled blocks of video.
If it's uncompressed or uses a lossless codec (HuffYUV, Lagarith, FFV1, etc.) then you can edit your video file just like film. You still have to re-encode the video but it won't effect video quality and you can cut, copy and paste however you like as long as the resolution and frame rate are the same. If you're video is lossy you have to re-encode it with some loss of video quality, just like saving the same image in JPEG, over and over.
Another option might be to put several pieces of video into a container like MKV and use chapters to have it jump from piece to piece. I seem to remember being told this is possible but I've never tried it so maybe it isn't.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.