Are there any libraries in python that allow the input of a video file and then output 4 equal quadrants of that video files (eg: top left, top right, bottom left, bottom right)?
At the moment I have only seen examples that split the video in terms of length (eg: a 20 minute video into 5 minute sections)
I know its probably possible by using something like opencv to split the video into frames and then split each frame into 4 and the make the individual frames back into the video but I think this is very resource hungry and not the most efficient solution.
Any suggestions or examples will be appreciated.
You are right about OpenCV. You don't need to split into frames. You can used OpenCV or scikit-video to read videos into 4 dimensional array (height, width, frames, channel). Then once you have size of weidth and height, you can just extract 4 videos by indexing. e.g. if (400,600, 122, 3) is your video dimension, you can get 4 videos by:
v1=vid[:200,:300,:,:]
v2=vid[200:,:300,:,:]
v3=vid[200:,300:,:,:]
v4=vid[:200,300:,:,:]
You can find solution using ffmpeg on: related question
This may be memory-wise cheaper (specially required RAM) compared to OpenCV or scikit-video solution. But the solution I mentioned above using scikit-video is not computationally expensive.
read/write videos with scikit-video
Related
I have 2 cameras that taking pictures in 40 fps,
I want to concat them together and have them continue each other like the picture comes from one camera.
any ideas on how I should approach this?
EDIT:
i used a software to show how roughly i merge the images and the color difference that rises from the merges:
(its to big so i upload it to a site)
https://ibb.co/8xx0BWC
EDIT2:
I already tried this post that seems to answer my problem, it didn't work as expected:
color match in images
Im trying to remove the differences between two frames and keep the non-chaning graphics. Would probably repeat the same process with more frames to get more accurate results. My idea is to simplify the frames removing things that won't need to simplify the rest of the process that will do after.
The different frames are coming from the same video so no need to deal with different sizes, orientation, etc. If the same graphic its in another frame but with a different orientation or scale, I would like to also remove it. For example:
Image 1
Image 2
Result (more or less, I suppose that will be uglier but containing a similar information)
One of the problems of this idea is that the source video, even if they are computer generated graphics, is compressed so its not that easy to identify if a change on the tonality of a pixel its actually a change or not.
Im ideally not looking at a pixel level and given the differences in saturation applied by the compression probably is not possible. Im looking for unchaged "objects" in the image. I want to extract the information layer shown on top of whats happening behind it.
During the last couple of days I have tried to achieve it in a Python script by using OpenCV with all kinds of combinations of absdiffs, subtracts, thresholds, equalizeHists, canny but so far haven't found the right implementation and would appreciate any guidance. How would you achieve it?
Im ideally not looking at a pixel level and given the differences in saturation applied by the compression probably is not possible. Im looking for unchaged "objects" in the image. I want to extract the information layer shown on top of whats happening behind it.
This will be extremely hard. You would need to employ proper CV and if you're not an expert in that field, you'll have really hard time.
How about this, forgetting about tooling and libs, you have two images, ie. two equally sized sequences of RGB pixels. Image A and Image B, and the output image R. Allocate output image R of the same size as A or B.
Run a single loop for every pixel, read pixel a and from A and pixel b from B. You get a 3-element (RGB) vector. Find distance between the two vectors, eg. magnitude of a vector (b-a), if this is less than some tolerance, write either a or b to the same offset into result image R. If not, write some default (background) color to R.
You can most likely do this with some HW accelerated way using OpenCV or some other library, but that's up to you to find a tool that does what you want.
I have continuous videos taken from two cameras placed on up right and up left corners of my car's windshield (please note that they are not fixed to each other, and I aligned them approximately straight). Now I am trying to make a 3D point cloud out of that and have no idea how to do that. I surfed the internet a lot and still couldn't find any useful info. Can you send me some links or hints on how can I make that work in Python.
You can try the stereo matching and point cloud generation implementation in the OpenCV library. Start with this short Python sample.
I suppose that you have two independent video streams that are not exactly synchronized. You will have to synchronize them first, because the linked sample expects two images, not videos. Extract images from videos using OpenCV or ffmpeg and find an image pair that shares exactly the same timepoint (e.g. green appearing on a traffic light). Alternatively you can use the audio tracks for synchronization, see https://github.com/benkno/audio-offset-finder. Beware: synchronization based on a single frame pair or a short audio excerpt will probably work only for few minutes before and after the synchronized timepoint.
I have a problem, not so easy to solve i guess. In general, I have a database of frames from different videos and I want to find for a given picture (which is not necessarily one of the frames but from some same source video) the matching source video.
So lets say I have some videos and extracted frames each x seconds. The frames are stored in the db.
My guess would now be to loop over all video frames in the db and try to find matching features. So I would somehow have to find features in the source image and then try to find these in the frames stored in the db.
My question is how can I achieve this? The problem is that camera angle and vieweing distance can be quite different when the picture in question was not taken quite close to the time the frame was extracted previously.
Is this even feasible?
I'm working with Python and OpenCV.
Thanks and best regards
I am getting video input from 2 separate cameras with some area of overlap between the output videos. I have tried out a code which combines the video output horizontally. Here is the link for that code:
https://github.com/rajatsaxena/NeuroscienceLab/blob/master/positiontracking/combinevid.py
To explain the problem visually:
The red part shows the overlap region between two image frame. I need the output to look like the second image, with first frame in blue and second frame in green (as shown in third illustration)
A solutions I can think of but unable to implement is, Using SIFT/SURF find out the maximum distance keypoints from both frames and then take the first video frame completely and just pick the non overlapping region from second video frame and horizontally combine them to get the stitched output.
Let me know of any other solutions possible as well. Thanks!
I read this post one hour ago. I tried some really easy approach. Not perfect but in some cases should work well. For example, if you have both cameras on one frame placed side by side.
I took 2 images from the phone like on a picture (color images). Program select Rectangles region from both source images and resize end extract this roi rectangles. The idea is to find the "best" overlapping Rect regions by normalized correlation.
M1 and M2 is mat roi to compare,
matchTemplate(M1, M2, res, TM_CCOEFF_NORMED);
After, I find this overlapping Rect use this to crop source images and combine by hconcat() function together.
My code is in C++ but is really simple to replicate this in python. It is not the best solution but one of the most simple solution. If your cameras are fixed in stable position between themselves. This is a good solution I think.
I hold my phone in hand :)
You can also use this simple approach on video. The speed depends only on the number of rectangle candidate you compare.
You can improve this by smart region to compare selection.
Also, I am thinking about another idea to use optical flow by putting your images from a camera at the same time to sequence behind each other. From the possible overlapping regions in one image extract good features to track and find them in the region of second images.
Surf and sift are great for this but this is the most simple idea on my mind.
Code is Here Code