I have 2 audio files for primary and background music that I want to merge (not concatenate). The final audio file should be as long as the primary file, and if the background music is shorter then it should repeat.
If there a Linux command or a Python library that can be used to do this? Sox supports merging, but does not appear to allow repeating the background audio.
As a possible solution, why not detect if the length of the background file < length of the foreground file and then construct a background file which is a loop, if necessary? Then you can pass that into sox.
You should be able to get the length from sndhdr (look at the frames count).
As far as a python way of merging the streams, audioop.add may do what you need, although if you're merging two full volume sources, you might want to reduce the volume of one of them (try -12db as a start) if you're mixing music and not.
More audio libraries can be found here.
Related
I have continuous videos taken from two cameras placed on up right and up left corners of my car's windshield (please note that they are not fixed to each other, and I aligned them approximately straight). Now I am trying to make a 3D point cloud out of that and have no idea how to do that. I surfed the internet a lot and still couldn't find any useful info. Can you send me some links or hints on how can I make that work in Python.
You can try the stereo matching and point cloud generation implementation in the OpenCV library. Start with this short Python sample.
I suppose that you have two independent video streams that are not exactly synchronized. You will have to synchronize them first, because the linked sample expects two images, not videos. Extract images from videos using OpenCV or ffmpeg and find an image pair that shares exactly the same timepoint (e.g. green appearing on a traffic light). Alternatively you can use the audio tracks for synchronization, see https://github.com/benkno/audio-offset-finder. Beware: synchronization based on a single frame pair or a short audio excerpt will probably work only for few minutes before and after the synchronized timepoint.
I wrote the following code:
from moviepy.editor import *
from PIL import Image
clip= VideoFileClip("video.mp4")
video= CompositeVideoClip([clip])
video.write_videofile("video_new.mp4",fps=clip.fps)
then to check whether the frames have changed or not and if changed, which function changed them, i retrieved the first frame of 'clip', 'video' and 'video_new.mp4' and compared them:
clip1= VideoFileClip("video_new.mp4")
img1= clip.get_frame(0)
img2= video.get_frame(0)
img3= clip1.get_frame(0)
a=img1[0,0,0]
b=img2[0,0,0]
c=img3[0,0,0]
I found that a=24, b=24, but c=26....infact on running a array compare loop i found that 'img1' and 'img2' were identical but 'img3' was different.
I suspect that the function video.write_videofile is responsible for the change in array. But i dont know why...Can anybody explain this to me and also suggest a way to write clips without changing their frames?
PS: i read the docs of 'VideoFileClip', 'FFMPEG_VideoWriter', 'FFMPEG_VideoReader' but could not find anything useful...I need to read the exact frame as it was before writing in a code I'm working on. Please, suggest me a way.
Like JPEG, MPEG-4 uses lossy compression, so it's not surprising that the frames read from "video_new.mp4" are not perfectly identical to those in "video.mp4". And as well as the variations caused purely by the lossy compression there are also variations that arise due to the wide variety of encoding options that can be used by programs that write MPEG data.
If you really need to be able to read back the exact same frame data that you write then you will have to use a different file format, but be warned: your files will be huge!
The choice of video format partly depends on what the image data is like and on what you want to do with it. If the data uses 256 colours or less, and you don't intend to perform transformations on it that will modify the colours, a simple GIF anim is a good choice. But bear in mind that even something like non-integer scaling modifies colours.
If you want to analyze the image data and transform it in various ways, it makes sense to use a format with better colour support than GIF, eg a stream of PNG images, which I assume is what Zulko mentions in his answer. FWIW, there's an anim format related to PNG called MNG, but it is not well supported or widely known.
Another option is to use a stream of PPM images, or maybe even a stream of YUV data, which is useful for certain kinds of analysis and convenient if you do intend to encode as MPEG for final consumption. The PPM format is very simple and easy to work with; YUV is slightly messy since it's a raw format with no header data, so you have to keep track of the image size and resolution data yourself.
The file size of PPM or YUV streams is large, since they incorporate no compression at all, but of course they can be compressed using standard compression techniques, if you want to save a little space when saving them to disk. OTOH, typical video processing workflows that use such streams often don't bother writing them to disk: they are sent in pipelines (perhaps using named pipes), so the file size is (mostly) irrelevant.
Although such formats take up a lot of space compared to MPEG-based files, they are far superior for use as intermediate formats while performing image data analysis and transformation, since every time you write & read back MPEG you are losing a little bit of quality.
I assume that you intend to do your image data analysis and transformations using PIL/Pillow. But you can also work with PPM & YUV streams using the ffmpeg / avconv command line programs; and the ffmpeg family happily work with sets of individual image files and GIF anims, too.
You can have lossless compression with the 'png' codec:
clip.write_videoclip('clip_new.avi', codec='png')
EDIT #PM 2Ring: when you write the line above, it makes a video that is compressed using the png algortihm (I'm not sure whether each frame is a png or if it's more subtle).
I am playing with stacking and processing astronomical photographs. I'm as interested in understanding algorithms as I am in the finished images, so I have not (yet) tried any of the numerous polished products floating around.
I have moderately-sized collections of still photographs (dozens at a time) which I can successfully import using
img = imread("filename.jpg")
This produces a numpy ndarray matrix, which I can manipulate using the tools available from numpy and scipy.ndimage, and display using imshow(). This is supported on the back end by the Python Imaging Library, PIL, which as far as I can tell supports only still images.
For longer exposures, it'd be nice to set my camera to take video, then extract frames from the video and run them through the same analysis pipeline as the still images. As far as I can tell, PIL supports only still images. My camera produces Quicktime movies with .MOV file extensions.
Is there a Python library that will let me access the data from frames of a video?
Alternatively, I'd appreciate guidance on using an external tool (there seems to exist a command-line ffmpeg, but I haven't tried it) to generate temporary files that I can feed into my still-image pipeline. Since I might want to examine all 18k frames in a ten-minute, 30fps movie, just extracting all the frames into one big folder is probably not an option.
I am running Python 2.7 on OSX Mavericks; I have easy access to MacPorts to install things.
The following line of ffmpeg will let you extract 10 seconds of video, starting at a prespecified time (here 20 seconds after the start of the movie) :
ffmpeg -i myvideo.MOV -ss 00:00:20.00 -t 10 img%3d.jpg
It is easy to figure out how you can use that in a Bash loop or run the command in a loop via Python.
I need to perform the following operations in my python+django project:
joining videos with same size and bitrate
joining videos and images (for the image manipulation I'll use PIL: writing text to an existing image)
fading in the transitions between videos
I already know of some video editing libraries for python: MLT framework (too complex for my needs), pygame and pymedia (don't include all the features I want), gstreamer bindings (terrible documentation).
I could also do all the work from command line, using ffmpeg, mencoder or transcode.
What's the best approach to do such a thing on a Linux machine.
EDIT: eventually I've chosen to work with melt (mlt's command line)
http://avisynth.org/mediawiki/Main_Page is a scripting language for video.
Because ffmpeg is available on GNU/Linux, i thing using it with modules such as pexpect or subprocess is the best solution....
You can use OpenCV for joining videos and images. See the documentation, in particular the image/video I/O functions.
However, I'm not sure if the library has functions that will do the fading for you.
What codec are you using?
There are two ways to compress video: lossy and lossless. It's easy to tell them apart. Depending on their length, lossy video files are in the megabyte range, lossless (including uncompressed) are in the gigabyte range.
Here's an oversimplification. Editing video files is a lot different from editing film, where you just glue the pieces of film together. It's not just about bitrate, frame rate and resolution. Most lossy video codecs (MPEG 1-4, Ogg Theora, H.26x, VC-1, etc.) start out with a full frame then record only the changes in movement. When you watch the video what you're actually seeing is a static scene with layer after layer of changes pasted on top of it. It looks like you're seeing full frame after full frame, but if you looked at the data in the file all you'd see would be a black background and scrambled blocks of video.
If it's uncompressed or uses a lossless codec (HuffYUV, Lagarith, FFV1, etc.) then you can edit your video file just like film. You still have to re-encode the video but it won't effect video quality and you can cut, copy and paste however you like as long as the resolution and frame rate are the same. If you're video is lossy you have to re-encode it with some loss of video quality, just like saving the same image in JPEG, over and over.
Another option might be to put several pieces of video into a container like MKV and use chapters to have it jump from piece to piece. I seem to remember being told this is possible but I've never tried it so maybe it isn't.
I am wondering if anyone has any experience with Python and video processing. Essentially, I would like to know if there are any libraries that would allow me to do scene detection in a video? If not, are there any that can allow me to split the video up into a series of frames and let me mess about with the pixels?
Thanks!
OpenCV has Python bindings; I don't think it has any scene boundary algorithms / functions built it, but you can definitely use it to write your own.
You can use FFmpeg to do the scene detection and obtain the change frames and their timestamps. The command can be combined with a python script and you can modify it according to your use case.
You can simply use the command:
ffmpeg inputvideo.mp4 -filter_complex "select='gt(scene,0.3)',metadata=print:file=time.txt" -vsync vfr img%03d.png
This will save just the relevant information in the time.txt file like below and also save the shot change images in order:
frame:0 pts:108859 pts_time:1.20954
lavfi.scene_score=0.436456
frame:1 pts:285285 pts_time:3.16983
lavfi.scene_score=0.444537
frame:2 pts:487987 pts_time:5.42208
lavfi.scene_score=0.494256
frame:3 pts:904654 pts_time:10.0517
lavfi.scene_score=0.462327
frame:4 pts:2533781 pts_time:28.1531
lavfi.scene_score=0.460413
frame:5 pts:2668916 pts_time:29.6546
lavfi.scene_score=0.432326
The frame is the serial number of the detected shot change from the starting. Also, choose your threshold value (here 0.3) appropriately for your use case to get correct outputs