I am working on a project and I am trying to save the video from a drone in my computer and also show it live. I thought about converting the video in Images about 30 per second and updating my frontend with these pictures so that it looks like a video.
Since It is the first time I am working with video and image strings I need some help. As far as I can figure out with my knowledge I am receiving a byte string.
I can not use the libh264 decoder because i am unable to intergrate it in python 3.7 its only working with python 2
Here some strings:
b'\x00\x00\x00\x01A\xe2\x82_\xc1^\xa9y\xae\xa3\xf2G\x1a \x89z\' \x8c\xa6\xe7I}\xf3F\x07t\xf4*b\xd8\xc7\xff\x82\xb32\xb7\x07\x9b\xf5r0\xa0\x1e\x10\x8e\x80\x07\xc1\xdd\xb8g\xba)\x02\xee\x9f\x16][\xe4\xc6\x16\xc8\x17y\x02\xdb\x974\x13Mfn\xcc6TB\xadC\xb3Y\xadB~"\xd5\xdb\xdbg\xbaQ:{\xbftV\xc4s\xb8\xa3\x1cC\xe9?\xca\xf6\xef\x84]?\xbd`\x94\xf6+\xa8\xb4]\xc3\xe9\xa8-I\xd1\x180\xed\xc9\xee\xf4\x93\xd2\r\x00p1\xb3\x1d\xa2~\xfa\xe8\xf4\x97\x08\xc1\x18\x8ad,\xb6\x80\x86\xc6\x05V\x0ba\xcb\x7f\x82\xf2\x03\x9a)\xd6\xd9\x11\x92\x7f\xb5\x8a)R\xaa\xa0 \x85$\xd82(\xee\xd2\x8b\x94N\xacg\x07\x98n\x95OJ\xa4\xcc_\\-\x17\x13\xf3V\x96_\xb5\x97\xe2\xa2;\x03q\xce\x9b\x9e,\xe37{Z\x00\xce|\\\xf9\xdb\xa7\xba\xf3\'c\xee\xc9\xe7I\xfadZ\xb2\xfb\t\xb6\x03\x03\xfe\x9dM!!k\xec\xe0t{\xfeig\xcbL\xf6\x0bOP\r\x97\t\x95Hb\xd81\xb5\xbfVLZ#\x16s\xb6\x1adf\xb5\xe2\xb5\xb7\xccI\x82l\x05\xe9\x85\xd3\'x\x14C\xeb\xc4\xcb\xa5\xc7\xb6=\x7f\\m4\xa4\x00~\xdb\x97\xe4\xbb\xf3A\x86 Mm\xc7\x9a\x90\xda&\xc5\xf2wY\nr.1\xb9\x0c\xb4\xb1\xb2!\x03)\xb3\x19\x1d\xba\xfb)\xb0\xd2LS\x93\xe3\xb4t\x91\xed\xa7\xfe\xceV\x10\xa7Vcd\xcbIt\xdf\xff0\xcb9Q\xef(\x11&W0|p\x13\xfe\xd6\x93A\xa7\xc2(f\xde\xcc[\x8f#P\x07\x1f\xb0\\.\xd0\xa07\xab\xd5\xce\xb1N\xfb\xd3\xcc\x0f\x89+gm1p4\x87_\xf6\xfe\x13\xe8\xec\xa3vd,\xb3jW\x96\xe2\x937\xcb\xc5\xc4\xdb\xd9(wj\xa85y\xccE \xf8\xe4\x83\xd5\xcf\xe5A\xf9\x18T;v\x00\xbc\xac\xd1a\xed\tK\xd6\xd4\xd4\xc4W\xe4F7L\xfc\xb4\xeb3\x937\x94\x02i\xf3\x85\xbe\x05B\xf5\xb8\xccO\x84\xfb]M\x0c\xd8k\x00va\x0f\x91M\xd9\x9f9\xfc\x0f6\xa4f\xc5\xbe\xd9GItD\xdf7*\x93Kv)~[\xf1%\xeb(o\xef;\xc0\xb4,\xa1\xc2V\x8a\xff\xe1\x86\x17\xe7\xf17\xe81l&\x14<j\xb0AS\xf92\xb1C;\x81\x8a\x06D\xab\x11j\xcd\xb1q\x9e\xefm\x0ei7\x15\x8d\x03\xdd6B\xd9qg*X\x0f\xe6F\xdc\xb6\x93N\xbe\x12\xc9#I\xe3\xd4\x80j\xe8z\xd5t\x05,Y\xd7\xec\xd1\x9a\x97\xae\x16\xb0\xdfi\xb2\xb8\xb5J-\xde9&\x1ai\x19\xb7\x81\xa3\'\xccf]\xeeK#\x8bk3\x11\x97\\T\x88\xfb\xee\xd3El:\x16\x13\xafi\xc0\xf9\xef\xefe7\xe4w\x14\xdf76g^\xd02J\x96Z\xedl\x19\x8eG\xb7\xc6\xebHj\x86\x84/:R{+co\xa0\xaa\xeb.\xbb\x0e\xc9\xf3\xa8\x1e\xd4\x1a\x010\x87;\xef\xbe\xaf.\x87\x9a5\xfdG\x82\xd5\xb2\x01\x1e\xf2\xd3l\xef\tb\xe7=1\x03\x8f\xae\x83\x84:0\x9bE;x\x03UB\x87\xbco\xb2\x80xZ\x96\x1a\x0e?i\xe51^\x9b\x1d\xb4\\|\xccH\xdf3G\x83\xbd/\rhS0;\x9a\xdb\xf6NG\x16 ?\xf3\x13<\xcf!p\xd5\n\xb1\xf2\x0e\xcc\xdc\x0b\xe6\xe8\xcb#\x85\x17s#\x87\xb4\xf8f\xc7\x9fi\xcc\xe4b\xca\xc0\x1eh\xc1u\xad\x98\x92\x12\x00\xb5`\xfa!~{\xac\xc0\x14:\xce\xfc\xa4\x90\x12\xc4K\xa5\xb9\x83\xd1\x03\x1a\xd8z\xf6A\xe9\xfbb\x07\x99\xf80\x9b,\x17\x8d /ZXb]\xb2P\\\'\xcb\n\xae\x82\x99X\xf5\t\xd1\xc9p\x11\x8d\xcaD\xf2\x8b\x8bc%\x17] \x89b\xa9kF\x93\xc0\xe1{INUg\xec\xb4\x1b`{\xd1:\xb3\xa4\x7f\t\x9b\xde\xb0V\x1f\xd7\x85>\xbeT\xbb\xe5\xf0u\x96\x98\xad\x9a\xc3N\xf8A\x91\xd95h\x1ef\xbc\xf2\x08B\xe0\x9f\xe0\x1d+\xb6$\xafA\xca\xf6\xc5MX\x88\x9e\xf1\xbawZ\x87\xe7\xf7\xf4\xcd\xe4\x92|L\x1ep69\x81\x8f\xc6\'\xc1q\xe3\x98\x1ev\x94\xa3\xd5\xb8g\xee\x82\xd3Y\xccs\x81\x06\x97\x02\xf0\xd8S\xf1\x1b!\x8emp\x02w\x97\x11t]5?\x16\xfa\xf2\xfb\xf7\xef\xdf\xe4\x82V\x07?F`\xcf\xee\xef\xe7\xae\x18\xef\x83a\x87\xb1zh\xe7\xaez]\x1e\xc5\xd9\xe7&\x9a\xf0\xd0\xa4!\x05\x07\xff\xca\x10\xfa\xb7\x01\x9aU\x8b(\xb5#\x11\x95\x98\x8b\xe3\x84\x9b\x13\xecw\x0e\xc9\xad<X\xde\x11\tuo\xd2\xfd\xb6\xc2\x1c\xfb\x82 \xb2\xa6\x02\x8c0\x19\xadP\x1b\xc3C\x08\xc9-\xaa\xd0\x15\xb3\xd2g\x07\x980:u\r\xfc\xf4&\xf9\x06$#\x85\xe1l\x16\x8a\x9f\xedX\xa0b\x1a^\x90#256\xc0z\xc7\xfax\xde\xa2\x0fKHY\xed8\xc6`\xa7^#\x0b.\xc4\x1a\r\x938\x17\xe2|\xb0\x95-\xce\xaa}\xc3\xb5\x0bS\xbb\xc6\x0cA\x00`\xe5:\x00\xc6\x0b\x93(1]\xb1\xb6\xc0\xc0de;]~\xa1\xc6d\xf7\x12\xc9\x0f\xfc\xd4\xd0\xfcJ\xb9\xd5\nE\x9a\x7f\x12\xbd\x83\x87\xff\xb8\x15\x0fm\x14p\xba\xc0\xef\x87v\x9e\\\xfd\x8f;\xe3\xb5\x03\x94\xd6t\xa5\xc2\xe9\x92\xd1\xcd9cS\x15\x9c}\xdd\x9f\xf4\xe1\xd2\xb6cR\xb1\x18\x83\xe7\n\xde\xfeUM\x90\xf9\xbf\xf6\xd8J\xc7\x1a:z\x0bGL\x00l\xf6\xa5\x1f$\x86O6\xfa\x13\x04G\x0e\xfe\xca\xbe\xaf\xe1\xb6\xfa\x91\x9b\xb5\x9f]\x12N\x9c\xcf4b}E\x07\xa6B\xd2\x10\xe0Xjxi\x93\x92w\x1d \xd5\xd1\x87,5\xa0\xd3\x18\x8e\xe0\xad9o\x92\x8d\xb1\x95o\x0c"\xb4\xadW\xf9\xc9\xa0\xe5i\xdb\x17\xea\xd6o$Y\xfb\xb5\x9c\x93\x16\xf7\xc0\x1cz\x00\xfc$\x08\x9ay38Y\xe1_8\xb2\xe2\xd1\t\xcdfmcpSEt\x86\xa6'
I would appreciate it if you could help me understand where every picture starts and where it ends. I assume that there has to be some kind of parity bits.
How is it possible to make a picture out of it.
Here is my code and what I've tried so far:
def videoLogging(self):
logging.info("-----------[Tello] Video Thread: started------------------")
INTERVAL = 0.2
index = 0
while True:
try:
packet_data = None
index += 1
res_string, ip = self.video_socket.recvfrom(2048)
packet_data = res_string
print(packet_data)
self.createImg(packet_data)
time.sleep(5)
# videoResponse = self.video_socket.recv(2048)
# mv = memoryview(videoResponse).cast('H')
# if mv is not None:
# self.createImg(mv)
# print("image created")
# print('VIDEO %s' % videoResponse)
# time.sleep(3)
except Exception as ex:
logging.error("Error in listening to tello\t\t %s" % ex)
def createImg(self, data):
with open('image.jpg', 'wb') as f:
f.write(data)
Unfortunately, the image can't be opened.
Thank you in regards.
This looks like an annex b stream. There are no parity bits. You can read about the bitstream format here. Possible Locations for Sequence/Picture Parameter Set(s) for H.264 Stream
Related
I am trying to concatenate a group of images with associated audio with a video clip at the start and front of the video. Whenever I concatenate the image with the associated audio it dosen't playback correctly in VLC media player and only displays the image for a frame before cutting to black and continually playing audio. I came across this github issue: https://github.com/kkroening/ffmpeg-python/issues/274 where the accepted solution was the one I implemented but one of the comments mentioned this issue of incorrect playback and error on youtube.
'''
Generates a clip from an image and a wav file, helper function for export_video
'''
def generate_clip(img):
transition_cond = os.path.exists("static/transitions/" + img + ".mp4")
chart_path = os.path.exists("charts/" + img + ".png")
if transition_cond:
clip = ffmpeg.input("static/transitions/" + img + ".mp4")
elif chart_path:
clip = ffmpeg.input("charts/" + img + ".png")
else:
clip = ffmpeg.input("static/transitions/Transition.jpg")
audio_clip = ffmpeg.input("audio/" + img + ".wav")
clip = ffmpeg.concat(clip, audio_clip, v=1, a=1)
clip = ffmpeg.filter(clip, "setdar","16/9")
return clip
'''
Combines the charts from charts/ and the audio from audio/ to generate one final video that will be uploaded to Youtube
'''
def export_video(CHARTS):
clips = []
intro = generate_clip("Intro")
clips.append(intro)
for key in CHARTS.keys():
value = CHARTS.get(key)
value.insert(0, key)
subclip = []
for img in value:
subclip.append(generate_clip(img))
concat_clip = ffmpeg.concat(*subclip)
clips.append(concat_clip)
outro = generate_clip("Outro")
clips.append(outro)
concat_clip = ffmpeg.concat(*clips)
concat_clip.output("export/export.mp4").run(overwrite_output=True)
It is unfortunate concat filter does not offer the shortest option like overlay. Anyway, the issue here is that image2 demuxer uses 25 fps by default, so a video stream with one image only lasts for 1/25 seconds long. There are a several ways to address this, but you first need to get the duration of the paired audio files. To incorporate the duration information to the ffmpeg command, you can:
Use tpad filter for each video (in series with setdar) to make the video duration to match the audio. Padded amount should be 1/25 seconds less than the audio duration.
Specify -loop 1 input option so the image will loop (indefinitely) and then specify an additional -t {duration} input option to limit the number of loops. Caution that the video duration may not be exact.
Specify -r {1/duration} so the image will last as long as the audio and use fps filter on each input to the output frame rate.
I'm not familiar with ffmpeg-python so I cannot provide its solution, but if you're interested, I'd be happy to post an equivalent code with my ffmpegio package.
[edit]
ffmpegio Solution
Here is how I'd code the 3rd solution with ffmpegio:
import ffmpegio
def generate_clip(img):
"""
Generates a clip from an image and a wav file,
helper function for export_video
"""
transition_cond = path.exists("static/transitions/" + img + ".mp4")
chart_path = path.exists("charts/" + img + ".png")
if transition_cond:
video_file = "static/transitions/" + img + ".mp4"
elif chart_path:
video_file = "charts/" + img + ".png"
else:
video_file = "static/transitions/Transition.jpg"
audio_file = "audio/" + img + ".wav"
video_opts = {}
if not transition_cond:
# audio_streams_basic() returns audio duration in seconds as Fraction
# set the "framerate" of the video to be the reciprocal
info = ffmpegio.probe.audio_streams_basic(audio_file)
video_opts["r"] = 1 / info[0]["duration"]
return [(video_file, video_opts), (audio_file, None)]
def export_video(CHARTS):
"""
Combines the charts from charts/ and the audio from audio/
to generate one final video that will be uploaded to Youtube
"""
# get all input files (video/audio pairs)
clips = [
generate_clip("Intro"),
*(generate_clip(img) for key, value in CHARTS.items() for img in value),
generate_clip("Outro"),
]
# number of clips
nclips = len(clips)
# filter chains to set DAR and fps of all video streams
vfilters = (f"[{2*n}:v]setdar=16/9,fps=30[v{n}]" for n in range(nclips))
# concatenation filter input: [v0][1:a][v1][3:a][v2][5:a]...
concatfilter = "".join((f"[v{n}][{2*n+1}:a]" for n in range(nclips))) + f"concat=n={nclips}:v=1:a=1[vout][aout]"
# form the full filtergraph
fg = ";".join((*vfilters, concatfilter))
# set output file and options
output = ("export/export.mp4", {"map": ["[vout]", "[aout]"]})
# run ffmpeg
ffmpegio.ffmpegprocess.run(
{
"inputs": [input for pair in clips for input in pair],
"outputs": [output],
"global_options": {"filter_complex": fg},
},
overwrite=True,
)
Since this code does not use the read/write features, ffmpegio-core package suffices:
pip install ffmpegio-core
Make sure that FFmpeg binary can be found by ffmpegio. See the installation doc.
Here are the direct links to the documentations of the functions used:
ffmpegprocess.run
ffmpeg_args dict argument
probe.audio_streams_basic (Ignore the documentation error both duration and start_time are both of Fraction type.
The code has not been fully validated. If you encounter a problem, it might be the easiest to post it on the GitHub Discussions to proceed.
I managed to write a code to decimate my video and take only 1 frame out of 10, in order to make my neural network more efficient in the future for character recognition.
The new video exit_video is well decimated because it's way faster than the previous one.
1: When I print the fps of the new video, I have 30 again despite the decimation
2: Why is my new video heavier ? 50.000 ko and it was 42.000 ko for the firts one
Thanks for your help
import cv2
#import os
import sys
video = cv2.VideoCapture("./video/inputvideo.mp4")
frameWidth = int(video.get(cv2.CAP_PROP_FRAME_WIDTH))
frameHeight = int(video.get(cv2.CAP_PROP_FRAME_HEIGHT))
frameFourcc = int(video.get(cv2.CAP_PROP_FOURCC))
success,image = video.read()
if not success:
print('impossible de prendre une frame')
sys.exit()
fps = video.get(cv2.CAP_PROP_FPS)
print("fps de base " + str(fps))
print(frameFourcc)
count = 0
exit_file = 'decimated_v1.mp4'
exit_video = cv2.VideoWriter(exit_file, frameFourcc, fps, (frameWidth, frameHeight))
while True:
if ((count % 10 ) ==0):
exit_video.write(image)
success,image = video.read()
if not success:
break
count +=1
exit_video.release()
exit_video_info = cv2.VideoCapture("decimated_v1.mp4")
fps_sortie = exit_video_info.get(cv2.CAP_PROP_FPS)
print("fps de sortie " + str(fps_sortie))
Decimating a video file that's not all Intra frames will require re-encoding. Unless your input file is e.g. ProRes or MJPEG, that's likely going to be the case.
Since you're not setting encoding parameters, OpenCV likely end up using some defaults that end up with a higher bitrate than your input file.
You'll probably have a better time using the FFmpeg tool than OpenCV, and its select filter.
ffmpeg -i ./video/inputvideo.mp4 -vf select='not(mod(n\,10))' ./decimated_v1.mp4
would be the basic syntax to use every tenth frame from the input; you can then add your desired encoding parameters such as -crf to adjust the H.264 rate factor – or, of course, you can change to a different codec altogether.
I am reading a raw disk image using python 3. My task is to retrieve (carve) jpgs as individual files from the disk image. As I know header pattern (\xd8\xff\xe0 or \xd8\xff\xe1) of jpg. I want to know where I get this while reading file.
fobj = open('carve.dd', 'rb')
data = fobj.read(32)
while data != '':
head_loc = findheader(data)
print(head_loc)
data = fobj.read(32)
def findheader(data) : # to find header in each 32 bytes of data of raw disk image
for i in range(0, len(data) - 3) :
if data[i] == b'\xff' :
if data[i+1:i+4] == b'\xd8\xff\xe0' or data[i+1:i+4] == b'\xd8\xff\xe1' :
return i
return -1
The same code is working fine in Python 2. In Python 2, I am able to get headers in just a few seconds from image. Can someone help me out, what is the problem in Python 3?
This code snippet is actually from this https://github.com/darth-cheney/JPEG-Recover/blob/master/jpegrecover2.py
This runs fine in Python 2 but not in Python 3. Please forget about inconsistent tab error when you run the code in link. I again retyped in VS code.
Like the old saying goes, I've got some bad news and some good news. The bad is I can't figure out why your code doesn't work the same in both version 2 and version 3 of Python.
The good is that I was able to reproduce the problem using the sample data you provided, but—more importantly—able to devise something that not only works consistently in both versions, it's likely much faster because it doesn't use a for loop to search through each chunk of data looking for the .jpg header patterns.
from __future__ import print_function
LIMIT = 100000 # Number of chunks (for testing).
CHUNKSIZE = 32 # Bytes.
HDRS = b'\xff\xd8\xff\xe0', b'\xff\xd8\xff\xe1'
IMG_PATH = r'C:\vols\Files\Temp\carve.dd.002'
print('Searching...')
with open(IMG_PATH, 'rb') as file:
chunk_index = 0
found = 0
while True:
data = file.read(CHUNKSIZE)
if not data:
break
# Search for each of the headers in each chunk.
for hdr in HDRS:
offset = 0
while offset < (CHUNKSIZE - len(hdr)):
try:
head_loc = data[offset:].index(hdr)
except ValueError: # Not found.
break
found += 1
file_offset = chunk_index*CHUNKSIZE + head_loc
print('found: #{} at {:,}'.format(found, file_offset))
offset += (head_loc + len(hdr))
chunk_index += 1
if LIMIT and (chunk_index == LIMIT): break # Stop after this many chunks.
print('total found {}'.format(found))
I was just playing around with sound input and output on a raspberry pi using python.
My plan was to read the input of a microphone, manipulate it and playback the manipulated audio. At the moment I tried to read and playback the audio.
The reading seems to work, since i wrote the read data into a wave file in the last step, and the wave file seemed fine.
But the playback is noise sounds only.
Playing the wave file worked as well, so the headset is fine.
I think maybe I got some problem in my settings or the output format.
The code:
import alsaaudio as audio
import time
import audioop
#Input & Output Settings
periodsize = 1024
audioformat = audio.PCM_FORMAT_FLOAT_LE
channels = 16
framerate=8000
#Input Device
inp = audio.PCM(audio.PCM_CAPTURE,audio.PCM_NONBLOCK,device='hw:1,0')
inp.setchannels(channels)
inp.setrate(framerate)
inp.setformat(audioformat)
inp.setperiodsize(periodsize)
#Output Device
out = audio.PCM(audio.PCM_PLAYBACK,device='hw:0,0')
out.setchannels(channels)
out.setrate(framerate)
out.setformat(audioformat)
out.setperiodsize(periodsize)
#Reading the Input
allData = bytearray()
count = 0
while True:
#reading the input into one long bytearray
l,data = inp.read()
for b in data:
allData.append(b)
#Just an ending condition
count += 1
if count == 4000:
break
time.sleep(.001)
#splitting the bytearray into period sized chunks
list1 = [allData[i:i+periodsize] for i in range(0, len(allData), periodsize)]
#Writing the output
for arr in list1:
# I tested writing the arr's to a wave file at this point
# and the wave file was fine
out.write(arr)
Edit: Maybe I should mention, that I am using python 3
I just found the answer. audioformat = audio.PCM_FORMAT_FLOAT_LE this format isn't the one used by my Headset (just copied and pasted it without a second thought).
I found out about my microphones format (and additional information) by running speaker-test in the console.
Since my speakers format is S16_LE the code works fine with audioformat = audio.PCM_FORMAT_S16_LE
consider using plughw (alsa subsystem supporting resampling/conversion) for the sink part of the chain at least:
#Output Device
out = audio.PCM(audio.PCM_PLAYBACK,device='plughw:0,0')
this should help to negotiate sampling rate as well as the data format.
periodsize is better to estimate based on 1/times of the sample rate like:
periodsize = framerate / 8 (8 = times for 8000 KHz sampling rate)
and sleeptime is better to estimate as a half of the time necessary to play periodsize:
sleeptime = 1.0 / 16 (1.0 - is a second, 16 = 2*times for 8000 KHz sampling rate)
I have a device that's connected through usb and I'm using pyUSB to interface with the data.
This is what my code currently looks like:
import usb.core
import usb.util
def main():
device = usb.core.find(idVendor=0x072F, idProduct=0x2200)
# use the first/default configuration
device.set_configuration()
# first endpoint
endpoint = device[0][(0,0)][0]
# read a data packet
data = None
while True:
try:
data = device.read(endpoint.bEndpointAddress,
endpoint.wMaxPacketSize)
print data
except usb.core.USBError as e:
data = None
if e.args == ('Operation timed out',):
continue
if __name__ == '__main__':
main()
It is based off the mouse reader, but the data that I'm getting isn't making sense to me:
array('B', [80, 3])
array('B', [80, 2])
array('B', [80, 3])
array('B', [80, 2])
My guess is that it's reading only a portion of what's actually being provided? I've tried settign the maxpacketsize to be bigger, but nothing.
pyUSB sends and receives data in string format. The data which you are receiving is ASCII codes. You need to add the following line to read the data properly in the code.
data = device.read(endpoint.bEndpointAddress,
endpoint.wMaxPacketSize)
RxData = ''.join([chr(x) for x in data])
print RxData
The function chr(x) converts ASCII codes to string. This should resolve your problem.
I'm only an occasional Python user so beware. If your python script cannot keep up with the amount of data being sampled, then this is what works for me. I'm sending from a uC to the PC blocks of 64 bytes. I use a buffer to hold my samples and later I save them in a file or plot them. I adjust the number multiplying 64 (10 in the example below) until I receive all the samples I was expecting.
# Initialization
rxBytes = array.array('B', [0]) * (64 * 10)
rxBuffer = array.array('B')
Within a loop, I get the new samples and store them in the buffer
# Get new samples
hid_dev.read(endpoint.bEndpointAddress, rxBytes)
rxBuffer.extend(rxBytes)
Hope this helps.