I am trying to create a python code that scans the barcode and retrieves the output. A library called pyzbar had already been created for the same purpose. Using that and OpenCV, I had created a code (attached below), for scanning and drawing bounding boxes on the barcode/QR code. The problem I'm facing is that when I input a pre-recorded video above 100 MB as input the output video that is displayed/saved is very slow, but with live stream, there is no such issue. I tried several methods to reduce the fps like PROP_FPS, but nothing worked. I even tried multithreading method and it seems to not have any effect. The code that I was referring to is attached below. Please help me out on the same.
import cv2
import numpy as np
from pyzbar.pyzbar import decode
cv2.namedWindow("Result", cv2.WINDOW_NORMAL)
cap = cv2.VideoCapture('2016_0806_040333_0081.mp4')
cap.set(3, 1280)
cap.set(4, 720)
#frame_width = int(cap.get(3))
#frame_height = int(cap.get(4))
#cap.set(cv2.CAP_PROP_FPS, 0.1)
#size = (frame_width, frame_height)
#result = cv2.VideoWriter('processed_video.avi', cv2.VideoWriter_fourcc(*'MJPG'),0.1, size)
while(True):
ret, frame = cap.read()
for barcode in decode(frame):
myData = barcode.data.decode('utf-8')
pts = np.array([barcode.polygon],np.int32)
pts = pts.reshape((-1,1,2))
cv2.polylines(frame, [pts], True, (255,0,255),5)
pts2 = barcode.rect
akash = []
akash.append(myData)
cv2.putText(frame, myData, (pts2[0], pts2[1]), cv2.FONT_HERSHEY_SIMPLEX, 0.9, (255, 0, 255), 2)
"""
f=open('output.csv','a+')
for ele in akash:
f.write(ele+'\n')
"""
result.write(frame)
cv2.imshow('Result',frame)
cv2.waitKey(1)
#video.release()
#result.release()
#cv2.destroyAllWindows()
print("The video was successfully saved")
You could change the frame rate of your project, but this won't add missing frames. Even if your editor can attempt to extrapolate the needed frames, it can result in glitches.
Best approach if you have the option is to shoot the video at a higher frame rate to begin with.
Related
I'm working on a project with april tags and a computer vision system to detect them from a webcam. I have a good system as of now that prints the data to the terminal but I would like to display this numerical/text data on top of the video window or in another window. I've already tried using cv2.putText() but that only puts static text on the page and it can't be updated in real time like I want. This is my code that tries to update a window in real time with the number of tags detected in the webcam video. But it ends up just writing a 1 for example and I can't figure out a way to erase that text and update it.
Is this even possible in OpenCV? Or is there another way?
while True:
success, frame = cap.read()
if not success:
break
gray = cv2.cvtColor(frame, cv2.COLOR_RGB2GRAY)
detections, dimg = detector.detect(gray, return_image=True)
print(detections)
num_detections = len(detections)
# print('Detected {} tags.\n'.format(num_detections))
num_detections_string = str(num_detections)
overlay = frame // 2 + dimg[:, :, None] // 2
clear_text = ''
text = checkNumDetections(num_detections, num_detections_string)
cv2.putText(whiteBackground, clear_text, (100, 100), cv2.FONT_HERSHEY_PLAIN, 10, (0, 255, 0), 2)
cv2.putText(whiteBackground, text, (100, 100), cv2.FONT_HERSHEY_PLAIN, 10, (0, 255, 0), 2)
cv2.imshow(window, overlay)
k = cv2.waitKey(1)
cv2.imshow(dataWindow, whiteBackground)
if k == 27:
break
You need to re-initialise the 'whiteBackground' image in each loop, before you draw anything on it.
I know this will work, but it will give you a black background:
whiteBackground = np.zeros((columns, rows, channels), dtype = "uint8")
This should work to give you a white background, but experiment and see:
whiteBackground = np.full((columns, rows, channels), 255, dtype = "uint8")
I usually work with opencv in c++, so I'm not 100% sure of the exact syntax, but that should work.
I have written the following code
import cv2
import datetime
import time
import pandas as pd
cascPath = 'haarcascade_frontalface_dataset.xml' # dataset
faceCascade = cv2.CascadeClassifier(cascPath)
video_capture = cv2.VideoCapture('video1.mp4')
frames = video_capture.get(cv2.CAP_PROP_FRAME_COUNT)
fps = int(video_capture.get(cv2.CAP_PROP_FPS))
print(frames) #1403 frames
print(fps) #30 fps
# calculate duration of the video
seconds = int(frames / fps)
print("duration in seconds:", seconds) #46 seconds
df = pd.DataFrame(columns=['Time(Seconds)', 'Status'])
start = time.time()
print(start)
n=5
while True:
ret, frame = video_capture.read()
gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY) #converts frame to grayscale image
faces = faceCascade.detectMultiScale(
gray, scaleFactor=1.1,
minNeighbors=5,
minSize=(30, 30),
flags=cv2.FONT_HERSHEY_SIMPLEX
)
if len(faces) == 0:
print(time.time()-start, 'No Face Detected')
df = df.append({'Time(Seconds)': (time.time()-start) , 'Status':'No Face detected' }, ignore_index=True)
else:
print(time.time()-start, 'Face Detected')
df = df.append({'Time(Seconds)':(time.time()-start), 'Status':'Face Detected' }, ignore_index=True)
# Draw a rectangle around the faces
for (x, y, w, h) in faces:
cv2.rectangle(frame, (x, y), (x+w, y+h), (0, 255, 0), 2)
# Display the resulting frame
cv2.imshow('Video', frame)
df.to_csv('output.csv', index = False)
if cv2.waitKey(1) & 0xFF == ord('q'):
# print(df.head(2))
break
# When everything is done, release the capture
video_capture.release()
cv2.destroyAllWindows()
If you want to download the video I m working on, you can download it from here
download the haar cascade XML file from here
I have a few doubts in this.
Currently it is running on all the 1403 frames of the video, I want to optimize it such that it runs inference after every n frames, which is customizable. in code I have mentioned n =5. So, if n= 5. no of frames should be 1403/5 = 280
The timestamps in my CSV are not coming accurate, I want them to be relative to the video. Basically, the first column (Time(Seconds) should designate the time in the video and the status should determine the status (detected/not detected) of the frame at that moment, the Time(second) column should end at around 46 seconds which is the length of the video.
my cv2.imshow is showing a video that is somewhere around 2x speed, I believe I can control the speed by using cv2.imKey(), what should be the optimal parameter for cv2.waitKey so that I get a similar speed video as output.
Thanks for going through the whole question
If you want to read every 'n' frames, you can wrap your VideoCapture.read() call in a loop like this:
for a in range(n):
ret, frame = video_capture.read();
For the timestamps in the csv file, if that came with the dataset I'd trust that. It's possible the camera isn't capturing at a consistent framerate. If you think the framerate is consistent and want to generate the timestamps yourself you can keep track of how many frames you've gone through and divide the video length by that. (i.e. at frame 150 the timestamp would be (150 / 1403) * 46 seconds)
cv2.imshow() just shows frames as fast as the loop runs. This is mostly controlled through cv2.waitKey(milliseconds). If you think the processing that you're doing in the loop takes a negligible amount of time you can just set the time in the waitKey to be ((n / 1403) * 46 * 1000). Otherwise you should use the python time module to track how long the processing takes and subtract that time from the wait.
Edit:
Sorry, I should have been more clear with the first part. That for loop only has the VideoCapture.read() line in it, nothing else. This way you'll read 'n' frames, but only process one out of every 'n' frames. This isn't replacing the overall while loop that you already have. You're just using the for loop to dump the frames you want to skip.
Oh, and you should also have a check for the return value of the read().
if not ret:
break;
The program will probably crash at the end of the video if it doesn't have that check.
I'm new here and i want to solve a problem that makes me sick this past weeks...
This is the problem, i have 2 codes, one that grab a video of a directory (it could be a web camera to) with OpenCV and read frame by frame drawing text on each frame, plotting one variable (just for testing), show me the video and save it in the python files directory, this is the code.... ("MyPath" is the path of the file that is irrelevant for the question)
import cv2
import random
capture = cv2.VideoCapture(r"MyPath\Car.mp4")
out = cv2.VideoWriter('Test.avi',cv2.VideoWriter_fourcc('M','P','4','2'), 25, (1280, 720))
def velocity():
now = (random.randint(0,100))
return now
while True:
ret, frame = capture.read()
if ret:
font=cv2.FONT_HERSHEY_SIMPLEX
cv2.putText(frame,'Velocity:',(15,580),font,0.7,(255,255,255),1)
cv2.putText(frame,'Distance:',(15,620),font,0.7,(255,255,255),1)
cv2.putText(frame,'Inclination:',(15,660),font,0.7,(255,255,255),1)
cv2.putText(frame,'Orientation:',(15,700),font,0.7,(255,255,255),1)
cv2.putText(frame, str(velocity()), (130,580),font,0.7,(255,255,255),1)
cv2.putText(frame,'KM/H',(165,580),font,0.7,(255,255,255),1)
out.write(frame)
cv2.imshow("Testing", frame)
else:
break
capture.release()
out.release()
cv2.destroyAllWindows()
This works fine, no problems here, and i have another code with pillow that open a background image (jpg) and 4 images in png that resize and repositions them, then the code paste the images above the background, (i made a background just for testing, this 4 png's images have to plot into the videoframes but now it's just testing...) then show me and save the background with the png's above... and again, it works perfectly!
from PIL import Image
back = Image.open(r"MyPath\Eagle.jpg")
vel = Image.open(r"MyPath\Velocímeter.png")
dis = Image.open(r"MyPath\Distance.png")
inp = Image.open(r"MyPath\Inclination.png")
orz = Image.open(r"MyPath\Orientation.png")
vel = vel.resize((60, 60), Image.LANCZOS)
dis = dis.resize((55, 55), Image.LANCZOS)
inp = inp.resize((60, 60), Image.LANCZOS)
orz = orz.resize((60, 60), Image.LANCZOS)
back.paste(vel, (10, 420), vel)
back.paste(dis, (10, 510), dis)
back.paste(inp, (10, 583), inp)
back.paste(orz, (10, 655), orz)
back.show()
back.save("Test.jpg")
The problem is that Pillow is an image library so i can't open a video and paste the images, and OpenCV not accept PNG images easily like Pillow... is there any way to mix this 2 codes in 1 and do what i want to do? (plotting the png images in videoframes to get a render video with text and the images) this project it's for get the information of sensors and plotting, that's why i made a test function just for see,(the images it's a detail that i want to add to the project). If my code likes you and it's useful for you feel free to use it! thank you very much for reading!!! i hope you can help me (i don't speak fluenty english sooo, sorry for posibbles mistakes, but i can read perfectly your answers).
You can mix PIL/Pillow and OpenCV. There are three things you need to know...
To convert an OpenCV image into a PIL image:
pilimage = Image.fromarray(opencvimage)
To convert from a PIL image into an OpenCV image:
opencvimage = np.array(pilimage)
And lastly, OpenCV stores images in BGR order while PIL uses RGB, so reds and blues will get swapped if you don't watch out - cvtColor(BGR2RGB) is your friend here.
So we can now mix up those two pieces of code:
#!/usr/bin/env python3
import cv2
import random
import numpy as np
from PIL import Image
# Load a speedo image which has transparency
speedo = Image.open('speedo.png').convert('RGBA')
capture = cv2.VideoCapture("movie.mov")
out = cv2.VideoWriter('test.mov',cv2.VideoWriter_fourcc('a','v','c','1'), 25, (1280, 720))
def velocity():
now = (random.randint(0,100))
return now
while True:
ret, frame = capture.read()
if ret:
font=cv2.FONT_HERSHEY_SIMPLEX
cv2.putText(frame,'Velocity:',(15,580),font,0.7,(255,255,255),1)
cv2.putText(frame,'Distance:',(15,620),font,0.7,(255,255,255),1)
cv2.putText(frame,'Inclination:',(15,660),font,0.7,(255,255,255),1)
cv2.putText(frame,'Orientation:',(15,700),font,0.7,(255,255,255),1)
cv2.putText(frame, str(velocity()), (130,580),font,0.7,(255,255,255),1)
cv2.putText(frame,'KM/H',(165,580),font,0.7,(255,255,255),1)
# Make PIL image from frame, paste in speedo, revert to OpenCV frame
pilim = Image.fromarray(frame)
pilim.paste(speedo,box=(80,20),mask=speedo)
frame = np.array(pilim)
out.write(frame)
cv2.imshow("Testing", frame)
cv2.waitKey(1)
else:
break
capture.release()
out.release()
cv2.destroyAllWindows()
Here's the speedo I used:
If you want the colours to go across to OpenCV correctly from PIL, you need to re-order the channels to what OpenCV expects, namely BGRA. So, you could change the image loading to this:
# Open speedo image, split into separate RGBA channels, then re-combine in BGRA order for OpenCV
speedo = Image.open('speedo.png').convert('RGBA')
R,G,B,A = speedo.split()
speedo = Image.merge('RGBA',(B,G,R,A))
First time asking a question on SO.
I am trying to find a fast way to read the screen live (60fps+). Screenshot to numpy is a fast method, but does not match that speed. There is a brilliant answer in this question for pixels: Most efficient/quickest way to parse pixel data with Python?
I tried changing GetPixel to this long form for BMP, but that reduces it to 5fps:
t1 = time.time()
count = 0
width = win32api.GetSystemMetrics(win32con.SM_CXVIRTUALSCREEN)
height = win32api.GetSystemMetrics(win32con.SM_CYVIRTUALSCREEN)
left = win32api.GetSystemMetrics(win32con.SM_XVIRTUALSCREEN)
top = win32api.GetSystemMetrics(win32con.SM_YVIRTUALSCREEN)
while count < 1000:
hwin = win32gui.GetDesktopWindow()
hwindc = win32gui.GetWindowDC(hwin)
srcdc = win32ui.CreateDCFromHandle(hwindc)
memdc = srcdc.CreateCompatibleDC()
bmp = win32ui.CreateBitmap()
bmp.CreateCompatibleBitmap(srcdc, width, height)
memdc.SelectObject(bmp)
memdc.BitBlt((0, 0), (width, height), srcdc, (left, top), win32con.SRCCOPY)
bmpinfo = bmp.GetInfo()
bmpInt = bmp.GetBitmapBits(False)
count +=1
t2 = time.time()
tf = t2-t1
it_per_sec = int(count/tf)
print (str(it_per_sec) + " iterations per second")
I watched a youtube video of a guy working on C# where he said GetPixel opens and closes memory and that's why doing a GetPixel on each individual pixel has a lot of overhead. He suggested to lock the entire data field and only then do getpixel. I don't know how to do that, so any help will be appreciated. (EDIT: this link might refer to that Unsafe Image Processing in Python like LockBits in C# )
There is also another method which gets a memory address of the bitmap, but I don't know what to do with it. The logic there is that I should be able to read memory from that point into any numpy array, but I have not been able to do that.
Any other option to read the screen fast will also be appreciated.
There must be a way, the GPU knows what pixels to draw at each location, that means there must be a memory bank somehere or a data stream we can tap into.
P.S. why a highspeed requirement? I am working on work automation tools that have a lot of overhead already and I am hoping to optimize screen data stream to help that part of the project.
The code below uses MSS, which if modified to show no output can reach 44fps for 1080p. https://python-mss.readthedocs.io/examples.html#opencv-numpy
import time
import cv2
import mss
import numpy
with mss.mss() as sct:
# Part of the screen to capture
monitor = {'top': 40, 'left': 0, 'width': 800, 'height': 640}
while 'Screen capturing':
last_time = time.time()
# Get raw pixels from the screen, save it to a Numpy array
img = numpy.array(sct.grab(monitor))
# Display the picture
#cv2.imshow('OpenCV/Numpy normal', img)
# Display the picture in grayscale
# cv2.imshow('OpenCV/Numpy grayscale',
# cv2.cvtColor(img, cv2.COLOR_BGRA2GRAY))
print('fps: {0}'.format(1 / (time.time()-last_time)))
# Press "q" to quit
if cv2.waitKey(25) & 0xFF == ord('q'):
cv2.destroyAllWindows()
break
Still not perfect though as it is not 60fps+ and using a raw repackaged buffer from the GPU would be a better solution if possible.
Good morning,
I'm currently trying to study real-time liquid surface deformations by sending a laser sheet on the surface and gathering its reflection. What I obtain is typically a bright curve at each timestep, and I wish to analyze its coordinates.
I thus brought myself to write a Python script, which is displayed right below (The analysis part is retaken from laser curved line detection using opencv and python, as it represents nearly exactly what I'm trying to do, except that I'm working with a video flow) :
import cv2
from PIL import Image
import cv2.cv as cv
import numpy as np
import time
myfile = open("hauteur.txt","w")
#Import camera flow
class Target:
def __init__(self):
self.capture = cv.CaptureFromCAM(0)
cv.namedWindow("Target", 1)
cv.SetCaptureProperty(self.capture,cv.CV_CAP_PROP_FRAME_WIDTH, 150)
cv.SetCaptureProperty(self.capture,cv.CV_CAP_PROP_FRAME_HEIGHT, 980)
cv.SetCaptureProperty(self.capture,cv.CV_CAP_PROP_FPS, 60 )
def run(self):
frame = cv.QueryFrame(self.capture)
frame_size = cv.GetSize(frame)
color_image_cv = cv.CreateImage(cv.GetSize(frame), 8, 3)
color_image = np.array(color_image_cv)
grey_image = cv.CreateImage(cv.GetSize(frame), cv.IPL_DEPTH_8U, 1)
first = True
t = time.clock()
# Frame analysis
while True:
ret, bw = cv2.threshold(color_image, 0, 255, cv2.THRESH_BINARY)
contours, hierarchy = cv2.findContours(bw, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)
curves = np.zeros((img.shape[0], img.shape[1], 3), np.uint8)
for i in range(len(contours)):
for col in range(draw.shape[1]):
M = cv2.moments(draw[:, col])
if M['m00'] != 0:
x = col
y = int (M['m01']/M['m00'])
curves[y, x, :] = (0, 0, 255)
res = {'X' : x, 'Y' : y, 't' : t}
print res
myfile.write('{X}\t{Y}\t{t}'.format(**res))
myfile.write("\n")
cv2.ShowImage("Target", color_image)
# Listen for ESC key
c = cv2.WaitKey(7) % 0x100
if c == 27:
break
if __name__=="__main__":
t = Target()
t.run()
However, the use of cv and cv2 functions within the same code seems to bring a nice mess and I get the error
src data type = 17 is not supported
from line
ret, bw = cv2.threshold(color_image, 0, 255, cv2.THRESH_BINARY)
I understand this arises from the way cv and cv2 functions create and store images, but any conversion process I try doesn't seem to work, and I didn't find equivalent cv2 functions to insert in my video flow importing part (but, as you may understand, I'm clearly not a programming pro and I may have skipped what I'd need in the documentation). Is there then a way to conciliate these cv and cv2 functions, or get a equivalent camera flow with cv2 functions ?
Bonus question : How fast can an script like this run (considering that I'd eventually need this to run at 300-400 fps, I'm not even sure this is actually feasible) ?
Thanks for your attention
ok, cv2 video code:
def __init__(self):
self.capture = cv2.VideoCapture(0)
cv2.namedWindow("Target", 1)
self.capture.set(cv2.CAP_PROP_FRAME_WIDTH, 150)
self.capture.set(cv2.CAP_PROP_FRAME_HEIGHT, 980)
self.capture.set(cv2.CAP_PROP_FPS, 60 )
def run(self):
ok, frame = self.capture.read()
gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY);
...
Bonus question : ofc, it can only run as fast, as the capture delivers. 300fps seems absurd, 30fps, more likely.