I'm working on a facial recognition project which creates a database of face encodings, then upon selection of a target photo will encode that photo as well and match it against the known encodings.
The program works correctly except it only provides the best match, if one is found within the set tolerance. The problem is that it is not always correct, even when i know that the target face is in my database, but the target picture is different enough that it can cause a false positive.
Because of this, I would like to list the top 3 or 5 results to hopefully get the correct result within those top 5.
Here's the code.
def recognize():
#define path for target photo
path=tkinter.filedialog.askopenfilename(filetypes=[("Image File",'.jpg .png')])
with open('dataset_faces.dat', 'rb') as f:
encoding = pickle.load(f)
def classify_face(im):
faces = encoding
faces_encoded = list(faces.values())
known_face_names = list(faces.keys())
img = cv2.imread(im, 1)
img = cv2.resize(img, (600, 600), fx=0.5, fy=0.5)
#img = img[:,:,::-1]
face_locations = face_recognition.face_locations(img, number_of_times_to_upsample=2, model="cnn")
unknown_face_encodings = face_recognition.face_encodings(img, face_locations, num_jitters=100)
face_names = []
for face_encoding in unknown_face_encodings:
# See if the face is a match for the known face(s)
name = "Unknown"
# use the known face with the smallest distance to the new face
face_distances = face_recognition.face_distance(faces_encoded, face_encoding)
best_match_index = np.argmin(face_distances)
#set distance between known faces and target face. The lower the distance between them the lower the match. Higher dist = more error.
if face_distances[best_match_index] < 0.60:
name = known_face_names[best_match_index]
face_names.append(name)
print(name)
I have tried adding code like
top_3_matches = np.argsort(face_distances)[:3]
top3 = face_names.append(top_3_matches)
print(top3)
However this gives me no hits.
Any ideas?
list.append does not return anything, so you should not try to affect that expression to a variable.
names = known_face_names[top_3_matches]
face_names.append(names)
print(names)
should do the same thing as
name = known_face_names[best_match_index]
face_names.append(name)
print(name)
for three elements instead of one.
The following code solves the issue. The issue was that I was using a numpy functions on a list that hadn't been converted into a numpy array, as per Aubergine's answer.
def classify_face(im):
faces = encoding
faces_encoded = list(faces.values())
known_face_names = list(faces.keys())
#make lists into numpy arrays
n_faces_encoded = np.array(faces_encoded)
n_known_face_names = np.array(known_face_names)
and to sort the numpy array for the 3 lowest values:
n_face_distances = face_recognition.face_distance(n_faces_encoded, face_encoding)
top_3_matches = np.argsort(n_face_distances)[:3]
printing the best 3 matches:
other_matches = n_known_face_names[top_3_matches]
print(other_matches)
Related
New to python, new to OpenCV, which I'm gonna use for my master-thesis, and already got some problems using the VideoCapture object of OpenCV.
Situation:
I got 2 folders containing corresponding images (taken with RGB and infrared cameras). I want to display them sibe by side in a Window using a while-loop. The problem arises, when there are some images missing from one of the image-sequences (Due to problems while recording or whatever, I don't really know but that should be of no importance). My idea was to use the bool-returnvalue of the .read() function to check wheather there is a frame to be read and if not, replace the image by a black one. This is what I did:
Code:
import cv2
import numpy as np
pathRGB = "Bilder/RGB"
pathIR = "Bilder/IR"
# the paths to the folders containing the images
capRGB = cv2.VideoCapture(pathRGB + "/frame_%06d.jpg")
capIR = cv2.VideoCapture(pathIR + "/frame_%06d.jpg")
# setting up the VideoCapture-elements with the according format
shapeRGB = capRGB.read()[1].shape
shapeIR = capIR.read()[1].shape
# get the shape of the first image in each folder to later create the black
# dummy-image
dtypeRGB = capRGB.read()[1].dtype
dtypeIR = capIR.read()[1].dtype
# get the type of the first image in each folder to later create the black
# dummy-image
if (capRGB.isOpened() is False):
print("Error opening RGB images")
if (capIR.isOpened() is False):
print("Error opening IR images")
cv2.namedWindow("frames", cv2.WINDOW_NORMAL)
while capRGB.isOpened() and capIR.isOpened() is True:
retRGB, imgRGB = capRGB.read()
retIR, imgIR = capIR.read()
# read both images
if retRGB is True and retIR is False:
imgIR = np.zeros(shapeIR, dtype=dtypeIR)
# if there is no IR image, crate a dummy one
if retIR is True and retRGB is False:
imgRGB = np.zeros(shapeRGB, dtype=dtypeRGB)
# if there is no RGB image, crate a dummy one
if retRGB is False and retIR is False:
break
imgCombined = np.hstack((imgRGB, imgIR))
# put both images together
cv2.imshow("frames", imgCombined)
k = cv2.waitKey(1)
if k == ord("q"):
break
capRGB.release()
capIR.release()
cv2.destroyAllWindows()
Problem:
From my understanding, the problem arises as capIR.read() attempts to read a missing image (in my case the 527th) and instead of just returning false/None it attempts to read the same image over and over again. Up to the missing frame, everything works fine, the right "IR" image even turns black but then the videoplayback begins to slow down and while i still can close the window by pressing 'q', spyder IDE freezes and if I wait "too long" i even have to shut it down. Console gives out "[image2 # 000002a7af8f0480] Could not open file : Bilder/IR/frame_000527.jpg" over and over again, so much that i can't scroll to the top.
I guess what I'm asking is: Is there any way to make the .read() function just attempt 1 read and after it fails continue with the next frame?
Best regards and thank you very much in advance!
Simulated for testing with different files and directory names.
Will retrieve the largest frame number from both directories and afterwards iterate over all frame numbers for reading the files from both directories.
import os
import cv2
import re
import glob
image_dir1 = 'test1'
image_dir2 = 'test2'
# retrieve all frames in both directories
frames_cap1 = glob.glob(os.path.join(image_dir1, "frame_*.jpg"))
frames_cap2 = glob.glob(os.path.join(image_dir2, "frame_*.jpg"))
# sort inscending
frames_cap1.sort()
frames_cap2.sort()
# retrieve last frame No for both directories
last_frame_cap1 = frames_cap1[-1]
last_frame_cap2 = frames_cap2[-1]
# extract integer counter as a group
# modify regex to match file name if required
match_cap1 = re.search('frame_(\d+).jpg', last_frame_cap1)
match_cap2 = re.search('frame_(\d+).jpg', last_frame_cap2)
last_frame_no_cap1 = int(match_cap1.group(1))
last_frame_no_cap2 = int(match_cap2.group(1))
# retrieve max frame No
max_frame_no = max(last_frame_no_cap1, last_frame_no_cap2)
for i in range(max_frame_no + 1):
# adapt formatting of frame number to digit count in file name
# here: 6 digits with leading zeros
image_path_cap1 = os.path.join(image_dir1, f"frame_{i:06d}.jpg")
image_path_cap2 = os.path.join(image_dir2, f"frame_{i:06d}.jpg")
if not os.path.isfile(image_path_cap1):
print(f"handle missing file: '{image_path_cap1}'")
# ...
else:
img1 = cv2.imread(image_path_cap1)
# …
if not os.path.isfile(image_path_cap2):
print(f"handle missing file: '{image_path_cap2}'")
# ...
else:
img2 = cv2.imread(image_path_cap2)
# …
# …
Assuming that the images in directory1 have the same names as directory2 images, but we know that some image may not be present in both directories...
import glob,os,cv2
path1 = "folder1/"
path2 = "folder2/"
#change directory to path1
os.chdir(path1)
l1 = glob.glob("*.jpg") #get a list of images names
os.chdir("../") #go one directory up
blackimg = cv2.imread("blackimg.jpg")
for fname in l1:
#check if image1 exists , then read it . otherwise im1 = blackimg
if os.path.isfile(path1+fname):
im1=cv2.imread(path1+fname)
else:
im1=blackimg
#check if image2 exists , then read it . otherwise im2 = blackimg
if os.path.isfile(path2+fname):
im2=cv2.imread(path2+fname)
else:
im2=blackimg
imgCombined = np.hstack((im1, im2))
cv2.imshow("Combined", imgCombined)
print("press any key to continue, q to exit")
k = cv2.waitKey(0)
if k == ord("q"):break
cv2.destroyAllWindows()
I have never done any video-based programming before, and although this SuperUser post provides a way to do it on the command line, I prefer a programmatic approach, preferably with Python.
I have a bunch of sub-videos. Suppose one of them is called 1234_trimmed.mp4 which is a short segment cut from the original, much-longer video 1234.mp4. How can I figure out the start and end timestamps of 1234_trimmed.mp4 inside 1234.mp4?
FYI, the videos are all originally on YouTube anyway ("1234" corresponds to the YouTube video ID) if there's any shortcut that way.
I figured it out myself with cv2. My strategy was to get the first and last frames of the subvideo and iterate over each frame of the original video, where I compare the current frame's dhash (minimum hamming distance instead of checking for equality in case of resizing and other transformations) against the first and last frames. I'm sure there may be some optimization opportunities but I need this yesterday.
import cv2
original_video_fpath = '5 POPULAR iNSTAGRAM BEAUTY TRENDS (DiY Feather Eyebrows, Colored Mascara, Drippy Lips, Etc)-vsNVU7y6dUE.mp4'
subvideo_fpath = 'vsNVU7y6dUE_trimmed-out.mp4'
def dhash(image, hashSize=8):
# resize the input image, adding a single column (width) so we
# can compute the horizontal gradient
resized = cv2.resize(image, (hashSize + 1, hashSize))
# compute the (relative) horizontal gradient between adjacent
# column pixels
diff = resized[:, 1:] > resized[:, :-1]
# convert the difference image to a hash
return sum([2 ** i for (i, v) in enumerate(diff.flatten()) if v])
def hamming(a, b):
return bin(a^b).count('1')
def get_video_frame_by_index(video_cap, frame_index):
# get total number of frames
totalFrames = video_cap.get(cv2.CAP_PROP_FRAME_COUNT)
if frame_index < 0:
frame_index = int(totalFrames) + frame_index
# check for valid frame number
if frame_index >= 0 & frame_index <= totalFrames:
# set frame position
video_cap.set(cv2.CAP_PROP_POS_FRAMES, frame_index)
_, frame = video_cap.read()
return frame
def main():
cap_original_video = cv2.VideoCapture(original_video_fpath)
cap_subvideo = cv2.VideoCapture(subvideo_fpath)
first_frame_subvideo = get_video_frame_by_index(cap_subvideo, 0)
last_frame_subvideo = get_video_frame_by_index(cap_subvideo, -1)
first_frame_subvideo_gray = cv2.cvtColor(first_frame_subvideo, cv2.COLOR_BGR2GRAY)
last_frame_subvideo_gray = cv2.cvtColor(last_frame_subvideo, cv2.COLOR_BGR2GRAY)
hash_first_frame_subvideo = dhash(first_frame_subvideo)
hash_last_frame_subvideo = dhash(last_frame_subvideo)
min_hamming_dist_with_first_frame = float('inf')
closest_frame_index_first = None
closest_frame_timestamp_first = None
min_hamming_dist_with_last_frame = float('inf')
closest_frame_index_last = None
closest_frame_timestamp_last = None
frame_index = 0
while(cap_original_video.isOpened()):
frame_exists, curr_frame = cap_original_video.read()
if frame_exists:
timestamp = cap_original_video.get(cv2.CAP_PROP_POS_MSEC) // 1000
hash_curr_frame = dhash(curr_frame)
hamming_dist_with_first_frame = hamming(hash_curr_frame, hash_first_frame_subvideo)
hamming_dist_with_last_frame = hamming(hash_curr_frame, hash_last_frame_subvideo)
if hamming_dist_with_first_frame < min_hamming_dist_with_first_frame:
min_hamming_dist_with_first_frame = hamming_dist_with_first_frame
closest_frame_index_first = frame_index
closest_frame_timestamp_first = timestamp
if hamming_dist_with_last_frame < min_hamming_dist_with_last_frame:
min_hamming_dist_with_last_frame = hamming_dist_with_last_frame
closest_frame_index_last = frame_index
closest_frame_timestamp_last = timestamp
frame_index += 1
else:
print('processed {} frames'.format(frame_index+1))
break
cap_original_video.release()
print('timestamp_start={}, timestamp_end={}'.format(closest_frame_timestamp_first, closest_frame_timestamp_last))
if __name__ == '__main__':
main()
MP4 utilizes relative timestamps. When the file was trimmed the old timestamps were lost, and the new file now begins at time stamp zero.
So the only way to identify where this file may overlap with another file is to use computer vision or perceptual hashing. Both options are too complex to describe in a single stackoverflow answer.
If they were simply -codec copy'd, the timestamps should be as they were in the original file. If they weren't, ffmpeg is not the tool for the job. In that case, you should look into other utilities that can find an exactly matching video and audio frame in both files and get the timestamps from there.
I am staring playing with tensorflow. I am facing the following problem. I am trying to run an example to do image recognition based on the Stanford Dog Dataset.I am stuck in the step of converting the image and label in TRFRECORDS files.
In the image dataset folder there are 120 sub-folders, one for each breed (label).
If I run the code below with just on sub-folder in run fine (Actually I didn't tried to read the trfrecord file). But If I include a second sub-folder the process kills the python kernel process.
Here is the code I am running
import glob
import tensorflow as tf
from itertools import groupby
from collections import defaultdict
image_filenames = glob.glob(r'C:\Users\Administrator\Documents\Tensorflow\images\n02*\*.jpg')
training_dataset = defaultdict(list)
testing_dataset = defaultdict(list)
# Split up the filename into its breed and corresponding filename. The breed is found by taking the directo
image_filename_with_breed =map(lambda filename: (filename.split("\\")[6], filename), image_filenames)
# Group each image by the breed which is the 0th element in the tuple returned above
for dog_breed, breed_images in groupby(image_filename_with_breed, lambda x: x[0]):
# Enumerate each breed's image and send ~20% of the images to a testing set
for i, breed_image in enumerate(breed_images):
if i % 5 == 0:
testing_dataset[dog_breed].append(breed_image[1])
else:
training_dataset[dog_breed].append(breed_image[1])
# Check that each breed includes at least 18% of the images for testing
breed_training_count = len(training_dataset[dog_breed])
breed_testing_count = len(testing_dataset[dog_breed])
assert round(breed_testing_count / (breed_training_count + breed_testing_count), 2) > 0.18,'Not enough testing data'
sess = tf.Session()
def write_records_file(dataset, record_location):
"""
Fill a TFRecords file with the images found in `dataset` and include their category.
Parameters
----------
dataset : dict(list)
Dictionary with each key being a label for the list of image filenames of its value.
record_location : str
Location to store the TFRecord output.
"""
writer = None
# Enumerating the dataset because the current index is used to breakup the files if they get over 100
# images to avoid a slowdown in writing.
current_index = 0
for breed, images_filenames in dataset.items():
for image_filename in images_filenames:
print(image_filename)
if current_index % 100 == 0:
if writer:
writer.close()
record_filename = "{record_location}-{current_index}.tfrecords".format(
record_location=record_location,
current_index=current_index)
print(record_filename)
writer = tf.python_io.TFRecordWriter(record_filename)
current_index += 1
image_file = tf.read_file(image_filename)
# In ImageNet dogs, there are a few images which TensorFlow doesn't recognize as JPEGs. This
# try/catch will ignore those images.
try:
image = tf.image.decode_jpeg(image_file)
except:
print(image_filename)
continue
# Converting to grayscale saves processing and memory but isn't required.
grayscale_image = tf.image.rgb_to_grayscale(image)
resized_image = tf.image.resize_images(grayscale_image, (250, 151))
# tf.cast is used here because the resized images are floats but haven't been converted into
# image floats where an RGB value is between [0,1).
image_bytes = sess.run(tf.cast(resized_image, tf.uint8)).tobytes()
# Instead of using the label as a string, it'd be more efficient to turn it into either an
# integer index or a one-hot encoded rank one tensor.
# https://en.wikipedia.org/wiki/One-hot
image_label = breed.encode("utf-8")
example = tf.train.Example(features=tf.train.Features(feature={
'label': tf.train.Feature(bytes_list=tf.train.BytesList(value=[image_label])),
'image': tf.train.Feature(bytes_list=tf.train.BytesList(value=[image_bytes]))
}))
writer.write(example.SerializeToString())
writer.close()
write_records_file(testing_dataset, r'C:\Users\Administrator\Documents\Tensorflow\TRF\testing_images')
write_records_file(training_dataset, r'C:\Users\Administrator\Documents\Tensorflow\TRF\training_images')
I monitored the memory usage and running the script does not seems to consume to much memory. I tried this in two Virtual Machines. One with Ubuntu and the other on with Windows 2000.
Does anyone have a idea?
Thanks!
I found the problem. It was the writer.close() statement that was incorrectly idented. I should be idented in the first for loop but I idented it in the second loop.
I am trying to write an script to remove cells in a part in ABAQUS if the cell volume is smaller than a given value.
Is there a simple command to delete a cell?
This is what I have tried:
# Keeps cells bigger than a certain minimum value 'paramVol': paramVol=volCell/part_volume_r
cellsVolume = []
pfacesInter_clean = []
allCells = pInterName.cells
mask_r = pInter.cells.getMask();
cellobj_sequence_r = pInter.cells.getSequenceFromMask(mask=mask_r);
part_volume_r = pInterName.getVolume(cells=cellobj_sequence_r);
volume_sliver = 0
# get faces
for i in range(0, len(allCells)):
volCell = allCells[i].getSize()
cellsVolume.append(volCell)
paramVol = volCell / part_volume_r
print 'paramVol= '+str(paramVol)
if paramVol < 0.01:
print 'liver Volume'
#session.viewports['Viewport: 1'].setColor(initialColor='#FF0000') #-->RED
faces = allCells[i].getFaces()
highlight(allCells[i].getFaces())
#pfacesInter_clean = [x for i, x in enumerate(pfacesInter) if i not in faces]
volume_sliver += volCell
else:
print 'Not an sliver Volume'
Thanks!
How about this, assuming pInter is a Part object:
pInter.RemoveFaces(faceList=[pInter.faces[j] for j in pInter.cells[i].getFaces()])
Update: once the common face of two cells is deleted, both cells cease to exist. Therefore, we need to do a little workaround:
faces_preserved = # List of faces that belong to cells with 'big' volume.
for cell in pInter.cells:
pInter.RemoveFaces(faceList=[face for face in pInter.faces if \
face not in faces_preserved])
I'm trying to scramble all the pixels in an image and my implementation of Knuths shuffle (as well as someone else's) seems to fail. Seems it is working doing each row. I cannot work out why - just can't see it.
Here is what happens:
Which ain't very scrambly! Well, it could be more scrambly, and more scrambly it needs to be.
Here's my code:
import Image
from numpy import *
file1 = "lhooq"
file2 = "kandinsky"
def shuffle(ary):
a=len(ary)
b=a-1
for d in range(b,0,-1):
e=random.randint(0,d)
ary[d],ary[e]=ary[e],ary[d]
return ary
for filename in [file1, file2]:
fid = open(filename+".jpg", 'r')
im = Image.open(fid)
data = array(im)
# turn into array
shape = data.shape
data = data.reshape((shape[0]*shape[1],shape[2]))
# Knuth Shuffle
data = shuffle(data)
data = data.reshape(shape)
imout = Image.fromarray(data)
imout.show()
fid.close()
When ary is a 2D array, ary[d] is a view of a that array rather than a copy of the contents.
Therefore, ary[d],ary[e]=ary[e],ary[d] is equivalent to the assignment ary[d] = ary[e]; ary[e] = ary[e], since ary[d] on the RHS is simply a pointer to the dth element of ary (as opposed to a copy of the pixel value).
To solve this, you can use advanced indexing:
ary[[d,e]] = ary[[e,d]]