Problem with imgIdx in DMatch class using FlannBasedMatcher in Python - python

I have the same issue as here:
how to access best image corresponding to best keypoint match using opencv flannbasedmatcher and dmatch
Unfortunately, this post doesn't have an answer.
I have several images (and corresponding descriptors), that I add to the FlannBasedMatcher, using the 'add' method (once for each set of descriptors, corresponding to a single image).
However, when I match an image, the return imgIdx is way larger than the number of images in the training set. I feel like each descriptor is treated as an image, but this is not what I want.
I want to know which image (or set of descriptors) each feature has been matched to.
Here is a part of my code (I simplified it a bit, and I know 'test' is not great for a variable name, but it's temporary).
Also here I read .key files, which are basically files containing keypoints and descriptors of an image (extracted with SIFT).
I just precise that in the following code, featMatch is just a class I created to create a FlannBasedMatcher (with initialization parameters).
with open(os.path.join(ROOT_DIR,"images\\descriptor_list.txt"),'r') as f:
for line in f:
folder_path = os.path.join(ROOT_DIR,"images\\",line[:-1]+"\\","*.key")
list_key = glob.glob(folder_path)
test2 = []
for key in list_key:
if os.path.isfile(key):
feat = Features()
feat.readFromFile(key)
test = feat.descriptors
test2 = test2+test
featMatch.add(test2)
# Read submitted picture features
feat = Features()
feat.readFromFile(os.path.join(ROOT_DIR,"submitted_picture\\sub.key"))
matches = []
matches.append(featMatch.knnMatch(np.array(feat.descriptors), k=3))
print(matches)
I was expecting, when looking at the matches, and more specifically at the imgIdx of the matches, to be told which image index the matching feature (trainIdx) correspond to, based on the number of descriptor sets I added with 'add' method.
But following this assumption, I should be able to have imgIdx larger than the number of images (or training sets) in my training set.
However, here, I get numbers such as 2960, while I only have about 5 images in my training set.
My guess is that it returns the feature index instead of the image index, but I don't know why.
I noticed that the 'add' method in C++ takes an array of array, where we have a list of descriptor sets (one for each image I guess). But here I have a different number of features for each image, so I can't really create a numpy array with a different number of rows in each column.
Thanks.

I finally figure it out after looking at the C++ source code of matcher.cpp:
https://github.com/opencv/opencv/blob/master/modules/features2d/src/matchers.cpp
I'm gonna post the answer, in case somebody needs it someday.
I thought that the 'add' method would increment the image count when called, but it does not. So, I realized that I have to create a list of Mat (or numpy array in python) and give it once to 'add', instead of calling it for each image.
So here is the updated (and working) source code:
with open(os.path.join(ROOT_DIR,"images\\descriptor_list.txt"),'r') as f:
list_image_descriptors = []
for line in f:
folder_path = os.path.join(ROOT_DIR,"images\\",line[:-1]+"\\","*.key")
list_key = glob.glob(folder_path)
for key in list_key:
if os.path.isfile(key):
feat = Features()
feat.readFromFile(key)
img_descriptors = np.array(feat.descriptors)
list_image_descriptors.append(img_descriptors)
featMatch.add(list_image_descriptors)
# Read submitted picture features
feat = Features()
feat.readFromFile(os.path.join(ROOT_DIR,"submitted_picture\\sub.key"))
matches = []
matches.append(featMatch.knnMatch(np.array(feat.descriptors), k=3))
print(matches)
Hope this helps.

Related

Load a part of Glove vectors with gensim

I have a word list like['like','Python']and I want to load pre-trained Glove word vectors of these words, but the Glove file is too large, is there any fast way to do it?
What I tried
I iterated through each line of the file to see if the word is in the list and add it to a dict if True. But this method is a little slow.
def readWordEmbeddingVector(Wrd):
f = open('glove.twitter.27B/glove.twitter.27B.200d.txt','r')
words = []
a = f.readline()
while a!= '':
vector = a.split()
if vector[0] in Wrd:
words.append(vector)
Wrd.remove(vector[0])
a = f.readline()
f.close()
words_vector = pd.DataFrame(words).set_index(0).astype('float')
return words_vector
I also tried below, but it loaded the whole file instead of vectors I need
gensim.models.keyedvectors.KeyedVectors.load_word2vec_format('word2vec.twitter.27B.200d.txt')
What I want
Method like gensim.models.keyedvectors.KeyedVectors.load_word2vec_format but I can set a word list to load.
There's no existing gensim support for filtering the words loaded via load_word2vec_format(). The closest is an optional limit parameter, which can be used to limit how many word-vectors are read (ignoring all subsequent vectors).
You could conceivably create your own routine to perform such filtering, using the source code for load_word2vec_format() as a model. As a practical matter, you might have to read the file twice: 1st, to find out exactly how many words in the file intersect with your set-of-words-of-interest (so you can allocate the right-sized array without trusting the declared size at the front of the file), then a second time to actually read the words-of-interest.

feature descriptor for a given 2d point in an image

Trying to get a descriptor for a predefined point using python opencv3. The goal is to provide a set of points for a given image and get their corresponding feature descriptors. I'm open to using SIFT, SURF, Brief, ORB, and basically any descriptor. However, I do not want to use any of the detection methods provided. I have created the following:
feat_object = cv2.xfeatures2d.BriefDescriptorExtractor_create()
# define keypoint for a single 2d point
pt = cv2.KeyPoint(point[0,0],point[1,0], 10)
# create feature descriptor
out = feat_object.compute(frame, pt)
However, I get the following error.
----> out = feat_object.compute(frame, pt)
SystemError: error return without exception set
Any suggestions?
Ok, resolving the matter ended up being simple. The correct code snippet looks like the following:
feat_object = cv2.xfeatures2d.BriefDescriptorExtractor_create()
define keypoint for a single 2d point
pt = [cv2.KeyPoint(point[0,0],point[1,0], 10)]
create feature descriptor
out = feat_object.compute(frame, pt)
with frame defined as a grayscale image and pt being a list of keypoints. So even if you only want to process a single keypoint, you still are required to pass it in as a list.
I've only tested this out in opencv2.

Theano shared updating last element in python

I have a shared variable persistent_vis_chain which is being updated by a theano function where it gets its function from a theano.scan, But thats not the problem just back story.
My shared variable looks like D = [image1, ... , imageN] where each images is [x1,x2,...,x784].
What I want to do is take the average of all the images and put them into the last imageN. That is I want to sum all the values in each image except the last 1, which will result in [s1,s2,...,s784] then I want to set imageN = [s1/len(D),s2/len(D),...s784/len(D)]
So my problem is I do not know how to do this with theano.shared and may be with my understanding of theano functions and doing this computation with symbolic variables. Any help would be greatly appreciated.
If you have N images, each of shape 28x28=784 then, presumably, your shared variable has shape (N,28,28) or (N,784)? This method should work with either shape.
Given D is your shared variable containing your image data. If you want to get the average image then D.mean(keepdims=True) will give it to you symbolically.
It's unclear if you want to change the final image to equal the mean image (sounds like a strange thing to do), or if you want to add a further N+1'th image to the shared variable. For the former you could do something like this:
D = theano.shared(load_D_data())
D_update_expression = do_something_with_scan_to_get_D_update_expression(D)
updates = [(D, T.concatenate(D_update_expression[:-1],
D_update_expression.mean(keepdims=True)))]
f = theano.function(..., updates=updates)
If you want to do the latter (add an additional image), change the updates line as follows:
updates = [(D, T.concatenate(D_update_expression,
D_update_expression.mean(keepdims=True)))]
Note that this code is intended as a guide. It may not work as it stands (e.g. you may need to mess with the axis= parameter in the T.concatenate command).
The point is that you need to construct a symbolic expression explaining what the new value for D looks like. You want it to be a combination of the updates from scan plus this additional average thing. T.concatenate allows you to combine those two parts together.

Quickly determining using Python whether an image is (fuzzily) in a collection

Image that some new image X arrives, and I want to know if X is new or has already been encountered before. I have code, below, that shrinks the image and then converts it to a hash code. I can then see via a single hash look-up if I've already encountered an image with the same hash code, so it's very fast.
My question is, is there an efficient way for me to see if a similar image, but one with a different hash code, has already been seen? If was going to title this question something like "Data structure for determining efficiently whether a similar, non-identical item is already contained" but decided that would be an instance of the XY problem.
When I say that this new image is "similar," I'm thinking of one that's perhaps gone through lossy compression and so looks like the original to the human eye but is not identical. Normally shrinking the image eliminates the difference, but not always, and if I shrink the image too much I start getting false positives.
Here's my current code:
import PIL
seen_images = {} # This would really be a shelf or something
# From http://www.guguncube.com/1656/python-image-similarity-comparison-using-several-techniques
def image_pixel_hash_code(image):
pixels = list(image.getdata())
avg = sum(pixels) / len(pixels)
bits = "".join(map(lambda pixel: '1' if pixel < avg else '0', pixels)) # '00010100...'
hexadecimal = int(bits, 2).__format__('016x').upper()
return hexadecimal
def process_image(filepath):
thumb = PIL.Image.open(filepath).resize((128,128)).convert("L")
code = image_pixel_hash_code(thumb)
previous_image = seen_images.get(code, None)
if code in seen_images:
print "'{}' already seen as '{}'".format(filepath, previous_image)
else:
seen_images[code] = filepath
You can put a path to a bunch of image files into a variable called IMAGE_ROOT and then try my code out with:
import os
for root, dirs, files in os.walk(IMAGE_ROOT):
for filename in files:
filepath = os.path.join(root, filename)
try:
process_image(filepath)
except IOError:
pass
There are a lot of methods for comparing images, but for your given example I suspect that simplicity and speed are the key factors (hence why you're trying to use a hash as a first-pass). Here are some suggestions - in all cases I'd suggest shrinking and cropping the image to a regular size and shape.
Smooth the image (gaussian blur) before shrinking to minimise the influence of artefacts. Then apply the hash or other comparison.
Subtract the images from one another (RGB) and check the remainder. Identical images will return zero, compression artefacts will result in small minor variations. You can either threshold, sum, or average the value and compare to a cut-off.
Use standard distance algorithsm (see scipy.spatial.distance) to calculate 'distance' between the two images. For example euclidean distance will give effectively the same as the sum of subtracting, while cosine will ignore itensity but match the profile of changes over the image i.e. a darker version of the same image will be considered equivalent. For these you will need to flatten your image to a 1D array.
The last two entail comparing every image to every other image when uploading, and that is going to get very computationally expensive for large numbers of images.

Pixel to pixel edit using PIL and Image.point

I can't seem to understand what Image point does. I want to do some pixel edit which might include checking which color value(r, g or b) is max in every pixel and act accordingly. Lets say that I can't use numpy. I managed to use Image point to add the same value to every pixel in an image.
point code
import Image, math
def brightness(i, value):
value = math.floor(255*(float(value)/100))
return i+value
if __name__ == '__main__':
image = '/home/avlahop/verybig.jpg'
print image
img = Image.open(image)
print img
out = img.point(lambda i: brightness(i, 50))
out.show()
numpy code
def brightness(arr, adjust):
import math
adjust = math.floor(255*(float(adjust)/100))
arr[...,0] += adjust
arr[...,1] += adjust
arr[...,2] += adjust
return arr
if __name__ == '__main__':
image = '/home/avlahop/verybig.jpg'
img = Image.open(image).convert('RGBA')
arr = np.array(np.asarray(img).astype('float'))
new_image = Image.fromarray(brightness(arr, adjust).clip(0,255).astype('uint8'), 'RGBA').show()
I have to say that point code is faster than numpy's. But what if i want to do a more complex operation with point. for example for every pixel check the max(r,g,b) and do something depending on if r=max or g=max or b=max. As you saw i used the point with function as argument. It takes one argument i. what is this i? is it the pixel?(i.e i=(r,g,b)?).I can't seem to understand from the pil documentation
The docs may not have been clear in earlier versions of PIL, but in Pillow it's spelled out pretty well. From Image.point:
lut – A lookup table, containing 256 values per band in the image. A function can be used instead, it should take a single argument. The function is called once for each possible pixel value, and the resulting table is applied to all bands of the image.
In other words, it's not a general-purpose way to map each pixel through a function, it's just a way to dynamically built the lookup table, instead of passing in a pre-built one.
In other words, it's called with the numbers from 0 through 255. (Which you can find out for yourself pretty easily by just writing a function that appends its argument to a global list and then dump out the list at the end…)
If you split your image into separate bands or planes, point each one of them with a different function, and then recombine them, that might be able to accomplish what you're trying to do. But even then, I think eval is what you wanted, not point.
But I think what you really want, which is a pixel-by-pixel all-bands-at-once iterator. And you don't need anything special for that. Just use map or a comprehension over getdata. Isn't that slow? Of course it's slow, because it's calling your function X*Y times; the cost of building the getdata sequence and iterating over it is tiny compared to that cost, so looking for a way for PIL to optimize the already-fast-enough part won't get you very far.

Categories

Resources