feature descriptor for a given 2d point in an image - python

Trying to get a descriptor for a predefined point using python opencv3. The goal is to provide a set of points for a given image and get their corresponding feature descriptors. I'm open to using SIFT, SURF, Brief, ORB, and basically any descriptor. However, I do not want to use any of the detection methods provided. I have created the following:
feat_object = cv2.xfeatures2d.BriefDescriptorExtractor_create()
# define keypoint for a single 2d point
pt = cv2.KeyPoint(point[0,0],point[1,0], 10)
# create feature descriptor
out = feat_object.compute(frame, pt)
However, I get the following error.
----> out = feat_object.compute(frame, pt)
SystemError: error return without exception set
Any suggestions?

Ok, resolving the matter ended up being simple. The correct code snippet looks like the following:
feat_object = cv2.xfeatures2d.BriefDescriptorExtractor_create()
define keypoint for a single 2d point
pt = [cv2.KeyPoint(point[0,0],point[1,0], 10)]
create feature descriptor
out = feat_object.compute(frame, pt)
with frame defined as a grayscale image and pt being a list of keypoints. So even if you only want to process a single keypoint, you still are required to pass it in as a list.
I've only tested this out in opencv2.

Related

how can I extract the points of a vector shape in Krita using the python scripter?

I want to get the position of the points that make my vector shape.
(https://i.stack.imgur.com/mfmGV.png)
(https://i.stack.imgur.com/J9Noq.png)
So far I could find out how to get the shape, but I don't know how to continue from there. I thought the QPoints are children of the QObjects (the vector Shape) so I tried to get them with .children() but the array I got was empty. Besides that I couldn't find a function that sounds like what I am looking for in the documentation.
from krita import *
shapes = Krita.instance().activeDocument().activeNode().shapes().
Points = shapes.children()

Is there a way to extract pixel color information from an image with OpenImageIO's python bindings

I want to ask the more experienced people how to get the RGB values of a pixel in an image using oiio's python bindings.
I have only just begun using oiio and am unfamiliar with the library as well as image manipulation on code level.
I poked around the documentation and I don't quite understand how the parameters work. It doesn't seem to work the same as python and I'm having a hard time trying to figure out, as I don't know C.
A) What command to even use to get pixel information (seems like get_pixel could work)
and
B) How to get it to work. I'm not understanding the parameters requirements exactly.
Edit:
I'm trying to convert the C example of visiting all pixels to get an average color in the documentation into something pythonic, but am getting nowhere.
Would appreciate any help, thank you.
Edit: adding the code
buffer = oiio.ImageBuf('image.tif')
array = buffer.read(0, 0, True)
print buffer.get_pixels(array)
the error message I get is:
# Error: Python argument types in
# ImageBuf.get_pixels(ImageBuf, bool)
# did not match C++ signature:
# get_pixels(class OpenImageIO::v1_5::ImageBuf, enum OpenImageIO::v1_5::TypeDesc::BASETYPE)
# get_pixels(class OpenImageIO::v1_5::ImageBuf, enum OpenImageIO::v1_5::TypeDesc::BASETYPE, struct OpenImageIO::v1_5::ROI)
# get_pixels(class OpenImageIO::v1_5::ImageBuf, struct OpenImageIO::v1_5::TypeDesc)
# get_pixels(class OpenImageIO::v1_5::ImageBuf, struct OpenImageIO::v1_5::TypeDesc, struct OpenImageIO::v1_5::ROI)
OpenImageIO has several classes for dealing with images, with different levels of abstraction. Assuming that you are interested in the ImageBuf class, I think the simplest way to access individual pixels from Python (with OpenImageIO 2.x) would look like this:
import OpenImageIO as oiio
buf = ImageBuf ("foo.jpg")
p = buf.getpixel (50, 50) # x, y
print (p)
p will be a numpy array, so the this will produce output like
(0.01148223876953125, 0.0030574798583984375, 0.0180511474609375)

Problem with imgIdx in DMatch class using FlannBasedMatcher in Python

I have the same issue as here:
how to access best image corresponding to best keypoint match using opencv flannbasedmatcher and dmatch
Unfortunately, this post doesn't have an answer.
I have several images (and corresponding descriptors), that I add to the FlannBasedMatcher, using the 'add' method (once for each set of descriptors, corresponding to a single image).
However, when I match an image, the return imgIdx is way larger than the number of images in the training set. I feel like each descriptor is treated as an image, but this is not what I want.
I want to know which image (or set of descriptors) each feature has been matched to.
Here is a part of my code (I simplified it a bit, and I know 'test' is not great for a variable name, but it's temporary).
Also here I read .key files, which are basically files containing keypoints and descriptors of an image (extracted with SIFT).
I just precise that in the following code, featMatch is just a class I created to create a FlannBasedMatcher (with initialization parameters).
with open(os.path.join(ROOT_DIR,"images\\descriptor_list.txt"),'r') as f:
for line in f:
folder_path = os.path.join(ROOT_DIR,"images\\",line[:-1]+"\\","*.key")
list_key = glob.glob(folder_path)
test2 = []
for key in list_key:
if os.path.isfile(key):
feat = Features()
feat.readFromFile(key)
test = feat.descriptors
test2 = test2+test
featMatch.add(test2)
# Read submitted picture features
feat = Features()
feat.readFromFile(os.path.join(ROOT_DIR,"submitted_picture\\sub.key"))
matches = []
matches.append(featMatch.knnMatch(np.array(feat.descriptors), k=3))
print(matches)
I was expecting, when looking at the matches, and more specifically at the imgIdx of the matches, to be told which image index the matching feature (trainIdx) correspond to, based on the number of descriptor sets I added with 'add' method.
But following this assumption, I should be able to have imgIdx larger than the number of images (or training sets) in my training set.
However, here, I get numbers such as 2960, while I only have about 5 images in my training set.
My guess is that it returns the feature index instead of the image index, but I don't know why.
I noticed that the 'add' method in C++ takes an array of array, where we have a list of descriptor sets (one for each image I guess). But here I have a different number of features for each image, so I can't really create a numpy array with a different number of rows in each column.
Thanks.
I finally figure it out after looking at the C++ source code of matcher.cpp:
https://github.com/opencv/opencv/blob/master/modules/features2d/src/matchers.cpp
I'm gonna post the answer, in case somebody needs it someday.
I thought that the 'add' method would increment the image count when called, but it does not. So, I realized that I have to create a list of Mat (or numpy array in python) and give it once to 'add', instead of calling it for each image.
So here is the updated (and working) source code:
with open(os.path.join(ROOT_DIR,"images\\descriptor_list.txt"),'r') as f:
list_image_descriptors = []
for line in f:
folder_path = os.path.join(ROOT_DIR,"images\\",line[:-1]+"\\","*.key")
list_key = glob.glob(folder_path)
for key in list_key:
if os.path.isfile(key):
feat = Features()
feat.readFromFile(key)
img_descriptors = np.array(feat.descriptors)
list_image_descriptors.append(img_descriptors)
featMatch.add(list_image_descriptors)
# Read submitted picture features
feat = Features()
feat.readFromFile(os.path.join(ROOT_DIR,"submitted_picture\\sub.key"))
matches = []
matches.append(featMatch.knnMatch(np.array(feat.descriptors), k=3))
print(matches)
Hope this helps.

Image warping with scikit-image and transform.PolynomialTransform

I attach a zip archive with all the files needed to illustrate and reproduce the problem.
(I don't have permissions to upload images yet...)
I have an image (test2.png in the zip archive ) with curved lines.
I try to warp it so the lines are straight.
I thought of using scikit-image transform, and in particular transform.PolynomialTransform because the transformation involves high order distortions.
So first I measure the precise position of each line at regular intervals in x to define the input interest points (in the file source_test2.csv).
Then I compute the corresponding desired positions, located along a straight line (in the file destination_test2.csv).
The figure correspondence.png shows how it looks like.
Next, I simply call transform.PolynomialTransform() using a polynomial of order 3.
It finds a solution, but when I apply it using transform.warp(), the result is crazy, as illustrated in the file Crazy_Warped.png
Anybody can tell what I am doing wrong?
I tried polynomial of order 2 without luck...
I managed to get a good transformation for a sub-image (the first 400 columns only).
Is transform.PolynomialTransform() completely unstable in a case like mine?
Here is the entire code:
import numpy as np
import matplotlib.pyplot as plt
import asciitable
import matplotlib.pylab as pylab
from skimage import io, transform
# read image
orig=io.imread("test2.png",as_grey=True)
# read tables with reference points and their desired transformed positions
source=asciitable.read("source_test2.csv")
destination=asciitable.read("destination_test2.csv")
# format as numpy.arrays as required by scikit-image
# (need to add 1 because I started to count positions from 0...)
source=np.column_stack((source["x"]+1,source["y"]+1))
destination=np.column_stack((destination["x"]+1,destination["y"]+1))
# Plot
plt.imshow(orig, cmap='gray', interpolation='nearest')
plt.plot(source[:,0],source[:,1],'+r')
plt.plot(destination[:,0],destination[:,1],'+b')
plt.xlim(0,orig.shape[1])
plt.ylim(0,orig.shape[0])
# Compute the transformation
t = transform.PolynomialTransform()
t.estimate(destination,source,3)
# Warping the image
img_warped = transform.warp(orig, t, order=2, mode='constant',cval=float('nan'))
# Show the result
plt.imshow(img_warped, cmap='gray', interpolation='nearest')
plt.plot(source[:,0],source[:,1],'+r')
plt.plot(destination[:,0],destination[:,1],'+b')
plt.xlim(0,img_warped.shape[1])
plt.ylim(0,img_warped.shape[0])
# Save as a file
io.imsave("warped.png",img_warped)
Thanks in advance!
There are a couple of things wrong here, mainly they have to do with coordinate conventions. For example, if we examine the code where you plot the original image, and then put the clicked point on top of it:
plt.imshow(orig, cmap='gray', interpolation='nearest')
plt.plot(source[:,0],source[:,1],'+r')
plt.xlim(0,orig.shape[1])
plt.ylim(0,orig.shape[0])
(I've taken out the destination points to make it cleaner) then we get the following image:
As you can see, the y-axis is flipped, if we invert the y-axis with:
source[:,1] = orig.shape[0] - source[:,1]
before plotting, then we get the following:
So that is the first problem (don't forget to invert the destination points as well), the second has to do with the transform itself:
t.estimate(destination,source,3)
From the documentation we see that the call takes the source points first, then the destination points. So the order of those arguments should be flipped.
Lastly, the clicked points are of the form (x,y), but the image is stored as (y,x), so we have to transpose the image before applying the transform and then transpose back again:
img_warped = transform.warp(orig.transpose(), t, order=2, mode='constant',cval=float('nan'))
img_warped = img_warped.transpose()
When you make these changes, you get the following warped image:
These lines aren't perfectly flat but it makes much more sense.
Thank you very much for the detailed answer! I cannot believe I did not see the axis inversion problem... Thanks for catching it!
But I am afraid your final solution does not solve my problem... The image you get is still crazy. It should be continuous, no have such big holes and weird distortions... (see final solution below)
I found I could get a reasonable solution using RANSAC:
from skimage.measure import ransac
t, inliers = ransac((destination,source), transform.PolynomialTransform, min_samples=20,residual_threshold=1.0, max_trials=1000)
outliers = inliers == False
I then get the following result
Note that I think I was right using (destination,source) in that order! I think it has to do with the fact that transform.warp requires the inverse_map as input for the transformation object, not the forward map. But maybe I am wrong? The good result I am getting suggest it's correct.
I guess that Polynomial transforms are too unstable, and using RANSAC allows to get a reasonable solution.
My problem was then to find a way to change the polynomial order in the RANSAC call...
transform.PolynomialTransform() does not take any parameters, and uses by default a 2nd order polynomial, but from the result I can see I would need a 3rd or 4th order polynomial.
So I opened a new question, and got a solution from Stefan van der Walt. Follow the link to see how to do it.
Thanks again for your help!

Pixel to pixel edit using PIL and Image.point

I can't seem to understand what Image point does. I want to do some pixel edit which might include checking which color value(r, g or b) is max in every pixel and act accordingly. Lets say that I can't use numpy. I managed to use Image point to add the same value to every pixel in an image.
point code
import Image, math
def brightness(i, value):
value = math.floor(255*(float(value)/100))
return i+value
if __name__ == '__main__':
image = '/home/avlahop/verybig.jpg'
print image
img = Image.open(image)
print img
out = img.point(lambda i: brightness(i, 50))
out.show()
numpy code
def brightness(arr, adjust):
import math
adjust = math.floor(255*(float(adjust)/100))
arr[...,0] += adjust
arr[...,1] += adjust
arr[...,2] += adjust
return arr
if __name__ == '__main__':
image = '/home/avlahop/verybig.jpg'
img = Image.open(image).convert('RGBA')
arr = np.array(np.asarray(img).astype('float'))
new_image = Image.fromarray(brightness(arr, adjust).clip(0,255).astype('uint8'), 'RGBA').show()
I have to say that point code is faster than numpy's. But what if i want to do a more complex operation with point. for example for every pixel check the max(r,g,b) and do something depending on if r=max or g=max or b=max. As you saw i used the point with function as argument. It takes one argument i. what is this i? is it the pixel?(i.e i=(r,g,b)?).I can't seem to understand from the pil documentation
The docs may not have been clear in earlier versions of PIL, but in Pillow it's spelled out pretty well. From Image.point:
lut – A lookup table, containing 256 values per band in the image. A function can be used instead, it should take a single argument. The function is called once for each possible pixel value, and the resulting table is applied to all bands of the image.
In other words, it's not a general-purpose way to map each pixel through a function, it's just a way to dynamically built the lookup table, instead of passing in a pre-built one.
In other words, it's called with the numbers from 0 through 255. (Which you can find out for yourself pretty easily by just writing a function that appends its argument to a global list and then dump out the list at the end…)
If you split your image into separate bands or planes, point each one of them with a different function, and then recombine them, that might be able to accomplish what you're trying to do. But even then, I think eval is what you wanted, not point.
But I think what you really want, which is a pixel-by-pixel all-bands-at-once iterator. And you don't need anything special for that. Just use map or a comprehension over getdata. Isn't that slow? Of course it's slow, because it's calling your function X*Y times; the cost of building the getdata sequence and iterating over it is tiny compared to that cost, so looking for a way for PIL to optimize the already-fast-enough part won't get you very far.

Categories

Resources