Is there best practice for storing feature maps of images - python

So I'm building an image search engine where a user can upload an image (query image) , and find other images stored in a directory ( target images) that have the query image in it.
Thats being done using SIFT by extracting the feature maps from both the query and source images and matching them.
Now to make this a feasible application, we'll need to store the feature maps somewhere so that when an image is queried we dont re-extract the feature maps of all images
The problem is that each feature map is around 2k rows with 128 columns per image, totalling about 6mb of storage.
For 1,000,000 image, that would be around 6TB of storage, i dont think this would be scalable and would fit in a database engine properly.
I'm kinda lost from here, how should i store and retrieve such feature maps
Your help is much appreciated!
Edit: the target images are always evolving and not just a static dataset

Related

What is the most efficient way to store images feature vector?

I'm working on project that needs to deal with images, I extract their feature vector instantly when any image uploaded then I store the feature vectors in MySQL database as text per each image.
Also I'm using django framework.
def search_images_by_features(query_image: ImageFieldFile, images):
featured_images = [
(calculate_similarity(load_features_from_str(image.features), get_image_feature(query_image)), image.item) for image in
images if image.features is not None]
...
But looping on each image isn't a big deal as it takes more time as images increase.
Also my feature vector in database stored like that:
0.0010601664,0.0003533888,0.8969008,0.0014135552,...
Also is there a way to make MySQL database engine calculate similarity from feature vectors?

Adding text search to content based image retrieval (convnet)

I've implemented CBIR app by using standard ConvNet approach:
Use Transfer Learning to extract features from the data set of images
Cluster extracted features via knn
Given search image, extract its features
Give top 10 images that are close to the image in hand in knn network
I am getting good results, but I want to further improve them by adding text search as well. For instance, when my image is the steering wheel of the car, the close results will be any circular objects that resemble a steering wheel for instance bike wheel. What would be the best possible way to input text say "car part" to produce only steering wheels similar to the search image.
I am unable to find a good way to combine ConvNet with text search model to construct improved knn network.
My other idea is to use ElasticSearch in order to do text search, something that ElasticSearch is good at. For instance, I would do a CBIR search described previously and out of the return results, I can look up their description and then use ElasticSearch on the subset of the hits to produce the results. Maybe tag images with classes and allow user to de/select groups of images of interest.
I don't want to do text search before image search as some of the images are poorly described so text search would miss them.
Any thoughts or ideas will be appreciated!
I have not found the original paper, but maybe you might find it interesting: https://www.slideshare.net/xavigiro/multimodal-deep-learning-d4l4-deep-learning-for-speech-and-language-upc-2017
It is about looking for a vector space where both images and text are (multimodal embedding). In this way you can find text similar to a images, images referring to a text, or use the tuple text / image to find similar images.
I think maybe this idea is an interesting point to start from.

How to extract only ID photo from CV with pdfimages

Hi I tried to use pdfimages to extract ID images from my pdf resume files. However for some files they return also the icon, table lines, border images which are totally irrelevant.
Is there anyway I can limit it to only extract person photo? I am thinking if we can define a certain size constraints on the output?
You need a way of differentiating images found in the PDF in order to extract the ones of interest.
I believe you have the options of considering:
Image characteristics such as Width, Height, Bits Per Component, ColorSpace
Metadata information about the image (e.g. a XMP tag of interest)
Facial recognition of the person in the photo or Form recognition of the structure of the ID itself.
Extracting all of the images and then use some image processing code to analyze the images to identify the ones of interest.
I think 2) may be the most reliable method if the author of the PDF included such information with the photo IDs. 3) may be difficult to implement and get a reliable result from consistently. 1) will only work if that is a reliable means of identifying such photo IDs for your PDF documents.
Then you could key off of that information using your extraction tool (if it lets you do that). Otherwise you would need to write your own extraction tool using a PDF library.

"Normalizing" (de-skewing, re-scaling) images as preprocessing for OCR in Python

I have a bunch of scanned images of documents of the same layout (strict forms filled out with variable data) that I need to process with OCR. I can more or less cope with the OCR process itself (convert text images to text) but still have to cope with the annoying fact that the scanned images are distorted either by different degree of rotation, different scaling or both.
Because my method focuses on reading pieces of information from respective cells that are defined as bounding boxes by pixels, I must convert all pictures to a "standard" version where every corresponding cells are in the same pixel position, otherwise my reader "misreads". My question is, how could I "normalize" the distorted images?
I use Python.
Today in high-volume form-scanning jobs we use commercial software with adaptive template matching, which does deskew and selective binarization to prepare the images, but then it adapts field boxes per image, not placing boxes on XY-location.
Deskeing process overall increases the image size. It is visible in this random image from online search:
https://github.com/tesseract-ocr/tesseract/wiki/skew-linedetection.png
Notice how the title of the document was near the top border, and in the deskewed image it is shifted down. In this oversimplified example an XY-based box would not catch it.
I use commercial software for deskewing and image pre-processing. It is quite inexpensive but good. Unfortunately, I believe it will take you only part-way if the data capture method relies on xy-coordinate field matching. I sense your frustration with dealing with it, thus appropriate tools were already created for handling that.
I run a service bureau for such form processing. If you are interested I can further share privately methods how we process.then

Storing and processing too many images with opencv

I am working on a Image detection problem in opencv for which I need to save many images.
These images are required to be accessed frequently so I wanted to store these images in IPL image format . All these images are grayscale images .
What I wanted to ask is what is the best method to handle all these Images ? Should I store them in a database or a file system ?
Any help would be highly appreciated.

Categories

Resources