How to calculate user-similarity matrix in a more efficient manner?

How to calculate user-similarity matrix in a more efficient manner? - python

I have a set of 10 users, each with their own folder/directories, containing 25-30 images shared by them (in some social media, say). I want to calculate the similarities between the users based on the images shared by them.
For that, I use a feature extractor to convert each image into a 224x224x3 array, then loop through each user and each of the images in their folders to find the cosine similarity between each pair images, then take the average of all those pairwise image similarities for each pair of users to find the user similarity. (Please let me know if there's some mistake in this logic by the way).
My code to do all this is as follows:
from tensorflow.keras.applications.imagenet_utils import preprocess_input
from tensorflow.keras.applications import vgg16
from tensorflow.keras.preprocessing.image import load_img,img_to_array
from tensorflow.keras.models import Model
import os
import matplotlib.pyplot as plt
import numpy as np
from sklearn.metrics.pairwise import cosine_similarity
import pandas as pd
# load the model
vgg_model = vgg16.VGG16(weights='imagenet')
# remove the last layers in order to get features instead of predictions
feat_extractor = Model(inputs=vgg_model.input, outputs=vgg_model.get_layer("fc2").output)
def processed_image(image):
original = load_img(image, target_size=(224, 224))
numpy_image = img_to_array(original)
image_batch = np.expand_dims(numpy_image, axis=0)
processed_image = preprocess_input(image_batch.copy())
img_features = feat_extractor.predict(processed_image)
return img_features
def image_similarity(image1, image2):
image1 = processed_image(image1)
image2 = processed_image(image2)
sim = cosine_similarity(image1, image2)
return sim[0][0]
user_list = ['User '+str(i) for i in range(1,11)]
user_sim_df = pd.DataFrame(columns=user_list, index=user_list)
for user1 in user_list:
for user2 in user_list:
sum_img_sim = 0
user1_files = [imgs_path + x for x in os.listdir('All_Users/'+user1) if "jpg" in x]
user2_files = [imgs_path + x for x in os.listdir('All_Users/'+user2) if "jpg" in x]
for image1 in user1_files:
for image2 in user2_files:
sum_img_sim += image_similarity(image1, image2)
user_sim_df[user1][user2] = 2*sum_img_sim/(len(user1_files)+len(user2_files))
Now, because there are 4 for loops involved in calculating the user similarity matrix, the code take a long time too run (its been more than 30 minutes as of typing this question, that the code has been running for 10 users with 25-30 images each).
So, how do I rewrite the last portion of this to make the code run faster?

Nested for loops are particularly bad for Python, but some work can be done to improve here.
First of all, you are doing work twice in the comparisons. user_sim_df[user_i][user_j] has the same value as user_sim_df[user_j][user_i] for all pairs i, j. Could benefit from using already calculated values, instead of computing them again in later iterations. Besides this, is computing the values on the diagonal (user_sim_df[user_i][user_i]) necessary for your application?
These simple changes will reduce execution time to half. Is that enough? Maybe not. Further lines of improvement:
the img_to_array() operation is being applied many times on every image (every time you calculate similarity with another one). Is it a bottleneck? In that case, performance could also improve if you first run a loop on all images and create a new file ready for numpy to read later, for example with numpy.read() - or maybe, just save the preprocessed files output from the Tensorflow currently being used.
if you're using the standard Python interpreter, changing to PyPy can help (in general). You could also try adapting the code to consist only of operations on numpy structures (e.g. adapt the pandas parts) and use Numba in a way similar to this SO link. Using Numba you can also benefit from parallelism. See some practical guidelines here.

Related

Background isolation from complementing images

I'm trying to isolate a background from multiple images that have something different between each other, that is overlapping the background.
the images I have are individually listed here: https://imgur.com/a/Htno7lm
but there is a preview of all 6 of them combined here:
I wanted to do it in a sequence of images, as of I'm reading some video feed, and by getting the last frames I'm processing them to isolate the background, like this:
import os
import cv2
first = True
bwand = None
for filename in os.listdir('images'):
curImage = cv2.imread('images/%s' % filename)
if(first):
first = False
bwand = curImage
continue
bwand = cv2.bitwise_and(bwand,curImage)
cv2.imwrite("and.png",bwand)
From this code, I'm always incrementing my buffer with bitwise operations, but the results I get is not what I'm looking for:
Bitwise and:
the way of concurrent adding to a buffer its the best approach for me in terms of video filtering and performance, but if I treat it like a list, I can look for the median value like so:
import os
import cv2
import numpy as np
sequence = []
for filename in os.listdir('images'):
curImage = cv2.imread('images/%s' % filename)
sequence.append(curImage)
imgs = np.asarray(sequence)
median = np.median(imgs, axis=0)
cv2.imwrite("res.png",median)
it results me:
Which is still not perfect, because I'm looking for the median value, if I would look for the mode value the performance would decrease significantly.
Is there an approach for obtaining the result that works as a buffer like the first alternative but outputs me the best result with good performance?
--Edit
As suggested by #Christoph Rackwitz I used OpenCV background subtractor, it works as one of the requested features which is a buffer, but the result is not the most pleasant:
code:
import os
import cv2
mog = cv2.createBackgroundSubtractorMOG2()
for filename in os.listdir('images'):
curImage = cv2.imread('images/%s' % filename)
mog.apply(curImage)
x = mog.getBackgroundImage()
cv2.imwrite("res.png",x)

Since scipy.stats.mode takes ages to do its thing, I did the same manually:
calculate histogram (for every channel of every pixel of every row of every image)
argmax gets mode
reshape and cast
Still not video speed but oh well. numba can probably speed this up.
filenames = ...
assert len(filenames) < 256, "need larger dtype for histogram"
stack = np.array([cv.imread(fname) for fname in filenames])
sheet = stack[0]
hist = np.zeros((sheet.size, 256), dtype=np.uint8)
index = np.arange(sheet.size)
for sheet in stack:
hist[index, sheet.flat] += 1
result = np.argmax(hist, axis=1).astype(np.uint8).reshape(sheet.shape)
del hist # because it's huge
cv.imshow("result", result); cv.waitKey()
And if I didn't use histograms and extensive amounts of memory, but a fixed number of sheets and data access that's cache-friendly, it could likely be even faster.

PyTorch : How to apply the same random transformation to multiple image?

I am writing a simple transformation for a dataset which contains many pairs of images. As a data augmentation, I want to apply some random transformation for each pair but the images in that pair should be transformed in the same way.
For example, given a pair of two images A and B, if A is flipped horizontally, B must be flipped horizontally as A. Then the next pair C and D should be differently transformed from A and B but C and D are transformed in the same way. I am trying that in the way below
import random
import numpy as np
import torchvision.transforms as transforms
from PIL import Image
img_a = Image.open("sample_ajpg") # note that two images have the same size
img_b = Image.open("sample_b.png")
img_c, img_d = Image.open("sample_c.jpg"), Image.open("sample_d.png")
transform = transforms.RandomChoice(
[transforms.RandomHorizontalFlip(),
transforms.RandomVerticalFlip()]
)
random.seed(0)
display(transform(img_a))
display(transform(img_b))
random.seed(1)
display(transform(img_c))
display(transform(img_d))
Yet、 the above code does not choose the same transformation and as I tested, it is dependent on the number of times transform is called.
Is there any way to force transforms.RandomChoice to use the same transform when specified?

Usually a workaround is to apply the transform on the first image, retrieve the parameters of that transform, then apply with a deterministic transform with those parameters on the remaining images. However, here RandomChoice does not provide an API to get the parameters of the applied transform since it involves a variable number of transforms.
In those cases, I usually implement an overwrite to the original function.
Looking at the torchvision implementation, it's as simple as:
class RandomChoice(RandomTransforms):
def __call__(self, img):
t = random.choice(self.transforms)
return t(img)
Here are two possible solutions.
You can either sample from the transform list on __init__ instead of on __call__:
import random
import torchvision.transforms as T
class RandomChoice(torch.nn.Module):
def __init__(self):
super().__init__()
self.t = random.choice(self.transforms)
def __call__(self, img):
return self.t(img)
So you can do:
transform = T.RandomChoice([
T.RandomHorizontalFlip(),
T.RandomVerticalFlip()
])
display(transform(img_a)) # both img_a and img_b will
display(transform(img_b)) # have the same transform
transform = T.RandomChoice([
T.RandomHorizontalFlip(),
T.RandomVerticalFlip()
])
display(transform(img_c)) # both img_c and img_d will
display(transform(img_d)) # have the same transform
Or better yet, transform the images in batch:
import random
import torchvision.transforms as T
class RandomChoice(torch.nn.Module):
def __init__(self, transforms):
super().__init__()
self.transforms = transforms
def __call__(self, imgs):
t = random.choice(self.transforms)
return [t(img) for img in imgs]
Which allows to do:
transform = T.RandomChoice([
T.RandomHorizontalFlip(),
T.RandomVerticalFlip()
])
img_at, img_bt = transform([img_a, img_b])
display(img_at) # both img_a and img_b will
display(img_bt) # have the same transform
img_ct, img_dt = transform([img_c, img_d])
display(img_ct) # both img_c and img_d will
display(img_dt) # have the same transform

Simply, take the randomization part out of PyTorch into an if statement.
Below code uses vflip. Similarly for horizontal or other transforms.
import random
import torchvision.transforms.functional as TF
if random.random() > 0.5:
image = TF.vflip(image)
mask = TF.vflip(mask)
This issue has been discussed in PyTorch forum. Several solutions' pros and cons were discussed on the official GitHub repository page.
PyTorch maintainers have suggested this simple approach.
Do not use torchvision.transforms.RandomVerticalFlip(p=1). Use torchvision.transforms.functional.vflip
Functional transforms give you fine-grained control of the transformation pipeline. As opposed to the transformations above, functional transforms don’t contain a random number generator for their parameters. That means you have to specify/generate all parameters, but you can reuse the functional transform.

I realize the OP requested a solution using torchvision and I think #Ivan's answer does a good job addressing this.
However, for those not tied to a specific augmentation library, I wanted to point out that Albumentations appears to handle these kind of situations nicely in a native fashion by allowing the user to pass multiple source images, boxes, etc into the same transform. The return is structured as a dict
import albumentations as A
transform = A.Compose(
transforms=[
A.VerticalFlip(p=0.5),
A.HorizontalFlip(p=0.5)],
additional_targets={'image0': 'image', 'image1': 'image'}
)
transformed = transform(image=image, image0=image0, image1=image1)
Now you can access transformed['image0'], transformed['image1'], etc and all of them will have random parameters applied

I dont know of a function to fix the random output.
maybe try a different logic, like creating the randomization yourself to be able to reuse the same transformation.
logic:
generate a random number
based on the number apply a transformation on both images
generate another random number
do the same for the other two images
try this:
import random
import numpy as np
import torchvision.transforms as transforms
from PIL import Image
img_a = Image.open("sample_ajpg") # note that two images have the same size
img_b = Image.open("sample_b.png")
img_c, img_d = Image.open("sample_c.jpg"), Image.open("sample_d.png")
if random.random() > 0.5:
image_a_flipped = transforms.functional_pil.vflip(img_a)
image_b_flipped = transforms.functional_pil.vflip(img_b)
else:
image_a_flipped = transforms.functional_pil.hflip(img_a)
image_b_flipped = transforms.functional_pil.hflip(img_b)
if random.random() > 0.5:
image_c_flipped = transforms.functional_pil.vflip(img_c)
image_d_flipped = transforms.functional_pil.vflip(img_d)
else:
image_c_flipped = transforms.functional_pil.hflip(img_c)
image_d_flipped = transforms.functional_pil.hflip(img_d)
display(image_a_flipped)
display(image_b_flipped)
display(image_c_flipped)
display(image_d_flipped)

Referencing Random transforms for both input and target? I think this is probably the cleanest way to do it. Save the random state before applying any transformation and the just restore it for each consequent call
t = transforms.RandomRotation(degrees=360)
state = torch.get_rng_state()
x = t(x)
torch.set_rng_state(state)
y = t(y)

Gaussian Mixture Model fit in Python with sklearn is too slow - Any alternative?

I need to use Gaussian Mixture Models on an RGB image, and therefore the dataset is quite big. This needs to run on real time (from a webcam feed). I first coded this with Matlab and I was able to achieve a running time of 0.5 seconds for an image of 1729 × 866. The images for the final application will be smaller and therefore the timing will be faster.
However, I need to implement this with Python and OpenCV for the final application (I need it to run on an embedded board). I translated all my code and used sklearn.mixture.GMM to replace fitgmdist in Matlab. The line of code calculating the GMM model itself is performed in only 7.7e-05 seconds, but the one to fit the model takes 19 seconds. I have tried other types of covariance, such as 'diag' or 'spherical', and the time does reduce a little but the results are worse and the time is still not good enough, not even close.
I was wondering if there is any other library I can use, or if it would be worth it to translate the functions from Matlab to Python.
Here is my example:
import cv2
import numpy as np
import math
from sklearn.mixture import GMM
im = cv2.imread('Boat.jpg');
h, w, _ = im.shape; # Height and width of the image
# Extract Blue, Green and Red
imB = im[:,:,0]; imG = im[:,:,1]; imR = im[:,:,2];
# Reshape Blue, Green and Red channels into single-row vectors
imB_V = np.reshape(imB, [1, h * w]);
imG_V = np.reshape(imG, [1, h * w]);
imR_V = np.reshape(imR, [1, h * w]);
# Combine the 3 single-row vectors into a 3-row matrix
im_V = np.vstack((imR_V, imG_V, imB_V));
# Calculate the bimodal GMM
nmodes = 2;
GMModel = GMM(n_components = nmodes, covariance_type = 'full', verbose = 0, tol = 1e-3)
GMModel = GMModel.fit(np.transpose(im_V))
Thank you very much for your help

You can try fit with the 'diagonal' or spherical covariance matrix instead of full.
covariance_type='diag'
or
covariance_type='spherical'
I believe it will be much faster.

Python matrix convolution without using numpy.convolve or scipy equivalent functions

I need to write a matrix convolution without using any built in functions to help. I am taking an image and turning it to greyscale, and then I'm supposed to pass a filter matrix over it. One of the filter matrices I have to use is:
[[-1,0,1],
[-1,0,1],
[-1,0,1]]
I understand how convolutions work, I just don't understand how to apply the convolution with code. Here is the code I am using to get my greyscale array:
import numpy
from scipy import misc
mylist = []
for i in myfile:
mylist.append(i)
for i in mylist:
q = i
print(q)
image = misc.imread(q[0:-1])
threshold()
image = misc.imread('image1.png')
def averageArr(pixel): #make the pixel color values more realistic
return 0.299*pixel[:,:,0] + 0.587*pixel[:,:,1] + 0.114*pixel[:,:,2]
def threshold():
picture = averageArr(image)
for i in range(0,picture.shape[0]): #begin thresholding
for j in range(0,picture.shape[1]):
myList.append(i,j)
misc.imsave('image1.png') #save the image file
I take the values from the function, and add them to a list, and then I am supposed to iterate over the list, but I'm not sure how to go about doing that. I can use scipy and numpy to read and arrange the matrix, but the actual convolution function has to be written.

Improving moving-window computation in memory consumption and speed

Is it possible to obtain better performance (both in memory consumption and speed) in this moving-window computation? I have a 1000x1000 numpy array and I take 16x16 windows through the whole array and finally apply some function to each window (in this case, a discrete cosine transform.)
import numpy as np
from scipy.fftpack import dct
from skimage.util import view_as_windows
X = np.arange(1000*1000, dtype=np.float32).reshape(1000,1000)
window_size = 16
windows = view_as_windows(X, (window_size,window_size))
dcts = np.zeros(windows.reshape(-1,window_size, window_size).shape, dtype=np.float32)
for idx, window in enumerate(windows.reshape(-1,window_size, window_size)):
dcts[idx, :, :] = dct(window)
dcts = dcts.reshape(windows.shape)
This code takes too much memory (in the example above, the memory consumption is not so bad - windows uses 1Gb and dcts also needs 1Gb) and is taking 25 seconds to complete. I'm a bit unsure as to what I'm doing wrong because this should be a straightforward calculation (e.g. filtering an image.) Is there a better way to accomplish this?
UPDATE:
I was initially worried that the arrays produced by Kington's solution and my initial approach were very different, but the difference is restricted to the boundaries, so it is unlikely to cause serious issues for most applications. The only remaining problem is that both solutions are very slow. Currently, the first solution takes 1min 10s and the second solution 59 seconds.
UPDATE 2:
I noticed the biggest culprits by far are dct and np.mean. Even generic_filter performs decently (8.6 seconds) using a "cythonized" version of mean with bottleneck:
import bottleneck as bp
def func(window, shape):
window = window.reshape(shape)
#return np.abs(dct(dct(window, axis=1), axis=0)).mean()
return bp.nanmean(dct(window))
result = scipy.ndimage.generic_filter(X, func, (16, 16),
extra_arguments=([16, 16],))
I'm currently reading how to wrap C code using numpy in order to replace scipy.fftpack.dct. If anyone knows how to do it, I would appreciate the help.

Since scipy.fftpack.dct calculates separate transforms along the last axis of the input array, you can replace your loop with:
windows = view_as_windows(X, (window_size,window_size))
dcts = dct(windows)
result1 = dcts.mean(axis=(2,3))
Now only the dcts array requires a lot of memory and windows remains merely a view into X. And because the DCT's are calculated with a single function call it's also much faster. However, because the windows overlap there are lots of repeated calculations. This can be overcome by only calculating the DCT for each sub-row once, followed by a windowed mean:
ws = window_size
row_dcts = dct(view_as_windows(X, (1, ws)))
cs = row_dcts.squeeze().sum(axis=-1).cumsum(axis=0)
result2 = np.vstack((cs[ws-1], cs[ws:]-cs[:-ws])) / ws**2
Though it seems what is gained in effeciency is lost in code clarity... But basically the approach here is to first calculate the DCT's and then take the window average by summing over the 2D window and then dividing by the number of elements in the window. The DCTs are already calculated over rowwise moving windows, so we take a regular sum over those windows. However we need to take a moving window sum over the columns, to arrive at the proper 2D window sums. To do this efficiently we use a cumsum trick, where:
sum(A[p:q]) # q-p == window_size
Is equivalent to:
cs = cumsum(A)
cs[q-1] - cs[p-1]
This avoids having to sum the exact same numbers over and over. Unfortunately it doesn't work for the first window (when p == 0), so for that we have to take only cs[q-1] and stack it together with the other window sums. Finally we divide by the number of elements to arrive at the 2D window average.
If you like to do a 2D DCT than this second approach becomes less interesting, beause you'll eventually need the full 985 x 985 x 16 x 16 array before you can take the mean.
Both approaches above should be equivalent, but it may be a good idea to perform the arithmetic with 64-bit floats:
np.allclose(result1, result2, atol=1e-6)
# False
np.allclose(result1, result2, atol=1e-5)
# True

skimage.util.view_as_windows is using striding tricks to make an array of overlapping "windows" that doesn't use any additional memory.
However, when you make a new array of the shape shape, it will require ~32 times (16 x 16) the memory that your original X array or the windows array used.
Based on your comment, your end result is doing dcts.reshape(windows.shape).mean(axis=2).mean(axis=2) - taking the mean of the dct of each window.
Therefore, it would be more memory-efficient (though similar performance wise) to take the mean inside the loop and not store the huge intermediate array of windows:
import numpy as np
from scipy.fftpack import dct
from skimage.util import view_as_windows
X = np.arange(1000*1000, dtype=np.float32).reshape(1000,1000)
window_size = 16
windows = view_as_windows(X, (window_size, window_size))
dcts = np.zeros(windows.shape[:2], dtype=np.float32).ravel()
for idx, window in enumerate(windows.reshape(-1, window_size, window_size)):
dcts[idx] = dct(window).mean()
dcts = dcts.reshape(windows.shape[:2])
Another option is scipy.ndimage.generic_filter. It won't increase performance much (the bottleneck is the python function call in the inner loop), but you'll have a lot more boundary condition options, and it will be fairly memory efficient:
import numpy as np
from scipy.fftpack import dct
import scipy.ndimage
X = np.arange(1000*1000, dtype=np.float32).reshape(1000,1000)
def func(window, shape):
window = window.reshape(shape)
return dct(window).mean()
result = scipy.ndimage.generic_filter(X, func, (16, 16),
extra_arguments=([16, 16],))

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to calculate user-similarity matrix in a more efficient manner? - python

Related

Background isolation from complementing images

PyTorch : How to apply the same random transformation to multiple image?

Gaussian Mixture Model fit in Python with sklearn is too slow - Any alternative?

Python matrix convolution without using numpy.convolve or scipy equivalent functions

Improving moving-window computation in memory consumption and speed

Categories

Resources