I'm trying to isolate a background from multiple images that have something different between each other, that is overlapping the background.
the images I have are individually listed here: https://imgur.com/a/Htno7lm
but there is a preview of all 6 of them combined here:
I wanted to do it in a sequence of images, as of I'm reading some video feed, and by getting the last frames I'm processing them to isolate the background, like this:
import os
import cv2
first = True
bwand = None
for filename in os.listdir('images'):
curImage = cv2.imread('images/%s' % filename)
if(first):
first = False
bwand = curImage
continue
bwand = cv2.bitwise_and(bwand,curImage)
cv2.imwrite("and.png",bwand)
From this code, I'm always incrementing my buffer with bitwise operations, but the results I get is not what I'm looking for:
Bitwise and:
the way of concurrent adding to a buffer its the best approach for me in terms of video filtering and performance, but if I treat it like a list, I can look for the median value like so:
import os
import cv2
import numpy as np
sequence = []
for filename in os.listdir('images'):
curImage = cv2.imread('images/%s' % filename)
sequence.append(curImage)
imgs = np.asarray(sequence)
median = np.median(imgs, axis=0)
cv2.imwrite("res.png",median)
it results me:
Which is still not perfect, because I'm looking for the median value, if I would look for the mode value the performance would decrease significantly.
Is there an approach for obtaining the result that works as a buffer like the first alternative but outputs me the best result with good performance?
--Edit
As suggested by #Christoph Rackwitz I used OpenCV background subtractor, it works as one of the requested features which is a buffer, but the result is not the most pleasant:
code:
import os
import cv2
mog = cv2.createBackgroundSubtractorMOG2()
for filename in os.listdir('images'):
curImage = cv2.imread('images/%s' % filename)
mog.apply(curImage)
x = mog.getBackgroundImage()
cv2.imwrite("res.png",x)
Since scipy.stats.mode takes ages to do its thing, I did the same manually:
calculate histogram (for every channel of every pixel of every row of every image)
argmax gets mode
reshape and cast
Still not video speed but oh well. numba can probably speed this up.
filenames = ...
assert len(filenames) < 256, "need larger dtype for histogram"
stack = np.array([cv.imread(fname) for fname in filenames])
sheet = stack[0]
hist = np.zeros((sheet.size, 256), dtype=np.uint8)
index = np.arange(sheet.size)
for sheet in stack:
hist[index, sheet.flat] += 1
result = np.argmax(hist, axis=1).astype(np.uint8).reshape(sheet.shape)
del hist # because it's huge
cv.imshow("result", result); cv.waitKey()
And if I didn't use histograms and extensive amounts of memory, but a fixed number of sheets and data access that's cache-friendly, it could likely be even faster.
Related
so i was wondering if there was sth like
pyautogui.locateOnScreen('picuture.jpg',confidence=x)
I'm currently trying to compare pictures from a folder, but pyautogui only works with "onScreen" images. I don't want to check if the picture are 1:1 the same, but if they are alike, with pyautogui you can simply add the "confidence" parameter and i've built my script based on that i just wanted to know if someone knows a way to do that.
This code checks if there are any duplicates in a folder its a bit slow though.
import image_similarity_measures
from image_similarity_measures.quality_metrics import rmse, psnr
from sewar.full_ref import rmse, psnr
import cv2
import os
import time
def check(path_orginal,path_new):#give r strings
original = cv2.imread(path_orginal)
new = cv2.imread(path_new)
return rmse(original, new)
def folder_check(folder_path):
i=0
file_list = os.listdir(folder_path)
print(file_list)
duplicate_dict={}
for file in file_list:
# print(file)
file_path=os.path.join(folder_path,file)
for file_compare in file_list:
print(i)
i+=1
file_compare_path=os.path.join(folder_path,file_compare)
if file_compare!=file:
similarity_score=check(file_path,file_compare_path)
# print(str(similarity_score))
if similarity_score==0.0:
print(file,file_compare)
duplicate_dict[file]=file_compare
file_list.remove(str(file))
return duplicate_dict
start_time=time.time()
print(folder_check(r"C:\Users\Admin\Linear-Regression-1\image-similarity-measures\input1"))
end_time=time.time()
stamp=end_time-start_time
print(stamp)
You can use numpy to compare the pixel array of two images.
from PIL import Image
import numpy as np
# import the image as pixels
img_a = Image.open('a.jpg')
img_b = Image.open('b.jpg')
img_a_pixels = img_a.load()
img_b_pixels = img_b.load()
# transform them into numpy array
img_a_array = np.array(img_a_pixels)
img_b_array = np.array(img_b_pixels)
# compare the difference
difference = (img_a_array == img_b_array).sum()
Then you can see if the difference exceeds your threshold. If it does not , you can consider them similar.
I have a set of 10 users, each with their own folder/directories, containing 25-30 images shared by them (in some social media, say). I want to calculate the similarities between the users based on the images shared by them.
For that, I use a feature extractor to convert each image into a 224x224x3 array, then loop through each user and each of the images in their folders to find the cosine similarity between each pair images, then take the average of all those pairwise image similarities for each pair of users to find the user similarity. (Please let me know if there's some mistake in this logic by the way).
My code to do all this is as follows:
from tensorflow.keras.applications.imagenet_utils import preprocess_input
from tensorflow.keras.applications import vgg16
from tensorflow.keras.preprocessing.image import load_img,img_to_array
from tensorflow.keras.models import Model
import os
import matplotlib.pyplot as plt
import numpy as np
from sklearn.metrics.pairwise import cosine_similarity
import pandas as pd
# load the model
vgg_model = vgg16.VGG16(weights='imagenet')
# remove the last layers in order to get features instead of predictions
feat_extractor = Model(inputs=vgg_model.input, outputs=vgg_model.get_layer("fc2").output)
def processed_image(image):
original = load_img(image, target_size=(224, 224))
numpy_image = img_to_array(original)
image_batch = np.expand_dims(numpy_image, axis=0)
processed_image = preprocess_input(image_batch.copy())
img_features = feat_extractor.predict(processed_image)
return img_features
def image_similarity(image1, image2):
image1 = processed_image(image1)
image2 = processed_image(image2)
sim = cosine_similarity(image1, image2)
return sim[0][0]
user_list = ['User '+str(i) for i in range(1,11)]
user_sim_df = pd.DataFrame(columns=user_list, index=user_list)
for user1 in user_list:
for user2 in user_list:
sum_img_sim = 0
user1_files = [imgs_path + x for x in os.listdir('All_Users/'+user1) if "jpg" in x]
user2_files = [imgs_path + x for x in os.listdir('All_Users/'+user2) if "jpg" in x]
for image1 in user1_files:
for image2 in user2_files:
sum_img_sim += image_similarity(image1, image2)
user_sim_df[user1][user2] = 2*sum_img_sim/(len(user1_files)+len(user2_files))
Now, because there are 4 for loops involved in calculating the user similarity matrix, the code take a long time too run (its been more than 30 minutes as of typing this question, that the code has been running for 10 users with 25-30 images each).
So, how do I rewrite the last portion of this to make the code run faster?
Nested for loops are particularly bad for Python, but some work can be done to improve here.
First of all, you are doing work twice in the comparisons. user_sim_df[user_i][user_j] has the same value as user_sim_df[user_j][user_i] for all pairs i, j. Could benefit from using already calculated values, instead of computing them again in later iterations. Besides this, is computing the values on the diagonal (user_sim_df[user_i][user_i]) necessary for your application?
These simple changes will reduce execution time to half. Is that enough? Maybe not. Further lines of improvement:
the img_to_array() operation is being applied many times on every image (every time you calculate similarity with another one). Is it a bottleneck? In that case, performance could also improve if you first run a loop on all images and create a new file ready for numpy to read later, for example with numpy.read() - or maybe, just save the preprocessed files output from the Tensorflow currently being used.
if you're using the standard Python interpreter, changing to PyPy can help (in general). You could also try adapting the code to consist only of operations on numpy structures (e.g. adapt the pandas parts) and use Numba in a way similar to this SO link. Using Numba you can also benefit from parallelism. See some practical guidelines here.
I'm working with TIFF-images of 30,000 x 30,000 pixels, and want to average 11 of these images at once.
I'd prefer to do it in python if possible, and was wondering what's the best way to approach this?
Should I use OpenCV or can it be done just using numpy?
Will averaging each of RGBA channels independently improve performance?
Or should I divide the images into smaller images and process them independently, and then stitch the resulting pieces back together?
Doing it straight-up with openCV like this leads to memory errors:
im0 = cv2.imread( '5014.tif' )
im1 = cv2.imread( '5114.tif' )
im2 = cv2.imread( '5214.tif' )
im3 = cv2.imread( '5314.tif' )
im4 = cv2.imread( '5414.tif' )
cv2.imwrite( 'avg.tif', .01*im0 -.002*im1 -.002*im2 -.002*im3 -.002*im4 )
libvips is an image processing system for large images. It streams images rather than doing separate load / process / save steps, so you can work with images much larger than the amount of memory in your computer. It has a handy, high-level python binding. It should be quicker than opencv for this sort of task.
You can solve your problem in Python like this:
#!/usr/bin/python
import pyvips
if len(sys.argv) < 3:
print("usage: %s output-file in1 in2 ..." % sys.argv[0])
sys.exit(1)
outfile = sys.argv[1]
input_names = sys.argv[2:]
total = sum([pyvips.Image.new_from_file(filename, access="sequential")
for filename in input_names])
avg = total / len(input_names)
# avg will be a float image, cast back to 8-bit for write, or we'll
# get a float tiff
avg.cast("uchar").write_to_file(outfile)
Run like this:
$ time ./avg.py x.tif ~/pics/wtc*.tif
memory: high-water mark 38.50 MB
real 0m5.759s
user 0m2.584s
sys 0m0.457s
That's averaging four 10,000 x 10,000 RGB images on a machine with a mechanical HDD, so I'd guess about two minutes for your dataset. Memory use should be around 100mb.
What performance to get improved?
SMALLER static Memory Footprint -- as processing does not perform any convoluted calculi, just change the processing scheme to get from 30GB static RAM footprint to some 5GB. ( code sample runs this way, iterating over the sequence of 11-files )
aListOfFNAMEs = [ ''5014.tif', ... ] # SETUP: FNAMEs
aListOfCOEFFs = [ .01, -.002, -.002, -.002, -.002, .... ] # COEFFs till the 11-th
anInputIMG = cv2.imread( aListOfFNAMEs[0] ) # LOAD
anAveragedIMG = numpy.zeros( anInputIMG.shape ) # ensure .copy, not view
anAveragedIMG += aListOfCOEFFs[0]*anInputIMG # process the 1st LOAD-ed
for aPtr in range( 1, len(aListOfCOEFFs ) ): # iterate, process the rest
anInputIMG = cv2.imread( aListOfFNAMEs[aPtr] ) # re-use MEM on LOAD(s)
anAveragedIMG += aListOfCOEFFs[aPtr] * anInputIMG # process <next>
cv2.imwrite( "avg.TIF", anAveragedIMG ) # SAVE
del anInputIMG # release for GC
del anAveragedIMG # release for GC, DONE.
FASTER vectorised Matrix Operations -- as processing allows numpy/openCV vectorised matrix operations, RGB-colorplane separation into independent processing does not improve the speed, just the contrary. ( code above runs that way )
FASTEST GPU-based Block Operations -- while possible, would introduce your Project to spend remarkable time on arranging GPU-device / HOST-device data transfers, so as to be moving less data per Block, than the size of hte GPU-DRAM allows. The requested calculation has such a low mathematical/calculation-density, that it does not justify these overheads to move into GPU-based mode.
Your "stack" has the "shape" (30000, 30000, 4, 11). I wouldn't worry about manually looping over these last two dimensions in any way - I would worry about running out of memory as you are experiencing.
I don't know the OpenCV syntax but if you can read in one image without memory issues do something like:
image_filenames = ['5014.tif', '5114.tif', ...]
N = float(len(image_filenames))
output = # empty array of image dimensions
for image_filename in image_filenames:
# read in this image
# add image / N to output
# save the output
I have a script which uses Google Maps API to download a sequence of equal-sized square satellite images and generates a PDF. The images need to be rotated beforehand, and I already do so using PIL.
I noticed that, due to different light and terrain conditions, some images are too bright, others are too dark, and the resulting pdf ends up a bit ugly, with less-than-ideal reading conditions "in the field" (which is backcountry mountain biking, where I want to have a printed thumbnail of specific crossroads).
(EDIT) The goal then is to make all images end up with similar apparent brightness and contrast. So, the images that are too bright would have to be darkened, and the dark ones would have to be lightened. (by the way, I once used imagemagick autocontrast, or auto-gamma, or equalize, or autolevel, or something like that, with interesting results in medical images, but don't know how to do any of these in PIL).
I already used some image corrections after converting to grayscale (had a grayscale printer a time ago), but the results weren't good, either. Here is my grayscale code:
#!/usr/bin/python
def myEqualize(im)
im=im.convert('L')
contr = ImageEnhance.Contrast(im)
im = contr.enhance(0.3)
bright = ImageEnhance.Brightness(im)
im = bright.enhance(2)
#im.show()
return im
This code works independently for each image. I wonder if it would be better to analyze all images first and then "normalize" their visual properties (contrast, brightness, gamma, etc).
Also, I think it would be necessary to perform some analysis in the image (histogram?), so as to apply a custom correction depending on each image, and not an equal correction for all of them (although any "enhance" function implicitly considers initial contitions).
Does anybody had such problem and/or know a good alternative to do this with the colored images (no grayscale)?
Any help will be appreciated, thanks for reading!
What you are probably looking for is a utility that performs "histogram stretching". Here is one implementation. I am sure there are others. I think you want to preserve the original hue and apply this function uniformly across all color bands.
Of course there is a good chance that some of the tiles will have a noticeable discontinuity in level where they join. Avoiding this, however, would involve spatial interpolation of the "stretch" parameters and is a much more involved solution. (...but would be a good exercise if there is that need.)
Edit:
Here is a tweak that preserves image hue:
import operator
def equalize(im):
h = im.convert("L").histogram()
lut = []
for b in range(0, len(h), 256):
# step size
step = reduce(operator.add, h[b:b+256]) / 255
# create equalization lookup table
n = 0
for i in range(256):
lut.append(n / step)
n = n + h[i+b]
# map image through lookup table
return im.point(lut*im.layers)
The following code works on images from a microscope (which are similar), to prepare them prior to stitching. I used it on a test set of 20 images, with reasonable results.
The brightness average function is from another Stackoverflow question.
from PIL import Image
from PIL import ImageStat
import math
# function to return average brightness of an image
# Source: https://stackoverflow.com/questions/3490727/what-are-some-methods-to-analyze-image-brightness-using-python
def brightness(im_file):
im = Image.open(im_file)
stat = ImageStat.Stat(im)
r,g,b = stat.mean
return math.sqrt(0.241*(r**2) + 0.691*(g**2) + 0.068*(b**2)) #this is a way of averaging the r g b values to derive "human-visible" brightness
myList = [0.0]
deltaList = [0.0]
b = 0.0
num_images = 20 # number of images
# loop to auto-generate image names and run prior function
for i in range(1, num_images + 1): # for loop runs from image number 1 thru 20
a = str(i)
if len(a) == 1: a = '0' + str(i) # to follow the naming convention of files - 01.jpg, 02.jpg... 11.jpg etc.
image_name = 'twenty/' + a + '.jpg'
myList.append(brightness(image_name))
avg_brightness = sum(myList[1:])/num_images
print myList
print avg_brightness
for i in range(1, num_images + 1):
deltaList.append(i)
deltaList[i] = avg_brightness - myList[i]
print deltaList
At this point, the "correction" values (i.e. difference between value and mean) are stored in deltaList. The following section applies this correction to all the images one by one.
for k in range(1, num_images + 1): # for loop runs from image number 1 thru 20
a = str(k)
if len(a) == 1: a = '0' + str(k) # to follow the naming convention of files - 01.jpg, 02.jpg... 11.jpg etc.
image_name = 'twenty/' + a + '.jpg'
img_file = Image.open(image_name)
img_file = img_file.convert('RGB') # converts image to RGB format
pixels = img_file.load() # creates the pixel map
for i in range (img_file.size[0]):
for j in range (img_file.size[1]):
r, g, b = img_file.getpixel((i,j)) # extracts r g b values for the i x j th pixel
pixels[i,j] = (r+int(deltaList[k]), g+int(deltaList[k]), b+int(deltaList[k])) # re-creates the image
j = str(k)
new_image_name = 'twenty/' +'image' + j + '.jpg' # creates a new filename
img_file.save(new_image_name) # saves output to new file name
I have thousands of images and I need to weed out the ones which are not photographs, or otherwise 'interesting'.
An 'uninteresting' image, for example, may be all one color, or mostly one color, or a simple icon/logo.
The solution doesn't have to be perfect, just good enough to remove the least interesting images.
My best idea so far is to take a random sampling of pixels, and then... do something with them.
Danphe beat me to it. Here's my method for calculating image entropy:
import Image
from math import log
def get_histogram_dispersion(histogram):
log2 = lambda x:log(x)/log(2)
total = len(histogram)
counts = {}
for item in histogram:
counts.setdefault(item,0)
counts[item]+=1
ent = 0
for i in counts:
p = float(counts[i])/total
ent-=p*log2(p)
return -ent*log2(1/ent)
im = Image.open('test.png')
h = im.histogram()
print get_histogram_dispersion(h)