I have created a program in python using pyscreenshot which periodically takes a screenshot of a specific area of screen which will contain one of several pre-defined images. I am looking to load each of these images from file into a list and compare them with the screenshot to see which is currently displayed. Initially the files were created by screenshotting the images as they were on screen:
while True:
filenm = str(i) + ".png"
im=ImageGrab.grab(bbox=(680,640,735,690)) #accross, down
i = i + 1
Then when I attempt to compare them it always reports false:
image2 = Image.open("04.png")
im=ImageGrab.grab(bbox=(680,640,735,690)) #accross, down
if im == image2:
print "TRUE"
print "FALSE"
However comparing two of the images saved to files works:
image = Image.open("03.png")
image2 = Image.open("04.png")
if image == image2:
print "TRUE"
print "FALSE"
So my question is how do the images differ once loaded from file and how can I compare the 'live' screenshot with an image loaded from file?
It looks like when I use ImageGrab.grab(), a PIL.Image.Image object is created, where as Image.open() creates a PIL.pngImagePlugin.PngImageFile object. You don't want to be calling == on the actual objects, since there's no special semantics implemented for PIL images across comparing these two object types, and thus it just checks if they are the same objects in memory. Code I would use to compare the two images proper (using numpy) would look something like
import numpy as np
from PIL import Image
def image_compare(image_1, image_2):
arr1 = np.array(image_1)
arr2 = np.array(image_2)
if arr1.shape != arr2.shape:
return False
maxdiff = np.max(np.abs(arr1 - arr2))
return maxdiff == 0
def image_compare_file(filename_1, filename_2):
im1 = Image.load(filename_1)
im2 = Image.load(filename_2)
return image_compare(im1, im2)
Here I take advantage of PIL images auto-casting to numpy ndarrays with np.array(). I then check that the dimensions match, and compute the max of the absolute error if they do. If this max is zero, the images are identical. Now you could just call
if image_compare_file('file1.png','file2.png'):
pass # images in file are identical
pass # images differ
if image_compare(image1,image2):
pass # images are identical
pass # images differ
You might be interested in using a perceptual diff tool which will let you quickly identify differences in the screenshots. imgdiff is a library that wraps a tool for this in Python. A simple version can probably be implemented with PIL's ImageChop, as in this answer:
import Image
import ImageChops
im1 = Image.open("splash.png")
im2 = Image.open("splash2.png")
diff = ImageChops.difference(im2, im1)
For more on perceptual diffing, check out Bret Slatkin's talk about using it for safe continuous deployment.
I have an image that is the output of a semantic segmentation algorithm, for example this one
I looked online and tried many pieces of code but none worked for me so far.
It is clear to the human eye that there are 5 different colors in this image: blue, black, red, and white.
I am trying to write a script in python to analyze the image and return the number of colors present in the image but so far it is not working. There are many pixels in the image which contain values that are a mixture of the colors above.
The code I am using is the following but I would like to understand if there is an easier way in your opinion to achieve this goal.
I think that I need to implement some sort of thresholding that has the following logic:
Is there a similar color to this one? if yes, do not increase the count of colors
Is this color present for more than N pixels? If not, do not increase the count of colors.
from PIL import Image
imgPath = "image.jpg"
img = Image.open(imgPath)
uniqueColors = set()
w, h = img.size
for x in range(w):
for y in range(h):
pixel = img.getpixel((x, y))
totalUniqueColors = len(uniqueColors)
Thanks in advance!
I solved my issue and I am now able to count colors in images coming from a semantic segmentation dataset (the images must be in .png since it is a lossless format).
Below I try to explain what I have found in the process for a solution and the code I used which should be ready to use (you need to just change the path to the images you want to analyze).
I had two main problems.
The first problem of the color counting was the format of the image. I was using (for some of the tests) .jpeg images that compress the image.
Therefore from something like this
If I would zoom in the top left corner of the glass (marked in green) I was seeing something like this
Which obviously is not good since it will introduce many more colors than the ones "visible to the human eye"
Instead, for my annotated images I had something like the following
If I zoom in the saddle of the bike (marked in green) I had something like this
The second problem was that I did not convert my image into an RGB image.
This is taken care in the code from the line:
img = Image.open(filename).convert('RGB')
The code is below. For sure it is not the most efficient but for me it does the job. Any suggestion to improve its performance is appreciated
import numpy as np
from PIL import Image
import argparse
import os
debug = False
def main(data_dir):
print("This small script allows you to count the number of different colors in an image")
print("This code has been written to count the number of classes in images from a semantic segmentation dataset")
print("Therefore, it is highly recommended to run this code on lossless images (such as .png ones)")
print("Images are being loaded from: {}".format(data_dir))
directory = os.fsencode(data_dir)
interesting_image_format = ".png"
# I will put in the variable filenames all the paths to the images to be analyzed
filenames = []
for file in os.listdir(directory):
filename = os.fsdecode(file)
if filename.endswith(interesting_image_format):
if debug:
print(os.path.join(directory, filename))
print("Analyzing image: {}".format(filename))
filenames.append(os.path.join(data_dir, filename))
if debug:
print("I am not doing much here...")
# Sort the filenames in an alphabetical order
# Analyze the images (i.e., count the different number of colors in the images)
number_of_colors_in_images = []
for filename in filenames:
img = Image.open(filename).convert('RGB')
if debug:
data_img = np.asarray(img)
if debug:
uniques = np.unique(data_img.reshape(-1, data_img.shape[-1]), axis=0)
# uncomment the following line if you want information for each analyzed image
print("The number of different colors in image ({}) {} is: {}".format(interesting_image_format, filename, len(uniques)))
# print("uniques.shape[0] for image {} is: {}".format(filename, uniques.shape[0]))
# Put the number of colors of each image into an array
# Print the maximum number of colors (classes) of all the analyzed images
# Print the average number of colors (classes) of all the analyzed images
def args_preprocess():
# Command line arguments
parser = argparse.ArgumentParser()
"--data_dir", default="default_path_to_images", type=str, help='Specify the directory path from where to take the images of which we want to count the classes')
args = parser.parse_args()
if __name__ == '__main__':
The thing mentioned above about the lossy compression in .jpeg images and lossless compression in .png seems to be a nice thing to point out. But you can use the following piece of code to get the number of classes from a mask.
This is only applicable on .png images. Not tested on .jpeg images.
import cv2 as cv
import numpy as np
img_path = r'C:\Users\Bhavya\Downloads\img.png'
img = cv.imread(img_path)
img = np.array(img, dtype='int32')
pixels = []
for i in range(img.shape[0]):
for j in range(img.shape[1]):
r, g, b = list(img[i, j, :])
pixels.append((r, g, b))
pixels = list(set(pixels))
In this solution what I have done is appended pair of pixel values(RGB) in the input image to a list and converted the list to set and then back to list. The first conversion of list to set removes all the duplicate elements(here pixel values) and gives unique pixel values and the next conversion from set to list is optional and just to apply some future list operations on the pixels.
Something has gone wrong - your image has 1277 unique colours, rather than the 5 you suggest.
Have you maybe saved/shared a lossy JPEG rather than the lossless PNG you should prefer for classified images?
A fast method of counting the unique colours with Numpy is as follows:
def withNumpy(img):
# Ignore A channel
px = np.asarray(img)[...,:3]
# Merge RGB888 into single 24-bit integer
px24 = np.dot(np.array(px, np.uint32),[1,256,65536])
# Return number of unique colours
return len(np.unique(px24))
EDIT: Code is working now, thanks to Mark and zephyr. zephyr also has two alternate working solutions below.
I want to divide blend two images with PIL. I found ImageChops.multiply(image1, image2) but I couldn't find a similar divide(image, image2) function.
Divide Blend Mode Explained (I used the first two images here as my test sources.)
Is there a built-in divide blend function that I missed (PIL or otherwise)?
My test code below runs and is getting close to what I'm looking for. The resulting image output is similar to the divide blend example image here: Divide Blend Mode Explained.
Is there a more efficient way to do this divide blend operation (less steps and faster)? At first, I tried using lambda functions in Image.eval and ImageMath.eval to check for black pixels and flip them to white during the division process, but I couldn't get either to produce the correct result.
EDIT: Fixed code and shortened thanks to Mark and zephyr. The resulting image output matches the output from zephyr's numpy and scipy solutions below.
# PIL Divide Blend test
import Image, os, ImageMath
imgA = Image.open('01background.jpg')
imgB = Image.open('02testgray.jpg')
# split RGB images into 3 channels
rA, gA, bA = imgA.split()
rB, gB, bB = imgB.split()
# divide each channel (image1/image2)
rTmp = ImageMath.eval("int(a/((float(b)+1)/256))", a=rA, b=rB).convert('L')
gTmp = ImageMath.eval("int(a/((float(b)+1)/256))", a=gA, b=gB).convert('L')
bTmp = ImageMath.eval("int(a/((float(b)+1)/256))", a=bA, b=bB).convert('L')
# merge channels into RGB image
imgOut = Image.merge("RGB", (rTmp, gTmp, bTmp))
imgOut.save('PILdiv0.png', 'PNG')
os.system('start PILdiv0.png')
You are asking:
Is there a more efficient way to do this divide blend operation (less steps and faster)?
You could also use the python package blend modes. It is written with vectorized Numpy math and generally fast. Install it via pip install blend_modes. I have written the commands in a more verbose way to improve readability, it would be shorter to chain them. Use blend_modes like this to divide your images:
from PIL import Image
import numpy
import os
from blend_modes import blend_modes
# Load images
imgA = Image.open('01background.jpg')
imgA = numpy.array(imgA)
# append alpha channel
imgA = numpy.dstack((imgA, numpy.ones((imgA.shape[0], imgA.shape[1], 1))*255))
imgA = imgA.astype(float)
imgB = Image.open('02testgray.jpg')
imgB = numpy.array(imgB)
# append alpha channel
imgB = numpy.dstack((imgB, numpy.ones((imgB.shape[0], imgB.shape[1], 1))*255))
imgB = imgB.astype(float)
# Divide images
imgOut = blend_modes.divide(imgA, imgB, 1.0)
# Save images
imgOut = numpy.uint8(imgOut)
imgOut = Image.fromarray(imgOut)
imgOut.save('PILdiv0.png', 'PNG')
os.system('start PILdiv0.png')
Be aware that for this to work, both images need to have the same dimensions, e.g. imgA.shape == (240,320,3) and imgB.shape == (240,320,3).
There is a mathematical definition for the divide function here:
Here's an implementation with scipy/matplotlib:
import numpy as np
import scipy.misc as mpl
a = mpl.imread('01background.jpg')
b = mpl.imread('02testgray.jpg')
c = a/((b.astype('float')+1)/256)
d = c*(c < 255)+255*np.ones(np.shape(c))*(c > 255)
e = d.astype('uint8')
mpl.imsave('output.png', e)
If you don't want to use matplotlib, you can do it like this (I assume you have numpy):
imgA = Image.open('01background.jpg')
imgB = Image.open('02testgray.jpg')
a = asarray(imgA)
b = asarray(imgB)
c = a/((b.astype('float')+1)/256)
d = c*(c < 255)+255*ones(shape(c))*(c > 255)
e = d.astype('uint8')
imgOut = Image.fromarray(e)
imgOut.save('PILdiv0.png', 'PNG')
The problem you're having is when you have a zero in image B - it causes a divide by zero. If you convert all of those values to one instead I think you'll get the desired result. That will eliminate the need to check for zeros and fix them in the result.
I assume this is just a beginner's mistake as I have been working with python 3.6.x for about three months and scikit-image for less than that amount of time. Any help would be greatly appreciated.
The problem is that the mean_intensity attribute of regionprops returns different values depending on the block_size specified in the threshold_local method, here set to 33.
My understanding of regionprops is that mean_intensity would be calculated on the basis of the original input image. If that's the case, then why would mean_intensity values vary as the threshold calculated by threshold_local changes.
Input is the following uint16 grayscale .tif image: slice 1.tif. Here's the code I'm working with:
from skimage import io
from skimage import morphology
from skimage.filters import threshold_local
from skimage.measure import label, regionprops
from numpy import flip
from pandas import DataFrame
def start():
while True:
us_input = input(
"Please enter the name of the file you'd like to analyze.\n> "
im = io.imread(us_input)
except FileNotFoundError:
print("That file doesn't seem to exist or has been entered incorrectly.")
def detect(image):
local_thresh = threshold_local(image, 33, offset=30)
binary_im = image > local_thresh
label_im = label(binary_im)
interest_im = morphology.remove_small_objects(label_im, min_size=14.7) # ignore objects below 5um
label_interest_im = label(interest_im)
props(label_interest_im, image)
def props(label_interest_im, image):
results = []
im_props = regionprops(label_interest_im, intensity_image=image, cache=False)
for blob in im_props:
properties = []
yx_coords = blob.centroid
xy_coords = flip(yx_coords, 0)
real_xy_coords = (xy_coords / .769230769230769) # pixel to um conversion
round_xy_coords = real_xy_coords.round(1)
results = DataFrame(results, columns = ['x_coord', 'y_coord', 'mean_intensity'])
results.index = results.index + 1
Mean intensity works by finding the mean intensity in the original image based on the regions in the label image. These regions will change depending on your thresholding. To get a clear understanding, I suggest, for each threshold_local window size, have a look at the resulting labeled objects. (You can use skimage.color.label2rgb.) You will see that these are different, so it is not surprising that they give different mean intensity values.
I want to know How can i Find an image in Massive data (there are a lot of images in a Folder) and i want to Find image which is Exactly the same as input image (Given an input image from another folder not in the data folder ) and Compare the input image with all of the massive data , if it found Exactly The Same Image ,then show its name as output(the name of the Same Image in Folder,Not input name) (for example: dafs.jpg)
using python
I am thinking about comparing the exact value of RGB pixels and Subtract the pixel of input image from each of the images in the folder
but i don't know how to do that in python
Comparing RGB Pixel Values
You could use the pillow module to get access to the pixel data of a particular image. Keep in mind that pillow supports these image formats.
If we make a few assumptions about what it means for 2 images to be identical, based on your description, both images must:
Have the same dimensions (height and width)
Have the same RGB pixel values (the RGB values of pixel [x, y] in the input image must be the same as the RGB values of pixel [x, y] in the output image)
Be of the same orientation (related to the previous assumption, an image is considered to be not identical compared to the same image rotated by 90 degrees)
then if we have 2 images using the pillow module
from PIL import Image
original = Image.open("input.jpg")
possible_duplicate = Image.open("output.jpg")
the following code would be able to compare the 2 images to see if they were identical
def compare_images(input_image, output_image):
# compare image dimensions (assumption 1)
if input_image.size != output_image.size:
return False
rows, cols = input_image.size
# compare image pixels (assumption 2 and 3)
for row in range(rows):
for col in range(cols):
input_pixel = input_image.getpixel((row, col))
output_pixel = output_image.getpixel((row, col))
if input_pixel != output_pixel:
return False
return True
by calling
compare_images(original, possible_duplicate)
Using this function, we could go through a set of images
from PIL import Image
def find_duplicate_image(input_image, output_images):
# only open the input image once
input_image = Image.open(input_image)
for image in output_images:
if compare_images(input_image, Image.open(image)):
return image
Putting it all together, we could simply call
original = "input.jpg"
possible_duplicates = ["output.jpg", "output2.jpg", ...]
duplicate = find_duplicate_image(original, possible_duplicates)
Note that the above implementation will only find the first duplicate, and return that. If no duplicate is found, None will be returned.
One thing to keep in mind is that performing a comparison on every pixel like this can be costly. I used this image and ran compare_images using this as the input and the output 100 times using the timeit module, and took the average of all those runs
num_trials = 100
trials = timeit.repeat(
stmt="compare_images(Image.open('input.jpg'), Image.open('input.jpg'))",
setup="from __main__ import compare_images; from PIL import Image"
avg = sum(trials) / num_trials
print("Average time taken per comparison was:", avg, "seconds")
# Average time taken per comparison was 1.3337286046380177 seconds
Note that this was done on an image that was only 600 by 600 pixels. If you did this with a "massive" set of possible duplicate images, where I will take "massive" to mean at least 1M images of similar dimensions, this could possibly take ~15 days (1,000,000 * 1.28s / 60 seconds / 60 minutes / 24 hours) to go through and compare each output image to the input, which is not ideal.
Also keep in mind that these metrics will vary based on the machine and operating system you are using. The numbers I provided are more for illustrative purposes.
Alternative Implementation
While I haven't fully explored this implementation myself, one method you could try would be to precompute a hash value of the pixel data of each of your images in your collection using a hash function. If you stored these in a database, with each hash containing a link to the original image or image name, then all you would have to do is calculate the hash of the input image using the same hashing function and compare the hashes instead. This would same lots of computation time, and would make a much more efficient algorithm.
This blog post describes one implementation for doing this.
Update - 2018-08-06
As per the request of the OP, if you were given the directory of the possible duplicate images and not the explicit image paths themselves, then you could use the os and ntpath modules like so
import ntpath
import os
def get_all_images(directory):
image_paths = []
for filename in os.listdir(directory):
# to be as careful as possible, you might check to make sure that
# the file is in fact an image, for instance using
# filename.endswith(".jpg") to check for .jpg files for instance
image_paths.append("{}/{}".format(directory, filename))
return image_paths
def get_filename(path):
return ntpath.basename(path)
Using these functions, the updated program might look like
possible_duplicates = get_all_images("/path/to/images")
duplicate_path = find_duplicate_image("/path/to/input.jpg", possible_duplicates)
if duplicate_path:
The above will only print the name of the duplicate image if there was a duplicate, otherwise, it will print nothing.
I'm trying to do as described here: Finding a subimage inside a Numpy image to be able to search an image inside screenshot.
The code looks like that:
import cv2
import numpy as np
import gtk.gdk
from PIL import Image
def make_screenshot():
w = gtk.gdk.get_default_root_window()
sz = w.get_size()
pb = gtk.gdk.Pixbuf(gtk.gdk.COLORSPACE_RGB, False, 8, sz[0], sz[1])
pb = pb.get_from_drawable(w, w.get_colormap(), 0, 0, 0, 0, sz[0], sz[1])
width, height = pb.get_width(), pb.get_height()
return Image.fromstring("RGB", (width, height), pb.get_pixels())
if __name__ == "__main__":
img = make_screenshot()
cv_im = cv2.cvtColor(np.array(img), cv2.COLOR_RGB2BGR)
template = cv_im[30:40, 30:40, :]
result = cv2.matchTemplate(cv_im, template, cv2.TM_CCORR_NORMED)
print np.unravel_index(result.argmax(), result.shape)
Depending on method selected (instead of cv2.TM_CCORR_NORMED) I'm getting completely different coordinates, but none of them is (30, 30) as in example.
Please, teach me, what's wrong with such approach?
Short answer: you need to use the following line to locate the corner of the best match:
minVal, maxVal, minLoc, maxLoc = cv2.minMaxLoc(result)
The variable maxLoc will hold a tuple containing the x, y indices of the upper lefthand corner of the best match.
Long answer:
cv2.matchTemplate() returns a single channel image where the number at each index corresponds to how well the input image matched the template at that index. Try visualizing result by inserting the following lines of code after your call to matchTemplate, and you will see why numpy would have a difficult time making sense of it.
cv2.imshow("Debugging Window", result)
minMaxLoc() turns the result returned by matchTemplate into the information you want. If you cared to know where the template had the worst match, or what value was held by result at the best and worst matches, you could use those values too.
This code worked for me on an example image that I read from file. If your code continues to misbehave, you probably aren't reading in your images the way you want to. The above snippet of code is useful for debugging with OpenCV. Replace the argument result in imshow with the name of any image object (numpy array) to visually confirm that you are getting the image you want.