Count number of classes in a semantic segmented image - python

I have an image that is the output of a semantic segmentation algorithm, for example this one
I looked online and tried many pieces of code but none worked for me so far.
It is clear to the human eye that there are 5 different colors in this image: blue, black, red, and white.
I am trying to write a script in python to analyze the image and return the number of colors present in the image but so far it is not working. There are many pixels in the image which contain values that are a mixture of the colors above.
The code I am using is the following but I would like to understand if there is an easier way in your opinion to achieve this goal.
I think that I need to implement some sort of thresholding that has the following logic:
Is there a similar color to this one? if yes, do not increase the count of colors
Is this color present for more than N pixels? If not, do not increase the count of colors.
from PIL import Image
imgPath = "image.jpg"
img = Image.open(imgPath)
uniqueColors = set()
w, h = img.size
for x in range(w):
for y in range(h):
pixel = img.getpixel((x, y))
uniqueColors.add(pixel)
totalUniqueColors = len(uniqueColors)
print(totalUniqueColors)
print(uniqueColors)
Thanks in advance!

I solved my issue and I am now able to count colors in images coming from a semantic segmentation dataset (the images must be in .png since it is a lossless format).
Below I try to explain what I have found in the process for a solution and the code I used which should be ready to use (you need to just change the path to the images you want to analyze).
I had two main problems.
The first problem of the color counting was the format of the image. I was using (for some of the tests) .jpeg images that compress the image.
Therefore from something like this
If I would zoom in the top left corner of the glass (marked in green) I was seeing something like this
Which obviously is not good since it will introduce many more colors than the ones "visible to the human eye"
Instead, for my annotated images I had something like the following
If I zoom in the saddle of the bike (marked in green) I had something like this
The second problem was that I did not convert my image into an RGB image.
This is taken care in the code from the line:
img = Image.open(filename).convert('RGB')
The code is below. For sure it is not the most efficient but for me it does the job. Any suggestion to improve its performance is appreciated
import numpy as np
from PIL import Image
import argparse
import os
debug = False
def main(data_dir):
print("This small script allows you to count the number of different colors in an image")
print("This code has been written to count the number of classes in images from a semantic segmentation dataset")
print("Therefore, it is highly recommended to run this code on lossless images (such as .png ones)")
print("Images are being loaded from: {}".format(data_dir))
directory = os.fsencode(data_dir)
interesting_image_format = ".png"
# I will put in the variable filenames all the paths to the images to be analyzed
filenames = []
for file in os.listdir(directory):
filename = os.fsdecode(file)
if filename.endswith(interesting_image_format):
if debug:
print(os.path.join(directory, filename))
print("Analyzing image: {}".format(filename))
filenames.append(os.path.join(data_dir, filename))
else:
if debug:
print("I am not doing much here...")
continue
# Sort the filenames in an alphabetical order
filenames.sort()
# Analyze the images (i.e., count the different number of colors in the images)
number_of_colors_in_images = []
for filename in filenames:
img = Image.open(filename).convert('RGB')
if debug:
print(img.format)
print(img.size)
print(img.mode)
data_img = np.asarray(img)
if debug:
print(data_img.shape)
uniques = np.unique(data_img.reshape(-1, data_img.shape[-1]), axis=0)
# uncomment the following line if you want information for each analyzed image
print("The number of different colors in image ({}) {} is: {}".format(interesting_image_format, filename, len(uniques)))
# print("uniques.shape[0] for image {} is: {}".format(filename, uniques.shape[0]))
# Put the number of colors of each image into an array
number_of_colors_in_images.append(len(uniques))
print(number_of_colors_in_images)
# Print the maximum number of colors (classes) of all the analyzed images
print(np.max(number_of_colors_in_images))
# Print the average number of colors (classes) of all the analyzed images
print(np.average(number_of_colors_in_images))
def args_preprocess():
# Command line arguments
parser = argparse.ArgumentParser()
parser.add_argument(
"--data_dir", default="default_path_to_images", type=str, help='Specify the directory path from where to take the images of which we want to count the classes')
args = parser.parse_args()
main(args.data_dir)
if __name__ == '__main__':
args_preprocess()

The thing mentioned above about the lossy compression in .jpeg images and lossless compression in .png seems to be a nice thing to point out. But you can use the following piece of code to get the number of classes from a mask.
This is only applicable on .png images. Not tested on .jpeg images.
import cv2 as cv
import numpy as np
img_path = r'C:\Users\Bhavya\Downloads\img.png'
img = cv.imread(img_path)
img = np.array(img, dtype='int32')
pixels = []
for i in range(img.shape[0]):
for j in range(img.shape[1]):
r, g, b = list(img[i, j, :])
pixels.append((r, g, b))
pixels = list(set(pixels))
print(len(pixels))
In this solution what I have done is appended pair of pixel values(RGB) in the input image to a list and converted the list to set and then back to list. The first conversion of list to set removes all the duplicate elements(here pixel values) and gives unique pixel values and the next conversion from set to list is optional and just to apply some future list operations on the pixels.

Something has gone wrong - your image has 1277 unique colours, rather than the 5 you suggest.
Have you maybe saved/shared a lossy JPEG rather than the lossless PNG you should prefer for classified images?
A fast method of counting the unique colours with Numpy is as follows:
def withNumpy(img):
# Ignore A channel
px = np.asarray(img)[...,:3]
# Merge RGB888 into single 24-bit integer
px24 = np.dot(np.array(px, np.uint32),[1,256,65536])
# Return number of unique colours
return len(np.unique(px24))

Related

The python code to stack images runs extremely slow, looking for suggestions to speed it up

I have written some code to read the RGB values for each pixel of ~150 images (1000px by 720px, cropped and sized).
import os
from PIL import Image
print("STACKING IMAGES...")
os.chdir('cropped')
images=os.listdir() #list all images present in directory
print("GETTING IMAGES...")
channelR=[]
channelG=[]
channelB=[]
print("GETTING PIXEL INFORMATION...") #runs reasonably fast
for image in images: #loop through each image to extract RGB channels as separate lists
with Image.open(image) as img:
if image==images[0]:
imgSize=img.size
channelR.append(list(img.getdata(0)))
channelG.append(list(img.getdata(1)))
channelB.append(list(img.getdata(2)))
print("PIXEL INFORMATIION COLLECTED.")
print("AVERAGING IN CHANNEL RED.") #average for each pixel in each channel
avgR=[round(sum(x)/len(channelR)) for x in zip(*channelR)] #unzip the each pixel from all ~250 images, average it, store in tuple, starts to slow
print("AVERAGING IN CHANNEL GREEN.")
avgG=[round(sum(x)/len(channelG)) for x in zip(*channelG)] #slower
print("AVERAGING IN CHANNEL BLUE.")
avgB=[round(sum(x)/len(channelB)) for x in zip(*channelB)] #progressively slower
print("MERGING DATA ACROSS THREE CHANNELS.")
mergedData=[(x) for x in zip(avgR, avgG, avgB)] #merge averaged colour channels pixel by pixel, doesn't seem to end, takes eternity
print("GENERATING IMAGE.")
stacked=Image.new('RGB', (imgSize)) #create image
stacked.putdata(mergedData) #generate image
stacked.show()
os.chdir('..')
stacked.save('stacked.tif', 'TIFF') #save file
print("FINISHED STACKING !")
Running it on my modestly equipped computer (Core2Duo, 4GB RAM, Linux Mint OS) took close to an hour for the averaging across the three channels to complete and a further one hour to merge the individual averaged pixels (did not complete, and I aborted the process). I have read that list comprehensions are slow and zip() function takes up too much memory, but tinkering with those resulted in further bugs. I have even read that partitioning the program into functions might speed it up.
For comparable performances, I would kindly request the person answering the question to run the code on the images from https://github.com/rlvaugh/Impractical_Python_Projects/tree/master/Chapter_15/video_frames.
Any help on speeding-up the program would be gratefully accepted. Does it hold any chance of improving its speed drastically on shifting to more powerful systems?
Thank you in advance for any help.
Appending to lists is slow. As is having multiple list comprehensions for something you could do in a single loop. You could also use numpy arrays to speed it up using SIMD operations instead of iterating over list.
Here's some sample code for a few images. You can extend it as per your requirements.
import os
import numpy as np
import PIL
os.chdir('cropped')
imgfiles = ['MVI_6450 001.jpg', 'MVI_6450 002.jpg', 'MVI_6450 003.jpg', 'MVI_6450 004.jpg']
allimgs = None
for imgnum, imgfile in enumerate(imgfiles):
img = PIL.Image.open(imgfile)
imgdata = np.array(img.getdata()) # Nx3 array. columns: R, G, B channels
if allimgs is None:
allshape = list(imgdata.shape) # Size of one image
allshape.append(len(imgfiles)) # Append number of images
# allshape is now [num_pixels, num_channels, num_images]
# so making an array of this shape will allow us to store all images
# Axis 0: pixels. Axis 1: channels. Axis 2: images
allimgs = np.zeros(allshape)
allimgs[:, :, imgnum] = imgdata # Set the imgnum'th image data
# Get the mean along the last axis
# average same pixel across all images for each channel
imgavg = np.mean(allimgs, axis=-1)
# normalize so that max value is 255
# Also convert to uint8
imgavg = np.uint8(imgavg / np.max(imgavg) * 255)
imgavg_tuple = tuple(map(tuple, imgavg))
stacked = PIL.Image.new("RGB", img.size)
stacked.putdata(imgavg_tuple)
stacked.show()
os.chdir('..')
Note: We create a numpy array to hold all images at the start instead of appending as we load more images because it's a bad, bad idea to append to numpy arrays as Jacob mentions in a comment below. This is because numpy array append actually creates a new array and then copies the contents of both arrays, so it's an O(n^2) operation.

check if there is Exactly the same image as input image

I want to know How can i Find an image in Massive data (there are a lot of images in a Folder) and i want to Find image which is Exactly the same as input image (Given an input image from another folder not in the data folder ) and Compare the input image with all of the massive data , if it found Exactly The Same Image ,then show its name as output(the name of the Same Image in Folder,Not input name) (for example: dafs.jpg)
using python
I am thinking about comparing the exact value of RGB pixels and Subtract the pixel of input image from each of the images in the folder
but i don't know how to do that in python
Comparing RGB Pixel Values
You could use the pillow module to get access to the pixel data of a particular image. Keep in mind that pillow supports these image formats.
If we make a few assumptions about what it means for 2 images to be identical, based on your description, both images must:
Have the same dimensions (height and width)
Have the same RGB pixel values (the RGB values of pixel [x, y] in the input image must be the same as the RGB values of pixel [x, y] in the output image)
Be of the same orientation (related to the previous assumption, an image is considered to be not identical compared to the same image rotated by 90 degrees)
then if we have 2 images using the pillow module
from PIL import Image
original = Image.open("input.jpg")
possible_duplicate = Image.open("output.jpg")
the following code would be able to compare the 2 images to see if they were identical
def compare_images(input_image, output_image):
# compare image dimensions (assumption 1)
if input_image.size != output_image.size:
return False
rows, cols = input_image.size
# compare image pixels (assumption 2 and 3)
for row in range(rows):
for col in range(cols):
input_pixel = input_image.getpixel((row, col))
output_pixel = output_image.getpixel((row, col))
if input_pixel != output_pixel:
return False
return True
by calling
compare_images(original, possible_duplicate)
Using this function, we could go through a set of images
from PIL import Image
def find_duplicate_image(input_image, output_images):
# only open the input image once
input_image = Image.open(input_image)
for image in output_images:
if compare_images(input_image, Image.open(image)):
return image
Putting it all together, we could simply call
original = "input.jpg"
possible_duplicates = ["output.jpg", "output2.jpg", ...]
duplicate = find_duplicate_image(original, possible_duplicates)
Note that the above implementation will only find the first duplicate, and return that. If no duplicate is found, None will be returned.
One thing to keep in mind is that performing a comparison on every pixel like this can be costly. I used this image and ran compare_images using this as the input and the output 100 times using the timeit module, and took the average of all those runs
num_trials = 100
trials = timeit.repeat(
repeat=num_trials,
number=1,
stmt="compare_images(Image.open('input.jpg'), Image.open('input.jpg'))",
setup="from __main__ import compare_images; from PIL import Image"
)
avg = sum(trials) / num_trials
print("Average time taken per comparison was:", avg, "seconds")
# Average time taken per comparison was 1.3337286046380177 seconds
Note that this was done on an image that was only 600 by 600 pixels. If you did this with a "massive" set of possible duplicate images, where I will take "massive" to mean at least 1M images of similar dimensions, this could possibly take ~15 days (1,000,000 * 1.28s / 60 seconds / 60 minutes / 24 hours) to go through and compare each output image to the input, which is not ideal.
Also keep in mind that these metrics will vary based on the machine and operating system you are using. The numbers I provided are more for illustrative purposes.
Alternative Implementation
While I haven't fully explored this implementation myself, one method you could try would be to precompute a hash value of the pixel data of each of your images in your collection using a hash function. If you stored these in a database, with each hash containing a link to the original image or image name, then all you would have to do is calculate the hash of the input image using the same hashing function and compare the hashes instead. This would same lots of computation time, and would make a much more efficient algorithm.
This blog post describes one implementation for doing this.
Update - 2018-08-06
As per the request of the OP, if you were given the directory of the possible duplicate images and not the explicit image paths themselves, then you could use the os and ntpath modules like so
import ntpath
import os
def get_all_images(directory):
image_paths = []
for filename in os.listdir(directory):
# to be as careful as possible, you might check to make sure that
# the file is in fact an image, for instance using
# filename.endswith(".jpg") to check for .jpg files for instance
image_paths.append("{}/{}".format(directory, filename))
return image_paths
def get_filename(path):
return ntpath.basename(path)
Using these functions, the updated program might look like
possible_duplicates = get_all_images("/path/to/images")
duplicate_path = find_duplicate_image("/path/to/input.jpg", possible_duplicates)
if duplicate_path:
print(get_filename(duplicate_path))
The above will only print the name of the duplicate image if there was a duplicate, otherwise, it will print nothing.

Sticking two pictures together in python

Short question, I have 2 images. One is imported through:
Image = mpimg.imread('image.jpg')
While the other one is a processed image of the one imported above, this image is first converted from rgb to hls and then back. The outcome of this convertion gives a "list" which is different than the uint8 of the imported image.
When I'm trying to stick these images together with the function:
new_img2[:height,width:width*2]=image2
I don't see the second image in the combined image while by plotting the image through:
imgplot = plt.imshow(image2)
plt.show()
It works fine. What is the best way to convert the orignal to a "list" and then combine them or the "list" to uint8?
For some more information, the outcome has to be something like this:
enter image description here
Where the right side is black because the image I try to import in it has another type of array. The left image was an uint8 while the other is a "list". The second image is this one, which is saved from python:
enter image description here
Not sure how to do it the way you have show above but I have always been able to merge and save images as shown below!
def mergeImages(image1, image2, dir):
'''
Merge Image 1 and Image 2 side by side and delete the origional
'''
#adding a try/except would cut down on directory errors. not needed if you know you will always open correct images
if image1 == None:
image1.save(dir)
os.remove(image2)
return
im1 = Image.open(image1) #open image
im1.thumbnail((640,640)) #scales the image to 640, 480. Can be changed to whatever you need
im2 = Image.open(image2) #open Image
im1.thumbnail((640,480)) #Again scale
new_im = Image.new('RGB', (2000,720)) #Create a blank canvas image, size can be changed for your needs
new_im.paste(im1, (0,0)) #pasting image one at pos (0,0), can be changed for you
new_im.paste(im2, (640,0)) #again pasting
new_im.save(dir) #save image in defined directory
os.remove(image1) #Optionally deleting the origonal images, I do this to save on space
os.remove(image2)
After a day of searching I found out that both variables can be changed to the type of a float64. The "list" variable:
Image = np.asarray(Image)
This creates an float 64 from a List variable. While the uint8 can be changed to a float64 by:
Image2=np.asarray(Image2/255)
Than the 2 can be combined with:
totalImgage = np.hstack((Image,Image2))
Which than creates the wanted image.

How can i find cycles in a skeleton image with python libraries?

I have many skeletonized images like this:
How can i detect a cycle, a loop in the skeleton?
Are there "special" functions that do this or should I implement it as a graph?
In case there is only the graph option, can the python graph library NetworkX can help me?
You can exploit the topology of the skeleton. A cycle will have no holes, so we can use scipy.ndimage to find any holes and compare. This isn't the fastest method, but it's extremely easy to code.
import scipy.misc, scipy.ndimage
# Read the image
img = scipy.misc.imread("Skel.png")
# Retain only the skeleton
img[img!=255] = 0
img = img.astype(bool)
# Fill the holes
img2 = scipy.ndimage.binary_fill_holes(img)
# Compare the two, an image without cycles will have no holes
print "Cycles in image: ", ~(img == img2).all()
# As a test break the cycles
img3 = img.copy()
img3[0:200, 0:200] = 0
img4 = scipy.ndimage.binary_fill_holes(img3)
# Compare the two, an image without cycles will have no holes
print "Cycles in image: ", ~(img3 == img4).all()
I've used your "B" picture as an example. The first two images are the original and the filled version which detects a cycle. In the second version, I've broken the cycle and nothing gets filled, thus the two images are the same.
First, let's build an image of the letter B with PIL:
import Image, ImageDraw, ImageFont
image = Image.new("RGBA", (600,150), (255,255,255))
draw = ImageDraw.Draw(image)
fontsize = 150
font = ImageFont.truetype("/usr/share/fonts/truetype/liberation/LiberationMono-Regular.ttf", fontsize)
txt = 'B'
draw.text((30, 5), txt, (0,0,0), font=font)
img = image.resize((188,45), Image.ANTIALIAS)
print type(img)
plt.imshow(img)
you may find a better way to do that, particularly with path to the fonts. Ii would be better to load an image instead of generating it. Anyway, we have now something to work on:
Now, the real part:
import mahotas as mh
img = np.array(img)
im = img[:,0:50,0]
im = im < 128
skel = mh.thin(im)
noholes = mh.morph.close_holes(skel)
plt.subplot(311)
plt.imshow(im)
plt.subplot(312)
plt.imshow(skel)
plt.subplot(313)
cskel = np.logical_not(skel)
choles = np.logical_not(noholes)
holes = np.logical_and(cskel,noholes)
lab, n = mh.label(holes)
print 'B has %s holes'% str(n)
plt.imshow(lab)
And we have in the console (ipython):
B has 2 holes
Converting your skeleton image to a graph representation is not trivial, and I don't know of any tools to do that for you.
One way to do it in the bitmap would be to use a flood fill, like the paint bucket in photoshop. If you start a flood fill of the image, the entire background will get filled if there are no cycles. If the fill doesn't get the entire image then you've found a cycle. Robustly finding all the cycles could require filling multiple times.
This is likely to be very slow to execute, but probably much faster to code than a technique where you trace the skeleton into graph data structure.

Auto-cropping images with PIL

I have folder full of images with each image containing at least 4 smaller images. I would to know how I can cut the smaller images out using Python PIL so that they will all exist as independent image files. fortunately there is one constant, the background is either white or black so what I'm guessing I need is a way to the cut these images out by searching for rows or preferably columns which are entirely black or entirely white, Here is an example image:
From the image above, there would be 10 separate images, each containing a number. Thanks in advance.
EDIT: I have another sample image that is more realistic in the sense that the backgrounds of some of the smaller images are the same colour as the background of the image they are contained in. e.g.
The output of which being 13 separate images, each containng 1 letter
Using scipy.ndimage for labeling:
import numpy as np
import scipy.ndimage as ndi
import Image
THRESHOLD = 100
MIN_SHAPE = np.asarray((5, 5))
filename = "eQ9ts.jpg"
im = np.asarray(Image.open(filename))
gray = im.sum(axis=-1)
bw = gray > THRESHOLD
label, n = ndi.label(bw)
indices = [np.where(label == ind) for ind in xrange(1, n)]
slices = [[slice(ind[i].min(), ind[i].max()) for i in (0, 1)] + [slice(None)]
for ind in indices]
images = [im[s] for s in slices]
# filter out small images
images = [im for im in images if not np.any(np.asarray(im.shape[:-1]) < MIN_SHAPE)]

Categories

Resources