For example, I have 100 pictures whose resolution is the same, and I want to merge them into one picture. For the final picture, the RGB value of each pixel is the average of the 100 pictures' at that position. I know the getdata function can work in this situation, but is there a simpler and faster way to do this in PIL(Python Image Library)?
Let's assume that your images are all .png files and they are all stored in the current working directory. The python code below will do what you want. As Ignacio suggests, using numpy along with PIL is the key here. You just need to be a little bit careful about switching between integer and float arrays when building your average pixel intensities.
import os, numpy, PIL
from PIL import Image
# Access all PNG files in directory
allfiles=os.listdir(os.getcwd())
imlist=[filename for filename in allfiles if filename[-4:] in [".png",".PNG"]]
# Assuming all images are the same size, get dimensions of first image
w,h=Image.open(imlist[0]).size
N=len(imlist)
# Create a numpy array of floats to store the average (assume RGB images)
arr=numpy.zeros((h,w,3),numpy.float)
# Build up average pixel intensities, casting each image as an array of floats
for im in imlist:
imarr=numpy.array(Image.open(im),dtype=numpy.float)
arr=arr+imarr/N
# Round values in array and cast as 8-bit integer
arr=numpy.array(numpy.round(arr),dtype=numpy.uint8)
# Generate, save and preview final image
out=Image.fromarray(arr,mode="RGB")
out.save("Average.png")
out.show()
The image below was generated from a sequence of HD video frames using the code above.
I find it difficult to imagine a situation where memory is an issue here, but in the (unlikely) event that you absolutely cannot afford to create the array of floats required for my original answer, you could use PIL's blend function, as suggested by #mHurley as follows:
# Alternative method using PIL blend function
avg=Image.open(imlist[0])
for i in xrange(1,N):
img=Image.open(imlist[i])
avg=Image.blend(avg,img,1.0/float(i+1))
avg.save("Blend.png")
avg.show()
You could derive the correct sequence of alpha values, starting with the definition from PIL's blend function:
out = image1 * (1.0 - alpha) + image2 * alpha
Think about applying that function recursively to a vector of numbers (rather than images) to get the mean of the vector. For a vector of length N, you would need N-1 blending operations, with N-1 different values of alpha.
However, it's probably easier to think intuitively about the operations. At each step you want the avg image to contain equal proportions of the source images from earlier steps. When blending the first and second source images, alpha should be 1/2 to ensure equal proportions. When blending the third with the the average of the first two, you would like the new image to be made up of 1/3 of the third image, with the remainder made up of the average of the previous images (current value of avg), and so on.
In principle this new answer, based on blending, should be fine. However I don't know exactly how the blend function works. This makes me worry about how the pixel values are rounded after each iteration.
The image below was generated from 288 source images using the code from my original answer:
On the other hand, this image was generated by repeatedly applying PIL's blend function to the same 288 images:
I hope you can see that the outputs from the two algorithms are noticeably different. I expect this is because of accumulation of small rounding errors during repeated application of Image.blend
I strongly recommend my original answer over this alternative.
One can also use numpy mean function for averaging. The code looks better and works faster.
Here the comparison of timing and results for 700 noisy grayscale images of faces:
def average_img_1(imlist):
# Assuming all images are the same size, get dimensions of first image
w,h=Image.open(imlist[0]).size
N=len(imlist)
# Create a numpy array of floats to store the average (assume RGB images)
arr=np.zeros((h,w),np.float)
# Build up average pixel intensities, casting each image as an array of floats
for im in imlist:
imarr=np.array(Image.open(im),dtype=np.float)
arr=arr+imarr/N
out = Image.fromarray(arr)
return out
def average_img_2(imlist):
# Alternative method using PIL blend function
N = len(imlist)
avg=Image.open(imlist[0])
for i in xrange(1,N):
img=Image.open(imlist[i])
avg=Image.blend(avg,img,1.0/float(i+1))
return avg
def average_img_3(imlist):
# Alternative method using numpy mean function
images = np.array([np.array(Image.open(fname)) for fname in imlist])
arr = np.array(np.mean(images, axis=(0)), dtype=np.uint8)
out = Image.fromarray(arr)
return out
average_img_1()
100 loops, best of 3: 362 ms per loop
average_img_2()
100 loops, best of 3: 340 ms per loop
average_img_3()
100 loops, best of 3: 311 ms per loop
BTW, the results of averaging are quite different. I think the first method lose information during averaging. And the second one has some artifacts.
average_img_1
average_img_2
average_img_3
in case anybody is interested in a blueprint numpy solution (I was actually looking for it), here's the code:
mean_frame = np.mean(([frame for frame in frames]), axis=0)
I would consider creating an array of x by y integers all starting at (0, 0, 0) and then for each pixel in each file add the RGB value in, divide all the values by the number of images and then create the image from that - you will probably find that numpy can help.
I ran into MemoryErrors when trying the method in the accepted answer. I found a way to optimize that seems to produce the same result. Basically, you blend one image at a time, instead of adding them all up and dividing.
N=len(images_to_blend)
avg = Image.open(images_to_blend[0])
for im in images_to_blend: #assuming your list is filenames, not images
img = Image.open(im)
avg = Image.blend(avg, img, 1/N)
avg.save(blah)
This does two things, you don't have to have two very dense copies of the image while you're turning the image into an array, and you don't have to use 64-bit floats at all. You get similarly high precision, with smaller numbers. The results APPEAR to be the same, though I'd appreciate if someone checked my math.
Related
My problem is as follows. I have an image img0 (array shape (A,B,3)) and then a face img1 cut out from the middle of that image (by an algorithm I don't have access to: my input is only the whole image, and the face cut out from it), now an array shaped (C,D,3) where C<A and D<B. Now, I want to perform operations on the face (e.g., colour it differently) and then stick it back inside the original background (which is not coloured differently) -- these operations will not affect the shape of img1 array containing the face alone, it will remain (C,D,3). Something like img0-img1 doesn't work because of the shape mismatch.
I guess an approach like finding the starting coordinate of the face in img0 would work in the case that the face cut out is rectangular (which is possible for me to use, though not ideal), since it is guaranteed that the face is exactly identical in img1 and img0. That means, to get the background, we only need to find the starting coordinate of the img1 array in img0, cut out the subsequent elements (that correspond to img1) from img0, and we're left with the background. After I've done whatever I want to the face, I can use the new (C,D,3) array in place of the previous img1 part of the whole image (img0).
Is there a way to do this in Python? i.e., compute the difference between two images of different sizes, where one image is a 'subimage' of the other? Or, failing that, if we can find the starting coordinate of the rectangular portion of an image (img0) which corresponds to a rectangular cutout available to us (img1)?
Or, failing that, if we can find the starting coordinate of the rectangular ?portion of an image (img0) which corresponds to a rectangular cutout available to us (img1)?
One easy way to do that would be to cross-correlate your zero-mean cut-out with the zero-mean original image. As you have no noise added to the image, any maximum of the cross-correlation is a possible candidate.
However:
(i) If you don't use faces but e.g. blocks, there will be multiple maxima and you don't have an unique solution.
(ii) It is not exactly an elegant solution to your problem.
I modified the code example from [1] to make it clearer:
from scipy import signal, misc
import numpy as np
face = misc.face(gray=True)
face = face - np.mean(face)
face_cutout = np.copy(face[300:365, 670:750])
face_cutout = face_cutout - np.mean(face_cutout)
corr = signal.correlate2d(face, face_cutout, mode='valid')
y, x = np.unravel_index(np.argmax(corr), corr.shape) # find the match
print(f'x: {x} y: {y}')
[1] https://docs.scipy.org/doc/scipy/reference/generated/scipy.signal.correlate2d.html
I have written some code to read the RGB values for each pixel of ~150 images (1000px by 720px, cropped and sized).
import os
from PIL import Image
print("STACKING IMAGES...")
os.chdir('cropped')
images=os.listdir() #list all images present in directory
print("GETTING IMAGES...")
channelR=[]
channelG=[]
channelB=[]
print("GETTING PIXEL INFORMATION...") #runs reasonably fast
for image in images: #loop through each image to extract RGB channels as separate lists
with Image.open(image) as img:
if image==images[0]:
imgSize=img.size
channelR.append(list(img.getdata(0)))
channelG.append(list(img.getdata(1)))
channelB.append(list(img.getdata(2)))
print("PIXEL INFORMATIION COLLECTED.")
print("AVERAGING IN CHANNEL RED.") #average for each pixel in each channel
avgR=[round(sum(x)/len(channelR)) for x in zip(*channelR)] #unzip the each pixel from all ~250 images, average it, store in tuple, starts to slow
print("AVERAGING IN CHANNEL GREEN.")
avgG=[round(sum(x)/len(channelG)) for x in zip(*channelG)] #slower
print("AVERAGING IN CHANNEL BLUE.")
avgB=[round(sum(x)/len(channelB)) for x in zip(*channelB)] #progressively slower
print("MERGING DATA ACROSS THREE CHANNELS.")
mergedData=[(x) for x in zip(avgR, avgG, avgB)] #merge averaged colour channels pixel by pixel, doesn't seem to end, takes eternity
print("GENERATING IMAGE.")
stacked=Image.new('RGB', (imgSize)) #create image
stacked.putdata(mergedData) #generate image
stacked.show()
os.chdir('..')
stacked.save('stacked.tif', 'TIFF') #save file
print("FINISHED STACKING !")
Running it on my modestly equipped computer (Core2Duo, 4GB RAM, Linux Mint OS) took close to an hour for the averaging across the three channels to complete and a further one hour to merge the individual averaged pixels (did not complete, and I aborted the process). I have read that list comprehensions are slow and zip() function takes up too much memory, but tinkering with those resulted in further bugs. I have even read that partitioning the program into functions might speed it up.
For comparable performances, I would kindly request the person answering the question to run the code on the images from https://github.com/rlvaugh/Impractical_Python_Projects/tree/master/Chapter_15/video_frames.
Any help on speeding-up the program would be gratefully accepted. Does it hold any chance of improving its speed drastically on shifting to more powerful systems?
Thank you in advance for any help.
Appending to lists is slow. As is having multiple list comprehensions for something you could do in a single loop. You could also use numpy arrays to speed it up using SIMD operations instead of iterating over list.
Here's some sample code for a few images. You can extend it as per your requirements.
import os
import numpy as np
import PIL
os.chdir('cropped')
imgfiles = ['MVI_6450 001.jpg', 'MVI_6450 002.jpg', 'MVI_6450 003.jpg', 'MVI_6450 004.jpg']
allimgs = None
for imgnum, imgfile in enumerate(imgfiles):
img = PIL.Image.open(imgfile)
imgdata = np.array(img.getdata()) # Nx3 array. columns: R, G, B channels
if allimgs is None:
allshape = list(imgdata.shape) # Size of one image
allshape.append(len(imgfiles)) # Append number of images
# allshape is now [num_pixels, num_channels, num_images]
# so making an array of this shape will allow us to store all images
# Axis 0: pixels. Axis 1: channels. Axis 2: images
allimgs = np.zeros(allshape)
allimgs[:, :, imgnum] = imgdata # Set the imgnum'th image data
# Get the mean along the last axis
# average same pixel across all images for each channel
imgavg = np.mean(allimgs, axis=-1)
# normalize so that max value is 255
# Also convert to uint8
imgavg = np.uint8(imgavg / np.max(imgavg) * 255)
imgavg_tuple = tuple(map(tuple, imgavg))
stacked = PIL.Image.new("RGB", img.size)
stacked.putdata(imgavg_tuple)
stacked.show()
os.chdir('..')
Note: We create a numpy array to hold all images at the start instead of appending as we load more images because it's a bad, bad idea to append to numpy arrays as Jacob mentions in a comment below. This is because numpy array append actually creates a new array and then copies the contents of both arrays, so it's an O(n^2) operation.
Assuming there are only 2 colors in an image. What's the simplest way in Python to tell an image has more (the colored areas) of these 2 colors than the other (group of similar images)?
Definition of "more": the area of total colored blocks of one picture, is bigger than the other. (please note the shape of colors might be irregular)
Thank you.
Okay, after some experimentation, I have a possible solution. You can use Pillow, a common image-loading/handling library, to convert the images to an ndarray, and then use the count_nonzero() method to get your desired results. As a fun side-effect, this works with an arbitrary amount of colors. Here's full working code that I just tried:
from PIL import Image # because for some reason, that's how you import something from Pillow
import numpy as np
im = Image.open("/path/to/image.png")
arr = np.array(im.getdata())
unique_colors, counts = np.unique(arr.reshape(-1, arr.shape[1]), axis=0, return_counts=True)
Now the unique_colors variable holds the unique colors that appear in your image, and counts holds the corresponding counts for each color in the image; that is to say, counts[i] is the number of times unique_colors[i] appears in the image for any i.
How does the unique + reshaping line work? This is borrowed from this particular answer. Basically, you flatten out your image array such that it has shape (num_pixels, num_channels), which could be 1, 3, or 4 depending on your image format (single-channel, RGB, RGBA, etc.). Now that I have a giant 2D "table" of pixels, I simply find which row values (hence axis=0) are unique, and then use the return_counts keyword to return, well, the counts.
At this point, you have extracted the unique colors and counts of those colors for a single image. To compare multiple images, you would repeat this process on multiple images, find the colors they have in common, and then you can simply compare integers to find out which image has more of a particular color.
For my particular image, the format of the channels happened to be RGBA; in any case, I would recommend printing out arr.shape prior to the reshape step to verify that you have the correct index. If you/anyone else knows of a more general method to find the channel index of an image obtained in this fashion — I'm all ears. Thus, you may have to change the index of arr.shape to something else depending on your image. For the record, I tried this on a .png image, like you specified. Hope this helps!
I want to know How can i Find an image in Massive data (there are a lot of images in a Folder) and i want to Find image which is Exactly the same as input image (Given an input image from another folder not in the data folder ) and Compare the input image with all of the massive data , if it found Exactly The Same Image ,then show its name as output(the name of the Same Image in Folder,Not input name) (for example: dafs.jpg)
using python
I am thinking about comparing the exact value of RGB pixels and Subtract the pixel of input image from each of the images in the folder
but i don't know how to do that in python
Comparing RGB Pixel Values
You could use the pillow module to get access to the pixel data of a particular image. Keep in mind that pillow supports these image formats.
If we make a few assumptions about what it means for 2 images to be identical, based on your description, both images must:
Have the same dimensions (height and width)
Have the same RGB pixel values (the RGB values of pixel [x, y] in the input image must be the same as the RGB values of pixel [x, y] in the output image)
Be of the same orientation (related to the previous assumption, an image is considered to be not identical compared to the same image rotated by 90 degrees)
then if we have 2 images using the pillow module
from PIL import Image
original = Image.open("input.jpg")
possible_duplicate = Image.open("output.jpg")
the following code would be able to compare the 2 images to see if they were identical
def compare_images(input_image, output_image):
# compare image dimensions (assumption 1)
if input_image.size != output_image.size:
return False
rows, cols = input_image.size
# compare image pixels (assumption 2 and 3)
for row in range(rows):
for col in range(cols):
input_pixel = input_image.getpixel((row, col))
output_pixel = output_image.getpixel((row, col))
if input_pixel != output_pixel:
return False
return True
by calling
compare_images(original, possible_duplicate)
Using this function, we could go through a set of images
from PIL import Image
def find_duplicate_image(input_image, output_images):
# only open the input image once
input_image = Image.open(input_image)
for image in output_images:
if compare_images(input_image, Image.open(image)):
return image
Putting it all together, we could simply call
original = "input.jpg"
possible_duplicates = ["output.jpg", "output2.jpg", ...]
duplicate = find_duplicate_image(original, possible_duplicates)
Note that the above implementation will only find the first duplicate, and return that. If no duplicate is found, None will be returned.
One thing to keep in mind is that performing a comparison on every pixel like this can be costly. I used this image and ran compare_images using this as the input and the output 100 times using the timeit module, and took the average of all those runs
num_trials = 100
trials = timeit.repeat(
repeat=num_trials,
number=1,
stmt="compare_images(Image.open('input.jpg'), Image.open('input.jpg'))",
setup="from __main__ import compare_images; from PIL import Image"
)
avg = sum(trials) / num_trials
print("Average time taken per comparison was:", avg, "seconds")
# Average time taken per comparison was 1.3337286046380177 seconds
Note that this was done on an image that was only 600 by 600 pixels. If you did this with a "massive" set of possible duplicate images, where I will take "massive" to mean at least 1M images of similar dimensions, this could possibly take ~15 days (1,000,000 * 1.28s / 60 seconds / 60 minutes / 24 hours) to go through and compare each output image to the input, which is not ideal.
Also keep in mind that these metrics will vary based on the machine and operating system you are using. The numbers I provided are more for illustrative purposes.
Alternative Implementation
While I haven't fully explored this implementation myself, one method you could try would be to precompute a hash value of the pixel data of each of your images in your collection using a hash function. If you stored these in a database, with each hash containing a link to the original image or image name, then all you would have to do is calculate the hash of the input image using the same hashing function and compare the hashes instead. This would same lots of computation time, and would make a much more efficient algorithm.
This blog post describes one implementation for doing this.
Update - 2018-08-06
As per the request of the OP, if you were given the directory of the possible duplicate images and not the explicit image paths themselves, then you could use the os and ntpath modules like so
import ntpath
import os
def get_all_images(directory):
image_paths = []
for filename in os.listdir(directory):
# to be as careful as possible, you might check to make sure that
# the file is in fact an image, for instance using
# filename.endswith(".jpg") to check for .jpg files for instance
image_paths.append("{}/{}".format(directory, filename))
return image_paths
def get_filename(path):
return ntpath.basename(path)
Using these functions, the updated program might look like
possible_duplicates = get_all_images("/path/to/images")
duplicate_path = find_duplicate_image("/path/to/input.jpg", possible_duplicates)
if duplicate_path:
print(get_filename(duplicate_path))
The above will only print the name of the duplicate image if there was a duplicate, otherwise, it will print nothing.
I need to work with some greyscale tif files and I have been using PIL to import them as images and convert them into numpy arrays:
np.array(Image.open(src))
I want to have a transparent understanding of exactly what the values of these array correspond to and in particular, it was not clear what value was appropriate as a white point or black point for my images. For instance if I wanted to convert this array into an array of floats with pixel values of 1 for white values and 0 for black with other values scaled linearly in between.
I have tried some naive methods including scaling by the maximum value in the array but opening the resulting files, there is always some amount of shift in the color levels.
Is there any documentation for the proper way to understand the values stored in these tif arrays?
A TIFF is basically a computer file format for storing raster graphics images. It has a lot of specs and quick search on the web will get you the resources you need.
The thing is you are using PIL as your input library. The array you have is likely working with an uint8 data type, which means your data can be anywhere within 0 to 255. To obtain the 0 to 1 color range do the following:
im = np.array(Image.open(src)).astype('float32')/255
Notice your array will likely have 4 layers given in the third dimension im[:,:, here] (im.shape = (i,j,k)). So each trace im[i,j,:] (which represents a pixel) is going to be a quadruplet for an RGBA value.
The R stands for Red (or quantity of Red), G for Green, B for Blue. A is the alpha channel and it is what enables you to have transparency (lower values means less opacity and more transparency).
It can also have three layers for only RGB, or one layer if intended to be plotted in the grey-scale.
In the case you have RGB (or RGBA but not considering alpha) but need a single value you should understand that there are quite a few different ways of doing this. In this post #denis recommends the use of the following formulation:
Y = .2126 * R^gamma + .7152 * G^gamma + .0722 * B^gamma
where gamma is 2.2 for many PCs. The usual R G B are sometimes written
as R' G' B' (R' = Rlin ^ (1/gamma)) (purists tongue-click) but here
I'll drop the '.
And finally L* = 116 * Y ^ 1/3 - 16 to obtain the luminance.
I recommend you to read his post. Also consider looking into the following concepts:
RGB Colors model
Gamma correction
Tagged Image File Format
Pillow documentation of TIFF
Working with TIFFs (import, export) in Python using numpy