Here's the scenario, I want to create a set of random, small jpg's - anywhere between 50 bytes and 8k in size - the actual visual content of the jpeg is irrelevant as long as they're valid. I need to generate a thousand or so, and they all have to be unique - even if they're only different by a single pixel. Can I just write a jpeg header/footer and some random bytes in there? I'm not able to use existing photos or sets of photos from the web.
The second issue is that the set of images has to be different for each run of the program.
I'd prefer to do this in python, as the wrapping scripts are in Python.
I've looked for python code to generate jpg's from scratch, and didn't find anything, so pointers to libraries are just as good.
If the images can be only random noise, so you could generate an array using numpy.random and save them using PIL's
This example might be expanded, including ways to avoid a (very unlikely) repetition of patterns:
import numpy
from PIL import Image
for n in range(10):
a = numpy.random.rand(30,30,3) * 255
im_out = Image.fromarray(a.astype('uint8')).convert('RGB')'out%000d.jpg' % n)
These conditions must be met in order to get jpeg images:
The array needs to be shaped (m, n, 3) - three colors, R G and B;
Each element (each color of each pixel) has to be a byte integer (uint, or unsigned integer with 8 bits), ranging from 0 to 255.
Additionaly, some other way besides pure randomness might be used in order to generate the images in case you don't want pure noise.
If you do not care about the content of a file, you can create valid JPEG using Pillow ( [0]) this way:
from PIL import Image
width = height = 128
valid_solid_color_jpeg ='RGB', size=(width, height), color='red')'red_image.jpg')
// EDIT: I thought OP wants to generate valid images and does not care about their content (that's why I suggested solid-color images). Here's a function that generates valid images with random pixels and as a bonus writes random string to the generated image. The only dependency is Pillow, everything else is pure Python.
import random
import uuid
from PIL import Image, ImageDraw
def generate_random_image(width=128, height=128):
rand_pixels = [random.randint(0, 255) for _ in range(width * height * 3)]
rand_pixels_as_bytes = bytes(rand_pixels)
text_and_filename = str(uuid.uuid4())
random_image = Image.frombytes('RGB', (width, height), rand_pixels_as_bytes)
draw_image = ImageDraw.Draw(random_image)
draw_image.text(xy=(0, 0), text=text_and_filename, fill=(255, 255, 255))"{file_name}.jpg".format(file_name=text_and_filename))
# Generate 42 random images:
for _ in range(42):
If you are looking for a way to do this without numpy this worked for me
(python 3.6 for bytes, you still need Pillow)
import random as r
from PIL import Image
dat = bytes([r.randint(1,3) for x in range(4500000)])
i = Image.frombytes('1', (200,200), dat)
my output my input Hi I am using this python code to generate an shuffle pixel image is there any way to make this process opposite ? for example I give this code output's photo to the program and it reproduce the original photo again.
I am trying to generate an static style image and reverse it back into the original image and I am open into any other ideas for replacing this code
from PIL import Image
import numpy as np
orig ='lena.jpg')
orig_px = orig.getdata()
orig_px = np.reshape(orig_px, (orig.height * orig.width, 3))
orig_px = np.reshape(orig_px, (orig.height, orig.width, 3))
res = Image.fromarray(orig_px.astype('uint8'))'out.jpg')
Firstly, bear in mind that JPEG is lossy - so you will never get back what you write with JPEG - it changes your data! So, use PNG if you want to read back losslessly exactly what you started with.
You can do what you ask like this:
#!/usr/bin/env python3
import numpy as np
from PIL import Image
def shuffleImage(im, seed=42):
# Get pixels and put in Numpy array for easy shuffling
pix = np.array(im.getdata())
# Generate an array of shuffled indices
# Seed random number generation to ensure same result
indices = np.random.permutation(len(pix))
# Shuffle the pixels and recreate image
shuffled = pix[indices].astype(np.uint8)
return Image.fromarray(shuffled.reshape(im.width,im.height,3))
def unshuffleImage(im, seed=42):
# Get shuffled pixels in Numpy array
shuffled = np.array(im.getdata())
nPix = len(shuffled)
# Generate unshuffler
indices = np.random.permutation(nPix)
unshuffler = np.zeros(nPix, np.uint32)
unshuffler[indices] = np.arange(nPix)
unshuffledPix = shuffled[unshuffler].astype(np.uint8)
return Image.fromarray(unshuffledPix.reshape(im.width,im.height,3))
# Load image and ensure RGB, i.e. not palette image
orig ='lena.png').convert('RGB')
result = shuffleImage(orig)'shuffled.png')
unshuffled = unshuffleImage(result)'unshuffled.png')
Which turns Lena into this:
It's impossible to do that reliably as far as I know. Theoretically you could brute force it by shuffling the pixels over and over and feeding the result into Amazon Rekognition, but you would end up with a huge AWS bill and probably only something that is approximately the original picture.
EDIT: Code is working now, thanks to Mark and zephyr. zephyr also has two alternate working solutions below.
I want to divide blend two images with PIL. I found ImageChops.multiply(image1, image2) but I couldn't find a similar divide(image, image2) function.
Divide Blend Mode Explained (I used the first two images here as my test sources.)
Is there a built-in divide blend function that I missed (PIL or otherwise)?
My test code below runs and is getting close to what I'm looking for. The resulting image output is similar to the divide blend example image here: Divide Blend Mode Explained.
Is there a more efficient way to do this divide blend operation (less steps and faster)? At first, I tried using lambda functions in Image.eval and ImageMath.eval to check for black pixels and flip them to white during the division process, but I couldn't get either to produce the correct result.
EDIT: Fixed code and shortened thanks to Mark and zephyr. The resulting image output matches the output from zephyr's numpy and scipy solutions below.
# PIL Divide Blend test
import Image, os, ImageMath
imgA ='01background.jpg')
imgB ='02testgray.jpg')
# split RGB images into 3 channels
rA, gA, bA = imgA.split()
rB, gB, bB = imgB.split()
# divide each channel (image1/image2)
rTmp = ImageMath.eval("int(a/((float(b)+1)/256))", a=rA, b=rB).convert('L')
gTmp = ImageMath.eval("int(a/((float(b)+1)/256))", a=gA, b=gB).convert('L')
bTmp = ImageMath.eval("int(a/((float(b)+1)/256))", a=bA, b=bB).convert('L')
# merge channels into RGB image
imgOut = Image.merge("RGB", (rTmp, gTmp, bTmp))'PILdiv0.png', 'PNG')
os.system('start PILdiv0.png')
You are asking:
Is there a more efficient way to do this divide blend operation (less steps and faster)?
You could also use the python package blend modes. It is written with vectorized Numpy math and generally fast. Install it via pip install blend_modes. I have written the commands in a more verbose way to improve readability, it would be shorter to chain them. Use blend_modes like this to divide your images:
from PIL import Image
import numpy
import os
from blend_modes import blend_modes
# Load images
imgA ='01background.jpg')
imgA = numpy.array(imgA)
# append alpha channel
imgA = numpy.dstack((imgA, numpy.ones((imgA.shape[0], imgA.shape[1], 1))*255))
imgA = imgA.astype(float)
imgB ='02testgray.jpg')
imgB = numpy.array(imgB)
# append alpha channel
imgB = numpy.dstack((imgB, numpy.ones((imgB.shape[0], imgB.shape[1], 1))*255))
imgB = imgB.astype(float)
# Divide images
imgOut = blend_modes.divide(imgA, imgB, 1.0)
# Save images
imgOut = numpy.uint8(imgOut)
imgOut = Image.fromarray(imgOut)'PILdiv0.png', 'PNG')
os.system('start PILdiv0.png')
Be aware that for this to work, both images need to have the same dimensions, e.g. imgA.shape == (240,320,3) and imgB.shape == (240,320,3).
There is a mathematical definition for the divide function here:
Here's an implementation with scipy/matplotlib:
import numpy as np
import scipy.misc as mpl
a = mpl.imread('01background.jpg')
b = mpl.imread('02testgray.jpg')
c = a/((b.astype('float')+1)/256)
d = c*(c < 255)+255*np.ones(np.shape(c))*(c > 255)
e = d.astype('uint8')
mpl.imsave('output.png', e)
If you don't want to use matplotlib, you can do it like this (I assume you have numpy):
imgA ='01background.jpg')
imgB ='02testgray.jpg')
a = asarray(imgA)
b = asarray(imgB)
c = a/((b.astype('float')+1)/256)
d = c*(c < 255)+255*ones(shape(c))*(c > 255)
e = d.astype('uint8')
imgOut = Image.fromarray(e)'PILdiv0.png', 'PNG')
The problem you're having is when you have a zero in image B - it causes a divide by zero. If you convert all of those values to one instead I think you'll get the desired result. That will eliminate the need to check for zeros and fix them in the result.
I want to know How can i Find an image in Massive data (there are a lot of images in a Folder) and i want to Find image which is Exactly the same as input image (Given an input image from another folder not in the data folder ) and Compare the input image with all of the massive data , if it found Exactly The Same Image ,then show its name as output(the name of the Same Image in Folder,Not input name) (for example: dafs.jpg)
using python
I am thinking about comparing the exact value of RGB pixels and Subtract the pixel of input image from each of the images in the folder
but i don't know how to do that in python
Comparing RGB Pixel Values
You could use the pillow module to get access to the pixel data of a particular image. Keep in mind that pillow supports these image formats.
If we make a few assumptions about what it means for 2 images to be identical, based on your description, both images must:
Have the same dimensions (height and width)
Have the same RGB pixel values (the RGB values of pixel [x, y] in the input image must be the same as the RGB values of pixel [x, y] in the output image)
Be of the same orientation (related to the previous assumption, an image is considered to be not identical compared to the same image rotated by 90 degrees)
then if we have 2 images using the pillow module
from PIL import Image
original ="input.jpg")
possible_duplicate ="output.jpg")
the following code would be able to compare the 2 images to see if they were identical
def compare_images(input_image, output_image):
# compare image dimensions (assumption 1)
if input_image.size != output_image.size:
return False
rows, cols = input_image.size
# compare image pixels (assumption 2 and 3)
for row in range(rows):
for col in range(cols):
input_pixel = input_image.getpixel((row, col))
output_pixel = output_image.getpixel((row, col))
if input_pixel != output_pixel:
return False
return True
by calling
compare_images(original, possible_duplicate)
Using this function, we could go through a set of images
from PIL import Image
def find_duplicate_image(input_image, output_images):
# only open the input image once
input_image =
for image in output_images:
if compare_images(input_image,
return image
Putting it all together, we could simply call
original = "input.jpg"
possible_duplicates = ["output.jpg", "output2.jpg", ...]
duplicate = find_duplicate_image(original, possible_duplicates)
Note that the above implementation will only find the first duplicate, and return that. If no duplicate is found, None will be returned.
One thing to keep in mind is that performing a comparison on every pixel like this can be costly. I used this image and ran compare_images using this as the input and the output 100 times using the timeit module, and took the average of all those runs
num_trials = 100
trials = timeit.repeat(
setup="from __main__ import compare_images; from PIL import Image"
avg = sum(trials) / num_trials
print("Average time taken per comparison was:", avg, "seconds")
# Average time taken per comparison was 1.3337286046380177 seconds
Note that this was done on an image that was only 600 by 600 pixels. If you did this with a "massive" set of possible duplicate images, where I will take "massive" to mean at least 1M images of similar dimensions, this could possibly take ~15 days (1,000,000 * 1.28s / 60 seconds / 60 minutes / 24 hours) to go through and compare each output image to the input, which is not ideal.
Also keep in mind that these metrics will vary based on the machine and operating system you are using. The numbers I provided are more for illustrative purposes.
Alternative Implementation
While I haven't fully explored this implementation myself, one method you could try would be to precompute a hash value of the pixel data of each of your images in your collection using a hash function. If you stored these in a database, with each hash containing a link to the original image or image name, then all you would have to do is calculate the hash of the input image using the same hashing function and compare the hashes instead. This would same lots of computation time, and would make a much more efficient algorithm.
This blog post describes one implementation for doing this.
Update - 2018-08-06
As per the request of the OP, if you were given the directory of the possible duplicate images and not the explicit image paths themselves, then you could use the os and ntpath modules like so
import ntpath
import os
def get_all_images(directory):
image_paths = []
for filename in os.listdir(directory):
# to be as careful as possible, you might check to make sure that
# the file is in fact an image, for instance using
# filename.endswith(".jpg") to check for .jpg files for instance
image_paths.append("{}/{}".format(directory, filename))
return image_paths
def get_filename(path):
return ntpath.basename(path)
Using these functions, the updated program might look like
possible_duplicates = get_all_images("/path/to/images")
duplicate_path = find_duplicate_image("/path/to/input.jpg", possible_duplicates)
if duplicate_path:
The above will only print the name of the duplicate image if there was a duplicate, otherwise, it will print nothing.
I have this in python:
import Image
import numpy as np
import random
img ='img.jpg')
#turn img to list of rgb tuples and scramble
pixels = list(img.getdata())
#make new image using scrambled pixels
img2 =, img.size)
I figured I should be working in c++ to keep stuff I learned last semester fresh in my head and to prepare for the class I have next semester which also revolves around c++. So, I found CImg and got a bit overwhelmed by the documentation. So, what would be CImg's equivalent of line 8?
My end goal is to be able to scramble an image using a known pattern, then use that pattern to unscramble later. I don't know if this is possible though. To me its a bit like asking the following:
int rand_num = rand() % 10;
rand_num = 7
find x.
As far as know CImg provides iterators to loop through every pixel. As such and provided that your compiler support C++11, you could use std::shuffle to shuffle the pixels of your image (see example below).
CImg<float> img("lena.jpg"); // Load image from file.
unsigned seed = std::chrono::system_clock::now().time_since_epoch().count();
std::shuffle(img.begin(), img.end(), std::default_random_engine(seed));
I wish to draw an image based on computed pixel values, as a means to visualize some data. Essentially, I wish to take a 2-dimensional matrix of color triplets and render it.
Do note that this is not image processing, since I'm not transforming an existing image nor doing any sort of whole-image transformations, and it's also not vector graphics as there is no pre-determined structure to the image I'm rendering- I'm probably going to be producing amorphous blobs of color one pixel at a time.
I need to render images about 1kx1k pixels for now, but something scalable would be useful. Final target format is PNG or any other lossless format.
I've been using PIL at the moment via ImageDraw's draw.point , and I was wondering, given the very specific and relatively basic features I require, is there any faster library available?
If you have numpy and scipy available (and if you are manipulating large arrays in Python, I would recommend them), then the scipy.misc.pilutil.toimage function is very handy.
A simple example:
import numpy as np
import scipy.misc as smp
# Create a 1024x1024x3 array of 8 bit unsigned integers
data = np.zeros( (1024,1024,3), dtype=np.uint8 )
data[512,512] = [254,0,0] # Makes the middle pixel red
data[512,513] = [0,0,255] # Makes the next pixel blue
img = smp.toimage( data ) # Create a PIL image # View in default viewer
The nice thing is toimage copes with different data types very well, so a 2D array of floating-point numbers gets sensibly converted to grayscale etc.
You can download numpy and scipy from here. Or using pip:
pip install numpy scipy
import Image
im='RGB', (1024, 1024))
im.putdata([(255,0,0), (0,255,0), (0,0,255)])'test.png')
Puts a red, green and blue pixel in the top-left of the image.
im.fromstring() is faster still if you prefer to deal with byte values.
For this example, install Numpy and Pillow.
The goal is to first represent the image you want to create as an array arrays of sets of 3 (RGB) numbers - use Numpy's array(), for performance and simplicity:
import numpy
data = numpy.zeros((1024, 1024, 3), dtype=numpy.uint8)
Now, set the middle 3 pixels' RGB values to red, green, and blue:
data[512, 511] = [255, 0, 0]
data[512, 512] = [0, 255, 0]
data[512, 513] = [0, 0, 255]
Then, use Pillow's Image.fromarray() to generate an Image from the array:
from PIL import Image
image = Image.fromarray(data)
Now, "show" the image (on OS X, this will open it as a temp-file in Preview):
This answer was inspired by BADCODE's answer, which was too out of date to use and too different to simply update without completely rewriting.
A different approach is to use Pyxel, an open source implementation of the TIC-80 API in Python3 (TIC-80 is the open source PICO-8).
Here's a complete app that just draws one yellow pixel on a black background:
import pyxel
def update():
"""This function just maps the Q key to `pyxel.quit`,
which works just like `sys.exit`."""
if pyxel.btnp(pyxel.KEY_Q): pyxel.quit()
def draw():
"""This function clears the screen and draws a single
pixel, whenever the buffer needs updating. Note that
colors are specified as palette indexes (0-15)."""
pyxel.cls(0) # clear screen (color)
pyxel.pix(10, 10, 10) # blit a pixel (x, y, color)
pyxel.init(160, 120) # initilize gui (width, height), draw) # run the game (*callbacks)
Note: The library only allows for up to sixteen colors, but you can change which colors, and you could probably get it to support more without too much work.
I think you use PIL to generate an image file on the disk, and you later load it with an image reader software.
You should get a small speed improvement by rendering directly the picture in memory (you will save the cost of writing the image on the disk and then re-loading it). Have a look at this thread for how to render that image with various python modules.
I would personally try wxpython and the dc.DrawBitmap function. If you use such a module rather than an external image reader you will have many benefits:
you will be able to create an interactive user interface with buttons for parameters.
you will be able to easily program a Zoomin and Zoomout function
you will be able to plot the image as you compute it, which can be quite useful if the computation takes a lot of time
You can use the turtle module if you don't want to install external modules. I created some useful functions:
setwindowsize( x,y ) - sets the window size to x*y
drawpixel( x, y, (r,g,b), pixelsize) - draws a pixel to x:y coordinates with an RGB color (tuple), with pixelsize thickness
showimage() - displays image
import turtle
def setwindowsize(x=640, y=640):
turtle.setup(x, y)
def drawpixel(x, y, color, pixelsize = 1 ):
turtle.tracer(0, 0)
for i in range(4):
def showimage():
200x200 window, 1 red pixel in the center
setwindowsize(200, 200)
drawpixel(100, 100, (255,0,0) )
30x30 random colors. Pixel size: 10
from random import *
for x in range(30):
for y in range(30):
color = (randint(0,255),randint(0,255),randint(0,255))