I can't seem to understand what Image point does. I want to do some pixel edit which might include checking which color value(r, g or b) is max in every pixel and act accordingly. Lets say that I can't use numpy. I managed to use Image point to add the same value to every pixel in an image.
point code
import Image, math
def brightness(i, value):
value = math.floor(255*(float(value)/100))
return i+value
if __name__ == '__main__':
image = '/home/avlahop/verybig.jpg'
print image
img = Image.open(image)
print img
out = img.point(lambda i: brightness(i, 50))
out.show()
numpy code
def brightness(arr, adjust):
import math
adjust = math.floor(255*(float(adjust)/100))
arr[...,0] += adjust
arr[...,1] += adjust
arr[...,2] += adjust
return arr
if __name__ == '__main__':
image = '/home/avlahop/verybig.jpg'
img = Image.open(image).convert('RGBA')
arr = np.array(np.asarray(img).astype('float'))
new_image = Image.fromarray(brightness(arr, adjust).clip(0,255).astype('uint8'), 'RGBA').show()
I have to say that point code is faster than numpy's. But what if i want to do a more complex operation with point. for example for every pixel check the max(r,g,b) and do something depending on if r=max or g=max or b=max. As you saw i used the point with function as argument. It takes one argument i. what is this i? is it the pixel?(i.e i=(r,g,b)?).I can't seem to understand from the pil documentation
The docs may not have been clear in earlier versions of PIL, but in Pillow it's spelled out pretty well. From Image.point:
lut – A lookup table, containing 256 values per band in the image. A function can be used instead, it should take a single argument. The function is called once for each possible pixel value, and the resulting table is applied to all bands of the image.
In other words, it's not a general-purpose way to map each pixel through a function, it's just a way to dynamically built the lookup table, instead of passing in a pre-built one.
In other words, it's called with the numbers from 0 through 255. (Which you can find out for yourself pretty easily by just writing a function that appends its argument to a global list and then dump out the list at the end…)
If you split your image into separate bands or planes, point each one of them with a different function, and then recombine them, that might be able to accomplish what you're trying to do. But even then, I think eval is what you wanted, not point.
But I think what you really want, which is a pixel-by-pixel all-bands-at-once iterator. And you don't need anything special for that. Just use map or a comprehension over getdata. Isn't that slow? Of course it's slow, because it's calling your function X*Y times; the cost of building the getdata sequence and iterating over it is tiny compared to that cost, so looking for a way for PIL to optimize the already-fast-enough part won't get you very far.
Related
I am playing around with images and the random number generator in Python, and I would like to understand the strange output picture my script is generating.
The larger script iterates through each pixel (left to right, then next row down) and changes the color. I used the following function to offset the given input red, green, and blue values by a randomly determined integer between 0 and 150 (so this formula is invoked 3 times for R, G, and B in each iteration):
def colCh(cVal):
random.seed()
rnd = random.randint(0,150)
newVal = max(min(cVal - 75 + rnd,255),0)
return newVal
My understanding is that random.seed() without arguments uses the system clock as the seed value. Given that it is invoked prior to the calculation of each offset value, I would have expected a fairly random output.
When reviewing the numerical output, it does appear to be quite random:
Scatter plot of every 100th R value as x and R' as y:
However, the picture this script generates has a very peculiar grid effect:
Output picture with grid effect hopefully noticeable:
Furthermore, fwiw, this grid effect seems to appear or disappear at different zoom levels.
I will be experimenting with new methods of creating seed values, but I can't just let this go without trying to get an answer.
Can anybody explain why this is happening? THANKS!!!
Update: Per Dan's comment about possible issues from JPEG compression, the input file format is .jpg and the output file format is .png. I would assume only the output file format would potentially create the issue he describes, but I admittedly do not understand how JPEG compression works at all. In order to try and isolate JPEG compression as the culprit, I changed the script so that the colCh function that creates the randomness is excluded. Instead, it merely reads the original R,G,B values and writes those exact values as the new R,G,B values. As one would expect, this outputs the exact same picture as the input picture. Even when, say, multiplying each R,G,B value by 1.2, I am not seeing the grid effect. Therefore, I am fairly confident this is being caused by the colCh function (i.e. the randomness).
Update 2: I have updated my code to incorporate Sascha's suggestions. First, I moved random.seed() outside of the function so that it is not reseeding based on the system time in every iteration. Second, while I am not quite sure I understand how there is bias in my original function, I am now sampling from a positive/negative distribution. Unfortunately, I am still seeing the grid effect. Here is my new code:
random.seed()
def colCh(cVal):
rnd = random.uniform(-75,75)
newVal = int(max(min(cVal + rnd,255),0))
return newVal
Any more ideas?
As imgur is down for me right now, some guessing:
Your usage of PRNGs is a bit scary. Don't use time-based seeds in very frequently called loops. It's very much possible, that the same seeds are generated and of course this will generate patterns. (granularity of time + number of random-bits used matter here)
So: seed your PRNG once! Don't do this every time, don't do this for every channel. Seed one global PRNG and use it for all operations.
There should be no pattern then.
(If there is: also check the effect of interpolation = image-size change)
Edit: As imgur is on now, i recognized the macro-block like patterns, like Dan mentioned in the comments. Please change your PRNG-usage first before further analysis. And maybe show more complete code.
It may be possible, that you recompressed the output and JPEG-compression emphasized the effects observed before.
Another thing is:
newVal = max(min(cVal - 75 + rnd,255),0)
There is a bit of a bias here (better approach: sample from symmetric negative/positive distribution and clip between 0,255), which can also emphasize some effect (what looked those macroblocks before?).
I would like to take a screenshot with a certain range of the screen, and then I would like to check the pixel values of certain lines (eg x_axis from 400 to 800).
I tried multiple ways like the imagegrab, gdi32.GetPixel and some more. It seems reading pixels values take a lot of time, so I even tried converting it into a list, something like this
im = ImageGrab.grab(box)
pixels = list(im .getdata())
Even this does not seem fast. Is there something I'm doing wrong?
ImageGrab returns pixels in PIL format (the Python Imaging Library: http://effbot.org/imagingbook/image.htm), and .getdata() already returns the pixels as a sequence. By wrapping it in list() again you are doing the same (expensive) operation twice. You can just do:
im = ImageGrab.grab(box)
pixels = im.getdata()
And iterate through your pixels in your favorite way.
I have a shared variable persistent_vis_chain which is being updated by a theano function where it gets its function from a theano.scan, But thats not the problem just back story.
My shared variable looks like D = [image1, ... , imageN] where each images is [x1,x2,...,x784].
What I want to do is take the average of all the images and put them into the last imageN. That is I want to sum all the values in each image except the last 1, which will result in [s1,s2,...,s784] then I want to set imageN = [s1/len(D),s2/len(D),...s784/len(D)]
So my problem is I do not know how to do this with theano.shared and may be with my understanding of theano functions and doing this computation with symbolic variables. Any help would be greatly appreciated.
If you have N images, each of shape 28x28=784 then, presumably, your shared variable has shape (N,28,28) or (N,784)? This method should work with either shape.
Given D is your shared variable containing your image data. If you want to get the average image then D.mean(keepdims=True) will give it to you symbolically.
It's unclear if you want to change the final image to equal the mean image (sounds like a strange thing to do), or if you want to add a further N+1'th image to the shared variable. For the former you could do something like this:
D = theano.shared(load_D_data())
D_update_expression = do_something_with_scan_to_get_D_update_expression(D)
updates = [(D, T.concatenate(D_update_expression[:-1],
D_update_expression.mean(keepdims=True)))]
f = theano.function(..., updates=updates)
If you want to do the latter (add an additional image), change the updates line as follows:
updates = [(D, T.concatenate(D_update_expression,
D_update_expression.mean(keepdims=True)))]
Note that this code is intended as a guide. It may not work as it stands (e.g. you may need to mess with the axis= parameter in the T.concatenate command).
The point is that you need to construct a symbolic expression explaining what the new value for D looks like. You want it to be a combination of the updates from scan plus this additional average thing. T.concatenate allows you to combine those two parts together.
Image that some new image X arrives, and I want to know if X is new or has already been encountered before. I have code, below, that shrinks the image and then converts it to a hash code. I can then see via a single hash look-up if I've already encountered an image with the same hash code, so it's very fast.
My question is, is there an efficient way for me to see if a similar image, but one with a different hash code, has already been seen? If was going to title this question something like "Data structure for determining efficiently whether a similar, non-identical item is already contained" but decided that would be an instance of the XY problem.
When I say that this new image is "similar," I'm thinking of one that's perhaps gone through lossy compression and so looks like the original to the human eye but is not identical. Normally shrinking the image eliminates the difference, but not always, and if I shrink the image too much I start getting false positives.
Here's my current code:
import PIL
seen_images = {} # This would really be a shelf or something
# From http://www.guguncube.com/1656/python-image-similarity-comparison-using-several-techniques
def image_pixel_hash_code(image):
pixels = list(image.getdata())
avg = sum(pixels) / len(pixels)
bits = "".join(map(lambda pixel: '1' if pixel < avg else '0', pixels)) # '00010100...'
hexadecimal = int(bits, 2).__format__('016x').upper()
return hexadecimal
def process_image(filepath):
thumb = PIL.Image.open(filepath).resize((128,128)).convert("L")
code = image_pixel_hash_code(thumb)
previous_image = seen_images.get(code, None)
if code in seen_images:
print "'{}' already seen as '{}'".format(filepath, previous_image)
else:
seen_images[code] = filepath
You can put a path to a bunch of image files into a variable called IMAGE_ROOT and then try my code out with:
import os
for root, dirs, files in os.walk(IMAGE_ROOT):
for filename in files:
filepath = os.path.join(root, filename)
try:
process_image(filepath)
except IOError:
pass
There are a lot of methods for comparing images, but for your given example I suspect that simplicity and speed are the key factors (hence why you're trying to use a hash as a first-pass). Here are some suggestions - in all cases I'd suggest shrinking and cropping the image to a regular size and shape.
Smooth the image (gaussian blur) before shrinking to minimise the influence of artefacts. Then apply the hash or other comparison.
Subtract the images from one another (RGB) and check the remainder. Identical images will return zero, compression artefacts will result in small minor variations. You can either threshold, sum, or average the value and compare to a cut-off.
Use standard distance algorithsm (see scipy.spatial.distance) to calculate 'distance' between the two images. For example euclidean distance will give effectively the same as the sum of subtracting, while cosine will ignore itensity but match the profile of changes over the image i.e. a darker version of the same image will be considered equivalent. For these you will need to flatten your image to a 1D array.
The last two entail comparing every image to every other image when uploading, and that is going to get very computationally expensive for large numbers of images.
I've got the following routine I've written that takes two arbitrary curves and warps the space between them so it fits between two straight lines. For the loop, it process it per column as np.linspace doesn't operate on vectors AFAIK. Is there way to get rid of this loop and hit the whole thing at once?
def warp(img, iris_curve, pupil_curve):
height, width = img.shape[:2]
iris_height = np.uint8(np.max(np.int8(iris_curve) - pupil_curve))
out = np.zeros((iris_height, width))
for r in range(0,width):
map_theta = np.linspace(pupil_curve[r], iris_curve[r], iris_height)
map_theta = np.uint8(np.rint(map_theta))
out[:, r] = img[map_theta, r]
return np.uint8(out)
Thanks!
If you peek into the source code of np.linspace, you can use that as a guide to vectorize your code. Your loop would then be replaced by something like:
steps = (iris_curve - pupil_curve) / iris_height
map_theta = np.arange(iris_height)[:, None] * steps + pupil_curve
map_theta = np.rint(map_theta).astype(np.uint8)
out = img[map_theta, np.arange(width)]
You may have to transpose the output, it's hard to debug this kind of code without an example.
My first thoughts looking at this code is that the for loop is unlikely to be the most significant thing hampering performace. There is an awful lot of copying of data around in this code (calling linspace creates a new array then this is copied back into the larger output array. casting to different types also involves copying data around). For example, can you not initiate your output as
np.empty((h,w),dtype=np.uint8)
Moreover do you really need to explicitly calculate all these values? Unless you reference all of them you might be better off just using the 2D linear interpolator from scipy.
If you really want to produce the output as-is I think you'll have to write something custom in Cython or similar.