read an array of pixel values python - python

I would like to take a screenshot with a certain range of the screen, and then I would like to check the pixel values of certain lines (eg x_axis from 400 to 800).
I tried multiple ways like the imagegrab, gdi32.GetPixel and some more. It seems reading pixels values take a lot of time, so I even tried converting it into a list, something like this
im = ImageGrab.grab(box)
pixels = list(im .getdata())
Even this does not seem fast. Is there something I'm doing wrong?

ImageGrab returns pixels in PIL format (the Python Imaging Library: http://effbot.org/imagingbook/image.htm), and .getdata() already returns the pixels as a sequence. By wrapping it in list() again you are doing the same (expensive) operation twice. You can just do:
im = ImageGrab.grab(box)
pixels = im.getdata()
And iterate through your pixels in your favorite way.

Related

Using PyAutoGUI to locate an image on screen regardless of the color tone / brightness

I'm looking for a simple way in Python (PyAutoGUI) to locate all the images of a certain type on the screen but here's the catch, each image has a different gradient / color tone and I don't want to take the screen shot of each and every image to locate them on screen.
Here's the region of the screen containing the images I am trying to get the coordinates of:
As you can see every square has a unique color (the contrast).
So I want to get the coordinate of every square making PyAutoGUI scan just one image. Is there any way I could make it ignore the difference in contrast of the images? Like making it black and white mode or something.
How the code works:
import pyautogui
coordinates = pyautogui.locateAllOnScreen("image.png") # Returns list of coordinates of all images matching image.png
I know this is 2 years old but for any future humans that find this like me try the confidence argument.
import pyautogui
button7location = pyautogui.locateOnScreen('calc7key.png', confidence=0.9)
button7location
Box(left=1416, top=562, width=50, height=41)
Source: https://pyautogui.readthedocs.io/en/latest/screenshot.html#the-locate-functions
I would try the following:
pyautogui.locateOnScreen("image.png", grayscale=True)
Which will ignore color values and simply compaire the contrast of values. This has the added benefit of comparing about 30% quicker but can lead to false positives.

Quickly determining using Python whether an image is (fuzzily) in a collection

Image that some new image X arrives, and I want to know if X is new or has already been encountered before. I have code, below, that shrinks the image and then converts it to a hash code. I can then see via a single hash look-up if I've already encountered an image with the same hash code, so it's very fast.
My question is, is there an efficient way for me to see if a similar image, but one with a different hash code, has already been seen? If was going to title this question something like "Data structure for determining efficiently whether a similar, non-identical item is already contained" but decided that would be an instance of the XY problem.
When I say that this new image is "similar," I'm thinking of one that's perhaps gone through lossy compression and so looks like the original to the human eye but is not identical. Normally shrinking the image eliminates the difference, but not always, and if I shrink the image too much I start getting false positives.
Here's my current code:
import PIL
seen_images = {} # This would really be a shelf or something
# From http://www.guguncube.com/1656/python-image-similarity-comparison-using-several-techniques
def image_pixel_hash_code(image):
pixels = list(image.getdata())
avg = sum(pixels) / len(pixels)
bits = "".join(map(lambda pixel: '1' if pixel < avg else '0', pixels)) # '00010100...'
hexadecimal = int(bits, 2).__format__('016x').upper()
return hexadecimal
def process_image(filepath):
thumb = PIL.Image.open(filepath).resize((128,128)).convert("L")
code = image_pixel_hash_code(thumb)
previous_image = seen_images.get(code, None)
if code in seen_images:
print "'{}' already seen as '{}'".format(filepath, previous_image)
else:
seen_images[code] = filepath
You can put a path to a bunch of image files into a variable called IMAGE_ROOT and then try my code out with:
import os
for root, dirs, files in os.walk(IMAGE_ROOT):
for filename in files:
filepath = os.path.join(root, filename)
try:
process_image(filepath)
except IOError:
pass
There are a lot of methods for comparing images, but for your given example I suspect that simplicity and speed are the key factors (hence why you're trying to use a hash as a first-pass). Here are some suggestions - in all cases I'd suggest shrinking and cropping the image to a regular size and shape.
Smooth the image (gaussian blur) before shrinking to minimise the influence of artefacts. Then apply the hash or other comparison.
Subtract the images from one another (RGB) and check the remainder. Identical images will return zero, compression artefacts will result in small minor variations. You can either threshold, sum, or average the value and compare to a cut-off.
Use standard distance algorithsm (see scipy.spatial.distance) to calculate 'distance' between the two images. For example euclidean distance will give effectively the same as the sum of subtracting, while cosine will ignore itensity but match the profile of changes over the image i.e. a darker version of the same image will be considered equivalent. For these you will need to flatten your image to a 1D array.
The last two entail comparing every image to every other image when uploading, and that is going to get very computationally expensive for large numbers of images.

Pixel to pixel edit using PIL and Image.point

I can't seem to understand what Image point does. I want to do some pixel edit which might include checking which color value(r, g or b) is max in every pixel and act accordingly. Lets say that I can't use numpy. I managed to use Image point to add the same value to every pixel in an image.
point code
import Image, math
def brightness(i, value):
value = math.floor(255*(float(value)/100))
return i+value
if __name__ == '__main__':
image = '/home/avlahop/verybig.jpg'
print image
img = Image.open(image)
print img
out = img.point(lambda i: brightness(i, 50))
out.show()
numpy code
def brightness(arr, adjust):
import math
adjust = math.floor(255*(float(adjust)/100))
arr[...,0] += adjust
arr[...,1] += adjust
arr[...,2] += adjust
return arr
if __name__ == '__main__':
image = '/home/avlahop/verybig.jpg'
img = Image.open(image).convert('RGBA')
arr = np.array(np.asarray(img).astype('float'))
new_image = Image.fromarray(brightness(arr, adjust).clip(0,255).astype('uint8'), 'RGBA').show()
I have to say that point code is faster than numpy's. But what if i want to do a more complex operation with point. for example for every pixel check the max(r,g,b) and do something depending on if r=max or g=max or b=max. As you saw i used the point with function as argument. It takes one argument i. what is this i? is it the pixel?(i.e i=(r,g,b)?).I can't seem to understand from the pil documentation
The docs may not have been clear in earlier versions of PIL, but in Pillow it's spelled out pretty well. From Image.point:
lut – A lookup table, containing 256 values per band in the image. A function can be used instead, it should take a single argument. The function is called once for each possible pixel value, and the resulting table is applied to all bands of the image.
In other words, it's not a general-purpose way to map each pixel through a function, it's just a way to dynamically built the lookup table, instead of passing in a pre-built one.
In other words, it's called with the numbers from 0 through 255. (Which you can find out for yourself pretty easily by just writing a function that appends its argument to a global list and then dump out the list at the end…)
If you split your image into separate bands or planes, point each one of them with a different function, and then recombine them, that might be able to accomplish what you're trying to do. But even then, I think eval is what you wanted, not point.
But I think what you really want, which is a pixel-by-pixel all-bands-at-once iterator. And you don't need anything special for that. Just use map or a comprehension over getdata. Isn't that slow? Of course it's slow, because it's calling your function X*Y times; the cost of building the getdata sequence and iterating over it is tiny compared to that cost, so looking for a way for PIL to optimize the already-fast-enough part won't get you very far.

Not able to display/Convert Image

I am new to Python and Opencv.
I am using the following code.
import Image
import ImageChops
im1 = Image.open("img1.png")
im2 = Image.open("img2.png")
diff = ImageChops.difference(im2, im1)
When I do cv.ShowImage, it asks me to convert it. I am trying all kinds of convert but there is always an error.
The only way I can see the image is by doing the following.
diff.save("final","JPEG")
Is there there another way I can convert to an IplImage or CvMat?
cv.SaveImage(diff, cv.LoadImage(diff)) might work, using the opencv function.
EDIT: In sight of the comment below, I think trying
cv.SaveImage(diff, cv.LoadImage(diff))
cv.ShowImage('box name', diff)
might work.
The difference image contains negative pixel values, so I don't think cv.ShowImage can display it 'as is'.
The range of possible pixel values after subtraction is -255 to 255. You might want to normalize pixel values first, by
new_value = (old_value + 255)/2
I don't use OpenCV on Python, so I cannot post code for the above.

How to convert a pygame Surface to a PIL Image?

I'm using PIL to transform a portion of the screen perspectively.
The original image-data is a pygame Surface which needs to be converted to a PIL Image.
Therefore I found the tostring-function of pygame which exists for that purpose.
However the result looks pretty odd (see attached screenshot). What is going wrong with this code:
rImage = pygame.Surface((1024,768))
#draw something to the Surface
sprite = pygame.sprite.RenderPlain((playboard,))
sprite.draw(rImage)
pil_string_image = pygame.image.tostring(rImage, "RGBA",False)
pil_image = Image.fromstring("RGBA",(660,660),pil_string_image)
What am I doing wrong?
As I noted in a comment, pygame documentation
for pygame.image.fromstring(string, size, format, flipped=False) says “The size and format image must compute the exact same size as the passed string buffer. Otherwise an exception will be raised”. Thus, using (1024,768) in place of (660,660), or vice versa – in general, the same dimensions for the two calls – is more likely to work. (I say “more likely to work” instead of “will work” because of I didn't test any cases.)
The reason for suspecting a problem like this: The strange look of part of the image resembles a display screen which is set to a raster rate it can't synchronize; ie, lines of the image start displaying at points other than at the left margin; in this case because of image line lengths being longer than display line lengths. I'm assuming the snowflakes are sprites, generated separately from the distorted image.

Categories

Resources