I am new to Python and Opencv.
I am using the following code.
import Image
import ImageChops
im1 = Image.open("img1.png")
im2 = Image.open("img2.png")
diff = ImageChops.difference(im2, im1)
When I do cv.ShowImage, it asks me to convert it. I am trying all kinds of convert but there is always an error.
The only way I can see the image is by doing the following.
diff.save("final","JPEG")
Is there there another way I can convert to an IplImage or CvMat?
cv.SaveImage(diff, cv.LoadImage(diff)) might work, using the opencv function.
EDIT: In sight of the comment below, I think trying
cv.SaveImage(diff, cv.LoadImage(diff))
cv.ShowImage('box name', diff)
might work.
The difference image contains negative pixel values, so I don't think cv.ShowImage can display it 'as is'.
The range of possible pixel values after subtraction is -255 to 255. You might want to normalize pixel values first, by
new_value = (old_value + 255)/2
I don't use OpenCV on Python, so I cannot post code for the above.
Related
I am trying to select an area of an image to do some analysis on that specific area of the image.
However, when I searched online, I am only able to find guides on how to select a rectangular area. I need to select an area that is drawn using my mouse. An example of such area is included bellow.
Would anyone be able to recommend to me some key words or libraries to search to help me with this or links to guides that do something similar?
Also, I am not sure if it is necessary information but the analysis I am trying to do on the region of interest is to Find a ratio of amount of white to black pixels in that specific area.
I produced a simple working example based on this answer. I also tried using scipy.ndimage.morphology.fill_binary_holes, but could not get it to work. Note that the provided function takes a little longer since it is assuming that the input image is grayscale and not binarized.
I specifically avoided the usage of OpenCV, since I find the setup to be a bit tedious, but I think it should also provide an equivalent (see here).
Additionally, my "binarization" is kind of hacky, but you can probably figure out how to parse your image into a valid format yourself (and it might be easier if you produce the result within a program). In any case, I would suggest making sure that you have a proper image format, since jpeg's compression might violate your connectivity, and cause issues in certain cases.
import scipy as sp
import numpy as np
import scipy.ndimage
import matplotlib.pyplot as plt
def flood_fill(test_array,h_max=255):
input_array = np.copy(test_array)
el = sp.ndimage.generate_binary_structure(2,2).astype(np.int)
inside_mask = sp.ndimage.binary_erosion(~np.isnan(input_array), structure=el)
output_array = np.copy(input_array)
output_array[inside_mask]=h_max
output_old_array = np.copy(input_array)
output_old_array.fill(0)
el = sp.ndimage.generate_binary_structure(2,1).astype(np.int)
while not np.array_equal(output_old_array, output_array):
output_old_array = np.copy(output_array)
output_array = np.maximum(input_array,sp.ndimage.grey_erosion(output_array, size=(3,3), footprint=el))
return output_array
x = plt.imread("test.jpg")
# "convert" to grayscale and invert
binary = 255-x[:,:,0]
filled = flood_fill(binary)
plt.imshow(filled)
This produces the following result:
I would like to take a screenshot with a certain range of the screen, and then I would like to check the pixel values of certain lines (eg x_axis from 400 to 800).
I tried multiple ways like the imagegrab, gdi32.GetPixel and some more. It seems reading pixels values take a lot of time, so I even tried converting it into a list, something like this
im = ImageGrab.grab(box)
pixels = list(im .getdata())
Even this does not seem fast. Is there something I'm doing wrong?
ImageGrab returns pixels in PIL format (the Python Imaging Library: http://effbot.org/imagingbook/image.htm), and .getdata() already returns the pixels as a sequence. By wrapping it in list() again you are doing the same (expensive) operation twice. You can just do:
im = ImageGrab.grab(box)
pixels = im.getdata()
And iterate through your pixels in your favorite way.
I have a .png image that contains three grayscale values. It contains black (0), white (255) and gray (128) blobs. I want to resize this image to a smaller size while preserving only these three grayscale values.
Currently, I am using scipy.misc.imresize to do it but I noticed that when I reduce the size, the edges get blurred and now contains more than 3 grayscale values.
Does anyone know how to do this in python?
From the docs for imresize, note the interp keyword argument:
interp : str, optional
Interpolation to use for re-sizing
(‘nearest’, ‘lanczos’, ‘bilinear’, ‘bicubic’ or ‘cubic’).
The default is bilinear filtering; switch to nearest and it will instead use the exact color of the nearest existing pixel, which will preserve your precise grayscale values rather than trying to linearly interpolate between them.
I believe that PIL.Image.resize does exactly what you want. Take a look at the docs.
Basically what you need is:
from PIL import Image
im = Image.open('old.png')
# The Image.NEAREST is the default, I'm just being explicit
im = im.resize((im.size[0]/2, im.size[1]/2), Image.NEAREST)
im.save('new.png')
Actually you can pretty much do that with the scipy.misc.imresize
Take a look at its docs.
The interp parameter is what you need. If you set it to nearest the image colors won't be affected.
I attach a zip archive with all the files needed to illustrate and reproduce the problem.
(I don't have permissions to upload images yet...)
I have an image (test2.png in the zip archive ) with curved lines.
I try to warp it so the lines are straight.
I thought of using scikit-image transform, and in particular transform.PolynomialTransform because the transformation involves high order distortions.
So first I measure the precise position of each line at regular intervals in x to define the input interest points (in the file source_test2.csv).
Then I compute the corresponding desired positions, located along a straight line (in the file destination_test2.csv).
The figure correspondence.png shows how it looks like.
Next, I simply call transform.PolynomialTransform() using a polynomial of order 3.
It finds a solution, but when I apply it using transform.warp(), the result is crazy, as illustrated in the file Crazy_Warped.png
Anybody can tell what I am doing wrong?
I tried polynomial of order 2 without luck...
I managed to get a good transformation for a sub-image (the first 400 columns only).
Is transform.PolynomialTransform() completely unstable in a case like mine?
Here is the entire code:
import numpy as np
import matplotlib.pyplot as plt
import asciitable
import matplotlib.pylab as pylab
from skimage import io, transform
# read image
orig=io.imread("test2.png",as_grey=True)
# read tables with reference points and their desired transformed positions
source=asciitable.read("source_test2.csv")
destination=asciitable.read("destination_test2.csv")
# format as numpy.arrays as required by scikit-image
# (need to add 1 because I started to count positions from 0...)
source=np.column_stack((source["x"]+1,source["y"]+1))
destination=np.column_stack((destination["x"]+1,destination["y"]+1))
# Plot
plt.imshow(orig, cmap='gray', interpolation='nearest')
plt.plot(source[:,0],source[:,1],'+r')
plt.plot(destination[:,0],destination[:,1],'+b')
plt.xlim(0,orig.shape[1])
plt.ylim(0,orig.shape[0])
# Compute the transformation
t = transform.PolynomialTransform()
t.estimate(destination,source,3)
# Warping the image
img_warped = transform.warp(orig, t, order=2, mode='constant',cval=float('nan'))
# Show the result
plt.imshow(img_warped, cmap='gray', interpolation='nearest')
plt.plot(source[:,0],source[:,1],'+r')
plt.plot(destination[:,0],destination[:,1],'+b')
plt.xlim(0,img_warped.shape[1])
plt.ylim(0,img_warped.shape[0])
# Save as a file
io.imsave("warped.png",img_warped)
Thanks in advance!
There are a couple of things wrong here, mainly they have to do with coordinate conventions. For example, if we examine the code where you plot the original image, and then put the clicked point on top of it:
plt.imshow(orig, cmap='gray', interpolation='nearest')
plt.plot(source[:,0],source[:,1],'+r')
plt.xlim(0,orig.shape[1])
plt.ylim(0,orig.shape[0])
(I've taken out the destination points to make it cleaner) then we get the following image:
As you can see, the y-axis is flipped, if we invert the y-axis with:
source[:,1] = orig.shape[0] - source[:,1]
before plotting, then we get the following:
So that is the first problem (don't forget to invert the destination points as well), the second has to do with the transform itself:
t.estimate(destination,source,3)
From the documentation we see that the call takes the source points first, then the destination points. So the order of those arguments should be flipped.
Lastly, the clicked points are of the form (x,y), but the image is stored as (y,x), so we have to transpose the image before applying the transform and then transpose back again:
img_warped = transform.warp(orig.transpose(), t, order=2, mode='constant',cval=float('nan'))
img_warped = img_warped.transpose()
When you make these changes, you get the following warped image:
These lines aren't perfectly flat but it makes much more sense.
Thank you very much for the detailed answer! I cannot believe I did not see the axis inversion problem... Thanks for catching it!
But I am afraid your final solution does not solve my problem... The image you get is still crazy. It should be continuous, no have such big holes and weird distortions... (see final solution below)
I found I could get a reasonable solution using RANSAC:
from skimage.measure import ransac
t, inliers = ransac((destination,source), transform.PolynomialTransform, min_samples=20,residual_threshold=1.0, max_trials=1000)
outliers = inliers == False
I then get the following result
Note that I think I was right using (destination,source) in that order! I think it has to do with the fact that transform.warp requires the inverse_map as input for the transformation object, not the forward map. But maybe I am wrong? The good result I am getting suggest it's correct.
I guess that Polynomial transforms are too unstable, and using RANSAC allows to get a reasonable solution.
My problem was then to find a way to change the polynomial order in the RANSAC call...
transform.PolynomialTransform() does not take any parameters, and uses by default a 2nd order polynomial, but from the result I can see I would need a 3rd or 4th order polynomial.
So I opened a new question, and got a solution from Stefan van der Walt. Follow the link to see how to do it.
Thanks again for your help!
I can't seem to understand what Image point does. I want to do some pixel edit which might include checking which color value(r, g or b) is max in every pixel and act accordingly. Lets say that I can't use numpy. I managed to use Image point to add the same value to every pixel in an image.
point code
import Image, math
def brightness(i, value):
value = math.floor(255*(float(value)/100))
return i+value
if __name__ == '__main__':
image = '/home/avlahop/verybig.jpg'
print image
img = Image.open(image)
print img
out = img.point(lambda i: brightness(i, 50))
out.show()
numpy code
def brightness(arr, adjust):
import math
adjust = math.floor(255*(float(adjust)/100))
arr[...,0] += adjust
arr[...,1] += adjust
arr[...,2] += adjust
return arr
if __name__ == '__main__':
image = '/home/avlahop/verybig.jpg'
img = Image.open(image).convert('RGBA')
arr = np.array(np.asarray(img).astype('float'))
new_image = Image.fromarray(brightness(arr, adjust).clip(0,255).astype('uint8'), 'RGBA').show()
I have to say that point code is faster than numpy's. But what if i want to do a more complex operation with point. for example for every pixel check the max(r,g,b) and do something depending on if r=max or g=max or b=max. As you saw i used the point with function as argument. It takes one argument i. what is this i? is it the pixel?(i.e i=(r,g,b)?).I can't seem to understand from the pil documentation
The docs may not have been clear in earlier versions of PIL, but in Pillow it's spelled out pretty well. From Image.point:
lut – A lookup table, containing 256 values per band in the image. A function can be used instead, it should take a single argument. The function is called once for each possible pixel value, and the resulting table is applied to all bands of the image.
In other words, it's not a general-purpose way to map each pixel through a function, it's just a way to dynamically built the lookup table, instead of passing in a pre-built one.
In other words, it's called with the numbers from 0 through 255. (Which you can find out for yourself pretty easily by just writing a function that appends its argument to a global list and then dump out the list at the end…)
If you split your image into separate bands or planes, point each one of them with a different function, and then recombine them, that might be able to accomplish what you're trying to do. But even then, I think eval is what you wanted, not point.
But I think what you really want, which is a pixel-by-pixel all-bands-at-once iterator. And you don't need anything special for that. Just use map or a comprehension over getdata. Isn't that slow? Of course it's slow, because it's calling your function X*Y times; the cost of building the getdata sequence and iterating over it is tiny compared to that cost, so looking for a way for PIL to optimize the already-fast-enough part won't get you very far.