I'm trying to do as described here: Finding a subimage inside a Numpy image to be able to search an image inside screenshot.
The code looks like that:
import cv2
import numpy as np
import gtk.gdk
from PIL import Image
def make_screenshot():
w = gtk.gdk.get_default_root_window()
sz = w.get_size()
pb = gtk.gdk.Pixbuf(gtk.gdk.COLORSPACE_RGB, False, 8, sz[0], sz[1])
pb = pb.get_from_drawable(w, w.get_colormap(), 0, 0, 0, 0, sz[0], sz[1])
width, height = pb.get_width(), pb.get_height()
return Image.fromstring("RGB", (width, height), pb.get_pixels())
if __name__ == "__main__":
img = make_screenshot()
cv_im = cv2.cvtColor(np.array(img), cv2.COLOR_RGB2BGR)
template = cv_im[30:40, 30:40, :]
result = cv2.matchTemplate(cv_im, template, cv2.TM_CCORR_NORMED)
print np.unravel_index(result.argmax(), result.shape)
Depending on method selected (instead of cv2.TM_CCORR_NORMED) I'm getting completely different coordinates, but none of them is (30, 30) as in example.
Please, teach me, what's wrong with such approach?
Short answer: you need to use the following line to locate the corner of the best match:
minVal, maxVal, minLoc, maxLoc = cv2.minMaxLoc(result)
The variable maxLoc will hold a tuple containing the x, y indices of the upper lefthand corner of the best match.
Long answer:
cv2.matchTemplate() returns a single channel image where the number at each index corresponds to how well the input image matched the template at that index. Try visualizing result by inserting the following lines of code after your call to matchTemplate, and you will see why numpy would have a difficult time making sense of it.
cv2.imshow("Debugging Window", result)
cv2.waitKey(0)
cv2.destroyAllWindows()
minMaxLoc() turns the result returned by matchTemplate into the information you want. If you cared to know where the template had the worst match, or what value was held by result at the best and worst matches, you could use those values too.
This code worked for me on an example image that I read from file. If your code continues to misbehave, you probably aren't reading in your images the way you want to. The above snippet of code is useful for debugging with OpenCV. Replace the argument result in imshow with the name of any image object (numpy array) to visually confirm that you are getting the image you want.
Related
I am trying to implement a skeletonization of small images. But I am not getting an expected results. I tried also thin() and medial_axis() but nothing seems to work as expected. I am suspicious that this problem occurs because of the small resolutions of images. Here is the code:
import cv2
from numpy import asarray
import numpy as np
# open image
file = "66.png"
img_grey = cv2.imread(file, cv2.IMREAD_GRAYSCALE)
afterMedian = cv2.medianBlur(img_grey, 3)
thresh = 140
# threshold the image
img_binary = cv2.threshold(afterMedian, thresh, 255, cv2.THRESH_BINARY)[1]
# make binary image
arr = asarray(img_binary)
binaryArr = np.zeros(asarray(img_binary).shape)
for i in range(0, arr.shape[0]):
for j in range(0, arr.shape[1]):
if arr[i][j] == 255:
binaryArr[i][j] = 1
else:
binaryArr[i][j] = 0
# perform skeletonization
from skimage.morphology import skeletonize
cv2.imshow("binary arr", binaryArr)
backgroundSkeleton = skeletonize(binaryArr)
# convert to non-binary image
bSkeleton = np.zeros(arr.shape)
for i in range(0, arr.shape[0]):
for j in range(0, arr.shape[1]):
if backgroundSkeleton[i][j] == 0:
bSkeleton[i][j] = 0
else:
bSkeleton[i][j] = 255
cv2.imshow("background skeleton", bSkeleton)
cv2.waitKey(0)
The results are:
I would expect something more like this:
This applies to similar shapes also:
Expectation:
Am I doing something wrong? Or it will truly will not be possible with such small pictures, because I tried skeletonization on bigger images and it worked just fine. Original images:
You could try the skeleton in DIPlib (dip.EuclideanSkeleton):
import numpy as np
import diplib as dip
import cv2
file = "66.png"
img_grey = cv2.imread(file, cv2.IMREAD_GRAYSCALE)
afterMedian = cv2.medianBlur(img_grey, 3)
thresh = 140
bin = afterMedian > thresh
sk = dip.EuclideanSkeleton(bin, endPixelCondition='three neighbors')
dip.viewer.Show(bin)
dip.viewer.Show(sk)
dip.viewer.Spin()
The endPixelCondition input argument can be used to adjust how many branches are preserved or removed. 'three neighbors' is the option that produces the most branches.
The code above produces branches also towards the corners of the image. Using 'two neighbors' prevents that, but produces fewer branches towards the object as well. The other way to prevent it is to set edgeCondition='object', but in this case the ring around the object becomes a square on the image boundary.
To convert the DIPlib image sk back to a NumPy array, do
sk = np.array(sk)
sk is now a Boolean NumPy array (values True and False). To create an array compatible with OpenCV simply cast to np.uint8 and multiply by 255:
sk = np.array(sk, dtype=np.uint8)
sk *= 255
Note that, when dealing with NumPy arrays, you generally don't need to loop over all pixels. In fact, it's worth trying to avoid doing so, as loops in Python are extremely slow.
It seems the scikit-image is much better choice than cv2 here.
since the package define Bit functions, if you are playing with BW images, then try this ready to use code:
skeletonize
note: if process pass the image details, then don’t upsample the input at first until you tried other functions:again use skimage morphology functions to enhance details which in such case your code will be work on bigger area of images too. You could look here.
Okay, here's the situation:
I want to use the Python Image Library to "theme" an image like this:
Theme color: "#33B5E5"
IN:
OUT:
I got the result using this commands with ImageMagick:
convert image.png -colorspace gray image.png
mogrify -fill "#33b5e5" -tint 100 image.png
Explanation:
The image is first converted to black-and-white, and then it is themed.
I want to get the same result with the Python Image Library.
But it seems I'm having some problems using it since:
Can not handle transparency
Background (transparency in main image) gets themed too..
I'm trying to use this script:
import Image
import ImageEnhance
def image_overlay(src, color="#FFFFFF", alpha=0.5):
overlay = Image.new(src.mode, src.size, color)
bw_src = ImageEnhance.Color(src).enhance(0.0)
return Image.blend(bw_src, overlay, alpha)
img = Image.open("image.png")
image_overlay(img, "#33b5e5", 0.5)
You can see I did not convert it to a grayscale first, because that didn't work with transparency either.
I'm sorry to post so many issues in one question, but I couldn't do anything else :$
Hope you all understand.
Note: There's a Python 3/pillow fork of PIL version of this answer here.
Update 4: Guess the previous update to my answer wasn't the last one after all. Although converting it to use PIL exclusively was a major improvement, there were a couple of things that seemed like there ought to be better, less awkward, ways to do, if only PIL had the ability.
Well, after reading the documentation closely as well as some of the source code, I realized what I wanted to do was in fact possible. The trade-off was that now it has to build the look-up table used manually, so the overall code is slightly longer. However the result is that it only needs to make one call to the relatively slow Image.point() method, instead of three of them.
from PIL import Image
from PIL.ImageColor import getcolor, getrgb
from PIL.ImageOps import grayscale
def image_tint(src, tint='#ffffff'):
if Image.isStringType(src): # file path?
src = Image.open(src)
if src.mode not in ['RGB', 'RGBA']:
raise TypeError('Unsupported source image mode: {}'.format(src.mode))
src.load()
tr, tg, tb = getrgb(tint)
tl = getcolor(tint, "L") # tint color's overall luminosity
if not tl: tl = 1 # avoid division by zero
tl = float(tl) # compute luminosity preserving tint factors
sr, sg, sb = map(lambda tv: tv/tl, (tr, tg, tb)) # per component adjustments
# create look-up tables to map luminosity to adjusted tint
# (using floating-point math only to compute table)
luts = (map(lambda lr: int(lr*sr + 0.5), range(256)) +
map(lambda lg: int(lg*sg + 0.5), range(256)) +
map(lambda lb: int(lb*sb + 0.5), range(256)))
l = grayscale(src) # 8-bit luminosity version of whole image
if Image.getmodebands(src.mode) < 4:
merge_args = (src.mode, (l, l, l)) # for RGB verion of grayscale
else: # include copy of src image's alpha layer
a = Image.new("L", src.size)
a.putdata(src.getdata(3))
merge_args = (src.mode, (l, l, l, a)) # for RGBA verion of grayscale
luts += range(256) # for 1:1 mapping of copied alpha values
return Image.merge(*merge_args).point(luts)
if __name__ == '__main__':
import os
input_image_path = 'image1.png'
print 'tinting "{}"'.format(input_image_path)
root, ext = os.path.splitext(input_image_path)
result_image_path = root+'_result'+ext
print 'creating "{}"'.format(result_image_path)
result = image_tint(input_image_path, '#33b5e5')
if os.path.exists(result_image_path): # delete any previous result file
os.remove(result_image_path)
result.save(result_image_path) # file name's extension determines format
print 'done'
Here's a screenshot showing input images on the left with corresponding outputs on the right. The upper row is for one with an alpha layer and the lower is a similar one that doesn't have one.
You need to convert to grayscale first. What I did:
get original alpha layer using Image.split()
convert to grayscale
colorize using ImageOps.colorize
put back original alpha layer
Resulting code:
import Image
import ImageOps
def tint_image(src, color="#FFFFFF"):
src.load()
r, g, b, alpha = src.split()
gray = ImageOps.grayscale(src)
result = ImageOps.colorize(gray, (0, 0, 0, 0), color)
result.putalpha(alpha)
return result
img = Image.open("image.png")
tinted = tint_image(img, "#33b5e5")
I've used openCV2 to load a grayscale image, which I then converted to a numpy.array. Now I want to pad that array with a 'frame' around the image. However, I'm having some trouble dissecting what the numpy manual wants me to do exactly. I tried googling and searching for padding examples, none came up that were relevant for my case.
My current code looks like this:
import numpy as np
img = cv2.imread('Lena.png', )
imgArray = np.array((img))
imgArray = np.pad(imgArray, pad_width=1,mode='constant' ,constant_values=0)
cv2.imshow('Padded', imgArray)
Check out the openCV2 documentation here: https://docs.opencv.org/3.0-beta/doc/py_tutorials/py_core/py_basic_ops/py_basic_ops.html
My best guess is to use constant= cv2.copyMakeBorder(img,10,10,10,10,cv2.BORDER_CONSTANT,value=BLUE)
You can do as follows:
import numpy as np
import cv2
img = cv2.imread('Lena.png', 0)
img = np.pad(img, pad_width=4, mode='constant', constant_values=0)
cv2.imshow('Padded', img)
cv2.waitKey(0)
From the documentation of cv2.imread:
cv2.imread(filename[, flags]) → retval
Parameters:
filename – Name of file to be loaded.
flags:
Flags specifying the color type of a loaded image:
CV_LOAD_IMAGE_ANYDEPTH - If set, return 16-bit/32-bit image when the input has the corresponding depth, otherwise convert it to 8-bit.
CV_LOAD_IMAGE_COLOR - If set, always convert image to the color one
CV_LOAD_IMAGE_GRAYSCALE - If set, always convert image to the grayscale one
>0 Return a 3-channel color image.
Note In the current implementation the alpha channel, if any, is stripped from the output image. Use negative value if you need the alpha channel.
=0 Return a grayscale image.
<0 Return the loaded image as is (with alpha channel).
With the above code we got the following result:
And another option using np.pad:
As you can see here, you need to supply the axis you want to np.pad. Simply using:
imgArray = np.pad(imgArray, pad_width=1, mode='constant', constant_values=0)
adds only values to the third axis (i.e. the RGB channel), so that you cannot plot the image any more.
As described in the referenced question, you would need to use the following arguments to you code:
imgArray = np.pad(imgArray, pad_width=((1,1), (1,1), (0,0)), mode='constant', constant_values=0)
Also see the np.pad documentation:
Number of values padded to the edges of each axis. ((before_1, after_1), … (before_N, after_N)) unique pad widths for each axis. ((before, after),) yields same before and after pad for each axis. (pad,) or int is a shortcut for before = after = pad width for all axes.
This means the first entry of tuple pads the first axis (in case of an image the upper and lower border) and the second tuple pads the second axis (the left and right borders) with one "0".
You do not want to pad the last dimension, as this is the dimension storing the RGB information.
And as you stated in your question that you want a white border: constant_values should be set to 255 or 1, depending on the range of your image. Using 0 results in a black border.
Whilst I see you already have an answer, I wanted to show the general case where you want to pad with something other than black or white, i.e. you want to add a coloured border. I couldn't get any of the methods suggested in the other answers to do that, so...
Say you have lena.png as follows:
Then you can do:
from PIL import Image, ImageOps
import numpy as np
# Load the image - you could just as well use OpenCV `imread()`
img = Image.open('lena.png')
# Pad 20px to all sides with magenta
padded = ImageOps.expand(img, border=20, fill=(255,0,255))
# Save to disk
padded.save('result.png')
Before anyone decides to downvote because the OP asked how to add white borders, please note you can just as easily add white with this method if you use:
padded = ImageOps.expand(img, border=20, fill=(255,255,255))
If you are using numpy arrays to manipulate your images, you can convert from numpy array to PIL Image with:
pil_image = Image.fromarray(numpy_array)
and the other way with:
numpy_array = np.array(pil_image)
I have many skeletonized images like this:
How can i detect a cycle, a loop in the skeleton?
Are there "special" functions that do this or should I implement it as a graph?
In case there is only the graph option, can the python graph library NetworkX can help me?
You can exploit the topology of the skeleton. A cycle will have no holes, so we can use scipy.ndimage to find any holes and compare. This isn't the fastest method, but it's extremely easy to code.
import scipy.misc, scipy.ndimage
# Read the image
img = scipy.misc.imread("Skel.png")
# Retain only the skeleton
img[img!=255] = 0
img = img.astype(bool)
# Fill the holes
img2 = scipy.ndimage.binary_fill_holes(img)
# Compare the two, an image without cycles will have no holes
print "Cycles in image: ", ~(img == img2).all()
# As a test break the cycles
img3 = img.copy()
img3[0:200, 0:200] = 0
img4 = scipy.ndimage.binary_fill_holes(img3)
# Compare the two, an image without cycles will have no holes
print "Cycles in image: ", ~(img3 == img4).all()
I've used your "B" picture as an example. The first two images are the original and the filled version which detects a cycle. In the second version, I've broken the cycle and nothing gets filled, thus the two images are the same.
First, let's build an image of the letter B with PIL:
import Image, ImageDraw, ImageFont
image = Image.new("RGBA", (600,150), (255,255,255))
draw = ImageDraw.Draw(image)
fontsize = 150
font = ImageFont.truetype("/usr/share/fonts/truetype/liberation/LiberationMono-Regular.ttf", fontsize)
txt = 'B'
draw.text((30, 5), txt, (0,0,0), font=font)
img = image.resize((188,45), Image.ANTIALIAS)
print type(img)
plt.imshow(img)
you may find a better way to do that, particularly with path to the fonts. Ii would be better to load an image instead of generating it. Anyway, we have now something to work on:
Now, the real part:
import mahotas as mh
img = np.array(img)
im = img[:,0:50,0]
im = im < 128
skel = mh.thin(im)
noholes = mh.morph.close_holes(skel)
plt.subplot(311)
plt.imshow(im)
plt.subplot(312)
plt.imshow(skel)
plt.subplot(313)
cskel = np.logical_not(skel)
choles = np.logical_not(noholes)
holes = np.logical_and(cskel,noholes)
lab, n = mh.label(holes)
print 'B has %s holes'% str(n)
plt.imshow(lab)
And we have in the console (ipython):
B has 2 holes
Converting your skeleton image to a graph representation is not trivial, and I don't know of any tools to do that for you.
One way to do it in the bitmap would be to use a flood fill, like the paint bucket in photoshop. If you start a flood fill of the image, the entire background will get filled if there are no cycles. If the fill doesn't get the entire image then you've found a cycle. Robustly finding all the cycles could require filling multiple times.
This is likely to be very slow to execute, but probably much faster to code than a technique where you trace the skeleton into graph data structure.
I am using putpixel on an image (srcImage) which is w = 134 and h = 454.
The code here gets the r,g,b value of a part of the font which is 0,255,0 (which I found through debugging, using print option).
image = letters['H']
r,g,b = image.getpixel((1,1)) #Note r g b values are 0, 255,0
srcImage.putpixel((10,15),(r,g,b))
srcImage.save('lolmini2.jpg')
This code does not throw any error. However, when I check the saved image I cannot spot the pure green pixel.
Instead of using putpixel() and getpixel() you should use indexing instead. For getpixel() you can use pixesl[1, 1] and for putpixel you can use pixels[1, 1] = (r, g, b). It should work the same but it's much faster. pixels here is image.load()
However, I don't see why it wouldn't work. It should work without a problem. Perhaps the jpeg compression is killing you here. Have you tried saving it as a png/gif file instead? Or setting more than 1 pixel.
I know it is a very old post but, for beginners who'd want to stick to putpixels() for a while, here's the solution:
initialize the image variable as:
from PIL import Image
img = Image.new('RGB', [200,200], 0x000000)
Make sure to initialize it as 'RGB' if you want to manipulate RGB values.
Sometimes people initialize images as:
img = Image.new('I', [200, 200], 0x000000)
and then try to work with RGB values, which doesn't work.