Using the Python Imaging Library PIL how can someone detect if an image has all it's pixels black or white?
~Update~
Condition: Not iterate through each pixel!
if not img.getbbox():
... will test to see whether an image is completely black. (Image.getbbox() returns the falsy None if there are no non-black pixels in the image, otherwise it returns a tuple of points, which is truthy.) To test whether an image is completely white, invert it first:
if not ImageChops.invert(img).getbbox():
You can also use img.getextrema(). This will tell you the highest and lowest values within the image. To work with this most easily you should probably convert the image to grayscale mode first (otherwise the extrema might be an RGB or RGBA tuple, or a single grayscale value, or an index, and you have to deal with all those).
extrema = img.convert("L").getextrema()
if extrema == (0, 0):
# all black
elif extrema == (1, 1):
# all white
The latter method will likely be faster, but not so you'd notice in most applications (both will be quite fast).
A one-line version of the above technique that tests for either black or white:
if sum(img.convert("L").getextrema()) in (0, 2):
# either all black or all white
Expanding on Kindall:
if you look at an image called img with:
extrema = img.convert("L").getextrema()
It gives you a range of the values in the images. So an all black image would be (0,0) and an all white image is (255,255). So you can look at:
if extrema[0] == extrema[1]:
return("This image is one solid color, so I won't use it")
else:
# do something with the image img
pass
Useful to me when I was creating a thumbnail from some data and wanted to make sure it was reading correctly.
from PIL import Image
img = Image.open("test.png")
clrs = img.getcolors()
clrs contains [("num of occurences","color"),...]
By checking for len(clrs) == 1 you can verify if the image contains only one color and by looking at the second element of the first tuple in clrs you can infer the color.
In case the image contains multiple colors, then by taking the number of occurences into account you can also handle almost-completly-single-colored images if 99% of the pixles share the same color.
I tried the Kindall solution ImageChops.invert(img).getbbox() without success, my test images failed.
I had noticed a problem, white should be 255 BUT I have found white images where numeric extrema are (0,0).. why? See the update below.
I have changed Kindall second solution (getextrema), that works right, in a way that doesn't need image conversion, I wrote a function and verified that works with Grayscale and RGB images both:
def is_monochromatic_image(img):
extr = img.getextrema()
a = 0
for i in extr:
if isinstance(i, tuple):
a += abs(i[0] - i[1])
else:
a = abs(extr[0] - extr[1])
break
return a == 0
The img argument is a PIL Image object.
You can also check, with small modifications, if images are black or white.. but you have to decide if "white" is 0 or 255, perhaps you have the definitive answer, I have not. :-)
Hope useful
UPDATE: I suppose that white images with zeros inside.. may be PNG or other image format with transparency.
Related
I am trying to remove a transparent watermark from an image.
Here is my sample image:
I would like to remove the text "Watermark" from the image. As you can see, the text is transparent. So I would like to replace that text to the original background.
Something like this would be my desired output:
I tried some examples (I am currently using cv2, if other libraries can solve the problem please also recommend), but none of them where near from succeeding. I know the way to go would be to have a mask (like in this post), but they all already have masked images, but I don't.
Here is what I tried to do to have a mask, I turned down the saturation to black and white, and created an image "imagemask.jpg", then tried going through the pixels with a for loop:
mask = cv2.imread('imagemask.jpg')
new = []
rows, cols, _ = mask.shape
for i in range(rows):
new.append([])
#print(i)
for j in range(cols):
k = img[i, j]
#print(k)
if all(x in range(110, 130) for x in k):
new[-1].append((255, 255, 255))
else:
new[-1].append((0, 0, 0))
cv2.imwrite('finalmask.jpg', np.array(new))
Then after that wanted to use the code for the mask, but I realized the "finalmask.jpg" is a complete mess... so I didn't try using the code for the mask.
Is this actually possible? I have been trying for around 3 hours but receiving no luck...
This is not trivial, my friend. To add insult to injury, your image is very low-res, compressed and has a nasty glare - that won't help processing at all. Please, look at your input and set your expectations accordingly. With that said, let's try to get the best result with what we have. These are the steps I propose:
Try to segment the watermark text from the image
Filter the segmentation mask and try to get a binary mask as clean as possible
Use the text mask to in-paint the offending area using the input image as reference
Now, the tricky part, as you already saw, is segmenting the text. After trying out some techniques and color spaces, I found that the CMYK color space - particularly the K channel - offers promising results. The text is reasonably clear and we can try an Adaptive Thresholding on this, let's take a look:
# Imports
import cv2
import numpy as np
# Read image
imagePath = "D://opencvImages//"
img = cv2.imread(imagePath+"0f5zZm.jpg")
# Store a deep copy for the inpaint operation:
originalImg = img.copy()
# Convert to float and divide by 255:
imgFloat = img.astype(np.float) / 255.
# Calculate channel K:
kChannel = 1 - np.max(imgFloat, axis=2)
OpenCV does not offer BGR to CMYK conversion directly, so I manually had to get the K channel using the conversion formula. It is very straightforward. The K (or Key) channel represents pixels of the lowest intensity (black) with color white. Meaning that the text, which is almost white, will be rendered in black... This is the K Channel of the input:
You see how the darker pixels on the input are almost white here? That's nice, it seems to get a clear separation between the text and everything else. It's a shame that we have some big nasty glare on the right side. Anyway, the conversion involves float operations, so gotta be careful with data types. Maybe we can improve this image with a little brightness/contrast adjustment. Just a little bit, I'm just trying to separate more the text from that nasty glare:
# Apply a contrast/brightness adjustment on Channel K:
alpha = 0
beta = 1.2
adjustedK = cv2.normalize(kChannel, None, alpha, beta, cv2.NORM_MINMAX, cv2.CV_32F)
# Convert back to uint 8:
adjustedK = (255*adjustedK).astype(np.uint8)
This is the adjusted image:
There's a little bit more separation between the text and the glare, it seems. Alright, let's apply an Adaptive Thresholding on this bad boy to get an initial segmentation mask:
# Adaptive Thresholding on adjusted Channel K:
windowSize = 21
windowConstant = 11
binaryImg = cv2.adaptiveThreshold(adjustedK, 255, cv2.ADAPTIVE_THRESH_MEAN_C, cv2.THRESH_BINARY_INV, windowSize, windowConstant)
You see I'm using a not-so-big windowSize here for the thresholding? Feel free to tune out these parameters if you like. This is the binary image I get:
Yeah, there's a lot of noise. Here's what I propose to get a cleaner mask: There's some obvious blobs that are bigger than the text. Likewise, there are other blobs that are smaller than the text. Let's locate the big blobs and the small blobs and subtract them. The resulting image should contain the text, if we set our parameters correctly. Let's see:
# Get the biggest blobs on the image:
minArea = 180
bigBlobs = areaFilter(minArea, binaryImg)
# Filter the smallest blobs on the image:
minArea = 20
smallBlobs = areaFilter(minArea, binaryImg)
# Let's try to isolate the text:
textMask = smallBlobs - bigBlobs
cv2.imshow("Text Mask", textMask)
cv2.waitKey(0)
Here I'm using a helper function called areaFilter. This function returns all the blobs of an image that are above a minimum area threshold. I'll post the function at the end of the answer. In the meantime, check out these cool images:
Big blobs:
Filtered small blobs:
The difference between them:
Sadly, it seems that some portions of the characters didn't survive the filtering operations. That's because the intersection of the glare and the text is too much for the algorithm to get a clear separation. Something that could benefit the result of the in-painting is a subtle blur on this mask, to get rid of that compression alias. Let's apply some Gaussian Blur to smooth the mask a little bit:
# Blur the mask a little bit to get a
# smoother inpanting result:
kernelSize = (3, 3)
textMask = cv2.GaussianBlur(textMask, kernelSize, cv2.BORDER_DEFAULT)
The kernel is not that big, I just want a subtle effect. This is the result:
Finally, let's apply the in-painting:
# Apply the inpaint method:
inpaintRadius = 10
inpaintMethod = cv2.INPAINT_TELEA
result = cv2.inpaint(originalImg, textMask, inpaintRadius, inpaintMethod)
cv2.imshow("Inpaint Result", result)
cv2.waitKey(0)
This is the final result:
Well, is not that bad, considering the input image. You can try to further improve the result adjusting some values, but the reality of this life, my dude, is that the input image is not that great to begin with. Here's the areaFilter function:
def areaFilter(minArea, inputImage):
# Perform an area filter on the binary blobs:
componentsNumber, labeledImage, componentStats, componentCentroids = \
cv2.connectedComponentsWithStats(inputImage, connectivity=4)
# Get the indices/labels of the remaining components based on the area stat
# (skip the background component at index 0)
remainingComponentLabels = [i for i in range(1, componentsNumber) if componentStats[i][4] >= minArea]
# Filter the labeled pixels based on the remaining labels,
# assign pixel intensity to 255 (uint8) for the remaining pixels
filteredImage = np.where(np.isin(labeledImage, remainingComponentLabels) == True, 255, 0).astype('uint8')
return filteredImage
I need to find the largest empty area in the document and display its coordinates, center point and area, using python to put a QR Code there.
I think OpenCV and Numpy should be enough for this task.
What kinda THRESH to use? Because there are a lot of types of scans:
gray, BW, with color, and how to find the contour properly?
How this can be implemented in the fastest way? An example using the
first scan from google is attached, where you can see that the code
should find the largest empty square area.
#Mark Setchell Thanks! This code works perfectly for all docs with a white background, but when I use smth with a color in the background it finds a completely different area. Also, to keep thin lines in the docs I used Erode after thresholding. Tried to change thresholding and erode parameters, still not working properly.
Edited post, added color pictures.
Here's a possible approach:
#!/usr/bin/env python3
import cv2
import numpy as np
def largestSquare(im):
# Make image square of 100x100 to simplify and speed up
s = 100
work = cv2.resize(im, (s,s), interpolation=cv2.INTER_NEAREST)
# Make output accumulator - uint16 is ok because...
# ... max value is 100x100, i.e. 10,000 which is less than 65,535
# ... and you can make a PNG of it too
p = np.zeros((s,s), np.uint16)
# Find largest square
for i in range(1, s):
for j in range(1, s):
if (work[i][j] > 0 ):
p[i][j] = min(p[i][j-1], p[i-1][j], p[i-1][j-1]) + 1
else:
p[i][j] = 0
# Save result - just for illustration purposes
cv2.imwrite("result.png",p)
# Work out what the actual answer is
ind = np.unravel_index(np.argmax(p, axis=None), p.shape)
print(f'Location: {ind}')
print(f'Length of side: {p[ind]}')
# Load image and threshold
im = cv2.imread('page.png', cv2.IMREAD_GRAYSCALE)
_, thr = cv2.threshold(im,127,255,cv2.THRESH_BINARY | cv2.THRESH_OTSU)
# Get largest white square
largestSquare(thr)
Output
Location: (21, 77)
Length of side: 18
Notes:
I edited out your red annotation so it didn't interfere with my algorithm.
I did Otsu thresholding to get pure black and white - that may or may not be appropriate to your use case. It will depend on your scans and paper background etc.
I scaled the image down to 100x100 so it doesn't take all day to run. You will need to scale the results back up to the size of your original image but I assume you can do that easily enough.
Keywords: Image processing, image, Python, OpenCV, largest white square, largest empty space.
I can successfully convert a rectangular image into a png with transparent rounded corners like this:
However, when I take this transparent cornered image and I want to use it in another image generated with Pillow, I end up with this:
The transparent corners become black. I've been playing around with this for a while but I can't find any way in which the transparent parts of an image don't turn black once I place them on another image with Pillow.
Here is the code I use:
mask = Image.open('Test mask.png').convert('L')
im = Image.open('boat.jpg')
im.resize(mask.size)
output = ImageOps.fit(im, mask.size, centering=(0.5, 0.5))
output.putalpha(mask)
output.save('output.png')
im = Image.open('output.png')
image_bg = Image.new('RGBA', (1292,440), (255,255,255,100))
image_fg = im.resize((710, 400), Image.ANTIALIAS)
image_bg.paste(image_fg, (20, 20))
image_bg.save('output2.jpg')
Is there a solution for this? Thanks.
Per some suggestions I exported the 2nd image as a PNG, but then I ended up with an image with holes in it:
Obviously I want the second image to have a consistent white background without holes.
Here is what I actually want to end up with. The orange is only placed there to highlight the image itself. It's a rectangular image with white background, with a picture placed into it with rounded corners.
If you paste an image with transparent pixels onto another image, the transparent pixels are just copied as well. It looks like you only want to paste the non-transparent pixels. In that case, you need a mask for the paste function.
image_bg.paste(image_fg, (20, 20), mask=image_fg)
Note the third argument here. From the documentation:
If a mask is given, this method updates only the regions indicated by
the mask. You can use either "1", "L" or "RGBA" images (in the latter
case, the alpha band is used as mask). Where the mask is 255, the
given image is copied as is. Where the mask is 0, the current value
is preserved. Intermediate values will mix the two images together,
including their alpha channels if they have them.
What we did here is provide an RGBA image as mask, and use the alpha channel as mask.
When rotating an image using
import skimage
result = skimage.transform.rotate(img, angle=some_angle, resize=True)
# the result is the rotated image with black 'margins' that fill the blanks
The algorithm rotates the image but leaves the newly formed background black and there is no way - using the rotate function - to choose the color of the newly formed background.
Do you have any idea how to do choose the color of the background before rotating an image?
Thank you.
Just use cval parameter
img3 = transform.rotate(img20, 45, resize=True, cval=1, mode ='constant')
img4 = img3 * 255
It's a little annoying that the cval parameter to skimage.transform.warp() can only be a scalar float.
You can set it to any value outside of the range of your data (like -1), and use np.where() to fix it afterwards. Here's an example that sets the background of a 3-channel image to red:
output = tf.warp(image, tform, mode='constant', cval=-1,
preserve_range=True, output_shape=(256, 256), order=3)
w = np.where(output == -1)
output[w[0], w[1], :] = [255, 0, 0]
tbh I don't know why everyone set the cval value to a strange value.
I think it's intuitive when you look at the documentation of the function
cval : float, optional
Used in conjunction with mode ‘constant’, the value outside the image boundaries.
Therefore depending on what your input image is, in my case, my picture already used cv2.cvtColor(img, cv2.COLOR_BGR2GRAY), therefore in order to set it to white, I can simply set cval=255.
If it's ndarray(img more than 1 channel), you can either rotate each channel separately, or set it to strange value then use np.where to replace afterward. (idk why but I just don't like this method lol)
if your pic is a gray picture,use cval ;
if pic is rgb ,no good way,blow code works:
path3=r'C:\\Users\forfa\Downloads\1111.jpg'
img20=data.imread(path3)
import numpy as np
img3=transform.rotate(img20,45,resize=True,cval=0.1020544) #0.1020544 can be any unique float,the pixel outside your pic is [0.1020544]*3
c,r,k=img3.shape
for i in range(c):
for j in range(r):
if np.allclose(img3[i][j],[0.1020544]*3):
img3[i][j]=[1,0,0] #[1,0,0] is red pixel,replace pixel [0.1020544]*3 with red pixel [1,0,0]
I don't know how do it, but maybe this help you
http://scikit-image.org/docs/dev/api/skimage.color.html
Good luck! ;)
Basically, I have two images. One is comprised of white and black pixels, the black pixels making up a word, and the other image that I'm trying to paste the black pixels on top of. I've pasted the code below, however I'm aware that there's an issue with the "if pixels [x,y] == (0, 0, 0):' being a tuple and not an indice, however I'm uncertain of how to get it to look for black pixels with other means.
So essentially I need to find, and remember the positions of, the black pixels so that I can paste them onto the first image. Any help is very much appreciated!
image_one = Image.open (image_one)
image_two = Image.open (image_two)
pixels = list(image_two.getdata())
for y in xrange(image_two.size[1]):
for x in xrange(image_two.size[0]):
if pixels[x,y] == (0, 0, 0):
pixels = black_pixels
black_pixels.append()
image = Image.open (image_one);
image_one.putdata(pixels)
image.save(image_one+ "_X.bmp")
del image_one, image_two;
You're almost there. I am not too familiar with the PIL class, but instead of calling the getdata method, let's use getpixel directly on the image object, and directly set the results into the output image. That eliminates the need to store the set of pixels to overwrite. However, there may be cases beyond what you've listed here where such an approach would be necessary. I created a random image and then set various pixels to black. For this test I used a different condition - if the R channel of the image is greater than 50. You can comment that out and use the other test, which tests for tuple (R,G,B) == (0,0,0) which will work fine.
imagea = PIL.Image.open('temp.png')
imageb = PIL.Image.open('temp.png')
for y in xrange(imagea.size[1]):
for x in xrange(imagea.size[0]):
currentPixel = imagea.getpixel((x,y))
if currentPixel[0] > 50:
#if currentPixel ==(0,0,0):
#this is a black pixel, you can directly modify image 2 now
imageb.putpixel((x,y),(0,0,0))
imageb.save('outputfile.png')
An alternative way to do this is just to multiply the two images together. Any pixel that's black in the binary image will be black in the result (multiply by zero) and any pixel that's white in the binary image will be unchanged from the other image in the result (multiply by one).
PIL can do this,
from PIL import Image, ImageChops
image_one = Image.open("image_one.bmp")
image_two = Image.open("image_two.bmp")
out = ImageChops.multiply(image_one, image_two)
out.save("output.bmp")