I need to find parts of an image that are of certain brightness (say between 120 and 135) and then set their brightness to 255 and the rest of the image to 0.
I have an image that I converted to grayscale and then blurred so as to reduce noise, and so far I only found this function:
threshold = cv2.threshold(imageGrayscaleBlurred, 120, 255, cv2.THRESH_BINARY)[1]
As far as I know, this function can only take a certain threshold and in my case it will set all parts of the image that are brighter than 120 to 255, and make the rest of the image black. This doesn't fit my goal because the output image looks pretty messy.
Is there a "fancier" way to achieve my goal without using the cv2.threshold() function twice with a different threshold?
Create a 1-channel mask of the same shape as imageGrayscaleBlurred, filled with zeros (black):
mask = np.zeros((imageGrayscaleBlurred.shape[0], imageGrayscaleBlurred.shape[1]), np.uint8))
Place the following condition that suits your requirement:
mask[(imageGrayscaleBlurred >= 120) & (imageGrayscaleBlurred <= 135)] = 255
Related
I am trying to do text extraction from some images, however, these come with a bit of background, I have tried to "play" with contrast and brightness, as well as looking to apply thresholding techniques like otsu.
Do you have any suggestions on how to improve the extraction? I leave below some parts of the processing, as well as the input and output, any recommendation will be welcome.
Input:
Output:
Processing:
enhancer = ImageEnhance.Brightness(img)
img = enhancer.enhance(1.62) # 1.8
enhancer2 = ImageEnhance.Contrast(img)
img = enhancer2.enhance(1.8) # 2
img = np.array(img)
thresh = cv2.threshold(img, 0, 255, cv2.THRESH_BINARY | cv2.THRESH_OTSU)[1]
You should perform adaptive threshold. The algorithm divides the image into blocks of pre-defined size. Every block is given a different threshold value based on the pixel intensities within that block. In the following example, threshold is obtained based on Gaussian weight applied to sum of all pixel values within each block (meaning similar pixel values are given more weightage based on Gaussian curve). Binarization is carried out based on this value for each block. Check this page for more
For the given image, I tried the following:
im = cv2.imread('text_block.jpg')
green_channel = im[:,:,1]
th = cv2.adaptiveThreshold(green_channel, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY, 27, 6)
You will have to try tweaking the parameters to get a better result. And also try cv.ADAPTIVE_THRESH_MEAN_C
I'd like to believe I'm close to being able to count cells, but I also know I'm missing something.
I have this image
image = cv2.imread('/cells.png',
cv2.IMREAD_GRAYSCALE)
image = cv2.resize(image, (1000, 600))
th, threshedImg = cv2.threshold(image, 30, 255,cv2.THRESH_TOZERO_INV)
img_blur = cv2.GaussianBlur(threshedImg, (3,3), 0)
sobelxy = cv2.Sobel(src=img_blur,
ddepth=cv2.CV_64F,
dx=1, dy=1, ksize=5)
edges = cv2.Canny(image=img_blur,
threshold1=20,
threshold2=65)
cv2.imshow('thresh', threshedImg)
cv2.imshow('Canny Edge Detection', edges)
cv2.waitKey(0)
Edge output:
Threshold output:
The threshold image captures the cells pretty good from what I can tell, and the Canny does a pretty good job for getting the edges. Ive tried making use of contours, but I was unable to produce any good results.
Any help or ideas on how to improve would be appreciated. Thanks!
I count 286 blobs. That depends on some tweakables because some blobs that could be counted separately are really close together.
Approach:
Your input has some grayish background in the bottom right. To compensate, I estimate the background using a large median blur, kernel size ~100, and subtract that (saturating math).
Next, I blur the entire thing to suppress noise sufficiently that each blob is smooth (just one local maximum on it, no camel humps or worse)
Then I use image == cv.dilate(image, iterations=15) to calculate a mask of local extrema.
Then I combine that with a mask of peaks which is simply image > threshold.
I & (and) both masks together.
morphological close/dilate operation to merge some peaks that occur due to numerics (anything the smoothing hasn't smoothed enough that the dilate and equality only see one peak)
Then I use connectedComponentsWithStats to find all those blobs and their centroids.
My opinion on other approaches:
Canny, makes absolute no sense at all. Will leave you in a worse place.
Color Space transformation... pointless because your input is basically monochrome and I treat it as such.
I am trying to remove a transparent watermark from an image.
Here is my sample image:
I would like to remove the text "Watermark" from the image. As you can see, the text is transparent. So I would like to replace that text to the original background.
Something like this would be my desired output:
I tried some examples (I am currently using cv2, if other libraries can solve the problem please also recommend), but none of them where near from succeeding. I know the way to go would be to have a mask (like in this post), but they all already have masked images, but I don't.
Here is what I tried to do to have a mask, I turned down the saturation to black and white, and created an image "imagemask.jpg", then tried going through the pixels with a for loop:
mask = cv2.imread('imagemask.jpg')
new = []
rows, cols, _ = mask.shape
for i in range(rows):
new.append([])
#print(i)
for j in range(cols):
k = img[i, j]
#print(k)
if all(x in range(110, 130) for x in k):
new[-1].append((255, 255, 255))
else:
new[-1].append((0, 0, 0))
cv2.imwrite('finalmask.jpg', np.array(new))
Then after that wanted to use the code for the mask, but I realized the "finalmask.jpg" is a complete mess... so I didn't try using the code for the mask.
Is this actually possible? I have been trying for around 3 hours but receiving no luck...
This is not trivial, my friend. To add insult to injury, your image is very low-res, compressed and has a nasty glare - that won't help processing at all. Please, look at your input and set your expectations accordingly. With that said, let's try to get the best result with what we have. These are the steps I propose:
Try to segment the watermark text from the image
Filter the segmentation mask and try to get a binary mask as clean as possible
Use the text mask to in-paint the offending area using the input image as reference
Now, the tricky part, as you already saw, is segmenting the text. After trying out some techniques and color spaces, I found that the CMYK color space - particularly the K channel - offers promising results. The text is reasonably clear and we can try an Adaptive Thresholding on this, let's take a look:
# Imports
import cv2
import numpy as np
# Read image
imagePath = "D://opencvImages//"
img = cv2.imread(imagePath+"0f5zZm.jpg")
# Store a deep copy for the inpaint operation:
originalImg = img.copy()
# Convert to float and divide by 255:
imgFloat = img.astype(np.float) / 255.
# Calculate channel K:
kChannel = 1 - np.max(imgFloat, axis=2)
OpenCV does not offer BGR to CMYK conversion directly, so I manually had to get the K channel using the conversion formula. It is very straightforward. The K (or Key) channel represents pixels of the lowest intensity (black) with color white. Meaning that the text, which is almost white, will be rendered in black... This is the K Channel of the input:
You see how the darker pixels on the input are almost white here? That's nice, it seems to get a clear separation between the text and everything else. It's a shame that we have some big nasty glare on the right side. Anyway, the conversion involves float operations, so gotta be careful with data types. Maybe we can improve this image with a little brightness/contrast adjustment. Just a little bit, I'm just trying to separate more the text from that nasty glare:
# Apply a contrast/brightness adjustment on Channel K:
alpha = 0
beta = 1.2
adjustedK = cv2.normalize(kChannel, None, alpha, beta, cv2.NORM_MINMAX, cv2.CV_32F)
# Convert back to uint 8:
adjustedK = (255*adjustedK).astype(np.uint8)
This is the adjusted image:
There's a little bit more separation between the text and the glare, it seems. Alright, let's apply an Adaptive Thresholding on this bad boy to get an initial segmentation mask:
# Adaptive Thresholding on adjusted Channel K:
windowSize = 21
windowConstant = 11
binaryImg = cv2.adaptiveThreshold(adjustedK, 255, cv2.ADAPTIVE_THRESH_MEAN_C, cv2.THRESH_BINARY_INV, windowSize, windowConstant)
You see I'm using a not-so-big windowSize here for the thresholding? Feel free to tune out these parameters if you like. This is the binary image I get:
Yeah, there's a lot of noise. Here's what I propose to get a cleaner mask: There's some obvious blobs that are bigger than the text. Likewise, there are other blobs that are smaller than the text. Let's locate the big blobs and the small blobs and subtract them. The resulting image should contain the text, if we set our parameters correctly. Let's see:
# Get the biggest blobs on the image:
minArea = 180
bigBlobs = areaFilter(minArea, binaryImg)
# Filter the smallest blobs on the image:
minArea = 20
smallBlobs = areaFilter(minArea, binaryImg)
# Let's try to isolate the text:
textMask = smallBlobs - bigBlobs
cv2.imshow("Text Mask", textMask)
cv2.waitKey(0)
Here I'm using a helper function called areaFilter. This function returns all the blobs of an image that are above a minimum area threshold. I'll post the function at the end of the answer. In the meantime, check out these cool images:
Big blobs:
Filtered small blobs:
The difference between them:
Sadly, it seems that some portions of the characters didn't survive the filtering operations. That's because the intersection of the glare and the text is too much for the algorithm to get a clear separation. Something that could benefit the result of the in-painting is a subtle blur on this mask, to get rid of that compression alias. Let's apply some Gaussian Blur to smooth the mask a little bit:
# Blur the mask a little bit to get a
# smoother inpanting result:
kernelSize = (3, 3)
textMask = cv2.GaussianBlur(textMask, kernelSize, cv2.BORDER_DEFAULT)
The kernel is not that big, I just want a subtle effect. This is the result:
Finally, let's apply the in-painting:
# Apply the inpaint method:
inpaintRadius = 10
inpaintMethod = cv2.INPAINT_TELEA
result = cv2.inpaint(originalImg, textMask, inpaintRadius, inpaintMethod)
cv2.imshow("Inpaint Result", result)
cv2.waitKey(0)
This is the final result:
Well, is not that bad, considering the input image. You can try to further improve the result adjusting some values, but the reality of this life, my dude, is that the input image is not that great to begin with. Here's the areaFilter function:
def areaFilter(minArea, inputImage):
# Perform an area filter on the binary blobs:
componentsNumber, labeledImage, componentStats, componentCentroids = \
cv2.connectedComponentsWithStats(inputImage, connectivity=4)
# Get the indices/labels of the remaining components based on the area stat
# (skip the background component at index 0)
remainingComponentLabels = [i for i in range(1, componentsNumber) if componentStats[i][4] >= minArea]
# Filter the labeled pixels based on the remaining labels,
# assign pixel intensity to 255 (uint8) for the remaining pixels
filteredImage = np.where(np.isin(labeledImage, remainingComponentLabels) == True, 255, 0).astype('uint8')
return filteredImage
I want to use pytesseract to read digits from images. The images look as follows:
The digits are dotted and in order to be able to use pytesseract, I need black connected digits on a white background. To do so, I thought about using erode and dilate as preprocessing techniques. As you can see, the images are similar, yet quite different in certain aspects. For example, the dots in the first image are darker than the background, while the dots in the second are whiter. That means, in the first image I can use erode to get black connected lines and in the second image I can use dilate to get white connected lines and then inverse the colors. This leads to the following results:
Using an appropriate threshold, the first image can easily be read with pytesseract. The second image, whoever, is more tricky. The problem is, that for example parts of the "4" are darker than the background around the three. So a simple threshold is not going to work. I need something like local threshold or local contrast enhancement. Does anybody have an idea here?
Edit:
OTSU, mean threshold and gaussian threshold lead to the following results:
Your images are pretty low res, but you can try a method called gain division. The idea is that you try to build a model of the background and then weight each input pixel by that model. The output gain should be relatively constant during most of the image.
After gain division is performed, you can try to improve the image by applying an area filter and morphology. I only tried your first image, because it is the "least worst".
These are the steps to get the gain-divided image:
Apply a soft median blur filter to get rid of high frequency noise.
Get the model of the background via local maximum. Apply a very strong close operation, with a big structuring element (I’m using a rectangular kernel of size 15).
Perform gain adjustment by dividing 255 between each local maximum pixel. Weight this value with each input image pixel.
You should get a nice image where the background illumination is pretty much normalized, threshold this image to get a binary mask of the characters.
Now, you can improve the quality of the image with the following, additional steps:
Threshold via Otsu, but add a little bit of bias. (This, unfortunately, is a manual step depending on the input).
Apply an area filter to filter out the smaller blobs of noise.
Let's see the code:
import numpy as np
import cv2
# image path
path = "C:/opencvImages/"
fileName = "iA904.png"
# Reading an image in default mode:
inputImage = cv2.imread(path+fileName)
# Remove small noise via median:
filterSize = 5
imageMedian = cv2.medianBlur(inputImage, filterSize)
# Get local maximum:
kernelSize = 15
maxKernel = cv2.getStructuringElement(cv2.MORPH_RECT, (kernelSize, kernelSize))
localMax = cv2.morphologyEx(imageMedian, cv2.MORPH_CLOSE, maxKernel, None, None, 1, cv2.BORDER_REFLECT101)
# Perform gain division
gainDivision = np.where(localMax == 0, 0, (inputImage/localMax))
# Clip the values to [0,255]
gainDivision = np.clip((255 * gainDivision), 0, 255)
# Convert the mat type from float to uint8:
gainDivision = gainDivision.astype("uint8")
# Convert RGB to grayscale:
grayscaleImage = cv2.cvtColor(gainDivision, cv2.COLOR_BGR2GRAY)
This is what gain division gets you:
Note that the lighting is more balanced. Now, let's apply a little bit of contrast enhancement:
# Contrast Enhancement:
grayscaleImage = np.uint8(cv2.normalize(grayscaleImage, grayscaleImage, 0, 255, cv2.NORM_MINMAX))
You get this, which creates a little bit more contrast between the foreground and the background:
Now, let's try to threshold this image to get a nice, binary mask. As I suggested, try Otsu's thresholding but add (or subtract) a little bit of bias to the result. This step, as mentioned, is dependent on the quality of your input:
# Threshold via Otsu + bias adjustment:
threshValue, binaryImage = cv2.threshold(grayscaleImage, 0, 255, cv2.THRESH_BINARY+cv2.THRESH_OTSU)
threshValue = 0.9 * threshValue
_, binaryImage = cv2.threshold(grayscaleImage, threshValue, 255, cv2.THRESH_BINARY)
You end up with this binary mask:
Invert this and filter out the small blobs. I set an area threshold value of 10 pixels:
# Invert image:
binaryImage = 255 - binaryImage
# Perform an area filter on the binary blobs:
componentsNumber, labeledImage, componentStats, componentCentroids = \
cv2.connectedComponentsWithStats(binaryImage, connectivity=4)
# Set the minimum pixels for the area filter:
minArea = 10
# Get the indices/labels of the remaining components based on the area stat
# (skip the background component at index 0)
remainingComponentLabels = [i for i in range(1, componentsNumber) if componentStats[i][4] >= minArea]
# Filter the labeled pixels based on the remaining labels,
# assign pixel intensity to 255 (uint8) for the remaining pixels
filteredImage = np.where(np.isin(labeledImage, remainingComponentLabels) == True, 255, 0).astype("uint8")
And this is the final binary mask:
If you plan on sending this image to an OCR, you might want to apply some morphology first. Maybe a closing to try and join the dots that make up the characters. Also be sure to train your OCR classifier with a font that is close to what you are actually trying to recognize. This is the (inverted) mask after a size 3 rectangular closing operation with 3 iterations:
Edit:
To get the last image, process the filtered output as follows:
# Set kernel (structuring element) size:
kernelSize = 3
# Set operation iterations:
opIterations = 3
# Get the structuring element:
maxKernel = cv2.getStructuringElement(cv2.MORPH_RECT, (kernelSize, kernelSize))
# Perform closing:
closingImage = cv2.morphologyEx(filteredImage, cv2.MORPH_CLOSE, maxKernel, None, None, opIterations, cv2.BORDER_REFLECT101)
# Invert image to obtain black numbers on white background:
closingImage = 255 - closingImage
I am using drawContours to make a mask for extracting ROI.
I have already defined four points and a zero mask for drawing the contour.
The output mask of the drawContours function is sort of a trapezoid shape which is what I want.
However, when I use this mask to do bitwise_and with the image,
the result isn't really the same shape with the mask.
The edge of the shape is obviously jagged.
Here is my python code snippet:
hull2 = cv2.convexHull(crop1)
mask10 = np.zeros(image.shape[:2], dtype = "uint8")
print(len(hull2))
cv2.drawContours(mask10, [hull2], -1,255, -1,cv2.LINE_AA)
cv2.imshow("mask10",mask10)
cv2.waitKey(0)
crop = cv2.bitwise_and(image, image, mask=mask10)
cv2.imshow("crop",crop)
cv2.waitKey(0)
cv2.drawContours(image, [hull2], -1, (0, 255, 0), -1,cv2.LINE_AA)
cv2.imshow("mask+img",image)
cv2.waitKey(0)
And here is a picture showing the result: "crop" is the ROI result image
Thanks for anyone trying to help.
The reason you are getting jagged edges while your mask looks like it has smooth edges is because you are using the anti-aliasing flag on drawContours( - ,cv2.LINE_AA) which fills in the surrounding of the jagged edges with darker pixels creating a gradient that fools your eye into thinking its a smooth edge.
Why does this matter? when you use bitwise_and with a mask, any value in the mask greater than 0 is evaluated as "True" and the corresponding pixel in the image will be selected.
So those extra AA pixels despite being a smaller gray value than 255, are expanding the edge of the mask, creating your jagged edge in crop. To emulate this, do mask10[mask10 > 0] = 255; cv2.imshow('mask10', mask10) and it should have the same shape as crop.
Now as a possible solution to your problem, you could use alpha blending to use the gradient (darkened intensity) of those extra AA pixels to darken the crop image edge pixels.
mask_float = cv2.cvtColor(mask10, cv2.COLOR_GRAY2BGR).astype('float32') / 255
image_float = image.astype('float32')
crop = cv2.multiply(image_float, mask_float).astype('uint8')
First we convert mask10 to a 3 channel array so that we can apply the alpha blending to all 3 BGR channels of the image.
Then we normalize the mask to a [0-1] range as we will need to multiply the values in the next step and dtype uint8 doesnt allow greater than 255. So first converting to float32 then dividing by 255. (we could potentially use cv2.normalize() but numpy should be alot faster)
we then convert the image to float32 to allow for multiplication with the mask.
then we multiply the image with the mask to get an alpha blended image of foreground to a black background and convert it back to uint8 for opencv.
Now since the BGR values are converted from float32 to uint8, it will discard the decimal values which will cause a negligible change in color. Also, I'm not 100% sure but there might be a small change in color due to multiplying each channel individually by the same value (eg: 20%) or it could be fine and im just overthinking it? But that only applies to those darkened AA pixels, the effect should also be negligible and we are already modifying it from the original anyways so it should be fine!
As an alternative, you could also convert the image to HLS and multiply the mask to the L-channel only. I believe that should be more true to the image's colors on those edges if that is very important, and the slower speed is permissible