How can I blue-box an image? - python

I have a scanned image which is basically black print on some weird (non-gray) background, say, green or yellow (think old paper).
How can I get rid of the green/yellow and receive a gray picture with as much of the gray structure of the original image intact? I.e. I want to keep the gray around the letters for the anti-aliasing effect or for gray areas but I want to turn anything which even is remotely green/yellow to become pure white?
Note that the background is by no means homogeneous; so the algorithm should be able accept a color and an error margin or a color range.
For bonus points: How can I automatically determine the background color?
I'd like to use Python with the Imaging Library or maybe ImageMagick.
Note: I'm aware of packages like unpaper. My problem with unpaper is that it produces B&W images which probably look good for an OCR software but not for the human eye.

I am more of C++ than python programmer, so I can't give you a code sample. But the general algorithm is something like this:
Finding the background color:
You make a histogram of the image. The histogram should have two peaks representing the background and foreground colors. Because you know that the background has higher intensity you choose the peak with higher intensity and that is the background color.
Now you have the RGB background (R_bg, G_bg, B_bg)
Setting the background to white:
You loop over all pixels and calculates the distance to the background:
distance = sqrt((R_bg - R_pixel) ^ 2 + (G_bg - G_pixel) ^ 2 + (B_bg - B_pixel) ^ 2)
If the distance is less than a threshold you set the pixel to white. You can experiment with different thresholds until you get a good result.

I was looking to make an arbitrary background color transparent a while ago and developed this script. It takes the most popular (background) color in an image and creates an alpha mask where the transparency is proportional to the distance from the background color. Taking RGB colorspace distances is an expensive process for large images so I've tried some optimization using numpy and a fast integer sqrt approximation operation. Converting to HSV first might be the right approach. If you havn't solved your problem yet, I hope this helps:
from PIL import Image
import sys, time, numpy
fldr = r'C:\python_apps'
fp = fldr+'\\IMG_0377.jpg'
rz = 0 # 2 will halve the size of the image, etc..
# ----------------
im = Image.open(fp)
if rz:
w,h = im.size
im = im.resize((w/rz,h/rz))
w,h = im.size
h = im.histogram()
rgb = r0,g0,b0 = [b.index(max(b)) for b in [ h[i*256:(i+1)*256] for i in range(3) ]]
def isqrt(n):
xn = 1
xn1 = (xn + n/xn)/2
while abs(xn1 - xn) > 1:
xn = xn1
xn1 = (xn + n/xn)/2
while xn1*xn1 > n:
xn1 -= 1
return xn1
vsqrt = numpy.vectorize(isqrt)
def dist(image):
imarr = numpy.asarray(image, dtype=numpy.int32) # dtype=numpy.int8
d = (imarr[:,:,0]-r0)**2 + (imarr[:,:,1]-g0)**2 + (imarr[:,:,2]-b0)**2
d = numpy.asarray((vsqrt(d)).clip(0,255), dtype=numpy.uint8)
return Image.fromarray(d,'L')
im.putalpha(dist(im))
im.save(fldr+'\\test.png')

I know the question is old, but I was playing around with ImageMagick trying to do something similar, and came up with this:
convert text.jpg -fill white -fuzz 50% +opaque black out.jpg
which converts this:
into this:
As regards the "average" colour, I used this:
convert text.jpg -colors 2 -colorspace RGB -format %c histogram:info:-
5894: ( 50, 49, 19) #323113 rgb(50,49,19)
19162: (186,187, 87) #BABB57 rgb(186,187,87) <- THIS ONE !
which is this colour:
After some more experimentation, I can get this:
using this:
convert text.jpg -fill black -fuzz 50% -opaque rgb\(50,50,10\) -fill white +opaque black out.jpg

Related

How to find the largest blank(white) square area in the doc and return its coordinates and area?

I need to find the largest empty area in the document and display its coordinates, center point and area, using python to put a QR Code there.
I think OpenCV and Numpy should be enough for this task.
What kinda THRESH to use? Because there are a lot of types of scans:
gray, BW, with color, and how to find the contour properly?
How this can be implemented in the fastest way? An example using the
first scan from google is attached, where you can see that the code
should find the largest empty square area.
#Mark Setchell Thanks! This code works perfectly for all docs with a white background, but when I use smth with a color in the background it finds a completely different area. Also, to keep thin lines in the docs I used Erode after thresholding. Tried to change thresholding and erode parameters, still not working properly.
Edited post, added color pictures.
Here's a possible approach:
#!/usr/bin/env python3
import cv2
import numpy as np
def largestSquare(im):
# Make image square of 100x100 to simplify and speed up
s = 100
work = cv2.resize(im, (s,s), interpolation=cv2.INTER_NEAREST)
# Make output accumulator - uint16 is ok because...
# ... max value is 100x100, i.e. 10,000 which is less than 65,535
# ... and you can make a PNG of it too
p = np.zeros((s,s), np.uint16)
# Find largest square
for i in range(1, s):
for j in range(1, s):
if (work[i][j] > 0 ):
p[i][j] = min(p[i][j-1], p[i-1][j], p[i-1][j-1]) + 1
else:
p[i][j] = 0
# Save result - just for illustration purposes
cv2.imwrite("result.png",p)
# Work out what the actual answer is
ind = np.unravel_index(np.argmax(p, axis=None), p.shape)
print(f'Location: {ind}')
print(f'Length of side: {p[ind]}')
# Load image and threshold
im = cv2.imread('page.png', cv2.IMREAD_GRAYSCALE)
_, thr = cv2.threshold(im,127,255,cv2.THRESH_BINARY | cv2.THRESH_OTSU)
# Get largest white square
largestSquare(thr)
Output
Location: (21, 77)
Length of side: 18
Notes:
I edited out your red annotation so it didn't interfere with my algorithm.
I did Otsu thresholding to get pure black and white - that may or may not be appropriate to your use case. It will depend on your scans and paper background etc.
I scaled the image down to 100x100 so it doesn't take all day to run. You will need to scale the results back up to the size of your original image but I assume you can do that easily enough.
Keywords: Image processing, image, Python, OpenCV, largest white square, largest empty space.

How to detect intersection of 3 different colors

I have an image created by many polygons of different solid colors. The coordinates themselves are not given, but can be detected if necessary.
I'm looking for a way to detect all points which are the intersection of 3 or more different colors. The colors are not known in advance, might be similar to each other (e.g one might be (255, 255, 250) and another is (255, 255, 245). The specific shade doesn't matter, just the fact that it is different).
for example, in the following image a tiny star marks all the points that I'm looking for.
As your annotations have obscured the intersections you are trying to identify, I made a new, similar image.
Rather than trying to bend my brain around trying to deal with 3-dimensions of 8-bit RGB colour, I converted that to a single 24-bit integer and then ran a generic filter from SciPy and counted the number of unique colours in each 3x3 window and made a new image from that. So each pixel in the result has a brightness value equal to the number of colours in its neighbourhood. I counted the number of colours by converting the Numpy array of neighbours into a Python set - exploiting the fact that a set can only have unique numbers in it.
#!/usr/bin/env python3
import numpy as np
from PIL import Image
from scipy.ndimage import generic_filter
# CountUnique
def CountUnique(P):
"""
We receive P[0]..P[8] with the pixels in the 3x3 surrounding window, return count of unique values
"""
return len(set(P))
# Open image and make into Numpy array
PILim = Image.open('patches.png').convert('RGB')
RGBim = np.array(PILim)
# Make a single channel 24-bit image rather than 3 channels of 8-bit each
RGB24 = (RGBim[...,0].astype(np.uint32)<<16) | (RGBim[...,1].astype(np.uint32)<<8) | RGBim[...,2].astype(np.uint32)
# Run generic filter counting unique colours in neighbourhood
result = generic_filter(RGB24, CountUnique, (3, 3))
# Save result
Image.fromarray(result.astype(np.uint8)).save('result.png')
The resultant image is shown here, with the contrast stretched so that you can see the brightest pixels at the intersections you seek.
A histogram of the values in the result image shows there are 21 pixels which have 3 unique colours in their 3x3 neighbourhood and 4,348 pixels which have 2 unique colours in their neighbourhood. You can find these by running np.where(result==3), for example.
Histogram:
155631: ( 1, 1, 1) #010101 gray(1)
4348: ( 2, 2, 2) #020202 gray(2)
21: ( 3, 3, 3) #030303 gray(3)
For extra fun, I had a go at programming the method suggested by #Micka and that gives the same results, code looks like this:
#!/usr/bin/env python3
import numpy as np
from PIL import Image
from skimage.morphology import dilation, disk
# Open image and make into Numpy array
PILim = Image.open('patches.png').convert('RGB')
RGBim = np.array(PILim)
h, w = RGBim.shape[0], RGBim.shape[1]
# Make a single channel 24-bit image rather than 3 channels of 8-bit each
RGB24 = (RGBim[...,0].astype(np.uint32)<<16) | (RGBim[...,1].astype(np.uint32)<<8) | RGBim[...,2].astype(np.uint32)
# Make list of unique colours
UniqueColours = np.unique(RGB24)
# Create result image
result = np.zeros((h,w),dtype=np.uint8)
# Make mask for any particular colour - same size as original image
mask = np.zeros((h,w), dtype=np.uint8)
# Make disk-shaped structuring element for morphology
selem = disk(1)
# Iterate over unique colours
for i,u in enumerate(UniqueColours):
# Turn on all pixels matching this unique colour, turn off all others
mask = np.where(RGB24==u,1,0)
# Dilate (fatten) the mask by 1 pixel
mask = dilation(mask,selem)
# Add all activated pixels to result image
result = result + mask
# Save result
Image.fromarray(result.astype(np.uint8)).save('result.png')
For reference, I created the image with anti-aliasing disabled in ImageMagick at the command line like this:
convert -size 400x400 xc:red -background red +antialias \
-fill blue -draw "polygon 42,168 350,72 416,133 416,247 281,336" \
-fill yellow -draw "polygon 271,11 396,127 346,154 77,86" \
-fill lime -draw "polygon 366,260 366,400 120,400" patches.png
Keywords: Python, image, image processing, intersect, intersection, PIL/Pillow, adjacency, neighbourhood, neighborhood, neighbour, neighbor, generic, SciPy, 3x3, filter.

Python: How to implement Binary Filter on RGB image? (algorithm)

I'm trying to implement binary image filter (to get monochrome binary image) using python & PyQT5, and, to retrieve new pixel colors I use the following method:
def _new_pixel_colors(self, x, y):
color = QColor(self.pixmap.pixel(x, y))
result = qRgb(0, 0, 0) if all(c < 127 for c in color.getRgb()[:3]) else qRgb(255, 255, 255)
return result
Could It be a correct sample of binary filter for RGB image? I mean, is that a sufficient condition to check whether the pixel is brighter or darker then (127,127,127) Gray color?
And please, do not provide any solutions with opencv, pillow, etc. I'm only asking about the algorithm itself.
I would at least compare against intensity i=R+G+B ...
For ROI like masks you can use any thresholding techniques (adaptive thresholding is the best) but if your resulting image is not a ROI mask and should resemble the visual features of the original image then the best conversion I know of is to use Dithering.
The Idea behind BW dithering is to convert gray scales into BW patterns preserwing the shading. The result is often noisy but preserves much much more visual details. Here simple naive C++ dithering (sorry not a Python coder):
picture pic0,pic1;
// pic0 - source img
// pic1 - output img
int x,y,i;
color c;
// resize output to source image size clear with black
pic1=pic0; pic1.clear(0);
// dithering
i=0;
for (y=0;y<pic0.ys;y++)
for (x=0;x<pic0.xs;x++)
{
// get source pixel color (AARRGGBB)
c=pic0.p[y][x];
// add to leftovers
i+=WORD(c.db[picture::_r]); // _r,_g,_b are just constants 0,1,2
i+=WORD(c.db[picture::_g]);
i+=WORD(c.db[picture::_b]);
// threshold white intensity is 255+255+255=765
if (i>=384){ i-=765; c.dd=0x00FFFFFF; } else c.dd=0;
// copy to destination image
pic1.p[y][x]=c;
}
So its the same as in the link above but using just black and white. i is the accumulated intensity to be placed on the image. xs,ys is the resolution and c.db[] is color channel access.
If I apply this on colored image like this:
The result looks like this:
As you can see all the details where preserved but a noisy patterns emerge ... For printing purposes was sometimes the resolution of the image multiplied to enhance the quality. If you change the naive 2 nested for loops with a better pattern (like 16x16 squares etc) then the noise will be conserved near its source limiting artifacts. There are also approaches that use pseudo random patterns (put the leftover i near its source pixel in random location) that is even better ...
But for a BW dithering even naive approach is enough as the artifacts are just one pixel in size. For colored dithering the artifacts could create unwanted horizontal line patterns of several pixels in size (depends on used palette mis match the worse palette the bigger artifacts...)
PS just for comparison to other answer threshold outputs this is the same image dithered:
Image thresholding is the class of algorithms you're looking for - a binary threshold would set pixels to 0 or 1, yes.
Depending on the desired output, consider converting your image first to other color spaces, in particular HSL, with the luminance channel. Using (127, 127, 127) as a threshold does not uniformly take brightness into account because each channel of RGB is the saturation of R, G, or B; consider this image:
from PIL import Image
import colorsys
def threshold_pixel(r, g, b):
h, l, s = colorsys.rgb_to_hls(r / 255., g / 255., b / 255.)
return 1 if l > .36 else 0
# return 1 if r > 127 and g > 127 and b > 127 else 0
def hlsify(img):
pixels = img.load()
width, height = img.size
# Create a new blank monochrome image.
output_img = Image.new('1', (width, height), 0)
output_pixels = output_img.load()
for i in range(width):
for j in range(height):
output_pixels[i, j] = threshold_pixel(*pixels[i, j])
return output_img
binarified_img = hlsify(Image.open('./sample_img.jpg'))
binarified_img.show()
binarified_img.save('./out.jpg')
There is lots of discussion on other StackExchange sites on this topic, e.g.
Binarize image data
How do you binarize a colored image?
how can I get good binary image using Otsu method for this image?

How to detect colored text from 6 meters away?

I am using python, PIL, opencv and numpy to detect single color texts (i.e one is red, one is green). I want to detect these colorful text up to 6 meters away during live stream. I have used color detection methods but they did not work after 30-50 cm. Camera should be close to colors. As a second method to detect these texts, I used ctpn method. Although it detects texts, It does not provide the coordinate of these texts since I need coordinate points of texts also. I also tried OCR method in Matlab to automatically detect text in natural image but it failed since it finds another small objects as text. I am so stuck about what to do.
Let say for example, there are two different texts in an image that is captured 6 meters away. One text is green, the other one is red. The width of these texts are approximately 40-50 cm. In addition, they are only two different words, not long texts. How can I detect them and specify their location as (x1,y1) and (x2,y2)? Is that possible ? needy for any succesfull hint ?
import numpy as np
from PIL import Image
# Open image and make RGB and HSV versions
RGBim = Image.open("AdjustedNewMaze3.jpg").convert('RGB')
HSVim = RGBim.convert('HSV')
# Make numpy versions
RGBna = np.array(RGBim)
HSVna = np.array(HSVim)
# Extract Hue
H = HSVna[:,:,0]
# Find all green pixels, i.e. where 100 < Hue < 140
lo,hi = 100,140
# Rescale to 0-255, rather than 0-360 because we are using uint8
lo = int((lo * 255) / 360)
hi = int((hi * 255) / 360)
green = np.where((H>lo) & (H<hi))
# Make all green pixels black in original image
RGBna[green] = [0,0,0]
def find_nearest(array, value):
array = np.asarray(array)
idx = (np.abs(array - value)).argmin()
return array[idx]
value = 120 & 125
green = find_nearest(RGBna, value)
print(green)
count = green[0].size
print("Pixels matched: {}".format(count))
Image.fromarray(green).save('resultgreen.png')

remove pixel annotations in dicom image

I am analyzing medical images. All images have a marker with the position. It looks like this
It is the "TRH RMLO" annotation in this image, but it can be different in other images. Also the size varies. The image is cropped but you see that the tissue is starting on the right side.
I found that the presence of these markers distort my analysis.
How can I remove them?
I load the image in python like this
import dicom
import numpy as np
img = dicom.read_file(my_image.dcm)
img_array = img.pixel_array
The image is then a numpy array. The white text is always surrounded by a large black area (black has value zero). The marker is in a different position in each image.
How can I remove the white text without hurting the tissue data.
UPDATE
added a second image
UPDATE2:
Here are two of the original dicom files. All personal information has been removed.edit:removed
Looking at the actual pixel values of the image you supplied, you can see that the marker is almost (99.99%) pure white and this doesn't occur elsewhere in the image so you can isolate it with a simple 99.99% threshold.
I prefer ImageMagick at the command-line, so I would do this:
convert sample.dcm -threshold 99.99% -negate mask.png
convert sample.dcm mask.png -compose darken -composite result.jpg
Of course, if the sample image is not representative, you may have to work harder. Let's look at that...
If the simple threshold doesn't work for your images, I would look at "Hit and Miss Morphology". Basically, you threshold your image to pure black and white - at around 90% say, and then you look for specific shapes, such as the corner markers on the label. So, if we want to look for the top-left corner of a white rectangle on a black background, and we use 0 to mean "this pixel must be black", 1 to mean "this pixel must be white" and - to mean "we don't care", we would use this pattern:
0 0 0 0 0
0 1 1 1 1
0 1 - - -
0 1 - - -
0 1 - - -
Hopefully you can see the top left corner of a white rectangle there. That would be like this in the Terminal:
convert sample.dcm -threshold 90% \
-morphology HMT '5x5:0,0,0,0,0 0,1,1,1,1 0,1,-,-,- 0,1,-,-,- 0,1,-,-,-' result.png
Now we also want to look for top-right, bottom-left and bottom-right corners, so we need to rotate the pattern, which ImageMagick handily does when you add the > flag:
convert sample.dcm -threshold 90% \
-morphology HMT '5x5>:0,0,0,0,0 0,1,1,1,1 0,1,-,-,- 0,1,-,-,- 0,1,-,-,-' result.png
Hopefully you can see dots demarcating the corners of the logo now, so we could ask ImageMagick to trim the image of all extraneous black and just leave the white dots and then tell us the bounding box:
cconvert sample.dcm -threshold 90% \
-morphology HMT '5x5>:0,0,0,0,0 0,1,1,1,1 0,1,-,-,- 0,1,-,-,- 0,1,-,-,-' -format %# info:
308x198+1822+427
So, if I now draw a red box around those coordinates, you can see where the label has been detected - of course in practice I would draw a black box to cover it but I am explaining the idea:
convert sample.dcm -fill "rgba(255,0,0,0.5)" -draw "rectangle 1822,427 2130,625" result.png
If you want a script to do that automagically, I would use something like this, saving it as HideMarker:
#!/bin/bash
input="$1"
output="$2"
# Find corners of overlaid marker using Hit and Miss Morphology, then get crop box
IFS="x+" read w h x1 y1 < <(convert "$input" -threshold 90% -morphology HMT '5x5>:0,0,0,0,0 0,1,1,1,1 0,1,-,-,- 0,1,-,-,- 0,1,-,-,-' -format %# info:)
# Calculate bottom-right corner from top-left and dimensions
((x1=x1-1))
((y1=y1-1))
((x2=x1+w+1))
((y2=y1+h+1))
convert "$input" -fill black -draw "rectangle $x1,$y1 $x2,$y2" "$output"
Then you would do this to make it executable:
chmod +x HideMarker
And run it like this:
./HideMarker someImage.dcm result.png
I have another idea. This solution is in OpenCV using python. It is a rather solution.
First, obtain the binary threshold of the image.
ret,th = cv2.threshold(img,2,255, 0)
Perform morphological dilation:
dilate = cv2.morphologyEx(th, cv2.MORPH_DILATE, kernel, 3)
To join the gaps, I then used median filtering:
median = cv2.medianBlur(dilate, 9)
Now you can use the contour properties to eliminate the smallest contour and retain the other containing the image.
It also works for the second image:
If these annotations are in the DICOM file there are a couple ways they could be stored (see https://stackoverflow.com/a/4857782/1901261). The currently supported method can be cleaned off by simply removing the 60xx group attributes from the files.
For the deprecated method (which is still commonly used) you can clear out the unused high bit annotations manually without messing up the other image data as well. Something like:
int position = object.getInt( Tag.OverlayBitPosition, 0 );
if( position == 0 ) return;
int bit = 1 << position;
int[] pixels = object.getInts( Tag.PixelData );
int count = 0;
for( int pix : pixels )
{
int overlay = pix & bit;
pixels[ count++ ] = pix - overlay;
}
object.putInts( Tag.PixelData, VR.OW, pixels );
If these are truly burned into the image data, you're probably stuck using one of the other recommendations here.
The good thing is, that these watermarks are probably in an isolated totally black are which makes it easier (although it's questionable if removing this is according to the indicated usage; license-stuff).
Without beeing an expert, here is one idea. It might be a sketch of some very very powerful approach tailored to this problem but you have to decide if implementation-complexity & algorithmic-complexity (very dependent on image-statistics) are worth it:
Basic idea
Detect the semi-cross like borders (4)
Calculate the defined rectangle from these
Black-out this rectangle
Steps
0
Binarize
1
Use some gradient-based edge-detector to get all the horizontal edges
There may be multiple; you can try to give min-length (maybe some morphology needed to connect pixels which are not connected based on noise in source or algorithm)
2
Use some gradient-based edge-detector to get all the horizontal edges
Like the above, but a different orientation
3
Do some connected-component calculation to get some objects which are vertical and horizontal lines
Now you can try different chosings of candidate-components (8 real ones) with the following knowledge
two of these components can be described by the same line (slope-intercept form; linear regression problem) -> line which borders the rectangle
it's probably that the best 4 pair-chosings (according to linear-regression loss) are the valid borders of this rectangle
you might add the assumption, that vertical borders and horizontal borders are orthogonal to each other
4
- Calculate the rectangle from these borders
- Widen it by a few pixels (hyper-parameter)
- Black-out that rectangle
That's the basic approach.
Alternative
This one is much less work, use more specialized tools and assumes the facts in the opening:
the stuff to remove is on some completely black part of the image
it's kind of isolated; distance to medical-data is high
Steps
Run some general OCR to detect characters
Get the occupied pixels / borders somehow (i'm not sure what OCR tools return)
Calculate some outer rectangle and black-out (using some predefined widening-gap; this one needs to be much bigger than the one above)
Alternative 2
Sketch only: The idea is to use something like binary-closing on the image somehow to build fully connected-components ouf of the source pixels (while small gaps/holes are filled), so that we got one big component describing the medical-data and one for the watermark. Then just remove the smaller one.
I am sure this can be optimized, but ... You could create 4 patches of size 3x3 or 4x4, and initialize them with the exact content of the pixel values for each of the individual corners of the frame surrounding the annotation text. You could then iterate over the whole image (or have some smart initialization looking only in the black area) and find the exact match for those patches. It is not very likely you will have the same regular structure (90 deg corner surrounded by near 0) in the tissue, so this might give you the bounding box.
Simpler one is still possible!!!.
Just implement following after (img_array = img.pixel_array)
img_array[img_array > X] = Y
In which X is the intensity threshold you want to eliminate after that. Also Y is the intensity value which you want to consider instead of that.
For example:
img_array[img_array > 4000] = 0
Replace white matter greater than 4000 with black intensity 0.

Categories

Resources