Related
My goal is to remove dark horizontal and vertical lines from an image after it has been converted into a numpy array. I didn't want to use a predefined image module for this since I wanted fine control over parameters such as threshold values.
My logic was as follows:
Convert a color image to a 3D numpy array image (BGR) using cv2.imread.
Iterate over row indices and extract each row using row = image[row_index,:,:].
In each row, calculate how many pixels are "black pixels" based on whether all 3 channel values are below the defined threshold.
If enough number (or ratio) of pixels in a row meet the above criteria, store this row index into the list remove_rows.
After all iterations, determine the rows to be preserved, stored into preserve_rows, based on the list remove_rows.
The new image after the rows deletion can be estimated by image = image[preserve_rows,:,:].
Repeat the process for columns as well.
The program worked, but it takes a very long time. I think the time complexity is O(rows * columns * 3) because every value has to be visited and compared with the threshold. The program takes around 9 seconds for a single image, which is unacceptable since I eventually plan to use this function for preprocessing in Keras in the ImageDataGenerator function and I'm not sure whether this function uses the GPU during neural network training. The full code is below:
def edge_removal(image, threshold=50, max_black_ratio=0.7):
num_rows, _, _ = image.shape
remove_rows = []
threshold_times = []
start_time = time.time()
for row_index in range(num_rows):
row = image[row_index,:,:]
pixel_count = 0
black_pixel_count = 0
for pixel in row:
pixel_count += 1
b,g,r = pixel
pre_threshold_time = time.time()
if all([x<=threshold for x in [b,g,r]]):
black_pixel_count += 1
threshold_times.append(time.time()-pre_threshold_time)
if pixel_count > 0 and (black_pixel_count/pixel_count)>max_black_ratio:
remove_rows.append(row_index)
time_taken = time.time() - start_time
print(f"Time taken for thresholding = {sum(threshold_times)}")
print(f"Time taken till row for loop = {time_taken}")
preserve_rows = [x for x in range(num_rows) if x not in remove_rows]
image = image[preserve_rows,:,:]
_, num_cols, _ = image.shape
remove_cols = []
for col_index in range(num_cols):
col = image[:,col_index,:]
pixel_count = 0
black_pixel_count = 0
for pixel in col:
pixel_count += 1
b,g,r = pixel
if all([x<=threshold for x in [b,g,r]]):
black_pixel_count += 1
if pixel_count > 0 and (black_pixel_count/pixel_count)>max_black_ratio:
remove_cols.append(col_index)
preserve_cols = [x for x in range(num_cols) if x not in remove_cols]
image = image[:,preserve_cols,:]
time_taken = time.time() - start_time
print(f"Total time taken = {time_taken}")
return image
And the output of the code is:
Time taken for thresholding = 3.586946487426758
Time taken till row for loop = 4.530229091644287
Total time taken = 8.74315094947815
I've tried the following:
Using mutlithreading to replace the outer for loop, where the argument to the threaded function is the threadnumber (no of threads = no of rows in the image). However, this did not speed up the program. This is probably because the for loop is a CPU-bound process, which cannot be sped up due to the Global Interpreter Lock, as described by this SO answer.
Looking for other suggestions how to reduce the time complexity of the program. This answer did not help me much since its not the deletion that's the bottleneck as can be seen in the output. The number of comparisons to perform thresholding is what's slowing this program down.
Any suggestions or heuristics to reduce the amount of computation and thereby the processing time of the program?
Since your code is comprised of two parts that do the same job, simply on a different dimension of the image, I moved all that logic in a single function that tells you whether the "series of pixels" (row or column, does not matter) provided is above or below the threshold.
I replaced all the manual counts with len calls.
The various generators (r, g, b = pixel; x <= threshold for x in (r, g, b)) are replaced with direct numpy array comparison like pixel <= threshold and python's all is replaced by numpy's .all().
The old and new codes process my test image in 5.9 s and 37 ms respectively, with the added benefit of readability.
def edge_removal(image, threshold=50, max_black_ratio=0.7):
def pixels_should_be_conserved(pixels) -> bool:
black_pixel_count = (pixels <= threshold).all(axis=1).sum()
pixel_count = len(pixels)
return pixel_count > 0 and black_pixel_count/pixel_count <= max_black_ratio
num_rows, num_columns, _ = image.shape
preserved_rows = [r for r in range(num_rows) if pixels_should_be_conserved(image[r, :, :])]
preserved_columns = [c for c in range(num_columns) if pixels_should_be_conserved(image[:, c, :])]
image = image[preserved_rows,:,:]
image = image[:,preserved_columns,:]
return image
To explain further the change that saved us the most time (counting black pixels), let's take a look at a simplified example.
red = np.array([255, 0, 0])
black = np.array([0, 0, 0])
pixels = np.array([red, red, red, black, red]) # Simple line of 5 pixels.
threshold = 50
pixels <= threshold
# >>> array([[False, True, True],
# [False, True, True],
# [False, True, True],
# [True, True, True],
# [False, True, True]])
(pixels <= threshold).all(axis=1)
# >>> array([False,
# False,
# False,
# True,
# False])
# We successfully detect that the fourth pixel has all its rgb values below the threshold.
(pixels <= threshold).all(axis=1).sum()
# >>> 1
# Summing a boolean area is a handy way of counting how many element in the
# array are true, i.e. dark enough in our case.
Alternative 1: HSV.
Another thing we can consider is using the HSV color system, since you only worry about brightness in your problem. That will allow use to check whether s <= threshold for each pixel, so one comparison instead of three.
The image is processed in 18 ms instead of 37 ms despite the two conversions to and from HSV.
def edge_removal(image, threshold=50, max_black_ratio=0.7):
def pixels_should_be_conserved(pixels) -> bool:
black_pixel_count = (pixels[:,2] <= threshold).sum() # Notice the change here.
pixel_count = len(pixels)
return pixel_count > 0 and black_pixel_count/pixel_count <= max_black_ratio
image = cv2.cvtColor(image, cv2.COLOR_BGR2HSV)
num_rows, num_columns, _ = image.shape
preserved_rows = [r for r in range(num_rows) if pixels_should_be_conserved(image[r, :, :])]
preserved_columns = [c for c in range(num_columns) if pixels_should_be_conserved(image[:, c, :])]
image = image[preserved_rows,:,:]
image = image[:,preserved_columns,:]
image = cv2.cvtColor(image, cv2.COLOR_HSV2BGR)
return image
Alternative 2: grayscale.
We can also work in grayscale mode and compare the "pixel" (a single value now) directly against the threshold. We save one conversion compared to the HSV alternative but use a little more memory.
It runs in 14 ms.
def edge_removal(image, threshold=50, max_black_ratio=0.7):
def pixels_should_be_conserved(pixels) -> bool:
black_pixel_count = (pixels <= threshold).sum()
pixel_count = len(pixels)
return pixel_count > 0 and black_pixel_count/pixel_count < max_black_ratio
image_grayscale = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
num_rows, num_columns, _ = image.shape
preserved_rows = [r for r in range(num_rows) if pixels_should_be_conserved(image_grayscale[r, :])]
preserved_columns = [c for c in range(num_columns) if pixels_should_be_conserved(image_grayscale[:, c])]
image = image[preserved_rows,:,:]
image = image[:,preserved_columns,:]
return image
Benchmarking:
RGB (OP)
RGB
HSV
Gray
5900 ms
37 ms
18 ms
14 ms
I have created an alghoritm that detects the edges of an extruded colagen casing and draws a centerline between these edges on an image. Casing with a centerline.
Here is my code:
import numpy as np
import matplotlib.pyplot as plt
plt.style.use('fivethirtyeight')
img = cv2.imread("C:/Users/5.jpg", cv2.IMREAD_GRAYSCALE)
img = cv2.resize(img, (1500, 1200))
#ROI
fromCenter = False
r = cv2.selectROI(img, fromCenter)
imCrop = img[int(r[1]):int(r[1]+r[3]), int(r[0]):int(r[0]+r[2])]
#Operations on an image
_,thresh = cv2.threshold(imCrop,100,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)
kernel = np.ones((5,5),np.uint8)
opening = cv2.morphologyEx(thresh, cv2.MORPH_OPEN, kernel)
blur = cv2.GaussianBlur(opening,(7,7),0)
edges = cv2.Canny(blur, 0,20)
#Edges localization, packing coords into a list
indices = np.where(edges != [0])
coordinates = list(zip(indices[1], indices[0]))
num = len(coordinates)
#Separating into top and bot edge
bot_cor = coordinates[:int(num/2)]
top_cor = coordinates[-int(num/2):]
#Converting to arrays, sorting
a, b = np.array(top_cor), np.array(bot_cor)
a, b = a[a[:,0].argsort()], b[b[:,0].argsort()]
#Edges approximation by a 5th degree polynomial
min_a_x, max_a_x = np.min(a[:,0]), np.max(a[:,0])
new_a_x = np.linspace(min_a_x, max_a_x, imCrop.shape[1])
a_coefs = np.polyfit(a[:,0],a[:,1], 5)
new_a_y = np.polyval(a_coefs, new_a_x)
min_b_x, max_b_x = np.min(b[:,0]), np.max(b[:,0])
new_b_x = np.linspace(min_b_x, max_b_x, imCrop.shape[1])
b_coefs = np.polyfit(b[:,0],b[:,1], 5)
new_b_y = np.polyval(b_coefs, new_b_x)
#Defining a centerline
midx = [np.average([new_a_x[i], new_b_x[i]], axis = 0) for i in range(imCrop.shape[1])]
midy = [np.average([new_a_y[i], new_b_y[i]], axis = 0) for i in range(imCrop.shape[1])]
plt.figure(figsize=(16,8))
plt.title('Cross section')
plt.xlabel('Length of the casing', fontsize=18)
plt.ylabel('Width of the casing', fontsize=18)
plt.plot(new_a_x, new_a_y,c='black')
plt.plot(new_b_x, new_b_y,c='black')
plt.plot(midx, midy, '-', c='blue')
plt.show()
#Converting coords type to a list (plotting purposes)
coords = list(zip(midx, midy))
points = list(np.int_(coords))
mask = np.zeros((imCrop.shape[:2]), np.uint8)
mask = edges
#Plotting
for point in points:
cv2.circle(mask, tuple(point), 1, (255,255,255), -1)
for point in points:
cv2.circle(imCrop, tuple(point), 1, (255,255,255), -1)
cv2.imshow('imCrop', imCrop)
cv2.imshow('mask', mask)
cv2.waitKey(0)
cv2.destroyAllWindows()
Now I would like to sum up the intensities of each pixel in a region between top edge and a centerline (same thing for a region between centerline and a bottom edge).
Is there any way to limit the ROI to the region between the detected edges and split it into two regions based on the calculated centerline?
Or is there any way to access the pixels which are contained between the edge and a centerline based on theirs coordinates?
(It's my very first post here, sorry in advance for all the mistakes)
I wrote a somewhat naïve code to get masks for the upper and lower part. My code considers that the source image will be always like yours: with horizontal stripes.
After applying Canny I get this:
Then I run some loops through image array to fill unwanted areas of your image. This is done separately for upper and lower part, creating masks. The results are:
Then you can use this masks to sum only the elements you're interested in, using cv.sumElems.
import cv2 as cv
#open as grayscale image
src = cv.imread("colagen.png",cv.IMREAD_GRAYSCALE)
# apply canny and find contours
threshold = 100
canny_output = cv.Canny(src, threshold, threshold * 2)
# find mask for upper part
mask1 = canny_output.copy()
x, y = canny_output.shape
area = 0
for j in range(y):
area = 0
for i in range(x):
if area == 0:
if mask1[i][j] > 0:
area = 1
continue
else:
mask1[i][j] = 255
elif area == 1:
if mask1[i][j] > 0:
area = 2
else:
continue
else:
mask1[i][j] = 255
mask1 = cv.bitwise_not(mask1)
# find mask for lower part
mask2 = canny_output.copy()
x, y = canny_output.shape
area = 0
for j in range(y):
area = 0
for i in range(x):
if area == 0:
if mask2[-i][j] > 0:
area = 1
continue
else:
mask2[-i][j] = 255
elif area == 1:
if mask2[-i][j] > 0:
area = 2
else:
continue
else:
mask2[-i][j] = 255
mask2 = cv.bitwise_not(mask2)
# apply masks and calculate sum of elements in upper and lower part
sums = [0,0]
(sums[0],_,_,_) = cv.sumElems(cv.bitwise_and(src,mask1))
(sums[1],_,_,_) = cv.sumElems(cv.bitwise_and(src,mask2))
cv.imshow('src',src)
cv.imshow('canny',canny_output)
cv.imshow('mask1',mask1)
cv.imshow('mask2',mask2)
cv.imshow('masked1',cv.bitwise_and(src,mask1))
cv.imshow('masked2',cv.bitwise_and(src,mask2))
cv.waitKey()
Alternatives...
Probably there exist some function that fill the areas of the Canny result. I tried cv.fillPoly and cv.floodFill, but didn't manage to make them work easily... But maybe someone else can help you with that...
Edit
Found another way to get the masks with a cleaner code. Using numpy np.add.accumulate then np.clip, and then a modulo operation:
# first divide canny_output by 255 to get 0's and 1's, then perform
# an accumulate addition for each column. Thus you'll get +1 for every
# line, "painting" areas with 1, 2, 3...
a = np.add.accumulate(canny_output/255,0)
# clip values: anything greater than 2 becomes 2
a = np.clip(a, 0, 2)
# performe a modulo, to get areas alternating with 0 or 1; then multiply by 255
a = a%2 * 255
# convert to uint8
mask1 = cv.convertScaleAbs(a)
# to get mask2 (the lower mask) flip the array then do the same as above
a = np.add.accumulate(np.flip(canny_output,0)/255,0)
a = np.clip(a, 0, 2)
a = a%2 * 255
mask2 = cv.convertScaleAbs(np.flip(a,0))
This returns almost the same result. The border of the mask is a little bit different...
I have an image such as this one, which is only black and white:
I would like to obtain only the flooded area of the image with the border using cv2.floodfill, like so (pardon my Paint skills):
Here's my current code:
# Copy the image.
im_floodfill = cv2.resize(actual_map_image, (500, 500)).copy()
# Floodfill from point (X, Y)
cv2.floodFill(im_floodfill, None, (X, Y), (255, 255, 255))
# Display images.
cv2.imshow("Floodfilled Image", im_floodfill)
cv2.waitKey(0)
The output I get is equal to the original image. How can I get only the flooded area with borders?
EDIT: I want to floodfill from any white point inside the "arena", like the red dot (X,Y) in the image. I wish to have only the outer border of the small circles inside the arena and the inner border of the outside walls.
EDIT2: I'm halfway there with this:
# Resize for test purposes
actual_map_image = cv2.resize(actual_map_image, (1000, 1000))
actual_map_image = cv2.cvtColor(actual_map_image, cv2.COLOR_BGR2GRAY)
h, w = actual_map_image.shape[:2]
flood_mask = np.zeros((h+2, w+2), dtype=np.uint8)
connectivity = 8
flood_fill_flags = (connectivity | cv2.FLOODFILL_FIXED_RANGE | cv2.FLOODFILL_MASK_ONLY | 255 << 8)
# Copy the image.
im_floodfill = actual_map_image.copy()
# Floodfill from point inside arena, not inside a black dot
cv2.floodFill(im_floodfill, flood_mask, (h/2 + 20, w/2 + 20), 255, None, None, flood_fill_flags)
borders = []
for i in range(len(actual_map_image)):
borders.append([B-A for A,B in zip(actual_map_image[i], flood_mask[i])])
borders = np.asarray(borders)
borders = cv2.bitwise_not(borders)
# Display images.
cv2.imshow("Original Image", cv2.resize(actual_map_image, (500, 500)))
cv2.imshow("Floodfilled Image", cv2.resize(flood_mask, (500, 500)))
cv2.imshow("Borders", cv2.resize(borders, (500, 500)))
cv2.waitKey(0)
I get this:
However, I feel like this is the wrong way of getting the borders, and they are incomplete.
I think the easiest, and fastest, way to do this is to flood-fill the arena with mid-grey. Then extract just the grey pixels and find their edges. That looks like this, but bear in mind more than half the lines are comments and debug statements :-)
#!/usr/bin/env python3
import cv2
# Load image as greyscale to use 1/3 of the memory and processing time
im = cv2.imread('arena.png', cv2.IMREAD_GRAYSCALE)
# Floodfill arena area with value 128, i.e. mid-grey
floodval = 128
cv2.floodFill(im, None, (150,370), floodval)
# DEBUG cv2.imwrite('result-1.png', im)
# Extract filled area alone
arena = ((im==floodval) * 255).astype(np.uint8)
# DEBUG cv2.imwrite('result-2.png', arena)
# Find edges and save
edges = cv2.Canny(arena,100,200)
# DEBUG cv2.imwrite('result-3.png',edges)
Here are the 3 steps of debug output showing you the sequence of processing:
result-1.png looks like this:
result-2.png looks like this:
result-3.png looks like this:
By the way, you don't have to write any Python code to do this, as you can just do it in the Terminal with ImageMagick which is included in most Linux distros and is available for macOS and Windows. The method used here corresponds exactly to the method I used in Python above:
magick arena.png -colorspace gray \
-fill gray -draw "color 370,150 floodfill" \
-fill white +opaque gray -canny 0x1+10%+30% result.png
How about dilating and xor
kernel = np.ones((3,3), np.uint8)
dilated = cv2.dilate(actual_map_image, kernel, iterations = 1)
borders = cv2.bitwise_xor(dilated, actual_map_image)
That will give you only the borders, I'm not clear if you want the circle borders only or also the interior borders, you should be able to remove borders you don't want based on size.
You can remove the exterior border with a size threshold, define a function like this:
def size_threshold(bw, minimum, maximum):
retval, labels, stats, centroids = cv.connectedComponentsWithStats(bw)
for val in np.where((stats[:, 4] < minimum) + (stats[:, 4] > maximum))[0]:
labels[labels==val] = 0
return (labels > 0).astype(np.uint8) * 255
result = size_threshold(borders, 0, 500)
Replace 500 with the a number larger than borders you want to keep and smaller than the border you want to lose.
I had to create my own Flood Fill implementation to get what I wanted. I based myself on this one.
def fill(data, start_coords, fill_value, border_value, connectivity=8):
"""
Flood fill algorithm
Parameters
----------
data : (M, N) ndarray of uint8 type
Image with flood to be filled. Modified inplace.
start_coords : tuple
Length-2 tuple of ints defining (row, col) start coordinates.
fill_value : int
Value the flooded area will take after the fill.
border_value: int
Value of the color to paint the borders of the filled area with.
connectivity: 4 or 8
Connectivity which we use for the flood fill algorithm (4-way or 8-way).
Returns
-------
filled_data: ndarray
The data with the filled area.
borders: ndarray
The borders of the filled area painted with border_value color.
"""
assert connectivity in [4,8]
filled_data = data.copy()
xsize, ysize = filled_data.shape
orig_value = filled_data[start_coords[0], start_coords[1]]
stack = set(((start_coords[0], start_coords[1]),))
if fill_value == orig_value:
raise ValueError("Filling region with same value already present is unsupported. Did you already fill this region?")
border_points = []
while stack:
x, y = stack.pop()
if filled_data[x, y] == orig_value:
filled_data[x, y] = fill_value
if x > 0:
stack.add((x - 1, y))
if x < (xsize - 1):
stack.add((x + 1, y))
if y > 0:
stack.add((x, y - 1))
if y < (ysize - 1):
stack.add((x, y + 1))
if connectivity == 8:
if x > 0 and y > 0:
stack.add((x - 1, y - 1))
if x > 0 and y < (ysize - 1):
stack.add((x - 1, y + 1))
if x < (xsize - 1) and y > 0:
stack.add((x + 1, y - 1))
if x < (xsize - 1) and y < (ysize - 1):
stack.add((x + 1, y + 1))
else:
if filled_data[x, y] != fill_value:
border_points.append([x,y])
# Fill all image with white
borders = filled_data.copy()
borders.fill(255)
# Paint borders
for x,y in border_points:
borders[x, y] = border_value
return filled_data, borders
The only thing I did was adding the else condition. If the point does not have a value equal to orig_value or fill_value, then it is a border, so I append it to a list that contains the points of all borders. Then I only paint the borders.
I was able to get the following images with this code:
# Resize for test purposes
actual_map_image = cv2.resize(actual_map_image, (500, 500))
actual_map_image = cv2.cvtColor(actual_map_image, cv2.COLOR_BGR2GRAY)
h, w = actual_map_image.shape[:2]
filled_data, borders = fill(actual_map_image, [h/2 + 20, w/2 + 20], 127, 0, connectivity=8)
cv2.imshow("Original Image", actual_map_image)
cv2.imshow("Filled Image", filled_data)
cv2.imshow("Borders", borders)
The one on the right was what I was aiming for. Thank you all!
Say you want to scale a transparent image but do not yet know the color(s) of the background you will composite it onto later. Unfortunately PIL seems to incorporate the color values of fully transparent pixels leading to bad results. Is there a way to tell PIL-resize to ignore fully transparent pixels?
import PIL.Image
filename = "trans.png" # http://qrc-designer.com/stuff/trans.png
size = (25,25)
im = PIL.Image.open(filename)
print im.mode # RGBA
im = im.resize(size, PIL.Image.LINEAR) # the same with CUBIC, ANTIALIAS, transform
# im.show() # does not use alpha
im.save("resizelinear_"+filename)
# PIL scaled image has dark border
original image with (0,0,0,0) (black but fully transparent) background (left)
output image with black halo (middle)
proper output scaled with gimp (right)
edit: It looks like to achieve what I am looking for I would have to modify the sampling of the resize function itself such that it would ignore pixels with full transparency.
edit2: I have found a very ugly solution. It sets the color values of fully transparent pixels to the average of the surrounding non fully transparent pixels to minimize impact of fully transparent pixel colors while resizing. It is slow in the simple form but I will post it if there is no other solution. Might be possible to make it faster by using a dilate operation to only process the necessary pixels.
edit3: premultiplied alpha is the way to go - see Mark's answer
It appears that PIL doesn't do alpha pre-multiplication before resizing, which is necessary to get the proper results. Fortunately it's easy to do by brute force. You must then do the reverse to the resized result.
def premultiply(im):
pixels = im.load()
for y in range(im.size[1]):
for x in range(im.size[0]):
r, g, b, a = pixels[x, y]
if a != 255:
r = r * a // 255
g = g * a // 255
b = b * a // 255
pixels[x, y] = (r, g, b, a)
def unmultiply(im):
pixels = im.load()
for y in range(im.size[1]):
for x in range(im.size[0]):
r, g, b, a = pixels[x, y]
if a != 255 and a != 0:
r = 255 if r >= a else 255 * r // a
g = 255 if g >= a else 255 * g // a
b = 255 if b >= a else 255 * b // a
pixels[x, y] = (r, g, b, a)
Result:
You can resample each band individually:
im.load()
bands = im.split()
bands = [b.resize(size, Image.LINEAR) for b in bands]
im = Image.merge('RGBA', bands)
EDIT
Maybe by avoiding high transparency values like so (need numpy)
import numpy as np
# ...
im.load()
bands = list(im.split())
a = np.asarray(bands[-1])
a.flags.writeable = True
a[a != 0] = 1
bands[-1] = Image.fromarray(a)
bands = [b.resize(size, Image.LINEAR) for b in bands]
a = np.asarray(bands[-1])
a.flags.writeable = True
a[a != 0] = 255
bands[-1] = Image.fromarray(a)
im = Image.merge('RGBA', bands)
Maybe you can fill the whole image with the color you want, and only create the shape in the alpha channnel?
sorry for answering myself but this is the only working solution that I know of. It sets the color values of fully transparent pixels to the average of the surrounding non fully transparent pixels to minimize impact of fully transparent pixel colors while resizing. There are special cases where the proper result will not be achieved.
It is very ugly and slow. I'd be happy to accept your answer if you can come up with something better.
# might be possible to speed this up by only processing necessary pixels
# using scipy dilate, numpy where
import PIL.Image
filename = "trans.png" # http://qrc-designer.com/stuff/trans.png
size = (25,25)
import numpy as np
im = PIL.Image.open(filename)
npImRgba = np.asarray(im, dtype=np.uint8)
npImRgba2 = np.asarray(im, dtype=np.uint8)
npImRgba2.flags.writeable = True
lenY = npImRgba.shape[0]
lenX = npImRgba.shape[1]
for y in range(npImRgba.shape[0]):
for x in range(npImRgba.shape[1]):
if npImRgba[y, x, 3] != 0: # only change completely transparent pixels
continue
colSum = np.zeros((3), dtype=np.uint16)
i = 0
for oy in [-1, 0, 1]:
for ox in [-1, 0, 1]:
if not oy and not ox:
continue
iy = y + oy
if iy < 0:
continue
if iy >= lenY:
continue
ix = x + ox
if ix < 0:
continue
if ix >= lenX:
continue
col = npImRgba[iy, ix]
if not col[3]:
continue
colSum += col[:3]
i += 1
npImRgba2[y, x, :3] = colSum / i
im = PIL.Image.fromarray(npImRgba2)
im = im.transform(size, PIL.Image.EXTENT, (0,0) + im.size, PIL.Image.LINEAR)
im.save("slime_"+filename)
result:
I am trying to remove a certain color from my image however it's not working as well as I'd hoped. I tried to do the same thing as seen here Using PIL to make all white pixels transparent? however the image quality is a bit lossy so it leaves a little ghost of odd colored pixels around where what was removed. I tried doing something like change pixel if all three values are below 100 but because the image was poor quality the surrounding pixels weren't even black.
Does anyone know of a better way with PIL in Python to replace a color and anything surrounding it? This is probably the only sure fire way I can think of to remove the objects completely however I can't think of a way to do this.
The picture has a white background and text that is black. Let's just say I want to remove the text entirely from the image without leaving any artifacts behind.
Would really appreciate someone's help! Thanks
The best way to do it is to use the "color to alpha" algorithm used in Gimp to replace a color. It will work perfectly in your case. I reimplemented this algorithm using PIL for an open source python photo processor phatch. You can find the full implementation here. This a pure PIL implementation and it doesn't have other dependences. You can copy the function code and use it. Here is a sample using Gimp:
to
You can apply the color_to_alpha function on the image using black as the color. Then paste the image on a different background color to do the replacement.
By the way, this implementation uses the ImageMath module in PIL. It is much more efficient than accessing pixels using getdata.
EDIT: Here is the full code:
from PIL import Image, ImageMath
def difference1(source, color):
"""When source is bigger than color"""
return (source - color) / (255.0 - color)
def difference2(source, color):
"""When color is bigger than source"""
return (color - source) / color
def color_to_alpha(image, color=None):
image = image.convert('RGBA')
width, height = image.size
color = map(float, color)
img_bands = [band.convert("F") for band in image.split()]
# Find the maximum difference rate between source and color. I had to use two
# difference functions because ImageMath.eval only evaluates the expression
# once.
alpha = ImageMath.eval(
"""float(
max(
max(
max(
difference1(red_band, cred_band),
difference1(green_band, cgreen_band)
),
difference1(blue_band, cblue_band)
),
max(
max(
difference2(red_band, cred_band),
difference2(green_band, cgreen_band)
),
difference2(blue_band, cblue_band)
)
)
)""",
difference1=difference1,
difference2=difference2,
red_band = img_bands[0],
green_band = img_bands[1],
blue_band = img_bands[2],
cred_band = color[0],
cgreen_band = color[1],
cblue_band = color[2]
)
# Calculate the new image colors after the removal of the selected color
new_bands = [
ImageMath.eval(
"convert((image - color) / alpha + color, 'L')",
image = img_bands[i],
color = color[i],
alpha = alpha
)
for i in xrange(3)
]
# Add the new alpha band
new_bands.append(ImageMath.eval(
"convert(alpha_band * alpha, 'L')",
alpha = alpha,
alpha_band = img_bands[3]
))
return Image.merge('RGBA', new_bands)
image = color_to_alpha(image, (0, 0, 0, 255))
background = Image.new('RGB', image.size, (255, 255, 255))
background.paste(image.convert('RGB'), mask=image)
Using numpy and PIL:
This loads the image into a numpy array of shape (W,H,3), where W is the
width and H is the height. The third axis of the array represents the 3 color
channels, R,G,B.
import Image
import numpy as np
orig_color = (255,255,255)
replacement_color = (0,0,0)
img = Image.open(filename).convert('RGB')
data = np.array(img)
data[(data == orig_color).all(axis = -1)] = replacement_color
img2 = Image.fromarray(data, mode='RGB')
img2.show()
Since orig_color is a tuple of length 3, and data has
shape (W,H,3), NumPy
broadcasts
orig_color to an array of shape (W,H,3) to perform the comparison data ==
orig_color. The result in a boolean array of shape (W,H,3).
(data == orig_color).all(axis = -1) is a boolean array of shape (W,H) which
is True wherever the RGB color in data is original_color.
#!/usr/bin/python
from PIL import Image
import sys
img = Image.open(sys.argv[1])
img = img.convert("RGBA")
pixdata = img.load()
# Clean the background noise, if color != white, then set to black.
# change with your color
for y in xrange(img.size[1]):
for x in xrange(img.size[0]):
if pixdata[x, y] == (255, 255, 255, 255):
pixdata[x, y] = (0, 0, 0, 255)
You'll need to represent the image as a 2-dimensional array. This means either making a list of lists of pixels, or viewing the 1-dimensional array as a 2d one with some clever math. Then, for each pixel that is targeted, you'll need to find all surrounding pixels. You could do this with a python generator thus:
def targets(x,y):
yield (x,y) # Center
yield (x+1,y) # Left
yield (x-1,y) # Right
yield (x,y+1) # Above
yield (x,y-1) # Below
yield (x+1,y+1) # Above and to the right
yield (x+1,y-1) # Below and to the right
yield (x-1,y+1) # Above and to the left
yield (x-1,y-1) # Below and to the left
So, you would use it like this:
for x in range(width):
for y in range(height):
px = pixels[x][y]
if px[0] == 255 and px[1] == 255 and px[2] == 255:
for i,j in targets(x,y):
newpixels[i][j] = replacementColor
If the pixels are not easily identifiable e.g you say (r < 100 and g < 100 and b < 100) also doesn't match correctly the black region, it means you have lots of noise.
Best way would be to identify a region and fill it with color you want, you can identify the region manually or may be by edge detection e.g. http://bitecode.co.uk/2008/07/edge-detection-in-python/
or more sophisticated approach would be to use library like opencv (http://opencv.willowgarage.com/wiki/) to identify objects.
This is part of my code, the result would like:
source
target
import os
import struct
from PIL import Image
def changePNGColor(sourceFile, fromRgb, toRgb, deltaRank = 10):
fromRgb = fromRgb.replace('#', '')
toRgb = toRgb.replace('#', '')
fromColor = struct.unpack('BBB', bytes.fromhex(fromRgb))
toColor = struct.unpack('BBB', bytes.fromhex(toRgb))
img = Image.open(sourceFile)
img = img.convert("RGBA")
pixdata = img.load()
for x in range(0, img.size[0]):
for y in range(0, img.size[1]):
rdelta = pixdata[x, y][0] - fromColor[0]
gdelta = pixdata[x, y][0] - fromColor[0]
bdelta = pixdata[x, y][0] - fromColor[0]
if abs(rdelta) <= deltaRank and abs(gdelta) <= deltaRank and abs(bdelta) <= deltaRank:
pixdata[x, y] = (toColor[0] + rdelta, toColor[1] + gdelta, toColor[2] + bdelta, pixdata[x, y][3])
img.save(os.path.dirname(sourceFile) + os.sep + "changeColor" + os.path.splitext(sourceFile)[1])
if __name__ == '__main__':
changePNGColor("./ok_1.png", "#000000", "#ff0000")