How to update numpy array based based on coordinates - python

I have an image that is given as below, which I am trying to convert into a binary image.
To convert this image into a binary image, I did:
image[image > 0] = 255
This creates a binary image with colored region having only white pixels. But I also want to convert the pixels that are above the colored region to value 255. How could I do this? I do not only want to convert the colored pixels to white but also the area above it. Is there I could do this? The area denoted by arrows will remain black (i.e area after the colored region)
UPDATE
Also, how could I approach if the edges are as shown below:

If I understood correctly your problem a more elaborated approach should be taken to get the desired result.
First of all the approach to use a simple threshold creates a noisy approach.
I used a modified image of your sample:
If you apply your thresholding then a result like this might come up:
A finer thresholding can be useful here:
image3 = cv2.inRange(image, np.array([10, 10, 10]), np.array([255, 255, 255]))
which creates a binary image as a result (resembles your desired output except for the upper strip):
TO get rid of the strip I would (it's just an approach not the perfect one though) use something to find the corner created by the white region and then use it to draw all the region above it with black:
ind = np.where(image3 == 255)
max_x = np.max(ind[1])
max_y = ind[0][np.argmax(ind[1])]
image3[:max_y, :max_x] = 255
And the result would be like this:

By all mean this is not the perfect answer. But it might be something helpful.
I recreated the image as follows:
Then I read it and followed your path with making it binary first (with a small modification to reduce noises):
import numpy as np
from matplotlib import pyplot as plt
img = plt.imread("sample.jpg")
img2 = img.copy()
img2[img2.sum(-1) > 30] = 255
img2[img2.sum(-1) <= 30] = 0
Here is the result after this modification:
OPTION 1
This might not be what you asked but it is similar to one of the solutions discussed in the comments, and I think it is partly correct:
i, j = np.where(img2.sum(-1) > 0) # find all white coordinates
i, j = (i[j.argmax()], j[j.argmax()]) # the corner white point into the black
img2[:i, :j] = 255 # paint whine all the left-above rectangle from this point
Here is the final result:
This is an imperfect but pretty simple pure numpy solution.
OPTION 2
In this solution, we need some simple calculus and linear algebra. Take two points in 2D space and draw a line between them. So, what is the function of the borderline?
point2 = (i, j) # same i and j from OPTION1 (coordinates of the top-right corner)
point1 = (img2.shape[0], img2[-1].sum(-1).argmin()) # the bottom-right white corner.
a = (point2[1] - point1[1]) / (point2[0] - point1[0])
c = point1[1] - a * point1[0]
f = lambda x: int(a * x + c)
Now, paint all areas to the left of the line:
for i in range(img2.shape[0]):
img2[:i, :f(i)+1] = 255
Here is the result:

Related

How to delete or clear contours from image?

I'm working with license plates, what I do is apply a series of filters to it, such as:
Grayscale
Blur
Threshhold
Binary
The problem is when I doing this, there are some contour like this image at borders, how can I clear them? or make it just black color (masked)? I used this code but sometimes it falls.
# invert image and detect contours
inverted = cv2.bitwise_not(image_binary_and_dilated)
contours, hierarchy = cv2.findContours(inverted,cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_SIMPLE)
# get the biggest contour
biggest_index = -1
biggest_area = -1
i = 0
for c in contours:
area = cv2.contourArea(c)
if area > biggest_area:
biggest_area = area
biggest_index = i
i = i+1
print("biggest area: " + str(biggest_area) + " index: " + str(biggest_index))
cv2.drawContours(image_binary_and_dilated, contours, biggest_index, [0,0,255])
center, size, angle = cv2.minAreaRect(contours[biggest_index])
rot_mat = cv2.getRotationMatrix2D(center, angle, 1.)
#cv2.warpPerspective()
print(size)
dst = cv2.warpAffine(inverted, rot_mat, (int(size[0]), int(size[1])))
mask = dst * 0
x1 = max([int(center[0] - size[0] / 2)+1, 0])
y1 = max([int(center[1] - size[1] / 2)+1, 0])
x2 = int(center[0] + size[0] / 2)-1
y2 = int(center[1] + size[1] / 2)-1
point1 = (x1, y1)
point2 = (x2, y2)
print(point1)
print(point2)
cv2.rectangle(dst, point1, point2, [0,0,0])
cv2.rectangle(mask, point1, point2, [255,255,255], cv2.FILLED)
masked = cv2.bitwise_and(dst, mask)
#cv2_imshow(imgg)
cv2_imshow(dst)
cv2_imshow(masked)
#cv2_imshow(mask)
Some results:
The original plates were:
Good result 1
Good result 2
Good result 3
Good result 4
Bad result 1
Bad result 2
Binary plates are:
Image 1
Image 2
Image 3
Image 4
Image 5 - Bad result 1
Image 6 - Bad result 2
How can I fix this code? only that I want to avoid that bad result or improve it.
INTRODUCTION
What you are asking starts to become complicated, and I believe there is not anymore a right or wrong answer, just different ways to do this. Almost all of them will yield positive and negative results, most likely in a different ratio. Having a 100% positive result is quite a challenging task, and I do believe my answer does not reach it. Yet it can be the basis for a more sophisticated work towards that goal.
MY PROPOSAL
So, I want to make a different proposal here.
I am not 100% sure why you are doing all the steps, and I believe some of them could be unnecessary.
Let's start from the problem: you want to remove the white parts on the borders (which are not numbers).
So, we need an idea about how to distinguish them from the letters, in order to correctly tackle them.
If we just try to contour and warp, it is likely to work on some images and not on others, because not all of them look the same. This is the hardest problem to have a general solution that works for many images.
What are the difference between the characteristics of the numbers and the characteristics of the borders (and other small points?):
after thinking about that, I would say: the shapes! That meaning, if you would imagine a bounding box around a letter/number, it would look like a rectangle, whose size is related to the image size. While in the case of the border, they are usually very large and narrow, or too small to be considered a letter/number (random points).
Therefore, my guess would be on segmentation, dividing the features via their shape. So we take the binary image, we remove some parts using the projection on their axes (as you correctly asked in the previous question and I believe we should use) and we get an image where each letter is separated from the white borders.
Then we can segment and check the shape of each segmented object, and if we think these are letters, we keep them, otherwise we discard them.
THE CODE
I wrote the code before as an example on your data. Some of the parameters are tuned on this set of images, so they may have to be relaxed for a larger dataset.
import cv2
import matplotlib.pyplot as plt
import numpy as np
%matplotlib inline
import scipy.ndimage as ndimage
# do this for all the images
num_images = 6
plt.figure(figsize=(16,16))
for k in range(num_images):
# read the image
binary_image = cv2.imread("binary_image/img{}.png".format(k), cv2.IMREAD_GRAYSCALE)
# just for visualization purposes, I create another image with the same shape, to show what I am doing
new_intermediate_image = np.zeros((binary_image.shape), np.uint8)
new_intermediate_image += binary_image
# here we will copy only the cleaned parts
new_cleaned_image = np.zeros((binary_image.shape), np.uint8)
### THIS CODE COMES FROM THE PREVIOUS ANSWER:
# https://stackoverflow.com/questions/62127537/how-to-clean-binary-image-using-horizontal-projection?noredirect=1&lq=1
(rows,cols)=binary_image.shape
h_projection = np.array([ x/rows for x in binary_image.sum(axis=0)])
threshold_h = (np.max(h_projection) - np.min(h_projection)) / 10
print("we will use threshold {} for horizontal".format(threshold))
# select the black areas
black_areas_horizontal = np.where(h_projection < threshold_h)
for j in black_areas_horizontal:
new_intermediate_image[:, j] = 0
v_projection = np.array([ x/cols for x in binary_image.sum(axis=1)])
threshold_v = (np.max(v_projection) - np.min(v_projection)) / 10
print("we will use threshold {} for vertical".format(threshold_v))
black_areas_vertical = np.where(v_projection < threshold_v)
for j in black_areas_vertical:
new_intermediate_image[j, :] = 0
### UNTIL HERE
# define the features we are looking for
# this parameters can also be tuned
min_width = binary_image.shape[1] / 14
max_width = binary_image.shape[1] / 2
min_height = binary_image.shape[0] / 5
max_height = binary_image.shape[0]
print("we look for feature with width in [{},{}] and height in [{},{}]".format(min_width, max_width, min_height, max_height))
# segment the iamge
labeled_array, num_features = ndimage.label(new_intermediate_image)
# loop over all features found
for i in range(num_features):
# get a bounding box around them
slice_x, slice_y = ndimage.find_objects(labeled_array==i)[0]
roi = labeled_array[slice_x, slice_y]
# check the shape, if the bounding box is what we expect, copy it to the new image
if roi.shape[0] > min_height and \
roi.shape[0] < max_height and \
roi.shape[1] > min_width and \
roi.shape[1] < max_width:
new_cleaned_image += (labeled_array == i)
# print all images on a grid
plt.subplot(num_images,3,1+(k*3))
plt.imshow(binary_image)
plt.subplot(num_images,3,2+(k*3))
plt.imshow(new_intermediate_image)
plt.subplot(num_images,3,3+(k*3))
plt.imshow(new_cleaned_image)
that produces the output (in the grid, left image are the input images, central one are the images after the mask based on histogram projections, and on the right are the cleaned images):
CONCLUSIONS:
As said above, this method does not yield 100% positive results. The last picture has lower quality and some parts are unconnected, and they are lost in the process. I personally believe this is a price to pay to get cleaner image, and if you have a lot of images, it won't be a problem, and you can remove those kind of images. Overall, I think this method returns quite clear images, where all other parts that are not letters or numbers are correctly removed.
ADVANTAGES
the image is clean, nothing more than letters or numbers are kept
the parameters can be tuned, and should be consistent across images
in case of problem, using some prints or some debugging on the loop that chooses the features to keep should make it easier to understand where are the problem and correct them
LIMITATIONS
it may fail in some cases where letters and numbers touch the white borders, which seems quite possible. It is handled from the black_areas created using the projection, but I am not so confident this will work 100% of the time.
some small parts of the numbers can be lost during the process, as in the last picture.

Detect if an OCR text image is upside down

I have some hundreds of images (scanned documents), most of them are skewed. I wanted to de-skew them using Python.
Here is the code I used:
import numpy as np
import cv2
from skimage.transform import radon
filename = 'path_to_filename'
# Load file, converting to grayscale
img = cv2.imread(filename)
I = cv2.cvtColor(img, COLOR_BGR2GRAY)
h, w = I.shape
# If the resolution is high, resize the image to reduce processing time.
if (w > 640):
I = cv2.resize(I, (640, int((h / w) * 640)))
I = I - np.mean(I) # Demean; make the brightness extend above and below zero
# Do the radon transform
sinogram = radon(I)
# Find the RMS value of each row and find "busiest" rotation,
# where the transform is lined up perfectly with the alternating dark
# text and white lines
r = np.array([np.sqrt(np.mean(np.abs(line) ** 2)) for line in sinogram.transpose()])
rotation = np.argmax(r)
print('Rotation: {:.2f} degrees'.format(90 - rotation))
# Rotate and save with the original resolution
M = cv2.getRotationMatrix2D((w/2,h/2),90 - rotation,1)
dst = cv2.warpAffine(img,M,(w,h))
cv2.imwrite('rotated.jpg', dst)
This code works well with most of the documents, except with some angles: (180 and 0) and (90 and 270) are often detected as the same angle (i.e it does not make difference between (180 and 0) and (90 and 270)). So I get a lot of upside-down documents.
Here is an example:
The resulted image that I get is the same as the input image.
Is there any suggestion to detect if an image is upside down using Opencv and Python?
PS: I tried to check the orientation using EXIF data, but it didn't lead to any solution.
EDIT:
It is possible to detect the orientation using Tesseract (pytesseract for Python), but it is only possible when the image contains a lot of characters.
For anyone who may need this:
import cv2
import pytesseract
print(pytesseract.image_to_osd(cv2.imread(file_name)))
If the document contains enough characters, it is possible for Tesseract to detect the orientation. However, when the image has few lines, the orientation angle suggested by Tesseract is usually wrong. So this can not be a 100% solution.
Python3/OpenCV4 script to align scanned documents.
Rotate the document and sum the rows. When the document has 0 and 180 degrees of rotation, there will be a lot of black pixels in the image:
Use a score keeping method. Score each image for it's likeness to a zebra pattern. The image with the best score has the correct rotation. The image you linked to was off by 0.5 degrees. I omitted some functions for readability, the full code can be found here.
# Rotate the image around in a circle
angle = 0
while angle <= 360:
# Rotate the source image
img = rotate(src, angle)
# Crop the center 1/3rd of the image (roi is filled with text)
h,w = img.shape
buffer = min(h, w) - int(min(h,w)/1.15)
roi = img[int(h/2-buffer):int(h/2+buffer), int(w/2-buffer):int(w/2+buffer)]
# Create background to draw transform on
bg = np.zeros((buffer*2, buffer*2), np.uint8)
# Compute the sums of the rows
row_sums = sum_rows(roi)
# High score --> Zebra stripes
score = np.count_nonzero(row_sums)
scores.append(score)
# Image has best rotation
if score <= min(scores):
# Save the rotatied image
print('found optimal rotation')
best_rotation = img.copy()
k = display_data(roi, row_sums, buffer)
if k == 27: break
# Increment angle and try again
angle += .75
cv2.destroyAllWindows()
How to tell if the document is upside down? Fill in the area from the top of the document to the first non-black pixel in the image. Measure the area in yellow. The image that has the smallest area will be the one that is right-side-up:
# Find the area from the top of page to top of image
_, bg = area_to_top_of_text(best_rotation.copy())
right_side_up = sum(sum(bg))
# Flip image and try again
best_rotation_flipped = rotate(best_rotation, 180)
_, bg = area_to_top_of_text(best_rotation_flipped.copy())
upside_down = sum(sum(bg))
# Check which area is larger
if right_side_up < upside_down: aligned_image = best_rotation
else: aligned_image = best_rotation_flipped
# Save aligned image
cv2.imwrite('/home/stephen/Desktop/best_rotation.png', 255-aligned_image)
cv2.destroyAllWindows()
Assuming you did run the angle-correction already on the image, you can try the following to find out if it is flipped:
Project the corrected image to the y-axis, so that you get a 'peak' for each line. Important: There are actually almost always two sub-peaks!
Smooth this projection by convolving with a gaussian in order to get rid of fine structure, noise, etc.
For each peak, check if the stronger sub-peak is on top or at the bottom.
Calculate the fraction of peaks that have sub-peaks on the bottom side. This is your scalar value that gives you the confidence that the image is oriented correctly.
The peak finding in step 3 is done by finding sections with above average values. The sub-peaks are then found via argmax.
Here's a figure to illustrate the approach; A few lines of you example image
Blue: Original projection
Orange: smoothed projection
Horizontal line: average of the smoothed projection for the whole image.
here's some code that does this:
import cv2
import numpy as np
# load image, convert to grayscale, threshold it at 127 and invert.
page = cv2.imread('Page.jpg')
page = cv2.cvtColor(page, cv2.COLOR_BGR2GRAY)
page = cv2.threshold(page, 127, 255, cv2.THRESH_BINARY_INV)[1]
# project the page to the side and smooth it with a gaussian
projection = np.sum(page, 1)
gaussian_filter = np.exp(-(np.arange(-3, 3, 0.1)**2))
gaussian_filter /= np.sum(gaussian_filter)
smooth = np.convolve(projection, gaussian_filter)
# find the pixel values where we expect lines to start and end
mask = smooth > np.average(smooth)
edges = np.convolve(mask, [1, -1])
line_starts = np.where(edges == 1)[0]
line_endings = np.where(edges == -1)[0]
# count lines with peaks on the lower side
lower_peaks = 0
for start, end in zip(line_starts, line_endings):
line = smooth[start:end]
if np.argmax(line) < len(line)/2:
lower_peaks += 1
print(lower_peaks / len(line_starts))
this prints 0.125 for the given image, so this is not oriented correctly and must be flipped.
Note that this approach might break badly if there are images or anything not organized in lines in the image (maybe math or pictures). Another problem would be too few lines, resulting in bad statistics.
Also different fonts might result in different distributions. You can try this on a few images and see if the approach works. I don't have enough data.
You can use the Alyn module. To install it:
pip install alyn
Then to use it to deskew images(Taken from the homepage):
from alyn import Deskew
d = Deskew(
input_file='path_to_file',
display_image='preview the image on screen',
output_file='path_for_deskewed image',
r_angle='offest_angle_in_degrees_to_control_orientation')`
d.run()
Note that Alyn is only for deskewing text.

Python 3: I am trying to find find all green pixels in an image by traversing all pixels using an np.array, but can't get around index error

My code currently consists of loading the image, which is successful and I don't believe has any connection to the problem.
Then I go on to transform the color image into a np.array named rgb
# convert image into array
rgb = np.array(img)
red = rgb[:,:,0]
green = rgb[:,:,1]
blue = rgb[:,:,2]
To double check my understanding of this array, in case that may be the root of the issue, it is an array such that rgb[x-coordinate, y-coordinate, color band] which holds the value between 0-255 of either red, green or blue.
Then, my idea was to make a nested for loop to traverse all pixels of my image (620px,400px) and sort them based on the ratio of green to blue and red in an attempt to single out the greener pixels and set all others to black or 0.
for i in range(xsize):
for j in range(ysize):
color = rgb[i,j] <-- Index error occurs here
if(color[0] > 128):
if(color[1] < 128):
if(color[2] > 128):
rgb[i,j] = [0,0,0]
The error I am receiving when trying to run this is as follows:
IndexError: index 400 is out of bounds for axis 0 with size 400
I thought it may have something to do with the bounds I was giving i and j so I tried only sorting through a small inner portion of the image but still got the same error. At this point I am lost as to what is even the root of the error let alone even the solution.
In direct answer to your question, the y axis is given first in numpy arrays, followed by the x axis, so interchange your indices.
Less directly, you will find that for loops are very slow in Python and you are generally better off using numpy vectorised operations instead. Also, you will often find it easier to find shades of green in HSV colourspace.
Let's start with an HSL colour wheel:
and assume you want to make all the greens into black. So, from that Wikipedia page, the Hue corresponding to Green is 120 degrees, which means you could do this:
#!/usr/local/bin/python3
import numpy as np
from PIL import Image
# Open image and make RGB and HSV versions
RGBim = Image.open("image.png").convert('RGB')
HSVim = RGBim.convert('HSV')
# Make numpy versions
RGBna = np.array(RGBim)
HSVna = np.array(HSVim)
# Extract Hue
H = HSVna[:,:,0]
# Find all green pixels, i.e. where 100 < Hue < 140
lo,hi = 100,140
# Rescale to 0-255, rather than 0-360 because we are using uint8
lo = int((lo * 255) / 360)
hi = int((hi * 255) / 360)
green = np.where((H>lo) & (H<hi))
# Make all green pixels black in original image
RGBna[green] = [0,0,0]
count = green[0].size
print("Pixels matched: {}".format(count))
Image.fromarray(RGBna).save('result.png')
Which gives:
Here is a slightly improved version that retains the alpha/transparency, and matches red pixels for extra fun:
#!/usr/local/bin/python3
import numpy as np
from PIL import Image
# Open image and make RGB and HSV versions
im = Image.open("image.png")
# Save Alpha if present, then remove
if 'A' in im.getbands():
savedAlpha = im.getchannel('A')
im = im.convert('RGB')
# Make HSV version
HSVim = im.convert('HSV')
# Make numpy versions
RGBna = np.array(im)
HSVna = np.array(HSVim)
# Extract Hue
H = HSVna[:,:,0]
# Find all red pixels, i.e. where 340 < Hue < 20
lo,hi = 340,20
# Rescale to 0-255, rather than 0-360 because we are using uint8
lo = int((lo * 255) / 360)
hi = int((hi * 255) / 360)
red = np.where((H>lo) | (H<hi))
# Make all red pixels black in original image
RGBna[red] = [0,0,0]
count = red[0].size
print("Pixels matched: {}".format(count))
result=Image.fromarray(RGBna)
# Replace Alpha if originally present
if savedAlpha is not None:
result.putalpha(savedAlpha)
result.save('result.png')
Keywords: Image processing, PIL, Pillow, Hue Saturation Value, HSV, HSL, color ranges, colour ranges, range, prime.

Edge detection for image stored in matrix

I represent images in the form of 2-D arrays. I have this picture:
How can I get the pixels that are directly on the boundaries of the gray region and colorize them?
I want to get the coordinates of the matrix elements in green and red separately. I have only white, black and gray regions on the matrix.
The following should hopefully be okay for your needs (or at least help). The idea is to split into the various regions using logical checks based on threshold values. The edge between these regions can then be detected using numpy roll to shift pixels in x and y and comparing to see if we are at an edge,
import matplotlib.pyplot as plt
import numpy as np
import scipy as sp
from skimage.morphology import closing
thresh1 = 127
thresh2 = 254
#Load image
im = sp.misc.imread('jBD9j.png')
#Get threashold mask for different regions
gryim = np.mean(im[:,:,0:2],2)
region1 = (thresh1<gryim)
region2 = (thresh2<gryim)
nregion1 = ~ region1
nregion2 = ~ region2
#Plot figure and two regions
fig, axs = plt.subplots(2,2)
axs[0,0].imshow(im)
axs[0,1].imshow(region1)
axs[1,0].imshow(region2)
#Clean up any holes, etc (not needed for simple figures here)
#region1 = sp.ndimage.morphology.binary_closing(region1)
#region1 = sp.ndimage.morphology.binary_fill_holes(region1)
#region1.astype('bool')
#region2 = sp.ndimage.morphology.binary_closing(region2)
#region2 = sp.ndimage.morphology.binary_fill_holes(region2)
#region2.astype('bool')
#Get location of edge by comparing array to it's
#inverse shifted by a few pixels
shift = -2
edgex1 = (region1 ^ np.roll(nregion1,shift=shift,axis=0))
edgey1 = (region1 ^ np.roll(nregion1,shift=shift,axis=1))
edgex2 = (region2 ^ np.roll(nregion2,shift=shift,axis=0))
edgey2 = (region2 ^ np.roll(nregion2,shift=shift,axis=1))
#Plot location of edge over image
axs[1,1].imshow(im)
axs[1,1].contour(edgex1,2,colors='r',lw=2.)
axs[1,1].contour(edgey1,2,colors='r',lw=2.)
axs[1,1].contour(edgex2,2,colors='g',lw=2.)
axs[1,1].contour(edgey2,2,colors='g',lw=2.)
plt.show()
Which gives the . For simplicity I've use roll with the inverse of each region. You could roll each successive region onto the next to detect edges
Thank you to #Kabyle for offering a reward, this is a problem that I spent a while looking for a solution to. I tried scipy skeletonize, feature.canny, topology module and openCV with limited success... This way was the most robust for my case (droplet interface tracking). Hope it helps!
There is a very simple solution to this: by definition any pixel which has both white and gray neighbors is on your "red" edge, and gray and black neighbors is on the "green" edge. The lightest/darkest neighbors are returned by the maximum/minimum filters in skimage.filters.rank, and a binary combination of masks of pixels that have a lightest/darkest neighbor which is white/gray or gray/black respectively produce the edges.
Result:
A worked solution:
import numpy
import skimage.filters.rank
import skimage.morphology
import skimage.io
# convert image to a uint8 image which only has 0, 128 and 255 values
# the source png image provided has other levels in it so it needs to be thresholded - adjust the thresholding method for your data
img_raw = skimage.io.imread('jBD9j.png', as_grey=True)
img = numpy.zeros_like(img, dtype=numpy.uint8)
img[:,:] = 128
img[ img_raw < 0.25 ] = 0
img[ img_raw > 0.75 ] = 255
# define "next to" - this may be a square, diamond, etc
selem = skimage.morphology.disk(1)
# create masks for the two kinds of edges
black_gray_edges = (skimage.filters.rank.minimum(img, selem) == 0) & (skimage.filters.rank.maximum(img, selem) == 128)
gray_white_edges = (skimage.filters.rank.minimum(img, selem) == 128) & (skimage.filters.rank.maximum(img, selem) == 255)
# create a color image
img_result = numpy.dstack( [img,img,img] )
# assign colors to edge masks
img_result[ black_gray_edges, : ] = numpy.asarray( [ 0, 255, 0 ] )
img_result[ gray_white_edges, : ] = numpy.asarray( [ 255, 0, 0 ] )
imshow(img_result)
P.S. Pixels which have black and white neighbors, or all three colors neighbors, are in an undefined category. The code above doesn't color those. You need to figure out how you want the output to be colored in those cases; but it is easy to extend the approach above to produce another mask or two for that.
P.S. The edges are two pixels wide. There is no getting around that without more information: the edges are between two areas, and you haven't defined which one of the two areas you want them to overlap in each case, so the only symmetrical solution is to overlap both areas by one pixel.
P.S. This counts the pixel itself as its own neighbor. An isolated white or black pixel on gray, or vice versa, will be considered as an edge (as well as all the pixels around it).
While plonser's answer may be rather straight forward to implement, I see it failing when it comes to sharp and thin edges. Nevertheless, I suggest you use part of his approach as preconditioning.
In a second step you want to use the Marching Squares Algorithm. According to the documentation of scikit-image, it is
a special case of the marching cubes algorithm (Lorensen, William and
Harvey E. Cline. Marching Cubes: A High Resolution 3D Surface
Construction Algorithm. Computer Graphics (SIGGRAPH 87 Proceedings)
21(4) July 1987, p. 163-170
There even exists a Python implementation as part of the scikit-image package. I have been using this algorithm (my own Fortran implementation, though) successfully for edge detection of eye diagrams in communications engineering.
Ad 1: Preconditioning
Create a copy of your image and make it two color only, e.g. black/white. The coordinates remain the same, but you make sure that the algorithm can properly make a yes/no-decision independent from the values that you use in your matrix representation of the image.
Ad 2: Edge Detection
Wikipedia as well as various blogs provide you with a pretty elaborate description of the algorithm in various languages, so I will not go into it's details. However, let me give you some practical advice:
Your image has open boundaries at the bottom. Instead of modifying the algorithm, you can artifically add another row of pixels (black or grey to bound the white/grey areas).
The choice of the starting point is critical. If there are not too many images to be processed, I suggest you select it manually. Otherwise you will need to define rules. Since the Marching Squares Algorithm can start anywhere inside a bounded area, you could choose any pixel of a given color/value to detect the corresponding edge (it will initially start walking in one direction to find an edge).
The algorithm returns the exact 2D positions, e.g. (x/y)-tuples. You can either
iterate through the list and colorize the corresponding pixels by assigning a different value or
create a mask to select parts of your matrix and assign the value that corresponds to a different color, e.g. green or red.
Finally: Some Post-Processing
I suggested to add an artificial boundary to the image. This has two advantages:
1. The Marching Squares Algorithm works out of the box.
2. There is no need to distinguish between image boundary and the interface between two areas within the image. Just remove the artificial boundary once you are done setting the colorful edges -- this will remove the colored lines at the boundary of the image.
Basically by follow pyStarter's suggestion of using the marching square algorithm from scikit-image, the desired could contours can be extracted with the following code:
import matplotlib.pyplot as plt
import numpy as np
import scipy as sp
from skimage import measure
import scipy.ndimage as ndimage
from skimage.color import rgb2gray
from pprint import pprint
#Load image
im = rgb2gray(sp.misc.imread('jBD9j.png'))
n, bins_edges = np.histogram(im.flatten(),bins = 100)
# Skip the black area, and assume two distinct regions, white and grey
max_counts = np.sort(n[bins_edges[0:-1] > 0])[-2:]
thresholds = np.select(
[max_counts[i] == n for i in range(max_counts.shape[0])],
[bins_edges[0:-1]] * max_counts.shape[0]
)
# filter our the non zero values
thresholds = thresholds[thresholds > 0]
fig, axs = plt.subplots()
# Display image
axs.imshow(im, interpolation='nearest', cmap=plt.cm.gray)
colors = ['r','g']
for i, threshold in enumerate(thresholds):
contours = measure.find_contours(im, threshold)
# Display all contours found for this threshold
for n, contour in enumerate(contours):
axs.plot(contour[:,1], contour[:,0],colors[i], lw = 4)
axs.axis('image')
axs.set_xticks([])
axs.set_yticks([])
plt.show()
!
However, from your image there is no clear defined gray region, so I took the two largest counts of intensities in the image and thresholded on these. A bit disturbing is the red region in the middle of the white region, however I think this could be tweaked with the number of bins in the histogram procedure. You could also set these manually as Ed Smith did.
Maybe there is a more elegant way to do that ...
but in case your array is a numpy array with dimensions (N,N) (gray scale) you can do
import numpy as np
# assuming black -> 0 and white -> 1 and grey -> 0.5
black_reg = np.where(a < 0.1, a, 10)
white_reg = np.where(a > 0.9, a, 10)
xx_black,yy_black = np.gradient(black_reg)
xx_white,yy_white = np.gradient(white_reg)
# getting the coordinates
coord_green = np.argwhere(xx_black**2 + yy_black**2>0.2)
coord_red = np.argwhere(xx_white**2 + yy_white**2>0.2)
The number 0.2 is just a threshold and needs to be adjusted.
I think you are probably looking for edge detection method for gray scale images. There are many ways to do that. Maybe this can help http://en.m.wikipedia.org/wiki/Edge_detection. For differentiating edges between white and gray and edges between black and gray, try use local average intensity.

Changing random pixels in a picture, python

I want to write a function that will create a random number (between m and n, inclusive) of stars in the sky of this picture (http://tinypic.com/r/34il9hu/6). I want the stars should be randomly composed of either a single white pixel, or a square of 4 adjacent white pixels.I also do not want to place a 'star' (of 1 pixel) over the tree branches, the moon or the bird though
How would I do this in python? Can anyone help? Thanks!
I have this so far:
I have started and have come out with this so far, I don't know if any of it is right or even if I am on the right track though:
def randomStars(small, large):
import random
file = pickAFile()
pic = makePicture(myPic)
#x = random.randrange(getWidth(pic))
#y = random.randrange(getHeight(pic))
for pixel in pic.getAllPixels():
if random.random() < 0.25:
pixel.red = random.randint(256)
pixel.green = random.randint(256)
pixel.blue = random.randint(256)
show(pic)
I have no clue what I am doing :(
This looks like a nice example to try the superpixels, as implemented by skimage. You can probably do this in an easier way for your problem.
import urllib
import random
import io
import matplotlib.pyplot as plt
import skimage.segmentation
import pandas
# Read the image
f = io.BytesIO(urllib.urlopen('http://oi46.tinypic.com/34il9hu.jpg').read())
img = plt.imread(f, format='jpg')
# Prefer to keep pixels together based on location
# But not too much, so we still get some branches.
superpixel = skimage.segmentation.slic(img, n_segments=200, ratio=20)
plt.imshow(superpixel%7, cmap='Set2')
Now that we have superpixels we can do classification a bit easier, by doing it per superpixel. You could use some fancy classification here, but this example is quite simple, with a blueish sky, let's do it by hand.
# Create a data frame with the relative blueish of every super pixel
# Convert image to hsv
hsv = matplotlib.colors.rgb_to_hsv(img.astype('float32')/255)
# Define blueish as the percentage of pixels in the blueish range of the hue space
df =pandas.DataFrame({'superpixel':superpixel.ravel(),
'blue':((hsv[:,:,0] > 0.4) & (hsv[:,:,0]<0.8)).astype('float32').ravel(),
'value':hsv[:,:,2].ravel()})
grouped = df.groupby('superpixel').mean()
# Lookup the superpixels with the least blue
blue = grouped.sort('blue', ascending=True).head(100)
# Lookup the darkest pixels
light = grouped.sort('value', ascending=True).head(50)
# If superpixels are too dark or too blue, get rid of them
mask = (np.in1d(superpixel, light.index ).reshape(superpixel.shape) |
np.in1d(superpixel, blue.index ).reshape(superpixel.shape))
# Now we can put the stars on the blueish, not too darkish areas
def randomstar(img, mask):
"""random located star"""
x,y = random.randint(1,img.shape[0]-1), random.randint(1,img.shape[1]-1)
if not mask[x-1:x+1, y-1:y+1].any():
# color not so random
img[x,y,:] = 255
img[x-1,y,:] = 255
img[x+1,y,:] = 255
img[x,y-1,:] = 255
img[x,y+1,:] = 255
for i in range(100):
randomstar(img, mask)
plt.imshow(img)
Python's standard library doesn't come with any powerful-enough image-manipulation code, but there are a few alternatives that are easy to install and use. I'll show how to do this with PIL.
from PIL import Image
def randomStars(small, large):
import random
filename = pickAFile()
pic = Image.open(filename)
max_x, max_y = pic.size
pixels = im.load()
x = random.randrange(max_x)
y = random.randrange(max_y)
for i in range(max_x):
for j in range(max_y):
if random.random() < 0.25:
red = random.randint(256)
green = random.randint(256)
blue = random.randint(256)
pixels[i, j] = (red, green, blue, 1)
im.show()
The show function doesn't display the image in your app (for that, you'd need some kind of GUI with an event loop, like tkinter or PySide); it saves a file to a temporary directory and runs a platform-specific program like Preview or xv to display it.
I assume you're also going to want to save the file. That's easy too:
name, ext = os.path.splitext(filename)
outfilename = '{}-with-stars.{}'.format(name, ext)
im.save(outfilename)
This will save it back to a .jpg with default JPEG settings, relying on PIL guessing what you want from the filename. (Which means, yes, you can save it as a PNG just by using '{}-with-stars.png'.format(name).) If you want more control, PIL can do that too, specifying an explicit format, and format-specific options.
So far, this is just how to turn your existing code into something that works, that you can play with and start debugging; it doesn't actually answer the original problem.
I want to write a function that will create a random number (between m and n, inclusive) of stars in the sky of this picture
So first, you need this as your loop, instead of a loop over all pixels:
for _ in random.randint(m, n):
Now:
I want the stars should be randomly composed of either a single white pixel, or a square of 4 adjacent white pixels.
x, y = random.randrange(max_x), random.randrange(max_y)
if random.random() < .5:
# draw white pixel at [x, y]
pixels[x, y] = (1, 1, 1, 1)
else:
# draw square at [x, y], making sure to handle edges
I also do not want to place a 'star' (of 1 pixel) over the tree branches, the moon or the bird though
You need to define how you know what's part of a tree branch, the moon, or the bird. Can you define that in terms of pixel colors?
From a quick glance, it looks like you might be able to. The moon's pixels are all brighter, more saturated, more red-biased, etc. than anything else (except the AP logo in the corner, which is even brighter). The bird and the branches are darker than anything else. In fact, they're so distinct that you probably don't even have to worry about doing correct colorspace math; it may be as simple as something like this:
r, g, b, a = pixels[x, y]
fake_brightness = r+g+b+a
if fake_brightness < 0.2:
# Tree or bird, pick a new random position
elif 1.2 < fake_brightness < 2.8:
# Moon, pick a new random position
else:
# Sky or API logo, scribble away
(Those numbers are obviously just pulled out of thin air, but a bit of trial and error should give you usable values.)
Of course if you're doing this as a learning exercise, you probably want to learn the correct colorspace math, and maybe even write an edge-detection algorithm, rather than relying on this image being so simply-parseable.
Try an approach like:
for n in xrange(number_of_stars):
# Find a good position
while True:
x, y = random_coords_in_image()
if is_sky(image, x, y):
break
# paint a star there
c = star_colour()
if large_star():
image.put(x, y, c)
image.put(x, y+1, c)
image.put(x+1, y+1, c)
image.put(x+1, y, c)
else:
image.put(x, y, c)
The functions I used are pretty self explanatory; you can implement is_sky estimating some conditions for the image colour at a given place.

Categories

Resources