Divide image to single objects (coins) for machine learning - python

I want to automatically divide an image of multiple coins to the single coins, so that afterwards the single coins can be put into a model that classifies the coins (with Tensorflow/Keras).
The input images look something like this
And they should look like this (Sorry that I cannot integrate the images directly as I'm new to StackOverflow).
I want to divide the input images, put the single coins into a classification model, so that I know the single value of the single coins and thereby can identify the value of the first input image.
I already tried an object detection model, but it didn't detect coins (https://towardsdatascience.com/object-detection-with-10-lines-of-code-d6cb4d86f606). As I already know that all objects on the image are coins, I thought that there is maybe an easier way to divide the image?
Thank you in advance.

I would try color segmentation similar to this tutorial as a first step to separate the coins from the background. Here is my quick try in Python using OpenCV:
import cv2
import numpy as np
import matplotlib.pyplot as plt
img = cv2.imread("coins.jpg")
lower = np.array([0,40,5])
upper = np.array([255,255,255])
mask = cv2.inRange(hsv, lower, upper)
cv2.imshow(img)
plt.show()
plt.imshow(mask)
This brings us from the input image
to this mask:
From here using blob analysis and a size filter you should be able to find and separate the unconnected coins. Disconnecting the overlapping areas could be achieved using active contours, or since your goal is to create a training data set, by shifting the coins before taking the picture.

import cv2
import imutils
import numpy as np
import matplotlib.pyplot as plt
def display(img,count,cmap="gray"):
f_image = cv2.imread("coins.jpg")
f, axs = plt.subplots(1,2,figsize=(12,5))
axs[0].imshow(f_image,cmap="gray")
axs[1].imshow(img,cmap="gray")
axs[1].set_title("Total Money Count = {}".format(count))
image = cv2.imread("coins.jpg")
image_blur = cv2.medianBlur(image,25)
image_blur_gray = cv2.cvtColor(image_blur, cv2.COLOR_BGR2GRAY)
image_res ,image_thresh = cv2.threshold(image_blur_gray,240,255,cv2.THRESH_BINARY_INV)
kernel = np.ones((3,3),np.uint8)
opening = cv2.morphologyEx(image_thresh,cv2.MORPH_OPEN,kernel)
dist_transform = cv2.distanceTransform(opening,cv2.DIST_L2,5)
ret, last_image = cv2.threshold(dist_transform, 0.3*dist_transform.max(),255,0)
last_image = np.uint8(last_image)
cnts = cv2.findContours(last_image.copy(), cv2.RETR_EXTERNAL,
cv2.CHAIN_APPROX_SIMPLE)
cnts = imutils.grab_contours(cnts)
for (i, c) in enumerate(cnts):
((x, y), _) = cv2.minEnclosingCircle(c)
cv2.putText(image, "#{}".format(i + 1), (int(x) - 45, int(y)+20),
cv2.FONT_HERSHEY_SIMPLEX, 2, (255, 0, 0), 5)
cv2.drawContours(image, [c], -1, (0, 255, 0), 2)
display(image,len(cnts))

Related

Bulk removing unwanted parts of images

I have downloaded a number of images (1000) from a website but they each have a black and white ruler running along 1 or 2 edges and some have these catalogue number tickets. I need these elements removed, the ruler at the very least.
Example images of coins:
The images all have the ruler in slightly different places so i cant just preform the same crop on them.
So I tried to remove the black and replace it with white using this code
from PIL import Image
import numpy as np
import matplotlib.pyplot as plt
im = Image.open('image-0.jpg')
im = im.convert('RGBA')
data = np.array(im) # "data" is a height x width x 4 numpy array
red, green, blue, alpha = data.T # Temporarily unpack the bands for readability
# Replace black with white
black_areas = (red < 150) & (blue < 150) & (green < 150)
data[..., :-1][black_areas.T] = (255, 255, 255) # Transpose back needed
im2 = Image.fromarray(data)
im2.show()
but it pretty much just removed half the coin as well:
I was having a read of some posts on opencv but though I'd see if there was a simpler way I'd missed first.
So I have taken a look at your problem and I have found a solution for your two images you provided, I hope it works for you other images as well but it is always hard to tell as it can be different on an individual basis. This solution is using OpenCV for preprocessing and contour detection to get the 2nd and 3rd largest elements in your picture (largest is the bounding box around the edges) which should be your coins. Then I create a box around those two items and add some padding before I crop to size.
So we start off with preprocessing:
import numpy as np
import cv2
img = cv2.imread(r'<PATH TO YOUR IMAGE>')
img = cv2.resize(img, None, fx=3, fy=3)
imgray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
blur = cv2.GaussianBlur(imgray, (5, 5), 0)
ret, thresh = cv2.threshold(blur, 0, 255, cv2.THRESH_BINARY+cv2.THRESH_OTSU)
Still rather basic, we make the image bigger so it is easier to detect contours, then we turn it into grayscale, blur it and apply thresholding to it so we turn all grey values either white or black. This then gives us the following image:
We now do contour detection, get the areas around our contours and sort them by the biggest area. Then we drop the biggest one as it is the box around the whole image and take the 2nd and 3rd biggest. And then get the x,y,w,h values we are interested in.
contours, hierarchy = cv2.findContours(
thresh, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
areas = []
for cnt in contours:
area = cv2.contourArea(cnt)
areas.append((area, cnt))
areas.sort(key=lambda x: x[0], reverse=True)
areas.pop(0)
x, y, w, h = cv2.boundingRect(areas[0][1])
x2, y2, w2, h2 = cv2.boundingRect(areas[1][1])
If we draw a rectangle around those contours:
Now we take those coordinates and create a box around both of them. This might need some minor adjusting as I just quickly took the bigger width of the two and not the corresponding one for the right coin but since I added extra padding it should be fine in most cases. And finally crop to size:
pad = 15
img = img[(min(y, y2) - pad) : (max(y, y2) + max(h, h2) + pad),
(min(x, x2) - pad) : (max(x, x2) + max(w, w2) + pad)]
I hope this helps you to understand how you could achieve what you want, I tried it on both your images and it worked well for them. It might need some adjustments and depending on how your other images look the simple approach of taking the two biggest objects (apart from image bounding box) might be turned into something more sophisticated to detect the cricular shapes or something along those lines. Alternatively you could try to detect the rulers and crop from their position inwards. You will have to decide after you have done this on more example images in your dataset.
If you're looking for a robust solution, you should try something like Max Kaha's response, since it'll provide you with greater fine tuning.
Since the rulers tend to be left with just a little bit of text after your "black to white" filter, a quick solution is to use erosion followed by a dilation to create a mask for your images, and then apply the mask to the original image.
Pillow offers that with the ImageFilter class. Here's your code with a few modifications that'll achieve that:
from PIL import Image, ImageFilter
import numpy as np
import matplotlib.pyplot as plt
WHITE = 255, 255, 255
input_image = Image.open('image.png')
input_image = input_image.convert('RGBA')
input_data = np.array(input_image) # "data" is a height x width x 4 numpy array
red, green, blue, alpha = input_data.T # Temporarily unpack the bands for readability
# Replace black with white
thresh = 30
black_areas = (red < thresh) & (blue < thresh) & (green < thresh)
input_data[..., :-1][black_areas.T] = WHITE # Transpose back needed
erosion_factor = 5
# dilation is bigger to avoid cropping the objects of interest
dilation_factor = 11
erosion_filter = ImageFilter.MaxFilter(erosion_factor)
dilation_filter = ImageFilter.MinFilter(dilation_factor)
eroded = Image.fromarray(input_data).filter(erosion_filter)
dilated = eroded.filter(dilation_filter)
mask_threshold = 220
# the mask is black on regions to be hidden
mask = dilated.convert('L').point(lambda x: 255 if x < mask_threshold else 0)
# create base image
output_image = Image.new('RGBA', input_image.size, WHITE)
# paste only the desired regions
output_image.paste(input_image, mask=mask)
output_image.show()
You should also play around with the black to white threshold and the erosion/dilation factors to try and find the best fit for most of your images.

How to fill up the entire internal contour using opencv

In the following image, I'm trying to blacken out the internal design band with the name "jupiter"
MY DESIRED RESULT IS THE FOLLOWING
I've tried using RETR_EXTERNAL in and then fillPoly but it only blackens out the "white" portion(the band) of the binary image and not completely as i want it to.
How would I need to improvise to get it to blacken it out completely?
You can use the masking technique to get your work done. Here is my code:
import cv2
import os
import numpy as np
import matplotlib.pyplot as plt
import copy
%matplotlib inline
#-----------------------------------------------------------------------------------
I = cv2.imread('E:\\Mukul\\others\\stof.png') #input image
#I.shape
I_cnt = np.where(I[:,:,2] == 255) #location of your bounding box region
I_mask = np.zeros_like(I[:,:,2]) # mask for the input image
I_mask[list(I_cnt[0]), list(I_cnt[1])] = 255
plt.imshow(I_mask, cmap = 'gray')
I_cnt1, _ = cv2.findContours(I_mask, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)
np.array(I_cnt1).shape # (1, 1420, 1, 2)
cv2.fillConvexPoly(I_mask, I_cnt1[0], 255)
plt.imshow(I_mask,cmap = 'gray')
Since we want our bounding box region to be black, we will invert our image using cv2.bitwise_not() and then use cv2.bitwise_and() to get the required output image.
I_mask1 = cv2.bitwise_not(I_mask)
out = cv2.bitwise_and(I_mask1, I[:,:,2])
plt.imshow(out,cmap = 'gray')
Instead of using the above lines to find contours of our binary mask which can be used to fill our region using cv2.fillConvexPoly() , we can directly convert I_cnt[0](array containing x-coordinate) and I_cnt[1](array containing y-coordinate) to an array of (x,y) coordinates using the following code:
temp_list = []
for a, b in zip(I_cnt[0], I_cnt[1]):
temp_list.append([a, b])
ctr = np.array(temp_list).reshape((-1,1,2)).astype(np.int32)
I_mask2 = np.zeros_like(I[:,:,2])
I_mask2[list(I_cnt[0]), list(I_cnt[1])] = 255
plt.imshow(I_mask2, cmap = 'gray')
cv2.fillConvexPoly(I_mask1, ctr, 255)
plt.imshow(I_mask2,cmap = 'gray')

How can I find the center of the pattern and the distribution of a color around it on python with opencv/skimage?

I have an image, that I want to process. I'm using Opencv and skimage. My goal is to find the distribution of the red dots around the barycenter of all the dots. I proceed as follows : first I select the color, and then I binarize the image that I obtain. Eventually, I would just count the red pixel that are on the rings with a certain width around that barycenter, in order to have an average distribution with regards to the radius supposing a cylindrical symmetry.
My issue is that I have no idea how to find the position of the barycenter.
I would also like to know if there is an short way to count the red pixels in the rings.
Here is my code :
import cv2
import matplotlib.pyplot as plt
from skimage import io, filters, measure, color, external
I'm uploading the image :
sph = cv2.imread('image_sper.jpg')
sph = cv2.cvtColor(sph, cv2.COLOR_BGR2RGB)
plt.imshow(sph)
plt.show()
I want to select the red color. Following https://realpython.com/python-opencv-color-spaces/, I'm converting it in HSV, and I'm using a mask.
hsv_sph = cv2.cvtColor(sph, cv2.COLOR_RGB2HSV)
light_red = (1, 100, 100)
dark_red = (18, 255, 255)
mask = cv2.inRange(hsv_sph, light_red, dark_red)
result = cv2.bitwise_and(sph, sph, mask=mask)
And here is the result :
plt.imshow(result)
plt.show()
Now I'm binarizing the image, since it'll be easier to process it afterwards.
red_image = result[:,:,1]
red_th = filters.threshold_otsu(red_image)
red_mask = red_image > red_th;
red_mask.dtype ;
io.imshow(red_mask);
And here we are :
What I would like some help now to find the barycenter of the white pixels.
Thx
Edit : The binarization gives the image boolean values False/True for the pixels. I don't know how to transform them to 0/1 pixels. If False was 0 and True 1, a code to find the barycenter would be :
np.shape(red_mask)
(* (321L, 316L) *)
bari=0
barj=0
N=0
for i in range(321):
for j in range(316):
bari=bari+red_mask[i,j]*i
barj=barj+red_mask[i,j]*j
N=N+red_mask[i,j]
bari=bari/N
barj=barj/N
Another question that should have been asked here: http://answers.opencv.org/questions/
But, let's go!
The process that I have implemented uses mostly structural analysis (https://docs.opencv.org/3.3.1/d3/dc0/group__imgproc__shape.html#ga17ed9f5d79ae97bd4c7cf18403e1689a)
First I got your image:
import cv2
import matplotlib.pyplot as plt
import numpy as np
from skimage import io, filters, measure, color, external
sph = cv2.imread('points.png')
ret,thresh = cv2.threshold(sph,200,255,cv2.THRESH_BINARY)
Then eroded and converted it for noise reduction
kernel = np.ones((2,2),np.uint8)
opening = cv2.morphologyEx(thresh, cv2.MORPH_OPEN, kernel)
opening = cv2.cvtColor(opening, cv2.COLOR_BGR2GRAY);
opening = cv2.convertScaleAbs(opening)
Then used "cv::findContours (InputOutputArray image, OutputArrayOfArrays contours, OutputArray hierarchy, int mode, int method, Point offset=Point())" to find all blobs.
After that, just calculate the center of each region and do a weighted average based on the contour area. This way, I got the points centroid (X:143.4202820443726 , Y:154.56471750651224).
im2, contours, hierarchy = cv2.findContours(opening, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
areas = []
centersX = []
centersY = []
for cnt in contours:
areas.append(cv2.contourArea(cnt))
M = cv2.moments(cnt)
centersX.append(int(M["m10"] / M["m00"]))
centersY.append(int(M["m01"] / M["m00"]))
full_areas = np.sum(areas)
acc_X = 0
acc_Y = 0
for i in range(len(areas)):
acc_X += centersX[i] * (areas[i]/full_areas)
acc_Y += centersY[i] * (areas[i]/full_areas)
print (acc_X, acc_Y)
cv2.circle(sph, (int(acc_X), int(acc_Y)), 5, (255, 0, 0), -1)
plt.imshow(sph)
plt.show()

Merge MSER detected objetcs (OpenCV, Python)

I am working on this image as source:
Applying the next code...
import cv2
import numpy as np
mser = cv2.MSER_create()
img = cv2.imread('C:\\Users\\Link\\Desktop\\test2.png')
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
vis = img.copy()
regions, _ = mser.detectRegions(gray)
hulls = [cv2.convexHull(p.reshape(-1, 1, 2)) for p in regions]
cv2.polylines(vis, hulls, 1, (0, 255, 0))
mask = np.zeros((img.shape[0], img.shape[1], 1), dtype=np.uint8)
for contour in hulls:
cv2.drawContours(mask, [contour], -1, (255, 255, 255), -1)
text_only = cv2.bitwise_and(img, img, mask=mask)
cv2.imshow('img', vis)
cv2.waitKey(0)
cv2.imshow('img', mask)
cv2.waitKey(0)
cv2.imshow('img', text_only)
cv2.waitKey(0)
cv2.imwrite('C:\\Users\\Link\\Desktop\\test_o\\1.png', text_only)
...I am obtaining this as result (mask):
The question is this:
how to merge into a single object the number 5 in the number series (157661546) as long as it is divided in the mask image ?
Thanks
Have a look here, it seems like the exact answer.
Here instead there is my version of the above code fine tuned for text extraction (with masking too).
Below there is the original code from the previous article, "ported" to python 3, opencv 3, added mser and bounding boxes. The main difference with my version is how the grouping distance is defined: mine is text-oriented while the one below is a free geometrical distance.
import sys
import cv2
import numpy as np
def find_if_close(cnt1,cnt2):
row1,row2 = cnt1.shape[0],cnt2.shape[0]
for i in range(row1):
for j in range(row2):
dist = np.linalg.norm(cnt1[i]-cnt2[j])
if abs(dist) < 25: # <-- threshold
return True
elif i==row1-1 and j==row2-1:
return False
img = cv2.imread(sys.argv[1])
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
cv2.imshow('input', img)
ret,thresh = cv2.threshold(gray,127,255,0)
mser=False
if mser:
mser = cv2.MSER_create()
regions = mser.detectRegions(thresh)
hulls = [cv2.convexHull(p.reshape(-1, 1, 2)) for p in regions[0]]
contours = hulls
else:
thresh = cv2.bitwise_not(thresh) # wants black bg
im2,contours,hier = cv2.findContours(thresh,cv2.RETR_EXTERNAL,2)
cv2.drawContours(img, contours, -1, (0,0,255), 1)
cv2.imshow('base contours', img)
LENGTH = len(contours)
status = np.zeros((LENGTH,1))
print("Elements:", len(contours))
for i,cnt1 in enumerate(contours):
x = i
if i != LENGTH-1:
for j,cnt2 in enumerate(contours[i+1:]):
x = x+1
dist = find_if_close(cnt1,cnt2)
if dist == True:
val = min(status[i],status[x])
status[x] = status[i] = val
else:
if status[x]==status[i]:
status[x] = i+1
unified = []
maximum = int(status.max())+1
for i in range(maximum):
pos = np.where(status==i)[0]
if pos.size != 0:
cont = np.vstack(contours[i] for i in pos)
hull = cv2.convexHull(cont)
unified.append(hull)
cv2.drawContours(img,contours,-1,(0,0,255),1)
cv2.drawContours(img,unified,-1,(0,255,0),2)
#cv2.drawContours(thresh,unified,-1,255,-1)
for c in unified:
(x,y,w,h) = cv2.boundingRect(c)
cv2.rectangle(img, (x,y), (x+w,y+h), (255, 0, 0), 2)
cv2.imshow('result', img)
cv2.waitKey(0)
cv2.destroyAllWindows()
Sample output (the yellow blob is below the binary threshold conversion so it's ignored). Red: original contours, green: unified ones, blue: bounding boxes.
Probably there is no need to use MSER as a simple findContours may work fine.
------------------------
Starting from here there is my old answer, before I found the above code. I'm leaving it anyway as it describe a couple of different approaches that may be easier/more appropriate for some scenarios.
A quick and dirty trick is to add a small gaussian blur and a high threshold before the MSER (or some dilute/erode if you prefer fancy things). In practice you just make the text bolder so that it fills small gaps. Obviously you can later discard this version and crop from the original one.
Otherwise, if your text is in lines, you may try to detect the average line center (make an histogram of Y coordinates and find the peaks for example). Then, for each line, look for fragments with a close average X. Quite fragile if text is noisy/complex.
If you do not need to split each letter, getting the bounding box for the whole word, may be easier: just split in groups based on a maximum horizontal distance between fragments (using the leftmost/rightmost points of the contour). Then use the leftmost and rightmost boxes within each group to find the whole bounding box. For multiline text first group by centroids Y coordinate.
Implementation notes:
Opencv allows you to create histograms but you probably can get away with something like this (worked for me on a similar task):
def histogram(vals, th=4, bins=400):
hist = np.zeros(bins)
for y_center in vals:
bucket = int(round(y_center / 2.)) <-- change this "2."
hist[bucket-1] += 1
print("hist: ", hist)
hist = np.where(hist > th, hist, 0)
return hist
Here my histogram is just an array with 400 buckets (my image was 800px high so each bucket catches two pixels, that is where the "2." comes from). Vals are the Y coordinates of the centroids of each fragment (you may want to ignore very small elements when you build this list). The th threshold is there just to remove some noise. You should get something like this:
0,0,0,5,22,0,0,0,0,43,7,0,0,0
This list describes, moving top to bottom, how many fragments are at each location.
Now I ran another pass to merge the peaks into a single value (just scan the array and sum while it is non-zero and reset the count on first zero) getting something like this {y:count}:
{9:27, 20:50}
Now I know I have two text rows at y=9 and y=20. Now, or before, you assign each fragment to on line (with again an 8px threshold in my case). Now you can process each line on its own, finding "words". BTW, I have your identical problem with broken letters that's why I came here looking for MSER :). Notice that if you find the whole bounding box for the word this problem happens only on the first/last letters: the other broken letters just falls inside the word box anyway.
Here is a reference for the erode/dilate thing, but gaussian blur/th worked for me.
UPDATE: I've noticed that there is something wrong in this line:
regions = mser.detectRegions(thresh)
I pass in the already thresholded image(!?). This is not relevant for the aggregation part but keep in mind that the mser part is not being used as expected.

python opencv fill contours which are not completely closed

I use openCV to find the external contour of a given image and fill it.
The images I use as input are images of pants like the one attached. The problem is that sometimes (like in the attached image) the contour is not completely closed and then I can't fill it. What can I do in this case?
Please see code below.
Thanks, Li
from PIL import Image
import os
import numpy
import bs4
import scipy
import cv2
image_obj_original = cv2.imread(image_file)
image_name = os.path.split(image_file)[-1]
name, extension = os.path.splitext(image_name)
# normalize to a standard size
image_obj = cv2.resize(image_obj_original, STANDARD_SIZE)
imWithBorder = cv2.copyMakeBorder(image_obj, 10, 10, 10, 10, cv2.BORDER_CONSTANT, value=[255, 255, 255])
# convert to grey-scale
greyscale_image = cv2.cvtColor(imWithBorder,cv2.COLOR_BGR2GRAY)
# get canny edges
canny_edges = cv2.Canny(greyscale_image, 1, 255)
h, w = canny_edges.shape[:2]
contours0, hierarchy = cv2.findContours( canny_edges.copy(), cv2.RETR_EXTERNAL , cv2.CHAIN_APPROX_SIMPLE)
contours = [cv2.approxPolyDP(cnt, 3, True) for cnt in contours0]
vis = numpy.zeros((h, w, 3), numpy.uint8)
cv2.drawContours(vis,contours,0,255,-1)
vis = cv2.bitwise_not(vis)
cv2.imshow('image', vis)
I agree that it is somewhat annoying that Canny in OpenCV is not always closing one last pixel in contour of object, and that prevent you from finding one closed contour. The workaround that I used to solve this problem is use of "close" morphological operation:
dilate(canny_edges, canny_edges, Mat());
erode(canny_edges, canny_edges, Mat());
As any workaround it is not perfect but it does solves the problem.

Categories

Resources