How to randomly generate scratch like lines with opencv automatically - python

I am trying to generate synthetic images for my deep learning model. I need to draw scratches on a black surface. I already have a little script that can generate random white scratch like lines but only horizontally. I need the scratches to also be vertically and curved. On top of that it would also be very helpfull if the thickness of the scratches would also be random so I have thick and thin scratches.
This is my code so far:
import cv2
import numpy as np
import random
height = 384
width = 384
blank_image = np.zeros((height, width, 3), np.uint8)
num_scratches= random.randint(0,5)
for _ in range(num_scratches):
row_random = random.randint(20,370)
blank_image[row_random:(row_random+1), row_random:(row_random+random.randint(25,75))] = (255,255,255)
cv2.imshow("synthetic", blank_image)
cv2.waitKey(0)
cv2.destroyAllWindows()
This is one example result outcome:
How do I have to edit my script so I can get more diverse looking scratches?
The scratches should somehow look like this for example (Done with paint):

need the scratches to also be vertically
Your method might be adopted as follows
import numpy as np # cv2 read image into np.array
img = np.zeros((5,5),dtype='uint8') # same as loading 5 x 5 px black rectangle
img[1:4,2:3] = 255
print(img)
Output:
[[ 0 0 0 0 0]
[ 0 0 255 0 0]
[ 0 0 255 0 0]
[ 0 0 255 0 0]
[ 0 0 0 0 0]]
Explanation: I set all elements (pixel) which have y-cordinate between 1 (inclusive) and 4 (exclusive) and x-cordinate between 2 (inclusive) and 3 (exclusive).
Nonetheless cv2 provide function for drawing lines namely cv2.line which is more handy to use, it does accept img on which to work, start point, end point, color and thickness, docs give following example:
# Draw a diagonal blue line with thickness of 5 px
img = cv2.line(img,(0,0),(511,511),(255,0,0),5)
If you are working in grayscale use value rather than 3-tuple as color.

Related

How to find the empty squares in a chess board image?

I am currently writing an image recognition script that makes a 2D array out of an image of a chess board for my chess project. However, I found it quite difficult to find which squares are empty:
So far, I have used the Canny edge detection on my image after applying the gaussian blur, which yielded the following results:
The code I used was:
sigma = 0.33
v = np.median(img)
img = cv2.GaussianBlur(img, (7, 7), 2) # we use gaussian blur on the image to make it clear.
lower = int(max(0, (1.0 - sigma) * v)) # we find the lower threshold.
upper = int(min(255, (1.0 + sigma) * v)) # we find the higher threshold.
img_edge = cv2.Canny(img, 50, 50) # we use the canny function to edge canny the image.
cv2.imshow('question', img_edge) # we show the image.
cv2.waitKey(0)
(You may notice I did not use the threshold I got, that's because I found it inaccurate. If anyone has any tips I'd love them!)
Now, after doing these steps, I have tried many other things such as finding contours, Hough transform, etc. Yet I can't seem to figure out how to move on from that and actually find out whether a square is empty.
Any help is appreciated!
Assuming you have some kind of square shaped input image covering the whole chess board (as the example suggests), you can resize the image by rounding width and height to the next smaller multiple of 8. So, you can derive 64 equally sized tiles from your image. For each tile, count the number of unique colors. Set up some threshold to distinguish two classes (empty vs. non-empty square), maybe by using Otsu's method.
That'd be my code (half of that is simply visualization stuff):
import cv2
import matplotlib.pyplot as plt
import numpy as np
from skimage.filters import threshold_otsu
# Round to next smaller multiple of 8
# https://www.geeksforgeeks.org/round-to-next-smaller-multiple-of-8/
def round_down_to_next_multiple_of_8(a):
return a & (-8)
# Read image, and shrink to quadratic shape with width and height of
# next smaller multiple of 8
img = cv2.imread('m0fAx.png')
wh = np.min(round_down_to_next_multiple_of_8(np.array(img.shape[:2])))
img = cv2.resize(img, (wh, wh))
# Prepare some visualization output
out = img.copy()
plt.figure(1, figsize=(18, 6))
plt.subplot(1, 3, 1), plt.imshow(img)
# Blur image
img = cv2.blur(img, (5, 5))
# Iterate tiles, and count unique colors inside
# https://stackoverflow.com/a/56606457/11089932
wh_t = wh // 8
count_unique_colors = np.zeros((8, 8))
for x in np.arange(8):
for y in np.arange(8):
tile = img[y*wh_t:(y+1)*wh_t, x*wh_t:(x+1)*wh_t]
tile = tile[3:-3, 3:-3]
count_unique_colors[y, x] = np.unique(tile.reshape(-1, tile.shape[-1]), axis=0).shape[0]
# Mask empty squares using cutoff from Otsu's method
val = threshold_otsu(count_unique_colors)
mask = count_unique_colors < val
# Some more visualization output
for x in np.arange(8):
for y in np.arange(8):
if mask[y, x]:
cv2.rectangle(out, (x*wh_t+3, y*wh_t+3),
((x+1)*wh_t-3, (y+1)*wh_t-3), (0, 255, 0), 2)
plt.subplot(1, 3, 2), plt.imshow(count_unique_colors, cmap='gray')
plt.subplot(1, 3, 3), plt.imshow(out)
plt.tight_layout(), plt.show()
And, that'd be the output:
As you can see, it's not perfect. One issue is the camera position, specifically that angle, but you already mentioned in the comments, that you can correct that. The other issue, as also already discussed in the comments, is the fact, that some pieces are placed between two squares. It's up to you, how to handle that. (I'd simply place the pieces correctly.)
----------------------------------------
System information
----------------------------------------
Platform: Windows-10-10.0.19041-SP0
Python: 3.9.1
PyCharm: 2021.1.1
Matplotlib: 3.4.2
NumPy: 1.20.3
OpenCV: 4.5.2
scikit-image: 0.18.1
----------------------------------------
Not like the given original image, but if you have a chessboard with pieces of colours which are not of the same colour as the chessboard (as discussed in the comments), then you can do something like this:
import cv2
import numpy
img = cv2.imread("Chesss.PNG") # read image using cv2
for x in range(0,img.shape[0] - 8, img.shape[0]//8):
for y in range(0,img.shape[1] - 8, img.shape[1]//8):
square = img[x:x+img.shape[0]//8, y:y+img.shape[1]//8, :] # creating 8*8 squares of image
avg_colour_per_row = numpy.average(square, axis=0)
avg_colour = numpy.array(list(map(int, numpy.average(avg_colour_per_row, axis=0))))//8 # finding average colour of the square
if list(avg_colour) == list(numpy.array([0, 0, 0])) or list(avg_colour) == list(numpy.array([31, 31, 31])): # if average colour of the squareis black or white, then print the coordinates of the square
print(x//(img.shape[0]//8), y//(img.shape[1]//8))
My example image (I do not have a chessboard right now, so I used a rendered image):
Output:
0 0
0 1
0 2
0 3
0 4
0 5
0 6
0 7
1 1
1 3
1 4
1 6
2 0
2 1
2 2
2 3
2 4
2 5
2 6
2 7
3 0
3 1
3 2
3 3
3 5
3 6
3 7
4 0
4 1
4 2
4 3
4 4
4 6
4 7
5 0
5 1
5 2
5 3
5 4
5 5
5 6
5 7
6 0
6 3
6 4
6 5
6 6
7 0
7 1
7 2
7 3
7 4
7 5
7 6
7 7
Note that I have divided the average colour vales by 8. This is because we will perceive [0, 0, 0] and [1, 1, 1] (and similarly) as black only.
You can find chessboard and even find it's pose like here. Then you'll able to estimate ellipse shape of piece base.
Find ellipses, using, for instance, this project.
Filter out trash ellipses using pose knowledge, and you'll get pieces positions. Then you can find free cells.

NumPy: Understanding values in colour matrix

I have an image which I have read and converted into a numpy array. I have then extracted each colour channel (R,G,B) of the image into three separate arrays:
import cv2
import numpy as np
from sklearn.cluster import MeanShift, estimate_bandwidth
from sklearn.datasets.samples_generator import make_blobs
import matplotlib.pyplot as plt
from itertools import cycle
from PIL import Image
image = Image.open('sample_images/fruit_half.png').convert('RGB')
image = np.array(image)
red = image[:,:,2]
green = image[:,:,1]
blue = image[:,:,0]
When I print the value of the "red" array, I get the following output:
print(red)
[[0 0 0 ... 0 0 0]
[0 0 0 ... 0 0 0]
[0 0 0 ... 0 0 0]
...
[0 0 0 ... 0 0 0]
[0 0 0 ... 0 0 0]
[0 0 0 ... 0 0 0]]
I would like to know what do the numbers in the red, green and blue arrays represent. Do they represent the intensity of red/green/blue for a specific pixel? Any insights are appreciated.
They stand for the pixel intensity of each color channel in the image. If you print out image.shape you will be able to access its properties
print(image.shape)
You will get something like this
(93, 296, 3)
This tells us the (rows, columns, channels) in the image. In this case, the image has three channels. If you print out each individual channel, they represent pixel intensity ranging from 0-255. Every pixel is made up of the combination of these three channels. If instead you printed out the shape of the image and you got this
(93, 296, 1)
it means that the image only has one channel (a grayscale image).
One thing to note is that OpenCV follows BGR convention while PIL follows RGB. Currently, you are splitting backwards. To split channels using PIL you can do this
image = Image.open('1.jpg').convert('RGB')
image = np.array(image)
red = image[:,:,0]
green = image[:,:,1]
blue = image[:,:,2]
Remember PIL uses RGB format where red is in channel 0, green in channel 1, and blue in channel 2.
To split channels using OpenCV you can do this
image = cv2.imread('1.jpg')
b,g,r = cv2.split(image)
or
b = image[:,:,0]
g = image[:,:,1]
r = image[:,:,2]
Taking this image as an example, you can use a histogram to visualize the channels.
import cv2
from matplotlib import pyplot as plt
image = cv2.imread('1.jpg')
b,g,r = cv2.split(image)
blue = cv2.calcHist([b], [0], None, [256], [0,256])
green = cv2.calcHist([g], [0], None, [256], [0,256])
red = cv2.calcHist([r], [0], None, [256], [0,256])
plt.plot(blue, color='b')
plt.plot(green, color ='g')
plt.plot(red, color ='r')
plt.show()
Yes they represent intensity, each value is an a 8-bit value from 0 to 255. If a value is 0 the red pixel is completely off and 255 is completely on. Usually people just use the image an array (well, opencv list them in the order blue green red). The image array holds a rgb value at every pixel (try printing image). This a standard for images and can be explained here.
RGB picture is a digital matrix with 3 channel, each channel contain a value from 0 to 255 (if your dtype = uint8) to present the percentage of that color in that pixel. Look at the picture:
You can see that if we combine red and Green at 100% (mean 255), we have yellow, if we combine them in 100% together, we have white, etc. By this formula, each pixel will have x of Red and y of Green and z of Blue.
Example:
Therefore, the value you see in red channel is the percent of red color in your picture.
Hope this useful!

Detecting start and end point of line in image (numpy array)

I have an image like the following:
What I would like is to get the coordinates of the start and end point of each segment. Actually what I thought was to consider the fact that each extreme point should have just one point belonging to the segment in its neighborhood, while all other point should have at least 2. Unfortunately the line does not have thickness equal to one pixel so this reasoning does not hold.
Here's a fairly simple way to do it:
load the image and discard the superfluous alpha channel
skeletonise
filter looking for 3x3 neighbourhoods that have the central pixel set and just one other
#!/usr/bin/env python3
import numpy as np
from PIL import Image
from scipy.ndimage import generic_filter
from skimage.morphology import medial_axis
# Line ends filter
def lineEnds(P):
"""Central pixel and just one other must be set to be a line end"""
return 255 * ((P[4]==255) and np.sum(P)==510)
# Open image and make into Numpy array
im = Image.open('lines.png').convert('L')
im = np.array(im)
# Skeletonize
skel = (medial_axis(im)*255).astype(np.uint8)
# Find line ends
result = generic_filter(skel, lineEnds, (3, 3))
# Save result
Image.fromarray(result).save('result.png')
Note that you can obtain exactly the same result, for far less effort, with ImageMagick from the command-line like this:
convert lines.png -alpha off -morphology HMT LineEnds result.png
Or, if you want them as numbers rather than an image:
convert result.png txt: | grep "gray(255)"
Sample Output
134,78: (65535) #FFFFFF gray(255) <--- line end at coordinates 134,78
106,106: (65535) #FFFFFF gray(255) <--- line end at coordinates 106,106
116,139: (65535) #FFFFFF gray(255) <--- line end at coordinates 116,139
196,140: (65535) #FFFFFF gray(255) <--- line end at coordinates 196,140
Another way of doing it is to use scipy.ndimage.morphology.binary_hit_or_miss and set up your "Hits" as the white pixels in the below image and your "Misses" as the black pixels:
The diagram is from Anthony Thyssen's excellent material here.
In a similar vein to the above, you could equally use the "Hits" and "Misses" kernels above with OpenCV as described here:
morphologyEx(input_image, output_image, MORPH_HITMISS, kernel);
I suspect this would be the fastest method.
Keywords: Python, image, image processing, line ends, line-ends, morphology, Hit or Miss, HMT, ImageMagick, filter.
The method you mentioned should work well, you just need to do a morphological operation before to reduce the width of the lines to one pixel. You can use scikit-image for that:
from skimage.morphology import medial_axis
import cv2
# read the lines image
img = cv2.imread('/tmp/tPVCc.png', 0)
# get the skeleton
skel = medial_axis(img)
# skel is a boolean matrix, multiply by 255 to get a black and white image
cv2.imwrite('/tmp/res.png', skel*255)
See this page on the skeletonization methods in skimage.
I would tackle this with watershed-style algorithm. I described method below, however it is created to deal only with single (multisegment) line, so you would need to split your image into images of separate lines.
Toy example:
0000000
0111110
0111110
0110000
0110000
0000000
Where 0 denotes black and 1 denotes white.
Now my implemention of solution:
import numpy as np
img = np.array([[0,0,0,0,0,0,0],
[0,255,255,255,255,255,0],
[0,255,255,255,255,255,0],
[0,255,255,0,0,0,0],
[0,0,0,0,0,0,0]],dtype='uint8')
def flood(arr,value):
flooded = arr.copy()
for y in range(1,arr.shape[0]-1):
for x in range(1,arr.shape[1]-1):
if arr[y][x]==255:
if arr[y-1][x]==value:
flooded[y][x] = value
elif arr[y+1][x]==value:
flooded[y][x] = value
elif arr[y][x-1]==value:
flooded[y][x] = value
elif arr[y][x+1]==value:
flooded[y][x] = value
return flooded
ends = np.zeros(img.shape,dtype='uint64')
for y in range(1,img.shape[0]-1):
for x in range(1,img.shape[1]-1):
if img[y][x]==255:
temp = img.copy()
temp[y][x] = 127
count = 0
while 255 in temp:
temp = flood(temp,127)
count += 1
ends[y][x] = count
print(ends)
Output:
[[0 0 0 0 0 0 0]
[0 5 4 4 5 6 0]
[0 5 4 3 4 5 0]
[0 6 5 0 0 0 0]
[0 0 0 0 0 0 0]]
Now ends are denoted by positions of maximal values in above array (6 in this case).
Explanation: I am examing all white pixels as possible ends. For each such pixel I am "flooding" image - I place special value (127 - different than 0 and different than 255) and then propogate it - in every step all 255 which are neighbors (in von Neumann's sense) of special value become special values themselves. I am counting steps required to remove all 255. Because if you start (constant velocity) flooding from end it would take more time than if you have source in any other location, then maximal times of flooding are ends of your line.
I must admit that I did not tested this deeply, so I can't guarantee correct working in special case, like for example in case of self-intersecting line. I am also aware of roughness of my solution especially in area of detecting neighbors and propagation of special values, so feel free to improve it. I assumed that all border pixels are black (no line is touching "frame" of your image).

Python 3: I am trying to find find all green pixels in an image by traversing all pixels using an np.array, but can't get around index error

My code currently consists of loading the image, which is successful and I don't believe has any connection to the problem.
Then I go on to transform the color image into a np.array named rgb
# convert image into array
rgb = np.array(img)
red = rgb[:,:,0]
green = rgb[:,:,1]
blue = rgb[:,:,2]
To double check my understanding of this array, in case that may be the root of the issue, it is an array such that rgb[x-coordinate, y-coordinate, color band] which holds the value between 0-255 of either red, green or blue.
Then, my idea was to make a nested for loop to traverse all pixels of my image (620px,400px) and sort them based on the ratio of green to blue and red in an attempt to single out the greener pixels and set all others to black or 0.
for i in range(xsize):
for j in range(ysize):
color = rgb[i,j] <-- Index error occurs here
if(color[0] > 128):
if(color[1] < 128):
if(color[2] > 128):
rgb[i,j] = [0,0,0]
The error I am receiving when trying to run this is as follows:
IndexError: index 400 is out of bounds for axis 0 with size 400
I thought it may have something to do with the bounds I was giving i and j so I tried only sorting through a small inner portion of the image but still got the same error. At this point I am lost as to what is even the root of the error let alone even the solution.
In direct answer to your question, the y axis is given first in numpy arrays, followed by the x axis, so interchange your indices.
Less directly, you will find that for loops are very slow in Python and you are generally better off using numpy vectorised operations instead. Also, you will often find it easier to find shades of green in HSV colourspace.
Let's start with an HSL colour wheel:
and assume you want to make all the greens into black. So, from that Wikipedia page, the Hue corresponding to Green is 120 degrees, which means you could do this:
#!/usr/local/bin/python3
import numpy as np
from PIL import Image
# Open image and make RGB and HSV versions
RGBim = Image.open("image.png").convert('RGB')
HSVim = RGBim.convert('HSV')
# Make numpy versions
RGBna = np.array(RGBim)
HSVna = np.array(HSVim)
# Extract Hue
H = HSVna[:,:,0]
# Find all green pixels, i.e. where 100 < Hue < 140
lo,hi = 100,140
# Rescale to 0-255, rather than 0-360 because we are using uint8
lo = int((lo * 255) / 360)
hi = int((hi * 255) / 360)
green = np.where((H>lo) & (H<hi))
# Make all green pixels black in original image
RGBna[green] = [0,0,0]
count = green[0].size
print("Pixels matched: {}".format(count))
Image.fromarray(RGBna).save('result.png')
Which gives:
Here is a slightly improved version that retains the alpha/transparency, and matches red pixels for extra fun:
#!/usr/local/bin/python3
import numpy as np
from PIL import Image
# Open image and make RGB and HSV versions
im = Image.open("image.png")
# Save Alpha if present, then remove
if 'A' in im.getbands():
savedAlpha = im.getchannel('A')
im = im.convert('RGB')
# Make HSV version
HSVim = im.convert('HSV')
# Make numpy versions
RGBna = np.array(im)
HSVna = np.array(HSVim)
# Extract Hue
H = HSVna[:,:,0]
# Find all red pixels, i.e. where 340 < Hue < 20
lo,hi = 340,20
# Rescale to 0-255, rather than 0-360 because we are using uint8
lo = int((lo * 255) / 360)
hi = int((hi * 255) / 360)
red = np.where((H>lo) | (H<hi))
# Make all red pixels black in original image
RGBna[red] = [0,0,0]
count = red[0].size
print("Pixels matched: {}".format(count))
result=Image.fromarray(RGBna)
# Replace Alpha if originally present
if savedAlpha is not None:
result.putalpha(savedAlpha)
result.save('result.png')
Keywords: Image processing, PIL, Pillow, Hue Saturation Value, HSV, HSL, color ranges, colour ranges, range, prime.

How to label different objects in a non solid black background?

I know that scipy.ndimage.label can't label if the background color is not a solid black.
So I have an image with black background and it's not a solid black so we can't assume that all the RGB values are(0,0,0) in all pixels.
How can I prepare the image so I can use ndimage.label??
this is a similar image to test on:
test image http://imageshack.us/a/img4/8661/backgrf.png
Note:
(1) The image was converted fromRGB to PNG gray scale .
(2) The background color varies.
(3) The ndimage.label labels the whole image as one object.
Thanks
This is a simple method for increasing the contrast as far as it can go so that anything "light" becomes white and anything "dark" becomes black. Assuming 8 bit grayscale and adapting the code in #Warren Weckesser's answer:
img2 = img.copy() # Copy the image.
img2[img2 < 128] = 0 # Set all values less than 128 to 0 (black).
img2[img2 >= 128] = 255 # Set all values equal or greater than 128 to 255 (white).
lbl, n = label(img2)
Let me know if this works for you.
You could set all values less than some threshold to 0, and then call label:
In [16]: img2 = img.copy() # Copy the image.
In [17]: img2[img2 < 20] = 0 # Set all values less than 20 to 0.
In [18]: lbl, n = label(img2)
In [19]: n
Out[19]: 2

Categories

Resources