PIL: Is it possible to fill a mask? - python

Suppose I have a mask (a blank triangle - 1/255/etc in the triangle, 0 elsewhere) and I have the texture I'd like to place inside that triangle, but the texture is sequential instead of being formatted into an image format. For example, if the box containing the triangle/mask is 100 x 100, but the triangle itself only has 2500 pixels, I only have the information for the 2500 pixels instead of having the actual box.
So, I could fill the triangle manually, either doing each row or column at a time, but I was wondering if there was a method to do this instead.
Here's the code I used anyways:
def fill_mask(mask, data): #mask and data are ndarrays
mask = np.copy(mask).T
k = 0
for i in xrange(len(mask)):
for j in xrange(len(mask[i])):
if mask[i][j] == 1:
mask[i][j] = data[k]
k += 1
return mask.T
That one fills horizontally (line by line). To fill vertically, take away the .T in the first and last lines. It can probably be made shorter though I'm awful with that so I'll just leave it as it is. Any improvements to it are appreciated as well.

mask[mask==1] = data

It sounds like you are asking if you can fill/modify each pixel of the triangle using an array of values without also having to visit the pixels which you don't want to fill using only the image data.
If that's so, then the short answer is no.
Certainly it would be possible to create some optimization array that only referenced those pixels which were part of the triangle, but in order to create that array, you'd have to visit each pixel, so it would only be a savings if you have to visit the same set many times.
PIL probably provides some helpers to do blending that might be optimized, which would be better than trying to roll your own.
If on the other hand, you know the dimensions and position of the triangle in your mask, you could calculate the position of the pixels inside the triangle. For that you'll need to study your trigonometry.
If you don't know how to do this already, I'd say stick with visiting each pixel, it will be a good learning experience. If you need to improve performance later, the path will be clearer once you understand the basic concepts.

Related

Count objects in a binarized image without scipy

I'm trying to count the number of objects in an image, which I already binarized, however, I'm not allowed to use scipy or numpy packages, therefore I can’t use scipy.ndimage.label, any ideas? My attempt counts over 80 objects, but there are only 13 (counted with scipy)
def label(img):
n=1
for i in range(h):
for j in range(c):
if img[i][j]==255:
if img[i-1][j]!=0 and img[i-1][j]!=255:
img[i][j]=img[i-1][j]
elif img[i+1][j]!=0 and img[i+1][j]!=255:
img[i][j]=img[i-1][j]
elif img[i][j+1]!=0 and img[i][j+1]!=255:
img[i][j]=img[i][j+1]
elif img[i][j-1]!=0 and img[i][j-1]!=255:
img[i][j]=img[i][j-1]
else:
img[i][j]=n
if img[i-1][j]!=0:
img[i-1][j]=img[i][j]
if img[i+1][j]!=0:
img[i+1][j]=img[i][j]
if img[i][j+1]!=0:
img[i][j+1]=img[i][j]
if img[i][j-1]!=0:
img[i][j-1]=img[i][j]
n+=1
elif img[i][j]!=0:
if img[i-1][j]!=0:
img[i-1][j]=img[i][j]
if img[i+1][j]!=0:
img[i+1][j]=img[i][j]
if img[i][j+1]!=0:
img[i][j+1]=img[i][j]
if img[i][j-1]!=0:
img[i][j-1]=img[i][j]
return img,n
You will want something like https://codereview.stackexchange.com/questions/148897/floodfill-algorithm, which implements https://en.wikipedia.org/wiki/Flood_fill.
It's a good fit for numba or cython if that's feasible for you.
Perhaps you can use OpenCV, which already offers floodfill: https://docs.opencv.org/3.4/d7/d1b/group__imgproc__misc.html#gaf1f55a048f8a45bc3383586e80b1f0d0.
Suppose you have binarized so background is color one and objects are color zero. Set c = 2, scan for a zero pixel, and floodfill it with color c.
Now increment c, scan for zero, fill it, lather, rinse, repeat.
You will wind up with each object bearing a distinct color so you can use it as an isolation mask.
Distinct colors are very helpful during debugging, but of course three colors suffices (or even two) if you just want a count.
The final bitmap will be uniformly the background color in the two-color case.
Using a 4-element Von Neumann neighborhood versus an 8-element neighborhood will make a big difference in the final result.
It's easier for paint to "leak" through diagonal connectivity in the 8-element setting.
Doing edge detection and thickening can help to reduce unwanted color leakage.

An algorithm that efficiently computes the distance of one labeled pixel to its nearest differently labeled pixel

I apologize for my lengthy title name. I have two questions, where the second question is based on the first one.
(1). Suppose I have a matrix, whose entries are either 0 or 1. Now, I pick an arbitrary 0 entry. Is there an efficient algorithm that searches the nearest entry with label 1 or calculates the distance between the chosen 0 entry and its nearest entry with label 1?
(2). Suppose now the distribution of entries 1 has a geometric property. To make this statement clearer, imagine this matrix as an image. In this image, there are multiple continuous lines (not necessarily straight). These lines form several boundaries that partition the image into smaller pieces. Assume the boundaries are labeled 1, whereas all the pixels in the partitioned area are labeled 0. Now, similar to (1), I pick a random pixel labeled as 0, and I hope to find out the coordinate of the nearest pixel labeled as 1 or the distance between them.
A hint/direction for part (1) is enough for me. If typing up an answer takes too much time, it is okay just to tell me the name of the algorithm, and I will look it up.
p.s.: If I post this question in an incorrect section, please let me know. I will re-post it to an appropriate section. Thank you!
I think that if you have a matrix, you can run a BFS version where the matrix A will be your graph G and the vertex v will be the arbitrary pixel you chose.
There is an edge between any two adjacent cells in the matrix.

Improving efficiency of using for loops to correct dead pixels across an entire dataset of images

this probably has a very simple solution but I have searched stack exchange and google and haven't found one yet. Below is a function from my code. It is passed a multidimensional datacube of images, where "size" is the dimensions of the cube. I have a number of dead pixels in my detector and use another function to find the indices of them which I pass to this function where it then removes the pixels from each image by averaging the surrounding ones.
My question is, is there a more efficient way to iterate through the images? I know the locations of the dead pixels but it is taking in excess of 10 minutes to correct them for the entire dataset.
def _dead_pixel(dataset, indices, size):
for i in range(0,size[0]):
for j in range(0,size[1]):
image = dataset[i,j]
for k in indices:
dead_pixel = (image[k[0]-1,k[1]].data + image[k[0]+1,k[1]].data +image[k[0],k[1]-1].data + image[k[0],k[1]+1].data)/4
image[k[0],k[1]].data = dead_pixel
return dataset
image.data just takes the image and accesses the data as a numpy array. "Indices" is a list of where the dead pixels

How can I extract this obvious event from this image?

EDIT: I have found a solution :D thanks for the help.
I've created an image processing algorithm which extracts this image from the data. It's complex, so I won't go into detail, but this image is essentially a giant numpy array (it's visualizing angular dependence of pixel intensity of an object).
I want to write a program which automatically determines when the curves switch direction. I have the data and I also have this image, but it turns out doing something meaningful with either has been tricky. Thresholding fails because there are bands of different background color. Sobel operators and Hough Transforms also do not work well for this same reason.
This is really easy for humans to see when this switch happens, but not so easy to tell a computer. Any tips? Thanks!
Edit: Thanks all, I'm now fitting lines to this image after convolution with general gaussian and skeletonization of the result. Any pointers on doing this would be appreciated :)
You can take a weighted dot product of successive columns to get a one-dimensional signal that is much easier to work with. You might be able to extract the patterns using this signal:
import numpy as np
A = np.loadtxt("img.txt")
N = A.shape[0]
L = np.logspace(1,2,N)
X = []
for c0,c1 in zip(A.T, A.T[1:]):
x = c0.dot(c1*L) / (np.linalg.norm(c0)*np.linalg.norm(c1))
X.append(x)
X = np.array(X)
import pylab as plt
plt.matshow(A,alpha=.5)
plt.plot(X*3-X.mean(),'k',lw=2)
plt.axis('tight')
plt.show()
This is absolutely not a complete answer to the question, but a useful observation that is too long for a comment. I'll delete if a better answer comes along.
With the help of Mark McCurry, I was able to get a good result.
Step 1: Load original image. Remove background by subtracting median of each vertical column from itself.
no_background=[]
for i in range(num_frames):
no_background.append(orig[:,i]-np.median(orig,1))
no_background=np.array(no_background).T
Step 2: Change negative values to 0.
clipped_background = no_background.clip(min=0)
Step 3: Extract a 1D signal. Take weighted sum of the vertical columns, which relates the max intensity in a column to its position.
def exp_func(x):
return np.dot(np.arange(len(x)), np.power(x, 10))/(np.sum(np.power(x, 10)))
weighted_sum = np.apply_along_axis(exp_func,0, clipped_background)
Step 4: Take the derivative of 1D signal.
conv = np.convolve([-1.,1],weighted_sum, mode='same')
pl.plot(conv)
Step 5: Determine when the derivative changes sign.
signs=np.sign(conv)
pl.plot(signs)
pl.ylim(-1.2,1.2)
Step 6: Apply median filter to above signal.
filtered_signs=median_filter(signs, 5) #pick window size based on result. second arg and odd number.
pl.plot(filtered_signs)
pl.ylim(-1.2,1.2)
Step 7: Find the indices (frame locations) of when the sign switches. Plot result.
def sign_switch(oneDarray):
inds=[]
for ind in range(len(oneDarray)-1):
if (oneDarray[ind]<0 and oneDarray[ind+1]>0) or (oneDarray[ind]>0 and oneDarray[ind+1]<0):
inds.append(ind)
return np.array(inds)
switched_frames = sign_switch(filtered_signs)
For detecting tip positions or turning points, you might try using a corner detector on the original image (not the skeletonized one). As a corner detector the structure tensor could be applicable. The structure tensor is also useful for calculating the local orientation in an image.

Rudimentary Computer Vision Techniques for Python Bot.

After completion several chapters in computer vision books I decided to apply those methods to create some primitive bot for a game. I chose Fling that has almost no dynamics and all I needed to do was to find balls. Balls may have 5 different colors and also they can be directed to any of 4 directions (depending on eyes' location). I cropped each block in the field such that I can just check each block whether it contains a ball or not. My problem is that I'm not able to find balls correctly.
My first attempt was following. I sum RGB colors for each ball and get [R, G, B] array. Then I sum RGB colors for each block in the field. If block's array has a similar [R, G, B] as ball's array I suggest that this block has a ball.
The problem is it's hard to find good value for 'similarity'. Even different empty blocks vary in such sums significantly.
Second, I tried to use openCV module that has matchTemplate function. This function matches image with another source image and along with minMaxLoc function returns a value maxLoc. If maxLoc is close to 1 then the image is probably in source image. I made all possible variations of balls (20 overall), and passed them with the entire field. This function worked well but unfortunately it sometimes misses some balls in the field or assigns two different types of balls (say green and yellow) for one ball. I tried to improve the process by matching balls not with the entire field but with each block (this method has advantage that it checks each block and should detect correct number of balls in the field, when matching with entire field only gives one location for each color of ball. If there are two balls of the same color matchTemplate loses information about 2nd ball) . Surprisingly it still has false negatives\positives.
Probably there is much easier way to solve this problem (maybe a library that I don't know yet) but for now I can't find one. Any suggestions are welcomed.
The balls seem pretty distinct in terms of colour. The problems you initially described seem to be related to some of the finer, random detail present in the image - especially in the background and in the different shading/poses of the ball.
On this basis, I would say you could simplify the task significantly by applying a set of pre-processing steps to "collapse" the range of colours in the image.
There are any number of more principled ways to achieving accurate colour segmentation (which is what, more formally, you want to achieve) - but taking a more pragmatic view, here are a few quick'n'dirty hacks.
So, for example, we can initially smooth the image to reduce higher frequency components...
Then, convert to a normalised RGB representation...
Before, finally posterizing it with the mean shift filtering step...
Here is the code in Python, using the OpenCV bindings, that does all this in order:
import cv
# get orginal image
orig = cv.LoadImage('fling.png')
# show original
cv.ShowImage("orig", orig)
# blur a bit to remove higher frequency variation
cv.Smooth(orig,orig,cv.CV_GAUSSIAN,5,5)
# normalise RGB
norm = cv.CreateImage(cv.GetSize(orig), 8, 3)
red = cv.CreateImage(cv.GetSize(orig), 8, 1)
grn = cv.CreateImage(cv.GetSize(orig), 8, 1)
blu = cv.CreateImage(cv.GetSize(orig), 8, 1)
total = cv.CreateImage(cv.GetSize(orig), 8, 1)
cv.Split(orig,red,grn,blu,None)
cv.Add(red,grn,total)
cv.Add(blu,total,total)
cv.Div(red,total,red,255.0)
cv.Div(grn,total,grn,255.0)
cv.Div(blu,total,blu,255.0)
cv.Merge(red,grn,blu,None,norm)
cv.ShowImage("norm", norm)
# posterize simply with mean shift filtering
post = cv.CreateImage(cv.GetSize(orig), 8, 3)
cv.PyrMeanShiftFiltering(norm,post,20,30)
cv.ShowImage("post", post)
Your task is simpler in several respects than the ones the general computer vision algorithms you'll find were designed for: you know exactly what to look for and you know exactly where to look for it. As such I think involving an external library is an unnecessary complication, unless you're already familiar with it and can use it effectively as a tool to solve your own problem. In this post I will only use PIL.
First, distinguish the task into two simpler tasks:
Given a tile, determine whether there's a ball there.
Given a tile where we're pretty sure that there's a ball, identify the colour of the ball.
The second task should be simple and I won't spend time on it here. Basically, sample some pixels where the ball's main colour will be visible and compare the colours you find to the known ball colours.
So let's look at the first task.
First off, note that the balls don't extend to the edge of the tiles. Thus you can find a fairly representative sample of the background of a tile, whether or not there's a ball there, by sampling the pixels along the edge of the tile.
A simple way to proceed is to compare every pixel in a tile with this sample of the tile background, and to obtain some sort of measure of whether it's generally similar (no ball) or dissimilar (ball).
The following is one way to do this. The basic approach used here is to calculate the mean and the standard deviation of the background pixels -- separately for the red, green, and blue channels. For every pixel, we then calculate the number of standard deviations we are from the mean in every channel. We take this value for the most dissimilar channel as our measure of dissimilarity.
import Image
import math
def fetch_pixels(col, row):
img = Image.open( "image.png" )
img = img.crop( (col*32,row*32,(col+1)*32,(row+1)*32) )
return img.load()
def border_pixels( a ):
rv = [ a[x,y] for x in range(32) for y in (0,31) ]
rv.extend( [ a[x,y] for x in (0,31) for y in range(1,31) ] )
return rv
def mean_and_stddev( xs ):
mean = float(sum( xs )) / len(xs)
dev = math.sqrt( float(sum( [ (x-mean)**2 for x in xs ] )) / len(xs) )
return mean, dev
def calculate_deviations(cols = 7, rows = 8):
outimg = Image.new( "L", (cols*32,rows*32) )
pixels = outimg.load()
for col in range(cols):
for row in range(rows):
rv = calculate_deviations_for( col, row, pixels )
print rv
outimg.save( "image_output.png" )
def calculate_deviations_for( col, row, opixels ):
a = fetch_pixels( col, row )
border = border_pixels( a )
bru, brd = mean_and_stddev( map( lambda x : x[0], border ) )
bgu, bgd = mean_and_stddev( map( lambda x : x[1], border ) )
bbu, bbd = mean_and_stddev( map( lambda x : x[2], border ) )
rv = []
for y in range(32):
for x in range(32):
r, g, b = a[x,y]
dr = (bru-r) / brd
dg = (bgu-g) / bgd
db = (bbu-b) / bbd
t = max(abs(dr), abs(dg), abs(db))
opixel = 0
limit, span = 2.5, 8.0
if t > limit:
v = min(1.0, (t - limit) / span)
print t,v
opixel = 127 + int( 128 * v )
opixels[col*32+x,row*32+y] = opixel
rv.append( t )
return (sum(rv) / float(len(rv)))
A visualization of the result is here:
Note that most of the non-ball pixels are pure black. It should now be possible to determine whether a ball is present or not by simply counting the black pixels. (Or more reliably: count the size of the largest single blob of non-black pixels.)
Now, this is a very ad-hoc method and I certainly don't make any claim that it's the best method. The "limit" value was determined by experimentation -- essentially, by trial and error. It's included here to illustrate the sort of method I think you should be exploring, and to give you a starting point to tweak from. (If you want a place to start experimenting, you could try to make it give a better result for the top purple ball. Can you think of weaknesses in the approach above that might make it give a result like that? Always keep in mind, however, that you don't need a perfect-looking result, just one that's good enough. The final answer you want is "ball" or "no ball", and you just want to be able to answer that reliably.)
Note that:
You need to make sure you take the screengrab when the balls have finished rolling and are lying still in the center of their tiles. This simplifies the problem immensely.
The game's background affects the problem -- if there are ocean-themed or desert-themed levels coming up, you will need to test and possibly tweak the recognizer to make sure it still reliably works.
Special effects and/or GUI elements that cover the playing field will complicate the problem. (E.g. consider if the game has a 'cloud' or 'smoke' effect that sometimes floats over the playing field.) You may want to tweak the recognizer to be able to return "no result" if it's not sure -- then you can try another screengrab later. You may want to take several screengrabs and average the results.
I have assumed that there are only balls and non-balls. If later levels have other kinds of objects, you will have to experiment more to find out how to best recognize those.
I haven't used the 'reference picture' approach. However, if you have an image containing all the objects in the game and you can exactly align the pixels with your tiles, that's likely going to be the most reliable approach. Instead of comparing the foreground to the sampled background, compare the foreground to a set of known foreground images.

Categories

Resources