I want to write a function that will create a random number (between m and n, inclusive) of stars in the sky of this picture (http://tinypic.com/r/34il9hu/6). I want the stars should be randomly composed of either a single white pixel, or a square of 4 adjacent white pixels.I also do not want to place a 'star' (of 1 pixel) over the tree branches, the moon or the bird though
How would I do this in python? Can anyone help? Thanks!
I have this so far:
I have started and have come out with this so far, I don't know if any of it is right or even if I am on the right track though:
def randomStars(small, large):
import random
file = pickAFile()
pic = makePicture(myPic)
#x = random.randrange(getWidth(pic))
#y = random.randrange(getHeight(pic))
for pixel in pic.getAllPixels():
if random.random() < 0.25:
pixel.red = random.randint(256)
pixel.green = random.randint(256)
pixel.blue = random.randint(256)
show(pic)
I have no clue what I am doing :(
This looks like a nice example to try the superpixels, as implemented by skimage. You can probably do this in an easier way for your problem.
import urllib
import random
import io
import matplotlib.pyplot as plt
import skimage.segmentation
import pandas
# Read the image
f = io.BytesIO(urllib.urlopen('http://oi46.tinypic.com/34il9hu.jpg').read())
img = plt.imread(f, format='jpg')
# Prefer to keep pixels together based on location
# But not too much, so we still get some branches.
superpixel = skimage.segmentation.slic(img, n_segments=200, ratio=20)
plt.imshow(superpixel%7, cmap='Set2')
Now that we have superpixels we can do classification a bit easier, by doing it per superpixel. You could use some fancy classification here, but this example is quite simple, with a blueish sky, let's do it by hand.
# Create a data frame with the relative blueish of every super pixel
# Convert image to hsv
hsv = matplotlib.colors.rgb_to_hsv(img.astype('float32')/255)
# Define blueish as the percentage of pixels in the blueish range of the hue space
df =pandas.DataFrame({'superpixel':superpixel.ravel(),
'blue':((hsv[:,:,0] > 0.4) & (hsv[:,:,0]<0.8)).astype('float32').ravel(),
'value':hsv[:,:,2].ravel()})
grouped = df.groupby('superpixel').mean()
# Lookup the superpixels with the least blue
blue = grouped.sort('blue', ascending=True).head(100)
# Lookup the darkest pixels
light = grouped.sort('value', ascending=True).head(50)
# If superpixels are too dark or too blue, get rid of them
mask = (np.in1d(superpixel, light.index ).reshape(superpixel.shape) |
np.in1d(superpixel, blue.index ).reshape(superpixel.shape))
# Now we can put the stars on the blueish, not too darkish areas
def randomstar(img, mask):
"""random located star"""
x,y = random.randint(1,img.shape[0]-1), random.randint(1,img.shape[1]-1)
if not mask[x-1:x+1, y-1:y+1].any():
# color not so random
img[x,y,:] = 255
img[x-1,y,:] = 255
img[x+1,y,:] = 255
img[x,y-1,:] = 255
img[x,y+1,:] = 255
for i in range(100):
randomstar(img, mask)
plt.imshow(img)
Python's standard library doesn't come with any powerful-enough image-manipulation code, but there are a few alternatives that are easy to install and use. I'll show how to do this with PIL.
from PIL import Image
def randomStars(small, large):
import random
filename = pickAFile()
pic = Image.open(filename)
max_x, max_y = pic.size
pixels = im.load()
x = random.randrange(max_x)
y = random.randrange(max_y)
for i in range(max_x):
for j in range(max_y):
if random.random() < 0.25:
red = random.randint(256)
green = random.randint(256)
blue = random.randint(256)
pixels[i, j] = (red, green, blue, 1)
im.show()
The show function doesn't display the image in your app (for that, you'd need some kind of GUI with an event loop, like tkinter or PySide); it saves a file to a temporary directory and runs a platform-specific program like Preview or xv to display it.
I assume you're also going to want to save the file. That's easy too:
name, ext = os.path.splitext(filename)
outfilename = '{}-with-stars.{}'.format(name, ext)
im.save(outfilename)
This will save it back to a .jpg with default JPEG settings, relying on PIL guessing what you want from the filename. (Which means, yes, you can save it as a PNG just by using '{}-with-stars.png'.format(name).) If you want more control, PIL can do that too, specifying an explicit format, and format-specific options.
So far, this is just how to turn your existing code into something that works, that you can play with and start debugging; it doesn't actually answer the original problem.
I want to write a function that will create a random number (between m and n, inclusive) of stars in the sky of this picture
So first, you need this as your loop, instead of a loop over all pixels:
for _ in random.randint(m, n):
Now:
I want the stars should be randomly composed of either a single white pixel, or a square of 4 adjacent white pixels.
x, y = random.randrange(max_x), random.randrange(max_y)
if random.random() < .5:
# draw white pixel at [x, y]
pixels[x, y] = (1, 1, 1, 1)
else:
# draw square at [x, y], making sure to handle edges
I also do not want to place a 'star' (of 1 pixel) over the tree branches, the moon or the bird though
You need to define how you know what's part of a tree branch, the moon, or the bird. Can you define that in terms of pixel colors?
From a quick glance, it looks like you might be able to. The moon's pixels are all brighter, more saturated, more red-biased, etc. than anything else (except the AP logo in the corner, which is even brighter). The bird and the branches are darker than anything else. In fact, they're so distinct that you probably don't even have to worry about doing correct colorspace math; it may be as simple as something like this:
r, g, b, a = pixels[x, y]
fake_brightness = r+g+b+a
if fake_brightness < 0.2:
# Tree or bird, pick a new random position
elif 1.2 < fake_brightness < 2.8:
# Moon, pick a new random position
else:
# Sky or API logo, scribble away
(Those numbers are obviously just pulled out of thin air, but a bit of trial and error should give you usable values.)
Of course if you're doing this as a learning exercise, you probably want to learn the correct colorspace math, and maybe even write an edge-detection algorithm, rather than relying on this image being so simply-parseable.
Try an approach like:
for n in xrange(number_of_stars):
# Find a good position
while True:
x, y = random_coords_in_image()
if is_sky(image, x, y):
break
# paint a star there
c = star_colour()
if large_star():
image.put(x, y, c)
image.put(x, y+1, c)
image.put(x+1, y+1, c)
image.put(x+1, y, c)
else:
image.put(x, y, c)
The functions I used are pretty self explanatory; you can implement is_sky estimating some conditions for the image colour at a given place.
Related
I'm trying to set the cactus image over the desert image. And have the green background of the cactus removed.
Here is my code:
from PIL import Image
image_desert = Image.open("desert.jpg")
image_cactus = Image.open("cactus.jpg")
image_cactus = image_desert.load()
for y in range (200, 300):
for x in range (100, 200):
(r, g, b) = image_cactus[x, y]
newgreen = g + 30
#choosing rgb colors
image_cactus[x, y] = (r, newgreen, b)
image_desert.show()
Any help would be appreciated it.
I thought this was interesting. I've never worked with picture manipulation before, so I wanted to give it a shot and see if I could figure it out. Let me know if this works for your purposes, or if it needs to be improved.
from PIL import Image
image_desert = Image.open("desert.jpg")
image_cactus = Image.open("cactus.jpg")
image_temp = image_desert.load()
image_cactus = image_cactus.load()
(rCactus, gCactus, bCactus) = image_cactus[0, 0]
#This gets the rgb value of the green background
for y in range (0, 600):
for x in range (0, 600):
(rResult, gResult, bResult) = image_cactus[x, y]
if(gCactus <= gResult + 5 and gCactus >= gResult - 5):
#this is a comparison of the cactus picture with the background pixel value.
# There is a range of plus or minus 5 since the pixels change ever so slightly
# near the edge of the cactus, so the range makes it look a little better.
(rResult, gResult, bResult) = image_temp[x, y]
#if the pixel in the cactus photo is within a range of 10 pixels of the
#known green background pixel value, it will be replaced with the pixel
#from the desert photo.
image_temp[x, y] = (rResult, gResult, bResult)
image_desert.show()
It's definitely not a perfect picture, photoshop would do a much better job. Getting rid of all of the green around the cactus is where I'm running into the biggest issue. The wider you make the range of pixels, the more your cactus may start disappearing and getting spotty.
I'm working with license plates, what I do is apply a series of filters to it, such as:
Grayscale
Blur
Threshhold
Binary
The problem is when I doing this, there are some contour like this image at borders, how can I clear them? or make it just black color (masked)? I used this code but sometimes it falls.
# invert image and detect contours
inverted = cv2.bitwise_not(image_binary_and_dilated)
contours, hierarchy = cv2.findContours(inverted,cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_SIMPLE)
# get the biggest contour
biggest_index = -1
biggest_area = -1
i = 0
for c in contours:
area = cv2.contourArea(c)
if area > biggest_area:
biggest_area = area
biggest_index = i
i = i+1
print("biggest area: " + str(biggest_area) + " index: " + str(biggest_index))
cv2.drawContours(image_binary_and_dilated, contours, biggest_index, [0,0,255])
center, size, angle = cv2.minAreaRect(contours[biggest_index])
rot_mat = cv2.getRotationMatrix2D(center, angle, 1.)
#cv2.warpPerspective()
print(size)
dst = cv2.warpAffine(inverted, rot_mat, (int(size[0]), int(size[1])))
mask = dst * 0
x1 = max([int(center[0] - size[0] / 2)+1, 0])
y1 = max([int(center[1] - size[1] / 2)+1, 0])
x2 = int(center[0] + size[0] / 2)-1
y2 = int(center[1] + size[1] / 2)-1
point1 = (x1, y1)
point2 = (x2, y2)
print(point1)
print(point2)
cv2.rectangle(dst, point1, point2, [0,0,0])
cv2.rectangle(mask, point1, point2, [255,255,255], cv2.FILLED)
masked = cv2.bitwise_and(dst, mask)
#cv2_imshow(imgg)
cv2_imshow(dst)
cv2_imshow(masked)
#cv2_imshow(mask)
Some results:
The original plates were:
Good result 1
Good result 2
Good result 3
Good result 4
Bad result 1
Bad result 2
Binary plates are:
Image 1
Image 2
Image 3
Image 4
Image 5 - Bad result 1
Image 6 - Bad result 2
How can I fix this code? only that I want to avoid that bad result or improve it.
INTRODUCTION
What you are asking starts to become complicated, and I believe there is not anymore a right or wrong answer, just different ways to do this. Almost all of them will yield positive and negative results, most likely in a different ratio. Having a 100% positive result is quite a challenging task, and I do believe my answer does not reach it. Yet it can be the basis for a more sophisticated work towards that goal.
MY PROPOSAL
So, I want to make a different proposal here.
I am not 100% sure why you are doing all the steps, and I believe some of them could be unnecessary.
Let's start from the problem: you want to remove the white parts on the borders (which are not numbers).
So, we need an idea about how to distinguish them from the letters, in order to correctly tackle them.
If we just try to contour and warp, it is likely to work on some images and not on others, because not all of them look the same. This is the hardest problem to have a general solution that works for many images.
What are the difference between the characteristics of the numbers and the characteristics of the borders (and other small points?):
after thinking about that, I would say: the shapes! That meaning, if you would imagine a bounding box around a letter/number, it would look like a rectangle, whose size is related to the image size. While in the case of the border, they are usually very large and narrow, or too small to be considered a letter/number (random points).
Therefore, my guess would be on segmentation, dividing the features via their shape. So we take the binary image, we remove some parts using the projection on their axes (as you correctly asked in the previous question and I believe we should use) and we get an image where each letter is separated from the white borders.
Then we can segment and check the shape of each segmented object, and if we think these are letters, we keep them, otherwise we discard them.
THE CODE
I wrote the code before as an example on your data. Some of the parameters are tuned on this set of images, so they may have to be relaxed for a larger dataset.
import cv2
import matplotlib.pyplot as plt
import numpy as np
%matplotlib inline
import scipy.ndimage as ndimage
# do this for all the images
num_images = 6
plt.figure(figsize=(16,16))
for k in range(num_images):
# read the image
binary_image = cv2.imread("binary_image/img{}.png".format(k), cv2.IMREAD_GRAYSCALE)
# just for visualization purposes, I create another image with the same shape, to show what I am doing
new_intermediate_image = np.zeros((binary_image.shape), np.uint8)
new_intermediate_image += binary_image
# here we will copy only the cleaned parts
new_cleaned_image = np.zeros((binary_image.shape), np.uint8)
### THIS CODE COMES FROM THE PREVIOUS ANSWER:
# https://stackoverflow.com/questions/62127537/how-to-clean-binary-image-using-horizontal-projection?noredirect=1&lq=1
(rows,cols)=binary_image.shape
h_projection = np.array([ x/rows for x in binary_image.sum(axis=0)])
threshold_h = (np.max(h_projection) - np.min(h_projection)) / 10
print("we will use threshold {} for horizontal".format(threshold))
# select the black areas
black_areas_horizontal = np.where(h_projection < threshold_h)
for j in black_areas_horizontal:
new_intermediate_image[:, j] = 0
v_projection = np.array([ x/cols for x in binary_image.sum(axis=1)])
threshold_v = (np.max(v_projection) - np.min(v_projection)) / 10
print("we will use threshold {} for vertical".format(threshold_v))
black_areas_vertical = np.where(v_projection < threshold_v)
for j in black_areas_vertical:
new_intermediate_image[j, :] = 0
### UNTIL HERE
# define the features we are looking for
# this parameters can also be tuned
min_width = binary_image.shape[1] / 14
max_width = binary_image.shape[1] / 2
min_height = binary_image.shape[0] / 5
max_height = binary_image.shape[0]
print("we look for feature with width in [{},{}] and height in [{},{}]".format(min_width, max_width, min_height, max_height))
# segment the iamge
labeled_array, num_features = ndimage.label(new_intermediate_image)
# loop over all features found
for i in range(num_features):
# get a bounding box around them
slice_x, slice_y = ndimage.find_objects(labeled_array==i)[0]
roi = labeled_array[slice_x, slice_y]
# check the shape, if the bounding box is what we expect, copy it to the new image
if roi.shape[0] > min_height and \
roi.shape[0] < max_height and \
roi.shape[1] > min_width and \
roi.shape[1] < max_width:
new_cleaned_image += (labeled_array == i)
# print all images on a grid
plt.subplot(num_images,3,1+(k*3))
plt.imshow(binary_image)
plt.subplot(num_images,3,2+(k*3))
plt.imshow(new_intermediate_image)
plt.subplot(num_images,3,3+(k*3))
plt.imshow(new_cleaned_image)
that produces the output (in the grid, left image are the input images, central one are the images after the mask based on histogram projections, and on the right are the cleaned images):
CONCLUSIONS:
As said above, this method does not yield 100% positive results. The last picture has lower quality and some parts are unconnected, and they are lost in the process. I personally believe this is a price to pay to get cleaner image, and if you have a lot of images, it won't be a problem, and you can remove those kind of images. Overall, I think this method returns quite clear images, where all other parts that are not letters or numbers are correctly removed.
ADVANTAGES
the image is clean, nothing more than letters or numbers are kept
the parameters can be tuned, and should be consistent across images
in case of problem, using some prints or some debugging on the loop that chooses the features to keep should make it easier to understand where are the problem and correct them
LIMITATIONS
it may fail in some cases where letters and numbers touch the white borders, which seems quite possible. It is handled from the black_areas created using the projection, but I am not so confident this will work 100% of the time.
some small parts of the numbers can be lost during the process, as in the last picture.
I have an image that is given as below, which I am trying to convert into a binary image.
To convert this image into a binary image, I did:
image[image > 0] = 255
This creates a binary image with colored region having only white pixels. But I also want to convert the pixels that are above the colored region to value 255. How could I do this? I do not only want to convert the colored pixels to white but also the area above it. Is there I could do this? The area denoted by arrows will remain black (i.e area after the colored region)
UPDATE
Also, how could I approach if the edges are as shown below:
If I understood correctly your problem a more elaborated approach should be taken to get the desired result.
First of all the approach to use a simple threshold creates a noisy approach.
I used a modified image of your sample:
If you apply your thresholding then a result like this might come up:
A finer thresholding can be useful here:
image3 = cv2.inRange(image, np.array([10, 10, 10]), np.array([255, 255, 255]))
which creates a binary image as a result (resembles your desired output except for the upper strip):
TO get rid of the strip I would (it's just an approach not the perfect one though) use something to find the corner created by the white region and then use it to draw all the region above it with black:
ind = np.where(image3 == 255)
max_x = np.max(ind[1])
max_y = ind[0][np.argmax(ind[1])]
image3[:max_y, :max_x] = 255
And the result would be like this:
By all mean this is not the perfect answer. But it might be something helpful.
I recreated the image as follows:
Then I read it and followed your path with making it binary first (with a small modification to reduce noises):
import numpy as np
from matplotlib import pyplot as plt
img = plt.imread("sample.jpg")
img2 = img.copy()
img2[img2.sum(-1) > 30] = 255
img2[img2.sum(-1) <= 30] = 0
Here is the result after this modification:
OPTION 1
This might not be what you asked but it is similar to one of the solutions discussed in the comments, and I think it is partly correct:
i, j = np.where(img2.sum(-1) > 0) # find all white coordinates
i, j = (i[j.argmax()], j[j.argmax()]) # the corner white point into the black
img2[:i, :j] = 255 # paint whine all the left-above rectangle from this point
Here is the final result:
This is an imperfect but pretty simple pure numpy solution.
OPTION 2
In this solution, we need some simple calculus and linear algebra. Take two points in 2D space and draw a line between them. So, what is the function of the borderline?
point2 = (i, j) # same i and j from OPTION1 (coordinates of the top-right corner)
point1 = (img2.shape[0], img2[-1].sum(-1).argmin()) # the bottom-right white corner.
a = (point2[1] - point1[1]) / (point2[0] - point1[0])
c = point1[1] - a * point1[0]
f = lambda x: int(a * x + c)
Now, paint all areas to the left of the line:
for i in range(img2.shape[0]):
img2[:i, :f(i)+1] = 255
Here is the result:
I'm practically new to python and don't have much knowledge about it. I need help converting this pseudocode into Python which is written to obtain the background by removing moving objects in the images. In regards to the Pseudocode, I don't understand the Lines 3, 4 and 5 so maybe once its converted into Python, I can understand it better. In line 3 and 4, I don't understand what the & does and in the last line, I don't understand how is it even computing an image.
Any help will be appreciated.
The code is provided below:
Mat sequence[3];// the sequence of images to loop through
Mat output, x = 0, y = 0; // looping through the sequence
matchTemplate(sequence[i], sequence[i+1], output, CV_TM_CCOEFF_NORMED)
mask = 1 & (output>0.9) // get correlated part amongst the images
x += sequence[i] & mask + sequence[i+1] & mask; // accumulate background infer
y += 2*mask; // keep count
end of loop;
Mat bg = x.mul(1.0/y); // average background
Sample images to try are also provided below:
image1
image2
image3
I'm not very familiar with OpenCV, so I hope you'll excuse me if I don't provide a code snippet you can just copy and paste. But if I understand the pseudocode correctly, it is doing this:
sequence = list of images
x will hold sum of backgrounds
y will hold the number of frames use to build x
for each index i in sequence:
c = matrix of correlation coefficients between (sequence[i], sequence[i+1]) from matchTemplate
mask = pixels that are highly correlated (90%+)
x += actual pixels from sequence[i] & mask and sequence[i+1] & mask that are considered background
y += 2 for every pixel in mask
bg = average of background images x / number of frames y
So what's happening is, for every pair of images, it marks the pixels that are the same in both images. The assumption is that background doesn't change between adjacent frames and foreground does. Whether pixels are "the same" is judged on the basis of correlation being >90%. Then it takes all the marked pixels, and averages them.
As one of the commentors mentioned, the mean of the images does remove the foreground but the entire image becomes a little faded. Here is the code that does that:
import skimage.io as io
import numpy as np
import matplotlib.pyplot as plt
cim1 = io.imread('https://i.stack.imgur.com/P44wT.jpg')
cim2 = io.imread('https://i.stack.imgur.com/wU4Yt.jpg')
cim3 = io.imread('https://i.stack.imgur.com/yUbB6.jpg')
x,y,z = cim1.shape
newimage = np.copy(cim1)
for row in range(x-1):
for col in range(y-1):
r = np.mean([cim1[row][col][0],cim2[row][col][0],cim3[row][col][0]]).astype(int)
g = np.mean([cim1[row][col][1],cim2[row][col][1],cim3[row][col][1]]).astype(int)
b = np.mean([cim1[row][col][2],cim2[row][col][2],cim3[row][col][2]]).astype(int)
newimage[row][col] = [r,g,b]
fix, ax = plt.subplots(figsize=(10,10))
ax.axis('off')
ax.imshow(newimage)
The output image I get from this:
A better approach to this problem is to find the median of the three images. The more images you have in the algorithm the better is the background. Here is a snippet I tried (just replacing mean with median). If you have more images you can get a much more accurate one.
x,y,z = cim1.shape
newimage = np.copy(cim1)
for row in range(x-1):
for col in range(y-1):
r = np.median([cim1[row][col][0],cim2[row][col][0],cim3[row][col][0]]).astype(int)
g = np.median([cim1[row][col][1],cim2[row][col][1],cim3[row][col][1]]).astype(int)
b = np.median([cim1[row][col][2],cim2[row][col][2],cim3[row][col][2]]).astype(int)
newimage[row][col] = [r,g,b]
fix, ax = plt.subplots(figsize=(10,10))
ax.axis('off')
ax.imshow(newimage)
The final output:
If you had more images, you can completely remove the foreground. Hope you got the idea on which you can build upon.
My code assumes all your images are of the same dimensions. The solution will be a bit more complicated if you captured the images in different views. In that case you may have to use template matching algorithm (your pseudo code seems to be doing something similar) to extract the common canvas from your images.
I am using python, PIL, opencv and numpy to detect single color texts (i.e one is red, one is green). I want to detect these colorful text up to 6 meters away during live stream. I have used color detection methods but they did not work after 30-50 cm. Camera should be close to colors. As a second method to detect these texts, I used ctpn method. Although it detects texts, It does not provide the coordinate of these texts since I need coordinate points of texts also. I also tried OCR method in Matlab to automatically detect text in natural image but it failed since it finds another small objects as text. I am so stuck about what to do.
Let say for example, there are two different texts in an image that is captured 6 meters away. One text is green, the other one is red. The width of these texts are approximately 40-50 cm. In addition, they are only two different words, not long texts. How can I detect them and specify their location as (x1,y1) and (x2,y2)? Is that possible ? needy for any succesfull hint ?
import numpy as np
from PIL import Image
# Open image and make RGB and HSV versions
RGBim = Image.open("AdjustedNewMaze3.jpg").convert('RGB')
HSVim = RGBim.convert('HSV')
# Make numpy versions
RGBna = np.array(RGBim)
HSVna = np.array(HSVim)
# Extract Hue
H = HSVna[:,:,0]
# Find all green pixels, i.e. where 100 < Hue < 140
lo,hi = 100,140
# Rescale to 0-255, rather than 0-360 because we are using uint8
lo = int((lo * 255) / 360)
hi = int((hi * 255) / 360)
green = np.where((H>lo) & (H<hi))
# Make all green pixels black in original image
RGBna[green] = [0,0,0]
def find_nearest(array, value):
array = np.asarray(array)
idx = (np.abs(array - value)).argmin()
return array[idx]
value = 120 & 125
green = find_nearest(RGBna, value)
print(green)
count = green[0].size
print("Pixels matched: {}".format(count))
Image.fromarray(green).save('resultgreen.png')