I have an RGB video and a single keyframe from that video. In that keyframe, the user will apply a binary mask.
I want to create a mask of the video where pixels have values that exist in the keyframe's masked region.
In other words, I want to create a list of RGB pixel values that exist in the mask of the keyframe, and create a mask of all other frames on the condition that the pixel values exist within the list. Pixel values can be (0,0,0)-(255,255,255)
My current implementation, although technically correct, is extremely inefficient, and I imagine there must be something better.
count = 0
for x in sequence
img = cv2.imread(x)
curr = np.zeros(img.shape[:2],dtype = np.uint16)
for x in range(img.shape[0]):
for y in range(img.shape[1]):
tuple = (img[x][y][0],img[x][y][1],img[x][y][2])
if tuple not in dict:
dict[tuple] = count
curr[x][y] = count
count+=1
else:
curr[x][y] = dict[tuple]
newsequence.append(curr)
#in another function, generate mask2, the mask of the keyframe
immask = cv2.bitwise_and(newsequence[keyframe],newsequence[keyframe],mask=mask2[index].astype('uint8'))
immask = [x for x in immask.flatten() if x != 0]
#for thresholding purposes (if at least 80% of pixels with that value are selected in the keyframe)
valcount= np.bincount(immask)
truecount = np.bincount(newsequence[keyframe].flatten())
frameset = set(immask)
framemask = list(frameset)
framemask = [x for x in framemask if (float(valcount[x])/float(truecount[x]))>0.8]
for frame in range(0,numframes):
for val in framemask:
mask[frame] = np.where((newsequence[frame]==val),255,0).astype('uint8')
Related
I am trying to write a code in Python to display an altered image according to some conditionals. It is a part of LED strip displaying, but I am not experienced with image processing. Maybe someone would help me.
The image with fixed dimensions [height x weight] is loaded and code has to change the rows rearrangement in two ways:
For the first half of rows, For example for max_height = 10: [1,2,3,4,5] --> [1,3,5,7,9]
i=0
for height_pixel =< max_height/2
height_pixel=height_pixel+1
i+=1
For the second half of rows, for example for max_height = 10: [6,7,8,9,10] --> [10,8,6,4,2]
i=4
for height_pixel > max_height/2
height_pixel=(max_height/2)+1+i
i-=2
The columns are not changed.
After that printing/showing/... new image.
The code I have so far, based on adafruit library:
from PIL import Image
import adafruit_dotstar as dotstar
NUMPIXELS = 10 #Length of strip
FILENAME = "image.png" # Image file to load
# Load image and get dimensions:
IMG = Image.open(FILENAME).convert("RGB")
PIXELS = IMG.load()
WIDTH = IMG.size[0]
HEIGHT = IMG.size[1]
HEIGHT = min(HEIGHT, NUMPIXELS)
######################################################
# CONDITIONAL PART
pixelMap = IMG.load()
img = Image.new( IMG.mode, IMG.size)
pixelsNew = img.load()
for i in range(img.size[0]):
if ...
else...
#######################################################
img.show() #altered image
#here I allocating list of lists one for each column of image
COLUMN = [0 for x in range(WIDTH)]
for x in range(WIDTH):
COLUMN[x] = [[0, 0, 0, 0] for _ in range(HEIGHT)]
#converts it into 2D list [column x row]
for x in range(WIDTH): # For each column of image
for y in range(HEIGHT): # For each pixel in column
value = PIXELS[x, y] # Read RGB pixel in image
COLUMN[x][y][0] = value[0]] # R
COLUMN[x][y][1] = value[1]] # G
COLUMN[x][y][2] = value[2]] # B
COLUMN[x][y][3] = 1.0 # Brightness
#and display in a loop
#here the columns will changed after pressing the button, I will add it later, for now it is in loop
while True:
for x in range(WIDTH): # For each column of image...
DOTS[0 : DOTS.n] = COLUMN[x]
DOTS.show()
Maybe someone would give hints or help me with the code.
Best regards
I would suggest to generate a mapping array, which maps the orignal column indices to the newly ordered ones (and by the way, indices should start at 0):
first_half = [2*idx + 1 for idx in range(5)] # 1,3,5,7,9
second_half = [8-2*idx for idx in range(5)] # 8,6,4,2,0
mapping = first_half + second_half
for idx in range(HEIGHT):
new_img[idx] = IMG[mapping[idx]] # introduced new_img to avoid confusion between img and IMG
And then you should be almost done. As far as I can see you do not need to generate this COLUMN list of lists.
for column in new_img:
DOTS[0 : DOTS.n] = column # if thats the DOTS notation, I am not familiar with that part of your code
DOTS.show()
I am trying to increase the region of interest of an image using the below algorithm.
First, the set of pixels of the exterior border of the ROI is de termined, i.e., pixels that are outside the ROI and are neighbors (using four-neighborhood) to pixels inside it. Then, each pixel value of this set is replaced with the mean value of its neighbors (this time using eight-neighborhood) inside the ROI. Finally, the ROI is expanded by inclusion of this altered set of pixels. This process is repeated and can be seen as artificially increasing the ROI.
The pseudocode is below -
while there are border pixels:
border_pixels = []
# find the border pixels
for each pixel p=(i, j) in image
if p is not in ROI and ((i+1, j) in ROI or (i-1, j) in ROI or (i, j+1) in ROI or (i, j-1) in ROI) or (i-1,j-1) in ROI or (i+1,j+1) in ROI):
add p to border_pixels
# calculate the averages
for each pixel p in border_pixels:
color_sum = 0
count = 0
for each pixel n in 8-neighborhood of p:
if n in ROI:
color_sum += color(n)
count += 1
color(p) = color_sum / count
# update the ROI
for each pixel p=(i, j) in border_pixels:
set p to be in ROI
Below is my code
img = io.imread(path_dir)
newimg = np.zeros((584, 565,3))
mask = img == 0
while(1):
border_pixels = []
for i in range(img.shape[0]):
for j in range(img.shape[1]):
for k in range(0,3):
if(i+1<=583 and j+1<=564 and i-1>=0 and j-1>=0):
if ((mask[i][j][k]) and ((mask[i+1][j][k]== False) or (mask[i-1][j][k]==False) or (mask[i][j+1][k]==False) or (mask[i][j-1][k]==False) or (mask[i-1][j-1][k] == False) or(mask[i+1][j+1][k]==False))):
border_pixels.append([i,j,k])
if len(border_pixels) == 0:
break
for (each_i,each_j,each_k) in border_pixels:
color_sum = 0
count = 0
eight_neighbourhood = [[each_i-1,each_j],[each_i+1,each_j],[each_i,each_j-1],[each_i,each_j+1],[each_i-1,each_j-1],[each_i-1,each_j+1],[each_i+1,each_j-1],[each_i+1,each_j+1]]
for pix_i,pix_j in eight_neighbourhood:
if (mask[pix_i][pix_j][each_k] == False):
color_sum+=img[pix_i,pix_j,each_k]
count+=1
print(color_sum//count)
img[each_i][each_j][each_k]=(color_sum//count)
for (i,j,k) in border_pixels:
mask[i,j,k] = False
border_pixels.remove([i,j,k])
io.imsave("tryout6.png",img)
But it is not doing any change in the image.I am getting the same image as before
so I tried plotting the border pixel on a black image of the same dimension for the first iteration and I am getting the below result-
I really don't have any idea where I am doing wrong here.
Here's a solution that I think works as you have requested (although I agree with #Peter Boone that it will take a while). My implementation has a triple loop, but maybe someone else can make it faster!
First, read in the image. With my method, the pixel values are floats between 0 and 1 (rather than integers between 0 and 255).
import urllib
import matplotlib.pyplot as plt
import numpy as np
from skimage.morphology import binary_dilation, binary_erosion, disk
from skimage.color import rgb2gray
from skimage.filters import threshold_otsu
# create a file-like object from the url
f = urllib.request.urlopen("https://i.stack.imgur.com/JXxJM.png")
# read the image file in a numpy array
# note that all pixel values are between 0 and 1 in this image
a = plt.imread(f)
Second, add some padding around the edges, and threshold the image. I used Otsu's method, but #Peter Boone's answer works well, too.
# add black padding around image 100 px wide
a = np.pad(a, ((100,100), (100,100), (0,0)), mode = "constant")
# convert to greyscale and perform Otsu's thresholding
grayscale = rgb2gray(a)
global_thresh = threshold_otsu(grayscale)
binary_global1 = grayscale > global_thresh
# define number of pixels to expand the image
num_px_to_expand = 50
The image, binary_global1 is a mask that looks like this:
Since the image is three channels (RGB), I process the channels separately. I noticed that I needed to erode the image by ~5 px because the outside of the image has some unusual colors and patterns.
# process each channel (RGB) separately
for channel in range(a.shape[2]):
# select a single channel
one_channel = a[:, :, channel]
# reset binary_global for the each channel
binary_global = binary_global1.copy()
# erode by 5 px to get rid of unusual edges from original image
binary_global = binary_erosion(binary_global, disk(5))
# turn everything less than the threshold to 0
one_channel = one_channel * binary_global
# update pixels one at a time
for jj in range(num_px_to_expand):
# get 1 px ring of to update
px_to_update = np.logical_xor(binary_dilation(binary_global, disk(1)),
binary_global)
# update those pixels with the average of their neighborhood
x, y = np.where(px_to_update == 1)
for x, y in zip(x,y):
# make 3 x 3 px slices
slices = np.s_[(x-1):(x+2), (y-1):(y+2)]
# update a single pixel
one_channel[x, y] = (np.sum(one_channel[slices]*
binary_global[slices]) /
np.sum(binary_global[slices]))
# update original image
a[:,:, channel] = one_channel
# increase binary_global by 1 px dilation
binary_global = binary_dilation(binary_global, disk(1))
When I plot the output, I get something like this:
# plot image
plt.figure(figsize=[10,10])
plt.imshow(a)
This is an interesting idea. You're going to want to use masks and some form of mean ranks to accomplish this. Going pixel by pixel will take you a while, instead you want to use different convolution filters.
If you do something like this:
image = io.imread("roi.jpg")
mask = image[:,:,0] < 30
just_inside = binary_dilation(mask) ^ mask
image[~just_inside] = [0,0,0]
you will have a mask representing just the pixels inside of the ROI. I also set the pixels not in that area to 0,0,0.
Then you can get the pixels just outside of the roi:
just_outside = binary_erosion(mask) ^ mask
Then get the mean bilateral of each channel:
mean_blue = mean_bilateral(image[:,:,0], selem=square(3), s0=1, s1=255)
#etc...
This isn't exactly correct, but I think it should put you in the right direction. I would check out image.sc if you have more general questions about image processing. Let me know if you need more help as this was more general direction than working code.
Problem: I have some images (in numpy) that have a black background. In the middle, I have an object that I am interested in. I would like to crop the object of interest, in a numpy way.
Lets say we have an image that looks like this:
I need a function crop_region_of_interest, that detects and removes any X and Y axis if the entire axis is black or [0,0,0], in order to get this:
Some codes used in this demo:
# just a function to add colors to generate test image
def add_color(img, pixel_x, pixel_y, rgb):
img[pixel_x][pixel_y][0] = rgb[0]
img[pixel_x][pixel_y][1] = rgb[1]
img[pixel_x][pixel_y][2] = rgb[2]
def generate_fake_image_for_stackoverflow():
# a black background image
base_img = np.zeros((16,16,3), dtype=int)
# lets add some colors, these are the region we want
for x in range(4,10):
for y in range(6,12):
if(x==y):
continue
if(x+y<12):
continue
if(x+y>16):
continue
add_color(base_img, x,y, [255,60,90])
return base_img
# a hardcoded cropped to generate expected result
def crop_region_of_interest(img):
# crop first axis
cropped = img[4:10]
# transpose to second axis, so can crop
cropped = cropped.transpose((1,0,2))
cropped = cropped[6:12]
# transpose back
cropped = cropped.transpose((1,0,2))
cropped = cropped.transpose((1,0,2))
cropped = cropped.transpose((1,0,2))
return cropped
img = generate_fake_image_for_stackoverflow() # to generate a test image
plt.imshow(img)
plt.show()
cropped = crop_region_of_interest(img) # a hardcoded cropped to generate expected result, this to be replaced
plt.imshow(cropped)
plt.show()
I thought about it some more, following #Divakar's comment suggestion. The process is fairly tricky, so I refactored it into very short functions so as to give everything a useful name.
def bounds(values):
# create a slice object representing the range from lowest to highest
# index value. Add 1 on the high side because of how ranges work
return slice(min(values), max(values) + 1)
def crop_bounds(mask, axis):
# find the indexes along this axis where any of the pixels are
# non-black, then convert those indexes into a bounds slice.
return bounds(np.where(mask.any(axis)))
def trim(img):
# True where pixels are non-black.
mask = np.any(img != 0, axis = 2)
# Get the crop bounds for each axis, and slice with them.
return img[crop_bounds(mask, 0)][crop_bounds(mask, 1)]
(Slice objects are a Python built-in, supported by Numpy. They basically represent the values used in a slicing operation; x[slice(a,b)] is equivalent to x[a:b].)
just slight edit on the solution provided by [https://codereview.stackexchange.com/questions/132914/crop-black-border-of-image-using-numpy]
def crop_image(img,tol=0):
# img is 2D or 3D image data
# tol is tolerance
mask = img>tol
if img.ndim==3:
mask = mask.all(2)
m,n = mask.shape
mask0,mask1 = mask.any(0),mask.any(1)
col_start,col_end = mask0.argmax(),n-mask0[::-1].argmax()
row_start,row_end = mask1.argmax(),m-mask1[::-1].argmax()
return img[row_start:row_end,col_start:col_end]
Thanks #jamal and #divakar
Use the following function :
def crop_image(img):
x_dim = len(img)
y_dim = len(img[0])
vertical_bounds = []
horizontal_bounds = []
for index_x, line in enumerate(img):
for index_y, value in enumerate(line):
if value[0]+value[1]+value[2] != 0:
vertical_bounds.append(index_x)
horizontal_bounds.append(index_y)
cropped = img[min(vertical_bounds):max(vertical_bounds)+1]
cropped = cropped.transpose((1,0,2))
cropped = cropped[min(horizontal_bounds):max(horizontal_bounds)+1]
cropped = cropped.transpose((1,0,2))
cropped = cropped.transpose((1,0,2))
cropped = cropped.transpose((1,0,2))
return cropped
Editors comment:
How to count pixels occurences in an image?
I have a set of images where each pixel consists of 3 integers in the range 0-255.
I am interested in finding one pixel that is "representative" (as much as possible) for the entire pixel-population as a whole, and that pixel must occur in the pixel-population.
I am determining which pixel is the most common (the median mode) in my set of images makes the most sense.
I am using python, but I am not sure how to go about it.
The images are stored as an numpy array with dimensions [n, h, w, c], where n is the number of images, h is the height, w is the widthandc` is the channels (RGB).
I'm going to assume you need to find the most common element, which as Cris Luengo mentioned is called the mode. I'm also going to assume that the bit depth of the channels is 8-bit (value between 0 and 255, i.e. modulo 256).
Here is an implementation independent approach:
The aim is to maintain a count of all the different kinds of pixels encountered. It makes sense to use a dictionary for this, which would be of the form {pixel_value : count}.
Once this dictionary is populated, we can find the pixel with the highest count.
Now, 'pixels' are not hashable and hence cannot be stored in a dictionary directly. We need a way to assign an integer(which I'll be referring to as the pixel_value) to each unique pixel, i.e., you should be able to convert pixel_value <--> RGB value of a pixel
This function converts RGB values to an integer in the range of 0 to 16,777,215:
def get_pixel_value(pixel):
return pixel.red + 256*pixel.green + 256*256*pixel.blue
and to convert pixel_value back into RGB values:
def get_rgb_values(pixel_value):
red = pixel_value%256
pixel_value //= 256
green = pixel_value%256
pixel_value //= 256
blue = pixel_value
return [red,green,blue]
This function can find the most frequent pixel in an image:
def find_most_common_pixel(image):
histogram = {} #Dictionary keeps count of different kinds of pixels in image
for pixel in image:
pixel_val = get_pixel_value(pixel)
if pixel_val in histogram:
histogram[pixel_val] += 1 #Increment count
else:
histogram[pixel_val] = 1 #pixel_val encountered for the first time
mode_pixel_val = max(histogram, key = histogram.get) #Find pixel_val whose count is maximum
return get_rgb_values(mode_pixel_val) #Returna a list containing RGB Value of the median pixel
If you wish to find the most frequent pixel in a set of images, simply add another loop for image in image_set and populate the dictionary for all pixel_values in all images.
You can iterate over the x/y of the image.
a pixel will be img_array[x, y, :] (the : for the RBG channel)
you will add this to a Counter (from collections)
Here is an example of the concept over an Image
from PIL import Image
import numpy as np
from collections import Counter
# img_path is the path to your image
cnt = Counter()
img = Image.open(img_path)
img_arr = np.array(img)
for x in range(img_arr.shape[0]):
for y in range(img_arr.shape[1]):
cnt[str(img_arr[x, y, :])] += 1
print(cnt)
# Counter({'[255 255 255]': 89916, '[143 143 143]': 1491, '[0 0 0]': 891, '[211 208 209]': 185, ...
A More efficient way to do it is by using the power of numpy and some math manipulation (because we know values are bound [0, 255]
img = Image.open(img_path)
img_arr = np.array(img)
pixels_arr = (img_arr[:, :, 0] + img_arr[:, :, 1]*256 + img_arr[:, :, 2]*(256**2)).flatten()
cnt = Counter(pixels_arr)
# print(cnt)
# Counter({16777215: 89916, 9408399: 1491, 0: 891, 13750483: 185, 14803425: 177, 5263440: 122 ...
# print(cnt.most_common(1))
# [(16777215, 89916)]
pixel_value = cnt.most_common(1)[0][0]
Now a conversion back to the original 3 values is exactly like Aayush Mahajan have writte in his answer. But I've shorten it for the sake of simplicity:
r, b, g = pixel_value%256, (pixel_value//256)%256, pixel_value//(256**2)
So you are using the power of numpy fast computation (and it's significate improvement on run time.
You use Counter which is an extension of python dictionary, dedicated for counting.
I am using Image.point and Image.fromarray to do exactly the same operation on an image, increase the value of all pixels together by the same value. The thing is that i get to absolutelly different images.
using point
def getValue(val):
return math.floor(255*float(val)/100)
def func(i):
return int(i+getValue(50))
out = img.point(func)
using array and numpy
arr = np.array(np.asarray(img).astype('float'))
value = math.floor(255*float(50)/100)
arr[...,0] += value
arr[...,1] += value
arr[...,2] += value
out = Image.fromarray(arr.astype('uint8'), 'RGB')
I am using the same image (a jpg).
the initial image
the image with point
the image with arrays
How can they be so much different?
You have values greater than 255 in your array, which you then convert to uint8 ... what do you want those values to become in the image? If you want them to be 255, clip them first:
out_arr_clip = Image.fromarray(arr.clip(0,255).astype('uint8'), 'RGB')
By the way, there's no need to add to each color band separately:
arr = np.asarray(img, dtype=float) # also simplified
value = math.floor(255*float(50)/100)
arr += value # the same as doing this in three separate lines
If your value is different for each band, you can still do this because of broadcasting:
percentages = np.array([25., 50., 75.])
values = np.floor(255*percentages/100)
arr += values # the first will be added to the first channel, etc.
Fixxed it :)
Didn't take under consideration getting out of bounds. So i did
for i in range(3):
conditions = [arr[...,i] > 255, arr[...,i] < 0]
choices = [255, 0]
arr[...,i] = np.select(conditions, choices, default=arr[...,i]
Worked like a charm....:)