Combining broadcast and boolean array indexing in Numpy for image masking - python

I am working on an image processing/building problem. I have a smaller image that I want to place into a larger one. As normal the image is represented as a 3d array. This works fine with the following code (both element_pixels and image_pixels are 3d ndarrays with depth 3 representing RGB, element_pixels is equal to or smaller than image_pixels in the other dimensions):
element_pixels = element.get_pixels()
image_pixels[element.position[0]:element.position[0]+element.height, element.position[1]:element.position[1]+element.width, :] = element_pixels
However I want to treat black pixels in the element as transparent. The simplest way to do this seems to be to mask the element so I don't modify image_pixels where element_pixel is black. I tried the following, but I am tying myself in knots:
element_pixels = element.get_pixels()
b = np.all(element_pixels == [0, 0, 0], axis=-1)
black_pixels_mask = np.dstack([b,b,b])
image_pixels[element.position[0]:element.position[0]+element.height, element.position[1]:element.position[1]+element.width, :][black_pixels_mask] = element_pixels
This looks to be correctly generating a mask but I can't figure out how to use it. I get the following error:
image_pixels[element.position[0]:element.position[0]+element.height, element.position[1]:element.position[1]+element.width, :][black_pixels_mask] = element_pixels
TypeError: NumPy boolean array indexing assignment requires a 0 or 1-dimensional input, input has 3 dimensions
The masking kind-of works (i.e. runs without exceptions) if I replace the final = element_pixels with a constant, but I'm struggling to extrapolate this to a solution.
Extra detail of sizes
element_pixels.shape=(40, 40,3)
image_pixels.shape=(100, 100,3)
image_pixels[element.position[0]:element.position[0]+element.height, element.position[1]:element.position[1]+element.width, :].shape = (40,40,3)
A MRE in 2d
This captures what I'm trying to do without the complexity of the extra dimension.
import numpy as np
bg = np.ones((10,10))*0.5
img = np.concatenate([np.zeros((5,1)),np.ones((5,1))], axis=1)
mask = img == 0
# copy the *non-zero* pixel values of img to a particular location in bg
bg[5:10,5:7][mask] = img # this throws exception
print(bg)

I discovered after some experimentation that the (perhaps obvious in hindsight) answer is the you have to apply the mask to both sides.
So taking my MRE:
import numpy as np
bg = np.ones((10,10))*0.5
img = np.concatenate([np.zeros((5,1)),np.ones((5,1))], axis=1)
mask = img > 0
bg[5:10,5:7][mask] = img[mask]
print(bg)
Or going back to my original code, the only line that changes is:
image_pixels[element.position[0]:element.position[0]+element.height, element.position[1]:element.position[1]+element.width, :][~black_pixels_mask] = element_pixels[~black_pixels_mask]

Well you can use a 2d mask on a 3d array. So something like this will replace all black pixels of img with those of background.
img = np.random.randint(0, 2, (10, 10, 3))
background = np.random.randint(0, 2, (10, 10, 3))
mask = np.all(img == [0,0,0], axis=2)
img[mask] = background[img]
I'm not sure I understand what is in image_pixels but I think you can do something similar.

Related

numpy.where on 2d or 3d matrix

I want to get the index where the target is located in background image using numpy. the background color of each image is actually varible
so except for the color of the sqaure, other colors (in this case, black painted) including the inside of the square will vary.
i have no idea how to approach this at all as im not familiar with numpy.
import numpy as np
#changing to grayscale to have 2d array
output = cv2.imread('backgroundimage.png', cv2.IMREAD_GRAYSCALE)
output1 = cv2.imread('target.png', cv2.IMREAD_GRAYSCALE)
i tried to change the images to 2d array because i thought it might be easier to approach.
a = np.where(output==output1)
apparently this doesnt work for 2d or 3d.
my desired output will be something like this
desired output = (108, 23) (x and y coordination of where its found)
so how would i able to do what i want?
You have to use a sliding window approach, and make sure that all pixels are equal for each window you compare with the target. You can do that using sliding_window_view and all. You can mask out any values inside the target that you do not want to match by setting them to True:
import cv2
import numpy as np
output = cv2.imread('backgroundimage.png', cv2.IMREAD_GRAYSCALE)
output1 = cv2.imread('target.png', cv2.IMREAD_GRAYSCALE)
# Apply sliding window view
windows = np.lib.stride_tricks.sliding_window_view(output, output1.shape)
# Only check the 1 pixel border by making a mask
mask = np.full_like(output1, False)
mask[1:-1, 1:-1] = True
# Apply mask and check match everywhere
masked = (windows == output1) | mask
matches = masked.all(axis=(2, 3))
locations = np.where(matches)
locations:
(array([24], dtype=int64), array([108], dtype=int64))

Remove [255,255,255] entries from list of image RGB values

I reshaped an image (included below) as a list of pixels, and now I want to remove the black ones (with value [255,255,255]). What is an efficient way to do it?
I tried using IM[IM != [255,255,255]] and I got a list of values, instead of a list of value triplets. Here is the code I'm using:
import cv2
import numpy as np
IM = cv2.imread('Test_image.png')
image = cv2.cvtColor(IM, cv2.COLOR_BGR2RGB)
# reshape the image to be a list of pixels
image_vec = np.array(image.reshape((image.shape[0] * image.shape[1], 3)))
image_clean = image_vec[image_vec != [255,255,255]]
print(image_clean)
The issue is that numpy automatically does array-boradcasting, so using IM != [255,255,255] will compare each element to [255,255,255] and return a boolean array with the same shape as the one with the image data. Using this as a mask will return the values as 1D array.
An easy way to fix this is to use np.all:
image_vec[~ np.all(image_vec == 255, axis=-1)]

How do I use only numpy to apply filters onto images?

I would like to apply a filter/kernel to an image to alter it (for instance, perform vertical edge detection, diagonal blur, etc). I found this wikipedia page with some interesting examples of kernels.
When I look online, filters are implemented using opencv or default matplotlib/Pillow functions. I want to be able to modify an image using only numpy arrays and functions like matrix multiplication and such (There doesn't appear to be a default numpy function to perform the convolution operation.)I've tried very hard to figure it out but I keep making errors and I'm also relatively new to numpy.
I worked out this code to convert an image to greyscale:
import numpy as np
from PIL import Image
img = Image.open("my_path/my_image.jpeg")
img = np.array(img.resize((180, 320)))
grey = np.zeros((320, 180))
grey_avg_array = (np.sum(img,axis=-1,keepdims=False)/3)
grey_avg_array = grey_avg_array.astype(np.uint8)
grey_image = Image.fromarray(grey_avg_array)
I have tried to multiply my image by a numpy array [[1, 0, -1], [1, 0, -1], [1, 0, -1]] to implement edge detection but that gave me a broadcasting error. What would some sample code/useful functions that can do this without errors look like?
Also: a minor problem I've faced all day is that PIL can't display (x, x, 1) shaped arrays as images. Why is this? How do I get it to fix this? (np.squeeze didn't work)
Note: I would highly recommend checking out OpenCV, which has a large variety of built-in image filters.
Also: a minor problem I've faced all day is that PIL can't display (x, x, 1) shaped arrays as images. Why is this? How do I get it to fix this? (np.squeeze didn't work)
I assume the issue here is with processing grayscale float arrays. To fix this issue, you have to convert the float arrays to np.uint8 and use the 'L' mode in PIL.
img_arr = np.random.rand(100, 100) # Our float array in the range (0, 1)
uint8_img_arr = np.uint8(img_arr * 255) # Converted to the np.uint8 type
img = Image.fromarray(uint8_img_arr, 'L') # Create PIL Image from img_arr
As for doing convolutions, SciPy provides functions for doing convolutions with kernels that you may find useful.
But since we're solely using NumPy, let's implement it!
Note: To make this as general as possible, I am adding a few extra parameters that may or may not be important to you.
# Assuming the image has channels as the last dimension.
# filter.shape -> (kernel_size, kernel_size, channels)
# image.shape -> (width, height, channels)
def convolve(image, filter, padding = (1, 1)):
# For this to work neatly, filter and image should have the same number of channels
# Alternatively, filter could have just 1 channel or 2 dimensions
if(image.ndim == 2):
image = np.expand_dims(image, axis=-1) # Convert 2D grayscale images to 3D
if(filter.ndim == 2):
filter = np.repeat(np.expand_dims(filter, axis=-1), image.shape[-1], axis=-1) # Same with filters
if(filter.shape[-1] == 1):
filter = np.repeat(filter, image.shape[-1], axis=-1) # Give filter the same channel count as the image
#print(filter.shape, image.shape)
assert image.shape[-1] == filter.shape[-1]
size_x, size_y = filter.shape[:2]
width, height = image.shape[:2]
output_array = np.zeros(((width - size_x + 2*padding[0]) + 1,
(height - size_y + 2*padding[1]) + 1,
image.shape[-1])) # Convolution Output: [(W−K+2P)/S]+1
padded_image = np.pad(image, [
(padding[0], padding[0]),
(padding[1], padding[1]),
(0, 0)
])
for x in range(padded_image.shape[0] - size_x + 1): # -size_x + 1 is to keep the window within the bounds of the image
for y in range(padded_image.shape[1] - size_y + 1):
# Creates the window with the same size as the filter
window = padded_image[x:x + size_x, y:y + size_y]
# Sums over the product of the filter and the window
output_values = np.sum(filter * window, axis=(0, 1))
# Places the calculated value into the output_array
output_array[x, y] = output_values
return output_array
Here is an example of its usage:
Original Image (saved as original.png):
filter = np.array([
[1, 1, 1],
[1, 1, 1],
[1, 1, 1]
], dtype=np.float32)/9.0 # Box Filter
image = Image.open('original.png')
image_arr = np.array(image)/255.0
convolved_arr = convolve(image_arr, filter, padding=(1, 1))
convolved = Image.fromarray(np.uint8(255 * convolved_arr), 'RGB') # Convolved Image
Convolved Image:
A few things:
OpenCV, SciPy and scikit-image all use Numpy arrays as the standard way to store and manipulate images and are all largely interoperable with Numpy and each other
as regards plotting im with shape (x,y,1), you can just take the zeroth plane and plot that, i.e. newim = im[...,0]
When converting an RGB image to greyscale, rather than add all the RGB components up and divide by 3, you could just calculate the mean:
grey = np.mean(im, axis=2)
Actually the recommended weightings in ITU-R 601-2 are
L = 0.299 * Red + 0.587 * Green + 0.114 * Blue
So, you can use np.dot() to do that:
grey = np.dot(RGBimg[...,:3], [0.299, 0.587,0.114]).astype(np.uint8)
As regards finding vertical edges, you can do this with Numpy by subtracting each pixel from the one to its immediate right, i.e. differencing. Here is a little example, I also drew the shapes with Numpy so you can see a way to do that without using OpenCV since it seems to upset you so much ;-)
#!/usr/bin/env python3
import numpy as np
# Create a test image with a white square on black
rect = np.zeros((200,200), dtype=np.uint8)
rect[40:-40,40:-40] = 255
# Create a test image with a white circle on black
xx, yy = np.mgrid[:200, :200]
circle = (xx - 100) ** 2 + (yy - 100) ** 2
circle = (circle<4096).astype(np.uint8)*255
# Concatenate side-by-side to make our test image
im = np.hstack((rect,circle))
That now looks like this:
# Calculate horizontal differences only finding increasing brightnesses
d = im[:,1:] - im[:,0:-1]
# Calculate horizontal differences finding increasing or decreasing brightnesses
d = np.abs(im[:,1:].astype(np.int16) - im[:,0:-1].astype(np.int16))
Not very efficient, but you could extend your code by the following to detect edges:
edge = np.zeros([322, 182])
for i in range(grey_avg_array.shape[0]-2):
for j in range(grey_avg_array.shape[1]-2):
edge[i+1, j+1] = np.sum(grey_avg_array[i:i+3, j:j+3]*[[1, 0, -1], [1, 0, -1], [1, 0, -1]])
edge = edge.astype(np.uint8)
edge_img = Image.fromarray(edge)
edge_img
To show image in the (say) Jupyter Notebook, you could just type the variable name (after you have done Image.fromarray()) as I have written above in the last line.

Convert a one dimensional dataframe into a 3 dimensional for RGB Image

I have a data frame of 2304 columns , as it is a 48*48 image pixels, when I convert it into one channel using this code
x = (df.iloc[:,1:].values).astype('float32')
x = x.reshape(-1,48,48,1)
its perfectly output of shape
(48*48*1)
with generating exact image by this code:
plt.imshow(x[0][:,:,0])
I want to make it into a 3Dimentional like in three channels. I try to merged the df 3 times and do this (48*48*3) it successfully change the df shape but I cannot generate the image again,
If you essentially want to convert a single channel image (which should essentially be a greyscale image) into a 3 channel greyscale image, its the same as concatenating the same image array thrice along the last axis. You can use np.concatenate to achieve the desired result.
import numpy as np
a = np.zeros((2304), dtype = np.uint8) #Just a dummy array representing a single pic
single_channel = a.reshape(48, 48, 1)
result = np.concatenate([single_channel,single_channel,single_channel], axis = -1)
print(result.shape) #(48, 48, 3)
At this point you should have an array that can be accepted by any image library. Just throwing a sample code to show how you may proceed to create the image from the array.
import cv2
cv2.imwrite("hi.jpg", result)
As stated earlier, use numpy instead of pandas for image manipulation.
EDIT: If you were unfortunately starting with a dataframe in the first place, you can always convert it to a numpy array with an extra dimension representing each image.
import pandas as pd
import cv2
import numpy as np
a = np.zeros((2304), dtype = np.uint8) #dummy row
dummy_df = pd.DataFrame(np.concatenate([a.reshape(1,-1)]*10)) #dummy df with 10 rows.
print(dummy_df.shape) #(10, 2304)
arr_images = np.array(dummy_df, dtype = np.uint8)
print(arr_images.shape) #(10, 2304)
multiple_single_channel = arr_images.reshape(-1, 48, 48, 1)
print(multiple_single_channel.shape) #(10, 48, 48, 1)
result = np.concatenate([multiple_single_channel] * 3, axis = -1)
print(result.shape) #(10, 48, 48, 3)
for i,img in enumerate(result):
print(i)
cv2.imwrite("{}.jpg".format(i), img)
#do something with image. you PROBABLY don't want to run this for 35k images though.
The bottom line really is that you should not need to use a dataframe, even for multiple images.
1)Dont use pandas
2) you cant transform 1channel image into 3 channels,
3) Dont use float32, images are usually 8bit (np.uint8)
4) use numpy in combination with OpenCV or with Pillow.
5) Dont use matplotlib to generate images. use libraries mentioned in 4.
6) if you have array with shape (x,y,3) there is nothing more simply than generate image with opencv cv2.imshow('image',array)

Mipmap of image in numpy?

I'm checking with you if there is a neat numpy solution to resizing down a 2D numpy array (which is an image) using bilinear filtering?
More specifically, my array has the shape (width, height, 4) (as in a rgba image). The downscaling is also only done on "even" steps: i.e. from (w, h, 4) to (w/2, h/2, 4) to (w/4, h/4, 4) etc.
I've browsed around for quite some time now but everyone seems to refer to the scipy/PIL versions of imresize.
I want to minimize the number of dependencies on python packages, hence the numpy only requirement.
I just wanted to check with SO before I go implement it in C++ instead.
I don't think there is any specific solution in numpy, but you should be able to implement it efficiently without leaving the comfort of python. Correct me if I'm wrong, but when the size of the image is divisible by 2, a bilinear filter is basically the same as averaging 4 pixels of the original image to get 1 pixel of the new one, right? Well, if your image size is a power of two, then the following code:
from __future__ import division
import numpy as np
from PIL import Image
def halve_image(image) :
rows, cols, planes = image.shape
image = image.astype('uint16')
image = image.reshape(rows // 2, 2, cols // 2, 2, planes)
image = image.sum(axis=3).sum(axis=1)
return ((image + 2) >> 2).astype('uint8')
def mipmap(image) :
img = image.copy()
rows, cols, planes = image.shape
mipmap = np.zeros((rows, cols * 3 // 2, planes), dtype='uint8')
mipmap[:, :cols, :] = img
row = 0
while rows > 1:
img = halve_image(img)
rows = img.shape[0]
mipmap[row:row + rows, cols:cols + img.shape[1], :] = img
row += rows
return mipmap
img = np.asarray(Image.open('lena.png'))
Image.fromarray(mipmap(img)).save('lena_mipmap.png')
Produces this output:
With an original image of 512x512, it runs on my system in:
In [3]: img.shape
Out[3]: (512, 512, 4)
In [4]: %timeit mipmap(img)
10 loops, best of 3: 154 ms per loop
This will not work if an odd length of a side ever comes up, but depending on exactly how you want to handle the downsampling for those cases, you should be able to get rid of a full row (or column) of pixels, reshape your image to (rows // 2, 2, cols // 2, 2, planes), so that img[r, :, c, :, p] is a 2x2 matrix of values to interpolate to get a new pixel value.

Categories

Resources