I want to make an affine transformation and afterwards use nearest neighbor interpolation while keeping the same dimensions for input and output images. I use for example the scaling transformation T= [[2,0,0],[0,2,0],[0,0,1]]. Any idea how can I fill the black pixels with nearest neighbor ? I tryied giving them the min value of neighbors' intensities. For ex. if a pixel has neighbors [55,22,44,11,22,55,23,231], I give it the value of min intensity: 11. But the result is not anything clear..
import numpy as np
from matplotlib import pyplot as plt
#Importing the original image and init the output image
img = plt.imread('/home/left/Desktop/computerVision/SET1/brain0030slice150_101x101.png',0)
outImg = np.zeros_like(img)
# Dimensions of the input image and output image (the same dimensions)
(width , height) = (img.shape[0], img.shape[1])
# Initialize the transformation matrix
T = np.array([[2,0,0], [0,2,0], [0,0,1]])
# Make an array with input image (x,y) coordinations and add [0 0 ... 1] row
coords = np.indices((width, height), 'uint8').reshape(2, -1)
coords = np.vstack((coords, np.zeros(coords.shape[1], 'uint8')))
output = T # coords
# Arrays of x and y coordinations of the output image within the image dimensions
x_array, y_array = output[0] ,output[1]
indices = np.where((x_array >= 0) & (x_array < width) & (y_array >= 0) & (y_array < height))
# Final coordinations of the output image
fx, fy = x_array[indices], y_array[indices]
# Final output image after the affine transformation
outImg[fx, fy] = img[fx, fy]
The input image is:
The output image after scaling is:
well you could simply use the opencv resize function
import cv2
new_image = cv2.resize(image, new_dim, interpolation=cv.INTER_AREA)
it'll do the resize and fill in the empty pixels in one go
more on cv2.resize
If you need to do it manually, then you could simply detect dark pixels in resized image and change their value to mean of 4 neighbour pixels (for example - it depends on your required alghoritm)
See: nereast neighbour, bilinear, bicubic, etc.
Related
Have a look at the image and it will give you the better idea what I want to achieve. I want to rotate the image and fill the black part of image just like in required image.
# Read the image
img = cv2.imread("input.png")
# Get the image size
h, w = img.shape[:2]
# Define the rotation matrix
M = cv2.getRotationMatrix2D((w/2, h/2), 30, 1)
# Rotate the image
rotated = cv2.warpAffine(img, M, (w, h))
mask = np.zeros(rotated.shape[:2], dtype=np.uint8)
mask[np.where((rotated == [0, 0, 0]).all(axis=2))] = 255
img_show(mask)
From the code I am able to get the mask of black regions. Now I want to replace these black regions with the image portion as shown in the image 1. Any better solution how can I achieve this.
Use the borderMode parameter of warpAffine.
You want to pass the BORDER_WRAP value.
Here's the result. This does exactly what you described with your first picture.
I have an approach. You can first create a larger image consisting of 3 * 3 times your original image. When you rotate this image and only cut out the center of this large image, you have your desired result.
import cv2
import numpy as np
# Read the image
img = cv2.imread("input.png")
# Get the image size of the origial image
h, w = img.shape[:2]
# make a large image containing 3 copies of the original image in each direction
large_img = np.tile(img, [3,3,1])
cv2.imshow("large_img", large_img)
# Define the rotation matrix. Rotate around the center of the large image
M = cv2.getRotationMatrix2D((w*3/2, h*3/2), 30, 1)
# Rotate the image
rotated = cv2.warpAffine(large_img, M, (w*3, h*3))
# crop only the center of the image
cropped_image = rotated[w:w*2,h:h*2,:]
cv2.imshow("cropped_image", cropped_image)
cv2.waitKey(0)
I'm trying to create a neon-effect w/ a source image. I have included three images, the source, my current attempt & a target. The program takes the image, finds the white-edges, & calculates the distance from each pixel to the nearest white-edge (these parts both work fine); from there, I am struggling to find the right saturation and value parameters to create the neon-glow.
From the target image, what I need to happen is basically for the saturation to be 0 on a white-edge, then to dramatically increase the further away it gets from an edge; for value, I need it to be 1 on a white-edge, then to dramatically decrease. I can't figure out the best way to manipulate distance_image (which holds each pixel's distance from the nearest white-edge) such as to achieve these two results with saturation and value.
from PIL import Image
import cv2
import numpy as np
from scipy.ndimage import binary_erosion
from scipy.spatial import KDTree
def find_closest_distance(img):
white_pixel_points = np.array(np.where(img))
tree = KDTree(white_pixel_points.T)
img_meshgrid = np.array(np.meshgrid(np.arange(img.shape[0]),
np.arange(img.shape[1]))).T
distances, _ = tree.query(img_meshgrid)
return distances
def find_edges(img):
img_np = np.array(img)
kernel = np.ones((3,3))
return img_np - binary_erosion(img_np, kernel)*255
img = Image.open('a.png').convert('L')
edge_image = find_edges(img)
distance_image = find_closest_distance(edge_image)
max_dist = np.max(distance_image)
distance_image = distance_image / max_dist
hue = np.full(distance_image.shape, 0.44*180)
saturation = distance_image * 255
value = np.power(distance_image, 0.2)
value = 255 * (1 - value**2)
new_tups = np.dstack((hue, saturation, value)).astype('uint8')
new_tups = cv2.cvtColor(new_tups, cv2.COLOR_HSV2BGR)
new_img = Image.fromarray(new_tups, 'RGB').save('out.png')
The following images show the source data (left), the current result (middle), and the desired result (right).
I think I would do this with convolution instead. Convolving an image with a Gaussian kernel is a common way to blur an image. You can do it in various ways, but maybe the easiest to use is scipy.ndimage.gaussian_filter. Here's one way to implement all this, see if you like the result.
from PIL import Image
from io import BytesIO
import requests
import numpy as np
r = requests.get('https://i.stack.imgur.com/MhUQZ.png')
img = Image.open(BytesIO(r.content))
imarray = np.asarray(img)[..., 0] / 255
This is your first image, the white rectangles.
Now I'll make those outlines, do the blur, create the colour images, and combine them:
from scipy.ndimage import binary_erosion
from scipy.ndimage import gaussian_filter
eroded = binary_erosion(imarray, iterations=3)
# Make the outlined rectangles.
outlines = imarray - eroded
# Convolve with a Gaussian to effect a blur.
blur = gaussian_filter(outlines, sigma=11)
# Make binary images into neon green.
neon_green_rgb = [0.224, 1.0, 0.0784]
outlines = outlines[:, :, None] * neon_green_rgb
blur = blur[:, :, None] * neon_green_rgb
# Combine the images and constrain to [0, 1].
blur_strength = 3
glow = np.clip(outlines + blur_strength*blur, 0, 1)
And look at it:
import matplotlib.pyplot as plt
plt.imshow(glow)
You'll want to adjust the sigma of the Gaussian (its width), the colours, blur strength, and so on. Hope it helps.
Here is one way to do that in Python/OpenCV.
Read the input
Convert to grayscale
Threshold to binary
Get edges of desired thickness using morphology gradient
Invert the edges so black on white background
Do distance transform
Stretch to full dynamic range
Invert
Normalize to range 0 to 1 by dividing by the maximum value
Attenuate using a power law to control distance roll-off (ramping)
Create a color image of the size of the input and the desired color
Multiply the attenuated image by the color image
Save results
Input:
import cv2
import numpy as np
import skimage.exposure
# read input
img = cv2.imread('rectangles.png')
# convert to grayscale
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
# threshold
thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY+cv2.THRESH_OTSU)[1]
# do morphology gradient to get edges and invert so black edges on white background
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (3,3))
edges = cv2.morphologyEx(thresh, cv2.MORPH_GRADIENT, kernel)
edges = 255 - edges
# get distance transform
dist = edges.copy()
distance = cv2.distanceTransform(dist, distanceType=cv2.DIST_L2, maskSize=3)
print(np.amin(distance), np.amax(distance))
# stretch to full dynamic range and convert to uint8 as 3 channels
stretch = skimage.exposure.rescale_intensity(distance, in_range=('image'), out_range=(0,255))
# invert
stretch = 255 - stretch
max_stretch = np.amax(stretch)
# normalize to range 0 to 1 by dividing by max_stretch
stretch = (stretch/max_stretch)
# attenuate with power law
pow = 4
attenuate = np.power(stretch, pow)
attenuate = cv2.merge([attenuate,attenuate,attenuate])
# create a green image the size of the input
color_img = np.full_like(img, (0,255,0), dtype=np.float32)
# multiply the color image with the attenuated distance image
glow = (color_img * attenuate).clip(0,255).astype(np.uint8)
# save results
cv2.imwrite('rectangles_edges.png', edges)
cv2.imwrite('rectangles_stretch.png', (255*stretch).clip(0,255).astype(np.uint8))
cv2.imwrite('rectangles_attenuate.png', (255*attenuate).clip(0,255).astype(np.uint8))
cv2.imwrite('rectangles_glow.png', glow)
# view results
cv2.imshow("EDGES", edges)
cv2.imshow("STRETCH", stretch)
cv2.imshow("ATTENUATE", attenuate)
cv2.imshow("RESULT", glow)
cv2.waitKey(0)
Edges (inverted):
Stretched Distance Transform:
Attenuated Distance Transform:
Glow Result:
I need to resize an image, but with a "varying scaling" in the y axis, after warping:
Plotted Image
Original input image
Warped output image
The image (left one) was taken at an angle, so I've used the getPerspectiveTransform and warpPerspective OpenCV functions to get the top/plan view of the image (right one).
But, now the top half of the warped image is stretched and the bottom half is squashed, and this amount of stretch/squash is varying continuously as you go down the image. So, I need to do the opposite.
For example: The zebra crossing lines in the warped image are thicker at the top of the image and thinner at the bottom. I want them to all be the same thickness and same vertical distance from each other essentially.
Badly drawn but something like this: (if we ignore the 2 people, I think this is what the final output image should be like.)
predicted output image
My end goal is to measure distance between people's feet in an image (shown by green dots), but I've got that section sorted already.
By vertically scaling the warped image to make it linear, it will allow me to accurately measure the real distance in the x & y direction from a top/plan view, (i.e each pixel in the x or y direction is say 1cm in real distance)
I was thinking of multiplying each row of the image by a factor (e.g. top rows multiply by smaller number like 0.8 or 0.9, and bottom rows multiply by bigger number like 1.1 or 1.2), but I really don't know how to do that.
Code:
import cv2 as cv
from matplotlib import pyplot as plt
import numpy as np
# READ IMAGE
imgOrig = cv.imread('.jpg')
# RESIZE IMAGE
width = int(1000)
ratio = imgOrig.shape[1]/width
height = int(imgOrig.shape[0]/ratio)
dsize = (width, height)
img = cv.resize(imgOrig, dsize)
feetLocation = [[280, 500], [740, 496]]
cv.circle(img,(280, 500),5,(0,255,0),thickness= 10)
cv.circle(img,(740, 496),5,(0,255,0),thickness= 10)
# WARPING
pts1 = np.float32([[0, -0], [width, 0], [-1800, height], [width + 1800, height]])
pts2 = np.float32([[0, 0], [width, 0], [0, height], [width, height]])
M = cv.getPerspectiveTransform(pts1, pts2)
dst = cv.warpPerspective(img, M, (width, height))
#DISPLAY IMAGES
plt.subplot(121),plt.imshow(img),plt.title('Original Image')
plt.subplot(122),plt.imshow(dst),plt.title('Warped Image')
plt.show()
I was working on a solution, before the several edits were applied. I focussed on the actual boxes only. If, instead, you actually need the surrounding, too, the following approach won't help you much, I'm afraid. Also, I assumed the bottom box to be fully included. So, if that one's somehow cut like presented in your new desired final output, additional work would be needed to handle that case.
From the given image, you could mask the gray-ish part around and between the single boxes using the saturation and value channels from the HSV color space:
Following, row-wise sum all pixels, apply some moving average to clean the signal, and detect the peaks in that signal:
The bottom image border must be manually added, since there is no gray-ish border (most likely because the box is somehow cut).
Now, for each of these "peak rows", determine the first and last masked pixels, and build boxes from each two neighbouring "peak rows". Finally, for each of these boxes, apply a distinct perspective transform to a given size. If needed, stack those boxes vertically for example:
That'd be the whole code:
import cv2
import matplotlib.pyplot as plt
import numpy as np
from scipy.signal import find_peaks
# Read original image
imgOrig = cv2.cvtColor(cv2.imread('DInAq.jpg'), cv2.COLOR_BGR2RGB)
# Resize image
width = int(1000)
ratio = imgOrig.shape[1] / width
height = int(imgOrig.shape[0] / ratio)
dsize = (width, height)
img = cv2.resize(imgOrig, dsize)
# Mask low saturation and medium to high value (i.e. gray-ish/white-ish colors)
img_gauss = cv2.GaussianBlur(img, (5, 5), -1)
h, s, v = cv2.split(cv2.cvtColor(img_gauss, cv2.COLOR_BGR2HSV))
mask = (s < 24) & (v > 64)
# Row-wise sum mask pixels, apply moving average filter, and find peaks
row_sum = np.sum(mask, axis=1)
row_sum = np.convolve(row_sum, np.ones(5)/5, 'same')
peaks = find_peaks(row_sum, prominence=50)[0]
peaks = np.insert(peaks, 4, img.shape[0]-1)
# Find first and last pixels per "peak row"
x1 = [np.argwhere(mask[p, :]).min() for p in peaks]
x2 = [np.argwhere(mask[p, :]).max() for p in peaks]
# Collect single boxes
boxes = []
for i in np.arange(len(peaks)-1, 0, -1):
boxes.append([[x1[i], peaks[i]],
[x1[i-1], peaks[i-1]],
[x2[i-1], peaks[i-1]],
[x2[i], peaks[i]]])
# Warp each box individually to a given size
warped = []
bw, bh = [400, 400]
for box in reversed(boxes):
pts1 = np.float32(box)
pts2 = np.float32([[0, bh-1], [0, 0], [bw-1, 0], [bw-1, bh-1]])
M = cv2.getPerspectiveTransform(pts1, pts2)
warped.append(cv2.warpPerspective(img, M, (bw, bh)))
# Output
plt.figure(1)
plt.subplot(121), plt.imshow(img), plt.title('Original image')
for box in boxes:
pts = np.array(box)
plt.plot(pts[:, 0], pts[:, 1], 'rx')
plt.subplot(122), plt.imshow(np.vstack(warped)), plt.title('Warped image')
plt.tight_layout(), plt.show()
That's kind of an automated way to detect and extract the single boxes. For better results, you could set up a simple GUI (solely using OpenCV, for example), and let the user click on the exact corners, and build the boxes to be transformed from there.
----------------------------------------
System information
----------------------------------------
Platform: Windows-10-10.0.16299-SP0
Python: 3.9.1
PyCharm: 2021.1
Matplotlib: 3.4.1
NumPy: 1.20.2
OpenCV: 4.5.1
SciPy: 1.6.2
----------------------------------------
I am trying to increase the region of interest of an image using the below algorithm.
First, the set of pixels of the exterior border of the ROI is de termined, i.e., pixels that are outside the ROI and are neighbors (using four-neighborhood) to pixels inside it. Then, each pixel value of this set is replaced with the mean value of its neighbors (this time using eight-neighborhood) inside the ROI. Finally, the ROI is expanded by inclusion of this altered set of pixels. This process is repeated and can be seen as artificially increasing the ROI.
The pseudocode is below -
while there are border pixels:
border_pixels = []
# find the border pixels
for each pixel p=(i, j) in image
if p is not in ROI and ((i+1, j) in ROI or (i-1, j) in ROI or (i, j+1) in ROI or (i, j-1) in ROI) or (i-1,j-1) in ROI or (i+1,j+1) in ROI):
add p to border_pixels
# calculate the averages
for each pixel p in border_pixels:
color_sum = 0
count = 0
for each pixel n in 8-neighborhood of p:
if n in ROI:
color_sum += color(n)
count += 1
color(p) = color_sum / count
# update the ROI
for each pixel p=(i, j) in border_pixels:
set p to be in ROI
Below is my code
img = io.imread(path_dir)
newimg = np.zeros((584, 565,3))
mask = img == 0
while(1):
border_pixels = []
for i in range(img.shape[0]):
for j in range(img.shape[1]):
for k in range(0,3):
if(i+1<=583 and j+1<=564 and i-1>=0 and j-1>=0):
if ((mask[i][j][k]) and ((mask[i+1][j][k]== False) or (mask[i-1][j][k]==False) or (mask[i][j+1][k]==False) or (mask[i][j-1][k]==False) or (mask[i-1][j-1][k] == False) or(mask[i+1][j+1][k]==False))):
border_pixels.append([i,j,k])
if len(border_pixels) == 0:
break
for (each_i,each_j,each_k) in border_pixels:
color_sum = 0
count = 0
eight_neighbourhood = [[each_i-1,each_j],[each_i+1,each_j],[each_i,each_j-1],[each_i,each_j+1],[each_i-1,each_j-1],[each_i-1,each_j+1],[each_i+1,each_j-1],[each_i+1,each_j+1]]
for pix_i,pix_j in eight_neighbourhood:
if (mask[pix_i][pix_j][each_k] == False):
color_sum+=img[pix_i,pix_j,each_k]
count+=1
print(color_sum//count)
img[each_i][each_j][each_k]=(color_sum//count)
for (i,j,k) in border_pixels:
mask[i,j,k] = False
border_pixels.remove([i,j,k])
io.imsave("tryout6.png",img)
But it is not doing any change in the image.I am getting the same image as before
so I tried plotting the border pixel on a black image of the same dimension for the first iteration and I am getting the below result-
I really don't have any idea where I am doing wrong here.
Here's a solution that I think works as you have requested (although I agree with #Peter Boone that it will take a while). My implementation has a triple loop, but maybe someone else can make it faster!
First, read in the image. With my method, the pixel values are floats between 0 and 1 (rather than integers between 0 and 255).
import urllib
import matplotlib.pyplot as plt
import numpy as np
from skimage.morphology import binary_dilation, binary_erosion, disk
from skimage.color import rgb2gray
from skimage.filters import threshold_otsu
# create a file-like object from the url
f = urllib.request.urlopen("https://i.stack.imgur.com/JXxJM.png")
# read the image file in a numpy array
# note that all pixel values are between 0 and 1 in this image
a = plt.imread(f)
Second, add some padding around the edges, and threshold the image. I used Otsu's method, but #Peter Boone's answer works well, too.
# add black padding around image 100 px wide
a = np.pad(a, ((100,100), (100,100), (0,0)), mode = "constant")
# convert to greyscale and perform Otsu's thresholding
grayscale = rgb2gray(a)
global_thresh = threshold_otsu(grayscale)
binary_global1 = grayscale > global_thresh
# define number of pixels to expand the image
num_px_to_expand = 50
The image, binary_global1 is a mask that looks like this:
Since the image is three channels (RGB), I process the channels separately. I noticed that I needed to erode the image by ~5 px because the outside of the image has some unusual colors and patterns.
# process each channel (RGB) separately
for channel in range(a.shape[2]):
# select a single channel
one_channel = a[:, :, channel]
# reset binary_global for the each channel
binary_global = binary_global1.copy()
# erode by 5 px to get rid of unusual edges from original image
binary_global = binary_erosion(binary_global, disk(5))
# turn everything less than the threshold to 0
one_channel = one_channel * binary_global
# update pixels one at a time
for jj in range(num_px_to_expand):
# get 1 px ring of to update
px_to_update = np.logical_xor(binary_dilation(binary_global, disk(1)),
binary_global)
# update those pixels with the average of their neighborhood
x, y = np.where(px_to_update == 1)
for x, y in zip(x,y):
# make 3 x 3 px slices
slices = np.s_[(x-1):(x+2), (y-1):(y+2)]
# update a single pixel
one_channel[x, y] = (np.sum(one_channel[slices]*
binary_global[slices]) /
np.sum(binary_global[slices]))
# update original image
a[:,:, channel] = one_channel
# increase binary_global by 1 px dilation
binary_global = binary_dilation(binary_global, disk(1))
When I plot the output, I get something like this:
# plot image
plt.figure(figsize=[10,10])
plt.imshow(a)
This is an interesting idea. You're going to want to use masks and some form of mean ranks to accomplish this. Going pixel by pixel will take you a while, instead you want to use different convolution filters.
If you do something like this:
image = io.imread("roi.jpg")
mask = image[:,:,0] < 30
just_inside = binary_dilation(mask) ^ mask
image[~just_inside] = [0,0,0]
you will have a mask representing just the pixels inside of the ROI. I also set the pixels not in that area to 0,0,0.
Then you can get the pixels just outside of the roi:
just_outside = binary_erosion(mask) ^ mask
Then get the mean bilateral of each channel:
mean_blue = mean_bilateral(image[:,:,0], selem=square(3), s0=1, s1=255)
#etc...
This isn't exactly correct, but I think it should put you in the right direction. I would check out image.sc if you have more general questions about image processing. Let me know if you need more help as this was more general direction than working code.
Like the image above suggests, how can I convert the image to the left into an array that represent the darkness of the image between 0 for white and decimals for darker colours closer to 1? as shown in the image usingpython 3`?
Update:
I have tried to work abit more on this. There are good answers below too.
# Load image
filename = tf.constant("one.png")
image_file = tf.read_file(filename)
# Show Image
Image("one.png")
#convert method
def convertRgbToWeight(rgbArray):
arrayWithPixelWeight = []
for i in range(int(rgbArray.size / rgbArray[0].size)):
for j in range(int(rgbArray[0].size / 3)):
lum = 255-((rgbArray[i][j][0]+rgbArray[i][j][1]+rgbArray[i][j][2])/3) # Reversed luminosity
arrayWithPixelWeight.append(lum/255) # Map values from range 0-255 to 0-1
return arrayWithPixelWeight
# Convert image to numbers and print them
image_decoded_png = tf.image.decode_png(image_file,channels=3)
image_as_float32 = tf.cast(image_decoded_png, tf.float32)
numpy.set_printoptions(threshold=numpy.nan)
sess = tf.Session()
squeezedArray = sess.run(image_as_float32)
convertedList = convertRgbToWeight(squeezedArray)
print(convertedList) # This will give me an array of numbers.
I would recommend to read in images with opencv. The biggest advantage of opencv is that it supports multiple image formats and it automatically transforms the image into a numpy array. For example:
import cv2
import numpy as np
img_path = '/YOUR/PATH/IMAGE.png'
img = cv2.imread(img_path, 0) # read image as grayscale. Set second parameter to 1 if rgb is required
Now img is a numpy array with values between 0 - 255. By default 0 equals black and 255 equals white. To change this you can use the opencv built in function bitwise_not:
img_reverted= cv2.bitwise_not(img)
We can now scale the array with:
new_img = img_reverted / 255.0 // now all values are ranging from 0 to 1, where white equlas 0.0 and black equals 1.0
Load the image and then just invert and divide by 255.
Here is the image ('Untitled.png') that I used for this example: https://ufile.io/h8ncw
import numpy as np
import cv2
import matplotlib.pyplot as plt
my_img = cv2.imread('Untitled.png')
inverted_img = (255.0 - my_img)
final = inverted_img / 255.0
# Visualize the result
plt.imshow(final)
plt.show()
print(final.shape)
(661, 667, 3)
Results (final object represented as image):
You can use PIL package to manage images. Here's example how it can be done.
from PIL import Image
image = Image.open('sample.png')
width, height = image.size
pixels = image.load()
# Check if has alpha, to avoid "too many values to unpack" error
has_alpha = len(pixels[0,0]) == 4
# Create empty 2D list
fill = 1
array = [[fill for x in range(width)] for y in range(height)]
for y in range(height):
for x in range(width):
if has_alpha:
r, g, b, a = pixels[x,y]
else:
r, g, b = pixels[x,y]
lum = 255-((r+g+b)/3) # Reversed luminosity
array[y][x] = lum/255 # Map values from range 0-255 to 0-1
I think it works but please note that the only test I did was if values are in desired range:
# Test max and min values
h, l = 0,1
for row in array:
h = max([max(row), h])
l = min([min(row), l])
print(h, l)
You have to load the image from the path and then transform it to a numpy array.
The values of the image will be between 0 and 255. The next step is to standardize the numpy array.
Hope it helps.