I know this question has been asked for several times, e.g., this one, but my problem is I need to overlay a smaller PNG image with a larger PNG image, where both images need to maintain their transparency in the output.
For example, smaller image is like:
... larger image is like:
(The image above has only 1 pixel filled with white color and 90% transparency)
Using the code examples given in the previous question (overlay a smaller image on a larger image python OpenCv), the larger background image will be in black.
def overlay_image_alpha(img, img_overlay, x, y, alpha_mask):
"""Overlay `img_overlay` onto `img` at (x, y) and blend using `alpha_mask`.
`alpha_mask` must have same HxW as `img_overlay` and values in range [0, 1].
"""
# Image ranges
y1, y2 = max(0, y), min(img.shape[0], y + img_overlay.shape[0])
x1, x2 = max(0, x), min(img.shape[1], x + img_overlay.shape[1])
# Overlay ranges
y1o, y2o = max(0, -y), min(img_overlay.shape[0], img.shape[0] - y)
x1o, x2o = max(0, -x), min(img_overlay.shape[1], img.shape[1] - x)
# Exit if nothing to do
if y1 >= y2 or x1 >= x2 or y1o >= y2o or x1o >= x2o:
return
# Blend overlay within the determined ranges
img_crop = img[y1:y2, x1:x2]
img_overlay_crop = img_overlay[y1o:y2o, x1o:x2o]
alpha = alpha_mask[y1o:y2o, x1o:x2o, np.newaxis]
alpha_inv = 1.0 - alpha
img_crop[:] = alpha * img_overlay_crop + alpha_inv * img_crop
# Prepare inputs
x, y = 50, 0
img = np.array(Image.open("template.png"))
img_overlay_rgba = np.array(Image.open("../foo.png"))
# Perform blending
alpha_mask = img_overlay_rgba[:, :, 3] / 255.0
img_result = img[:, :, :3].copy()
img_overlay = img_overlay_rgba[:, :, :3]
overlay_image_alpha(img_result, img_overlay, x, y, alpha_mask)
# Save result
Image.fromarray(img_result).save("img_result.png")
Result:
My desired result is to maintain the larger image's transparency.
May I know how to do so?
This is actually the correct output. Fully transparent images (like your template image) have the pixel value of 0 on empty areas which represents black. If you were to use semi-transparent templates (which have pixels value greater than 0 even if they are slightly transparent). You should see this:
Answer 2:
Your output images are 24bit. (Which means you get rid of thhe alpha channel before saving it). I looked into your code and saw the lines 34 and 35;
img_result = img[:, :, :3].copy()
img_overlay = img_overlay_rgba[:, :, :3]
You're sending RGB images to overlay_image_alpha function. Change this to;
img_result = img.copy()
img_overlay = img_overlay_rgba
To preserve Alpha channel information.
New Output:
On Photoshop:
Related
I was wondering, given the type of interpolation that is used for image resizes using cv2.resize. How can I find out exactly where a particular pixel maps too? For example, if I'm increasing the size of an image using Linear_interpolation and I take coordinates (785, 251) for a particular pixel, regardless of whether or not the aspect ratio changes between the source image and resized image, how could I find out exactly to what coordinates the pixel in the source image with coordinates == (785, 251) maps in the resized version? I've looked over the internet for a solution but all solutions seem to be indirect methods of finding out where a pixel maps that don't actually work for different aspect ratio's:
https://answers.opencv.org/question/209827/resize-and-remap/
After resizing an image with cv2, how to get the new bounding box coordinate
Is there a way through cv2 to access the way pixels are mapped maybe and through reversing the script finding out the new coordinates?
The reason why I would like this is that I want to be able to create bounding boxes that give me back the same information regardless of the change in aspect ratio of a given image. Every method I've used so far doesn't give me back the same information. I figure that if I can figure out where the particular pixel coordinates of x,y top left and bottom right maps I can recreate an accurate bounding box regardless of aspect ratio changes.
Scaling the coordinates works when the center coordinate is (0, 0).
You may compute x_scaled and y_scaled as follows:
Subtract x_original_center and y_original_center from x_original and y_original.
After subtraction, (0, 0) is the "new center".
Scale the "zero centered" coordinates by scale_x and scale_y.
Convert the "scaled zero centered" coordinates to "top left (0, 0)" by adding x_scaled_center and y_scaled_center.
Computing the center accurately:
The Python conversion is:
(0, 0) is the top left, and (cols-1, rows-1) is the bottom right coordinate.
The accurate center coordinate is:
x_original_center = (original_rows-1)/2
y_original_center = (original_cols-1)/2
Python code (assume img is the original image):
resized_img = cv2.resize(img, [int(cols*scale_x), int(rows*scale_y)])
rows, cols = img.shape[0:2]
resized_rows, resized_cols = resized_img.shape[0:2]
x_original_center = (cols-1) / 2
y_original_center = (rows-1) / 2
x_scaled_center = (resized_cols-1) / 2
y_scaled_center = (resized_rows-1) / 2
# Subtract the center, scale, and add the "scaled center".
x_scaled = (x_original - x_original_center)*scale_x + x_scaled_center
y_scaled = (y_original - y_original_center)*scale_y + y_scaled_center
Testing
The following code sample draws crosses at few original and scaled coordinates:
import cv2
def draw_cross(im, x, y, use_color=False):
""" Draw a cross with center (x,y) - cross is two rows and two columns """
x = int(round(x - 0.5))
y = int(round(y - 0.5))
if use_color:
im[y-4:y+6, x] = [0, 0, 255]
im[y-4:y+6, x+1] = [255, 0, 0]
im[y, x-4:x+6] = [0, 0, 255]
im[y+1, x-4:x+6] = [255, 0, 0]
else:
im[y-4:y+6, x] = 0
im[y-4:y+6, x+1] = 255
im[y, x-4:x+6] = 0
im[y+1, x-4:x+6] = 255
img = cv2.imread('graf.png') # http://man.hubwiz.com/docset/OpenCV.docset/Contents/Resources/Documents/db/d70/tutorial_akaze_matching.html
rows, cols = img.shape[0:2] # cols = 320, rows = 256
# 3 points for testing:
x0_original, y0_original = cols//2-0.5, rows//2-0.5 # 159.5, 127.5
x1_original, y1_original = cols//5-0.5, rows//4-0.5 # 63.5, 63.5
x2_original, y2_original = (cols//5)*3+20-0.5, (rows//4)*3+30-0.5 # 211.5, 221.5
draw_cross(img, x0_original, y0_original) # Center of cross (159.5, 127.5)
draw_cross(img, x1_original, y1_original)
draw_cross(img, x2_original, y2_original)
scale_x = 2.5
scale_y = 2
resized_img = cv2.resize(img, [int(cols*scale_x), int(rows*scale_y)], interpolation=cv2.INTER_NEAREST)
resized_rows, resized_cols = resized_img.shape[0:2] # cols = 800, rows = 512
# Compute center column and center row
x_original_center = (cols-1) / 2 # 159.5
y_original_center = (rows-1) / 2 # 127.5
# Compute center of resized image
x_scaled_center = (resized_cols-1) / 2 # 399.5
y_scaled_center = (resized_rows-1) / 2 # 255.5
# Compute the destination coordinates after resize
x0_scaled = (x0_original - x_original_center)*scale_x + x_scaled_center # 399.5
y0_scaled = (y0_original - y_original_center)*scale_y + y_scaled_center # 255.5
x1_scaled = (x1_original - x_original_center)*scale_x + x_scaled_center # 159.5
y1_scaled = (y1_original - y_original_center)*scale_y + y_scaled_center # 127.5
x2_scaled = (x2_original - x_original_center)*scale_x + x_scaled_center # 529.5
y2_scaled = (y2_original - y_original_center)*scale_y + y_scaled_center # 443.5
# Draw crosses on resized image
draw_cross(resized_img, x0_scaled, y0_scaled, True)
draw_cross(resized_img, x1_scaled, y1_scaled, True)
draw_cross(resized_img, x2_scaled, y2_scaled, True)
cv2.imshow('img', img)
cv2.imshow('resized_img', resized_img)
cv2.waitKey()
cv2.destroyAllWindows()
Original image:
Resized image:
Making sure the crosses are aligned:
Note:
In my answer I was using the naming conventions of Miki's comment.
I have a set of binary masks that contain some noise (image below). I want to write a piece of code to change the pixels from the areas that contain that noise to black. I have tried with the code below but this does not change any pixel from the original array to pitch black. Does anyone know how I can do this?
Mask with noise
I have tried the following code:
ht, wt = array.shape
area = ht * wt
for region in sme.regionprops(array):
if 0.1 > (region.area/area) > 0.001:
x1 = math.ceil(region.bbox[0])
x2 = math.ceil(region.bbox[1])
y1 = math.ceil(region.bbox[2])
y2 = math.ceil(region.bbox[3])
for item in array[y1:y2, x1:x2]:
array = np.where(array, item>0.1, 0)
plt.imshow(array, cmap='gray')
plt.show()
I'm trying to apply an overlay on a transparent logo using this function:
def overlay(path):
logo_img = cv2.imread(path, cv2.IMREAD_UNCHANGED)
'''Saving the alpha channel of the logo to the "alpha" variable.'''
alpha = logo_img[:, :, 3]
'''Creating the overlay of the logo by first creating an array of zeros in the shape of the logo.
The color on this will change later to become the overlay that will mask the logo.'''
mask = np.zeros(logo_img.shape, dtype=np.uint8)
'''Adding the alpha (transparency) of the original logo so that the overlay has the same transparecy
that the original logo has.'''
# mask[:, :, 2] = alpha
'''This code chooses random values for Red, Green and Blue channels of the image so that the final
overlay always has a different background'''
# r, g, b = (random.randint(0, 255),
# random.randint(0, 255),
# random.randint(0, 255))
r, g, b = (0, 255, 0)
'''There is a risk of losing the transparency when randomizing the overlay color so here I'm saving the
alpha value'''
a = 255
'''Creating the overlay'''
mask[:, :] = r, g, b, a
mask[:, :, 3] = alpha
'''Alp, short for alpha, is separate from above. This determines the opacity level of the logo. The
beta parameter determines the opacity level of the overlay.'''
alp = 1
beta = 1 - alp
'''addWeighted() is what masks the overlay on top of the logo'''
dst = cv2.addWeighted(logo_img, alp, mask, beta, 0, dtype=cv2.CV_32F).astype(np.uint8)
'''Converting the output dst to a PIL image with the RGBA channels.'''
pil_image = Image.fromarray(dst).convert('RGBA')
return pil_image
As you can see, I have these two tuples for setting the RGB. Whether I randomize it or I select a specific color, it makes no difference for the color of the overlay.
# r, g, b = (random.randint(0, 255),
# random.randint(0, 255),
# random.randint(0, 255))
r, g, b = (0, 255, 0)
Edit
You wish to "change the colour of the logo" with using an alpha matte. You cannot do this unless you actually manipulate the actual image pixels. Using an alpha matting approach is not the right answer for this. I would suggest you actually mask out the regions of the logo you want to change then replacing the colours with what is desired. Alpha matting is primarily used to blend objects together, not change the colour distribution of an object. I have left the answer below for posterity as the original method for alpha matting provided in the original question at its core was incorrect.
The core misunderstanding of this approach comes from how cv2.addWeighted is performed. Citing the documentation (emphasis mine):
In case of multi-channel arrays, each channel is processed independently. The function can be replaced with a matrix expression:
dst = src1*alpha + src2*beta + gamma;
cv2.addWeighted does not process the alpha channel in the way you are expecting correctly. Specifically, it will consider the alpha channel to be just another channel of information and does a weighted sum of this channel alone for the final output. It does not actually achieve alpha matting. Therefore if you want to do alpha matting you will need to actually compute the operation yourself.
Something like:
import numpy as np
import cv2
from PIL import Image
def overlay(path):
### Some code from your function - comments removed for brevity
logo_img = cv2.imread(path, cv2.IMREAD_UNCHANGED)
alpha = logo_img[:, :, 3]
mask = np.zeros(logo_img.shape, dtype=np.uint8)
# r, g, b = (random.randint(0, 255),
# random.randint(0, 255),
# random.randint(0, 255))
r, g, b = (0, 255, 0)
a = 255
mask[:, :] = r, g, b, a
mask[:, :, 3] = alpha
### Alpha matte code here
alp = alpha.astype(np.float32) / 255.0 # To make between [0, 1]
alp = alp[..., None] # For broadcasting
dst_tmp = logo_img[..., :3].astype(np.float32) * alp + mask[..., :3].astype(np.float32) * (1.0 - alp)
dst = np.zeros_like(logo_img)
dst[..., :3] = dst_tmp.astype(np.uint8)
dst[..., 3] = 255
pil_image = Image.fromarray(dst).convert('RGBA')
return pil_image
The first bit of this new function is from what you originally had. However, the section that has the alpha matting is seen above marked after the appropriate comment. The first line of the section will convert the alpha map into the [0, 1] range. This is required so that when you perform the alpha matting operation, which is a weighted sum of two images, you ensure that no output pixels span outside of the range of its native data type. Also, I introduce a singleton third dimension so that we can broadcast the alpha channel over each RGB channel separately to do the weighting correctly. After, we will compute the alpha matte as a weighted sum of the logo image and mask. Take note that I subset and pull out just the RGB channels of each. You don't need the alpha channel here specifically as I'm using it directly in the weighted sum instead. Once you finish this up, create a new output image with the first three channels being the resulting alpha matte but the alpha channel is all set to 255 since we've already achieved the blending at this point and you want all of the pixel values to show with no transparency now.
I'm trying to work on video stabilization using python and template matching via skimage. The code is supposed to track a single point during the whole video but the tracking is awfully imprecise and I suspect it's not even working correctly
This is the track_point function which is supposed to take a video as an input and some coordinates of a point and then return an array of tracked points for each frame
from skimage.feature import match_template
from skimage.color import rgb2gray
def track_point(video, x, y, patch_size = 4, search_size = 40):
length, height, width, _ = video.shape
frame = rgb2gray(np.squeeze(video[1, :, :, :])) # convert image to grayscale
x1 = int(max(1, x - patch_size / 2))
y1 = int(max(1, y - patch_size / 2))
x2 = int(min(width, x + patch_size / 2 - 1))
y2 = int(min(height, y + patch_size / 2 - 1))
template = frame[y1:y2, x1:x2] # cut the reference patch (template) from the first frame
track_x = [x]
track_y = [y]
#plt.imshow(template)
half = int(search_size/2)
for i in range(1, length):
prev_x = int(track_x[i-1])
prev_y = int(track_y[i-1])
frame = rgb2gray(np.squeeze(video[i, :, :, :])) # Extract current frame and convert it grayscale
image = frame[prev_x-half:prev_x+half,prev_y-half:prev_y+half] # Cut-out a region of search_size x search_size from 'frame' with the center in the point's previous position (i-1)
result = match_template(image, template, pad_input=False, mode='constant', constant_values=0) # Compare the region to template using match_template
ij = np.unravel_index(np.argmax(result), result.shape) # Select best match (maximum) and determine its position. Update x and y and append new x,y values to track_x,track_y
x, y = ij[::-1]
x += x1
y += y1
track_x.append(x)
track_y.append(y)
return track_x, track_y
And this is the implementation of the function
points = track_point(video, point[0], point[1])
# Draw trajectory on top of the first frame from video
image = np.squeeze(video[1, :, :, :])
figure = plt.figure()
plt.gca().imshow(image)
plt.gca().plot(points[0], points[1])
I expect the plot to be somehow regular since the video isn't that shaky but it's not.
For some reason the graph is plotting almost all of the coordinates of the search template.
EDIT: Here's the link for the video: https://upload-video.net/a11073n9Y11-noau
What am I doing wrong?
I am trying to calculate the percentage of black a pixel is. For example, let's say I have a pixel that is 75% black, so a gray. I have the RGBA values, so how do I get the level of black?
I have already completed getting each pixel and replacing it with a new RGBA value, and tried to use some RGBA logic to no avail.
#Gradient Testing here
from PIL import Image
picture = Image.open("img1.png")
img = Image.open('img1.png').convert('LA')
img.save('greyscale.png')
# Get the size of the image
width, height = picture.size
# Process every pixel
for x in range(width):
for y in range(height):
#Code I need here
r1, g1, b1, alpha = picture.getpixel( (x,y) )
r,g,b = 120, 140, 99
greylvl = 1 - (alpha(r1 + g1 + b1) / 765) #Code I tried
I would like to get new variable called that gives me a value, such as 0.75, which would represent a 0.75 percent black pixel.
I'm not quite sure what the "LA" format you're trying to convert to is for; I would try "L" instead.
Try this code: (Make sure you're using Python 3.)
from PIL import Image
picture = Image.open('img1.png').convert('L')
width, height = picture.size
for x in range(width):
for y in range(height):
value = picture.getpixel( (x, y) )
black_level = 1 - value / 255
print('Level of black at ({}, {}): {} %'.format(x, y, black_level * 100))
Is this what you're looking for?