I would like to achieve something similar to this:
I currently have the image on the red background but I am unsure how to draw a translucent rectangle such as on the image above to put the text on in order to make it pop out more. I’m pretty sure it can be achieved using OpenCV but I am fairly new to Python and it seems very confusing. (I can’t seem to do it properly and it’s starting to annoy me). Here is my current image (ignore the white outline):
Here is one way to achieve the same results in Python/OpenCV.
Read the input
Crop the desired region to darken
Create the same sized black image
Blend the two image (crop 75% and black 25%)
Draw text on the blended image
Copy the text image back to the same location in the input
Save results
Input:
import cv2
import numpy as np
# load image
img = cv2.imread("chimichanga.jpg")
# define undercolor region in the input image
x,y,w,h = 66,688,998,382
# define text coordinates in the input image
xx,yy = 250,800
# compute text coordinates in undercolor region
xu = xx - x
yu = yy - y
# crop undercolor region of input
sub = img[y:y+h, x:x+w]
# create black image same size
black = np.zeros_like(sub)
# blend the two
blend = cv2.addWeighted(sub, 0.75, black, 0.25, 0)
# draw text on blended image
text = cv2.putText(blend, "CHIMICHANGA", (xu,yu), cv2.FONT_HERSHEY_SIMPLEX, 2, (255,255,255), cv2.LINE_8, bottomLeftOrigin=False )
# copy text filled region onto input
result = img.copy()
result[y:y+h, x:x+w] = text
# write result to disk
cv2.imwrite("chimichanga_result.jpg", result)
# display results
cv2.imshow("BLEND", blend)
cv2.imshow("TEXT", text)
cv2.imshow("RESULT", result)
cv2.waitKey(0)
cv2.destroyAllWindows()
Result:
Related
when i was trying to overlay one image over the other one image had a transparent rounded rectangle filling and the other was just a normal image it looked either like this ( just putting the yellow over the pink without taking into account the rounded corners at all) or like this (looks just like the rounded rectangle without adding anything even kept the transparency)
this is how it should look like:
here are the 2 example images: (pink.png) and (yellow.png)
here is the code used for this :
import cv2
import numpy as np
layer0 = cv2.imread(r'yellow.png', cv2.IMREAD_UNCHANGED)
h0, w0 = layer0.shape[:2]
layer4 = cv2.imread(r"pink.png", cv2.IMREAD_UNCHANGED)
#just a way to help the image look more transparent in the opencv imshow because imshow always ignores
# the transparency and pretends that the image has no alpha channel
for y in range(layer4.shape[0]):
for x in range(layer4.shape[1]):
if layer4[y,x][3]<255:
layer4[y,x][:] =0,0,0,0
# Create a new np array
shapes = np.zeros_like(layer4, np.uint8)
shapes = cv2.cvtColor(shapes, cv2.COLOR_BGR2BGRA)
#the start position of the yellow image on the pink
gridpos = (497,419)
shapes[gridpos[1]:gridpos[1]+h0, gridpos[0]:gridpos[0]+w0] = layer0
# Change this into bool to use it as mask
mask = shapes.astype(bool)
# We'll create a loop to change the alpha
# value i.e transparency of the overlay
for alpha in np.arange(0, 1.1, 0.1)[::-1]:
# Create a copy of the image to work with
bg_img = layer4.copy()
# Create the overlay
bg_img[mask] = cv2.addWeighted( bg_img,1-alpha, shapes, alpha, 0)[mask]
# print the alpha value on the image
cv2.putText(bg_img, f'Alpha: {round(alpha,1)}', (50, 200),
cv2.FONT_HERSHEY_PLAIN, 8, (200, 200, 200), 7)
# resize the image before displaying
bg_img = cv2.resize(bg_img, (700, 600))
cv2.imwrite("out.png", bg_img)
cv2.imshow('Final Overlay', bg_img)
cv2.waitKey(0)
you can test different alpha combinations by pressing a key on the keyboard
OpenCV Version
Took me some time, but basically you have to mask both images and then combine them. The code bellow is commented and should be self explenatory. I think the hardest part to grasp is, that your pink image actually represents the foreground and the yellow image is your background. The trickiest part is to not let anything through from your background, which is why you have to mask both images.
import cv2
import numpy as np
pink = cv2.imread("pink.png", cv2.IMREAD_UNCHANGED)
# We now have to use an image that has the same size as the pink "foreground"
# and create a black image wiht numpy's zeros_like (gives same size as input)
background = np.zeros_like(pink)
# We then split the pink image into 4 channels:
# b, g, r and alpha, we only need the alpha as mask
_, _, _, mask = cv2.split(pink)
yellow = cv2.imread("yellow.png", cv2.IMREAD_UNCHANGED)
# we need the x and y dimensions for pasting the image later
h_yellow, w_yellow = yellow.shape[:2]
# Assuming format is (x, y)
gridpos = (497, 419)
# We paste the yellow image onto our black background
# IMPORTANT: if any of the dimensions of yellow plus the gridpos is
# larger than the background width or height, this will give you an
# error! Also, this only works with the same number of input channels.
# If you are loading a jpg image without alpha channel, you can adjust
# the number of channels, the last input param, e.g. with :3 to only use
# the first 3 channels
background[gridpos[1]:gridpos[1] + h_yellow, gridpos[0]:gridpos[0] + w_yellow, :] = yellow
# This step was not intuitive for me in the first run, since the
# pink img should aready be masked, but for some reason, it is not
pink_masked = cv2.bitwise_and(pink, pink, mask=mask)
# In this step, we mask the positioned yellow image with the inverse
# mask from the pink image, achieved by bitwise_not
background = cv2.bitwise_and(background, background, mask=cv2.bitwise_not(mask))
# We combine the pink masked image with the background
img = cv2.convertScaleAbs(pink_masked + background)
cv2.imshow("img", img), cv2.waitKey(0), cv2.destroyAllWindows()
Cheers!
Old Answer:
It looks like you are setting the whole image as a mask, this is why the rounded corners have no effect at all from your pink background. I myself was struggling a lot with this task aswell and ended up using pillow instead of OpenCV. I don't know if it is more performant, but I got it running.
Here the code that works for your example:
from PIL import Image
# load images
background = Image.open(r"pink.png")
# load image and scale it to the same size as the background
foreground = Image.open(r"yellow.png").resize(background.size)
# split gives you the r, g, b and alpha channel of the image.
# For the mask we only need alpha channel, indexed at 3
mask = background.split()[3]
# we combine the two images and provide the mask that is applied to the foreground.
im = Image.composite(background, foreground, mask)
im.show()
If your background is not monochrome as in your example, and you want to use the version, where you paste your original image, you have to create an empty image with the same size as the background, then paste your foreground to the position (your gridpos), e.g. like this:
canvas = Image.new('RGBA', background.size)
canvas.paste(foreground, gridpos)
foreground = canvas
Hope this helps!
I'm making a script that copies an "anomaly" image and pastes it in random places from an original image. Like this:
Original Imagem
Anomaly Image:
Output Imagem Example:
But at the same time that the image with the anomaly is generated, I need to generate a mask where the area of the anomaly that I pasted is white and the rest of the image is black. Like this (I did it manually in Gimp):
Output image mask example:
How can I do this automatically at the same time as the anomaly image is generated? Below the code I'm using:
from PIL import Image
import random
anomaly = Image.open("anomaly_transp.png") # anomaly image with transparent background
image_number = 1 # number of images
w, h = anomaly.size
offset_x, offset_y = 480-w, 512-h # offsets to avoid incorrect paste area from original image
for i in range(image_number):
original = Image.open("Crop_120.png") # original good image
x1 = random.randint(0,offset_x)
y1 = random.randint(0,offset_y)
area = (x1, y1, x1+w, y1+h)
original.paste(anomaly, area, anomaly)
original.save("output_"+str(i)+".png") # save output image
original.close()
You can use
alpha = anomaly.split()[-1]
to fetch the alpha plane of your transparent image. You can then paste that into an all black image of the right size to get your mask.
I have a screenshot received from an iPhone, both dark and light mode.
I need to use OCR to extract the URL but am unable to do so with the underlining that appears.
What would be the best way to remove the horizontal lines from the message? Except the phone number, it doesn't matter if other parts of the screenshot are distorted.
I've tried approaches as described in
Removing Horizontal Lines in image (OpenCV, Python, Matplotlib)
https://docs.opencv.org/3.2.0/d1/dee/tutorial_moprh_lines_detection.html
https://legacy.imagemagick.org/discourse-server/viewtopic.php?t=22338
And none seem to work well, at all.
Here's a possible solution for your problem. I'm using mock screenshots, since, like I suggested, it is better to use lossless images to get a better result. The main idea here is to extract the color of the text box and to fill the rest of the image with that color, then threshold the image. By doing this, we will reduce the intensity variation and obtain a better thresholded image - since the image histogram will contain fewer intensity values. These are the steps:
Crop the image to a ROI (Region Of Interest)
Get the colors in that ROI via K-Means
Get the color of the text box
Flood-fill the ROI with the color of the text box
Apply Otsu's thresholding to get a binary image
Get OCR of the image
Suppose this is our test images, one uses a a "light" theme while the other uses a "dark" theme:
I'll be using pyocr as OCR engine. Let's use image one, the code would be this:
# imports:
from PIL import Image
import numpy as np
import cv2
import pyocr
import pyocr.builders
tools = pyocr.get_available_tools()
# The tools are returned in the recommended order of usage
tool = tools[0]
langs = tool.get_available_languages()
lang = langs[0]
# image path
path = "D://opencvImages//"
fileName = "mockText.png"
# Reading an image in default mode:
inputImage = cv2.imread(path + fileName)
# Convert RGB to grayscale:
grayscaleImage = cv2.cvtColor(inputImage, cv2.COLOR_BGR2GRAY)
# Set the ROI location:
roiX = 0
roiY = 235
roiWidth = 750
roiHeight = 1080
# Crop the ROI:
smsROI = grayscaleImage[roiY:roiHeight, roiX:roiWidth]
The first bit crops the ROI - everything that is of interest, leaving out the "header" and the "footer" of the image, where's there's info that we really don't need. This is the current ROI:
Wouldn't be nice to (approximately) get all the colors used in the image? Fortunately that's what Color Quantization gives us - a reduced pallet of the average colors present in an image, provided the number of the colors we are looking for. Let's apply K-Means and use 3 clusters to group this colors.
In our test images, most of the pixels are background - so, the largest cluster of pixels will belong to the background. The text represents the smallest cluster of pixels. That leaves the remaining cluster our target - the color of the text box. Let's apply K-Means, then. We need to format the data before, though, because K-Means needs float re-arranged arrays:
# Reshape the data to width x height, number of channels:
kmeansData = smsROI.reshape((-1,1))
# convert the data to np.float32
kmeansData = np.float32(kmeansData)
# define criteria, number of clusters(K) and apply kmeans():
criteria = (cv2.TERM_CRITERIA_EPS + cv2.TERM_CRITERIA_MAX_ITER, 5, 1.0)
# Define number of clusters (3 colors):
K = 3
# Run K-means:
_, _, center = cv2.kmeans(kmeansData, K, None, criteria, 10, cv2.KMEANS_RANDOM_CENTERS)
# Convert the centers to uint8:
center = np.uint8(center)
# Sort centers from small to largest:
center = sorted(center, reverse=False)
# Get text color and min color:
textBoxColor = int(center[1][0])
minColor = min(center)[0]
print("Minimum Color is: "+str(minColor))
print("Text Box Color is: "+str(textBoxColor))
The info of interest is in center. That's where our colors are. After sorting this list and getting the minimum color value (that I'll use later to distinguish between a light and a dark theme) we can print the values. For the first test image, these values are:
Minimum Color is: 23
Text Box Color is: 225
Alright, so far so good. We have the color of the text box. Let's use that and flood-fill the entire ROI at position (x=0, y=0):
# Apply flood-fill at seed point (0,0):
cv2.floodFill(smsROI, mask=None, seedPoint=(0, 0), newVal=textBoxColor)
The result is this:
Very nice. Let's apply Otsu's thresholding on this bad boy:
# Threshold via Otsu:
_, binaryImage = cv2.threshold(smsROI, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)
Now, here comes the minColor part. If you are processing a dark theme screenshot and threshold it you will get white text on black background. If you were to process a light theme screenshot you would get black text on white background. We will always produce the same no matter the input: white text and black background. Let's check the min color, if this equals 0 (black) you just received a dark theme screenshot and you don't need to invert the image. Otherwise, invert the image:
# Process "Dark Theme / Light Theme":
if minColor != 0:
# Invert image if is not already inverted:
binaryImage = 255 - binaryImage
cv2.imshow("binaryImage", binaryImage)
cv2.waitKey(0)
For our first test image, the result is:
Notice the little bits of small noise. Let's apply an area filter (function defined at the end of the post) to get rid of pixels below a certain area threshold:
# Run a minimum area filter:
minArea = 10
binaryImage = areaFilter(minArea, binaryImage)
This is the filtered image:
Very nice. Lastly, I write this image and use pyocr to get the text as a string:
cv2.imwrite(path + "ocrText.png", binaryImage)
txt = tool.image_to_string(
Image.open(path + "ocrText.png"),
lang=lang,
builder=pyocr.builders.TextBuilder()
)
print("Image text is: "+txt)
Which results in:
Image text is: 301248 is your Amazon
verification code
If you test the second image you get the same exact result. This is the definition and implementation of the areaFilter function:
def areaFilter(minArea, inputImage):
# Perform an area filter on the binary blobs:
componentsNumber, labeledImage, componentStats, componentCentroids = \
cv2.connectedComponentsWithStats(inputImage, connectivity=4)
# Get the indices/labels of the remaining components based on the area stat
# (skip the background component at index 0)
remainingComponentLabels = [i for i in range(1, componentsNumber) if componentStats[i][4] >= minArea]
# Filter the labeled pixels based on the remaining labels,
# assign pixel intensity to 255 (uint8) for the remaining pixels
filteredImage = np.where(np.isin(labeledImage, remainingComponentLabels) == True, 255, 0).astype('uint8')
return filteredImage
I have two images: an image with a text and an image as the dirty background.
Clean Image
Dirty Background Image
How will I overlay the clean image to the dirty background image using Python? Please assume that the clean image has the smaller size compared to the dirty background image.
There's a library called pillow (which is a fork of PIL) that can do this for you. You can play around with the placements a little, but I think it looks good.
# Open your two images
cleantxt = Image.open('cleantext.jpg')
dirtybackground = Image.open('dirtybackground.jpg')
# Convert the image to RGBA
cleantxt = cleantxt.convert('RGBA')
# Return a sequence object of every pixel in the text
data = cleantxt.getdata()
new_data = []
# Turn every pixel that looks lighter than gray into a transparent pixel
# This turns everything except the text transparent
for item in data:
if item[0] >= 123 and item[1] >= 123 and item[2] >= 123:
new_data.append((255, 255, 255, 0))
else:
new_data.append(item)
# Replace the old pixel data of the clean text with the transparent pixel data
cleantxt.putdata(new_data)
# Resize the clean text to fit on the dirty background (which is 850 x 555 pixels)
cleantxt.thumbnail((555,555), Image.ANTIALIAS)
# Save the clean text if we want to use it for later
cleantxt.save("cleartext.png", "PNG")
# Overlay the clean text on top of the dirty background
## (0, 0) is the pixel where you place the top left pixel of the clean text
## The second cleantxt is used as a mask
## If you pass in a transparency, the alpha channel is used as a mask
dirtybackground.paste(cleantxt, (0,0), cleantxt)
# Show it!
dirtybackground.show()
# Save it!
dirtybackground.save("dirtytext.png", "PNG")
Here's the output image:
I am trying to combine some parts of the image together while still maintaining some parts unchanged.
This is first image
This is the code to get the first image, the parameter for the input are img which is original image but already colorized with green while jawline,eyebrows,etc are (x,y) coordinates to cut those parts from the image
def getmask(img,jawline,eyebrows,eyes,mouth):
img = cv2.cvtColor(img, cv2.COLOR_RGB2GRAY)
imArray = np.asarray(img)
# create mask
polygon = jawline.flatten().tolist()
maskIm = Image.new('L', (imArray.shape[1], imArray.shape[0]), 0)
ImageDraw.Draw(maskIm).polygon(polygon, outline=1, fill='white')
#ImageDraw.Draw(maskIm).polygon(polygon, outline=(1))
# draw eyes
righteyes=eyes[0:6].flatten().tolist()
ImageDraw.Draw(maskIm).polygon(righteyes, outline=1, fill='black')
lefteyes=eyes[6:].flatten().tolist()
ImageDraw.Draw(maskIm).polygon(lefteyes, outline=1, fill='black')
# draw eyebrows
rightbrows=eyebrows[0:6].flatten().tolist()
ImageDraw.Draw(maskIm).polygon(rightbrows, outline=2, fill='black')
leftbrows=eyebrows[6:].flatten().tolist()
ImageDraw.Draw(maskIm).polygon(leftbrows, outline=2, fill='black')
# draw mouth
mouth=mouth.flatten().tolist()
ImageDraw.Draw(maskIm).polygon(mouth, outline=1, fill='black')
mask = np.array(maskIm)
mask = np.multiply(img,mask)+ np.multiply((1-mask),np.ones((L,P,3)))
return mask
This is the second image which will fill the white blank inside the first image
I used this code to cut the parts which is very similar to the code on first image.
def getface(img,eyebrows,eyes,mouth):
im=img.copy()
img = cv2.cvtColor(img, cv2.COLOR_RGB2GRAY)
imArray = np.asarray(img)
# create mask
maskIm = Image.new('L', (imArray.shape[1], imArray.shape[0]), 0)
righteyes=eyes[0:6].flatten().tolist()
ImageDraw.Draw(maskIm).polygon(righteyes, outline=1,fill='white')
lefteyes=eyes[6:].flatten().tolist()
ImageDraw.Draw(maskIm).polygon(lefteyes, outline=1,fill='white')
# draw eyebrows
rightbrows=eyebrows[0:6].flatten().tolist()
ImageDraw.Draw(maskIm).polygon(rightbrows, outline=2, fill='white')
leftbrows=eyebrows[6:].flatten().tolist()
ImageDraw.Draw(maskIm).polygon(leftbrows, outline=2, fill='white')
# draw mouth
mouth=mouth.flatten().tolist()
ImageDraw.Draw(maskIm).polygon(mouth, outline=1, fill='white')
cutted_part = np.array(maskIm)
cutted_part = cv2.bitwise_or(im,im,mask=mask)
return cutted_part
So far I have tried to combine those two images by first inversing the second image so that the black background become white and then multiply the first and second image. But the result isn't satisfactory.
As you can see, there are some white space between the combined area and I notice that some part from second image become smaller or missing which I suspect create those white space when combined (Please don't mind the slightly different color on the result). Maybe someone can share how to resolve this problem or has better ways to combine 2 images together?
If you provide your results as actual pictures instead of cropped screenshots we can reproduce your problem, so far i would recommend:
Invert the background of your cutout (black to white) and then simply combine both pictures either by adding them (They need to have the same dimensions, which i presume is the case.) or overlaying them by using opencv's addWeighted function to adjust opacity.