I draw some rectangles in OpenCV and put text in them. My general approach looks like this:
# Draw rectangle p1(x,y) p2(x,y) Student name box
cv2.rectangle(frame, (500, 650), (800, 700), (42, 219, 151), cv2.FILLED )
font = cv2.FONT_HERSHEY_DUPLEX
cv2.putText(frame, name, (510, 685), font, 1.0, (255, 255, 255), 1
Everything works so far. The only thing is, that the opacity in all boxes is at 100 %. My question is: How can I change the opacity?
The final result should look like this:
I would like to add a small optimization to the #HansHirse answer, Instead of creating the canvas for whole image, we can crop the rectangle first from the src image and then later swap it with the cv2.addWeighted result as:
import cv2
import numpy as np
img = cv2.imread("lena.png")
# First we crop the sub-rect from the image
x, y, w, h = 100, 100, 200, 100
sub_img = img[y:y+h, x:x+w]
white_rect = np.ones(sub_img.shape, dtype=np.uint8) * 255
res = cv2.addWeighted(sub_img, 0.5, white_rect, 0.5, 1.0)
# Putting the image back to its position
img[y:y+h, x:x+w] = res
EDIT: Since this answer seems to have some importance, I decided to edit it again, incorporating the proper blending from ZdaR's answer, which initially was an improvement to my original answer (check the timeline if interested). Also, I incorporated Jon's comments to include an example of a non-rectangular shape.
At least from my point of view, built-in functions like cv2.rectangle don't support opacity, even on BGRA images, see here. So, as I described in the linked answer, the only possibility to achieve, what you want, is to use the cv2.addWeighted function. You can simply set up a blank mask image, and draw all possible shapes on that. Doing so, you can also use that as an actual mask to limit the blending to that part only.
An example could be:
import cv2
import numpy as np
# Load image
img = cv2.imread('images/paddington.png')
# Initialize blank mask image of same dimensions for drawing the shapes
shapes = np.zeros_like(img, np.uint8)
# Draw shapes
cv2.rectangle(shapes, (5, 5), (100, 75), (255, 255, 255), cv2.FILLED)
cv2.circle(shapes, (300, 300), 75, (255, 255, 255), cv2.FILLED)
# Generate output by blending image with shapes image, using the shapes
# images also as mask to limit the blending to those parts
out = img.copy()
alpha = 0.5
mask = shapes.astype(bool)
out[mask] = cv2.addWeighted(img, alpha, shapes, 1 - alpha, 0)[mask]
# Visualization
cv2.imshow('Image', img)
cv2.imshow('Shapes', shapes)
cv2.imshow('Output', out)
cv2.waitKey(0)
cv2.destroyAllWindows()
The original Paddington img:
The intermediate image to draw the shapes on shapes:
And, the final result out:
After drawing the shapes and blending the images, you can add your texts as before.
Hope that helps!
Simply install pyshine and use putBText, it has following inputs and output.
pip install pyshine
"""
Inputs:
img: cv2 image img
text_offset_x, text_offset_x: X,Y location of text start
vspace, hspace: Vertical and Horizontal space between text and box boundaries
font_scale: Font size
background_RGB: Background R,G,B color
text_RGB: Text R,G,B color
font: Font Style e.g. cv2.FONT_HERSHEY_DUPLEX,cv2.FONT_HERSHEY_SIMPLEX,cv2.FONT_HERSHEY_PLAIN,cv2.FONT_HERSHEY_COMPLEX
cv2.FONT_HERSHEY_TRIPLEX, etc
thickness: Thickness of the text font
alpha: Opacity 0~1 of the box around text
gamma: 0 by default
Output:
img: CV2 image with text and background
"""
Example tested on Python3 and complete demonstration is here:
lena.jpg
simple.py
import pyshine as ps
import cv2
image = cv2.imread('lena.jpg')
text = 'HELLO WORLD!'
image = ps.putBText(image,text,text_offset_x=20,text_offset_y=20,vspace=10,hspace=10, font_scale=2.0,background_RGB=(0,250,250),text_RGB=(255,250,250))
cv2.imshow('Output', image)
cv2.imwrite('out.jpg',image)
cv2.waitKey(0)
out.jpg
another.py
import pyshine as ps
import cv2
import time
image = cv2.imread('lena.jpg')
text = 'ID: '+str(123)
image = ps.putBText(image,text,text_offset_x=20,text_offset_y=20,vspace=10,hspace=10, font_scale=1.0,background_RGB=(228,225,222),text_RGB=(1,1,1))
text = str(time.strftime("%H:%M %p"))
image = ps.putBText(image,text,text_offset_x=image.shape[1]-170,text_offset_y=20,vspace=10,hspace=10, font_scale=1.0,background_RGB=(228,225,222),text_RGB=(1,1,1))
text = '6842'
image = ps.putBText(image,text,text_offset_x=80,text_offset_y=372,vspace=10,hspace=10, font_scale=1.0,background_RGB=(228,225,222),text_RGB=(255,255,255))
text = "Lena Fors'en"
image = ps.putBText(image,text,text_offset_x=80,text_offset_y=425,vspace=20,hspace=10, font_scale=1.0,background_RGB=(20,210,4),text_RGB=(255,255,255))
text = 'Status: '
image = ps.putBText(image,text,text_offset_x=image.shape[1]-130,text_offset_y=200,vspace=10,hspace=10, font_scale=1.0,background_RGB=(228,225,222),text_RGB=(255,255,255))
text = 'On time'
image = ps.putBText(image,text,text_offset_x=image.shape[1]-130,text_offset_y=242,vspace=10,hspace=10, font_scale=1.0,background_RGB=(228,225,222),text_RGB=(255,255,255))
text = 'Attendence: '
image = ps.putBText(image,text,text_offset_x=image.shape[1]-200,text_offset_y=394,vspace=10,hspace=10, font_scale=1.0,background_RGB=(228,225,222),text_RGB=(255,255,255))
text = '96.2% '
image = ps.putBText(image,text,text_offset_x=image.shape[1]-200,text_offset_y=436,vspace=10,hspace=10, font_scale=1.0,background_RGB=(228,225,222),text_RGB=(255,255,255))
cv2.imshow('Output', image)
cv2.imwrite('out.jpg',image)
cv2.waitKey(0)
out.jpg
A simpler solution (although a bit less efficient in terms of memory) is:
create a copy of the original image
draw the desired shapes/text on the the original image
get the overlay with: alpha*img + (1-alpha)*img_cpy
In this way each original pixel will not change it's value (since alpha*px + (1-alpha)px = px), whereas pixels which were drawn on will be affected by the overlay.
This eliminates the need to perform crops and pesky calculations seen in the other answers.
...and applying to to the OP's code:
frame_cpy = frame.copy()
cv2.rectangle(frame, (500, 650), (800, 700), (42, 219, 151), cv2.FILLED)
font = cv2.FONT_HERSHEY_DUPLEX
cv2.putText(frame, name, (510, 685), font, 1.0, (255, 255, 255), 1)
alpha = 0.4
frame_overlay=cv2.addWeighted(frame, alpha, frame_cpy,1-alpha, gamma=0)
cv2.imshow("overlay result",frame_overlay)
cv2.waitKey(0)
Disclaimer: this answer was inspired by a post on www.pyimagesearch.com
Related
I want to use OCR (pytesseract) to recognize the text located in images like these:
I have thousands of these arrows. Until now the procedure is as follows: I first resize the image (for another process). Then I crop the image to get rid of the most part of the arrow. Next I draw a white rectangle as a frame to remove further noise but still have distance between text and image borders for better text recognition. I resize the image again to ensure a height of capital letters to ~30 px (https://groups.google.com/forum/#!msg/tesseract-ocr/Wdh_JJwnw94/24JHDYQbBQAJ). Finally I binarize the image with a threshold of 150.
Full code:
import cv2
image_file = '001.jpg'
# load the input image and grab the image dimensions
image = cv2.imread(image_file, cv2.IMREAD_GRAYSCALE)
(h_1, w_1) = image.shape[:2]
# resize the image and grab the new image dimensions
image = cv2.resize(image, (int(w_1*320/h_1), 320))
(h_1, w_1) = image.shape
# crop image
image_2 = image[70:h_1-70, 20:w_1-20]
# get image_2 height, width
(h_2, w_2) = image_2.shape
# draw white rectangle as a frame around the number -> remove noise
cv2.rectangle(image_2, (0, 0), (w_2, h_2), (255, 255, 255), 40)
# resize image, that capital letters are ~ 30 px in height
image_2 = cv2.resize(image_2, (int(w_2*50/h_2), 50))
# image binarization
ret, image_2 = cv2.threshold(image_2, 150, 255, cv2.THRESH_BINARY)
# save image to file
cv2.imwrite('processed_' + image_file, image_2)
# tesseract part can be commented out
import pytesseract
config_7 = ("-c tessedit_char_whitelist=0123456789AB --oem 1 --psm 7")
text = pytesseract.image_to_string(image_2, config=config_7)
print("OCR TEXT: " + "{}\n".format(text))
The problem is that the text located in the arrow is never centered. Sometimes I remove part of the text with the method described above (e.g. in image 50A).
Is there a method in image processing to get rid of the arrow in a more elegant way? For instance using contour detection and deletion? I am more interested in the OpenCV part than the tesseract part to recognize the text.
Any help is appreciated.
If you look at the pictures you will see that there is a white arrow in the image which is also the biggest contour (especially if you draw a black border on the image). If you make a blank mask and draw the arrow (biggest contour on the image) then erode it a little bit you can perform a per element bitwise conjunction of the actual image and eroded mask. If it is not clear look at the bottom code and comments and you will see that it is actually pretty simple.
# imports
import cv2
import numpy as np
img = cv2.imread("number.png") # read image
# you can resize the image here if you like - it should still work for both sizes
h, w = img.shape[:2] # get the actual images height and width
img = cv2.resize(img, (int(w*320/h), 320))
h, w = img.shape[:2]
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) # transform to grayscale
thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY+cv2.THRESH_OTSU)[1] # perform OTSU threhold
cv2.rectangle(thresh, (0, 0), (w, h), (0, 0, 0), 2)
contours = cv2.findContours(thresh, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)[0] # search for contours
max_cnt = max(contours, key=cv2.contourArea) # select biggest one
mask = np.zeros((h, w), dtype=np.uint8) # create a black mask
cv2.drawContours(mask, [max_cnt], -1, (255, 255, 255), -1) # draw biggest contour on the mask
kernel = np.ones((15, 15), dtype=np.uint8) # make a kernel with appropriate values - in both cases (resized and original) 15 is ok
erosion = cv2.erode(mask, kernel, iterations=1) # erode the mask with given kernel
reverse = cv2.bitwise_not(img.copy()) # reversed image of the actual image 0 becomes 255 and 255 becomes 0
img = cv2.bitwise_and(reverse, reverse, mask=erosion) # per-element bit-wise conjunction of the actual image and eroded mask (erosion)
img = cv2.bitwise_not(img) # revers the image again
# save image to file and display
cv2.imwrite("res.png", img)
cv2.imshow("img", img)
cv2.waitKey(0)
cv2.destroyAllWindows()
Result:
You can try simple Python script:
import cv2
import numpy as np
img = cv2.imread('mmubS.png', cv2.IMREAD_GRAYSCALE)
thresh = cv2.threshold(img, 200, 255, cv2.THRESH_BINARY_INV )[1]
im_flood_fill = thresh.copy()
h, w = thresh.shape[:2]
im_flood_fill=cv2.rectangle(im_flood_fill, (0,0), (w-1,h-1), 255, 2)
mask = np.zeros((h + 2, w + 2), np.uint8)
cv2.floodFill(im_flood_fill, mask, (0, 0), 0)
im_flood_fill = cv2.bitwise_not(im_flood_fill)
cv2.imshow('clear text', im_flood_fill)
cv2.imwrite('text.png', im_flood_fill)
Result:
I have a project of opencv where on the frame I am displaying some text using cv2.putText(). Currently it looks like below:
As you can see on the top left corner, the text is present but its not clearly visible. Is it possible to make background black so that the text will then appear good. Something like below image:
Even if the black background covers till right side of the frame, that is also fine. Below is the code I am using for putting text on frame:
cv2.putText(frame, "Data: N/A", (5, 30), cv2.FONT_HERSHEY_COMPLEX_SMALL, 1, (0, 0, 255), 1)
cv2.putText(frame, "Room: C1", (5, 60), cv2.FONT_HERSHEY_COMPLEX_SMALL, 1, (0, 0, 255), 1)
Is there any prebuilt method/library available in opencv which can do this. Can anyone please suggest a good way?
Use this function:
import cv2
def draw_text(img, text,
font=cv2.FONT_HERSHEY_PLAIN,
pos=(0, 0),
font_scale=3,
font_thickness=2,
text_color=(0, 255, 0),
text_color_bg=(0, 0, 0)
):
x, y = pos
text_size, _ = cv2.getTextSize(text, font, font_scale, font_thickness)
text_w, text_h = text_size
cv2.rectangle(img, pos, (x + text_w, y + text_h), text_color_bg, -1)
cv2.putText(img, text, (x, y + text_h + font_scale - 1), font, font_scale, text_color, font_thickness)
return text_size
Then you can invoke the function like this:
image = 127 * np.ones((100, 200, 3), dtype="uint8")
pos = (10, 10)
w, h = draw_text(image, "hello", pos=(10, 10))
draw_text(image, "world", font_scale=4, pos=(10, 20 + h), text_color_bg=(255, 0, 0))
cv2.imshow("image", image)
cv2.waitKey()
note that by default it paints a black background, but you can use a different color if you want.
There's no prebuilt method but a simple appraoch is to use cv2.rectangle + cv2.putText. All you need to do is to draw the black rectangle on the image followed by placing the text. You can adjust the x,y,w,h parameters depending on how large/small you want the rectangle. Here's an example:
Input image:
Result:
import cv2
import numpy as np
# Load image, define rectangle bounds
image = cv2.imread('1.jpg')
x,y,w,h = 0,0,175,75
# Draw black background rectangle
cv2.rectangle(image, (x, x), (x + w, y + h), (0,0,0), -1)
# Add text
cv2.putText(image, "THICC flower", (x + int(w/10),y + int(h/2)), cv2.FONT_HERSHEY_SIMPLEX, 0.7, (255,255,255), 2)
# Display
cv2.imshow('image', image)
cv2.waitKey()
Here is one way to do that in Python OpenCV.
Read the input
Create an image of your desired background color that is the same size as the input
Draw your text on the background image
Get the bounding rectangle for the text region
Copy the text region from the background color image to a copy of the input image
Save the results
Input:
import cv2
import numpy as np
# load image
img = cv2.imread("zelda1.jpg")
# create same size image of background color
bg_color = (0,0,0)
bg = np.full((img.shape), bg_color, dtype=np.uint8)
# draw text on bg
text_color = (0,0,255)
cv2.putText(bg, "Data: N/A", (5,30), cv2.FONT_HERSHEY_COMPLEX_SMALL, 0.75, text_color, 1)
# get bounding box
# use channel corresponding to color so that text is white on black background
x,y,w,h = cv2.boundingRect(bg[:,:,2])
print(x,y,w,h)
# copy bounding box region from bg to img
result = img.copy()
result[y:y+h, x:x+w] = bg[y:y+h, x:x+w]
# write result to disk
cv2.imwrite("zelda1_background_text.jpg", bg)
cv2.imwrite("zelda1_text.jpg", result)
# display results
cv2.imshow("TEXT", bg)
cv2.imshow("RESULT", result)
cv2.waitKey(0)
cv2.destroyAllWindows()
Text on background color image:
Text on input image:
P.S. You can adjust the bounding rectangle (x,y,w,h) values to add some padding if you want when you do the crop.
import cv2 \
import numpy as np
#### Load image, define rectangle bounds
image = cv2.imread(r'C:\Users\Bharath\Downloads\test.jpg')
#### overlay space
x,y,w,h = 40,30,300,60
#### alpha, the 4th channel of the image
alpha = 0.3
overlay = image.copy()
output = image.copy()
##### corner
cv2.rectangle(overlay, (x, x), (x + w, y + h), (0,0,0), -1)
##### putText
cv2.putText(overlay, "HELLO WORLD..!", (x + int(w/10),y + int(h/1.5)), cv2.FONT_HERSHEY_SIMPLEX, 0.7, (255,255,255), 2)
#### apply the overlay
cv2.addWeighted(overlay, alpha, output, 1 - alpha,0, output)
##### Display
cv2.imshow("Output", output)\
cv2.waitKey(0)
`
input
output
I have many 708x708 pictures which I need to resize into a 500x250px, keeping the ratio the same. I imagined this can be done by resize the actual image to 250x250 via Image.thumbnail('image.jpg'), and adding two white borders to fill the remainder of the space. However, I don't know how to do the latter. The following code gives me the thumbnail image of 250x250px.
image = img
img
image.thumbnail((500, 250))
image.save('image_thumbnail.jpg')
print(image.size)
Question is similar to this one.
Any suggestions would be much appreciated!
Check this method in skimage package. There's parameter called mode, where you can control desired behaviour.
Try the following:
import cv2
import numpy as np
img = cv2.imread('myimage.jpg', cv2.IMREAD_UNCHANGED)
print('Original Dimensions : ',img.shape)
width = int(img.shape[1] * 35.31 / 100) # 250/708 is 35%
height = int(img.shape[0] * 35.31 / 100)
dim = (width, height)
resized_image = cv2.resize(img, dim, interpolation = cv2.INTER_AREA)
print('Resized_image Dimensions : ',resized_image.shape)
row, col = resized_image.shape[:2]
bottom = resized_image[row-2:row, 0:col]
bordersize = 125
border = cv2.copyMakeBorder(
resized_image,
top=bordersize,
bottom=bordersize,
left=0,
right=0,
borderType=cv2.BORDER_CONSTANT,
value=[255, 255, 255]
)
cv2.imshow('image', resized_image)
cv2.imshow('left', bottom)
cv2.imshow('right', border)
cv2.waitKey(0)
cv2.destroyAllWindows()
I tried the following: first, I make a thumbnail of size (250,250) and alter the image with ImageOps.expand to add two white borders to make the dimensions (250, 250).
from PIL import Image, ImageOps
img = Image.open('801595.jpg')
img.thumbnail((500, 250))
print(img.size)
img_with_border = ImageOps.expand(img, border = (125, 0) ,fill='white')
img_with_border.save('imaged-with-border2.jpg')
I want to use paste of the python PIL library to paste a image to a black background.
I know I can use the image itself as a alpha mask, but I only want to have the parts of the image where the alpha value is 255.
How is this possible?
Here is my code so far:
import PIL
from PIL import Image
img = Image.open('in.png')
background = Image.new('RGBA', (825, 1125), (0, 0, 0, 255))
offset = (50, 50)
background.paste(img, offset, img) #image as alpha mask as third param
background.save('out.png')
I can't find anything in the official but bad documentation
If I understand your question correctly, then
this is a possible solution. It generates
a dedicated mask, which is used for the paste:
from PIL import Image
img = Image.open('in.png')
# Extract alpha band from img
mask = img.split()[-1]
width, height = mask.size
# Iterate through alpha pixels,
# perform desired conversion
pixels = mask.load()
for x in range(0, width):
for y in range(0, height):
if pixels[x,y] < 255:
pixels[x,y] = 0
# Paste image with converted alpha mask
background = Image.new('RGBA', (825, 1125), (0, 0, 0, 255))
background.paste(img, (50, 50), mask)
background.save('out.png')
As a note, the alpha channel of the background image is fairly useless.
If you don't need it later on, you could also load the background with:
background = Image.new('RGB', (825, 1125), (0, 0, 0))
I'm trying to rotate an image in Python using PIL and having the expand argument to true. It seems that when the background of my image is black, the resulting image saved as a bmp will be a lot smaller than if I have a white background for my image, and then I replace the black due to expand with white. In either case, my original image is always of two colors, and right now i need the file size to be small, since I'm putting these images on an embedded device.
Any ideas if i can force rotate to fill in another color when expanding or if there is another way to rotate my picture in order to make it small?
If your original image has no alpha layer, you can use an alpha layer as a mask to convert the background to white. When rotate creates the "background", it makes it fully transparent.
# original image
img = Image.open('test.png')
# converted to have an alpha layer
im2 = img.convert('RGBA')
# rotated image
rot = im2.rotate(22.2, expand=1)
# a white image same size as rotated image
fff = Image.new('RGBA', rot.size, (255,)*4)
# create a composite image using the alpha layer of rot as a mask
out = Image.composite(rot, fff, rot)
# save your work (converting back to mode='1' or whatever..)
out.convert(img.mode).save('test2.bmp')
There is a parameter fillcolor in a rotate method to specify color which will be use for expanded area:
white = (255,255,255)
pil_image.rotate(angle, PIL.Image.NEAREST, expand = 1, fillcolor = white)
https://pillow.readthedocs.io/en/stable/reference/Image.html#PIL.Image.Image.rotate
Here is a working version, inspired by the answer, but it works without opening or saving images and shows how to rotate a text.
The two images have colored background and alpha channel different from zero to show what's going on. Changing the two alpha channels from 92 to 0 will make them completely transparent.
from PIL import Image, ImageFont, ImageDraw
text = 'TEST'
font = ImageFont.truetype(r'C:\Windows\Fonts\Arial.ttf', 50)
width, height = font.getsize(text)
image1 = Image.new('RGBA', (200, 150), (0, 128, 0, 92))
draw1 = ImageDraw.Draw(image1)
draw1.text((0, 0), text=text, font=font, fill=(255, 128, 0))
image2 = Image.new('RGBA', (width, height), (0, 0, 128, 92))
draw2 = ImageDraw.Draw(image2)
draw2.text((0, 0), text=text, font=font, fill=(0, 255, 128))
image2 = image2.rotate(30, expand=1)
px, py = 10, 10
sx, sy = image2.size
image1.paste(image2, (px, py, px + sx, py + sy), image2)
image1.show()