Related
I would like to draw an arbitrary ellipse on an opencv image in python and then return two arrays: (1) The pixel coordinates of all pixels bounded by the ellipse, both on the ellipse line and inside the ellipse, (2) the pixel values of each of the pixels from array (1).
I looked at this answer, but it only considers the points on the ellipse contour and not the region inside.
With this code you can get every pixel inside an ellipse:
from math import sin, cos
def get_y_ellipse(ellipse_size, x, alpha):
a, b = ellipse_size[0] / 2, ellipse_size[1] / 2
delta_sqrt = ((b**4-2*a**2*b**2+a**4)*sin(2*alpha)**2*x**2-4*a**4*x**2*cos(alpha)**2*sin(alpha)**2-4*a**2*b**2*x**2*cos(alpha)**2+4*a**4*b**2*cos(alpha)**2-4*a**2*b**2*x**2*sin(alpha)**4-4*b**4*x**2*sin(alpha)**2*cos(alpha)**2+4*a**2*b**4*sin(alpha)**2)**(1/2)
y1 = ((-b**2 + a**2)*sin(2*alpha)*x + delta_sqrt) / (2*a**2*cos(alpha)**2+2*b**2*sin(alpha)**2)
y2 = ((-b**2 + a**2)*sin(2*alpha)*x - delta_sqrt) / (2*a**2*cos(alpha)**2+2*b**2*sin(alpha)**2)
return y1, y2
ellipse_size = (100, 50)
ellipse_rotation = 45 # deg
ellipse_center_position = (0,0)
pixels = []
for x in range(ellipse_center_position[0] - ellipse_size[0], ellipse_center_position[0] + ellipse_size[0]):
y1, y2 = get_y_ellipse(ellipse_size, x, ellipse_rotation)
if complex not in map(type, (y1, y2)):
for y in range(int(y1), int(y2), -1):
pixels.append([x, y])
# 'pixels' is a 1d array that contain every pixel [x,y] format
Hope this helps.
I'm working on a data augmentation and im trying to generate synthetic version of every image in my dataset. So i need to rotate images and together with bounding boxes as well in the images.
im only going to rotate images by 90, 180, 270 degrees.
I'm using pascal-voc annotation format as shown here. As a result i have following info.
x_min, y_min, x_max, y_max. Origin of image(i can get it from image size)
i've searched a lot on it. But i couldnt find any solution for rotating bounding boxes( or rectangles)
i've tried something like this;
i've got this solution from here and tried to adapt it but didnt work.
def rotateRect(bndbox, img_size, angle):
angle = angle * math.pi/180 # conversion from degree to radian
y_min, y_max, x_min, x_max = bndbox
ox, oy = img_size[0]/2, img_size[1]/2 # coordinate of origin of image
rect = [[x_min, y_min], [x_min, y_max],[x_max, y_min],[x_max, y_max]] # coordinates of points of corners of bounding box rectangle.
nrp = [[0, 0], [0,0 ],[0,0],[0, 0]] #new rectangle position
for i, pt in enumerate(rect):
newPx = int(ox + math.cos(angle) * (pt[0] - ox) - math.sin(angle) * (pt[1] - oy)) # new coordinate of point x
newPy = int(oy + math.sin(angle) * (pt[0] - ox) + math.cos(angle) * (pt[1] - oy)) # new coordinate of point y
nrp[i] = newPx,newPy
nx_min, ny_min, nx_max, ny_max = nrp[0][0], nrp[0][1], nrp[2][0], nrp[2][1] # new bounding boxes values.
return [ny_min, ny_max, nx_min, nx_max]
thanks.
EDIT:
I need to get this rotation together with image and bounding box.
First picture is original one, second one is rotated as 90 degree(counter-clockwise) and 3rd picture is rotated as -90 degree (counter-wise).
i tried to rotate manually on paint to be precise. So i got these results.
original of img size:(640x480)
rotation orj, 90, -90
--------------
x_min = 98, 345, 17
y_min = 345, 218, 98
x_max = 420, 462, 420
y_max = 462, 540, 134
i've found simpler way.
Base on this aproach. We can do this calculation without using trigonometric calculations like this:
def rotate90Deg(bndbox, img_width): # just passing width of image is enough for 90 degree rotation.
x_min,y_min,x_max,y_max = bndbox
new_xmin = y_min
new_ymin = img_width-x_max
new_xmax = y_max
new_ymax = img_width-x_min
return [new_xmin, new_ymin,new_xmax,new_ymax]
rotate90Deg([98,345,420,462],640)
this can be used over and over again. And returns new bounding boxes values in Pascal-voc format.
OK, maybe this can help. Assuming your rectangle is stored as a set of 4 points marking the corners, this will do arbitrary rotation around another point. If you store the points in circular order, then plot will even look like rectangles. I'm not forcing the aspect ratio on the plot, so the rotated rectangle looks like it is skewed, but it's not.
import math
import matplotlib.pyplot as plt
def rotatebox( rect, center, degrees ):
rads = math.radians(degrees)
newpts = []
for pts in rect:
diag_x = center[0] - pts[0]
diag_y = center[1] - pts[1]
# Rotate the diagonal from center to top left
newdx = diag_x * math.cos(rads) - diag_y * math.sin(rads)
newdy = diag_x * math.sin(rads) + diag_y * math.cos(rads)
newpts.append( (center[0] + newdx, center[1] + newdy) )
return newpts
# Return a set of X and Y for plotting.
def corners(rect):
return [k[0] for k in rect]+[rect[0][0]],[k[1] for k in rect]+[rect[0][1]]
rect = [[50,50],[50,120],[150,120],[150,50]]
plt.plot( *corners(rect) )
rect = rotatebox( rect, (100,100), 135 )
plt.plot( *corners(rect) )
plt.show()
The code can be made simpler for the 90/180/270 cases, because no trigonometry is needed. It's just addition, subtraction, and swapping points. Here, the rectangle is just stored [minx,miny,maxx,maxy].
import matplotlib.pyplot as plt
def rotaterectcw( rect, center ):
x0 = rect[0] - center[0]
y0 = rect[1] - center[1]
x1 = rect[2] - center[0]
y1 = rect[3] - center[1]
return center[0]+y0, center[1]-x0, center[0]+y1, center[1]-x1
def corners(rect):
x0, y0, x1, y1 = rect
return [x0,x0,x1,x1,x0],[y0,y1,y1,y0,y0]
rect = (50,50,150,120)
plt.plot( *corners(rect) )
rect = rotaterectcw( rect, (60,100) )
plt.plot( *corners(rect) )
rect = rotaterectcw( rect, (60,100) )
plt.plot( *corners(rect) )
rect = rotaterectcw( rect, (60,100) )
plt.plot( *corners(rect) )
plt.show()
I tried the implementations mentioned in the other answers but none of them worked for me. I had to rotate the image and the bounding box clockwise by 90 degrees so I made this method,
def rotate90Deg( bndbox , image_width ):
"""
image_width: Width of the image after clockwise rotation of 90 degrees
"""
x_min,y_min,x_max,y_max = bndbox
new_xmin = image_width - y_max # Reflection about center X-line
new_ymin = x_min
new_xmax = image_width - y_min # Reflection about center X-line
new_ymax = x_max
return [new_xmin, new_ymin,new_xmax,new_ymax]
Usage
image = Image.open( "..." )
image = image.rotate( -90 )
new_bbox = rotate90Deg( bbox , image.width )
I'm adding a text on an image at a position (x,y) and then drawing a rectangle around it (x,y,x+text_width,y+text_height). Now I'm rotating the image by an angle of 30. How can I get the new coordinates ?
from PIL import Image
im = Image.open('img.jpg')
textlayer = Image.new("RGBA", im.size, (0,0,0,0))
textdraw = ImageDraw.Draw(textlayer)
textsize = textdraw.textsize('Hello World', font='myfont.ttf')
textdraw.text((75,267), 'Hello World', font='myfont.ttf', fill=(255,255,255))
textlayer = textlayer.rotate(30)
I tried this . But I'm not getting the point correctly. Can anyone point me what I'm doing wrong.
textpos = (75,267)
theta = 30
x0,y0 = 0,0
h = textsize[0] - textsize[1]
x,y = textpos[0], textpos[1]
xNew = (x-x0)*cos(theta) - (h-y-y0)*sin(theta) + x0
yNew = -(x-x0)*sin(theta) - (h-y-y0)*cos(theta) + (h-y0)
In PIL, the rotation happens about the center of the image. So considering your center of the image is given by:
cx = int(image_width / 2)
cy = int(image_height / 2)
a specified rotation angle:
theta = 30
and given coordinates (px, py), The new coordinates can be obtained using the following equation:
rad = radians(theta)
new_px = cx + int(float(px-cx) * cos(rad) + float(py-cy) * sin(rad))
new_py = cy + int(-(float(px-cx) * sin(rad)) + float(py-cy) * cos(rad))
Please note that the angle must be specified in radians, not degrees.
This answer is inspired from this following blog-post.
Context: I am performing Object Localisation and wanting to implement an Inhibition of Return mechanism (i.e. drawing a black cross on the image where the red bounding box is after a trigger action.)
Problem: I do not know how to accurately scale the bounding box (red) in relation to the original input (init_input). If this scaling is understood, then the black cross should be accurately placed in the middle of the red bounding box.
My current code for this function is as follows:
def IoR(b, init_input, prev_coord):
"""
Inhibition-of-Return mechanism.
Marks the region of the image covered by
the bounding box with a black cross.
:param b:
The current bounding box represented as [x1, y1, x2, y2].
:param init_input:
The initial input volume of the current episode.
:param prev_coord:
The previous state's bounding box coordinates (x1, y1, x2, y2)
"""
x1, y1, x2, y2 = prev_coord
width = 12
x_mid = (b[2] + b[0]) // 2
y_mid = (b[3] + b[1]) // 2
# Define vertical rectangle coordinates
ver_x1 = int(((x_mid) * IMG_SIZE / (x2 - x1)) - width)
ver_x2 = int(((x_mid) * IMG_SIZE / (x2 - x1)) + width)
ver_y1 = int((b[1]) * IMG_SIZE / (y2 - y1))
ver_y2 = int((b[3]) * IMG_SIZE / (y2 - y1))
# Define horizontal rectangle coordinates
hor_x1 = int((b[0]) * IMG_SIZE / (x2 - x1))
hor_x2 = int((b[2]) * IMG_SIZE / (x2 - x1))
hor_y1 = int(((y_mid) * IMG_SIZE / (y2 - y1)) - width)
hor_y2 = int(((y_mid) * IMG_SIZE / (y2 - y1)) + width)
# Draw vertical rectangle
cv2.rectangle(init_input, (ver_x1, ver_y1), (ver_x2, ver_y2), (0, 0, 0), -1)
# Draw horizontal rectangle
cv2.rectangle(init_input, (hor_x1, hor_y1), (hor_x2, hor_y2), (0, 0, 0), -1)
The desired effect can be seen below:
Note: I believe the complexity in this problem arises due to the image being resized (to 224, 224, 3) each time I take an action (and consequently move onto the next state). Therefore, the "anchor" to determine the scaling must be extracted from the previous states scaling, which is shown in the following code:
def next_state(init_input, b_prime, g):
"""
Returns the observable region of the next state.
Formats the next state's observable region, defined
by b_prime, to be of dimension (224, 224, 3). Adding 16
additional pixels of context around the original bounding box.
The ground truth box must be reformatted according to the
new observable region.
IMG_SIZE = 224
:param init_input:
The initial input volume of the current episode.
:param b_prime:
The subsequent state's bounding box.
:param g: (init_g)
The initial ground truth box of the target object.
"""
# Determine the pixel coordinates of the observable region for the following state
context_pixels = 16
x1 = max(b_prime[0] - context_pixels, 0)
y1 = max(b_prime[1] - context_pixels, 0)
x2 = min(b_prime[2] + context_pixels, IMG_SIZE)
y2 = min(b_prime[3] + context_pixels, IMG_SIZE)
# Determine observable region
observable_region = cv2.resize(init_input[y1:y2, x1:x2], (224, 224), interpolation=cv2.INTER_AREA)
# Resize ground truth box
g[0] = int((g[0] - x1) * IMG_SIZE / (x2 - x1)) # x1
g[1] = int((g[1] - y1) * IMG_SIZE / (y2 - y1)) # y1
g[2] = int((g[2] - x1) * IMG_SIZE / (x2 - x1)) # x2
g[3] = int((g[3] - y1) * IMG_SIZE / (y2 - y1)) # y2
return observable_region, g, (b_prime[0], b_prime[1], b_prime[2], b_prime[3])
Explanation:
There is a state t in which the agent is predicting the location of the target object. The target object has a ground truth box (yellow in image, dotted in sketch), and the agent's current "localising box" is the red bounding box. Say, at state t the agent decides it is best to move right. Consequently, the bounding box is moved to the right, and then the next state, t' is determined by adding an additional 16 pixels of context around the red bounding box, cropping the original image with respect to this boundary, and then upscaling the cropped image back to 224, 224 in dimensions.
Say the agent is now confident that its prediction is accurate, so it chooses the trigger action. This basically means, end the current target object's localisation episode and place a black cross on where the agent predicted the object was (i.e. in the middle of the red bounding box). Now, since the current state is zoomed in after being cropped following the previous action, the bounding box must be re-scaled with respect to the normal/original/initial image and then the black cross can be drawn accurately onto the image.
In the context of my problem, the first rescaling between states is working perfectly well (the second code in this post). However, scaling back to normal and drawing the black cross is what I cannot seem to get my head around.
Here is an image which hopefully helps the explanation:
Here is the output of my current solution (please click the image to zoom in):
I think it's better to save the coordinate globally instead of using a bunch of upscale/downscale. They give me headache and there might be loss of precision due to rounding.
That is, every time you detect something, you convert it to global (original image) coordinate first. I have written a small demo here, imitating your detection and trigger behavior.
Initial detection:
Zoomed in, another detection:
Zoomed in, another detection:
Zoomed in, another detection:
Zoomed back to original scale, with the detection box in the correct location
Code:
import cv2
import matplotlib.pyplot as plt
IMG_SIZE = 224
im = cv2.cvtColor(cv2.imread('lena.jpg'), cv2.COLOR_BGR2GRAY)
im = cv2.resize(im, (IMG_SIZE, IMG_SIZE))
# Your detector results
detected_region = [
[(10, 20) , (80, 100)],
[(50, 0) , (220, 190)],
[(100, 143) , (180, 200)],
[(110, 45) , (180, 150)]
]
# Global states
x_scale = 1.0
y_scale = 1.0
x_shift = 0
y_shift = 0
x1, y1 = 0, 0
x2, y2 = IMG_SIZE-1, IMG_SIZE-1
for region in detected_region:
# Detection
x_scale = IMG_SIZE / (x2-x1)
y_scale = IMG_SIZE / (y2-y1)
x_shift = x1
y_shift = y1
cur_im = cv2.resize(im[y1:y2, x1:x2], (IMG_SIZE, IMG_SIZE))
# Assuming the detector return these results
cv2.rectangle(cur_im, region[0], region[1], (255))
plt.imshow(cur_im)
plt.show()
# Zooming in, using part of your code
context_pixels = 16
x1 = max(region[0][0] - context_pixels, 0) / x_scale + x_shift
y1 = max(region[0][1] - context_pixels, 0) / y_scale + y_shift
x2 = min(region[1][0] + context_pixels, IMG_SIZE) / x_scale + x_shift
y2 = min(region[1][1] + context_pixels, IMG_SIZE) / y_scale + y_shift
x1, y1, x2, y2 = int(x1), int(y1), int(x2), int(y2)
# Assuming the detector confirm its choice here
print('Confirmed detection: ', x1, y1, x2, y2)
# This time no padding
x1 = detected_region[-1][0][0] / x_scale + x_shift
y1 = detected_region[-1][0][1] / y_scale + y_shift
x2 = detected_region[-1][1][0] / x_scale + x_shift
y2 = detected_region[-1][1][1] / y_scale + y_shift
x1, y1, x2, y2 = int(x1), int(y1), int(x2), int(y2)
cv2.rectangle(im, (x1, y1), (x2, y2), (255, 0, 0))
plt.imshow(im)
plt.show()
This also prevents resizing on a resized image which might create more artifacts and worsen the detector's performance.
Imagine a point (x, y) in a 500x500 image. Let it be (100, 200).
After scaling it to a different size, say 250x250 - the correct way to scale it would be to just look at the current co-ordinate and do new_coord = old_coord * NEW_SIZE/OLD_SIZE.
Thus, (100,200) will be transformed to (50,100)
If you replace your scaling using x2-x1 and use a simpler rescaling formula, it should fix your problem.
Update: NEW_SIZE and OLD_SIZE may be different for the two co-ordinates based on the shape of the original image and final image, if they are rectangular and not square.
I have bunch of images (say 10) I have generated both as array or PIL object.
I need to integrate them into a circular fashion to display them and it should adjust itself to the resolution of the screen, is there anything in python that can do this?
I have tried using paste, but figuring out the resolution canvas and positions to paste is painful, wondering if there is an easier solution?
We can say that points are arranged evenly in a circle when there is a constant angle theta between neighboring points. theta can be calculated as 2*pi radians divided by the number of points. The first point is at angle 0 with respect to the x axis, the second point at angle theta*1, the third point at angle theta*2, etc.
Using simple trigonometry, you can find the X and Y coordinates of any point that lies on the edge of a circle. For a point at angle ohm lying on a circle with radius r:
xFromCenter = r*cos(ohm)
yFromCenter = r*sin(ohm)
Using this math, it is possible to arrange your images evenly on a circle:
import math
from PIL import Image
def arrangeImagesInCircle(masterImage, imagesToArrange):
imgWidth, imgHeight = masterImage.size
#we want the circle to be as large as possible.
#but the circle shouldn't extend all the way to the edge of the image.
#If we do that, then when we paste images onto the circle, those images will partially fall over the edge.
#so we reduce the diameter of the circle by the width/height of the widest/tallest image.
diameter = min(
imgWidth - max(img.size[0] for img in imagesToArrange),
imgHeight - max(img.size[1] for img in imagesToArrange)
)
radius = diameter / 2
circleCenterX = imgWidth / 2
circleCenterY = imgHeight / 2
theta = 2*math.pi / len(imagesToArrange)
for i, curImg in enumerate(imagesToArrange):
angle = i * theta
dx = int(radius * math.cos(angle))
dy = int(radius * math.sin(angle))
#dx and dy give the coordinates of where the center of our images would go.
#so we must subtract half the height/width of the image to find where their top-left corners should be.
pos = (
circleCenterX + dx - curImg.size[0]/2,
circleCenterY + dy - curImg.size[1]/2
)
masterImage.paste(curImg, pos)
img = Image.new("RGB", (500,500), (255,255,255))
#red.png, blue.png, green.png are simple 50x50 pngs of solid color
imageFilenames = ["red.png", "blue.png", "green.png"] * 5
images = [Image.open(filename) for filename in imageFilenames]
arrangeImagesInCircle(img, images)
img.save("output.png")
Result: