I'm trying to use the EAST model in OpenCV to detect text in images. I'm successfuly getting the output after I run an image through a network but I'm having a hard time understanding how the decode function I use works. I know that I get 5 numbers as output from the model and I think it's the distances from a point to the top, bottom, left and right sides of the rectangle, respectively, and the angle of rotation at the end. I'm not sure what the decode function does to get the bounding box for the text region.
I know why the offset is multiplied by 4 (it's shrunk by 4 when run through the model). I know why h and w are what they are. I'm not sure about anything after that.
scores are the confidence scores for each region;
geometry are the geometry values for each region (the 5 numbers I mentioned)
scoreThresh is just a threshold for the non-maximum suppresion
def decode(scores, geometry, scoreThresh):
detections = []
confidences = []
############ CHECK DIMENSIONS AND SHAPES OF geometry AND scores ############
assert len(scores.shape) == 4, "Incorrect dimensions of scores"
assert len(geometry.shape) == 4, "Incorrect dimensions of geometry"
assert scores.shape[0] == 1, "Invalid dimensions of scores"
assert geometry.shape[0] == 1, "Invalid dimensions of geometry"
assert scores.shape[1] == 1, "Invalid dimensions of scores"
assert geometry.shape[1] == 5, "Invalid dimensions of geometry"
assert scores.shape[2] == geometry.shape[2], "Invalid dimensions of scores and geometry"
assert scores.shape[3] == geometry.shape[3], "Invalid dimensions of scores and geometry"
height = scores.shape[2]
width = scores.shape[3]
for y in range(0, height):
# Extract data from scores
scoresData = scores[0][0][y]
x0_data = geometry[0][0][y]
x1_data = geometry[0][1][y]
x2_data = geometry[0][2][y]
x3_data = geometry[0][3][y]
anglesData = geometry[0][4][y]
for x in range(0, width):
score = scoresData[x]
# If score is lower than threshold score, move to next x
if(score < scoreThresh):
continue
# Calculate offset
offsetX = x * 4.0
offsetY = y * 4.0
angle = anglesData[x]
# Calculate cos and sin of angle
cosA = math.cos(angle)
sinA = math.sin(angle)
h = x0_data[x] + x2_data[x]
w = x1_data[x] + x3_data[x]
# Calculate offset
offset = ([offsetX + cosA * x1_data[x] + sinA * x2_data[x], offsetY - sinA * x1_data[x] + cosA * x2_data[x]])
# Find points for rectangle
p1 = (-sinA * h + offset[0], -cosA * h + offset[1])
p3 = (-cosA * w + offset[0], sinA * w + offset[1])
center = (0.5*(p1[0]+p3[0]), 0.5*(p1[1]+p3[1]))
detections.append((center, (w,h), -1*angle * 180.0 / math.pi))
confidences.append(float(score))
# Return detections and confidences
return [detections, confidences]
The paper contains a diagram of the output format. Instead of specifying the box in a usual way, it is specified as a set of distances (up, right, down, and left) from an offset (x, y), in addition to an angle A, the amount box has rotated counterclockwise.
Note that the scores and geometry are indexed by y, x, opposite of any logic below offset calculation. Therefore, to get the geometry components of a highest scoring y, x:
high_scores_yx = np.where(scores[0][0] >= np.max(scores[0][0]))
y, x = high_scores_yx[0][0], high_scores_yx[1][0]
h_upper, w_right, h_lower, w_left, A = geometry[0,:,y,x]
The code uses offset to store the offset of the lower-right corner of the rectangle. Since it's the lower-right, it only needs w_right and h_lower, which in the code, are x1_data and x2_data, respectively.
The location of the lower-right corner, with respect to the original offset offsetX, offsetY, depends on the angle of rotation. Below, the dotted lines show the axes orientation. The components to get from the original to the lower-bottom offset are labelled in violet (horizontal) and purple (vertical). Note that the sin(A) * w_right component is subtracted because y gets bigger as you go lower, in this coordinate system.
So that explains
offset = ([offsetX + cosA * x1_data[x] + sinA * x2_data[x], offsetY - sinA * x1_data[x] + cosA * x2_data[x]])
Next: p1 and p3 are the lower-left and upper-right corners of the rectangle, respectively, with rotation taken into account. center is just the average of these two points.
Finally, -1*angle * 180.0 / math.pi converts the original counterclockwise, radians angle into a clockwise-based, degrees angle (so final output angle should be negative for objects rotated counterclockwise). This is for compatibility with the CV2 boxPoints method, used in:
https://github.com/opencv/opencv/blob/7fb70e170154d064ef12d8fec61c0ae70812ce3d/samples/dnn/text_detection.py
Related
I need to draw slanted lines like this programmatically using opencv-python, and it has to be similar in terms of the slant angle and the distance between the lines:
If using OpenCV cv.line() i need to supply the function with the line's start and endpoint.
Following this StackOverflow accepted answer, I think I will be able to know those two points, but first I need to calculate the line equation itself.
So what I have done is first I calculate the slant angle of the line using the measure tool in ai (The actual image was given by the graphic designer as ai (adobe illustrator) file), and I got 67deg and I solve the gradient of the line. But the problem is I don't know how to get the horizontal spacing/distance between the lines. I needed that so i can supply the start.X. I used the illustrator, and try to measure the distance between the lines but how to map it to opencv coordinate?
Overall is my idea feasible? Or is there a better way to achieve this?
Update 1:
I managed to draw this experimental image:
And this is code:
def show_image_scaled(window_name,image,height,width):
cv2.namedWindow(window_name,cv2.WINDOW_NORMAL)
cv2.resizeWindow(window_name,width,height)
cv2.imshow(window_name,image)
cv2.waitKey(0)
cv2.destroyAllWindows()
def slanted_lines_background():
canvas = np.ones((200,300)) * 255
end_x = 0
start_y = 0
m = 2.35
end_x = 0
for x in range(0,canvas.shape[1],10):
start_x = x
end_y = start_y + compute_length(m,start_x,start_y,end_x)
cv2.line(canvas,(start_x,start_y),(end_x,end_y),(0,0,0),2)
show_image_scaled("Slant",canvas,200,300)
def compute_length(m,start_x,start_y,end_x=0):
c = start_y - (m * start_x)
length_square = (end_x - start_x)**2 + ((m *end_x) + c - start_y) ** 2
length = math.sqrt(length_square)
return int(length)
Still working on to fill the left part of the rectangle
This code "shades" every pixel in a given image to produce your hatched pattern. Don't worry about the math. It's mostly correct. I've checked the edge cases for small and wide lines. The sampling isn't exactly correct but nobody's gonna notice anyway because the imperfection amounts to small fractions of a pixel. And I've used numba to make it fast.
import numpy as np
from numba import njit, prange
#njit(parallel=True)
def hatch(im, angle=45, stride=10, dc=None):
stride = float(stride)
if dc is None:
dc = stride * 0.5
assert 0 <= dc <= stride
stride2 = stride / 2
dc2 = dc / 2
angle = angle / 180 * np.pi
c = np.cos(angle)
s = np.sin(angle)
(height, width) = im.shape[:2]
for y in prange(height):
for x in range(width):
# distance to origin along normal
dist_origin = c*x - s*y
# distance to center of nearest line
dist_center = stride2 - abs((dist_origin % stride) - stride2)
# distance to edge of nearest line
dist_edge = dist_center - dc2
# shade pixel, with antialiasing
# use edge-0.5 to edge+0.5 as "gradient" <=> 1-sized pixel straddles edge
# for thick/thin lines, needs hairline handling
# thin line -> gradient hits far edge of line / pixel may span both edges of line
# thick line -> gradient hits edge of adjacent line / pixel may span adjacent line
if dist_edge > 0.5: # background
val = 0
else: # pixel starts covering line
val = 0.5 - dist_edge
if dc < 1: # thin line, clipped to line width
val = min(val, dc)
elif stride - dc < 1: # thick line, little background
val = max(val, 1 - (stride - dc))
im[y,x] = val
canvas = np.zeros((128, 512), 'f4')
hatch(canvas, angle=-23, stride=5, dc=2.5)
# mind the gamma mapping before imshow
The problem - given a list of planar points [p_1, ..., p_n] and the dimensions of some rectangle w, h, find the minimal set of rectangles w, h that cover all points (edit - the rectangles are not rotated).
My inital solution was:
find the bounding-box of all points
divide the width and height of the bounding-box by the w, h of the given rectangle and round the number up to get the number of instances of the rectangle in x and y
to further optimize, go through all rectangles and delete the ones that have zero points inside them.
An example in Python:
def tile_rect(points, rect):
w, h = rect
xs = [p.x for p in points]
ys = [p.y for p in points]
bbox_w = abs(max(xs) - min(xs))
bbox_h = abs(max(ys) - min(ys))
n_x, n_y = ceil(bbox_w / w), ceil(bbox_h / h)
rect_xs = [(min(xs) + n * w for n in range(n_x)]
rect_ys = [(min(ys) + n * h for n in range(n_y)]
rects = remove_empty(rect_xs, rect_ys)
return rects
How can I do better? What algorithm can I use to decrease the number of rectangles?
To discretize the problem for integer programming, observe that given a rectangle we can slide it in the +x and +y directions without decreasing the coverage until the min x and the min y lines both have a point on them. Thus the integer program is just the standard min cover:
minimize sum_R x_R
subject to
for every point p, sum_{R contains p} x_R >= 1
x_R in {0, 1}
where R ranges over all rectangles whose min x is the x of some point and whose min y is the y of some point (not necessarily the same point).
Demo Python:
import random
from ortools.linear_solver import pywraplp
w = 0.1
h = 0.1
points = [(random.random(), random.random()) for _ in range(100)]
rectangles = [(x, y) for (x, _) in points for (_, y) in points]
solver = pywraplp.Solver.CreateSolver("min cover", "SCIP")
objective = solver.Objective()
constraints = [solver.RowConstraint(1, pywraplp.inf, str(p)) for p in points]
variables = [solver.BoolVar(str(r)) for r in rectangles]
for (x, y), var in zip(rectangles, variables):
objective.SetCoefficient(var, 1)
for (px, py), con in zip(points, constraints):
if x <= px <= x + w and y <= py <= y + h:
con.SetCoefficient(var, 1)
solver.Objective().SetMinimization()
solver.Solve()
scale = 6 * 72
margin = 72
print(
'<svg width="{}" height="{}">'.format(
margin + scale + margin, margin + scale + margin
)
)
print(
'<text x="{}" y="{}">{} rectangles</text>'.format(
margin // 2, margin // 2, round(objective.Value())
)
)
for x, y in points:
print(
'<circle cx="{}" cy="{}" r="3" fill="none" stroke="black"/>'.format(
margin + x * scale, margin + y * scale
)
)
for (x, y), var in zip(rectangles, variables):
if var.solution_value():
print(
'<rect x="{}" y="{}" width="{}" height="{}" fill="none" stroke="rgb({},{},{})"/>'.format(
margin + x * scale,
margin + y * scale,
w * scale,
h * scale,
random.randrange(192),
random.randrange(192),
random.randrange(192),
)
)
print("</svg>")
Example output:
Assuming an approximate, rather than optimal solution is acceptable, how about a routine generally like:
Until no points are left:
(1) Find the convex hull of the remaining points.
(2) Cover each point/s on the hull so the
covering rectangles extend "inward."
(Perhaps examine neighbouring hull points
to see if a rectangle can cover more than one.)
(3) Remove the newly covered points.
Clearly, the orientation of the covering rectangles has an effect on the procedure and result. I think there is a way to combine (1) and (3), or possibly rely on a nested convex hull, but I don't have too much experience with those.
This is can be transformed into a mostly standard set cover problem. The general steps are as follows, given n points in the plane.
First, generate all possible maximally inclusive rectangles, of which there are at most n^2, named R. The key insight is that given a point p1 with coordinates (x1, y1), use x1 as the leftmost bound for a set of rectangles. For all other points p2 with (x2,y2) where x1 <= x2 <= x1+w and where y1-h <= y2 <= y1+h, generate a rectangle ((x1, y2), (x1+w, y2+h)).
For each rectangle r generated, count the points included in that rectangle cover(r).
Choose a subset of the rectangles R, s, such that all points are in Union(r in s) cover(r)
Now, the tricky part is that last step. Fortunately, it is a standard problem and there are many algorithms suggested in the literature. For example, combinatorial optimization solvers (such as SAT solvers, MIP solvers, and Constraint programming solvers) can be used.
Note that the above re-formulation only works if it is ok for rectangles to cover each other. It might be the case that the generated set of rectangles is not enough to find the least set of rectangles that do not overlap.
I'm trying to do 3D scene reconstruction and camera pose estimation on video input, however the camera positions are not matching what I am seeing in the video.
Here is the code I wrote to recover the pose and landmark positions
def SfM(self, points1, points2):
x = 800 / 2
y = 600 / 2
fov = 80 * (math.pi / 180)
f_x = x / math.tan(fov / 2)
f_y = y / math.tan(fov / 2)
# intrinsic camera matrix
K = np.array([[f_x, 0, x],
[0, f_y, y],
[0, 0, 1]])
#find fundamental matrix
E, mask = cv2.findFundamentalMat(np.float32(points2), np.float32(points1), cv2.FM_8POINT)
#get rotation matrix and translation vector
points, R, t, mask = cv2.recoverPose(E, np.float32(points2), np.float32(points1), K, 500)
#caculate the new camera position based on the translation, camPose is the previous camera position
self.cam_xyz.append([self.camPose[0] + t[0], self.camPose[1] + t[1], self.camPose[2] + t[2]])
#calculate the extrinsic matrix
C = np.hstack((R, t))
#calculate the landmark positions
for i in range(len(points2)):
#convert coordinates into a 3x1 array
pts2d = np.asmatrix([points2[i][0], points2[i][1], 1]).T
#calculate camera matrix
P = np.asmatrix(K) * np.asmatrix(C)
#find 3d coordinate
pts3d = np.asmatrix(P).I * pts2d
#add to list of landmarks
self.lm_xyz.append([pts3d[0][0] * self.scale + self.camPose[0],
pts3d[1][0] * self.scale + self.camPose[1],
pts3d[2][0] * self.scale + self.camPose[2]])
#update the previous camera position
self.camPose = [self.camPose[0] + t[0], self.camPose[1] + t[1], self.camPose[2] + t[2]]
When I passed in this video I got this as my output
I can't figure out why it is veering to right when the camera only heads straight in the video. I suspect that I am implementing the cv2.recoverPose method incorrectly but I don't no what else I can do to make it better. I put the full code in a PasteBin in case anyone wants to replicate the program. Any help would be greatly appreciated. Thank you so much!
Shouldn't you calculate the essential matrix E with cv.findEssentialMatrix instead? In this way, you calculated the fundamental matrix F, but to recover the pose, you must pass E = K^T * F * K, w/ K = camera matrix
I was reading this paper "Self-Invertible 2D Log-Gabor Wavelets" it defines 2D log gabor filter as such:
The paper also states that the filter only covers one side of the frequency space and shows that in this image
On my attempt to implement the filter I get results that do not match with what is said in the paper. Let me start with my implementation then I will state the problems.
Implementation:
I created a 2d array that contains the filter and transformed each index so that the origin of the frequency domain is at the center of the array with positive x-axis going right and positive y-axis going up.
number_scales = 5 # scale resolution
number_orientations = 9 # orientation resolution
N = constantDim # image dimensions
def getLogGaborKernal(scale, angle, logfun=math.log2, norm = True):
# setup up filter configuration
center_scale = logfun(N) - scale
center_angle = ((np.pi/number_orientations) * angle) if (scale % 2) \
else ((np.pi/number_orientations) * (angle+0.5))
scale_bandwidth = 0.996 * math.sqrt(2/3)
angle_bandwidth = 0.996 * (1/math.sqrt(2)) * (np.pi/number_orientations)
# 2d array that will hold the filter
kernel = np.zeros((N, N))
# get the center of the 2d array so we can shift origin
middle = math.ceil((N/2)+0.1)-1
# calculate the filter
for x in range(0,constantDim):
for y in range(0,constantDim):
# get the transformed x and y where origin is at center
# and positive x-axis goes right while positive y-axis goes up
x_t, y_t = (x-middle),-(y-middle)
# calculate the filter value at given index
kernel[y,x] = logGaborValue(x_t,y_t,center_scale,center_angle,
scale_bandwidth, angle_bandwidth,logfun)
# normalize the filter energy
if norm:
Kernel = kernel / np.sum(kernel**2)
return kernel
To calculate the filter value at each index another transform is made where we go to the log-polar space
def logGaborValue(x,y,center_scale,center_angle,scale_bandwidth,
angle_bandwidth, logfun):
# transform to polar coordinates
raw, theta = getPolar(x,y)
# if we are at the center, return 0 as in the log space
# zero is not defined
if raw == 0:
return 0
# go to log polar coordinates
raw = logfun(raw)
# calculate (theta-center_theta), we calculate cos(theta-center_theta)
# and sin(theta-center_theta) then use atan to get the required value,
# this way we can eliminate the angular distance wrap around problem
costheta, sintheta = math.cos(theta), math.sin(theta)
ds = sintheta * math.cos(center_angle) - costheta * math.sin(center_angle)
dc = costheta * math.cos(center_angle) + sintheta * math.sin(center_angle)
dtheta = math.atan2(ds,dc)
# final value, multiply the radial component by the angular one
return math.exp(-0.5 * ((raw-center_scale) / scale_bandwidth)**2) * \
math.exp(-0.5 * (dtheta/angle_bandwidth)**2)
Problems:
The angle: the paper stated that indexing the angles from 1->8 would produce good coverage of the orientation, but in my implementation angles from 1->n don't cover except for half orientations. Even the vertical orientation is not correctly covered. This can be shown in this figure which contains sets of filters of scale 3 and orientations ranging from 1->8:
The coverage: from filters above it is clear the filter covers both sides of the space which is not what the paper says. This can be made more explicit by using 9 orientations ranging from -4 -> 4. The following image contains all the filters in one image to show how it covers both sides of the spectrum (this image is created by taking the maximum at each location from all filters):
Middle Column (orientation $\pi / 2$): in the first figure in orientation from 3 -> 8 it can be seen that the filter vanishes at orientation $ \pi / 2$. Is this normal? This can be seen too when I combine all the filters(of all 5 scales and 9 orientations) in one image:
Update:
Adding the impulse response of the filter in spatial domain, as you can see there is an obvious distortion in -4 & 4 orientations:
After a lot of code analysis, I found that my implementation was correct but the getPolar function was messed up, so the code above should work just fine. This is the a new code without the getPolar function if any one was looking for it:
number_scales = 5 # scale resolution
number_orientations = 8 # orientation resolution
N = 128 # image dimensions
def getFilter(f_0, theta_0):
# filter configuration
scale_bandwidth = 0.996 * math.sqrt(2/3)
angle_bandwidth = 0.996 * (1/math.sqrt(2)) * (np.pi/number_orientations)
# x,y grid
extent = np.arange(-N/2, N/2 + N%2)
x, y = np.meshgrid(extent,extent)
mid = int(N/2)
## orientation component ##
theta = np.arctan2(y,x)
center_angle = ((np.pi/number_orientations) * theta_0) if (f_0 % 2) \
else ((np.pi/number_orientations) * (theta_0+0.5))
# calculate (theta-center_theta), we calculate cos(theta-center_theta)
# and sin(theta-center_theta) then use atan to get the required value,
# this way we can eliminate the angular distance wrap around problem
costheta = np.cos(theta)
sintheta = np.sin(theta)
ds = sintheta * math.cos(center_angle) - costheta * math.sin(center_angle)
dc = costheta * math.cos(center_angle) + sintheta * math.sin(center_angle)
dtheta = np.arctan2(ds,dc)
orientation_component = np.exp(-0.5 * (dtheta/angle_bandwidth)**2)
## frequency componenet ##
# go to polar space
raw = np.sqrt(x**2+y**2)
# set origin to 1 as in the log space zero is not defined
raw[mid,mid] = 1
# go to log space
raw = np.log2(raw)
center_scale = math.log2(N) - f_0
draw = raw-center_scale
frequency_component = np.exp(-0.5 * (draw/ scale_bandwidth)**2)
# reset origin to zero (not needed as it is already 0?)
frequency_component[mid,mid] = 0
return frequency_component * orientation_component
I'm writing a Python program to generate the Luna Free State flag from the famous Heinlein novel The Moon is a Harsh Mistress, as a personal project. I've been cribbing heraldry rules and matching mathematical formulas off the web, but something is clearly wrong in my bendsinister routine, since the assertion fails when uncommented. The area of the bend sinister should be 1/3 the total area of the flag, and it isn't. The only really dodgy thing I've done is to guess at the formula for the height of the trapezoid, but I guess the errors could be anywhere. I've trimmed out most of the code, leaving only what's necessary to show the problem. Hopefully someone less mathematically-challenged can spot the error!
#!/usr/bin/python
'generate bend sinister according to rules of heraldry'
import sys, os, random, math, Image, ImageDraw
FLAG = Image.new('RGB', (900, 600), 'black')
CANVAS = ImageDraw.Draw(FLAG)
DEBUGGING = True
def bendsinister(image = FLAG, draw = CANVAS):
'''a bend sinister covers 1/3 of the field, sinister chief to dexter base
(some sources on the web say 1/5 of the field, but we'll use 1/3)
the "field" in this case being the area of the flag, so we need to
find a trapezoid which is 1/6 the total area (width * height).
we need to return only the width of the diagonal, which is double
the height of the calculated trapezoid
'''
x, y = image.size
b = math.sqrt((x ** 2) + (y ** 2))
A = float(x * y)
debug('%d * %d = %d' % (x, y, A))
H = triangle_height(A / 2, b) # height of triangular half of flag
width = trapezoid_height(b, H, A / 6) * 2
if command == 'bendsinister':
show_bendsinister(x, y, width, image, draw)
return width
def show_bendsinister(x, y, width, image = FLAG, draw = CANVAS):
'for debugging formula'
dexter_base, sinister_chief = (0, y), (x, 0)
draw.line((dexter_base, sinister_chief), 'blue', int(width))
image.show()
debug(image.getcolors(2)) # should be twice as many black pixels as blue
def triangle_height(a, b):
'a=bh/2'
h = float(a) / (float(b) / 2)
debug('triangle height: %.2f' % h)
return h
def trapezoid_height(b, H, a):
'''calculate trapezoid height (h) given the area (a) of the trapezoid and
base b, the longer base, when it is known that the trapezoid is a section
of a triangle of height H, such that the top, t, equals b when h=0 and
t=0 when h=H. h is therefore inversely proportional to t with the formula
t=(1-(h/H))*b, found simply by looking for what fit the two extremes.
the area of a trapezoid is simply the height times the average length of
the two bases, b and t, i.e.: a=h*((b+t)/2). the formula reduces
then to (2*a)/b=(2*h)+(h**2)/H, which is the quadratic equation
(1/H)*(h**2)+(2*h)-((2*a)/b)=0; solve for h using the quadratic formula
'''
try:
h = (-2 + math.sqrt(4 - 4 * (1.0 / H) * -((2 * a) / b))) / (2 * (1.0 / H))
debug('trapezoid height with plus: %.2f' % h)
except: # must be imaginary, so try minus instead
h = (-2 - math.sqrt(4 - 4 * (1.0 / H) * -((2 * a) / b))) / (2 * (1.0 / H))
debug('trapezoid height with minus: %.2f' % h)
t = (1 - (float(h) / H)) * b
debug('t=%d, a=%d, check=%d' % (t, round(a), round(h * ((b + t) / 2))))
#assert round(a) == round(h * ((b + t) / 2))
return h
def debug(message):
if DEBUGGING:
print >>sys.stderr, message
if __name__ == '__main__':
command = os.path.splitext(os.path.basename(sys.argv[0]))[0]
print eval(command)(*sys.argv[1:]) or ''
Here is the debugging output, showing I'm far off from the 1/3 area:
jcomeau#intrepid:~/rentacoder/jcomeau/tanstaafl$ ./bendsinister.py
900 * 600 = 540000
triangle height: 499.23
trapezoid height with plus: 77.23
t=914, a=90000, check=77077
[(154427, (0, 0, 255)), (385573, (0, 0, 0))]
154.462354191
Here is an image of the output, with some added lines:
The red line divides the two triangles, either can be used for the calculation of the trapezoid. I'm using the one starting at the top left. The green line is the height of that triangle, the variable H in the program.
For the finished script and flag (using the correction supplied by Michael Anderson), see http://unternet.net/tanstaafl/. Thanks all for the help!
Break the rectangle into two triangles. They will be identical.
The Black triangle + Blue Trapezoid is Triangle A.
The Black Triangle on its own is Triangle B
Triangle A and Triangle B are similar triangles so their area is related by the square of the scale factor relating them.
We want the Blue Trapezoid to be one third of the area of Triangle A. (This way the bend will take one third of the overall rectangle). This means that Triangle B must be 2/3 area of Triangle A. Thus the scalefactor must be sqrt(2/3).
You should then be able to convert this to give you the coordinates of the bend geometry pretty easily.
I executed the following code in an IDLE session
from PIL import Image, ImageDraw
from math import sqrt
'generate bend sinister according to rules of heraldry'
import sys, os, random, math
FLAG = Image.new('RGB', (900, 600), 'black')
CANVAS = ImageDraw.Draw(FLAG)
DEBUGGING = True
def debug(message):
if DEBUGGING:
print >>sys.stderr, message
def show_bendsinister(x, y, width, image = FLAG, draw = CANVAS):
'for debugging formula'
dexter_base, sinister_chief = (0, y), (x, 0)
print 'dexter_base==',dexter_base,'sinister_chief==',sinister_chief
draw.line((dexter_base, sinister_chief), 'blue', int(width))
image.show()
debug(image.getcolors(2)) # should be twice as many black pixels as blue
def trapezoid_height(x, y, P):
'''Given a rectangle whose width and length are (x) and (y)
The half of this rectangle is a large triangle A
whose base (b) is the diagonal of the rectangle
and its height (H) goes from its base (b) to
the right angle of the large triangle.
(x) and (y) are the side-lengths of the triangle.
The area of this large triangle is (x*y)/2 = (H*b)/2
Given a trapezoid whose base is the diagonal (b) of the rectangle
and base (b) of the large triangle, its height is (h)
and its top is (t).
Given (S) as the area of the trapezoid.
In general, the trapezoid is disymtric because the triangle have x != y.
So the area is S = h*(b + t)/2
This function trapezoid_height() calculates the height (h) of the trapezoid
in order that the trapezoid have an area (S) which must be
the percentage (P) of the area of the large triangle A. So:
h*(b + t)/2 = S = P*[H*b /2] ==> h*(b + t) = P*H*b
==> h*t = P*H*b - h*b ==> h*t*(H-h) = [P*H - h]*b*(H-h)
The large triangle is the sum of the trapezoid and of a little triangle B
having an height equal to (H-h) and a base which is the top (t)
of the trapezoid.
The area of this little triangle B is t*(H-h)/2 and must be equal to (1-P)*[H*b / 2]
==> t*(H-h) = (1-P)*H*b ==> h*t*(H-h) = h*(1-P)*H*b
From h*t*(H-h) = [P*H - h]*b*(H-h) and h*t*(H-h) = h*(1-P)*H*b
we obtain [P*H - h]*b*(H-h) = h*(1-P)*H*b
==> b*h**2 - (b*H + xy)*h + P*x*y*H = 0
==> h**2 - 2*H*h + P*(H**2) = 0
That leads to the solution H*(1 - sqrt(1-P)), the other H*(1 + sqrt(1-P))
being bigger than H
'''
H = math.sqrt( (x*x*y*y) / (x*x + y*y) )
return H*(1 - sqrt(1-P))
def bendsinister(image = FLAG, draw = CANVAS):
'''a bend sinister covers 1/3 of the field, sinister chief to dexter base
(some sources on the web say 1/5 of the field, but we'll use 1/3)
the "field" in this case being the area of the flag, so we need to
find a trapezoid which is 1/6 the total area (width * height).
we need to return only the width of the diagonal, which is double
the height of the calculated trapezoid
'''
x, y = image.size
print 'x ==',x,'y ==',y
percentage = float(1)/3
width = 2 * trapezoid_height(x, y , percentage)
print 'height ==',width/2
print 'width==',width
if command == 'bendsinister':
show_bendsinister(x, y, width, image, draw)
return width
command = 'bendsinister'
print bendsinister()
result
x == 900 y == 600
height == 91.6103029364
width== 183.220605873
dexter_base== (0, 600) sinister_chief== (900, 0)
[(180340, (0, 0, 255)), (359660, (0, 0, 0))]
183.220605873
The blue stripe displayed doesn't give the impression to be 1/3 of the field's area, but the numbers speak:
359660 / 180340 = 1.994344