Defining an ellipse around data points

Defining an ellipse around data points - python

I have a question/idea that I am not sure how to do.
I have a scatter plot of X vs. Y
I can draw a rectangle and then pick all the points within in.
Ideally I want to define a ellipse as it better captures the shape and exclude all the points that are outside it.
How does one do this? is it even possible? I drew the plot using matplotlib.
I used Linear Regression (LR) to fit the points but thats not really what I am looking for.
I want to define APPROXIMATELY a ellipse to cover as many points as possible within it and then exclude points outside it. How can I define an equation/code to pick the ones inside ?

If you have the data structure that is represented in the graph, you can do this with a function and a list comprehension.
If you have the data in a list like this:
# Made up data
lst = [
# First element is X, second is Y.
(0,0),
(92,20),
(10,0),
(13,40),
(27,31),
(.5,.5),
]
def shape_bounds(x):
"""
Function that returns lower and upper bounds for y based on x
Using a circle as an example here.
"""
r = 4
# A circle is x**2 + y**2 = r**2, r = radius
if -r <= x <= r:
y = sqrt(r**2-x**2)
return -y, y
else:
return 1, -1 # Remember, returns lower, upper.
# This will fail any lower < x < upper test.
def in_shape(elt):
"""
Unpacks a pair and tests if y is inside the shape bounds given by x
"""
x, y = elt
lower_bound, upper_bound = shape_bounds(x)
if lower_bound < y < upper_bound:
return True
else:
return False
# Demo walkthrough
for elt in lst:
x, y = elt
print x, y
lower_bound, upper_bound = shape_bounds(x)
if lower_bound < y < upper_bound:
print "X: {0}, Y: {1} is in the circle".format(x, y)
# New list of only points inside the shape
new_lst = [x for x in lst if in_shape(x)]
As for an ellipse, try changing the shape equation based on this

Related

Label points in section of np.meshgrid

I am trying to label x and y points based on their being in a specific section of a meshgrid in python. The points are stored in a pandas dataframe.
Here I have a scatter plot of the coordinates and above them I am plotting the grid.
The entire grid is way bigger, from the bottom left point (500,1250) to upper right point (2750, 3250), which means the whole grid is 225x200 sections.
I want to iterate through the sections of the grid and check if a point is inside. If a point is inside the section I want to add a label to the point. The label should be the same of the section name.
I want to add a column to the dataframe called 'section' that stores the section a point belongs to.
In the example (picture above) I would like to label all the points with
770 <= x <= 780 and 1795 <= y <= 1805 with the section name 'A3'.
my code currently looks like this:
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.collections import LineCollection
df = pd.read_csv('./file.csv', sep=';')
x_min = df['X[mm]'].min()
x_max = df['X[mm]'].max()
y_min = df['Y[mm]'].min()
y_max = df['Y[mm]'].max()
#side of the square in mm:
square_side = 10
xs = np.arange(x_min, x_max+square_side, square_side)
ys = np.arange(y_min, y_max+square_side, square_side)
x_2, y_2 = np.meshgrid(xs, ys, indexing = 'ij')
fig, ax = plt.subplots(figsize=(9,9))
ax.plot(df['X[mm]'], df['Y[mm]'], linewidth=0.2, c='black')
#plot meshgrid as grid instead of points:
segs1 = np.stack((x_2[:,[0,-1]],y_2[:,[0,-1]]), axis=2)
segs2 = np.stack((x_2[[0,-1],:].T,y_2[[0,-1],:].T), axis=2)
plt.gca().add_collection(LineCollection(np.concatenate((segs1, segs2))))
ax.set_aspect('equal', 'box')
plt.show()
I have also a function that determines if the points are inside of a rectangle (this does not use meshgrid):
def is_inside_rect(M, A, B, D):
'''Check if a point M is inside a rectangle with corners A, B, C, D'''
# 0 <= dot(BC,BM) <= dot(BC,BC)
#print(np.dot(B - A, D - A))
return 0 <= np.dot(B - A, M - A) <= np.dot(B - A, B - A) and 0 <= np.dot(D - B, M - B) <= np.dot(D - B, D - B)
I thought of using it in a while loop like this:
x = x_min
y = y_min
while (x <= x_max + square_side) and (y <= y_max + square_side):
A = np.array([x, y])
B = np.array([x + square_side, y])
D = np.array([x + square_side, y + square_side])
print(A, B, D)
df['c'] = df[['X[mm]', 'Y[mm]']].apply(lambda coord: 'red' if is_inside_rect(np.array(coord), A, B, D) else 'black', axis=1)
x += square_side
y += square_side
but this very slow and it changes the colors of all the points in every iteration.

Since all your points are equally sized, there is no need to define all of your squares beforehand and then determine which squares have which points. I would use the coordinates of each point to directly determine which square it will land in.
Let's take the 1-dimensional case, for the sake of simplicity. You want to group points on the number line into "squares" (really 1-d line segments). If your first square starts at x=0, your second at x=10, your third at x=20, and so on, how do you find the square for an arbitrary point x? You know that your squares are spaced by 10 (and you know they start at 0, which makes things easier), so you can simply divide by 10 and round down to get the square index.
You can just as easily do the same thing in 3-dimensions (or n-dimensions).
square_side = 10
x_min = df['X[mm]'].min()
y_min = df['Y[mm]'].min()
def label_point(x, y):
# Double forward slash is integer (round down) division
# Add 1 here if you really want 1-based indexing
x_label = (x - x_min) // square_side
y_label = chr(ord('A') + (y - y_min) // square_side)
return f'{y_label}{x_label}'
df['label'] = df[['X[mm]', 'Y[mm]']].apply(lambda coord: label_point(*coord), axis=1)
As for the efficiency, this solution looks at each point only once, and does a constant amount of work with each point, so it is O(n) in the number of points. Your solution looks at each square once, and for each square looks at each point this is O(n × m) where n is the number of points and m is the number of squares.
Your solution is more general, in that your is_inside_rect function works when your grid of rectangles has an arbitrary rotation. In this case, I would recommend rotating all your points about the origin, and then running my solution.
Also, your loop is adding 10 to x and y every loop, so you are traversing your space diagonally. I don't think you meant to do that.

Some points are not displayed on the graph plotted using NumPy and matplotlib

For the following code whose job is to perform Monte Carlo integration for a function f, I was wondering what would happen if I define f as y = sqrt(1-x^2), which is the equation for a unit quarter circle, and specify an endpoint that is greater than 1, since we know that f is only defined for 0<x<1.
import numpy as np
import matplotlib.pyplot as plt
def definite_integral_show(f, x0, x1, N):
"""Approximate the definite integral of f(x)dx between x0 and x1 using
N random points
Arguments:
f -- a function of one real variable, must be nonnegative on [x0, x1]
N -- the number of random points to use
"""
#First, let's compute fmax. We do that by evaluating f(x) on a grid
#of points between x0 and x1
#This assumes that f is generally smooth. If it's not, we're in trouble!
x = np.arange(x0, x1, 0.01)
y = f(x)
print(y)
f_max = max(y)
#Now, let's generate the random points. The x's should be between
#x0 and x1, so we first create points beterrm 0 and (x1-x0), and
#then add x0
#The y's should be between 0 and fmax
#
# 0...(x1-x0)
x_rand = x0 + np.random.random(N)*(x1-x0)
print(x_rand)
y_rand = 0 + np.random.random(N)*f_max
#Now, let's find the indices of the poitns above and below
#the curve. That is, for points below the curve, let's find
# i s.t. y_rand[i] < f(x_rand)[i]
#And for points above the curve, find
# i s.t. y_rand[i] >= f(x_rand)[i]
ind_below = np.where(y_rand < f(x_rand))
ind_above = np.where(y_rand >= f(x_rand))
#Finally, let's display the results
plt.plot(x, y, color = "red")
pts_below = plt.scatter(x_rand[ind_below[0]], y_rand[ind_below[0]], color = "green")
pts_above = plt.scatter(x_rand[ind_above[0]], y_rand[ind_above[0]], color = "blue")
plt.legend((pts_below, pts_above),
('Pts below the curve', 'Pts above the curve'),
loc='lower left',
ncol=3,
fontsize=8)
def f1(x):
return np.sqrt(1-x**2)
definite_integral_show(f1, 0, 6, 200)
To my surprise, the program still works and gives me the following picture.
I suspect that it works because in NumPy, nan's in an array are just ignored when performing operations on the array. However, I don't understand why the picture only contains points whose x and y coordinates are both between 0 to 1. Where are the points that aren't within this range, but whose values are computed by
x_rand = x0 + np.random.random(N)*(x1-x0)
y_rand = 0 + np.random.random(N)*f_max

You can just print out the arrays (for example by generating only one random point) and see that they go into neither ind_below nor ind_above...
That's because all comparisons that involves nan returns False. (See also: What is the rationale for all comparisons returning false for IEEE754 NaN values?). (so y_rand < nan and y_rand >= nan both evaluates to False)
The easiest way to change the code is
ind_below = np.where(y_rand < f(x_rand))
ind_above = np.where(~(y_rand < f(x_rand)))
(optionally only compute the array once)

Strange result sorting points by distance in python

I got a list of points by extracting the edge of a image, like that:But it is not well ordered, so if I connect it as a line, it will be:
Thus I want to sort this list if points. Like, start with point_0, find which one has the shortest distance with it, say, point_3, then find which one's closest to point_3 then continue...
To sort the points, I wrote this:
import matplotlib.pyplot as plt
import numpy as np
import math
def dist(now, seek):
return math.sqrt((now[0] - seek[0])**2 + (now[1] - seek[1])**2)
def sortNearest(x, y):
if len(x) != len(y):
raise Exception('Error! Array length do not match!')
return False
xNew = []; yNew = []
nearest = 0 #record which point is nearest
now = [x[0], y[0]] #start point index
seekValue = 0
while len(x) > 0:
distance = (max(x) - min(x)) + (max(y) - min(y))
for seek in range(len(x)): # other
temp = dist(now, [x[seek], y[seek]])
if temp < distance and temp != 0.0:
distance = temp
seekValue = x[seek]
xNew.append(now[0]);
yNew.append(now[1]);
if len(x) > 0:
x.remove(now[0])
y.remove(now[1])
if len(x) > 0:
nearest = x.index(seekValue)
now = [x[nearest], y[nearest]]
x = list(xNew); y = list(yNew)
return xNew, yNew
x, y = getBorder('large.png', maxRes = 125)
x, y = sortNearest(x, y)
But that doesn't work well, I came up with this:
Which is obviously incorrect, if I zoom in, see:
If my code runs what I want, point_644 should connect 620 or 675, any but 645... What's wrong with it?

Well, point 644 cannot connect to point 620, because 620 is already part of your path.
As for why it connects to 645 instead of the closer 675: in your loop, you aren't actually remembering the index of the closest point, you're only remembering its x coordinate. After the loop, you then locate an arbitrary point with the same x coordinate - it could be anywhere on a vertical line going through the desired point.

I don't know how I would do this in python 3.x, so please forgive changes that I have not made from python 2.7. You'll also want to figure out what point you'd like to start with:
def find_distance(point1, point2):
distance = sqrt(square(point1[0]-point2[0]) + square(point1[1] - point2[1]))
return distance
x, y = getBorder('large.png', maxRes = 125)
points_in_border = [(i,j) for i, j in zip(x,y)]
current_point = points_in_border.pop([0])
points_in_order = [current_point]
while len(points_in_border) > 0:
min_distance = 10000
for point in points_in_border:
if find_distance(current_point, point) < min_distance:
closest_point = point
min_distance = find_distance(current_point, point)
points_in_border.remove(closest_point)
current_point = closest_point
points_in_order.append(closest_point)

I think what you want to do can be optimized with numpy and scipy:
import numpy as np
import scipy.spatial.distance as distance
import matplotlib.pyplot as plt
points = np.random.random((6,2))
dists =distance.pdist(points)
m=np.argsort(distance.squareform(dists))[:,1:]
order = [0,m[0,0]]
next_point = order[-1]
while len(order)<len(points):
row = m[next_point]
i = 0
while row[i] in order:
i += 1
order.append(row[i])
next_point = order[-1]
order.append(0)
ordered=points[order]
plt.plot(ordered[:,0], ordered[:,1], 'o-')
The idea underlying this code is the following. First you calculate all the distances. Then you use argsort to get the indices that would order each row. You can remove the first column, as each point is closest to itself. We know that. Then you look which is the next closest point and you add it to the list order if the point is not there yet. You then go to the row corresponding to this point, and look for the next point. And so on.
If what you are only interested in is just sorting the enclosing set of points, you can use ConvexHull to find them:
ch = ConvexHull(points)
plt.plot(points[ch.vertices,0], points[ch.vertices,1], 'o-')

How to index a list of points for faster searches of nearby points?

For a list of (x, y) points, I am trying to find the nearby points for each point.
from collections import defaultdict
from math import sqrt
from random import randint
# Generate a list of random (x, y) points
points = [(randint(0, 100), randint(0, 100)) for _ in range(1000)]
def is_nearby(point_a, point_b, max_distance=5):
"""Two points are nearby if their Euclidean distance is less than max_distance"""
distance = sqrt((point_b[0] - point_a[0])**2 + (point_b[1] - point_a[1])**2)
return distance < max_distance
# For each point, find nearby points that are within a radius of 5
nearby_points = defaultdict(list)
for point in points:
for neighbour in points:
if point != neighbour:
if is_nearby(point, neighbour):
nearby_points[point].append(neighbour)
Is there any way I can index points to make the above search faster? I feel there must be some faster way than O(len(points)**2).
Edit: the points in general could be floats, not just ints

this is a version with a fixed grid where each gridpoint holds the number of samples that are there.
the search can then be reduced to just the space around the point in question.
from random import randint
import math
N = 100
N_SAMPLES = 1000
# create the grid
grd = [[0 for _ in range(N)] for __ in range(N)]
# set the number of points at a given gridpoint
for _ in range(N_SAMPLES):
grd[randint(0, 99)][randint(0, 99)] += 1
def find_neighbours(grid, point, distance):
# this will be: (x, y): number of points there
points = {}
for x in range(point[0]-distance, point[0]+distance):
if x < 0 or x > N-1:
continue
for y in range(point[1]-distance, point[1]+distance):
if y < 0 or y > N-1:
continue
dst = math.hypot(point[0]-x, point[1]-y)
if dst > distance:
continue
if grd[x][y] > 0:
points[(x, y)] = grd[x][y]
return points
print(find_neighbours(grid=grd, point=(45, 36), distance=5))
# -> {(44, 37): 1, (45, 33): 1, ...}
# meadning: there is one neighbour at (44, 37) etc...
for further optimzation: the tests for x and y could be precalculated for a given gridsize - the math.hypot(point[0]-x, point[1]-y) would not have to be done then for every point.
and it may be a good idea to replace the grid with a numpy array.
UPDATE
if your points are floats you can still create an int grid to reduce the search space:
from random import uniform
from collections import defaultdict
import math
class Point:
def __init__(self, x, y):
self.x = x
self.y = y
#property
def x_int(self):
return int(self.x)
#property
def y_int(self):
return int(self.y)
def __str__(self):
fmt = '''{0.__class__.__name__}(x={0.x:5.2f}, y={0.y:5.2f})'''
return fmt.format(self)
N = 100
MIN = 0
MAX = N-1
N_SAMPLES = 1000
# create the grid
grd = [[[] for _ in range(N)] for __ in range(N)]
# set the number of points at a given gridpoint
for _ in range(N_SAMPLES):
p = Point(x=uniform(MIN, MAX), y=uniform(MIN, MAX))
grd[p.x_int][p.y_int].append(p)
def find_neighbours(grid, point, distance):
# this will be: (x_int, y_int): list of points
points = defaultdict(list)
# need to cast a slightly bigger net on the upper end of the range;
# int() rounds down
for x in range(point[0]-distance, point[0]+distance+1):
if x < 0 or x > N-1:
continue
for y in range(point[1]-distance, point[1]+distance+1):
if y < 0 or y > N-1:
continue
dst = math.hypot(point[0]-x, point[1]-y)
if dst > distance + 1: # account for rounding... is +1 enough?
continue
for pt in grd[x][y]:
if math.hypot(pt.x-x, pt.y-y) <= distance:
points[(x, y)].append(pt)
return points
res = find_neighbours(grid=grd, point=(45, 36), distance=5)
for int_point, points in res.items():
print(int_point)
for point in points:
print(' ', point)
the output looks something like this:
(44, 36)
Point(x=44.03, y=36.93)
(41, 36)
Point(x=41.91, y=36.55)
Point(x=41.73, y=36.53)
Point(x=41.56, y=36.88)
...
for convenience Points is now a class. may not be necessary though...
depending on how dense or sparse your points are you could also represent the grid as a dictionary pointing to a list or Points...
also the find_neighbours function accepts a starting point consisting of ints only in that version. this might also be refined.
and there is much room for improvement: the range of the y axis can be restricted using trigonometry. and for the points way inside the circle there is no need for an individual check; detailed checking only needs to be done close to the outer rim of the circle.

Generate random number outside of range in python

I'm currently working on a pygame game and I need to place objects randomly on the screen, except they cannot be within a designated rectangle. Is there an easy way to do this rather than continuously generating a random pair of coordinates until it's outside of the rectangle?
Here's a rough example of what the screen and the rectangle look like.
______________
| __ |
| |__| |
| |
| |
|______________|
Where the screen size is 1000x800 and the rectangle is [x: 500, y: 250, width: 100, height: 75]
A more code oriented way of looking at it would be
x = random_int
0 <= x <= 1000
and
500 > x or 600 < x
y = random_int
0 <= y <= 800
and
250 > y or 325 < y

Partition the box into a set of sub-boxes.
Among the valid sub-boxes, choose which one to place your point in with probability proportional to their areas
Pick a random point uniformly at random from within the chosen sub-box.
This will generate samples from the uniform probability distribution on the valid region, based on the chain rule of conditional probability.

This offers an O(1) approach in terms of both time and memory.
Rationale
The accepted answer along with some other answers seem to hinge on the necessity to generate lists of all possible coordinates, or recalculate until there is an acceptable solution. Both approaches take more time and memory than necessary.
Note that depending on the requirements for uniformity of coordinate generation, there are different solutions as is shown below.
First attempt
My approach is to randomly choose only valid coordinates around the designated box (think left/right, top/bottom), then select at random which side to choose:
import random
# set bounding boxes
maxx=1000
maxy=800
blocked_box = [(500, 250), (100, 75)]
# generate left/right, top/bottom and choose as you like
def gen_rand_limit(p1, dim):
x1, y1 = p1
w, h = dim
x2, y2 = x1 + w, y1 + h
left = random.randrange(0, x1)
right = random.randrange(x2+1, maxx-1)
top = random.randrange(0, y1)
bottom = random.randrange(y2, maxy-1)
return random.choice([left, right]), random.choice([top, bottom])
# check boundary conditions are met
def check(x, y, p1, dim):
x1, y1 = p1
w, h = dim
x2, y2 = x1 + w, y1 + h
assert 0 <= x <= maxx, "0 <= x(%s) <= maxx(%s)" % (x, maxx)
assert x1 > x or x2 < x, "x1(%s) > x(%s) or x2(%s) < x(%s)" % (x1, x, x2, x)
assert 0 <= y <= maxy, "0 <= y(%s) <= maxy(%s)" %(y, maxy)
assert y1 > y or y2 < y, "y1(%s) > y(%s) or y2(%s) < y(%s)" % (y1, y, y2, y)
# sample
points = []
for i in xrange(1000):
x,y = gen_rand_limit(*blocked_box)
check(x, y, *blocked_box)
points.append((x,y))
Results
Given the constraints as outlined in the OP, this actually produces random coordinates (blue) around the designated rectangle (red) as desired, however leaves out any of the valid points that are outside the rectangle but fall within the respective x or y dimensions of the rectangle:
# visual proof via matplotlib
import matplotlib
from matplotlib import pyplot as plt
from matplotlib.patches import Rectangle
X,Y = zip(*points)
fig = plt.figure()
ax = plt.scatter(X, Y)
p1 = blocked_box[0]
w,h = blocked_box[1]
rectangle = Rectangle(p1, w, h, fc='red', zorder=2)
ax = plt.gca()
plt.axis((0, maxx, 0, maxy))
ax.add_patch(rectangle)
Improved
This is easily fixed by limiting only either x or y coordinates (note that check is no longer valid, comment to run this part):
def gen_rand_limit(p1, dim):
x1, y1 = p1
w, h = dim
x2, y2 = x1 + w, y1 + h
# should we limit x or y?
limitx = random.choice([0,1])
limity = not limitx
# generate x, y O(1)
if limitx:
left = random.randrange(0, x1)
right = random.randrange(x2+1, maxx-1)
x = random.choice([left, right])
y = random.randrange(0, maxy)
else:
x = random.randrange(0, maxx)
top = random.randrange(0, y1)
bottom = random.randrange(y2, maxy-1)
y = random.choice([top, bottom])
return x, y
Adjusting the random bias
As pointed out in the comments this solution suffers from a bias given to points outside the rows/columns of the rectangle. The following fixes that in principle by giving each coordinate the same probability:
def gen_rand_limit(p1, dim):
x1, y1 = p1Final solution -
w, h = dim
x2, y2 = x1 + w, y1 + h
# generate x, y O(1)
# --x
left = random.randrange(0, x1)
right = random.randrange(x2+1, maxx)
withinx = random.randrange(x1, x2+1)
# adjust probability of a point outside the box columns
# a point outside has probability (1/(maxx-w)) v.s. a point inside has 1/w
# the same is true for rows. adjupx/y adjust for this probability
adjpx = ((maxx - w)/w/2)
x = random.choice([left, right] * adjpx + [withinx])
# --y
top = random.randrange(0, y1)
bottom = random.randrange(y2+1, maxy)
withiny = random.randrange(y1, y2+1)
if x == left or x == right:
adjpy = ((maxy- h)/h/2)
y = random.choice([top, bottom] * adjpy + [withiny])
else:
y = random.choice([top, bottom])
return x, y
The following plot has 10'000 points to illustrate the uniform placement of points (the points overlaying the box' border are due to point size).
Disclaimer: Note that this plot places the red box in the very middle such thattop/bottom, left/right have the same probability among each other. The adjustment thus is relative to the blocking box, but not for all areas of the graph. A final solution requires to adjust the probabilities for each of these separately.
Simpler solution, yet slightly modified problem
It turns out that adjusting the probabilities for different areas of the coordinate system is quite tricky. After some thinking I came up with a slightly modified approach:
Realizing that on any 2D coordinate system blocking out a rectangle divides the area into N sub-areas (N=8 in the case of the question) where a valid coordinate can be chosen. Looking at it this way, we can define the valid sub-areas as boxes of coordinates. Then we can choose a box at random and a coordinate at random from within that box:
def gen_rand_limit(p1, dim):
x1, y1 = p1
w, h = dim
x2, y2 = x1 + w, y1 + h
# generate x, y O(1)
boxes = (
((0,0),(x1,y1)), ((x1,0),(x2,y1)), ((x2,0),(maxx,y1)),
((0,y1),(x1,y2)), ((x2,y1),(maxx,y2)),
((0,y2),(x1,maxy)), ((x1,y2),(x2,maxy)), ((x2,y2),(maxx,maxy)),
)
box = boxes[random.randrange(len(boxes))]
x = random.randrange(box[0][0], box[1][0])
y = random.randrange(box[0][1], box[1][1])
return x, y
Note this is not generalized as the blocked box may not be in the middle hence boxes would look different. As this results in each box chosen with the same probability, we get the same number of points in each box. Obviously the densitiy is higher in smaller boxes:
If the requirement is to generate a uniform distribution among all possible coordinates, the solution is to calculate boxes such that each box is about the same size as the blocking box. YMMV

I've already posted a different answer that I still like, as it is simple and
clear, and not necessarily slow... at any rate it's not exactly what the OP asked for.
I thought about it and I devised an algorithm for solving the OP's problem within their constraints:
partition the screen in 9 rectangles around and comprising the "hole".
consider the 8 rectangles ("tiles") around the central hole"
for each tile, compute the origin (x, y), the height and the area in pixels
compute the cumulative sum of the areas of the tiles, as well as the total area of the tiles
for each extraction, choose a random number between 0 and the total area of the tiles (inclusive and exclusive)
using the cumulative sums determine in which tile the random pixel lies
using divmod determine the column and the row (dx, dy) in the tile
using the origins of the tile in the screen coordinates, compute the random pixel in screen coordinates.
To implement the ideas above, in which there is an initialization phase in which we compute static data and a phase in which we repeatedly use those data, the natural data structure is a class, and here it is my implementation
from random import randrange
class make_a_hole_in_the_screen():
def __init__(self, screen, hole_orig, hole_sizes):
xs, ys = screen
x, y = hole_orig
wx, wy = hole_sizes
tiles = [(_y,_x*_y) for _x in [x,wx,xs-x-wx] for _y in [y,wy,ys-y-wy]]
self.tiles = tiles[:4] + tiles[5:]
self.pixels = [tile[1] for tile in self.tiles]
self.total = sum(self.pixels)
self.boundaries = [sum(self.pixels[:i+1]) for i in range(8)]
self.x = [0, 0, 0,
x, x,
x+wx, x+wx, x+wx]
self.y = [0, y, y+wy,
0, y+wy,
0, y, y+wy]
def choose(self):
n = randrange(self.total)
for i, tile in enumerate(self.tiles):
if n < self.boundaries[i]: break
n1 = n - ([0]+self.boundaries)[i]
dx, dy = divmod(n1,self.tiles[i][0])
return self.x[i]+dx, self.y[i]+dy
To test the correctness of the implementation, here it is a rough check that I
run on python 2.7,
drilled_screen = make_a_hole_in_the_screen((200,100),(30,50),(20,30))
for i in range(1000000):
x, y = drilled_screen.choose()
if 30<=x<50 and 50<=y<80: print "***", x, y
if x<0 or x>=200 or y<0 or y>=100: print "+++", x, y
A possible optimization consists in using a bisection algorithm to find the relevant tile in place of the simpler linear search that I've implemented.

It requires a bit of thought to generate a uniformly random point with these constraints. The simplest brute force way I can think of is to generate a list of all valid points and use random.choice() to select from this list. This uses a few MB of memory for the list, but generating a point is very fast:
import random
screen_width = 1000
screen_height = 800
rect_x = 500
rect_y = 250
rect_width = 100
rect_height = 75
valid_points = []
for x in range(screen_width):
if rect_x <= x < (rect_x + rect_width):
for y in range(rect_y):
valid_points.append( (x, y) )
for y in range(rect_y + rect_height, screen_height):
valid_points.append( (x, y) )
else:
for y in range(screen_height):
valid_points.append( (x, y) )
for i in range(10):
rand_point = random.choice(valid_points)
print(rand_point)
It is possible to generate a random number and map it to a valid point on the screen, which uses less memory, but it is a bit messy and takes more time to generate the point. There might be a cleaner way to do this, but one approach using the same screen size variables as above is here:
rand_max = (screen_width * screen_height) - (rect_width * rect_height)
def rand_point():
rand_raw = random.randint(0, rand_max-1)
x = rand_raw % screen_width
y = rand_raw // screen_width
if rect_y <= y < rect_y+rect_height and rect_x <= x < rect_x+rect_width:
rand_raw = rand_max + (y-rect_y) * rect_width + (x-rect_x)
x = rand_raw % screen_width
y = rand_raw // screen_width
return (x, y)
The logic here is similar to the inverse of the way that screen addresses are calculated from x and y coordinates on old 8 and 16 bit microprocessors. The variable rand_max is equal to the number of valid screen coordinates. The x and y co-ordinates of the pixel are calculated, and if it is within the rectangle the pixel is pushed above rand_max, into the region that couldn't be generated with the first call.
If you don't care too much about the point being uniformly random, this solution is easy to implement and very quick. The x values are random, but the Y value is constrained if the chosen X is in the column with the rectangle, so the pixels above and below the rectangle will have a higher probability of being chosen than pizels to the left and right of the rectangle:
def pseudo_rand_point():
x = random.randint(0, screen_width-1)
if rect_x <= x < rect_x + rect_width:
y = random.randint(0, screen_height-rect_height-1)
if y >= rect_y:
y += rect_height
else:
y = random.randint(0, screen_height-1)
return (x, y)
Another answer was calculating the probability that the pixel is in certain regions of the screen, but their answer isn't quite correct yet. Here's a version using a similar idea, calculate the probability that the pixel is in a given region and then calculate where it is within that region:
valid_screen_pixels = screen_width*screen_height - rect_width * rect_height
prob_left = float(rect_x * screen_height) / valid_screen_pixels
prob_right = float((screen_width - rect_x - rect_width) * screen_height) / valid_screen_pixels
prob_above_rect = float(rect_y) / (screen_height-rect_height)
def generate_rand():
ymin, ymax = 0, screen_height-1
xrand = random.random()
if xrand < prob_left:
xmin, xmax = 0, rect_x-1
elif xrand > (1-prob_right):
xmin, xmax = rect_x+rect_width, screen_width-1
else:
xmin, xmax = rect_x, rect_x+rect_width-1
yrand = random.random()
if yrand < prob_above_rect:
ymax = rect_y-1
else:
ymin=rect_y+rect_height
x = random.randrange(xmin, xmax)
y = random.randrange(ymin, ymax)
return (x, y)

If it's the generation of random you want to avoid, rather than the loop, you can do the following:
Generate a pair of random floating point coordinates in [0,1]
Scale the coordinates to give a point in the outer rectangle.
If your point is outside the inner rectangle, return it
Rescale to map the inner rectangle to the outer rectangle
Goto step 3
This will work best if the inner rectangle is small as compared to the outer rectangle. And it should probably be limited to only going through the loop some maximum number of times before generating new random and trying again.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Defining an ellipse around data points - python

Related

Label points in section of np.meshgrid

Some points are not displayed on the graph plotted using NumPy and matplotlib

Strange result sorting points by distance in python

How to index a list of points for faster searches of nearby points?

Generate random number outside of range in python

Categories

Resources