How to check points fall in rectangles and vice versa? - python

Sorry if the title doesn't make it clear.
Here is the more detailed situation.
Given n dots and n rectangles.
Rectangles can overlap.
Dots are represented as (x,y)
Rectangles are represented as (x,y,w,h)
x,y refer to location in x and y axes, respectively
w,h refer to width and height, respectively
How do i check if the following two conditions are met simultaneously:
each dot falls in a certain rectangle (doesn't matter which)
AND
each rectangle contains at least one dot.
Is there a better way instead of iterating through each dot and each rectangle?
It would be best if you can show me how to do this in python.
Thanks!

I think you can use what is called oriented surfaces created by the mathematician Gauss i believe. this allows you to calculate any polygon area. Using the point to test as a fifth point and one other rectangle point as sixth point (duplicate) you can calculate a new area for this new six-side polygon. You will obtain the same area or a bigger area depending on the point position compared to the rectangle.
Addendum
The oriented surfaces allows you to calculate the area of any polygon when knowing their coordinates. The polygon must be defined as a set of points P(Xp,Yp) in the specific order describing the contour. Two consecutive points will be connected by a line.
In the picture below the polygon can be defined as the set [A,B,C,D], but also as [C,D,A,B] or [B,A,D,C].
It cannot be defined as [A,C,B,D] since this would define a polygon shaped like a butterfly wings as shown below.
Oriented Surfaces
For each couple of ordered successive point - meaning [A,B], [B,C], [C,D], [D,A] for the defined set [A,B,C,D] for example - the formula allows us to calculate the area of the triangle formed by a couple and the axis origin. This surface is oriented - meaning it has a positive or a negative value - according to the rotation (clockwise or counter-clockwise). In the figure below the triangles (OAB) and (OBC) and (ODA) will have a negative area, while the triangle (OCD) will have a positive area. By adding all those area, one can notice that the result will be the area of polygon (A,B,C,D), which is negative because it is drawn clockwise.
Calculations
You can find a clear example of the calculations and try a few things here: https://www.mathopenref.com/coordpolygonarea.html. To complete my example I have drawn a polygon similar (but not identical) to the ones above on this website and the result is as follow: -22
Adding a point
When you add a point, which is the point you want to test, you will obtain a 5-point polygon. The first thing you have to do is to place it in the correct order so that you don't have segments crossing. To do that you can create loop where the new point P is placed successively at the different positions in the set - meaning (PABCD), then (APBCD), etc until (ABCDP)- and calculate for each the area. The set giving you the maximum area in absolute value is the one you keep.
Here is an example from the website https://rechneronline.de/pi/simple-polygon.php. The first polygon is the initial, the second is badly defined and the last one is correctly defined.
One can see that that if the added point is outside the original polygon then the area is increased. At the opposite, if the added point is inside the original polygon, the area is decreased:
Note
If the original point set is not ordered correctly, you will have to reorder it as described just above
In Python you will have to use ordered object such as a list
To check that each rectangle has at least a point inside, you will have to check each point against all rectangles and maintain a dictionary describing which point is inside which rectangle
Adding: I also realized that since a rectangle is convex it is possible to know whether a point P is inside by just checking the four oriented triangle area in order namely (ABP) (BCP) (CDP) and (DAP). If those four area have the same sign then P is inside the rectangle (ABCD), otherwise it is outside.

Related

How do I find the density of a list of points given latitude and longitude in a 5 mile radius in Python Pandas?

I am trying to come up with a calculation that creates a column that comes up with a number that shows density for that specific location in a 5 mile radius, i.e if there are many other locations near it or not. I would like to compare these locations with themselves to achieve this.
I'm not familiar with the math needed to achieve this and have tried to find a solution for some time now.
Ok, i'm not super clear with what your problem may be but i will try to give you my approach.
Let's first assume that the area you are querying for points is small enough to be considered flat hence the geo coordinates of your area will basically be cartesian coordinates.
You choose your circle's center as (x,y) and then you have to find which of your points are within radius of your cirle: in cartesian coordinates being inside of a circle means that the distance of the points from your center are smaller than a given radius. You save those points in your choice of data structure and the density will probably be the number of your points divided by the area of the circle.
I hope i understood the problem correctyl!

Tower of colored cubes

Consider a set of n cubes with colored facets (each one with a specific color
out of 4 possible ones - red, blue, green and yellow). Form the highest possible tower of k cubes ( k ≤ n ) properly rotated (12 positions of a cube), so the lateral faces of the tower will have the same color, using and evolutionary algorithm.
What I did so far:
I thought that the following representation would be suitable: an Individual could be an array of n integers, each number having a value between 1 and 12, indicating the current position of the cube (an input file contains n lines, each line shows information about the color of each face of the cube).
Then, the Population consists of multiple Individuals.
The Crossover method should create a new child(Individual), containing information from its parents (approximately half from each parent).
Now, my biggest issue is related to the Mutate and Fitness methods.
In Mutate method, if the probability of mutation (say 0.01), I should change the position of a random cube with other random position (for example, the third cube can have its position(rotation) changed from 5 to 12).
In Fitness method, I thought that I could compare, two by two, the cubes from an Individual, to see if they have common faces. If they have a common face, a "count" variable will be incremented with the number of common faces and if all the 4 lateral faces will be the same for these 2 cubes, the count will increase with another number of points. After comparing all the adjacent cubes, the count variable is returned. Our goal is to obtain as many adjacent cubes having the same lateral faces as we can, i.e. to maximize the Fitness method.
My question is the following:
How can be a rotation implemented? I mean, if a cube changes its position(rotation) from 3, to 10, how do we know the new arrangement of the faces? Or, if I perform a mutation on a cube, what is the process of rotating this cube if a random rotation number is selected?
I think that I should create a vector of 6 elements (the colors of each face) for each cube, but when the rotation value of a cube is modified, I don't know in what manner the elements of its vector of faces should be rearranged.
Shuffling them is not correct, because by doing this, two opposite faces could become adjacent, meaning that the vector doesn't represent that particular cube anymore (obviously, two opposite faces cannot be adjacent).
First, I'm not sure how you get 12 rotations; I get 24: 4 orientations with each of the 6 faces on the bottom. Use a standard D6 (6-sided die) and see how many different layouts you get.
Apparently, the first thing you need to build is a something (a class?) that accurately represents a cube in any of the available orientations. I suggest that you use a simple structure that can return the four faces in order -- say, front-right-back-left -- given a cube and the rotation number.
I think you can effectively represent a cube as three pairs of opposing sides. Once you've represented that opposition, the remaining organization is arbitrary numbering: any valid choice is isomorphic to any other. Each rotation will produce an interleaved sequence of two opposing pairs. For instance, a standard D6 has opposing pairs [(1, 6), (2, 5), (3, 4)]. The first 8 rotations would put 1 and 6 on the hidden faces (top and bottom), giving you the sequence 2354 in each of its 4 rotations and their reverses.
That class is one large subsystem of your problem; the other, the genetic algorithm, you seem to have well in hand. Stack all of your cubes randomly; "fitness" is a count of the most prevalent 4-show (sequence of 4 sides) in the stack. At the start, this will generally be 1, as nothing will match.
From there, you seem to have an appropriate handle on mutation. You might give a higher chance of mutating a non-matching cube, or perhaps see if some cube is a half-match: two opposite faces match the "best fit" 4-show, so you merely rotate it along that axis, preserving those two faces, and swapping the other pair for the top-bottom pair (note: two directions to do that).
Does that get you moving?

How to efficienctly find the top border line of a graph in python

I have a set of graphs from which I want to find an outline graph (Black line in this figure.)
Finding the maximum of each graph at all points on the x-axis is not possible because the x-values are not same for all the graphs. The points are accurate to a couple of decimal places. this figure might be able to help understand better.
I tried converting each graph to a polygon and using shapely cascaded_union and then cropping off the bottom.
It works for a small number of graphs, but when the number of graphs becomes large. It takes a lot of time.
Is there some other efficient way to do this?
Sort all your points by their x coordinate.
Your final output will have a finite number of pixels. You can compute the range of x values that fall within each pixel(small range but not 0). So split your points into buckets. Since they are already sorted, you just need to advance through the list until the values belongs to the next range.
For each pixel column compute the maximum y value you find. Add a point at the (x, y) for the black line.
The complexity of this will be o(N logN).
If you have gaps on the x axis you can choose to either skip it and have a gap in the black line or simply interpolate between the neighbouring values. If you plot the blackline as a collection of line segments you can just skip generating a point for that column and let the renderer do the interpolation for you.
If your original points are too rare (they skip pixels) your line may look jagged (jumping up and down). You can avoid this by adding interpolated values for the functions that don't have a point in that range. Linear interpolation should work just fine. Make sure you generate try to generate a point at the beginning and the end of the interval and take the larger y-value.

Measuring rectangles at odd angles with a low resolution input matrix (Linear regression classification?)

I'm trying to solve the following problem:
Given an input of, say,
0000000000000000
0011111111110000
0011111111110000
0011111111110000
0000000000000000
0000000111111110
0000000111111110
0000000000000000
I need to find the width and height of all rectangles in the field. The input is actually a single column at a time (think like a scanner moves from left to right) and is continuous for the duration of the program (that is, the scanning column doesn't move, but the rectangles move over it).
In this example, I can 'wait for a rectangle to begin' (that is, watch for zeros changing to 1s) and then watch it end (ones back to zeros) and measure the piece in 'grid units'. This will work fine for the simple case outlined above, but will fail is the rectangle is tilted at an angle, for example:
0000000000000000
0000011000000000
0000111100000000
0001111111000000
0000111111100000
0000011111110000
0000000111100000
0000000011000000
I had originally thought that the following question would apply:
Dynamic programming - Largest square block
but now i'm not so sure.
I have little to no experience with regression or regression testing, but I think that I could represent this as an input of 8 variables.....
Well to be honest i'm not sure how I would do this at all. The sizes that this part of the code extracts need to be fitted against rectangles of known sizes (ie, from a database).
I initially thought I could feed the known data as training exercises and store the positive test results, but I'm really not sure where to go from here.
Thanks for any advice you might have.
Collect the transition points (from a 1 to a 0 or vice-versa) as you're scanning, then figure the length and width either directly from there, or from the convex hull of each object.
If rectangles can overlap, then you'll have bigger issues.
I'd take following steps:
get all columns together in a matrix (this is needed for proper filtering)
now apply a filter (need to google for it a bit) to sharpen edges and corners
create some structure to hold data for next steps (this can have many different solutions, choose your favorite and/or optimal)
scan vertically (column by column) and for each segment of consequent 'ones' found in a column (segment means you have found it's start end end y coordinates) do:
check that this segment overlaps some segment in the previous column
if it does not, consider this a new rect. Create a rect object and assign it's handle to the segment. for the new rect, update it's metrics (this operation takes just the segment's coordinates - x, ymin, ymax, and will be discussed later)
if it does, assume this is the same rect, take the rect's handle, assign this handle to the current segment then get the rect by it's handle and update it's metrics
That's pretty it. After this you will have a pool of rect objects each having four coordinates of its corners. Do some primitive math to approximate rect's width and height.
So where is the magic? Well, it all happens in the update rect metrics routine.
For each rect we have 13 metrics:
min X => ymin1, ymax1
max X => ymin2, ymax2
min Y => xmin1, xmax1
max Y => xmin2, xmax2
average vertical segment length
First of all we have to determine if this rect is properly aligned within our scan grid. To do this we compare values average vertical segment length and max Y - min Y. If they are the same (i'd choose a threshold around 97%, and then tune it for the best results), then we assume the following coordinates for our rect:
(min X, max Y)
(min X, min Y)
(max X, max Y)
(max X, min Y).
In other case out rect is rotated and in this case we take it's coordinates as follows:
(min X, (ymin1+ymax1)/2)
((xmin1+xmax1)/2, min Y)
(max X, (ymin2+ymax2)/2)
((xmin2+xmax2)/2, max Y)
I posed this question to a friend, and he suggested:
When seeing a 1 for the first time, store it as a new shape. Flood fill it to the right, and add those points to the same shape.
Any input pixel that is'nt in a shape now is a new shape. Do the same flood fill.
On the next input column, flood again from the original shape points. Add new pixels to the corresponding shape
If any flood fill does not add any new pixels for two consecutive columns, you have a completed shape. Move on, and try to determine it's dimensions
This then leaves us with getting the dimensions for a shape we isolated (like in example 2).
For this, we thought up:
If the number of leftmost pixels in the shape is below the average number of pixels per column, then the peice is probably rotated. Thus, find the corners by getting the outermost pixels. Use distance formula between all of them. Largest = hypotenuse, others = width or height.
Otherwise, this peice is probably perfectly aligned, so the corners are probably just the topleft most pixel, bottom right most pixel, etc
What do you all think?

Finding all points common to two circles

In Python, how would one find all integer points common to two circles?
For example, imagine a Venn diagram-like intersection of two (equally sized) circles, with center-points (x1,y1) and (x2,y2) and radii r1=r2. Additionally, we already know the two points of intersection of the circles are (xi1,yi1) and (xi2,yi2).
How would one generate a list of all points (x,y) contained in both circles in an efficient manner? That is, it would be simple to draw a box containing the intersections and iterate through it, checking if a given point is within both circles, but is there a better way?
Keep in mind that there are four cases here.
Neither circle intersects, meaning the "common area" is empty.
One circle resides entirely within the other, meaning the "common area" is the smaller/interior circle. Also note that a degenerate case of this is if they are the same concentric circle, which would have to be the case given the criteria that they are equal-diameter circles that you specified.
The two circles touch at one intersection point.
The "general" case where there are going to be two intersection points. From there, you have two arcs that define the enclosed area. In that case, the box-drawing method could work for now, I'm not sure there's a more efficient method for determining what is contained by the intersection. Do note, however, if you're just interested in the area, there is a formula for that.
You may also want to look into the various clipping algorithms used in graphics development. I have used clipping algorithms to solve alot of problems similar to what you are asking here.
If the locations and radii of your circles can vary with a granularity less than your grid, then you'll be checking a bunch of points anyway.
You can minimize the number of points you check by defining the search area appropriately. It has a width equal to the distance between the points of intersection, and a height equal to
r1 + r2 - D
with D being the separation of the two centers. Note that this rectangle in general is not aligned with the X and Y axes. (This also gives you a test as to whether the two circles intersect!)
Actually, you'd only need to check half of these points. If the radii are the same, you'd only need to check a quarter of them. The symmetry of the problem helps you there.
You're almost there.
Iterating over the points in the box should be fairly good, but you can do better if for the second coordinate you iterate directly between the limits.
Say you iterate along the x axis first, then for the y axis, instead of iterating between bounding box coords figure out where each circle intersects the x line, more specifically you are interested in the y coordinate of the intersection points, and iterate between those (pay attention to rounding)
When you do this, because you already know you are inside the circles you can skip the checks entirely.
If you have a lot of points then you skip a lot of checks and you might get some performance improvements.
As an additional improvement you can pick the x axis or the y axis to minimize the number of times you need to compute intersection points.
So you want to find the lattice points that are inside both circles?
The method you suggested of drawing a box and iterating through all the points in the box seems the simplest to me. It will probably be efficient, as long as the number of points in the box is comparable to the number of points in the intersection.
And even if it isn't as efficient as possible, you shouldn't try to optimize it until you have a good reason to believe it's a real bottleneck.
I assume by "all points" you mean "all pixels". Suppose your display is NX by NY pixels. Have two arrays
int x0[NY], x1[NY]; initially full of -1.
The intersection is lozenge-shaped, between two curves.
Iterate x,y values along each curve. At each y value (that is, where the curve crosses y + 0.5), store the x value in the array. If x0[y] is -1, store it in x0, else store it in x1.
Also keep track of the lowest and highest values of y.
When you are done, just iterate over the y values, and at each y, iterate over the x values between x0 and x1, that is, for (ix = x0[iy]; ix < x1[iy]; ix++) (or the reverse).
It's important to understand that pixels are not the points where x and y are integers. Rather pixels are the little squares between the grid lines. This will prevent you from having edge-case problems.

Categories

Resources