I've got an array of data filled with "#" and "." for example it display something like this :
...........#############..............................
.........################..................#######....
........##################................#########...
.......####################..............###########..
........##################................#########...
.........################..................#######....
...........#############..............................
I want to create an algorithm that find the bigger ball and erase the smaller one.
I was thinking of using the longest sequence of "#" to know what is the diameter.
So i've got something like this :
x = 0
longest_line = 0
for i in range(0, nbLine) :
for j in range(0, nbRaw) :
if data[i, j] = red :
x = x+1
if data[i, j+1] != red:
And i don't know what to do next..
I would use some sort of a segmentation algorithm, and then simply count the number of pixels in each object. Then simply erase the smaller one, which should be easy as you have a tag of the object.
The segmentation algorithms typically work like this.
Perform a raster scan, starting upper left, working towards bottom right.
As you see a #, you know you have an object. Check it's neighbors.
If the neighbors have a previously assigned value, assign that value to it
If there there are multiple values, put that into some sort of a table, which after you are done processing, you will simplify.
So, for a very simple example:
...##...
.######.
...##...
Your processing will look like:
00011000
02111110
00011000
With a conversion such that:
2=>1
Apply the look up table, and all objects will be tagged with a 1 value. Then simply count the number of pixels, and you are done.
I'll leave the implementation to you;-)
Get your data into a nicer array structure
Perform connected component labelling
Count the number of elements with each label (ignoring the background label)
Choose the label with the largest number of elements
Do you olways have only 2 shapes like this? Because in this case, you could also use the python regular expression library. This code seems to do the trick for your example (I copied your little drawing in a file and named it "balls.txt"):
import re
f = open("balls.txt", "r")
for line in f :
balls = re.search("(\.+)(#+)(\.+)(#*)(\.+)", line)
dots1 = balls.groups()[0]
ball1 = balls.groups()[1]
dots2 = balls.groups()[2]
ball2 = balls.groups()[3]
dots3 = balls.groups()[4]
if len(ball1) < len(ball2):
ball1 = ball1.replace('#', '.')
else:
ball2 = ball2.replace('#', '.')
print "%s%s%s%s%s" % (dots1, ball1, dots2, ball2, dots3)
And this is what I get:
...........#############..............................
.........################.............................
........##################............................
.......####################...........................
........##################............................
.........################.............................
...........#############..............................
I hope this can give you some ideas for the resolution of your problem
Assuming the balls don't touch:
Assuming exactly two balls
Create two ball objects, (called balls 0 and 1) each owns a set of points. Points are x,y pairs.
scan each row from top to bottom, left to right within each line.
When you see the first #, assign it to ball 0 (add it’s x,y cords to the set owned by the ball 0 object).
When you see any subsequent #, add it to ball 0 if it is adjacent to any point already in ball 0; otherwise add it to ball 1. (If the new # is at x, y we just test (x+1,y) is in set, (x-1, y) is in set, (x, y+1) is in set, (x, y-1) is in set, and the diagonal neighbors)
When the scan is complete, the list sizes reveal the larger ball. You then have a list of the points to be erased in the other ball’s points set.
Related
So like the title says I need help trying to map points from a 2d plane to a number line in such a way that each point is associated with a unique positive integer. Put another way, I need a function f:ZxZ->Z+ and I need f to be injective. Additionally I need to to run in a reasonable time.
So the way I've though about doing this is to basically just count points, starting at (1,1) and spiraling outwards.
Below I've written some python code to do this for some point (i,j)
def plot_to_int(i,j):
a=max(i,j) #we want to find which "square" we are in
b=(a-1)^2 #we can start the count from the last square
J=abs(j)
I=abs(i)
if i>0 and j>0: #the first quadrant
#we start counting anticlockwise
if I>J:
b+=J
#we start from the edge and count up along j
else:
b+=J+(J-i)
#when we turn the corner, we add to the count, increasing as i decreases
elif i<0 and j>0: #the second quadrant
b+=2a-1 #the total count from the first quadrant
if J>I:
b+=I
else:
b+=I+(I-J)
elif i<0 and j<0: #the third quadrant
b+=(2a-1)2 #the count from the first two quadrants
if I>J:
b+=J
else:
b+=J+(J-I)
else:
b+=(2a-1)3
if J>I:
b+=I
else:
b+=I+(I-J)
return b
I'm pretty sure this works, but as you can see it quite a bulky function. I'm trying to think of some way to simplify this "spiral counting" logic. Or possibly if there's another counting method that is simpler to code that would work too.
Here's a half-baked idea:
For every point, calculate f = x + (y-y_min)/(y_max-y_min)
Find the smallest delta d between any given f_n and f_{n+1}. Multiply all the f values by 1/d so that all f values are at least 1 apart.
Take the floor() of all the f values.
This is sort of like a projection onto the x-axis, but it tries to spread out the values so that it preserves uniqueness.
UPDATE:
If you don't know all the data and will need to feed in new data in the future, maybe there's a way to hardcode an arbitrarily large or small constant for y_max and y_min in step 1, and an arbitrary delta d for step 2 according the boundaries of the data values you expect. Or a way to calculate values for these according to the limits of the floating point arithmetic.
grid = int(input("Grid size: "))
x = []
for i in range(grid):
y = []
for j in range(grid):
e = '.'
y.append(e)
x.append(y)
x[0][0] = 'X'
for row in x:
print(*row)
d = input("Direction: ")
I'm trying to work out how to assign X to the index of the grid position which the user input gives. Below is the question.
You're familiar with the story of Hansel and Gretel, so when you decide to go exploring you always make sure to keep track of where you've come from. Since you know how unreliable leaving a trail of breadcrumbs is, you decide to write a program to help you map out your trail.
Your program should first read in the size of the grid that you will be exploring, and then read in directions: left, right, up or down. You always start exploring at the top left of your map. After every step, you should print out the trail, reading in directions until a blank line is entered. You will never be directed off the edges of the map.
Here is the problem:
Given the input n = 4 x = 5, we must imagine a chessboard that is 4 squares across (x-axis) and 5 squares tall (y-axis). (This input changes, all the up to n = 200 x = 200)
Then, we are asked to determine the minimum shortest path from the bottom left square on the board to the top right square on the board for the Knight (the Knight can move 2 spaces on one axis, then 1 space on the other axis).
My current ideas:
Use a 2d array to store all the possible moves, perform breadth-first
search(BFS) on the 2d array to find the shortest path.
Floyd-Warshall shortest path algorithm.
Create an adjacency list and perform BFS on that (but I think this would be inefficient).
To be honest though I don't really have a solid grasp on the logic.
Can anyone help me with psuedocode, python code, or even just a logical walk-through of the problem?
BFS is efficient enough for this problem as it's complexity is O(n*x) since you explore each cell only one time. For keeping the number of shortest paths, you just have to keep an auxiliary array to save them.
You can also use A* to solve this faster but it's not necessary in this case because it is a programming contest problem.
dist = {}
ways = {}
def bfs():
start = 1,1
goal = 6,6
queue = [start]
dist[start] = 0
ways[start] = 1
while len(queue):
cur = queue[0]
queue.pop(0)
if cur == goal:
print "reached goal in %d moves and %d ways"%(dist[cur],ways[cur])
return
for move in [ (1,2),(2,1),(-1,-2),(-2,-1),(1,-2),(-1,2),(-2,1),(2,-1) ]:
next_pos = cur[0]+move[0], cur[1]+move[1]
if next_pos[0] > goal[0] or next_pos[1] > goal[1] or next_pos[0] < 1 or next_pos[1] < 1:
continue
if next_pos in dist and dist[next_pos] == dist[cur]+1:
ways[next_pos] += ways[cur]
if next_pos not in dist:
dist[next_pos] = dist[cur]+1
ways[next_pos] = ways[cur]
queue.append(next_pos)
bfs()
Output
reached goal in 4 moves and 4 ways
Note that the number of ways to reach the goal can get exponentially big
I suggest:
Use BFS backwards from the target location to calculate (in just O(nx) total time) the minimum distance to the target (x, n) in knight's moves from each other square. For each starting square (i, j), store this distance in d[i][j].
Calculate c[i][j], the number of minimum-length paths starting at (i, j) and ending at the target (x, n), recursively as follows:
c[x][n] = 1
c[i][j] = the sum of c[p][q] over all (p, q) such that both
(p, q) is a knight's-move-neighbour of (i, j), and
d[p][q] = d[i][j]-1.
Use memoisation in step 2 to keep the recursion from taking exponential time. Alternatively, you can compute c[][] bottom-up with a slightly modified second BFS (also backwards) as follows:
c = x by n array with each entry initially 0;
seen = x by n array with each entry initially 0;
s = createQueue();
push(s, (x, n));
while (notEmpty(s)) {
(i, j) = pop(s);
for (each location (p, q) that is a knight's-move-neighbour of (i, j) {
if (d[p][q] == d[i][j] + 1) {
c[p][q] = c[p][q] + c[i][j];
if (seen[p][q] == 0) {
push(s, (p, q));
seen[p][q] = 1;
}
}
}
}
The idea here is to always compute c[][] values for all positions having some given distance from the target before computing any c[][] value for a position having a larger distance, as the latter depend on the former.
The length of a shortest path will be d[1][1], and the number of such shortest paths will be c[1][1]. Total computation time is O(nx), which is clearly best-possible in an asymptotic sense.
My approach to this question would be backtracking as the number of squares in the x-axis and y-axis are different.
Note: Backtracking algorithms can be slow for certain cases and fast for the other
Create a 2-d Array for the chess-board. You know the staring index and the final index. To reach to the final index u need to keep close to the diagonal that's joining the two indexes.
From the starting index see all the indexes that the knight can travel to, choose the index which is closest to the diagonal indexes and keep on traversing, if there is no way to travel any further backtrack one step and move to the next location available from there.
PS : This is a bit similar to a well known problem Knight's Tour, in which choosing any starting point you have to find that path in which the knight whould cover all squares. I have codes this as a java gui application, I can send you the link if you want any help
Hope this helps!!
Try something. Draw boards of the following sizes: 1x1, 2x2, 3x3, 4x4, and a few odd ones like 2x4 and 3x4. Starting with the smallest board and working to the largest, start at the bottom left corner and write a 0, then find all moves from zero and write a 1, find all moves from 1 and write a 2, etc. Do this until there are no more possible moves.
After doing this for all 6 boards, you should have noticed a pattern: Some squares couldn't be moved to until you got a larger board, but once a square was "discovered" (ie could be reached), the number of minimum moves to that square was constant for all boards not smaller than the board on which it was first discovered. (Smaller means less than n OR less than x, not less than (n * x) )
This tells something powerful, anecdotally. All squares have a number associated with them that must be discovered. This number is a property of the square, NOT the board, and is NOT dependent on size/shape of the board. It is always true. However, if the square cannot be reached, then obviously the number is not applicable.
So you need to find the number of every square on a 200x200 board, and you need a way to see if a board is a subset of another board to determine if a square is reachable.
Remember, in these programming challenges, some questions that are really hard can be solved in O(1) time by using lookup tables. I'm not saying this one can, but keep that trick in mind. For this one, pre-calculating the 200x200 board numbers and saving them in an array could save a lot of time, whether it is done only once on first run or run before submission and then the results are hard coded in.
If the problem needs move sequences rather than number of moves, the idea is the same: save move sequences with the numbers.
I am drawing a metaball with marching cubes (squares as it is a 2d) algorithm.
Everything is fine, but I want to get it as a vector object.
So far I've got a vector line or two from each active square, keeping them in list lines. In other words, I have an array of small vector lines, spatially displaying several isolines (curves) - my aim is to rebuild those curves back from lines.
Now I am stuck with fast joining them all together: basically I need to connect all lines one by one all together into several sequences (curves). I don't know how many curves (sequences) will be there, and lines could be in different directions and I need to process lines into sequences of unique points.
So far I wrote something obviously ugly and half-working (here line is a class with list of points as attribute points, and chP is a function checking if points are close enough, t definies this 'enough') :
def countur2(lines):
'''transform random list of lines into
list of grouped sequences'''
t = 2 # tolerance
sqnss = [[lines[0]]] # sequences
kucha = [lines[0]] #list of already used lines
for l in lines:
for i,el in enumerate(lines):
print 'working on el', i
ss = sqnss[-1][0]
ee = sqnss[-1][-1]
if el not in kucha:
if chP(el.points[0],ee.points[1],t):
sqnss[-1].append(el)
kucha.append(el)
break
elif chP(el.points[1],ee.points[1],t):
sqnss[-1].append(el.rvrse())
kucha.append(el)
break
elif chP(el.points[1],ss.points[0],t):
sqnss[-1] = [el] + sqnss[-1]
kucha.append(el)
break
elif chP(el.points[0],ss.points[0],t):
sqnss[-1] = [el.rvrse()] + sqnss[-1]
kucha.append(el)
break
print 'new shape added, with el as start'
sqnss.append([el])
kucha.append(el)
#return sqnse of points
ps = []
for x in sqnss: ps.append([el.points[0] for el in x])
return ps
I know this is such a big question, but please give me any clue on right direction to handle this task
A first option is to number all cell sides uniquely, and associate to every vector the pair of edges it joins.
Enter all pairs in a dictionary, both ways: (a,b) and (b,a). Then, starting from an arbitrary pair, say (a,b), you will find the next pair through b, say (b,c). You will remove both (b,c) and (c,b) from the dictionary, and continue from c, until the chain breaks on a side of the domain, or loops.
A second option is to scan the whole domain and when you find a cell crossed by an isocurve, compute the vector, and move to the neighboring cell that shares an edge crossed by the isocurve, and so on. To avoid an infinite scanning, you will flag the cell as already visited.
By contrast with the first approach, no dictionary is required, as the following of the chain is purely based on the local geometry.
Beware that there are two traps:
cells having one or more corner values equal to the iso-level are creating trouble. A possible cure is by slightly modifying the values corner; this will create a few tiny vectors.
cells can be crossed by two vectors instead of one, and require to be visited twice.
I have a list of lists in the form of
[ [ x1,.....,x8],[x1,.......,x8],...............,[x1,.....x[8]] ] . The number of lists in that list can go upto a million. Each list has 4 gps co-ordinates which show the four points of a rectangle ( assumed that each segment is in the form of a rectangle].
Problem : Given a new point, I need to determine which segment the point falls on and create a new one if it falls in none of them. I am not uploading the data into MySQL as of now, it comes in as a simple text file. I find out the co-ordinates from the text file for any given car.
What I tried : I am thinking of using R-trees to find all points which are near to the given point . ( Near== 200 meters maximum) . But even in R-trees, there seem to be too many options . R,R*,Hilbert.
Q1. Which one should be opted for ?
Q2. Is there a better option than R-trees?Can something be done by searching faster within the list ?
Thanks a lot.
[ {a1:[........]},{a2:[.......]},{a3:[.........]},.... ,{a20:[.....]}] .
Isn't the problem "find whether a given point falls within a certain rectangle in 2D space"?
That could be separated dimensionally, couldn't it? Give each rectangle an ID, then separate into lists of one-dimensional ranges ((id, x0, x1), (id, y0, y1)) and find all the ranges in both dimensions the point falls in. (I'm fairly sure there are very efficient algorithms for this. Heck, you could even leverage, say, sqlite already.) Then just intersect the ID sets you get and you should find all rectangles the point falls in, if any. (Of course you can exit early if either of the single dimensional queries returns no result.)
Not sure if this'd be faster or smarter than R-trees or other spatial indexes though. Hope this helps anyway.
import random as ra
# my _data will hold tuples of gps readings
# under the key of (row,col), knowing that the size of
# the row and col is 10, it will give an overall grid coverage.
# Another dict could translate row/col coordinates into some
# more usefull region names
my_data = {}
def get_region(x,y,region_size=10):
"""Build a tuple of row/col based on
the values provided and region square dimension.
It's for demonstration only and it uses rather naive calculation as
coordinate / grid cell size"""
row = int(x / region_size)
col = int(y / region_size)
return (row,col)
#make some examples and build my_data
for loop in range(10000):
#simule some readings
x = ra.choice(range(100))
y = ra.choice(range(100))
my_coord = get_region(x,y)
if my_data.get(my_coord):
my_data[my_coord].append((x,y))
else:
my_data[my_coord]= [(x,y),]
print my_data