Finding minimum number of points to cover all segments

Finding minimum number of points to cover all segments - python

Hi I have a problem as below:
Given a set of n segments {[a0, b0], [a1, b1], . . . , [an-1, bn-1]} with integer coordinates on a line, find the minimum number m of points such that each segment contains at least one point. That is, find a set of integers X of the minimum size such that for any segment [ai,bi] there is a point x ∈ X such that ai ≤ x ≤ bi.
Input Format: The first line of the input contains the number n of segments. Each of the following n lines contains two integers ai and bi (separated by a space) defining the coordinates of endpoints of the i-th segment.
Output Format: Output the minimum number m of points on the first line and the integer coordinates of m points (separated by spaces) on the second line. You can output the points in any order. If there are many such sets of points, you can output any set. (It is not difficult to see that there always exist a set of points of the minimum size such that all the coordinates of the points are integers.)
Sample 1:
Input: 3
1 3
2 5
3 6
Output: 1 3
Explanation:
In this sample, we have three segments: [1,3],[2,5],[3,6] (of length 2,3,3 respectively). All of them contain the point with coordinate 3: 1 ≤3 ≤3, 2 ≤3 ≤5, 3 ≤ 3 ≤ 6.
Sample 2:
Input: 4
4 7
1 3
2 5
5 6
Output: 2
3 6
Explanation:
The second and the third segments contain the point with coordinate 3 while the first and the fourth segments contain the point with coordinate 6. All the four segments cannot be covered by a single point, since the segments [1, 3] and [5, 6] are disjoint.
Solution:
The greedy choice is selecting the minimum right endpoint. Then remove all segments that contains that endpoint. Keep choosing minimum right endpoint and removing segments.
I followed the solution. I found the minimum right endpoint, removed all segments that contain that endpoint in my code. Then execute the function again with the new segments list (Keep choosing minimum right endpoint and removing segments - Recursive) but I'm stuck with the order of my code and can't make it works.
list_time = [[4,7],[1,3],[2,5],[5,6]]
def check_inside_range(n, lst): #Function to check if a number is inside the range of start and end of a list
#for example 3 is in [3,5], 4 is not in [5,6], return False if in
if lst[1]-n>=0 and n-lst[0]>=0:
return False
else:
return True
def lay_chu_ki(list_time):
list_time.sort(key = lambda x: x[1]) #Sort according to the end of each segments [1,3],[2,5],[5,6],[4,7]
first_end = list_time[0][1] #Selecting the minimum right endpoint
list_after_remove = list(filter(lambda x: check_inside_range(first_end, x),list_time))
#Remove all segments that contains that endpoint
lay_chu_ki(list_after_remove) #Keep doing the function again with new segments list
#(Keep choosing minimum right endpoint and removing segments)
return first_end #I don't know where to put this line.
print(lay_chu_ki(list_time))
As you can see, I've already done 3 steps: Selecting the minimum right endpoint; Remove all segments that contains that endpoint; Keep choosing minimum right endpoint and removing segments but it won't work somehow. I tried to print two numbers 3 and 6 first (the return result of each recursive call). I also tried to create a count variable to count each recursive call (count +=1) but it didn't work too since it reset count = 0 for each call.

I think recursion overcomplicates the implementation. While it's still feasible, you have to pass in a bunch of extra parameters, which could be difficult to track. In my opinion, it's much simpler to implement this approach iteratively.
Also, your approach repeatedly uses filter() and list(), which takes linear time every time you do it (to clarify, "linear" means linear in the size of the input list). In the worst case, you would perform that operation for every element in the list, which means that the runtime of your original implementation is quadratic (assuming you fix the existing issues with your code). This approach avoids that by making a single pass through the list:
def lay_chu_ki(list_time):
list_time.sort(key=lambda x: x[1])
idx = 0
selected_points = []
while idx != len(list_time):
selected_point = list_time[idx][1]
while idx != len(list_time) and list_time[idx][0] <= selected_point:
idx += 1
selected_points.append(selected_point)
return selected_points
result = lay_chu_ki(list_time)
print(len(result))
print(' '.join(map(str, result)))
With the given list, this outputs:
2
3 6

Related

How do I fix code that calculates the amount of combinations in the partitions of a set?

I am working on a code in Python 2 that partitions a set of 13 elements using integer partitions, then evaluating the different combinations they can have (order does not matter). I have seen the ways people do this by using recursive functions to calculate every partition in a set retroactively, but for what I'm working on I'm taking a different approach.
I'm working with the logic that the different ways a set can be partitioned is determined by the integer partitions of a set. For a set of 4 elements, it can be partitioned in these ways:
[1,1,1,1]
[1,1,2]
[2,2]
[1,3]
[4]
Every number stands for the length of a subset in the partition. Using this info, I can then calculate all of the combinations that can be used with these different integer partitions. If I add the number of combinations from each partition together, I should receive the Bell number (the number of possible partitions in a set). For a list of 4 elements, the Bell number should be 15.
My code runs through the subset lengths in each partition, sets the length of the set to n and the subset length to r, then calculates the combinations in the specific subset. When it goes to the next subset, it subtracts the previous r from n to account for it lessening the amount of combinations available, as n gets smaller when a subset is already defined.
My code, however, is lackluster. When inputting 4 as the length of the set, it outputs 16 (instead of 15). When inputting 5, it outputs 48 (instead of 52). When inputting 13, it outputs 102,513 (instead of 27,644,437). I need it to be exact rather than an estimate.
This is in part because of if elem != 1: not properly accounting for a list of all ones or a list of one subset. It's also in part because it doesn't account for repeats of a combination when appearing in a subset. In [2,2] for a list of 4 elements, it considers the subset to contain 6 combinations when in reality it contains 3.
I'm stuck on how to solve this issue, as I only know enough Python to get by. The way the code currently outputs is how I prefer it to output, obviously without the errors.
The recursive function that calculates the integer partitions is from Nicolas Blanc, and the rest was coded by myself. Important links: Bell number, Partition of a set
import math
in_par = []
stack = []
bell = 0
def partitions(remainder, start_number = 1):
if remainder == 0:
in_par.append(list(stack))
#print stack
else:
for nb_to_add in range(start_number, remainder+1):
stack.append(nb_to_add)
partitions(remainder - nb_to_add, nb_to_add)
stack.pop()
x = partitions(13) # <------- input element count here
for part in in_par:
part.reverse()
combinations = 0
n = 13 # <------- input element count here
for i,elem in enumerate(part):
r = elem
combo = 0
if elem != 1:
if i != (len(part) - 1):
combo = math.factorial(n) / (math.factorial(r) * math.factorial(n-r))
n = n - elem
combinations = combinations + combo
bell = bell + combinations
part.append([combinations])
print part
#print str(bell)
print "Bell Number: " + str(bell)

Select a random subset of indices, with minimum consecutive count

I would like to select a random subset of indices from a numpy array with the caveat that I need each randomly selected index to be part of a consecutive "cluster" of at least three indices in a row.
For example, if I have an array that contains 25 items
a = np.arange(0,25)
I want to make sure that no index is selected without including at least two neighboring indices. So, for example, if I was looking for a subset of length 12, the following two options both fulfill this.
# this has 3 consecutive, followed by 5 consecutive, followed by 4 consecutive
rand_subset_1 = [0,1,2,9,10,11,12,13,18,19,20,21]
# this has 6 consecutive, followed by 3 consecutive, followed by 3 consecutive
rand_subset_2 = [3,4,5,6,7,8,14,15,16,22,23,24]
Attempted Answer
I tried to figure this out initially by dividing a into lists of three.
a_mod = np.array([0,1,2],[3,4,5],[6,7,8],...[21,22,23])
and then using np.random.choice(a_mod, subset_length/3, replace=False)
However this doesn't solve my problem, for two reasons.
I want to be able to input arrays with lengths that don't have to be divisible by three.
I don't mind if the subset indices are in cluster sizes that also aren't divisible by three. I just need the cluster to have at least three consecutive indices.
Clarification Edit:
Is there a method that allows every number in the subset of indices is part of a "cluster" of consecutive numbers? Ideally this wouldn't limit the cluster to be divisible by a particular integer (which is where I got stuck on my attempted solution above), but would be flexible in allowing clusters to be random lengths with a specified minimum cluster size.
Thanks in advance for any help with this problem!

Use the following function.
It selects an index at random and add two consecutive indices.
After that, select indices without considering the indices selected already.
def select_consequtive_index(a, m, n = 3):
# a: array
# m: number of index to be selected
# n: minimum of consequtive counts
output = []
x = np.random.choice(a)
if x == 0:
output += [x, x+1, x+2]
elif x == a[-1]:
output += [x-2, x-1, x]
else:
output += [x-1, x, x+1]
output += np.random.choice(list(set(a) - set(output)), m - n, replace = False).tolist()
output = np.array(output)
output.sort()
return output
code sample.
a = np.arange(0, 25)
print(select_consequtive_index(a, m = 12, n = 3))
The result is as follows.
[ 3 4 7 8 9 10 11 12 17 21 22 24]

Google Jam test cases passes but submission shows "Wrong Answer"

Note: The main parts of the statements of the problems "Reversort" and
"Reversort Engineering" are identical, except for the last paragraph.
The problems can otherwise be solved independently.
Reversort is an algorithm to sort a list of distinct integers in
increasing order. The algorithm is based on the "Reverse" operation.
Each application of this operation reverses the order of some
contiguous part of the list.
After i−1 iterations, the positions 1,2,…,i−1 of the list contain the
i−1 smallest elements of L, in increasing order. During the i-th
iteration, the process reverses the sublist going from the i-th
position to the current position of the i-th minimum element. That
makes the i-th minimum element end up in the i-th position.
For example, for a list with 4 elements, the algorithm would perform 3
iterations. Here is how it would process L=[4,2,1,3]:
i=1, j=3⟶L=[1,2,4,3] i=2, j=2⟶L=[1,2,4,3] i=3, j=4⟶L=[1,2,3,4] The
most expensive part of executing the algorithm on our architecture is
the Reverse operation. Therefore, our measure for the cost of each
iteration is simply the length of the sublist passed to Reverse, that
is, the value j−i+1. The cost of the whole algorithm is the sum of the
costs of each iteration.
In the example above, the iterations cost 3, 1, and 2, in that order,
for a total of 6.
Given the initial list, compute the cost of executing Reversort on it.
Input The first line of the input gives the number of test cases, T. T
test cases follow. Each test case consists of 2 lines. The first line
contains a single integer N, representing the number of elements in
the input list. The second line contains N distinct integers L1, L2,
..., LN, representing the elements of the input list L, in order.
Output For each test case, output one line containing Case #x: y,
where x is the test case number (starting from 1) and y is the total
cost of executing Reversort on the list given as input.
Limits Time limit: 10 seconds. Memory limit: 1 GB. Test Set 1 (Visible
Verdict) 1≤T≤100. 2≤N≤100. 1≤Li≤N, for all i. Li≠Lj, for all i≠j.
Sample Sample Input 3 4 4 2 1 3 2 1 2 7 7 6 5 4 3 2 1 Sample Output
Case #1: 6 Case #2: 1 Case #3: 12 Sample Case #1 is described in the
statement above.
In Sample Case #2, there is a single iteration, in which Reverse is
applied to a sublist of size 1. Therefore, the total cost is 1.
In Sample Case #3, the first iteration reverses the full list, for a
cost of 7. After that, the list is already sorted, but there are 5
more iterations, each of which contributes a cost of 1.
def Reversort(L):
sort = 0
for i in range(len(L)-1):
small = L[i]
x = L[i]
y = L[i]
for j in range(i, len(L)):
if L[j] < small :
small = L[j]
sort = sort + (L.index(small) - L.index(y) + 1)
L[L.index(small)] = x
L[L.index(y)] = small
print(L) #For debugging purpose
return sort
T = int(input())
for i in range(T):
N = int(input())
L = list(map(int, input().rstrip().split()))
s = Reversort(L)
print(f"Case #{i+1}: {s}")

Your code fails for the test case 7 6 5 4 3 2 1. The code gives the answer as 18 whereas the answer should be 12.
You have forgotten to reverse the list between i and j.
the algorithm says
During the i-th iteration, the process reverses the sublist going from the i-th position to the current position of the i-th minimum element.

Best way to remove similar points in a list of points

I have a list of points that looks like this:
points = [(54592748,54593510),(54592745,54593512), ...]
Many of these points are similar in the sense that points[n][0] is almost equal to points[m][0] AND points[n][1] is almost equal to points[m][1]. Where 'almost equal' is a whatever integer I decide.
I would like to filter out all the similar points from the list, keeping just one of it.
Here is my code.
points = [(54592748,54593510),(54592745,54593512),(117628626,117630648),(1354358,1619520),(54592746,54593509)]
md = 10 # max distance allowed between two points
to_compare = points[:] # make a list of item to compare
to_remove = set() # keep track of items to be removed
for point in points:
to_compare.remove(point) # do not compare with itself
for other_point in to_compare:
if abs(point[0]-other_point[0]) <= md and abs(point[1]-other_point[1]) <= md:
to_remove.add(other_point)
for point in to_remove:
points.remove(point)
It works...
>>>points
[(54592748, 54593510), (117628626, 117630648), (1354358, 1619520)]
but I am looking for a faster solution since my list is millions items long.
PyPy helped a lot, it speeded up 6 the whole process 6 times, but probably there is a more efficient way to do this in the first place, or not?
Any help is very welcome.
=======
UPDATE
I have tested some of the answers with the points object you can pickle.load() from here https://mega.nz/#!TVci1KDS!tE5fTnjpPwbvpFTmW1TLsVXDvYHbRF8F7g10KGdOPCs
My code takes 1104 seconds and reduces the list to 96428 points (from 99920).
David's code do the job in 14 seconds! But misses something, 96431 points left.
Martin's code takes 0.06 seconds!! But also misses something, 96462 points left.
Any clue about why the results are not the same?

Depending on how accurate you need this to be, the following approach should work well:
points = [(54592748, 54593510), (54592745, 54593512), (117628626, 117630648), (1354358, 1619520), (54592746, 54593509)]
d = 20
hpoints = {((x - (x % d)), (y - (y % d))) : (x,y) for x, y in points}
for x in hpoints.itervalues():
print x
This converts each point into a dictionary key with each x and y coordinate rounded by its modulus. The result is a dictionary holding the coordinate of the last point in a given area. For the data you have given, this would display the following:
(117628626, 117630648)
(54592746, 54593509)
(1354358, 1619520)

Sorting the list first avoids the inner for loop and thus the n^2 time. I'm not sure if it's practically any quicker though since I don't have your full data. Try this (it outputs the same as far as i can see from your example points, just ordered).
points = [(54592748,54593510),(54592745,54593512),(117628626,117630648),(1354358,1619520),(54592746,54593509)]
md = 10 # max distance allowed between two points
points.sort()
to_remove = set() # keep track of items to be removed
for i, point in enumerate(points):
if i == len(points) - 1:
break
other_point = points[i+1]
if abs(point[0]-other_point[0]) <= md and abs(point[1]-other_point[1]) <= md:
to_remove.add(point)
for point in to_remove:
points.remove(point)
print(points)

This function for getting unique items from a list (it isn't mine, I found it a while back) only loops over the list once (plus dictionary lookups).
def unique(seq, idfun=None):
# order preserving
if idfun is None:
def idfun(x): return x
seen = {}
result = []
for item in seq:
marker = idfun(item)
# in old Python versions:
# if seen.has_key(marker)
# but in new ones:
if marker in seen: continue
seen[marker] = 1
result.append(item)
return result
The id function will require some cleverness. point[0] is divided by error and floored to an integer. So all point[0]'s such that x*error <= point[0] < (x+1)*error are the same and similarly for point[1].
def id(point):
error = 4
x = point[0]//error
y = point[1]//error
idValue = str(x)+"//"+str(y)
return idValue
So these functions will reduce points between consecutive multiples of error to the same point. The good news is that it only touches the original list once plus the dictionary lookups. The bad news is that this id function won't catch for example 15 and 17 should be the same because 15 reduces to 3 and 17 reduces to 4. It is possible that will some cleverness, this issue could be resolved.
[NOTE: I originally used exponents of primes for the idValue, but the exponents would be way to large. If you could make the idValue an int, that would increase lookup speed ]

Sorting Technique Python

I'm trying to create a sorting technique that sorts a list of numbers. But what it does is that it compares two numbers, the first being the first number in the list, and the other number would be the index of 2k - 1.
2^k - 1 = [1,3,7, 15, 31, 63...]
For example, if I had a list [1, 4, 3, 6, 2, 10, 8, 19]
The length of this list is 8. So the program should find a number in the 2k - 1 list that is less than 8, in this case it will be 7.
So now it will compare the first number in the random list (1) with the 7th number in the same list (19). if it is greater than the second number, it will swap positions.
After this step, it will continue on to 4 and the 7th number after that, but that doesn't exist, so now it should compare with the 3rd number after 4 because 3 is the next number in 2k - 1.
So it should compare 4 with 2 and swap if they are not in the right place. So this should go on and on until I reach 1 in 2k - 1 in which the list will finally be sorted.
I need help getting started on this code.
So far, I've written a small code that makes the 2k - 1 list but thats as far as I've gotten.
a = []
for i in range(10):
a.append(2**(i+1) -1)
print(a)
EXAMPLE:
Consider sorting the sequence V = 17,4,8,2,11,5,14,9,18,12,7,1. The skipping
sequence 1, 3, 7, 15, … yields r=7 as the biggest value which fits, so looking at V, the first sparse subsequence =
17,9, so as we pass along V we produce 9,4,8,2,11,5,14,17,18,12,7,1 after the first swap, and
9,4,8,2,1,5,14,17,18,12,7,11 after using r=7 completely. Using a=3 (the next smaller term in the skipping
sequence), the first sparse subsequence = 9,2,14,12, which when applied to V gives 2,4,8,9,1,5,12,17,18,14,7,11, and the remaining a = 3 sorts give 2,1,8,9,4,5,12,7,18,14,17,11, and then 2,1,5,9,4,8,12,7,11,14,17,18. Finally, with a = 1, we get 1,2,4,5,7,8,9,11,12,14,17,18.
You might wonder, given that at the end we do a sort with no skips, why
this might be any faster than simply doing that final step as the only step at the beginning. Think of it as a comb
going through the sequence -- notice that in the earlier steps we’re using course combs to get distant things in the
right order, using progressively finer combs until at the end our fine-tuning is dealing with a nearly-sorted sequence
needing little adjustment.
p = 0
x = len(V) #finding out the length of V to find indexer in a
for j in a: #for every element in a (1,3,7....)
if x >= j: #if the length is greater than or equal to current checking value
p = j #sets j as p
So that finds what distance it should compare the first number in the list with but now i need to write something that keeps doing that until the distance is out of range so it switches from 3 to 1 and then just checks the smaller distances until the list is sorted.

The sorting algorithm you're describing actually is called Combsort. In fact, the simpler bubblesort is a special case of combsort where the gap is always 1 and doesn't change.
Since you're stuck on how to start this, here's what I recommend:
Implement the bubblesort algorithm first. The logic is simpler and makes it much easier to reason about as you write it.
Once you've done that you have the important algorithmic structure in place and from there it's just a matter of adding gap length calculation into the mix. This means, computing the gap length with your particular formula. You'll then modifying the loop control index and the inner comparison index to use the calculated gap length.
After each iteration of the loop you decrease the gap length(in effect making the comb shorter) by some scaling amount.
The last step would be to experiment with different gap lengths and formulas to see how it affects algorithm efficiency.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.