K-center algorithm - python

I am currently using python to solving k-center algorithm.
When I run my codes its runtime exceeds the limit time(provided by my teacher),I don't quite know the way to improve my code so it can pass the limited runtime.
My code is below:
import math
# 1.Import group
# 2.Find the most farthest point in this group.
# 3.reassign the rest points between two center points
# 4.Find the most farthest point from its center point, and make it the newest center point
# 5.reassign points among all center points
# 6.Repeat 4 and 5 step untill the answer fits the condition
class point():
def __init__(self,x,y,num,group=[]):
self.x = x
self.y = y
self.id = num
self.group = []
def range_cus(one,two):
return math.sqrt(math.pow((one.x-two.x),2)+math.pow((one.y-two.y),2))
def reassign(all_points,all_answer):
for i in range(len(all_answer)):
all_answer[i].group = []
for i in range(len(all_points)):
if all_points[i] not in all_answer:
min_length = 0
for j in range(len(all_answer)):
current_length = range_cus(all_answer[j],all_points[i])
if min_length == 0:
min_length = current_length
current_group = all_answer[j]
elif current_length < min_length:
min_length = current_length
current_group = all_answer[j]
current_group.group.append(all_points[i])
def search(all_answer,seek_points_number):
if seek_points_number == 0:
return 0
answer_range = 0
for j in range(len(all_answer)):
for i in range(len(all_answer[j].group)):
if range_cus(all_answer[j],all_answer[j].group[i])>answer_range:
answer_range = range_cus(all_answer[j].group[i],all_answer[j])
answer_obj = all_answer[j].group[i]
seek_points_number -= 1
final_answer.append(answer_obj)
reassign(group,final_answer)
search(final_answer,seek_points_number)
info = raw_input().split(',')
info = [int(i) for i in info]
group = []
final_answer = []
for i in range(info[0]):
x = raw_input().split(',')
group.append(point(float(x[0]),float(x[1]),i+1))
final_answer.append(group[info[2]-1])
group[info[2]-1].group = [point for point in group if point not in final_answer]
search(final_answer,info[1]-1)
print ",".join([str(answer.id) for answer in final_answer])
Please help me examine where should the function be revised to save some runtime.
Example input:
10,3,10 #The first number denotes the sets of data.The second denotes the number of answer I want to return.The third denotes the first center point's id.
21.00,38.00
26.00,28.00
45.00,62.00
31.00,51.00
39.00,44.00
42.00,39.00
21.00,27.00
28.00,29.00
31.00,60.00
27.00,54.00
Example output
10,7,6

You can save at least some time by simply rewriting the range_cus function. As you call this function inside a nested loop, it should to be a good point of attack. Try replacing it with
def range_cus(one,two):
return sqrt((one.x - two.x)**2 + (one.y - two.y)**2)
and remember to do from math import sqrt at the top of your program. In this version, you get rid of a lot of lookups on the math object (math.)

Related

Find minimum number of elements in a list that covers specific values

A recruiter wants to form a team with different skills and he wants to pick the minimum number of persons which can cover all the required skills.
N represents number of persons and K is the number of distinct skills that need to be included. list spec_skill = [[1,3],[0,1,2],[0,2,4]] provides information about skills of each person. e.g. person 0 has skills 1 and 3, person 1 has skills 0, 1 and 2 and so on.
The code should outputs the size of the smallest team that recruiter could find (the minimum number of persons) and values indicating the specific IDs of the people to recruit onto the team.
I implemented the code with brute force as below but since some data are more than thousands, it seems I need to be solved with heuristic approaches. In this case it is possible to have approximate answer.
Any suggestion how to solve it with heuristic methods will be appreciated.
N,K = 3,5
spec_skill = [[1,3],[0,1,2],[0,2,4]]
A = list(range(K))
set_a = set(A)
solved = False
for L in range(0, len(spec_skill)+1):
for subset in itertools.combinations(spec_skill, L):
s = set(item for sublist in subset for item in sublist)
if set_a.issubset(s):
print(str(len(subset)) + '\n' + ' '.join([str(spec_skill.index(item)) for item in subset]))
solved = True
break
if solved: break
Here is my way of doing this. There might be potential optimization possibilities in the code, but the base idea should be understandable.
import random
import time
def man_power(lst, K, iterations=None, period=0):
"""
Specify a fixed number of iterations
or a period in seconds to limit the total computation time.
"""
# mapping each sublist into a (sublist, original_index) tuple
lst2 = [(lst[i], i) for i in range(len(lst))]
mini_sample = [0]*(len(lst)+1)
if period<0 or (period == 0 and iterations is None):
raise AttributeError("You must specify iterations or a positive period")
def shuffle_and_pick(lst, iterations):
mini = [0]*len(lst)
for _ in range(iterations):
random.shuffle(lst2)
skillset = set()
chosen_ones = []
idx = 0
fullset = True
# Breaks from the loop when all skillsets are found
while len(skillset) < K:
# No need to go further, we didn't find a better combination
if len(chosen_ones) >= len(mini):
fullset = False
break
before = len(skillset)
skillset.update(lst2[idx][0])
after = len(skillset)
if after > before:
# We append with the orginal index of the sublist
chosen_ones.append(lst2[idx][1])
idx += 1
if fullset:
mini = chosen_ones.copy()
return mini
# Estimates how many iterations we can do in the specified period
if iterations is None:
t0 = time.perf_counter()
mini_sample = shuffle_and_pick(lst, 1)
iterations = int(period / (time.perf_counter() - t0)) - 1
mini_result = shuffle_and_pick(lst, iterations)
if len(mini_sample)<len(mini_result):
return mini_sample, len(mini_sample)
else:
return mini_result, len(mini_result)

Finding the optimal location for router placement

I am looking for an optimization algorithm that takes a text file encoded with 0s, 1s, and -1s:
1's denoting target cells that requires Wi-Fi coverage
0's denoting cells that are walls
1's denoting cells that are void (do not require Wi-Fi coverage)
Example of text file:
I have created a solution function along with other helper functions, but I can't seem to get the optimal positions of the routers to be placed to ensure proper coverage. There is another file that does the printing, I am struggling with finding the optimal location. I basically need to change the get_random_position function to get the optimal one, but I am unsure how to do that. The area covered by the various routers are:
This is the kind of output I am getting:
Each router covers a square area of at most (2S+1)^2
Type 1: S=5; Cost=180
Type 2: S=9; Cost=360
Type 3: S=15; Cost=480
My code is as follows:
import numpy as np
import time
from random import randint
def is_taken(taken, i, j):
for coords in taken:
if coords[0] == i and coords[1] == j:
return True
return False
def get_random_position(floor, taken , nrows, ncols):
i = randint(0, nrows-1)
j = randint(0, ncols-1)
while floor[i][j] == 0 or floor[i][j] == -1 or is_taken(taken, i, j):
i = randint(0, nrows-1)
j = randint(0, ncols-1)
return (i, j)
def solution(floor):
start_time = time.time()
router_types = [1,2,3]
nrows, ncols = floor.shape
ratio = 0.1
router_scale = int(nrows*ncols*0.0001)
if router_scale == 0:
router_scale = 1
row_ratio = int(nrows*ratio)
col_ratio = int(ncols*ratio)
print('Row : ',nrows, ', Col: ', ncols, ', Router scale :', router_scale)
global_best = [0, ([],[],[])]
taken = []
while True:
found_better = False
best = [global_best[0], (list(global_best[1][0]), list(global_best[1][1]), list(global_best[1][2]))]
for times in range(0, row_ratio+col_ratio):
if time.time() - start_time > 27.0:
print('Time ran out! Using what I got : ', time.time() - start_time)
return global_best[1]
fit = []
for rtype in router_types:
interim = (list(global_best[1][0]), list(global_best[1][1]), list(global_best[1][2]))
for i in range(0, router_scale):
pos = get_random_position(floor, taken, nrows, ncols)
interim[0].append(pos[0])
interim[1].append(pos[1])
interim[2].append(rtype)
fit.append((fitness(floor, interim), interim))
highest_fitness = fit[0]
for index in range(1, len(fit)):
if fit[index][0] > highest_fitness[0]:
highest_fitness = fit[index]
if highest_fitness[0] > best[0]:
best[0] = highest_fitness[0]
best[1] = (highest_fitness[1][0],highest_fitness[1][1], highest_fitness[1][2])
found_better = True
global_best = best
taken.append((best[1][0][-1],best[1][1][-1]))
break
if found_better == False:
break
print('Best:')
print(global_best)
end_time = time.time()
run_time = end_time - start_time
print("Run Time:", run_time)
return global_best[1]
def available_cells(floor):
available = 0
for i in range(0, len(floor)):
for j in range(0, len(floor[i])):
if floor[i][j] != 0:
available += 1
return available
def fitness(building, args):
render = np.array(building, dtype=int, copy=True)
cov_factor = 220
cost_factor = 22
router_types = { # type: [coverage, cost]
1: {'size' : 5, 'cost' : 180},
2: {'size' : 9, 'cost' : 360},
3: {'size' : 15, 'cost' : 480},
}
routers_used = args[-1]
for r, c, t in zip(*args):
size = router_types[t]['size']
nrows, ncols = render.shape
rows = range(max(0, r-size), min(nrows, r+size+1))
cols = range(max(0, c-size), min(ncols, c+size+1))
walls = []
for ri in rows:
for ci in cols:
if building[ri, ci] == 0:
walls.append((ri, ci))
def blocked(ri, ci):
for w in walls:
if min(r, ri) <= w[0] and max(r, ri) >= w[0]:
if min(c, ci) <= w[1] and max(c, ci) >= w[1]:
return True
return False
for ri in rows:
for ci in cols:
if blocked(ri, ci):
continue
if render[ri, ci] == 2:
render[ri, ci] = 4
if render[ri, ci] == 1:
render[ri, ci] = 2
render[r, c] = 5
return (
cov_factor * np.sum(render > 1) -
cost_factor * np.sum([router_types[x]['cost'] for x in routers_used])
)
Here's a suggestion on how to solve the problem; however I don't affirm this is the best approach, and it's certainly not the only one.
Main idea
Your problem can be modelised as a weighted minimum set cover problem.
Good news, this is a well known optimization problem:
It is easy to find algorithm descriptions for approximate solutions
A quick search on the web shows many implementations of approximation algorithms in Python.
Bad news, this is a NP-hard optimization problem:
If you need an exact solution: algorithms will work only for "small" sized problems in a reasonable amount of time(in your case: size of the problem <=> number of "1" cells).
Approximate (a.k.a greedy) algorithms are trade-off between computation requirements, and a risk do deliver far from optimal solutions in certain cases.
Note that the following part does not prove that your problem is NP-hard. The general minimum set cover problem is NP-hard. In your case the subsets have several properties that might help to design a better algorithm. I have no idea how though.
Translating into a cover set problem
Let's define some sets:
U: the set of "1" cells (requiring Wifi).
P(U): the power set of U (the set of subsets of U).
P: the set of cells on which you can place a router (not sure if P=U in your original post).
T: the set of router type (3 values in your case).
R+: positive Real number (used to describe prices).
Let's define a function (pseudo Python):
# Domain of definition : T,P --> R+,P(U)
# This function takes a router type and a position, and returns
# a tuple containing:
# - the price of a router of the given type.
# - the subset of U containing all the position covered by a router
# of the given type placed at the given position.
def weighted_subset(routerType, position):
pass # TODO: implementation
Now, we define a last set, as the image of the function we've just described: S=weighted_subset(T,P). Each element of this set is a subset of U, weighted by a price in R+.
With all this formalism, finding the router types & positions that:
gives coverage to all the desirable locations
minimize the cost
Is equivalent to finding a sub-collection of S:
whose union of their P(U) is equal to U
which minimise the sum of the associated weights
Which is the weighted minimal set cover problem.

backtracking not trying all possibilities

so I've got a list of questions as a dictionary, e.g
{"Question1": 3, "Question2": 5 ... }
That means the "Question1" has 3 points, the second one has 5, etc.
I'm trying to create all subset of question that have between a certain number of questions and points.
I've tried something like
questions = {"Q1":1, "Q2":2, "Q3": 1, "Q4" : 3, "Q5" : 1, "Q6" : 2}
u = 3 #
v = 5 # between u and v questions
x = 5 #
y = 10 #between x and y points
solution = []
n = 0
def main(n_):
global n
n = n_
global solution
solution = []
finalSolution = []
for x in questions.keys():
solution.append("_")
finalSolution.extend(Backtracking(0))
return finalSolution
def Backtracking(k):
finalSolution = []
for c in questions.keys():
solution[k] = c
print ("candidate: ", solution)
if not reject(k):
print ("not rejected: ", solution)
if accept(k):
finalSolution.append(list(solution))
else:
finalSolution.extend(Backtracking(k+1))
return finalSolution
def reject(k):
if solution[k] in solution: #if the question already exists
return True
if k > v: #too many questions
return True
points = 0
for x in solution:
if x in questions.keys():
points = points + questions[x]
if points > y: #too many points
return True
return False
def accept(k):
points = 0
for x in solution:
if x in questions.keys():
points = points + questions[x]
if points in range (x, y+1) and k in range (u, v+1):
return True
return False
print(main(len(questions.keys())))
but it's not trying all possibilities, only putting all the questions on the first index..
I have no idea what I'm doing wrong.
There are three problems with your code.
The first issue is that the first check in your reject function is always True. You can fix that in a variety of ways (you commented that you're now using solution.count(solution[k]) != 1).
The second issue is that your accept function uses the variable name x for what it intends to be two different things (a question from solution in the for loop and the global x that is the minimum number of points). That doesn't work, and you'll get a TypeError when trying to pass it to range. A simple fix is to rename the loop variable (I suggest q since it's a key into questions). Checking if a value is in a range is also a bit awkward. It's usually much nicer to use chained comparisons: if x <= points <= y and u <= k <= v
The third issue is that you're not backtracking at all. The backtracking step needs to reset the global solution list to the same state it had before Backtracking was called. You can do this at the end of the function, just before you return, using solution[k] = "_" (you commented that you've added this line, but I think you put it in the wrong place).
Anyway, here's a fixed version of your functions:
def Backtracking(k):
finalSolution = []
for c in questions.keys():
solution[k] = c
print ("candidate: ", solution)
if not reject(k):
print ("not rejected: ", solution)
if accept(k):
finalSolution.append(list(solution))
else:
finalSolution.extend(Backtracking(k+1))
solution[k] = "_" # backtracking step here!
return finalSolution
def reject(k):
if solution.count(solution[k]) != 1: # fix this condition
return True
if k > v:
return True
points = 0
for q in solution:
if q in questions:
points = points + questions[q]
if points > y: #too many points
return True
return False
def accept(k):
points = 0
for q in solution: # change this loop variable (also done above, for symmetry)
if q in questions:
points = points + questions[q]
if x <= points <= y and u <= k <= v: # chained comparisons are much nicer than range
return True
return False
There are still things that could probably be improved in there. I think having solution be a fixed-size global list with dummy values is especially unpythonic (a dynamically growing list that you pass as an argument would be much more natural). I'd also suggest using sum to add up the points rather than using an explicit loop of your own.

Forming a polygon with classes

So, my problem is: I am trying to create a program which would create a polygon that has atleast 3 points(that are composed of coordinates x and y) or angles. I would like that, if there are less than 3 points or angles submitted, the program returns an error saying there are insufficient number of points. I need to create this with classes.
I have created this so far: `
class Polygon:
number_points = 0
number_angles = 0
def __init__(self, coordinate_x, coordinate_y, angles):
s = []
self.coordinate_x = coordinate_x
self.coordinate_y = coordinate_y
self.angles = angles
self.s = s.append([coordinate_x, coordinate_y])
Polygon.number_points = Polygon.number_points + 1
Nkotnik.number_angles = Polygon.number_angles + 1
# Here i would like the program to check if there are enough points
# and angles to form a polygon and to check if all coordinates are
# numbers. If this requirement is not met, the program prints an
# error message.
def creation(self):
if not isinstance(coordinate_x, (int,float)):
#raise Exception("That is not a number")
if Polygon.number_points <= 3:
`
The idea that I had is that i store the coordinates in a list and then when the user enters enough points, a polygon can be formed.
I am not a native speaker, so if I need to clear things a bit further feel free to ask :) thank you for any possible answers :)
I see an error here:
Polygon.number_points = Polygon.number_points + 1
Nkotnik.number_angles = Polygon.number_angles + 1
Nkotnik should be Polygon. Also, to make it shorter, you could do Polygon.number_points += 1 and same for number_angles.
So now, the creation of the program:
def creation(self):
This is bad design. The function should take the number of points and the number of angles as parameters. So, do this:
def creation(self, points, angles):
But creation is basically initialization, so you should integrate it into your __init__.
Also, your __init__ is strange. number_points and number_angles should be defined in the __init__, not the object body, because those variables are different for different Polygon objects. So after modification, your code looks like this:
class Polygon:
def __init__(self, coord_list, angles):
if len(coord_list) // 2 < 3:
raise Exception("Side count must be 3 or more.")
s = []
self.number_points = 0
self.number_angles = 0
self.coordinates_x = coord_list[::2]
self.coordinates_y = coord_list[1::2]
self.angles = angles
self.s = s.append([coordinate_x, coordinate_y])
self.number_points += len(coord_list // 2)
self.number_angles += len(angles)
num_sides = int(input('Number of sides: ')) #raw_input if you're using Python 2
points = []
angles = []
for i in range(num_sides):
points.append(int(input('X value of point: ')))
points.append(int(input('Y value of point: ')))
for i in range(num_sides):
angles.append(int(input('Angle value: ')))
polygon_object = Polygon(points, angles)
And you're done!
You can do the check at creation time in the class, like this, also you need more that just a angle to define a point
import collections
PointCartesian = collections.namedtuple("PointCartesian","coordinate_x coordinate_y")
PointPolar = collections.namedtuple("PointPolar","magnitude angle")
#this is a easy way to make a class for points, that I recommend have
#a class too
class Polygon(object):
def __init__(self,*argv,**kargv):
points = list()
for elem in argv:
if isinstance(elem,(PointCartesian,PointPolar ) ):
points.append(elem)
else:
raise ValueError("Element "+str(elem)+" of wrong type")
if len(points) <3:
raise ValueError("Insufficient data")
self.points = points
and in other place you have the routine that ask the user for the data, you can check every input or leave it to the class.
to call it do something like this
Polygon(PointCartesian(1,2),PointCartesian(4,7),PointPolar(5,28.2))
Polygon(*list_of_points)

Compute Higher Moments of Data Matrix

this probably leads to scipy/numpy, but right now I'm happy with any functionality as I couldn't find anything in those packages. I have a matrix that contains data for a multi-variate distribution (let's say, 2, for the fun of it). Is there any function to compute (higher) moments of that? All I could find was numpy.mean() and numpy.cov() :o
Thanks :)
/edit:
So some more detail: I have multivariate data, that is, a matrix where rows display variables and columns observations. Now I would like to have a simple way of computing the joint moments of that data, as defined in http://en.wikipedia.org/wiki/Central_moment#Multivariate_moments .
I'm pretty new to python/scipy so I'm not sure I'd be the best person to code this one up, especially for the n-variables case (note that the wikipedia definition is for n=2), and I kind of expected there to be some out-of-the-box thing to use as I thought this would be a standard problem.
/edit2:
Just for the future, in case someone wants to do something similar, the following code (which is still under review) should give the sample equivalent of the raw moments E(X^2), E(Y^2), etc. It only works for two variables right now, but it should be extendable if one feels the need. If you see some mistakes or unclean/unpython-nish code, feel free to comment.
from numpy import *
# this function should return something as
# moments[0] = 1
# moments[1] = mean(X), mean(Y)
# moments[2] = 1/n*X'X, 1/n*X'Y, 1/n*Y'Y
# moments[3] = mean(X'X'X), mean(X'X'Y), mean(X'Y'Y),
# mean(Y'Y'Y)
# etc
def getRawMoments(data, moment, axis=0):
a = moment
if (axis==0):
n = float(data.shape[1])
X = matrix(data[0,:]).reshape((n,1))
Y = matrix(data[1,:]).reshape((n,1))
else:
n = float(data.shape[0])
X = matrix(data[:,0]).reshape((n,1))
Y = matrix(data[:,1]).reshape((n,11))
result = 1
Z = hstack((X,Y))
iota = ones((1,n))
moments = {}
moments[0] = 1
#first, generate huge-ass matrix containing all x-y combinations
# for every power-combination k,l such that k+l = i
# for all 0 <= i <= a
for i in arange(1,a):
if i==2:
moments[i] = moments[i-1]*Z
# if even, postmultiply with X.
elif i%2 == 1:
moments[i] = kron(moments[i-1], Z.T)
# Else, postmultiply with X.T
elif i%2==0:
temp = moments[i-1]
temp2 = temp[:,0:n]*Z
temp3 = temp[:,n:2*n]*Z
moments[i] = hstack((temp2, temp3))
# since now we have many multiple moments
# such as x**2*y and x*y*x, filter non-distinct elements
momentsDistinct = {}
momentsDistinct[0] = 1
for i in arange(1,a):
if i%2 == 0:
data = 1/n*moments[i]
elif i == 1:
temp = moments[i]
temp2 = temp[:,0:n]*iota.T
data = 1/n*hstack((temp2))
else:
temp = moments[i]
temp2 = temp[:,0:n]*iota.T
temp3 = temp[:,n:2*n]*iota.T
data = 1/n*hstack((temp2, temp3))
momentsDistinct[i] = unique(data.flat)
return momentsDistinct(result, axis=1)

Categories

Resources