I want to solve the TSP problem using a dynamic programming algorithm in Python.The problem is:
Input: cities represented as a list of points. For example, [(1,2), (0.3, 4.5), (9, 3)...]. The distance between cities is defined as the Euclidean distance.
Output: the minimum cost of a traveling salesman tour for this instance, rounded down to the nearest integer.
And the pseudo-code is:
Let A = 2-D array, indexed by subsets of {1, 2, ,3, ..., n} that contains 1 and destinations j belongs to {1, 2, 3,...n}
1. Base case:
2. if S = {0}, then A[S, 1] = 0;
3. else, A[S, 1] = Infinity.
4.for m = 2, 3, ..., n: // m = subproblem size
5. for each subset of {1, 2,...,n} of size m that contains 1:
6. for each j belongs to S and j != 1:
7. A[S, j] = the least value of A[S-{j},k]+the distance of k and j for every k belongs to S that doesn't equal to j
8.Return the least value of A[{1,2..n},j]+the distance between j and 1 for every j = 2, 3,...n.
My confusions are:
How to index a list using subset, that is how to implement line 5 in the pseudo-code efficiently.
You can encode sets as integers: i'th bit of the integer will represent the state of i'th city (i.e. do we take it in the subset or not).
For example, 3510 = 1000112 will represent cities {1, 2, 6}. Here I count from the rightmost bit, which represents city 1.
In order to index a list using such representation of a subset, you should create 2D array of length 2n:
# Assuming n is given.
A = [[0 for i in xrange(n)] for j in xrange(2 ** n)]
This comes from the fact that with n-bit integer you can represent every subset of {1, 2, ..., n} (remember, each bit corresponds to exactly one city).
This representation gives you a number of nice possibilities:
# Check whether some city (1-indexed) is inside subset.
if (1 << (i - 1)) & x:
print 'city %d is inside subset!' % i
# In particular, checking for city #1 is super-easy:
if x & 1:
print 'city 1 is inside subset!'
# Iterate over subsets with increasing cardinality:
subsets = range(1, 2 ** n)
for subset in sorted(subsets, key=lambda x: bin(x).count('1')):
print subset,
# For n=4 prints "1 2 4 8 3 5 6 9 10 12 7 11 13 14 15"
# Obtain a subset y, which is the same as x,
# except city #j (1-indexed) is removed:
y = x ^ (1 << (j - 1)) # Note that city #j must be inside x.
This is how I would implement your pseudocode (warning: no testing was done):
# INFINITY and n are defined somewhere above.
A = [[INFINITY for i in xrange(n)] for j in xrange(2 ** n)]
# Base case (I guess it should read "if S = {1}, then A[S, 1] = 0",
because otherwise S = {0} is not a valid index to A, according to line #1)
A[1][1] = 0
# Iterate over all subsets:
subsets = range(1, 2 ** n)
for subset in sorted(subsets, key=lambda x: bin(x).count('1')):
if not subset & 1:
# City #1 is not presented.
continue
for j in xrange(2, n + 1):
if not (1 << (j - 1)) & subset:
# City #j is not presented.
continue
for k in xrange(1, n + 1):
if k == j or not (1 << (k - 1)) & subset:
continue
A[subset][j] = min(A[subset][j], A[subset ^ (1 << (j - 1))][k] + get_dist(j, k))
Besides having all needed functionality to implement your pseudocode, this approach is going to be faster than with tuples\dicts.
Related
Problem Description:
I'm working on making a function which gives me a definition for a particular combination of several descriptors based on a single index. My inputs are a set of raw features X = [feat0,feat1,feat2,feat3,feat4], a list of powers to be used pow = [1,2,3], and a list of group sizes sizes = [1,3,5]. A valid output might look like the following:
feat0^2 * feat4^3 * feat1^1
This output is valid because feat0, feat4, and feat1 exist within X, their powers exist within pow, and the number of features being combined is in sizes.
Invalid edge cases include:
values which don't exist in X, powers not in pow, and combination sizes not in sizes
combinations that are identical to another are invalid: feat0^2 * feat1^3 and feat1^3 * feat0^2 are the same
combinations that include multiples of the same feature are invalid: feat0^1 * feat0^3 * feat2^2 is invalid
under the hood I'm encoding these groupings as lists of tuples. So feat0^2 * feat4^3 * feat1^1 would be represented as [(0,2), (4,3), (1,1)], where the first element in the tuple is the feature index, and the second is the power.
Question:
my question is, how can I create a 1 to 1 mapping of a particular combination to an index i? I would like to get the number of possible combinations, and be able to plug in an integer i to a function, and have that function generate a particular combination. Something like this:
X = [0.123, 0.111, 11, -5]
pow = [1,2,3]
sizes = [1,3]
#getting total number of combinations
numCombos = get_num_combos(X,pow,sizes)
#getting a random index corresponding to a grouping
i = random.randint(0, numCombos)
#getting grouping
grouping = generate_grouping(i, X, pow, sizes)
print(grouping)
Resulting in something like
[(0,1), (1,2), (3,1)]
So far, figuring out the generation when not accounting for the various edge cases wasn't too hard, but I'm at a loss for how to account for edge cases 2 and 3; making it guaranteed that no value of i is algebraically equivalent to any other value of i, and that the same feature does not appear multiple times in a grouping.
Current Progress
#computes the n choose k of a list and a size
def get_num_groupings(n, k):
return int(math.factorial(n)/(math.factorial(k)*math.factorial(n-k)))
import numpy as np
import bisect
i = 150
n = 5
m = 3
sizes = [1, 3, 5]
#computing the number of elements in each group length
numElements = [m**k * get_num_groupings(n, k) for k in sizes]
#index bins for each group size
bins = list(np.cumsum(numElements))[:-1]
#getting the current group size
binIdx = bisect.bisect_left(bins,i)
curSize = sizes[binIdx]
#adding idx 0 to bins
bins = [0]+bins
#getting the location of i in the bin
z = i - bins[binIdx]
#getting the product index and combination rank
pi = z // m**k
ci = z % m**k
#getting the indexes of the powers
pidx = [(pi // m**(curSize - (num+1)))%m for num in range(curSize)]
#getting the indexes of the features
#TODO cidx = unrank(i, range(n))
This is based on the Mad Physicist's answer. Though I haven't figured out how to get cidx yet. Some of the variable names are rewritten for my own understanding. To my knowledge this implimentation works by logically separating the combinations of variables and which powers they each have. So far, I can get the powers from an index i, and once unrank is ironed out I should be able to get the indexes for which features are used.
Let's look at a slightly different problem that's closely related to what to want: generate all the possible valid combinations.
If you choose a size and a power, finding all possible combinations of features is fairly straightforward:
from itertools import combinations, product
n = len(X)
m = len(powers)
k = size = ... # e.g. 3
pow = ... # e.g. [1, 2, 3]
The iterator of unique combinations of features is given by
def elements(X, size, pow):
for x in combinations(X, size):
yield sum(e**p for p, e in zip(pow, x))
The equivalent one-liner would he
(sum(e**p for p, e in zip(pow, x)) for x in combinations(X, size))
This generator has exactly n choose k unique elements. These elements meet all your conditions by definition.
Now you can loop over all possible sizes and product of powers to get all the options:
def all_features(X, sizes, powers):
for size in sizes:
for pow in product(powers, repeat=size):
for x in combinations(X, size):
yield sum(e**p for p, e in zip(pow, x))
The total number of elements is the sum for each k of m**k * n choose k.
Now that you've counted the possibilities, you can compute the mapping of element to index and vice versa, using a combinatorial number system. Sample ranking and unranking functions for combinations are shown here. You can use them after you adjust the index for the size and power bins.
To show what I mean, assume you have three functions (given in the linked answer):
choose(n, k) computes n choose k
rank(combo) accepts the ordered indices of a specific commination and returns the rank.
unrank(ind, k) accepts a rank and size, and returns the k indices of the corresponding combination.
You can then compute the offsets of each size group and the step for each power within that group. Let's work through your concrete example with n = 5, m = 3, and sizes = [1, 3, 5].
The number of elements for each size is given by
elements = [m**k * choose(n, k) for k in sizes]
The total number of possible arrangements is sum(elements):
3**1 * choose(5, 1) + 3**3 * choose(5, 3) + 3**5 * choose(5, 5) = 3 * 5 + 27 * 10 + 243 * 1 = 15 + 270 + 243 = 528
The cumulative sum is useful to convert between index and element:
cumsum = [0, 15, 285]
When you get an index, you can check which bin it falls in using bisect.
Let's say you were given index = 55. Since 15 < 55 < 285, your offset is 15, size = 3. Within the size = 3 group, you have an offset of z = 55 - 15 = 40.
Within the k = 3 group, there are m**k = 3**3 = 27 power products. The index of the product is pi = z // m**k and the combination rank is ci = z % m**k.
So the indices of the power are given by
pidx = [(pi // m**(k - 1)) % m, (pi // m**(k - 2)) % m, ...]
Similarly, the indices of the combination are given by
cidx = unrank(ci, k)
You can convert all these indices into a value using something like
sum(X[q]**powers[p] for p, q in zip(pidx, cidx))
Given an array of n elements length where each element denotes set size, determine the number of ways you can select K size set.
condition : You can't pick more than one element from one set. How to
solve this (any program)
Examples:
Input :
n = 4
k = 3
{1,2,1,1} Each value represents number of elements in each set
Output : 7
Example :
{1},{2,3},{4},{5}
{1,2,4}
{1,2,5}
{1,3,4}
{1,3,5}
{1,4,5}
{2,4,5}
{3,4,5}
Code i tried but it returns 10 values which doesn't follow condition What mistake i done here?
they just gave subset length instead of actual subset. So i based on the sum of all the subset length i am forming a new array
count = 0
def printCombination(arr, n, r):
global count
data = [0]*r
combinationUtil(arr, data, 0,
n - 1, 0, r)
def combinationUtil(arr, data, start,
end, index, r):
global count
if (index == r):
for j in range(r):
print(data[j], end=" ")
print()
count += 1
return
i = start
while(i <= end and end - i + 1 >= r - index):
data[index] = arr[i]
combinationUtil(arr, data, i + 1,
end, index + 1, r)
i += 1
in_val = [1,2,1,1]
arr = list(range(1,sum(in_val)+1)) r = 3
n = len(arr)
printCombination(arr, n, r)
print(count)
can we solve this with some formula with minimal time instead of simulating each subset and traversing through. Please throw some light on this or give me suggestion to proceed further.
Assuming all input sets are disjoint, a dynamic programming approach allows us to calculate the number of such combinations in O(n) time, where n is the number of sets (assuming input sets' cardinalities are bounded above by a constant; otherwise, the time complexity is O(n, max_size) where max_size is the cardinality of an input set with maximum size).
import random # testing
from functools import lru_cache # memoization
xs = [{1},{2,3},{4},{5}]
k = 3
sizes = [len(x) for x in xs]
# [1, 2, 1, 1]
# dynamic programming approach
def count_combinations(sizes, k=3):
#lru_cache(None)
def f(n, k):
if (n < k) or (k < 0):
# no combination possible
return 0
elif n == k:
# return product sizes[:n]
res = 1
for x in sizes[:n]:
res *= x
return res
else:
# recursive memoized call
# f(n-1, k-1) ways to select k-1 elts from n-1 sets
# times size of n-1'st set (counting from 0)
# plus f(n-1, k) ways to select k elts from n-1 sets
return sizes[n-1] * f(n-1, k-1) + f(n-1, k)
return f(len(sizes), k)
# assert count_combs(sizes, k=k) == count_combinations(sizes, k)
# larger benchmark
n = 25
k = n // 2
xs = [{random.randint(0, n) for i in range(n)} for _ in range(n)]
sizes = [len(x) for x in xs]
%time count_combs(sizes, k) # O(n choose k), 8.6 s
%timeit count_combinations(sizes, k) # O(n), 112 µs
This avoids considering all possible combinations of size k explicitly, dropping complexity form O(n choose k) to O(n).
Determine the number of ways you can select K size set.
from itertools import combinations
from functools import reduce
def count_combs(arr, k):
if k > len(arr):
return 0 # Not possible
elif k == len(arr):
return reduce(lambda a, b: a*b, arr) # multiply values in arr
else:
"""sum of answer to each sub-set of arr of size k
subsets of arr of size k are combinations(arr, k)"""
return sum(count_combs(x, k) for x in combinations(arr, k))
Test
arr = [1, 2, 1, 1]
print(count_combs(arr, 3))
# Outputs 7
arr = [1, 1, 1, 1]
print(count_combs(arr, 3))
# Outputs 4
arr = [1, 1, 1, 2]
print(count_combs(arr, 2))
# Output 9
Explanation
Three cases
k > len(arr): Not possible, so answer is 0
k == len(arr): Its the number of ways we can take one element at a time from each array index, which is the product of the values of the array arr.
k < len(arr): We sum the answer of all subsets of arr of size k select k (i.e. as sub-problems). Then we sum the solution count to each of these sub-problems. The solution to each sub-problem is known from step 2 above.
I would very much like to generate n random integer numbers between two values (min, max) whose sum is equal to a given number m.
Note: I found similar questions in StackOverflow; however, they do not address exactly this problem (use of Dirichlet function and thus numbers between 0 and 1).
Example: I need 8 random numbers (integers) between 0 and 24 where the sum of the 8 generated numbers must be equal to 24.
Any help is appreciated. Thanks.
Well, you could use integer distribution which naturally sums to some fixed number - Multinomial one.
Just shift forth and back, and it should work automatically
Code
import numpy as np
def multiSum(n, p, maxv):
while True:
v = np.random.multinomial(n, p, size=1)
q = v[0]
a, = np.where(q > maxv) # are there any values above max
if len(a) == 0: # accept only samples below or equal to maxv
return q
N = 8
S = 24
p = np.full((N), 1.0/np.float64(N))
mean = S / N
start = 0
stop = 24
n = N*mean - N*start
h = np.zeros((stop-start), dtype=np.int64)
print(h)
for k in range(0, 10000):
ns = multiSum(n, p, stop-start) + start # result in [0...24]
#print(np.sum(ns))
for v in ns:
h[v-start] += 1
print(h)
this is a case of partition number theory . here is solution .
def partition(n,k,l, m):
if k < 1:
raise StopIteration
if k == 1:
if n <= m and n>=l :
yield (n,)
raise StopIteration
for i in range(l,m+1):
for result in partition(n-i,k-1,i,m):
yield result+(i,)
n = 24 # sum value
k = 8 # partition size
l = 0 # range min value
m = 24 # range high value
result = list(partition(n,k,l,m ))
this will give all the combinations that satisfy the conditions.
ps this is quite slow as this is giving all the cases for that partition size.
This is one possible solution which is based on this answer. it seems the dirichlet method is only functional for between 0 and 1. Full credit should be given to the original answer. I will be happy to delete it once you comment that it served your purpose.
Don't forget to upvote the original answer.
target = 24
x = np.random.randint(0, target, size=(8,))
while sum(x) != target:
x = np.random.randint(0, target, size=(8,))
print(x)
# [3 7 0 6 7 0 0 1]
I'd like to iterate over every other element of a m-by-n "chessboard", i.e.,
l = []
for i in range(m):
for j in range(n):
if (i+j) % 2 == 0:
l.append(something(i, j))
I'm using an explicit loop here, but for speed would rather use a list comprehension.
Any hints?
For bonus points, the solution also works for i, j, k with (i+j+k) % 2 == 0.
Well, list comprehension is just like your nested for loop, except that this is done within the list brackets:
my_list = [something(i, j) for i in range(m) for j in range(n) if (i + j) % 2 == 0]
More generally, for n nested loops, you can use itertools.product, like this:
from itertools import product
my_list = [something(*p) for p in product(range(n), repeat=n) if sum(p) % 2 == 0]
As I understand it, you would like an explicit expression for the x and y coordinates of the black squares on the 'chess board', so that you don't have to evaluate the boolean for every square. Here is an implementation of my solution (for a 2-dimensional board):
import numpy as np
# 'Chess board' dimension
m = 3 # Number of columns
n = 4 # Number of rows
# Counter variable. The length of this array is equal to the total number of black squares.
k = np.arange(0,m*n,2)
x_coords = (k + (k/n) % 2) % n # x-coordinates of black squares
y_coords = (k + (k/n) % 2) / n # y-coordinates of black squares
print("x-coordinates: "+str(x_coords))
print("y-coordinates: "+str(y_coords))
For the 3x4 dimensional board in the example above, this generates the following output:
x-coordinates: [0 2 1 3 0 2]
y-coordinates: [0 0 1 1 2 2]
which you can verify by drawing a little diagram. Note that the 'helper variable' (k/n) % 2 keeps track of whether the row number is even or odd; the odd rows have an 'offset' with respect to the even ones.
I'm a stumped on how to speed up my algorithm which sums multiples in a given range. This is for a problem on codewars.com here is a link to the problem
codewars link
Here's the code and i'll explain what's going on in the bottom
import itertools
def solution(number):
return multiples(3, number) + multiples(5, number) - multiples(15, number)
def multiples(m, count):
l = 0
for i in itertools.count(m, m):
if i < count:
l += i
else:
break
return l
print solution(50000000) #takes 41.8 seconds
#one of the testers takes 50000000000000000000000000000000000000000 as input
# def multiples(m, count):
# l = 0
# for i in xrange(m,count ,m):
# l += i
# return l
so basically the problem ask the user return the sum of all the multiples of 3 and 5 within a number. Here are the testers.
test.assert_equals(solution(10), 23)
test.assert_equals(solution(20), 78)
test.assert_equals(solution(100), 2318)
test.assert_equals(solution(200), 9168)
test.assert_equals(solution(1000), 233168)
test.assert_equals(solution(10000), 23331668)
my program has no problem getting the right answer. The problem arises when the input is large. When pass in a number like 50000000 it takes over 40 seconds to return the answer. One of the inputs i'm asked to take is 50000000000000000000000000000000000000000, which a is huge number. That's also the reason why i'm using itertools.count() I tried using xrange in my first attempt but range can't handle numbers larger than a c type long. I know the slowest part the problem is the multiples method...yet it is still faster then my first attempt using list comprehension and checking whether i % 3 == 0 or i % 5 == 0, any ideas guys?
This solution should be faster for large numbers.
def solution(number):
number -= 1
a, b, c = number // 3, number // 5, number // 15
asum, bsum, csum = a*(a+1) // 2, b*(b+1) // 2, c*(c+1) // 2
return 3*asum + 5*bsum - 15*csum
Explanation:
Take any sequence from 1 to n:
1, 2, 3, 4, ..., n
And it's sum will always be given by the formula n(n+1)/2. This can be proven easily if you consider that the expression (1 + n) / 2 is just a shortcut for computing the average, or Arithmetic mean of this particular sequence of numbers. Because average(S) = sum(S) / length(S), if you take the average of any sequence of numbers and multiply it by the length of the sequence, you get the sum of the sequence.
If we're given a number n, and we want the sum of the multiples of some given k up to n, including n, we want to find the summation:
k + 2k + 3k + 4k + ... xk
where xk is the highest multiple of k that is less than or equal to n. Now notice that this summation can be factored into:
k(1 + 2 + 3 + 4 + ... + x)
We are given k already, so now all we need to find is x. If x is defined to be the highest number you can multiply k by to get a natural number less than or equal to n, then we can get the number x by using Python's integer division:
n // k == x
Once we find x, we can find the sum of the multiples of any given k up to a given n using previous formulas:
k(x(x+1)/2)
Our three given k's are 3, 5, and 15.
We find our x's in this line:
a, b, c = number // 3, number // 5, number // 15
Compute the summations of their multiples up to n in this line:
asum, bsum, csum = a*(a+1) // 2, b*(b+1) // 2, c*(c+1) // 2
And finally, multiply their summations by k in this line:
return 3*asum + 5*bsum - 15*csum
And we have our answer!