Given two numbers m and n, in one move you can get two new pairs:
m+n, n
m, n+m
Let's intially set m = n = 1 find the minimum number of moves so that at least one of the numbers equals k
it's guaranteed there's a solution (i.e. there exist a sequence of moves that leads to k)
For example:
given k = 5
the minimum number of moves so that m or n is equal to k is 3
1, 1
1, 2
3, 2
3, 5
Total of 3 moves.
I have come up with a solution using recursion in python, but it doesn't seem to work on big number (i.e 10^6)
def calc(m, n, k):
if n > k or m > k:
return 10**6
elif n == k or m == k:
return 0
else:
return min(1+calc(m+n, n, k), 1+calc(m, m+n, k))
k = int(input())
print(calc(1, 1, k))
How can I improve the performance so it works for big numbers?
Non-Recursive Algorithm based on Priority Queue (using Heap)
State: (sum_, m, n, path)
sum_ is current sum (i.e. m + n)
m and n are the first and second numbers
path is the sequence of (m, n) pairs to get to the current sum
In each step there are two possible moves
Replace first number by the sum
Replace second number by the sum
Thus each state generates two new states. States are prioritized by:
moves: states with a lower number of have higher priority
sum: States with higher sums have higher priority
We use a Priority Queue (Heap in this case) to process states by priority.
Code
from heapq import heappush, heappop
def calc1(k):
if k < 1:
return None, None # No solution
m, n, moves = 1, 1, 0
if m == k or n == k:
return moves, [(m, n)]
h = [] # Priority queue (heap)
path = [(m, n)]
sum_ = m + n
# Python's heapq acts as a min queue.
# We can order thing by max by using -value rather than value
# Thus Tuple (moves+1, -sum_, ...) prioritizes by 1) min moves, and 2) max sum
heappush(h, (moves+1, -sum_, sum_, n, path))
heappush(h, (moves+1, -sum_, m, sum_, path))
while h:
# Get state with lowest sum
moves, sum_, m, n, path = heappop(h)
sum_ = - sum_
if sum_ == k:
return moves, path # Found solution
if sum_ < k:
sum_ = m + n # new sum
# Replace first number with sum
heappush(h, (moves+1, -sum_, sum_, n, path + [(sum_, n)]))
# Replace second number with sum
heappush(h, (moves+1, -sum_, m, sum_, path + [(m, sum_)]))
# else:
# so just continues since sum_ > k
# Exhausted all options, so no solution
return None, None
Test
Test Code
for k in [5, 100, 1000]:
moves, path = calc1(k)
print(f'k: {k}, Moves: {moves}, Path: {path}')
Output
k: 5, Moves: 3, Path: [(1, 1), (2, 3), (2, 5)]
k: 100, Moves: 10, Path: [(1, 1), (2, 3), (5, 3), (8, 3), (8, 11),
(8, 19), (27, 19), (27, 46), (27, 73), (27, 100)]
k: 1000, Moves: 15, Path: [(1, 1), (2, 3), (5, 3), (8, 3), (8, 11),
(19, 11), (19, 30), (49, 30), (79, 30), (79, 109),
(188, 109), (297, 109), (297, 406), (297, 703), (297, 1000)]
Performance Improvement
Following two adjustments to improve performance
Not including path just number of steps (providing 3X speedup for k = 10,000
Not using symmetric pairs (provided 2x additional with k = 10, 000
By symmetric pairs, mean pairs of m, n which are the same forward and backwards, such as (1, 2) and (2, 1).
We don't need to branch on both of these since they will provide the same solution step count.
Improved Code
from heapq import heappush, heappop
def calc(k):
if k < 1:
return None, None
m, n, moves = 1, 1, 0
if m == k or n == k:
return moves
h = [] # Priority queue (heap)
sum_ = m + n
heappush(h, (moves+1, -sum_, sum_, n))
while h:
moves, sum_, m, n = heappop(h)
sum_ = - sum_
if sum_ == k:
return moves
if sum_ < k:
sum_ = m + n
steps = [(sum_, n), (m, sum_)]
heappush(h, (moves+1, -sum_, *steps[0]))
if steps[0] != steps[-1]: # not same tuple in reverse (i.e. not symmetric)
heappush(h, (moves+1, -sum_, *steps[1]))
Performance
Tested up to k = 100, 000 which took ~2 minutes.
Update
Converted solution by #גלעדברקן from JavaScript to Python to test
def g(m, n, memo):
key = (m, n)
if key in memo:
return memo[key]
if m == 1 or n == 1:
memo[key] = max(m, n) - 1
elif m == 0 or n == 0:
memo[key] = float("inf")
elif m > n:
memo[key] = (m // n) + g(m % n, n, memo)
else:
memo[key] = (n // m) + g(m, n % m, memo)
return memo[key]
def f(k, memo={}):
if k == 1:
return 0
return min(g(k, n, memo) for n in range((k // 2) + 1))
Performance of #גלעדברקן Code
Completed 100K in ~1 second
This is 120X faster than my above heap based solution.
This is an interesting problem in number theory, including linear Diophantine equations. Since there are solutions available on line, I gather that you want help in deriving the algorithm yourself.
Restate the problem: you start with two numbers characterized as 1*m+0*n, 0*m+1*n. Use the shorthand (1, 0) and (0, 1). You are looking for the shortest path to any solution to the linear Diophantine equation
a*m + b*n = k
where (a, b) is reached from starting values (1, 1) a.k.a. ( (1, 0), (0, 1) ).
So ... starting from (1, 1), how can you characterize the paths you reach from various permutations of the binary enhancement. At each step, you have two choices: a += b or b += a. Your existing algorithm already recognizes this binary search tree.
These graph transitions -- edges along a lattice -- can be characterized, in terms of which (a, b) pairs you can reach on a given step. Is that enough of a hint to move you along? That characterization is the key to converting this problem into something close to a direct computation.
We can do much better than the queue even with brute force, trying each possible n when setting m to k. Here's JavaScript code, very close to Python syntax:
function g(m, n, memo){
const key = m + ',' + n;
if (memo[key])
return memo[key];
if (m == 1 || n == 1)
return Math.max(m, n) - 1;
if (m == 0 || n == 0)
return Infinity;
let answer;
if (m > n)
answer = Math.floor(m / n) + g(m % n, n, memo);
else
answer = Math.floor(n / m) + g(m, n % m, memo);
memo[key] = answer;
return answer;
}
function f(k, memo={}){
if (k == 1)
return 0;
let best = Infinity;
for (let n=1; n<=Math.floor(k/2); n++)
best = Math.min(best, g(k, n, memo));
return best;
}
var memo = {};
var ks = [1, 2, 5, 6, 10, 100, 1000, 100000];
for (let k of ks)
console.log(`${ k }: ${ f(k, memo) }`);
By default, in Python, for a recursive function the recursion limit is set to 10^4. You can change it using sys module:
import sys
sys.setrecursionlimit(10**6)
Related
I am trying to find a pair (x,y) in A such that x-y = 0 (mod n) where inputs are a positive integer n, a set A of m nonnegative integers and m > n. To run the code below I took an m and n just for the sake of running an example.
Below is the script I have written.
I wonder if there is a more efficient way to write the script
import numpy as np import sys
n = 10
m = 12
def functi(n, m):
A = [0] * m
for i in range(m):
A[i] = np.random.randint(0,34)
X = [-1] * n
for i in range(len(A)-1,-1,-1) : #loop backwards
a = A[i]
A.pop(i)
r = a % n
if X[r] == -1:
X[r] = a
else:
return(X[r], a)
pair = functi(n, m)
print(pair)
Note that your function doesn't have the parameters described by the problem -- it should take n and A as parameters, not take an m and generate its own A.
The problem is much easier if you look at it as simply "find a pair of numbers with the same value mod n". An simple approach to this is to bucket all of the numbers according to their value % n, and return a bucket once it has two numbers in it. That way you don't need to compare each pair of values individually to see if they match.
>>> import random
>>> def find_equal_pair_mod_n(n, A):
... assert len(A) > n
... mods = {}
... for i in A:
... xy = mods.setdefault(i % n, [])
... xy.append(i)
... if len(xy) > 1:
... return tuple(xy)
...
>>> find_equal_pair_mod_n(10, [random.randint(0, 34) for _ in range(12)])
(26, 6)
>>> find_equal_pair_mod_n(10, [random.randint(0, 34) for _ in range(12)])
(30, 10)
>>> find_equal_pair_mod_n(10, [random.randint(0, 34) for _ in range(12)])
(32, 32)
>>> find_equal_pair_mod_n(10, [random.randint(0, 34) for _ in range(12)])
(1, 1)
>>> find_equal_pair_mod_n(10, [random.randint(0, 34) for _ in range(12)])
(28, 8)
Im trying to implement merge sort in Python based on the following pseudo code. I know there are many implementations out there, but I have not been able to find one that followis this pattern with a for loop at the end as opposed to while loop(s). Also, setting the last values in the subarrays to infinity is something I haven't seen in other implementation. NOTE: The following pseudo code has 1 based index i.e. index starts at 1. So I think my biggest issue is getting the indexing right. Right now its just not sorting properly and its really hard to follow with the debugger. My implementation is at the bottom.
Current Output:
Input: [9, 8, 7, 6, 5, 4, 3, 2, 1, 0]
Merge Sort: [0, 0, 0, 3, 0, 5, 5, 5, 8, 0]
def merge_sort(arr, p, r):
if p < r:
q = (p + (r - 1)) // 2
merge_sort(arr, p, q)
merge_sort(arr, q + 1, r)
merge(arr, p, q, r)
def merge(A, p, q, r):
n1 = q - p + 1
n2 = r - q
L = [0] * (n1 + 1)
R = [0] * (n2 + 1)
for i in range(0, n1):
L[i] = A[p + i]
for j in range(0, n2):
R[j] = A[q + 1 + j]
L[n1] = 10000000 #dont know how to do infinity for integers
R[n2] = 10000000 #dont know how to do infinity for integers
i = 0
j = 0
for k in range(p, r):
if L[i] <= R[j]:
A[k] = L[i]
i += 1
else:
A[k] = R[j]
j += 1
return A
First of all you need to make sure if the interval represented by p and r is open or closed at its endpoints. The pseudocode (for loops include last index) establishes that the interval is closed at both endpoints: [p, r].
With last observation in mind you can note that for k in range(p, r): doesn't check last number so the correct line is for k in range(p, r + 1):.
You can represent "infinity" in you problem by using the maximum element of A in the range [p, r] plus one. That will make the job done.
You not need to return the array A because all changes are being done through its reference.
Also, q = (p + (r - 1)) // 2 isn't wrong (because p < r) but correct equation is q = (p + r) // 2 as the interval you want middle integer value of two numbers.
Here is a rewrite of the algorithm with “modern” conventions, which are the following:
Indices are 0-based
The end of a range is not part of that range; in other words, intervals are closed on the left and open on the right.
This is the resulting code:
INF = float('inf')
def merge_sort(A, p=0, r=None):
if r is None:
r = len(A)
if r - p > 1:
q = (p + r) // 2
merge_sort(A, p, q)
merge_sort(A, q, r)
merge(A, p, q, r)
def merge(A, p, q, r):
L = A[p:q]; L.append(INF)
R = A[q:r]; R.append(INF)
i = 0
j = 0
for k in range(p, r):
if L[i] <= R[j]:
A[k] = L[i]
i += 1
else:
A[k] = R[j]
j += 1
A = [433, 17, 585, 699, 942, 483, 235, 736, 629, 609]
merge_sort(A)
print(A)
# → [17, 235, 433, 483, 585, 609, 629, 699, 736, 942]
Notes:
Python has a handy syntax for copying a subrange.
There is no int infinity in Python, but we can use the float one, because ints and floats can always be compared.
There is one difference between this algorithm and the original one, but it is irrelevant. Since the “midpoint” q does not belong to the left range, L is shorter than R when the sum of their lengths is odd. In the original algorithm, q belongs to L, and so L is the longer of the two in this case. This does not change the correctness of the algorithm, since it simply swaps the roles of L and R. If for some reason you need not to have this difference, then you must calculate q like this:
q = (p + r + 1) // 2
In mathematics, we represent all real numbers which are greater than or equal to i and smaller than j by [i, j). Notice the use of [ and ) brackets here. I have used i and j in the same way in my code to represent the region that I am dealing with currently.
ThThe region [i, j) of an array covers all indexes (integer values) of this array which are greater or equal to i and smaller than j. i and j are 0-based indexes. Ignore the first_array and second_array the time being.
Please notice, that i and j define the region of the array that I am dealing with currently.
Examples to understand this better
If your region spans over the whole array, then i should be 0 and j should be the length of array [0, length).
The region [i, i + 1) has only index i in it.
The region [i, i + 2) has index i and i + 1 in it.
def mergeSort(first_array, second_array, i, j):
if j > i + 1:
mid = (i + j + 1) // 2
mergeSort(second_array, first_array, i, mid)
mergeSort(second_array, first_array, mid, j)
merge(first_array, second_array, i, mid, j)
One can see that I have calculated middle point as mid = (i + j + 1) // 2 or one can also use mid = (i + j) // 2 both will work. I will divide the region of the array that I am currently dealing with into 2 smaller regions using this calculated mid value.
In line 4 of the code, MergeSort is called on the region [i, mid) and in line 5, MergeSort is called on the region [mid, j).
You can access the whole code here.
Given the number of squares in a board (e.g. scrabble or chess board), N and dimensions AxB, this code tries to determine all possible
dimensional combinations that can give N number of squares in the board.
Example: N = 8
There are four possible dimensional combinations to obtain exactly 8 squares in the board. So, the code outputs board dimensions
1x8 2x3, 3x2 and 8x1. The 8x1 boards have eight 1x1 squares; the 3x2 boards have six 1x1 squares and two 2x2 squares.
Here is my solution:
def dims(num_sqrs):
dim_list=[]
for i in range(1,num_sqrs+1):
temp = []
for x in range(1,num_sqrs+1):
res = 0
a = i
b = x
while (a != 0) and (b !=0):
res = res + (a*b)
a = a -1
b = b-1
if res == num_sqrs:
dim_list.append((i,x))
print(dim_list)
dims(8)
However, this code takes too much time to run for large values of N.
Any suggestion to optimize the efficiency of the code will be much appreciated.
Here are two pretty obvious observations:
The square count for AxB is the same as the square count for BxA
If C>B then the square count for AxC is greater than the square count for AxB
Given those facts, it should be clear that:
We only need to consider AxB for A≤B, since we can just add BxA to the list if A≠B
For a given A and N, there is at most one value of B which has a square count of N.
The code below is based on the above. It tries each AxA in turn, for each one checking to see if there is some B≥A which produces the correct square count. It stops when the square count for AxA exceeds N.
Now, to find the correct value of B, a couple of slightly less obvious observations.
Suppose the square count for AxA is N. Then the square count for (A+1)x(Ax1) is N + (A+1)².
Proof: Every square in AxA can be identified by its upper left co-ordinate [i, j] and its size s. I'll write that as [s: *i, j]. (Here I'm assuming that coordinates are zero-based and go from top to bottom and left to right.)
For each such square 0 ≤ i + s < A and 0 ≤ j + s < A (assuming 0-based coordinates).
Now, suppose we change each square [s: i, j] into the square based at the same coordinate but with a size one larger, [s+1: i, j]. That new square is a square in (A+1)x(A+1), because 0 ≤ i + s + 1 < A + 1 (and similarly for j). So that transformation gives us every square in A + 1 whose size is at least 2. The only squares which we've missed are the squares of size 1, and there are exactly (A+1)×(A+1) of them.
Suppose the square count for AxB is N, and B≥A. Then the square count for Ax(B+1) is N + the sum of each integer from 1 to A. (These are the triangular number, which are A×(A+1)/2; I think that's well-known.)
Proof: The squares in Ax(B+1) are precisely the squares in AxB plus the squares whose right-hand side includes the last column of Ax(B+1). So we only need to count those. There is one such square of size A, two of size A-1, three of size A-2, and so on up to A squares of size 1.
So for a given A, we can compute the square count for AxA and the increment in the square count for each increase in B. If the increment even divides the difference between the target count and the count of AxA, then we've found an AxB.
The program below also relies on one more algebraic identity, which is pretty straight-forward: the sum of two consecutive triangular numbers is a square. That's obvious by just arranging the two triangles. The larger one contains the diagonal of the square. These facts are used to compute the next base value and increment for the next value of A.
def finds(n):
a = 1
base = 1 # Square count for AxA
inc = 1 # Difference between count(AxB) and count(AxB+1)
rects = []
while base < n:
if (n - base) % inc == 0:
rects.append((a, a + (n - base) // inc))
a += 1
newinc = inc + a
base += inc + newinc
inc = newinc
if base == n:
return rects + [(a, a)] + list(map(lambda p:p[::-1], reversed(rects)))
else:
return rects + list(map(lambda p:p[::-1], reversed(rects)))
The slowest part of that function is adding the reverse of the reverses of the AxB solutions at the end, which I only did to simplify counting the solutions correctly. My first try, which was almost twice as fast, used the loop while base <= n and then just returned rects. But it's still fast enough.
For example:
>>> finds(1000000)
[(1, 1000000), (4, 100001), (5, 66668), (15, 8338), (24, 3341),
(3341, 24), (8338, 15), (66668, 5), (100001, 4), (1000000, 1)]
>>> finds(760760)
[(1, 760760), (2, 253587), (3, 126794), (4, 76077), (7, 27172),
(10, 13835), (11, 11530), (12, 9757), (13, 8364), (19, 4010),
(20, 3629), (21, 3300), (38, 1039), (39, 988), (55, 512),
(56, 495), (65, 376), (76, 285), (285, 76), (376, 65),
(495, 56), (512, 55), (988, 39), (1039, 38), (3300, 21),
(3629, 20), (4010, 19), (8364, 13), (9757, 12), (11530, 11),
(13835, 10), (27172, 7), (76077, 4), (126794, 3), (253587, 2),
(760760, 1)]
The last one came out of this test, which took a few seconds: (It finds each successive maximum number of solutions, if you don't feel like untangling the functional elements)
>>> from functools import reduce
>>> print('\n'.join(
map(lambda l:' '.join(map(lambda ab:"%dx%d"%ab, l)),
reduce(lambda a,b: a if len(b) <= len(a[-1]) else a + [b],
(finds(n) for n in range(2,1000001)),[[(1,1)]]))))
1x1
1x2 2x1
1x5 2x2 5x1
1x8 2x3 3x2 8x1
1x14 2x5 3x3 5x2 14x1
1x20 2x7 3x4 4x3 7x2 20x1
1x50 2x17 3x9 4x6 6x4 9x3 17x2 50x1
1x140 2x47 3x24 4x15 7x7 15x4 24x3 47x2 140x1
1x280 4x29 5x20 6x15 7x12 12x7 15x6 20x5 29x4 280x1
1x770 2x257 3x129 4x78 10x17 11x15 15x11 17x10 78x4 129x3 257x2 770x1
1x1430 2x477 3x239 4x144 10x29 11x25 12x22 22x12 25x11 29x10 144x4 239x3 477x2 1430x1
1x3080 2x1027 3x514 4x309 7x112 10x59 11x50 20x21 21x20 50x11 59x10 112x7 309x4 514x3 1027x2 3080x1
1x7700 2x2567 3x1284 4x771 7x277 10x143 11x120 20x43 21x40 40x21 43x20 120x11 143x10 277x7 771x4 1284x3 2567x2 7700x1
1x10010 2x3337 3x1669 4x1002 10x185 11x155 12x132 13x114 20x54 21x50 50x21 54x20 114x13 132x12 155x11 185x10 1002x4 1669x3 3337x2 10010x1
1x34580 2x11527 3x5764 4x3459 7x1237 12x447 13x384 19x188 20x171 38x59 39x57 57x39 59x38 171x20 188x19 384x13 447x12 1237x7 3459x4 5764x3 11527x2 34580x1
1x40040 2x13347 3x6674 4x4005 7x1432 10x731 11x610 12x517 13x444 20x197 21x180 39x64 64x39 180x21 197x20 444x13 517x12 610x11 731x10 1432x7 4005x4 6674x3 13347x2 40040x1
1x100100 2x33367 3x16684 4x10011 7x3577 10x1823 11x1520 12x1287 13x1104 20x483 21x440 25x316 39x141 55x83 65x68 68x65 83x55 141x39 316x25 440x21 483x20 1104x13 1287x12 1520x11 1823x10 3577x7 10011x4 16684x3 33367x2 100100x1
1x340340 2x113447 3x56724 4x34035 7x12157 10x6191 11x5160 12x4367 13x3744 20x1627 21x1480 34x583 39x449 55x239 65x180 84x123 123x84 180x65 239x55 449x39 583x34 1480x21 1627x20 3744x13 4367x12 5160x11 6191x10 12157x7 34035x4 56724x3 113447x2 340340x1
1x760760 2x253587 3x126794 4x76077 7x27172 10x13835 11x11530 12x9757 13x8364 19x4010 20x3629 21x3300 38x1039 39x988 55x512 56x495 65x376 76x285 285x76 376x65 495x56 512x55 988x39 1039x38 3300x21 3629x20 4010x19 8364x13 9757x12 11530x11 13835x10 27172x7 76077x4 126794x3 253587x2 760760x1
I think the critical detail is that #Qudus is looking for boards where there are N squares of any size.
One simple optimization is to just break when res > n. Another optimization to make it about twice as fast is to only run it for boards where length >= width.
def dims(num_sqrs):
dim_list=[]
for i in range(1, num_sqrs + 1):
temp = []
for x in range(1, i + 1):
res = 0
a = i
b = x
while (a != 0) and (b != 0):
res = res + (a * b)
a = a - 1
b = b - 1
if res > num_sqrs:
break
if res == num_sqrs:
dim_list.append((i, x))
if i != x:
dim_list.append((x, i))
print(dim_list)
Here's a much faster solution that takes a different approach:
def dims(num_sqrs):
dim_list = []
sum_squares = [0]
sums = [0]
for i in range(1, num_sqrs + 1):
sums.append(sums[-1] + i)
sum_squares.append(sum_squares[-1] + i * i)
for i in range(1, num_sqrs + 1):
if sum_squares[i] > num_sqrs:
break
if sum_squares[i] == num_sqrs:
dim_list.append((i, i))
break
for x in range(i + 1, num_sqrs + 1):
total_squares = sum_squares[i] + sums[i] * (x - i)
if total_squares == num_sqrs:
dim_list.append((x, i))
dim_list.append((i, x))
break
if total_squares > num_sqrs:
break
return dim_list
Start with basic algebraic analysis. I derived my own formula for the sums of various sizes. From the initial analysis, we get that for a board of size n x m, there are (n-k)*(m-k) squares of size k. Summing this for k in [0, min(m, n)] we have a simple calculation formula:
sum(((n-k) * (m-k) for k in range(0, min(n, m))))
I expanded the product to nm - k(n+m) + k^2, re-derived the individual series sums, and made a non-iterative formula, assuming n <= m:
n * n * m
- n * (n - 1) / 2 * (n + m)
+ ((n - 1) * n * (2 * n - 1))/6
This first link then spoiled my fun with an even shorter formula:
t = m - n
n * (n + 1) / 6 * (2 * n + 3 * t + 1)
which follows from mine with a bit of nifty rearrangement of terms.
Now to the point of this exercise: given a desired count of squares, Q, find all rectangle dimensions (n, m) that have exactly that many squares. Starting with the formula above:
q = n * (n + 1) / 6 * (2 * n + 3 * t + 1)
Since we're given Q, the desired value for q, we can iterate through all values of n, finding whether there is a positive, integral value for t that satisfies the formula. Start by solving this for t:
t = (6/(n*(n+1)) * q - 2*n - 1) / 3
combining the denominators:
t = (6*q) / (3*n*(n+1)) - (2*n + 1)/3
I'll use the first version. Since a solution of n x m implies a solution of m x n, we can limit our search to only those cases n <= m. Also, since the numerator shrinks (negative n^3 term), we can limit the search for values of n that allow t >= 1 -- in other words, have the combined numerator at least as large as the denominator:
numer = 6 * num_sqrs - n * (n+1) * (2*n+1)
denom = 3 * n * (n+1)
Solving this:
num_sqrs > (n * (n+1) * (n+2)) / 3
Thus, the (cube root of n) / 3 is a convenient upper bound for our loop limits.
This gives us a simple iteration loop in the code:
def dims(num_sqrs):
dim = [(1, num_sqrs)]
limit = ceil((3*num_sqrs)**(1.0/3.0))
for n in range(2, limit):
numer = 6 * num_sqrs - n * (n+1) * (2*n+1)
denom = 3 * n * (n+1)
if numer % denom == 0:
t = numer // denom
if t >= 0:
dim.append((n, n+t))
return dim
Output for a couple of test cases:
>>> print(dims(8))
[(1, 8), (2, 3)]
>>> print(dims(2000))
[(1, 2000), (2, 667), (3, 334), (4, 201)]
>>> print(dims(1000000))
[(1, 1000000), (4, 100001), (5, 66668), (15, 8338), (24, 3341)]
>>> print(dims(21493600))
[(1, 21493600), (4, 2149361), (5, 1432908), (15, 179118), (24, 71653), (400, 401)]
These return immediately, so I expect that this solution is fast enough for OP's purposes.
It's quite possible that a parameterized equation would give us direct solutions, rather than iterating through possibilities. I'll leave that for the Project Euler folks. :-)
This uses the formula derived in the link provided by the OP. The only real optimization is trying not to look at dimensions that cannot produce the result. Pre-loading the results with the two end cases (figures = [(1,n_squares),(n_squares,1)]) saved a lot with big numbers. I think there are others chunks that can be discarded but I haven't figured them out yet.
def h(n_squares):
# easiest case for a n x m figure:
# n = 1 and m = n_squares
figures = [(1,n_squares),(n_squares,1)]
for n in range(2, n_squares+1):
for m in range(n, n_squares+1):
t = m - n
x = int((n * (n + 1) / 6) * ((2 * n) + (3 * t) + 1))
if x > n_squares:
break
if x == n_squares:
figures.extend([(n,m),(m,n)])
#print(f'{n:>6} x {m:<6} has {n_squares} squares')
if x > n_squares and n == m:
break
return figures
It also doesn't make lots of lists which can blow up your computer with really big numbers like 21493600 (400x401).
Formula derivation from link in OP's comment (in case that resource disappears):
text from Link
courtesy:
Doctor Anthony, The Math Forum
Link
If we have an 8 x 9 board the numbers of squares are as follows:
Size of Square Number of Squares
-------------- -----------------
1 x 1 8 x 9 = 72
2 x 2 7 x 8 = 56
3 x 3 6 x 7 = 42
4 x 4 5 x 6 = 30
5 x 5 4 x 5 = 20
6 x 6 3 x 4 = 12
7 x 7 2 x 3 = 6
8 x 8 1 x 2 = 2
----------------------------------------
Total = 240
For the general case of an n x m board, where m = n + t
We require
n n
SUM[r(r + t)] = SUM[r^2 + rt}
r=1 r=1
= n(n + 1)(2n + 1)/6 + tn(n + 1)/2
= [n(n + 1)/6]*[2n + 1 + 3t]
No. of squares =
[n(n + 1)/6]*[2n + 3t + 1] .......(1)
In the example above t = 1 and so
No. of squares = 8 x 9/6[16 + 3 + 1]
= (72/6)[20]
= 240 (as required)
The general formula for an (n x n+t) board is that given in (1)
above.
No. of squares = [n(n + 1)/6]*[2n + 3t + 1]
def pack(L, n):
'''Return the subset of L with the largest sum up to n
>>> s = [4,1,3,5]
>>> pack(s, 7)
{3, 4}
>>> pack(s, 6)
{1, 5}
>>> pack(s, 11)
{1, 4, 5}
'''
I'm asked to code this. It takes in a list and an integer and returns the best combination to get that integer less than or equal to.
I used a helper function that takes in the sum, but it's not correct since I don't know how I could replace a number while in recursion.
# doesn't work as intended
def pack_helper(L, n, sum=0):
'''Return the subset of L with the largest sum up to n and the sum total
>>> s = [4,1,3,5]
>>> pack_helper(s, 7)
({3, 4}, 7)
>>> pack(s, 6)
({1, 5}, 6)
>>> pack(s, 11)
({1, 4, 5}, 10)
'''
package = set()
if L == []:
result = (package, sum)
else:
first = L[0]
(package, sum) = pack_helper(L[1:], n, sum)
if sum < n and (first + sum) <= n:
package.add(first)
sum = sum + first
return (package, sum)
Any hints or help? Thx
Here's a simple recursive function that does the job:
def pack(L, n):
'''Return the subset of L with the largest sum up to n
>>> s = [4,1,3,5]
>>> pack(s, 7)
{3, 4}
>>> pack(s, 6)
{1, 5}
>>> pack(s, 11)
{1, 4, 5}
'''
if all(j > n for j in L):
return set()
return max(({j} | pack(L[i+1:], n-j) for i, j in enumerate(L) if j <= n), key=sum)
If you're using Python 3, you can pass the default parameter to max instead:
def pack(L, n):
return max(({j} | pack(L[i+1:], n-j) for i, j in enumerate(L) if j <= n), key=sum, default=set())
The test data here is small enough that brute force is pretty fast. Recursion is not necessary:
from itertools import chain, combinations
# taken from the itertools documentation
def powerset(iterable):
s = list(iterable)
return chain.from_iterable(combinations(s, r) for r in range(len(s)+1))
def pack(L, n):
best_set, best_sum = (), 0
for candidate in powerset(L):
total = sum(candidate)
if best_sum < total <= n:
best_set, best_sum = candidate, total
return best_set
However, assuming positive weights, the dynamic programming solution is pretty short.
def pack(L, n):
assert all(w > 0 for w in L), 'weights must all be positive'
a = [((), 0)] * (n + 1)
for w in L:
a = [ (a[x - w][0] + (w,), a[x - w][1] + w)
if w <= x and a[x][1] < a[x - w][1] + w
else a[x] for x in range(n + 1) ]
return a[n][0]
How does this work?
a[x] stores the best set of weights processed so far that sum up to x or less (and the sum, just to save time). Before any weights have been processed, these are all empty ().
To process a new weight w at target x, one of the following two sets must be the best.
the best set of weights without this new weight that sum up to x (the old a[x]), or
the best set of weights without this new weight that sum up to x - w, plus this new weight w
Once all the weights are processed, the solution is right there at the end.
By the way, this is the well-known 0/1 knapsack problem. (The Wikipedia article currently has a solution that uses O(len(L) * n) time and O(len(L) * n) space, but it's doable in O(n) space, as I demonstrated here.)
I want to write a bottom up fibonacci using O(1) space. My problem is python's recursion stack is limiting me from testing large numbers. Could someone provide an alternate or optimization to what I have? This is my code:
def fib_in_place(n):
def fibo(f2, f1, i):
if i < 1:
return f2
else:
return fibo(f1, f2+f1, i -1)
return fibo(0, 1, n)
Using recursion this way means you're using O(N) space, not O(1) - the O(N) is in the stack.
Why use recursion at all?
def fib(n):
a, b = 0, 1
for i in range(n):
a, b = b, a + b
return a
You can memoize the Fibonacci function for efficiency, but if you require a recursive function, it's still going to take at least O(n):
def mem_fib(n, _cache={}):
'''efficiently memoized recursive function, returns a Fibonacci number'''
if n in _cache:
return _cache[n]
elif n > 1:
return _cache.setdefault(n, mem_fib(n-1) + mem_fib(n-2))
return n
This is from my answer on the main Fibonacci in Python question: How to write the Fibonacci Sequence in Python
If you're allowed to use iteration instead of recursion, you should do this:
def fib():
a, b = 0, 1
while True: # First iteration:
yield a # yield 0 to start with and then
a, b = b, a + b # a will now be 1, and b will also be 1, (0 + 1)
usage:
>>> list(zip(range(10), fib()))
[(0, 0), (1, 1), (2, 1), (3, 2), (4, 3), (5, 5), (6, 8), (7, 13), (8, 21), (9, 34)]
If you just want to get the nth number:
def get_fib(n):
fib_gen = fib()
for _ in range(n):
next(fib_gen)
return next(fib_gen)
and usage
>>> get_fib(10)
55
Why use iteration at all?
def fib(n):
phi_1 = (math.sqrt(5) + 1) / 2
phi_2 = (math.sqrt(5) - 1) / 2
f = (phi_1**n - phi_2**n) / math.sqrt(5)
return round(f)
The algebraic result is exact; the round operation is only to allow for digital representation inaccuracy.
Tail-recursive definitions are easily turned into iterative definitions. If necessary, flip the condition so that the tail-recursive call is in the 'if' branch.
def fibo(f2, f1, i):
if i > 0:
return fibo(f1, f2+f1, i -1)
else:
return f2
Then turn 'if' into 'while', replace return with unpacking assignment of the new arguments, and (optionally) drop 'else'.
def fibo(f2, f1, i):
while i > 0:
f2, f1, i = f1, f2+f1, i -1
return f2
With iteration, you do not need the nested definition.
def fib_efficient(n):
if n < 0:
raise ValueError('fib argument n cannot be negative')
new, old = 0, 1
while n:
new, old = old, old+new
n -= 1
return new
Local names 'new' and 'old' refer to Fibonacci's use of biological reproduction to motivate the sequence. However, the story works better with yeast cells instead of rabbits. Old, mature yeast cells reproduce by budding off new, immature cells. (The original source of the function in India appears to be Virahanka counting the number a ways to make a Sanskrit poetic line with n beats from an ordered sequence of 1- and 2-beat syllables.)