Pythagorean Triplet with given sum - python

The following code prints the pythagorean triplet if it is equal to the input, but the problem is that it takes a long time for large numbers like 90,000 to answer.
What can I do to optimize the following code?
1 ≤ n ≤ 90 000
def pythagoreanTriplet(n):
# Considering triplets in
# sorted order. The value
# of first element in sorted
# triplet can be at-most n/3.
for i in range(1, int(n / 3) + 1):
# The value of second element
# must be less than equal to n/2
for j in range(i + 1,
int(n / 2) + 1):
k = n - i - j
if (i * i + j * j == k * k):
print(i, ", ", j, ", ",
k, sep="")
return
print("Impossible")
# Driver Code
vorodi = int(input())
pythagoreanTriplet(vorodi)

Your source code does a brute force search for a solution so it's slow.
Faster Code
def solve_pythagorean_triplets(n):
" Solves for triplets whose sum equals n "
solutions = []
for a in range(1, n):
denom = 2*(n-a)
num = 2*a**2 + n**2 - 2*n*a
if denom > 0 and num % denom == 0:
c = num // denom
b = n - a - c
if b > a:
solutions.append((a, b, c))
return solutions
OP code
Modified OP code so it returns all solutions rather than printing the first found to compare performance
def pythagoreanTriplet(n):
# Considering triplets in
# sorted order. The value
# of first element in sorted
# triplet can be at-most n/3.
results = []
for i in range(1, int(n / 3) + 1):
# The value of second element
# must be less than equal to n/2
for j in range(i + 1,
int(n / 2) + 1):
k = n - i - j
if (i * i + j * j == k * k):
results.append((i, j, k))
return results
Timing
n pythagoreanTriplet (OP Code) solve_pythagorean_triplets (new)
900 0.084 seconds 0.039 seconds
5000 3.130 seconds 0.012 seconds
90000 Timed out after several minutes 0.430 seconds
Explanation
Function solve_pythagorean_triplets is O(n) algorithm that works as follows.
Searching for:
a^2 + b^2 = c^2 (triplet)
a + b + c = n (sum equals input)
Solve by searching over a (i.e. a fixed for an iteration). With a fixed, we have two equations and two unknowns (b, c):
b + c = n - a
c^2 - b^2 = a^2
Solution is:
denom = 2*(n-a)
num = 2*a**2 + n**2 - 2*n*a
if denom > 0 and num % denom == 0:
c = num // denom
b = n - a - c
if b > a:
(a, b, c) # is a solution
Iterate a range(1, n) to get different solutions
Edit June 2022 by #AbhijitSarkar:
For those who like to see the missing steps:
c^2 - b^2 = a^2
b + c = n - a
=> b = n - a - c
c^2 - (n - a - c)^2 = a^2
=> c^2 - (n - a - c) * (n - a - c) = a^2
=> c^2 - n(n - a - c) + a(n - a - c) + c(n - a - c) = a^2
=> c^2 - n^2 + an + nc + an - a^2 - ac + cn - ac - c^2 = a^2
=> -n^2 + 2an + 2nc - a^2 - 2ac = a^2
=> -n^2 + 2an + 2nc - 2a^2 - 2ac = 0
=> 2c(n - a) = n^2 - 2an + 2a^2
=> c = (n^2 - 2an + 2a^2) / 2(n - a)

DarrylG's answer is correct, and I've added the missing steps to it as well, but there's another solution that's faster than iterating from [1, n). Let me explain it, but I'll leave the code up to the reader.
We use Euclid's formula of generating a tuple.
a = m^2 - n^2, b = 2mn, c = m^2 + n^2, where m > n > 0 ---(i)
a + b + c = P ---(ii)
Combining equations (i) and (ii), we have:
2m^2 + 2mn = P ---(iii)
Since m > n > 0, 1 <= n <= m - 1.
Putting n=1 in equation (iii), we have:
2m^2 + 2m - P = 0, ax^2 + bx + c = 0, a=2, b=2, c=-P
m = (-b +- sqrt(b^2 - 4ac)) / 2a
=> (-2 +- sqrt(4 + 8P)) / 4
=> (-1 +- sqrt(1 + 2P)) / 2
Since m > 0, sqrt(b^2 - 4ac) > -b, the only solution is
(-1 + sqrt(1 + 2P)) / 2 ---(iv)
Putting n=m-1 in equation (iii), we have:
2m^2 + 2m(m - 1) - P = 0
=> 4m^2 - 2m - P = 0, ax^2 + bx + c = 0, a=4, b=-2, c=-P
m = (-b +- sqrt(b^2 - 4ac)) / 2a
=> (2 +- sqrt(4 + 16P)) / 8
=> (1 +- sqrt(1 + 4P)) / 4
Since m > 0, the only solution is
(1 + sqrt(1 + 4P)) / 4 ---(v)
From equation (iii), m^2 + mn = P/2; since P/2 is constant,
when n is the smallest, m must be the largest, and vice versa.
Thus:
(1 + sqrt(1 + 4P)) / 4 <= m <= (-1 + sqrt(1 + 2P)) / 2 ---(vi)
Solving equation (iii) for n, we have:
n = (P - 2m^2) / 2m ---(vii)
We iterate for m within the bounds given by the inequality (vi)
and check when the corresponding n given by equation (vii) is
an integer.
Despite generating all primitive triples, Euclid's formula does not
produce all triples - for example, (9, 12, 15) cannot be generated using
integer m and n. This can be remedied by inserting an additional
parameter k to the formula. The following will generate all Pythagorean
triples uniquely.
a = k(m^2 - n^2), b = 2kmn, c = k(m^2 + n^2), for k >= 1.
Thus, we iterate for integer values of P/k until P < 12,
lowest possible perimeter corresponding to the triple (3, 4, 5).

Yo
I don't know if you still need the answer or not but hopefully, this can help.
n = int(input())
ans = [(a, b, c) for a in range(1, n) for b in range(a, n) for c in range(b, n) if (a**2 + b**2 == c**2 and a + b + c == n)]
if ans:
print(ans[0][0], ans[0][1], ans[0][2])
else:
print("Impossible")

Related

Given N, return M that satisfy the equation: N + M = 2 * (N XOR M)

Problem
Given N, return M that satisfy the equation: N + M = 2 * (N ^ M)
Constraints
1 <= Test Cases = 10^5;
1 <= N <= 10^18
I came across this problem in one of the hiring challenges.
By trial and error method, I have found a pattern that - Such an M exists between N/3 and 3N and that N + M is an Even number. So I code it up and upon submission, my solution only managed to pass only half of the test cases. This is not much of an optimisation as this method's time complexity is same as that of Brute force solution.
I know that my solution is not the Optimal solution.
Here's my solution:
def solve(n):
m = n//3
end = 3*n
# If both m and n are odd/even, their sum will be even
if (m&1 == 1 and n & 1 == 1) or (m&1 == 0 and n&1 == 0):
inc = 2
else:
m += 1
inc = 2
while m <= end:
if (n + m) == 2 * (n ^ m):
return m
m += inc
Could someone provide me some hints/methods/algorithm to get an Optimal Solution. Thanks!
The bottom bit of m is determined (since n+m must be even). Given that bottom bit, the next bit is determined, and so on.
That observation leads to this O(log n) solution:
def solve(n):
b = 1
m = 0
while n + m != 2 * (n ^ m):
mask = 2 * b - 1
if ((n + m) & mask) != ((2 * (n ^ m)) & mask):
m += b
b *= 2
return m
Another way to implement this is to find the smallest bit in which m+n and 2*(n^m) differ, and toggle that bit in m. That results in this very compact code (using the new walrus operator, and some bit-twiddling tricks):
def solve(n):
m = 0
while r := n + m ^ 2 * (n ^ m):
m |= r & -r
return m

Python Find the index of value in sequence given the condition

I have this problem but don't know where to start. The only thing I have in my mind is Fibonacci numbers and for loop (but don't know how to apply). It looks like Fibonacci but the first terms are different.
We have a sequence beginning with a then a+b. The 3rd number is
a+b+a, the 4th one is a+b+a+a+b. Which means that a number is
equal to the sum of 2 previous terms. (except the two first terms).
We need to find the index of the number given that the number is
exactly 100k. a and b are picked randomly. The program has to end by print(a,b,index)
So my problem is, I don't know how to choose a and b which can satisfy the 100k condition. For example:
If a == 35k and b == 20k, then the sequence would be like this:
35k, 55k, 90k, 145k and no 100k in the sequence.
So how to deal with this problem? Really appreciate!!
EDIT: this is a correction over my last answer
First write the difference equation according to the described conditions:
f[0] = a
f[1] = a + b
f[2] = f[1] + f[0]
= 2a + b = a + b + a
f[3] = f[2] + f[1] = f[1] + f[0] + f[1]
= 3a + 2b
= a + b + a + a + b
f[4] = f[3] + f[2]
= 3a + 2b + 2a + b = 5a + 3b
f[5] = f[4] + f[3]
= 5a + 3b + 3a + 2b = 8a + 5b
f[6] = f[5] + f[4]
= 8a + 5b + 5a + 3b = 13a + 8b
...
f[n] = f[n-1] + f[n-2]
We can actually simplify this problem if we separate a and b:
f_a[n] = a*(f[n-1] + f[n-2]) with f[0] = 1 and f[1] = 1
f_b[n] = b*(f[n-1] + f[n-2]) with f[0] = 0 and f[1] = 1
Now, if we calculate the solution to the diference equation we should obtain the following assuming that s=sqrt(5) that n \in N (is a natural number):
w1a = ((1+s)/2)ˆ{n+1}
w2a = ((1-s)/2)ˆ{n+1}
w1b = ((1+s)/2)ˆn
w2b = ((1-s)/2)ˆn
f_a[n] = (1/s) * [w1a - w2a] * a
f_b[n] = (1/s) * [w1b - w2b] * b
Simplifying:
l = (1+s)/2
g = (1-s)/2
f[n] = f_a[n] + f_b[n]
= (1/s) * [lˆn(al+b) - gˆn(ag+b)]
You can find more info on how to solve diference equations here: https://www.cl.cam.ac.uk/teaching/2003/Probability/prob07.pdf
You can implement these equations in a Python function to obtain any value of this function.
from math import sqrt
def f(n, a, b):
s = sqrt(5)
l = (1+s)/2
g = (1-s)/2
fn = (1/s) * ((a*l + b) * (l**n) - (a*g + b) * (g**n))
return int(round(fn, 0))
Searching for the index iteratively
You may now find the n which solves this equation for a particular f(n) if you apply the logarithmic function (see the section below). However, if time complexity is not an issue for you, and given that f[n] grows exponentially for n (meaning that you will not need to search much until 100k is reached or surpassed), you may also simply find the n which gives f[n] for a given a and b by doing the following search:
def search_index(a, b, value):
n = 0
while(True):
fn = f(n, a, b)
if fn == value:
return n
elif fn > value:
return -1
else:
n += 1
def brute_search(range_a, range_b, value):
for a in range(range_a + 1):
for b in range(range_b + 1):
if (a == 0) and (b == 0):
a = 1
res = search_index(a, b, value)
if res != -1:
return a, b, res
return -1
brute_search(1000, 1000, 100000)
>>> (80, 565, 12) # a = 80, b = 565 and n = 12
Through this (quite bad) method we find that for a=80 and b=565, n=12 will return f_n = 100k. If you would like to find all possible solutions for a range of values of a and b, you can modify brute_search in the following way:
def brute_search_many_solutions(range_a, range_b, value):
solutions = []
for a in range(range_a + 1):
for b in range(range_b + 1):
if (a == 0) and (b == 0):
a = 1
res = search_index(a, b, value)
if res != -1:
solutions.append((a, b, res))
return solutions
Analytical solution
Transforming the previous diference equation f_n so that now n is a function of a, b and f_n we obtain:
n \aprox log((f_n * s) / (a * l + b)) / log(l)
This result is an approximation, which may guide your search. You can use it in the following way:
def find_n(a, b, value):
s = sqrt(5)
l = (1+s)/2
g = (1-s)/2
return int(round(log(value * s / (a * l + b)) / log(l), 0))
def search(a, b, value):
n = find_n(a, b, value)
sol = f(n, a, b)
if sol == value:
return(a, b, n)
elif sol > value:
for i in range(n-1, 0, -1):
sol = f(i, a, b)
if sol == value:
return(a, b, i)
elif sol < value:
return(-1, 'no solution exits for a={} and b={}'.format(a, b))
else: # this should probably never be reached as find_n should
# provide an upper bound. But I still need to prove it
i = n
while(sol < value):
i += 1
sol = f(i, a, b)
if sol == value:
return(a, b, i)
elif sol > value:
return(-1, 'no solution exits for a={} and b={}'.format(a, b))
search(80, 565, 100000)
>>> (80, 565, 12) # a = 80, b = 565 and n = 12
NOTE: I would have loved to use mathematical notation here with LaTeX, but unfortunately I did not find an easy way to do it... Likely, this question would fit better Stack Exchange, than Stack overflow.

Python 3.7 does not support assignment expressions

I have the following code:
n = int(input())
a, b, c = map(int, input().split())
result = sum(s // c + 1 for i in range(n) for j in range(n - a * i) if (s := n - a * i - b * j - 1) >= 0)
print(result)
But I have an error that Python 3.7 does not support assignment expressions in this part (s := n - a * i - b * j - 1). How can I rewrite it? I want to rewrite it to python3.7
The simple, though repetitive, fix is to "inline" the value of s.
result = sum((n - a * i - b * j - 1) // c + 1
for i in range(n)
for j in range(n - a * i) if n - a * i - b * j - 1 >= 0)
Start with converting the generator expression to plain code and then it is a simple task:
result = 0
for i in range(n):
for j in range(n - a * i):
s = n - a * i - b * j - 1
if s >= 0:
result += s // c + 1

Optimizing Mathematical Calculation

I have a model for four possibilities of purchasing a pair items (purchasing both, none or just one) and need to optimize the (pseudo-) log-likelihood function. Part of this, of course, is the calculation/definition of the pseudo-log-likelihood function.
The following is my code, where Beta is a 2-d vector for each customer (there are U customers and U different beta vectors), X is a 2-d vector for each item (different for each of the N items) and Gamma is a symmetric matrix with a scalar value gamma(i,j) for each pair of items. And df is a dataframe of the purchases - one row for each customer and N columns for the items.
It would seem to me that all of these loops are inefficient and take up too much time, but I am not sure how to speed up this calculation and would appreciate any help improving it.
Thank you in advance!
def pseudo_likelihood(Args):
Beta = np.reshape(Args[0:2*U], (U, 2))
Gamma = np.reshape(Args[2*U:], (N,N))
L = 0
for u in range(0,U,1):
print datetime.datetime.today(), " for user {}".format(u)
y = df.loc[u][1:]
beta_u = Beta[u,:]
for l in range(N):
print datetime.datetime.today(), " for item {}".format(l)
for i in range(N-1):
if i == l:
continue
for j in range(i+1,N):
if (y[i] == y[j]):
if (y[i] == 1):
L += np.dot(beta_u,(x_vals.iloc[i,1:]+x_vals.iloc[j,1:])) + Gamma[i,j] #Log of the exponent of this expression
else:
L += np.log(
1 - np.exp(np.dot(beta_u, (x_vals.iloc[i, 1:] + x_vals.iloc[j, 1:])) + Gamma[i, j])
- np.exp(np.dot(beta_u, x_vals.iloc[i, 1:])) * (
1 - np.exp(np.dot(beta_u, x_vals.iloc[j, 1:])))
- np.exp(np.dot(beta_u, x_vals.iloc[j, 1:])) * (
1 - np.exp(np.dot(beta_u, x_vals.iloc[i, 1:]))))
else:
if (y[i] == 1):
L += np.dot(beta_u,x_vals.iloc[i,1:]) + np.log(1 - np.exp(np.dot(beta_u,x_vals.iloc[j,1:])))
else:
L += (np.dot(beta_u, x_vals.iloc[j,1:])) + np.log(1 - np.exp(np.dot(beta_u, x_vals.iloc[i,1:])))
L -= (N-2)*np.dot(beta_u,x_vals.iloc[l,1:])
for k in range(N):
if k != l:
L -= np.dot(beta_u, x_vals.iloc[k,1:])
return -L
To add/clarify - I am using this calculation to optimize and find the beta and gamma parameters that generated the data for this pseudo-likelihood function.
I am using scipy optimize.minimize with the 'Powell' method.
Updating for whomever is interested-
I found numpy.einsum to speed up the calculations here by over 90%.
np.einsum performs matrix/vector operations using Einstein notation. Recall that for two matrices A, B their product can be represented as the sum of
a_ij*b_jk
i.e. the ik element of the matrix AB is the sum over j of a_ij*b_jk
Using the einsum function I could calculate in advance all of the values necessary for the iterative calculation, saving precious time and hundreds, if not thousands, of unnecessary calculations.
I rewrote the code as follows:
def pseudo_likelihood(Args):
Beta = np.reshape(Args[0:2*U], (U,2))
Gamma = np.reshape(Args[2*U:], (N,N))
exp_gamma = np.exp(Gamma)
L = 0
for u in xrange(U):
y = df.loc[u][1:]
beta_u = Beta[u,:]
beta_dot_x = np.einsum('ij,j',x_vals[['V1','V2']],beta_u)
exp_beta_dot_x = np.exp(beta_dot_x)
log_one_minus_exp = np.log(1 - exp_beta_dot_x)
for l in xrange(N):
for i in xrange(N-1):
if i == l:
continue
for j in xrange(i+1,N):
if (y[i] == y[j]):
if (y[i] == 1):
L += beta_dot_x[i] + beta_dot_x[j] + Gamma[i,j] #Log of the exponent of this expression
else:
L += math.log(
1 - exp_beta_dot_x[i]*exp_beta_dot_x[j]*exp_gamma[i,j]
- exp_beta_dot_x[i] * (1 - exp_beta_dot_x[j])
- exp_beta_dot_x[j] * (1 - exp_beta_dot_x[i]))
else:
if (y[i] == 1):
L += beta_dot_x[i] + log_one_minus_exp[j]
else:
L += (beta_dot_x[j]) + log_one_minus_exp[i]
L -= (N-2)*beta_dot_x[l]
for k in xrange(N):
if k != l:
L -= sum(beta_dot_x) + beta_dot_x[l]
return -L

Unable to implement a dynamic programming table algorithm in python

I am having problems creating a table in python. Basically I want to build a table that for every number tells me if I can use it to break down another(its the table algo from the accepted answer in Can brute force algorithms scale?). Here's the pseudo code:
for i = 1 to k
for z = 0 to sum:
for c = 1 to z / x_i:
if T[z - c * x_i][i - 1] is true:
set T[z][i] to true
Here's the python implementation I have:
from collections import defaultdict
data = [1, 2, 4]
target_sum = 10
# T[x, i] is True if 'x' can be solved
# by a linear combination of data[:i+1]
T = defaultdict(bool) # all values are False by default
T[0, 0] = True # base case
for i, x in enumerate(data): # i is index, x is data[i]
for s in range(target_sum + 1): #set the range of one higher than sum to include sum itself
for c in range(s / x + 1):
if T[s - c * x, i]:
T[s, i+1] = True
#query area
target_result = 1
for node in T:
if node[0]==target_result:
print node, ':', T[node]
So what I expect is if target_result is set to 8, it shows how each item in list data can be used to break that number down. For 8, 1,2,4 for all work so I expect them all to be true, but this program is making everything true. For example, 1 should only be able to be broken down by 1(and not 2 or 4) but when I run it as 1, I get:
(1, 2) : True
(1, 0) : False
(1, 3) : True
(1, 1) : True
can anyone help me understand what's wrong with the code? or perhaps I am not understanding the algorithm that was posted in answer I am referring to.
(Note: I could be completely wrong, but I learned that defaultdict creates entries even if its not there, and if the entry exists the algo turns it to true, maybe thats the problem I'm not sure, but it was the line of thought I tried to go but it didn't work for me because it seems to break the overall implemention)
Thanks!
The code works if you print the solution using RecursivelyListAllThatWork():
coeff = [0]*len(data)
def RecursivelyListAllThatWork(k, sum): # Using last k variables, make sum
# /* Base case: If we've assigned all the variables correctly, list this
# * solution.
# */
if k == 0:
# print what we have so far
print(' + '.join("%2s*%s" % t for t in zip(coeff, data)))
return
x_k = data[k-1]
# /* Recursive step: Try all coefficients, but only if they work. */
for c in range(sum // x_k + 1):
if T[sum - c * x_k, k - 1]:
# mark the coefficient of x_k to be c
coeff[k-1] = c
RecursivelyListAllThatWork(k - 1, sum - c * x_k)
# unmark the coefficient of x_k
coeff[k-1] = 0
RecursivelyListAllThatWork(len(data), target_sum)
Output
10*1 + 0*2 + 0*4
8*1 + 1*2 + 0*4
6*1 + 2*2 + 0*4
4*1 + 3*2 + 0*4
2*1 + 4*2 + 0*4
0*1 + 5*2 + 0*4
6*1 + 0*2 + 1*4
4*1 + 1*2 + 1*4
2*1 + 2*2 + 1*4
0*1 + 3*2 + 1*4
2*1 + 0*2 + 2*4
0*1 + 1*2 + 2*4
As a side note, you don't really need a defaultdict with what you're doing, you can use a normal dict + .get():
data = [1, 2, 4]
target_sum = 10
T = {}
T[0, 0] = True
for i,x in enumerate(data):
for s in range(target_sum + 1): # xrange on python-2.x
for c in range(s // x + 1):
if T.get((s - c * x, i)):
T[s, i+1] = True
If you're using J.S. solution, don't forget to change:
if T[sum - c * x_k, k - 1]:
with:
if T.get((sum - c * x_k, k - 1)):
Your code is right.
1 = 1 * 1 + 0 * 2, so T[1, 2] is True.
1 = 1 * 1 + 0 * 2 + 0 * 4, so T[1, 3] is True.
As requested in the comments, a short explanation of the algo:
It calculates all numbers from 0 to targetsum that can be represented as a sum of (non-negative) multiples of some of the numbers in data.
If T[s, i] is True, then s can be represented in this way using only the first i elements of data.
At the start, 0 can be represented as the empty sum, thus T[0, 0] is True. (This step may seem a little technical.)
Let x be the 'i+1'-th element of data. Then, the algorithm tries for each number s if it can be represented by the sum of some multiple of x and a number for which a representation exists that uses only the first i elements of data (the existence of such a number means T[s - c * x, i] is True for some c). If so, s can be represented using only the first i+1 elements of data.

Categories

Resources