how can i use dynamic programming to optimize this code

how can i use dynamic programming to optimize this code - python

Daulat Ram is an affluent business man. After demonetization, IT raid was held at his accommodation in which all his money was seized. He is very eager to gain his money back, he started investing in certain ventures and earned out of them. On the first day, his income was Rs. X, followed by Rs. Y on the second day. Daulat Ram observed his growth as a function and wanted to calculate his income on the Nth day.
The function he found out was FN = FN-1 + FN-2 + FN-1×FN-2
Given his income on day 0 and day 1, calculate his income on the Nth day (yeah Its that simple).
INPUT:
The first line of input consists of a single integer T denoting number of test cases.
Each of the next T lines consists of three integers F0, F1 and N respectively.
OUTPUT:
For each test case, print a single integer FN, as the output can be large, calculate the answer modulo 109+7.
CONSTRAINTS:
1 ≤ T ≤ 105
0 ≤ F0, F1, N ≤ 109
def function(x1):
if x1==2: return fnc__1+fnc__0*fnc__1+fnc__0
elif x1==1: return fnc__1
elif x1==0: return fnc__0
return function(x1-1)+function(x1-2)*function(x1-1)+function(x1-2)
for i in range(int(input())): #input() is the no of test cases
rwINput = input().split()
fnc__0 =int(rwINput[0])
fnc__1 = int(rwINput[1])
print(function(int(rwINput[2])))

a simple way to optimize is to cache the results of your function. python provides a mechanism for just hat with its lru_cache. all you need to do is decorate your function with this:
from functools import lru_cache
#lru_cache()
def function(n, F0=1, F1=2):
if n == 0:
return F0
elif n == 1:
return F1
else:
f1 = function(n-1, F0, F1)
f2 = function(n-2, F0, F1)
return f1+f2 + f1*f2
you can tweak lru_cache a bit for your needs. and it plays very nice with the python garbage collector as it stores WeakRefs to your objects only.
test cases:
for i in range(7):
print('{}: {:7d}'.format(i, function(i)))
prints:
0: 1
1: 2
2: 5
3: 17
4: 107
5: 1943
6: 209951
to get your answer modulo an integer (not clear about the modulus in your question) you can do this:
MOD = 10**9 + 7 # ???
#lru_cache()
def function(n, F0=1, F1=2):
if n == 0:
return F0
elif n == 1:
return F1
else:
f1 = function(n-1, F0, F1)
f2 = function(n-2, F0, F1)
return (f1+f2 + f1*f2) % MOD

You could just start execute the function and assign f1 to f0 and result to f1. Iterate over this n times and the desired result is in f0:
MOD = 10**9 + 7
for _ in range(int(input())):
f0, f1, n = (int(x) for x in input().split())
for _ in range(n):
f0, f1 = f1, (f0 + f1 + f0 * f1) % MOD
print(f0)
Input:
8
1 2 0
1 2 1
1 2 2
1 2 3
1 2 4
1 2 5
1 2 6
10 13 100
Output:
1
2
5
17
107
1943
209951
276644752

Someone gave this answer to me and it worked but i don't know how?Complexity O(logn)
#include <stdio.h>
#include <stdlib.h>
#define mod 1000000007
long long int power(long long int,long long int);
void mult(long long int[2][2],long long int[2][2]);
int main()
{
int test;
scanf("%d",&test);
while(test--)
{
int n;
int pp,p;
scanf("%d%d%d",&pp,&p,&n);
long long int A[2][2] = {{1,1},{1,0}};
n = n-1;
long long int B[2][2] = {{1,0},{0,1}};
while(n>0)
{
if(n%2==1)
mult(B,A);
n = n/2;
mult(A,A);
}
long long int result = ((power(pp+1,B[0][1])*power(p+1,B[0][0]))%mod - 1 + mod)%mod;
printf("%lld\n",result);
}
}
long long int power(long long int a,long long int b)
{
long long int result = 1;
while(b>0)
{
if(b%2==1)
result = (result*a)%mod;
a = (a*a)%mod;
b = b/2;
}
return result;
}
void mult(long long int A[2][2],long long int B[2][2])
{
long long int C[2][2];
C[0][0] = A[0][0]*B[0][0] + A[0][1]*B[1][0];
C[0][1] = A[0][0]*B[0][1] + A[0][1]*B[1][1];
C[1][0] = A[1][0]*B[0][0] + A[1][1]*B[1][0];
C[1][1] = A[1][0]*B[0][1] + A[1][1]*B[1][1];
A[0][0] = C[0][0]%(mod-1);
A[0][1] = C[0][1]%(mod-1);
A[1][0] = C[1][0]%(mod-1);
A[1][1] = C[1][1]%(mod-1);
}

I know this post is old, but I want to point out that an important issue has been eluded: the function quickly gets huge values, and only a modulo is required. The modulo of a sum or of a product can be computed with the sum or product of the modulos. So the only way to get a correct answer for a big N is to store the modulos, instead of the Fn!
Here is my view on how dynamic programming should be used. Dynamic programming is just about caching the results in order to avoid recomputing all sub-branches of the recursion tree. Storing the successive Fn is everything that's needed. If the algorithm only needs to be used once, you even don't have to store the whole array here: compute f0 and f1, and keep the last two computed values (with the modulo), to find the result with a simple loop. If the algorithm is run multiple times, and the result has still not been computed, you just need to retrieve the last two computed values (a variable to store the index of the last computed value would be useful) in order to restart from there.

Related

How do I optimise this function that generates pythagorean group of n elements (like triples but with any number of elements) using itertools? [duplicate]

This is a program I wrote to calculate Pythagorean triplets. When I run the program it prints each set of triplets twice because of the if statement. Is there any way I can tell the program to only print a new set of triplets once? Thanks.
import math
def main():
for x in range (1, 1000):
for y in range (1, 1000):
for z in range(1, 1000):
if x*x == y*y + z*z:
print y, z, x
print '-'*50
if __name__ == '__main__':
main()

Pythagorean Triples make a good example for claiming "for loops considered harmful", because for loops seduce us into thinking about counting, often the most irrelevant part of a task.
(I'm going to stick with pseudo-code to avoid language biases, and to keep the pseudo-code streamlined, I'll not optimize away multiple calculations of e.g. x * x and y * y.)
Version 1:
for x in 1..N {
for y in 1..N {
for z in 1..N {
if x * x + y * y == z * z then {
// use x, y, z
}
}
}
}
is the worst solution. It generates duplicates, and traverses parts of the space that aren't useful (e.g. whenever z < y). Its time complexity is cubic on N.
Version 2, the first improvement, comes from requiring x < y < z to hold, as in:
for x in 1..N {
for y in x+1..N {
for z in y+1..N {
if x * x + y * y == z * z then {
// use x, y, z
}
}
}
}
which reduces run time and eliminates duplicated solutions. However, it is still cubic on N; the improvement is just a reduction of the co-efficient of N-cubed.
It is pointless to continue examining increasing values of z after z * z < x * x + y * y no longer holds. That fact motivates Version 3, the first step away from brute-force iteration over z:
for x in 1..N {
for y in x+1..N {
z = y + 1
while z * z < x * x + y * y {
z = z + 1
}
if z * z == x * x + y * y and z <= N then {
// use x, y, z
}
}
}
For N of 1000, this is about 5 times faster than Version 2, but it is still cubic on N.
The next insight is that x and y are the only independent variables; z depends on their values, and the last z value considered for the previous value of y is a good starting search value for the next value of y. That leads to Version 4:
for x in 1..N {
y = x+1
z = y+1
while z <= N {
while z * z < x * x + y * y {
z = z + 1
}
if z * z == x * x + y * y and z <= N then {
// use x, y, z
}
y = y + 1
}
}
which allows y and z to "sweep" the values above x only once. Not only is it over 100 times faster for N of 1000, it is quadratic on N, so the speedup increases as N grows.
I've encountered this kind of improvement often enough to be mistrustful of "counting loops" for any but the most trivial uses (e.g. traversing an array).
Update: Apparently I should have pointed out a few things about V4 that are easy to overlook.
Both of the while loops are controlled by the value of z (one directly, the other indirectly through the square of z). The inner while is actually speeding up the outer while, rather than being orthogonal to it. It's important to look at what the loops are doing, not merely to count how many loops there are.
All of the calculations in V4 are strictly integer arithmetic. Conversion to/from floating-point, as well as floating-point calculations, are costly by comparison.
V4 runs in constant memory, requiring only three integer variables. There are no arrays or hash tables to allocate and initialize (and, potentially, to cause an out-of-memory error).
The original question allowed all of x, y, and x to vary over the same range. V1..V4 followed that pattern.
Below is a not-very-scientific set of timings (using Java under Eclipse on my older laptop with other stuff running...), where the "use x, y, z" was implemented by instantiating a Triple object with the three values and putting it in an ArrayList. (For these runs, N was set to 10,000, which produced 12,471 triples in each case.)
Version 4: 46 sec.
using square root: 134 sec.
array and map: 400 sec.
The "array and map" algorithm is essentially:
squares = array of i*i for i in 1 .. N
roots = map of i*i -> i for i in 1 .. N
for x in 1 .. N
for y in x+1 .. N
z = roots[squares[x] + squares[y]]
if z exists use x, y, z
The "using square root" algorithm is essentially:
for x in 1 .. N
for y in x+1 .. N
z = (int) sqrt(x * x + y * y)
if z * z == x * x + y * y then use x, y, z
The actual code for V4 is:
public Collection<Triple> byBetterWhileLoop() {
Collection<Triple> result = new ArrayList<Triple>(limit);
for (int x = 1; x < limit; ++x) {
int xx = x * x;
int y = x + 1;
int z = y + 1;
while (z <= limit) {
int zz = xx + y * y;
while (z * z < zz) {++z;}
if (z * z == zz && z <= limit) {
result.add(new Triple(x, y, z));
}
++y;
}
}
return result;
}
Note that x * x is calculated in the outer loop (although I didn't bother to cache z * z); similar optimizations are done in the other variations.
I'll be glad to provide the Java source code on request for the other variations I timed, in case I've mis-implemented anything.

Substantially faster than any of the solutions so far. Finds triplets via a ternary tree.
Wolfram says:
Hall (1970) and Roberts (1977) prove that (a, b, c) is a primitive Pythagorean triple if and only if
(a,b,c)=(3,4,5)M
where M is a finite product of the matrices U, A, D.
And there we have a formula to generate every primitive triple.
In the above formula, the hypotenuse is ever growing so it's pretty easy to check for a max length.
In Python:
import numpy as np
def gen_prim_pyth_trips(limit=None):
u = np.mat(' 1 2 2; -2 -1 -2; 2 2 3')
a = np.mat(' 1 2 2; 2 1 2; 2 2 3')
d = np.mat('-1 -2 -2; 2 1 2; 2 2 3')
uad = np.array([u, a, d])
m = np.array([3, 4, 5])
while m.size:
m = m.reshape(-1, 3)
if limit:
m = m[m[:, 2] <= limit]
yield from m
m = np.dot(m, uad)
If you'd like all triples and not just the primitives:
def gen_all_pyth_trips(limit):
for prim in gen_prim_pyth_trips(limit):
i = prim
for _ in range(limit//prim[2]):
yield i
i = i + prim
list(gen_prim_pyth_trips(10**4)) took 2.81 milliseconds to come back with 1593 elements while list(gen_all_pyth_trips(10**4)) took 19.8 milliseconds to come back with 12471 elements.
For reference, the accepted answer (in Python) took 38 seconds for 12471 elements.
Just for fun, setting the upper limit to one million list(gen_all_pyth_trips(10**6)) returns in 2.66 seconds with 1980642 elements (almost 2 million triples in 3 seconds). list(gen_all_pyth_trips(10**7)) brings my computer to its knees as the list gets so large it consumes every last bit of RAM. Doing something like sum(1 for _ in gen_all_pyth_trips(10**7)) gets around that limitation and returns in 30 seconds with 23471475 elements.
For more information on the algorithm used, check out the articles on Wolfram and Wikipedia.

You should define x < y < z.
for x in range (1, 1000):
for y in range (x + 1, 1000):
for z in range(y + 1, 1000):
Another good optimization would be to only use x and y and calculate zsqr = x * x + y * y. If zsqr is a square number (or z = sqrt(zsqr) is a whole number), it is a triplet, else not. That way, you need only two loops instead of three (for your example, that's about 1000 times faster).

The previously listed algorithms for generating Pythagorean triplets are all modifications of the naive approach derived from the basic relationship a^2 + b^2 = c^2 where (a, b, c) is a triplet of positive integers. It turns out that Pythagorean triplets satisfy some fairly remarkable relationships that can be used to generate all Pythagorean triplets.
Euclid discovered the first such relationship. He determined that for every Pythagorean triple (a, b, c), possibly after a reordering of a and b there are relatively prime positive integers m and n with m > n, at least one of which is even, and a positive integer k such that
a = k (2mn)
b = k (m^2 - n^2)
c = k (m^2 + n^2)
Then to generate Pythagorean triplets, generate relatively prime positive integers m and n of differing parity, and a positive integer k and apply the above formula.
struct PythagoreanTriple {
public int a { get; private set; }
public int b { get; private set; }
public int c { get; private set; }
public PythagoreanTriple(int a, int b, int c) : this() {
this.a = a < b ? a : b;
this.b = b < a ? a : b;
this.c = c;
}
public override string ToString() {
return String.Format("a = {0}, b = {1}, c = {2}", a, b, c);
}
public static IEnumerable<PythagoreanTriple> GenerateTriples(int max) {
var triples = new List<PythagoreanTriple>();
for (int m = 1; m <= max / 2; m++) {
for (int n = 1 + (m % 2); n < m; n += 2) {
if (m.IsRelativelyPrimeTo(n)) {
for (int k = 1; k <= max / (m * m + n * n); k++) {
triples.Add(EuclidTriple(m, n, k));
}
}
}
}
return triples;
}
private static PythagoreanTriple EuclidTriple(int m, int n, int k) {
int msquared = m * m;
int nsquared = n * n;
return new PythagoreanTriple(k * 2 * m * n, k * (msquared - nsquared), k * (msquared + nsquared));
}
}
public static class IntegerExtensions {
private static int GreatestCommonDivisor(int m, int n) {
return (n == 0 ? m : GreatestCommonDivisor(n, m % n));
}
public static bool IsRelativelyPrimeTo(this int m, int n) {
return GreatestCommonDivisor(m, n) == 1;
}
}
class Program {
static void Main(string[] args) {
PythagoreanTriple.GenerateTriples(1000).ToList().ForEach(t => Console.WriteLine(t));
}
}
The Wikipedia article on Formulas for generating Pythagorean triples contains other such formulae.

Algorithms can be tuned for speed, memory usage, simplicity, and other things.
Here is a pythagore_triplets algorithm tuned for speed, at the cost of memory usage and simplicity. If all you want is speed, this could be the way to go.
Calculation of list(pythagore_triplets(10000)) takes 40 seconds on my computer, versus 63 seconds for ΤΖΩΤΖΙΟΥ's algorithm, and possibly days of calculation for Tafkas's algorithm (and all other algorithms which use 3 embedded loops instead of just 2).
def pythagore_triplets(n=1000):
maxn=int(n*(2**0.5))+1 # max int whose square may be the sum of two squares
squares=[x*x for x in xrange(maxn+1)] # calculate all the squares once
reverse_squares=dict([(squares[i],i) for i in xrange(maxn+1)]) # x*x=>x
for x in xrange(1,n):
x2 = squares[x]
for y in xrange(x,n+1):
y2 = squares[y]
z = reverse_squares.get(x2+y2)
if z != None:
yield x,y,z
>>> print list(pythagore_triplets(20))
[(3, 4, 5), (5, 12, 13), (6, 8, 10), (8, 15, 17), (9, 12, 15), (12, 16, 20)]
Note that if you are going to calculate the first billion triplets, then this algorithm will crash before it even starts, because of an out of memory error. So ΤΖΩΤΖΙΟΥ's algorithm is probably a safer choice for high values of n.
BTW, here is Tafkas's algorithm, translated into python for the purpose of my performance tests. Its flaw is to require 3 loops instead of 2.
def gcd(a, b):
while b != 0:
t = b
b = a%b
a = t
return a
def find_triple(upper_boundary=1000):
for c in xrange(5,upper_boundary+1):
for b in xrange(4,c):
for a in xrange(3,b):
if (a*a + b*b == c*c and gcd(a,b) == 1):
yield a,b,c

def pyth_triplets(n=1000):
"Version 1"
for x in xrange(1, n):
x2= x*x # time saver
for y in xrange(x+1, n): # y > x
z2= x2 + y*y
zs= int(z2**.5)
if zs*zs == z2:
yield x, y, zs
>>> print list(pyth_triplets(20))
[(3, 4, 5), (5, 12, 13), (6, 8, 10), (8, 15, 17), (9, 12, 15), (12, 16, 20)]
V.1 algorithm has monotonically increasing x values.
EDIT
It seems this question is still alive :)
Since I came back and revisited the code, I tried a second approach which is almost 4 times as fast (about 26% of CPU time for N=10000) as my previous suggestion since it avoids lots of unnecessary calculations:
def pyth_triplets(n=1000):
"Version 2"
for z in xrange(5, n+1):
z2= z*z # time saver
x= x2= 1
y= z - 1; y2= y*y
while x < y:
x2_y2= x2 + y2
if x2_y2 == z2:
yield x, y, z
x+= 1; x2= x*x
y-= 1; y2= y*y
elif x2_y2 < z2:
x+= 1; x2= x*x
else:
y-= 1; y2= y*y
>>> print list(pyth_triplets(20))
[(3, 4, 5), (6, 8, 10), (5, 12, 13), (9, 12, 15), (8, 15, 17), (12, 16, 20)]
Note that this algorithm has increasing z values.
If the algorithm was converted to C —where, being closer to the metal, multiplications take more time than additions— one could minimalise the necessary multiplications, given the fact that the step between consecutive squares is:
(x+1)² - x² = (x+1)(x+1) - x² = x² + 2x + 1 - x² = 2x + 1
so all of the inner x2= x*x and y2= y*y would be converted to additions and subtractions like this:
def pyth_triplets(n=1000):
"Version 3"
for z in xrange(5, n+1):
z2= z*z # time saver
x= x2= 1; xstep= 3
y= z - 1; y2= y*y; ystep= 2*y - 1
while x < y:
x2_y2= x2 + y2
if x2_y2 == z2:
yield x, y, z
x+= 1; x2+= xstep; xstep+= 2
y-= 1; y2-= ystep; ystep-= 2
elif x2_y2 < z2:
x+= 1; x2+= xstep; xstep+= 2
else:
y-= 1; y2-= ystep; ystep-= 2
Of course, in Python the extra bytecode produced actually slows down the algorithm compared to version 2, but I would bet (without checking :) that V.3 is faster in C.
Cheers everyone :)

I juste extended Kyle Gullion 's answer so that triples are sorted by hypothenuse, then longest side.
It doesn't use numpy, but requires a SortedCollection (or SortedList) such as this one
def primitive_triples():
""" generates primitive Pythagorean triplets x<y<z
sorted by hypotenuse z, then longest side y
through Berggren's matrices and breadth first traversal of ternary tree
:see: https://en.wikipedia.org/wiki/Tree_of_primitive_Pythagorean_triples
"""
key=lambda x:(x[2],x[1])
triples=SortedCollection(key=key)
triples.insert([3,4,5])
A = [[ 1,-2, 2], [ 2,-1, 2], [ 2,-2, 3]]
B = [[ 1, 2, 2], [ 2, 1, 2], [ 2, 2, 3]]
C = [[-1, 2, 2], [-2, 1, 2], [-2, 2, 3]]
while triples:
(a,b,c) = triples.pop(0)
yield (a,b,c)
# expand this triple to 3 new triples using Berggren's matrices
for X in [A,B,C]:
triple=[sum(x*y for (x,y) in zip([a,b,c],X[i])) for i in range(3)]
if triple[0]>triple[1]: # ensure x<y<z
triple[0],triple[1]=triple[1],triple[0]
triples.insert(triple)
def triples():
""" generates all Pythagorean triplets triplets x<y<z
sorted by hypotenuse z, then longest side y
"""
prim=[] #list of primitive triples up to now
key=lambda x:(x[2],x[1])
samez=SortedCollection(key=key) # temp triplets with same z
buffer=SortedCollection(key=key) # temp for triplets with smaller z
for pt in primitive_triples():
z=pt[2]
if samez and z!=samez[0][2]: #flush samez
while samez:
yield samez.pop(0)
samez.insert(pt)
#build buffer of smaller multiples of the primitives already found
for i,pm in enumerate(prim):
p,m=pm[0:2]
while True:
mz=m*p[2]
if mz < z:
buffer.insert(tuple(m*x for x in p))
elif mz == z:
# we need another buffer because next pt might have
# the same z as the previous one, but a smaller y than
# a multiple of a previous pt ...
samez.insert(tuple(m*x for x in p))
else:
break
m+=1
prim[i][1]=m #update multiplier for next loops
while buffer: #flush buffer
yield buffer.pop(0)
prim.append([pt,2]) #add primitive to the list
the code is available in the math2 module of my Python library. It is tested against some series of the OEIS (code here at the bottom), which just enabled me to find a mistake in A121727 :-)

I wrote that program in Ruby and it similar to the python implementation. The important line is:
if x*x == y*y + z*z && gcd(y,z) == 1:
Then you have to implement a method that return the greatest common divisor (gcd) of two given numbers. A very simple example in Ruby again:
def gcd(a, b)
while b != 0
t = b
b = a%b
a = t
end
return a
end
The full Ruby methon to find the triplets would be:
def find_triple(upper_boundary)
(5..upper_boundary).each {|c|
(4..c-1).each {|b|
(3..b-1).each {|a|
if (a*a + b*b == c*c && gcd(a,b) == 1)
puts "#{a} \t #{b} \t #{c}"
end
}
}
}
end

Old Question, but i'll still input my stuff.
There are two general ways to generate unique pythagorean triples. One Is by Scaling, and the other is by using this archaic formula.
What scaling basically does it take a constant n, then multiply a base triple, lets say 3,4,5 by n. So taking n to be 2, we get 6,8,10 our next triple.
Scaling
def pythagoreanScaled(n):
triplelist = []
for x in range(n):
one = 3*x
two = 4*x
three = 5*x
triple = (one,two,three)
triplelist.append(triple)
return triplelist
The formula method uses the fact the if we take a number x, calculate 2m, m^2+1, and m^2-1, those three will always be a pythagorean triplet.
Formula
def pythagoreantriple(n):
triplelist = []
for x in range(2,n):
double = x*2
minus = x**2-1
plus = x**2+1
triple = (double,minus,plus)
triplelist.append(triple)
return triplelist

Yes, there is.
Okay, now you'll want to know why. Why not just constrain it so that z > y? Try
for z in range (y+1, 1000)

from math import sqrt
from itertools import combinations
#Pythagorean triplet - a^2 + b^2 = c^2 for (a,b) <= (1999,1999)
def gen_pyth(n):
if n >= 2000 :
return
ELEM = [ [ i,j,i*i + j*j ] for i , j in list(combinations(range(1, n + 1 ), 2)) if sqrt(i*i + j*j).is_integer() ]
print (*ELEM , sep = "\n")
gen_pyth(200)

for a in range(1,20):
for b in range(1,20):
for c in range(1,20):
if a>b and c and c>b:
if a**2==b**2+c**2:
print("triplets are:",a,b,c)

in python we can store square of all numbers in another list.
then find permutation of pairs of all number given
square them
finally check if any pair sum of square matches the squared list

Version 5 to Joel Neely.
Since X can be max of 'N-2' and Y can be max of 'N-1' for range of 1..N. Since Z max is N and Y max is N-1, X can be max of Sqrt ( N * N - (N-1) * (N-1) ) = Sqrt ( 2 * N - 1 ) and can start from 3.
MaxX = ( 2 * N - 1 ) ** 0.5
for x in 3..MaxX {
y = x+1
z = y+1
m = x*x + y*y
k = z * z
while z <= N {
while k < m {
z = z + 1
k = k + (2*z) - 1
}
if k == m and z <= N then {
// use x, y, z
}
y = y + 1
m = m + (2 * y) - 1
}
}

Just checking, but I've been using the following code to make pythagorean triples. It's very fast (and I've tried some of the examples here, though I kind of learned them and wrote my own and came back and checked here (2 years ago)). I think this code correctly finds all pythagorean triples up to (name your limit) and fairly quickly too. I used C++ to make it.
ullong is unsigned long long and I created a couple of functions to square and root
my root function basically said if square root of given number (after making it whole number (integral)) squared not equal number give then return -1 because it is not rootable.
_square and _root do as expected as of description above, I know of another way to optimize it but I haven't done nor tested that yet.
generate(vector<Triple>& triplist, ullong limit) {
cout<<"Please wait as triples are being generated."<<endl;
register ullong a, b, c;
register Triple trip;
time_t timer = time(0);
for(a = 1; a <= limit; ++a) {
for(b = a + 1; b <= limit; ++b) {
c = _root(_square(a) + _square(b));
if(c != -1 && c <= limit) {
trip.a = a; trip.b = b; trip.c = c;
triplist.push_back(trip);
} else if(c > limit)
break;
}
}
timer = time(0) - timer;
cout<<"Generated "<<triplist.size()<<" in "<<timer<<" seconds."<<endl;
cin.get();
cin.get();
}
Let me know what you all think. It generates all primitive and non-primitive triples according to the teacher I turned it in for. (she tested it up to 100 if I remember correctly).
The results from the v4 supplied by a previous coder here are
Below is a not-very-scientific set of timings (using Java under Eclipse on my older laptop with other stuff running...), where the "use x, y, z" was implemented by instantiating a Triple object with the three values and putting it in an ArrayList. (For these runs, N was set to 10,000, which produced 12,471 triples in each case.)
Version 4: 46 sec.
using square root: 134 sec.
array and map: 400 sec.
The results from mine is
How many triples to generate: 10000
Please wait as triples are being generated.
Generated 12471 in 2 seconds.
That is before I even start optimizing via the compiler. (I remember previously getting 10000 down to 0 seconds with tons of special options and stuff). My code also generates all the triples with 100,000 as the limit of how high side1,2,hyp can go in 3.2 minutes (I think the 1,000,000 limit takes an hour).
I modified the code a bit and got the 10,000 limit down to 1 second (no optimizations). On top of that, with careful thinking, mine could be broken down into chunks and threaded upon given ranges (for example 100,000 divide into 4 equal chunks for 3 cpu's (1 extra to hopefully consume cpu time just in case) with ranges 1 to 25,000 (start at 1 and limit it to 25,000), 25,000 to 50,000 , 50,000 to 75,000, and 75,000 to end. I may do that and see if it speeds it up any (I will have threads premade and not include them in the actual amount of time to execute the triple function. I'd need a more precise timer and a way to concatenate the vectors. I think that if 1 3.4 GHZ cpu with 8 gb ram at it's disposal can do 10,000 as lim in 1 second then 3 cpus should do that in 1/3 a second (and I round to higher second as is atm).

It should be noted that for a, b, and c you don't need to loop all the way to N.
For a, you only have to loop from 1 to int(sqrt(n**2/2))+1, for b, a+1 to int(sqrt(n**2-a**2))+1, and for c from int(sqrt(a**2+b**2) to int(sqrt(a**2+b**2)+2.

# To find all pythagorean triplets in a range
import math
n = int(input('Enter the upper range of limit'))
for i in range(n+1):
for j in range(1, i):
k = math.sqrt(i*i + j*j)
if k % 1 == 0 and k in range(n+1):
print(i,j,int(k))

U have to use Euclid's proof of Pythagorean triplets. Follow below...
U can choose any arbitrary number greater than zero say m,n
According to Euclid the triplet will be a(m*m-n*n), b(2*m*n), c(m*m+n*n)
Now apply this formula to find out the triplets, say our one value of triplet is 6 then, other two? Ok let’s solve...
a(m*m-n*n), b(2*m*n) , c(m*m+n*n)
It is sure that b(2*m*n) is obviously even. So now
(2*m*n)=6 =>(m*n)=3 =>m*n=3*1 =>m=3,n=1
U can take any other value rather than 3 and 1, but those two values should hold the product of two numbers which is 3 (m*n=3)
Now, when m=3 and n=1 Then,
a(m*m-n*n)=(3*3-1*1)=8 , c(m*m-n*n)=(3*3+1*1)=10
6,8,10 is our triplet for value, this our visualization of how generating triplets.
if given number is odd like (9) then slightly modified here, because b(2*m*n)
will never be odd. so, here we have to take
a(m*m-n*n)=7, (m+n)*(m-n)=7*1, So, (m+n)=7, (m-n)=1
Now find m and n from here, then find the other two values.
If u don’t understand it, read it again carefully.
Do code according this, it will generate distinct triplets efficiently.

A non-numpy version of the Hall/Roberts approach is
def pythag3(limit=None, all=False):
"""generate Pythagorean triples which are primitive (default)
or without restriction (when ``all`` is True). The elements
returned in the tuples are sorted with the smallest first.
Examples
========
>>> list(pythag3(20))
[(3, 4, 5), (8, 15, 17), (5, 12, 13)]
>>> list(pythag3(20, True))
[(3, 4, 5), (6, 8, 10), (9, 12, 15), (12, 16, 20), (8, 15, 17), (5, 12, 13)]
"""
if limit and limit < 5:
return
m = [(3,4,5)] # primitives stored here
while m:
x, y, z = m.pop()
if x > y:
x, y = y, x
yield (x, y, z)
if all:
a, b, c = x, y, z
while 1:
c += z
if c > limit:
break
a += x
b += y
yield a, b, c
# new primitives
a = x - 2*y + 2*z, 2*x - y + 2*z, 2*x - 2*y + 3*z
b = x + 2*y + 2*z, 2*x + y + 2*z, 2*x + 2*y + 3*z
c = -x + 2*y + 2*z, -2*x + y + 2*z, -2*x + 2*y + 3*z
for d in (a, b, c):
if d[2] <= limit:
m.append(d)
It's slower than the numpy-coded version but the primitives with largest element less than or equal to 10^6 are generated on my slow machine in about 1.4 seconds. (And the list m never grew beyond 18 elements.)

In c language -
#include<stdio.h>
int main()
{
int n;
printf("How many triplets needed : \n");
scanf("%d\n",&n);
for(int i=1;i<=2000;i++)
{
for(int j=i;j<=2000;j++)
{
for(int k=j;k<=2000;k++)
{
if((j*j+i*i==k*k) && (n>0))
{
printf("%d %d %d\n",i,j,k);
n=n-1;
}
}
}
}
}

You can try this
triplets=[]
for a in range(1,100):
for b in range(1,100):
for c in range(1,100):
if a**2 + b**2==c**2:
i=[a,b,c]
triplets.append(i)
for i in triplets:
i.sort()
if triplets.count(i)>1:
triplets.remove(i)
print(triplets)

Recurrent sequence task

Given the sequence f0, f1, f2, ... given by the recurrence relations f0 = 0, f1 = 1, f2 = 2 and fk = f (k-1) + f (k-3)
Write a program that calculates the n elements of this sequence with the numbers k1, k2, ..., kn.
Input format
The first line of the input contains an integer n (1 <= n <= 1000)
The second line contains n non-negative integers ki (0 <= ki <= 16000), separated by spaces.
Output format
Output space-separated values for fk1, fk2, ... fkn.
Memory Limit: 10MB
Time limit: 1 second
The problem is that the recursive function at large values goes beyond the limit.
def f (a):
    if a <= 2:
        return a
    return f (a - 1) + f (a - 3)
n = int (input ())
nums = list (map (int, input (). split ()))
for i in range (len (nums)):
    if i <len (nums) - 1:
        print (f (nums [i]), end = '')
    else:
        print (f (nums [i]))
I also tried to solve through a cycle, but the task does not go through time (1 second):
fk1 = 0
fk2 = 0
fk3 = 0
n = int (input ())
nums = list (map (int, input (). split ()))
a = []
for i in range (len (nums)):
    itog = 0
    for j in range (1, nums [i] + 1):
        if j <= 2:
            itog = j
        else:
            if j == 3:
                itog = 0 + 2
                fk1 = itog
                fk2 = 2
                fk3 = 1
            else:
                itog = fk1 + fk3
                fk1, fk2, fk3 = itog, fk1, fk2
    if i <len (nums) - 1:
        print (itog, end = '')
    else:
        print (itog)
How else can you solve this problem so that it is optimal in time and memory?

Concerning the memory, the best solution probably is the iterative one. I think you are not far from the answer. The idea would be to first check for the simple cases f(k) = k (ie, k <= 2), for all other cases k > 2 you can simply compute fi using (fi-3, fi-2, fi-1) until i = k. What you need to do during this process is indeed to keep track of the last three values (similar to what you did in the line fk1, fk2, fk3 = itog, fk1, fk2).
On the other hand, there is one thing that you need to do here. If you just perform computations of fk1, fk2, ... fkn independently, then you are screwed (unless you use a super fast machine or a Cython implementation). On the other hand, there is no reason to perform n independent computations, you can just compute fx for x = max(k1, k2, ..., kn) and on the way you'll store every answer for fk1, fk2, ..., fkn (this will slow down the computation of fx by a little bit, but instead of doing this n times you'll do it only once). This way it can be solved under 1s even for n = 1000.
On my machine, independent calculations for f15000, f15001, ..., f16000 takes roughly 30s, the "all at once" solution takes roughly 0.035s.
Honestly, that's not such an easy exercise, it would be interesting to show your solution on a site like code review to get some feedback on your solution once you found one :).

First, you have to sort the numbers. Then calculate values of the sequence one by one:
while True:
a3 = a2 + a0
a0 = a3 + a1
a1 = a0 + a2
a2 = a1 + a3
Lastly, return values in beginning order. To do that you have to remember position of every number. From [45, 22, 14, 33] make [[45,0], [22,1], [14,2], [33,3]] and then sort, calculate values and change them with argument [[f45,0], [f22,1], [f14,2], [f33,3]], then sort by second value.

Non Divisible subset in python

I have been given a set S, of n integers, and have to print the size of a maximal subset S' of S where the sum of any 2 numbers in S' are not evenly divisible by k.
Input Format
The first line contains 2 space-separated integers, n and k, respectively.
The second line contains n space-separated integers describing the unique values of the set.
My Code :
import sys
n,k = raw_input().strip().split(' ')
n,k = [int(n),int(k)]
a = map(int,raw_input().strip().split(' '))
count = 0
for i in range(len(a)):
for j in range(len(a)):
if (a[i]+a[j])%k != 0:
count = count+1
print count
Input:
4 3
1 7 2 4
Expected Output:
3
My Output:
10
What am i doing wrong? Anyone?

You can solve it in O(n) time using the following approach:
L = [0]*k
for x in a:
L[x % k] += 1
res = 0
for i in range(k//2+1):
if i == 0 or k == i*2:
res += bool(L[i])
else:
res += max(L[i], L[k-i])
print(res)

Yes O(n) solution for this problem is very much possible. Like planetp rightly pointed out its pretty much the same solution I have coded in java. Added comments for better understanding.
import java.io.; import java.util.;
public class Solution {
public static void main(String[] args) {
Scanner in = new Scanner(System.in);
int n=in.nextInt();
int k=in.nextInt();
int [] arr = new int[k];
Arrays.fill(arr, 0);
Map<Integer,Integer> mp=new HashMap<>();
Storing the values in a map considering there are no duplicates. You can store them in array list if there are duplicates. Only then you have different results.
for(int i=0;i
int res=0;
for(int i=0;i<=(k/2);i++)
{
if(i==0 || k==i*2)
{
if(arr[i]!=0)
res+=1;
}
If the no. is divisible by k we can have only one and if the no is exactly half of k then we can have only 1. Rational if a & b are divisble by k then a+b is also divisible by k. Similarly if c%k=k/2 then if we have more than one such no. their combination is divisible by k. Hence we restrict them to 1 value each.
else
{
int p=arr[i];
int q=arr[k-i];
if(p>=q)
res+=p;
else
res+=q;
}
This is simple figure out which is more from a list of 0 to k/2 in the list if a[x]>a[k-x] get the values which is greater. i.e. if we have k=4 and we have no. 1,3,5,7,9,13,17. Then a[1]=4 and a[3]=2 thus pick a[1] because 1,5,13,17 can be kept together.
}
System.out.println(res);
}
}

# given k, n and a as per your input.
# Will return 0 directly if n == 1
def maxsize(k, n, a):
import itertools
while n > 1:
sets = itertools.combinations(a, n)
for set_ in sets:
if all((u+v) % k for (u, v) in itertools.combinations(set_, 2)):
return n
n -= 1
return 0

Java solution
public class Solution {
static PrintStream out = System.out;
public static void main(String[] args) {
/* Enter your code here. Read input from STDIN. Print output to STDOUT. Your class should be named Solution. */
Scanner in = new Scanner (System.in);
int n = in.nextInt();
int k = in.nextInt();
int[] A = new int[n];
for(int i=0;i<n;i++){
A[i]=in.nextInt();
}
int[] R = new int[k];
for(int i=0;i<n;i++)
R[A[i] % k]+=1;
int res=0;
for(int i=0;i<k/2+1;i++){
if(i==0 || k==i*2)
res+= (R[i]!=0)?1:0;
else
res+= Math.max(R[i], R[k-i]);
}
out.println(res);
}
}

Python code optimization (20x slower than C)

I've written this very badly optimized C code that does a simple math calculation:
#include <stdio.h>
#include <math.h>
#include <stdlib.h>
#define MIN(a, b) (((a) < (b)) ? (a) : (b))
#define MAX(a, b) (((a) > (b)) ? (a) : (b))
unsigned long long int p(int);
float fullCheck(int);
int main(int argc, char **argv){
int i, g, maxNumber;
unsigned long long int diff = 1000;
if(argc < 2){
fprintf(stderr, "Usage: %s maxNumber\n", argv[0]);
return 0;
}
maxNumber = atoi(argv[1]);
for(i = 1; i < maxNumber; i++){
for(g = 1; g < maxNumber; g++){
if(i == g)
continue;
if(p(MAX(i,g)) - p(MIN(i,g)) < diff && fullCheck(p(MAX(i,g)) - p(MIN(i,g))) && fullCheck(p(i) + p(g))){
diff = p(MAX(i,g)) - p(MIN(i,g));
printf("We have a couple %llu %llu with diff %llu\n", p(i), p(g), diff);
}
}
}
return 0;
}
float fullCheck(int number){
float check = (-1 + sqrt(1 + 24 * number))/-6;
float check2 = (-1 - sqrt(1 + 24 * number))/-6;
if(check/1.00 == (int)check)
return check;
if(check2/1.00 == (int)check2)
return check2;
return 0;
}
unsigned long long int p(int n){
return n * (3 * n - 1 ) / 2;
}
And then I've tried (just for fun) to port it under Python to see how it would react. My first version was almost a 1:1 conversion that run terribly slow (120+secs in Python vs <1sec in C).
I've done a bit of optimization, and this is what I obtained:
#!/usr/bin/env/python
from cmath import sqrt
import cProfile
from pstats import Stats
def quickCheck(n):
partial_c = (sqrt(1 + 24 * (n)))/-6
c = 1/6 + partial_c
if int(c.real) == c.real:
return True
c = c - 2*partial_c
if int(c.real) == c.real:
return True
return False
def main():
maxNumber = 5000
diff = 1000
for i in range(1, maxNumber):
p_i = i * (3 * i - 1 ) / 2
for g in range(i, maxNumber):
if i == g:
continue
p_g = g * (3 * g - 1 ) / 2
if p_i > p_g:
ma = p_i
mi = p_g
else:
ma = p_g
mi = p_i
if ma - mi < diff and quickCheck(ma - mi):
if quickCheck(ma + mi):
print ('New couple ', ma, mi)
diff = ma - mi
cProfile.run('main()','script_perf')
perf = Stats('script_perf').sort_stats('time', 'calls').print_stats(10)
This runs in about 16secs which is better but also almost 20 times slower than C.
Now, I know C is better than Python for this kind of calculations, but what I would like to know is if there something that I've missed (Python-wise, like an horribly slow function or such) that could have made this function faster.
Please note that I'm using Python 3.1.1, if this makes a difference

Since quickCheck is being called close to 25,000,000 times, you might want to use memoization to cache the answers.
You can do memoization in C as well as Python. Things will be much faster in C, also.
You're computing 1/6 in each iteration of quickCheck. I'm not sure if this will be optimized out by Python, but if you can avoid recomputing constant values, you'll find things are faster. C compilers do this for you.
Doing things like if condition: return True; else: return False is silly -- and time consuming. Simply do return condition.
In Python 3.x, /2 must create floating-point values. You appear to need integers for this. You should be using //2 division. It will be closer to the C version in terms of what it does, but I don't think it's significantly faster.
Finally, Python is generally interpreted. The interpreter will always be significantly slower than C.

I made it go from ~7 seconds to ~3 seconds on my machine:
Precomputed i * (3 * i - 1 ) / 2 for each value, in yours it was computed twice quite a lot
Cached calls to quickCheck
Removed if i == g by adding +1 to the range
Removed if p_i > p_g since p_i is always smaller than p_g
Also put the quickCheck-function inside main, to make all variables local (which have faster lookup than global).
I'm sure there are more micro-optimizations available.
def main():
maxNumber = 5000
diff = 1000
p = {}
quickCache = {}
for i in range(maxNumber):
p[i] = i * (3 * i - 1 ) / 2
def quickCheck(n):
if n in quickCache: return quickCache[n]
partial_c = (sqrt(1 + 24 * (n)))/-6
c = 1/6 + partial_c
if int(c.real) == c.real:
quickCache[n] = True
return True
c = c - 2*partial_c
if int(c.real) == c.real:
quickCache[n] = True
return True
quickCache[n] = False
return False
for i in range(1, maxNumber):
mi = p[i]
for g in range(i+1, maxNumber):
ma = p[g]
if ma - mi < diff and quickCheck(ma - mi) and quickCheck(ma + mi):
print('New couple ', ma, mi)
diff = ma - mi

Because the function p() monotonically increasing you can avoid comparing the values as g > i implies p(g) > p(i). Also, the inner loop can be broken early because p(g) - p(i) >= diff implies p(g+1) - p(i) >= diff.
Also for correctness, I changed the equality comparison in quickCheck to compare difference against an epsilon because exact comparison with floating point is pretty fragile.
On my machine this reduced the runtime to 7.8ms using Python 2.6. Using PyPy with JIT reduced this to 0.77ms.
This shows that before turning to micro-optimization it pays to look for algorithmic optimizations. Micro-optimizations make spotting algorithmic changes much harder for relatively tiny gains.
EPS = 0.00000001
def quickCheck(n):
partial_c = sqrt(1 + 24*n) / -6
c = 1/6 + partial_c
if abs(int(c) - c) < EPS:
return True
c = 1/6 - partial_c
if abs(int(c) - c) < EPS:
return True
return False
def p(i):
return i * (3 * i - 1 ) / 2
def main(maxNumber):
diff = 1000
for i in range(1, maxNumber):
for g in range(i+1, maxNumber):
if p(g) - p(i) >= diff:
break
if quickCheck(p(g) - p(i)) and quickCheck(p(g) + p(i)):
print('New couple ', p(g), p(i), p(g) - p(i))
diff = p(g) - p(i)

There are some python compilers that might actually do a good bit for you. Have a look at Psyco.
Another way of dealing with math intensive programs is to rewrite the majority of the work into a math kernel, such as NumPy, so that heavily optimized code is doing the work, and your python code only guides the calculation. To get the most out of this strategy, avoid doing calculations in loops, and instead let the math kernel do all of that.

The other respondents have already mentioned several optimizations that will help. However, ultimately, you're not going to be able to match the performance of C in Python. Python is a nice tool, but since it's interpreted, it isn't really suited for heavy number crunching or other apps where performance is key.
Also, even in your C version, your inner loop could use quite a bit of help. Updated version:
for(i = 1; i < maxNumber; i++){
for(g = 1; g < maxNumber; g++){
if(i == g)
continue;
max=i;
min=g;
if (max<min) {
// xor swap - could use swap(p_max,p_min) instead.
max=max^min;
min=max^min;
max=max^min;
}
p_max=P(max);
p_min=P(min);
p_i=P(i);
p_g=P(g);
if(p_max - p_min < diff && fullCheck(p_max-p_min) && fullCheck(p_i + p_g)){
diff = p_max - p_min;
printf("We have a couple %llu %llu with diff %llu\n", p_i, p_g, diff);
}
}
}
///////////////////////////
float fullCheck(int number){
float den=sqrt(1+24*number)/6.0;
float check = 1/6.0 - den;
float check2 = 1/6.0 + den;
if(check == (int)check)
return check;
if(check2 == (int)check2)
return check2;
return 0.0;
}
Division, function calls, etc are costly. Also, calculating them once and storing in vars such as I've done can make things a lot more readable.
You might consider declaring P() as inline or rewrite as a preprocessor macro. Depending on how good your optimizer is, you might want to perform some of the arithmetic yourself and simplify its implementation.
Your implementation of fullCheck() would return what appear to be invalid results, since 1/6==0, where 1/6.0 would return 0.166... as you would expect.
This is a very brief take on what you can do to your C code to improve performance. This will, no doubt, widen the gap between C and Python performance.

20x difference between Python and C for a number crunching task seems quite good to me.
Check the usual performance differences for some CPU intensive tasks (keep in mind that the scale is logarithmic).
But look on the bright side, what's 1 minute of CPU time compared with the brain and typing time you saved writing Python instead of C? :-)

Generating unique, ordered Pythagorean triplets

This is a program I wrote to calculate Pythagorean triplets. When I run the program it prints each set of triplets twice because of the if statement. Is there any way I can tell the program to only print a new set of triplets once? Thanks.
import math
def main():
for x in range (1, 1000):
for y in range (1, 1000):
for z in range(1, 1000):
if x*x == y*y + z*z:
print y, z, x
print '-'*50
if __name__ == '__main__':
main()

Pythagorean Triples make a good example for claiming "for loops considered harmful", because for loops seduce us into thinking about counting, often the most irrelevant part of a task.
(I'm going to stick with pseudo-code to avoid language biases, and to keep the pseudo-code streamlined, I'll not optimize away multiple calculations of e.g. x * x and y * y.)
Version 1:
for x in 1..N {
for y in 1..N {
for z in 1..N {
if x * x + y * y == z * z then {
// use x, y, z
}
}
}
}
is the worst solution. It generates duplicates, and traverses parts of the space that aren't useful (e.g. whenever z < y). Its time complexity is cubic on N.
Version 2, the first improvement, comes from requiring x < y < z to hold, as in:
for x in 1..N {
for y in x+1..N {
for z in y+1..N {
if x * x + y * y == z * z then {
// use x, y, z
}
}
}
}
which reduces run time and eliminates duplicated solutions. However, it is still cubic on N; the improvement is just a reduction of the co-efficient of N-cubed.
It is pointless to continue examining increasing values of z after z * z < x * x + y * y no longer holds. That fact motivates Version 3, the first step away from brute-force iteration over z:
for x in 1..N {
for y in x+1..N {
z = y + 1
while z * z < x * x + y * y {
z = z + 1
}
if z * z == x * x + y * y and z <= N then {
// use x, y, z
}
}
}
For N of 1000, this is about 5 times faster than Version 2, but it is still cubic on N.
The next insight is that x and y are the only independent variables; z depends on their values, and the last z value considered for the previous value of y is a good starting search value for the next value of y. That leads to Version 4:
for x in 1..N {
y = x+1
z = y+1
while z <= N {
while z * z < x * x + y * y {
z = z + 1
}
if z * z == x * x + y * y and z <= N then {
// use x, y, z
}
y = y + 1
}
}
which allows y and z to "sweep" the values above x only once. Not only is it over 100 times faster for N of 1000, it is quadratic on N, so the speedup increases as N grows.
I've encountered this kind of improvement often enough to be mistrustful of "counting loops" for any but the most trivial uses (e.g. traversing an array).
Update: Apparently I should have pointed out a few things about V4 that are easy to overlook.
Both of the while loops are controlled by the value of z (one directly, the other indirectly through the square of z). The inner while is actually speeding up the outer while, rather than being orthogonal to it. It's important to look at what the loops are doing, not merely to count how many loops there are.
All of the calculations in V4 are strictly integer arithmetic. Conversion to/from floating-point, as well as floating-point calculations, are costly by comparison.
V4 runs in constant memory, requiring only three integer variables. There are no arrays or hash tables to allocate and initialize (and, potentially, to cause an out-of-memory error).
The original question allowed all of x, y, and x to vary over the same range. V1..V4 followed that pattern.
Below is a not-very-scientific set of timings (using Java under Eclipse on my older laptop with other stuff running...), where the "use x, y, z" was implemented by instantiating a Triple object with the three values and putting it in an ArrayList. (For these runs, N was set to 10,000, which produced 12,471 triples in each case.)
Version 4: 46 sec.
using square root: 134 sec.
array and map: 400 sec.
The "array and map" algorithm is essentially:
squares = array of i*i for i in 1 .. N
roots = map of i*i -> i for i in 1 .. N
for x in 1 .. N
for y in x+1 .. N
z = roots[squares[x] + squares[y]]
if z exists use x, y, z
The "using square root" algorithm is essentially:
for x in 1 .. N
for y in x+1 .. N
z = (int) sqrt(x * x + y * y)
if z * z == x * x + y * y then use x, y, z
The actual code for V4 is:
public Collection<Triple> byBetterWhileLoop() {
Collection<Triple> result = new ArrayList<Triple>(limit);
for (int x = 1; x < limit; ++x) {
int xx = x * x;
int y = x + 1;
int z = y + 1;
while (z <= limit) {
int zz = xx + y * y;
while (z * z < zz) {++z;}
if (z * z == zz && z <= limit) {
result.add(new Triple(x, y, z));
}
++y;
}
}
return result;
}
Note that x * x is calculated in the outer loop (although I didn't bother to cache z * z); similar optimizations are done in the other variations.
I'll be glad to provide the Java source code on request for the other variations I timed, in case I've mis-implemented anything.

Substantially faster than any of the solutions so far. Finds triplets via a ternary tree.
Wolfram says:
Hall (1970) and Roberts (1977) prove that (a, b, c) is a primitive Pythagorean triple if and only if
(a,b,c)=(3,4,5)M
where M is a finite product of the matrices U, A, D.
And there we have a formula to generate every primitive triple.
In the above formula, the hypotenuse is ever growing so it's pretty easy to check for a max length.
In Python:
import numpy as np
def gen_prim_pyth_trips(limit=None):
u = np.mat(' 1 2 2; -2 -1 -2; 2 2 3')
a = np.mat(' 1 2 2; 2 1 2; 2 2 3')
d = np.mat('-1 -2 -2; 2 1 2; 2 2 3')
uad = np.array([u, a, d])
m = np.array([3, 4, 5])
while m.size:
m = m.reshape(-1, 3)
if limit:
m = m[m[:, 2] <= limit]
yield from m
m = np.dot(m, uad)
If you'd like all triples and not just the primitives:
def gen_all_pyth_trips(limit):
for prim in gen_prim_pyth_trips(limit):
i = prim
for _ in range(limit//prim[2]):
yield i
i = i + prim
list(gen_prim_pyth_trips(10**4)) took 2.81 milliseconds to come back with 1593 elements while list(gen_all_pyth_trips(10**4)) took 19.8 milliseconds to come back with 12471 elements.
For reference, the accepted answer (in Python) took 38 seconds for 12471 elements.
Just for fun, setting the upper limit to one million list(gen_all_pyth_trips(10**6)) returns in 2.66 seconds with 1980642 elements (almost 2 million triples in 3 seconds). list(gen_all_pyth_trips(10**7)) brings my computer to its knees as the list gets so large it consumes every last bit of RAM. Doing something like sum(1 for _ in gen_all_pyth_trips(10**7)) gets around that limitation and returns in 30 seconds with 23471475 elements.
For more information on the algorithm used, check out the articles on Wolfram and Wikipedia.

You should define x < y < z.
for x in range (1, 1000):
for y in range (x + 1, 1000):
for z in range(y + 1, 1000):
Another good optimization would be to only use x and y and calculate zsqr = x * x + y * y. If zsqr is a square number (or z = sqrt(zsqr) is a whole number), it is a triplet, else not. That way, you need only two loops instead of three (for your example, that's about 1000 times faster).

The previously listed algorithms for generating Pythagorean triplets are all modifications of the naive approach derived from the basic relationship a^2 + b^2 = c^2 where (a, b, c) is a triplet of positive integers. It turns out that Pythagorean triplets satisfy some fairly remarkable relationships that can be used to generate all Pythagorean triplets.
Euclid discovered the first such relationship. He determined that for every Pythagorean triple (a, b, c), possibly after a reordering of a and b there are relatively prime positive integers m and n with m > n, at least one of which is even, and a positive integer k such that
a = k (2mn)
b = k (m^2 - n^2)
c = k (m^2 + n^2)
Then to generate Pythagorean triplets, generate relatively prime positive integers m and n of differing parity, and a positive integer k and apply the above formula.
struct PythagoreanTriple {
public int a { get; private set; }
public int b { get; private set; }
public int c { get; private set; }
public PythagoreanTriple(int a, int b, int c) : this() {
this.a = a < b ? a : b;
this.b = b < a ? a : b;
this.c = c;
}
public override string ToString() {
return String.Format("a = {0}, b = {1}, c = {2}", a, b, c);
}
public static IEnumerable<PythagoreanTriple> GenerateTriples(int max) {
var triples = new List<PythagoreanTriple>();
for (int m = 1; m <= max / 2; m++) {
for (int n = 1 + (m % 2); n < m; n += 2) {
if (m.IsRelativelyPrimeTo(n)) {
for (int k = 1; k <= max / (m * m + n * n); k++) {
triples.Add(EuclidTriple(m, n, k));
}
}
}
}
return triples;
}
private static PythagoreanTriple EuclidTriple(int m, int n, int k) {
int msquared = m * m;
int nsquared = n * n;
return new PythagoreanTriple(k * 2 * m * n, k * (msquared - nsquared), k * (msquared + nsquared));
}
}
public static class IntegerExtensions {
private static int GreatestCommonDivisor(int m, int n) {
return (n == 0 ? m : GreatestCommonDivisor(n, m % n));
}
public static bool IsRelativelyPrimeTo(this int m, int n) {
return GreatestCommonDivisor(m, n) == 1;
}
}
class Program {
static void Main(string[] args) {
PythagoreanTriple.GenerateTriples(1000).ToList().ForEach(t => Console.WriteLine(t));
}
}
The Wikipedia article on Formulas for generating Pythagorean triples contains other such formulae.

Algorithms can be tuned for speed, memory usage, simplicity, and other things.
Here is a pythagore_triplets algorithm tuned for speed, at the cost of memory usage and simplicity. If all you want is speed, this could be the way to go.
Calculation of list(pythagore_triplets(10000)) takes 40 seconds on my computer, versus 63 seconds for ΤΖΩΤΖΙΟΥ's algorithm, and possibly days of calculation for Tafkas's algorithm (and all other algorithms which use 3 embedded loops instead of just 2).
def pythagore_triplets(n=1000):
maxn=int(n*(2**0.5))+1 # max int whose square may be the sum of two squares
squares=[x*x for x in xrange(maxn+1)] # calculate all the squares once
reverse_squares=dict([(squares[i],i) for i in xrange(maxn+1)]) # x*x=>x
for x in xrange(1,n):
x2 = squares[x]
for y in xrange(x,n+1):
y2 = squares[y]
z = reverse_squares.get(x2+y2)
if z != None:
yield x,y,z
>>> print list(pythagore_triplets(20))
[(3, 4, 5), (5, 12, 13), (6, 8, 10), (8, 15, 17), (9, 12, 15), (12, 16, 20)]
Note that if you are going to calculate the first billion triplets, then this algorithm will crash before it even starts, because of an out of memory error. So ΤΖΩΤΖΙΟΥ's algorithm is probably a safer choice for high values of n.
BTW, here is Tafkas's algorithm, translated into python for the purpose of my performance tests. Its flaw is to require 3 loops instead of 2.
def gcd(a, b):
while b != 0:
t = b
b = a%b
a = t
return a
def find_triple(upper_boundary=1000):
for c in xrange(5,upper_boundary+1):
for b in xrange(4,c):
for a in xrange(3,b):
if (a*a + b*b == c*c and gcd(a,b) == 1):
yield a,b,c

def pyth_triplets(n=1000):
"Version 1"
for x in xrange(1, n):
x2= x*x # time saver
for y in xrange(x+1, n): # y > x
z2= x2 + y*y
zs= int(z2**.5)
if zs*zs == z2:
yield x, y, zs
>>> print list(pyth_triplets(20))
[(3, 4, 5), (5, 12, 13), (6, 8, 10), (8, 15, 17), (9, 12, 15), (12, 16, 20)]
V.1 algorithm has monotonically increasing x values.
EDIT
It seems this question is still alive :)
Since I came back and revisited the code, I tried a second approach which is almost 4 times as fast (about 26% of CPU time for N=10000) as my previous suggestion since it avoids lots of unnecessary calculations:
def pyth_triplets(n=1000):
"Version 2"
for z in xrange(5, n+1):
z2= z*z # time saver
x= x2= 1
y= z - 1; y2= y*y
while x < y:
x2_y2= x2 + y2
if x2_y2 == z2:
yield x, y, z
x+= 1; x2= x*x
y-= 1; y2= y*y
elif x2_y2 < z2:
x+= 1; x2= x*x
else:
y-= 1; y2= y*y
>>> print list(pyth_triplets(20))
[(3, 4, 5), (6, 8, 10), (5, 12, 13), (9, 12, 15), (8, 15, 17), (12, 16, 20)]
Note that this algorithm has increasing z values.
If the algorithm was converted to C —where, being closer to the metal, multiplications take more time than additions— one could minimalise the necessary multiplications, given the fact that the step between consecutive squares is:
(x+1)² - x² = (x+1)(x+1) - x² = x² + 2x + 1 - x² = 2x + 1
so all of the inner x2= x*x and y2= y*y would be converted to additions and subtractions like this:
def pyth_triplets(n=1000):
"Version 3"
for z in xrange(5, n+1):
z2= z*z # time saver
x= x2= 1; xstep= 3
y= z - 1; y2= y*y; ystep= 2*y - 1
while x < y:
x2_y2= x2 + y2
if x2_y2 == z2:
yield x, y, z
x+= 1; x2+= xstep; xstep+= 2
y-= 1; y2-= ystep; ystep-= 2
elif x2_y2 < z2:
x+= 1; x2+= xstep; xstep+= 2
else:
y-= 1; y2-= ystep; ystep-= 2
Of course, in Python the extra bytecode produced actually slows down the algorithm compared to version 2, but I would bet (without checking :) that V.3 is faster in C.
Cheers everyone :)

I juste extended Kyle Gullion 's answer so that triples are sorted by hypothenuse, then longest side.
It doesn't use numpy, but requires a SortedCollection (or SortedList) such as this one
def primitive_triples():
""" generates primitive Pythagorean triplets x<y<z
sorted by hypotenuse z, then longest side y
through Berggren's matrices and breadth first traversal of ternary tree
:see: https://en.wikipedia.org/wiki/Tree_of_primitive_Pythagorean_triples
"""
key=lambda x:(x[2],x[1])
triples=SortedCollection(key=key)
triples.insert([3,4,5])
A = [[ 1,-2, 2], [ 2,-1, 2], [ 2,-2, 3]]
B = [[ 1, 2, 2], [ 2, 1, 2], [ 2, 2, 3]]
C = [[-1, 2, 2], [-2, 1, 2], [-2, 2, 3]]
while triples:
(a,b,c) = triples.pop(0)
yield (a,b,c)
# expand this triple to 3 new triples using Berggren's matrices
for X in [A,B,C]:
triple=[sum(x*y for (x,y) in zip([a,b,c],X[i])) for i in range(3)]
if triple[0]>triple[1]: # ensure x<y<z
triple[0],triple[1]=triple[1],triple[0]
triples.insert(triple)
def triples():
""" generates all Pythagorean triplets triplets x<y<z
sorted by hypotenuse z, then longest side y
"""
prim=[] #list of primitive triples up to now
key=lambda x:(x[2],x[1])
samez=SortedCollection(key=key) # temp triplets with same z
buffer=SortedCollection(key=key) # temp for triplets with smaller z
for pt in primitive_triples():
z=pt[2]
if samez and z!=samez[0][2]: #flush samez
while samez:
yield samez.pop(0)
samez.insert(pt)
#build buffer of smaller multiples of the primitives already found
for i,pm in enumerate(prim):
p,m=pm[0:2]
while True:
mz=m*p[2]
if mz < z:
buffer.insert(tuple(m*x for x in p))
elif mz == z:
# we need another buffer because next pt might have
# the same z as the previous one, but a smaller y than
# a multiple of a previous pt ...
samez.insert(tuple(m*x for x in p))
else:
break
m+=1
prim[i][1]=m #update multiplier for next loops
while buffer: #flush buffer
yield buffer.pop(0)
prim.append([pt,2]) #add primitive to the list
the code is available in the math2 module of my Python library. It is tested against some series of the OEIS (code here at the bottom), which just enabled me to find a mistake in A121727 :-)

I wrote that program in Ruby and it similar to the python implementation. The important line is:
if x*x == y*y + z*z && gcd(y,z) == 1:
Then you have to implement a method that return the greatest common divisor (gcd) of two given numbers. A very simple example in Ruby again:
def gcd(a, b)
while b != 0
t = b
b = a%b
a = t
end
return a
end
The full Ruby methon to find the triplets would be:
def find_triple(upper_boundary)
(5..upper_boundary).each {|c|
(4..c-1).each {|b|
(3..b-1).each {|a|
if (a*a + b*b == c*c && gcd(a,b) == 1)
puts "#{a} \t #{b} \t #{c}"
end
}
}
}
end

Old Question, but i'll still input my stuff.
There are two general ways to generate unique pythagorean triples. One Is by Scaling, and the other is by using this archaic formula.
What scaling basically does it take a constant n, then multiply a base triple, lets say 3,4,5 by n. So taking n to be 2, we get 6,8,10 our next triple.
Scaling
def pythagoreanScaled(n):
triplelist = []
for x in range(n):
one = 3*x
two = 4*x
three = 5*x
triple = (one,two,three)
triplelist.append(triple)
return triplelist
The formula method uses the fact the if we take a number x, calculate 2m, m^2+1, and m^2-1, those three will always be a pythagorean triplet.
Formula
def pythagoreantriple(n):
triplelist = []
for x in range(2,n):
double = x*2
minus = x**2-1
plus = x**2+1
triple = (double,minus,plus)
triplelist.append(triple)
return triplelist

Yes, there is.
Okay, now you'll want to know why. Why not just constrain it so that z > y? Try
for z in range (y+1, 1000)

from math import sqrt
from itertools import combinations
#Pythagorean triplet - a^2 + b^2 = c^2 for (a,b) <= (1999,1999)
def gen_pyth(n):
if n >= 2000 :
return
ELEM = [ [ i,j,i*i + j*j ] for i , j in list(combinations(range(1, n + 1 ), 2)) if sqrt(i*i + j*j).is_integer() ]
print (*ELEM , sep = "\n")
gen_pyth(200)

for a in range(1,20):
for b in range(1,20):
for c in range(1,20):
if a>b and c and c>b:
if a**2==b**2+c**2:
print("triplets are:",a,b,c)

in python we can store square of all numbers in another list.
then find permutation of pairs of all number given
square them
finally check if any pair sum of square matches the squared list

Version 5 to Joel Neely.
Since X can be max of 'N-2' and Y can be max of 'N-1' for range of 1..N. Since Z max is N and Y max is N-1, X can be max of Sqrt ( N * N - (N-1) * (N-1) ) = Sqrt ( 2 * N - 1 ) and can start from 3.
MaxX = ( 2 * N - 1 ) ** 0.5
for x in 3..MaxX {
y = x+1
z = y+1
m = x*x + y*y
k = z * z
while z <= N {
while k < m {
z = z + 1
k = k + (2*z) - 1
}
if k == m and z <= N then {
// use x, y, z
}
y = y + 1
m = m + (2 * y) - 1
}
}

Just checking, but I've been using the following code to make pythagorean triples. It's very fast (and I've tried some of the examples here, though I kind of learned them and wrote my own and came back and checked here (2 years ago)). I think this code correctly finds all pythagorean triples up to (name your limit) and fairly quickly too. I used C++ to make it.
ullong is unsigned long long and I created a couple of functions to square and root
my root function basically said if square root of given number (after making it whole number (integral)) squared not equal number give then return -1 because it is not rootable.
_square and _root do as expected as of description above, I know of another way to optimize it but I haven't done nor tested that yet.
generate(vector<Triple>& triplist, ullong limit) {
cout<<"Please wait as triples are being generated."<<endl;
register ullong a, b, c;
register Triple trip;
time_t timer = time(0);
for(a = 1; a <= limit; ++a) {
for(b = a + 1; b <= limit; ++b) {
c = _root(_square(a) + _square(b));
if(c != -1 && c <= limit) {
trip.a = a; trip.b = b; trip.c = c;
triplist.push_back(trip);
} else if(c > limit)
break;
}
}
timer = time(0) - timer;
cout<<"Generated "<<triplist.size()<<" in "<<timer<<" seconds."<<endl;
cin.get();
cin.get();
}
Let me know what you all think. It generates all primitive and non-primitive triples according to the teacher I turned it in for. (she tested it up to 100 if I remember correctly).
The results from the v4 supplied by a previous coder here are
Below is a not-very-scientific set of timings (using Java under Eclipse on my older laptop with other stuff running...), where the "use x, y, z" was implemented by instantiating a Triple object with the three values and putting it in an ArrayList. (For these runs, N was set to 10,000, which produced 12,471 triples in each case.)
Version 4: 46 sec.
using square root: 134 sec.
array and map: 400 sec.
The results from mine is
How many triples to generate: 10000
Please wait as triples are being generated.
Generated 12471 in 2 seconds.
That is before I even start optimizing via the compiler. (I remember previously getting 10000 down to 0 seconds with tons of special options and stuff). My code also generates all the triples with 100,000 as the limit of how high side1,2,hyp can go in 3.2 minutes (I think the 1,000,000 limit takes an hour).
I modified the code a bit and got the 10,000 limit down to 1 second (no optimizations). On top of that, with careful thinking, mine could be broken down into chunks and threaded upon given ranges (for example 100,000 divide into 4 equal chunks for 3 cpu's (1 extra to hopefully consume cpu time just in case) with ranges 1 to 25,000 (start at 1 and limit it to 25,000), 25,000 to 50,000 , 50,000 to 75,000, and 75,000 to end. I may do that and see if it speeds it up any (I will have threads premade and not include them in the actual amount of time to execute the triple function. I'd need a more precise timer and a way to concatenate the vectors. I think that if 1 3.4 GHZ cpu with 8 gb ram at it's disposal can do 10,000 as lim in 1 second then 3 cpus should do that in 1/3 a second (and I round to higher second as is atm).

It should be noted that for a, b, and c you don't need to loop all the way to N.
For a, you only have to loop from 1 to int(sqrt(n**2/2))+1, for b, a+1 to int(sqrt(n**2-a**2))+1, and for c from int(sqrt(a**2+b**2) to int(sqrt(a**2+b**2)+2.

# To find all pythagorean triplets in a range
import math
n = int(input('Enter the upper range of limit'))
for i in range(n+1):
for j in range(1, i):
k = math.sqrt(i*i + j*j)
if k % 1 == 0 and k in range(n+1):
print(i,j,int(k))

U have to use Euclid's proof of Pythagorean triplets. Follow below...
U can choose any arbitrary number greater than zero say m,n
According to Euclid the triplet will be a(m*m-n*n), b(2*m*n), c(m*m+n*n)
Now apply this formula to find out the triplets, say our one value of triplet is 6 then, other two? Ok let’s solve...
a(m*m-n*n), b(2*m*n) , c(m*m+n*n)
It is sure that b(2*m*n) is obviously even. So now
(2*m*n)=6 =>(m*n)=3 =>m*n=3*1 =>m=3,n=1
U can take any other value rather than 3 and 1, but those two values should hold the product of two numbers which is 3 (m*n=3)
Now, when m=3 and n=1 Then,
a(m*m-n*n)=(3*3-1*1)=8 , c(m*m-n*n)=(3*3+1*1)=10
6,8,10 is our triplet for value, this our visualization of how generating triplets.
if given number is odd like (9) then slightly modified here, because b(2*m*n)
will never be odd. so, here we have to take
a(m*m-n*n)=7, (m+n)*(m-n)=7*1, So, (m+n)=7, (m-n)=1
Now find m and n from here, then find the other two values.
If u don’t understand it, read it again carefully.
Do code according this, it will generate distinct triplets efficiently.

A non-numpy version of the Hall/Roberts approach is
def pythag3(limit=None, all=False):
"""generate Pythagorean triples which are primitive (default)
or without restriction (when ``all`` is True). The elements
returned in the tuples are sorted with the smallest first.
Examples
========
>>> list(pythag3(20))
[(3, 4, 5), (8, 15, 17), (5, 12, 13)]
>>> list(pythag3(20, True))
[(3, 4, 5), (6, 8, 10), (9, 12, 15), (12, 16, 20), (8, 15, 17), (5, 12, 13)]
"""
if limit and limit < 5:
return
m = [(3,4,5)] # primitives stored here
while m:
x, y, z = m.pop()
if x > y:
x, y = y, x
yield (x, y, z)
if all:
a, b, c = x, y, z
while 1:
c += z
if c > limit:
break
a += x
b += y
yield a, b, c
# new primitives
a = x - 2*y + 2*z, 2*x - y + 2*z, 2*x - 2*y + 3*z
b = x + 2*y + 2*z, 2*x + y + 2*z, 2*x + 2*y + 3*z
c = -x + 2*y + 2*z, -2*x + y + 2*z, -2*x + 2*y + 3*z
for d in (a, b, c):
if d[2] <= limit:
m.append(d)
It's slower than the numpy-coded version but the primitives with largest element less than or equal to 10^6 are generated on my slow machine in about 1.4 seconds. (And the list m never grew beyond 18 elements.)

In c language -
#include<stdio.h>
int main()
{
int n;
printf("How many triplets needed : \n");
scanf("%d\n",&n);
for(int i=1;i<=2000;i++)
{
for(int j=i;j<=2000;j++)
{
for(int k=j;k<=2000;k++)
{
if((j*j+i*i==k*k) && (n>0))
{
printf("%d %d %d\n",i,j,k);
n=n-1;
}
}
}
}
}

You can try this
triplets=[]
for a in range(1,100):
for b in range(1,100):
for c in range(1,100):
if a**2 + b**2==c**2:
i=[a,b,c]
triplets.append(i)
for i in triplets:
i.sort()
if triplets.count(i)>1:
triplets.remove(i)
print(triplets)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

how can i use dynamic programming to optimize this code - python

Related

How do I optimise this function that generates pythagorean group of n elements (like triples but with any number of elements) using itertools? [duplicate]

Recurrent sequence task

Non Divisible subset in python

Python code optimization (20x slower than C)

Generating unique, ordered Pythagorean triplets

Categories

Resources