Related
So, for my Computer Graphics class I was tasked with doing a Polygon Filler, my software renderer is currently being coded in Python. Right now, I want to test this pointInPolygon code I found at: How can I determine whether a 2D Point is within a Polygon? so I can make my own method later on basing myself on that one.
The code is:
int pnpoly(int nvert, float *vertx, float *verty, float testx, float testy)
{
int i, j, c = 0;
for (i = 0, j = nvert-1; i < nvert; j = i++) {
if ( ((verty[i]>testy) != (verty[j]>testy)) &&
(testx < (vertx[j]-vertx[i]) * (testy-verty[i]) / (verty[j]-verty[i]) + vertx[i]) )
c = !c;
}
return c;
}
And my attempt to recreate it in Python is as following:
def pointInPolygon(self, nvert, vertx, verty, testx, testy):
c = 0
j = nvert-1
for i in range(nvert):
if(((verty[i]>testy) != (verty[j]>testy)) and (testx < (vertx[j]-vertx[i]) * (testy-verty[i]) / (verty[j]-verty[i] + vertx[i]))):
c = not c
j += 1
return c
But this obviously will return a index out of range in the second iteration because j = nvert and it will crash.
Thanks in advance.
You're reading the tricky C code incorrectly. The point of j = i++ is to both increment i by one and assign the old value to j. Similar python code would do j = i at the end of the loop:
j = nvert - 1
for i in range(nvert):
...
j = i
The idea is that for nvert == 3, the values would go
j | i
---+---
2 | 0
0 | 1
1 | 2
Another way to achieve this is that j equals (i - 1) % nvert,
for i in range(nvert):
j = (i - 1) % nvert
...
i.e. it is lagging one behind, and the indices form a ring (like the vertices do)
More pythonic code would use itertools and iterate over the coordinates themselves. You'd have a list of pairs (tuples) called vertices, and two iterators, one of which is one vertex ahead the other, and cycling back to the beginning because of itertools.cycle, something like:
# make one iterator that goes one ahead and wraps around at the end
next_ones = itertools.cycle(vertices)
next(next_ones)
for ((x1, y1), (x2, y2)) in zip(vertices, next_ones):
# unchecked...
if (((y1 > testy) != (y2 > testy))
and (testx < (x2 - x1) * (testy - y1) / (y2-y1 + x1))):
c = not c
I have a very long math formula (just to put you in context: it has 293095 characters) which in practice will be the body of a python function. This function has 15 input parameters as in:
def math_func(t,X,P,n1,n2,R,r):
x,y,z = X
a,b,c = P
u1,v1,w1 = n1
u2,v2,w2 = n2
return <long math formula>
The formula uses simple math operations + - * ** / and one function call to arctan. Here an extract of it:
r*((-16*(r**6*t*u1**6 - 6*r**6*u1**5*u2 - 15*r**6*t*u1**4*u2**2 +
20*r**6*u1**3*u2**3 + 15*r**6*t*u1**2*u2**4 - 6*r**6*u1*u2**5 -
r**6*t*u2**6 + 3*r**6*t*u1**4*v1**2 - 12*r**6*u1**3*u2*v1**2 -
18*r**6*t*u1**2*u2**2*v1**2 + 12*r**6*u1*u2**3*v1**2 +
3*r**6*t*u2**4*v1**2 + 3*r**6*t*u1**2*v1**4 - 6*r**6*u1*u2*v1**4 -
3*r**6*t*u2**2*v1**4 + r**6*t*v1**6 - 6*r**6*u1**4*v1*v2 -
24*r**6*t*u1**3*u2*v1*v2 + 36*r**6*u1**2*u2**2*v1*v2 +
24*r**6*t*u1*u2**3*v1*v2 - 6*r**6*u2**4*v1*v2 -
12*r**6*u1**2*v1**3*v2 - 24*r**6*t*u1*u2*v1**3*v2 +
12*r**6*u2**2*v1**3*v2 - 6*r**6*v1**5*v2 - 3*r**6*t*u1**4*v2**2 + ...
Now the point is that in practice the bulk evaluation of this function will be done for fixed values of P,n1,n2,R and r which reduces the set of free variables to only four, and "in theory" the formula with less parameters should be faster.
So the question is: How can I implement this optimization in Python?
I know I can put everything in a string and do some sort of replace,compile and eval like in
formula = formula.replace('r','1').replace('R','2')....
code = compile(formula,'formula-name','eval')
math_func = lambda t,x,y,z: eval(code)
It would be good if some operations (like power) are substituted by their value, for example 18*r**6*t*u1**2*u2**2*v1**2 should become 18*t for r=u1=u2=v1=1. I think compile should do so but in any case I'm not sure. Does compile actually perform this optimization?
My solution speeds up the computation but if I can squeeze it more it will be great. Note: preferable within standard Python (I could try Cython later).
In general I'm interesting in a pythonic way to accomplish my goal maybe with some extra libraries: what is a reasonably good way of doing this? Is my solution a good approach?
EDIT: (To give more context)
The huge expression is the output of a symbolic line integral over an arc of circle. The arc is given in space by the radius r, two ortho-normal vectors (like the x and y axis in a 2D version) n1=(u1,v1,w1),n2=(u2,v2,w2) and the center P=(a,b,c). The rest is the point over which I'm performing the integration X=(x,y,z) and a parameter R for the function I'm integrating.
Sympy and Maple just take ages to compute this, the actual output is from Mathematica.
If you are curious about the formula here it is (pseudo-pseudo-code):
G(u) = P + r*(1-u**2)/(1+u**2)*n1 + r*2*u/(1+u**2)*n2
integral of (1-|X-G(t)|^2/R^2)^3 over t
You could use Sympy:
>>> from sympy import symbols
>>> x,y,z,a,b,c,u1,v1,w1,u2,v2,w2,t,r = symbols("x,y,z,a,b,c,u1,v1,w1,u2,v2,w2,t,r")
>>> r=u1=u2=v1=1
>>> a = 18*r**6*t*u1**2*u2**2*v1**2
>>> a
18*t
Then you can create a Python function like this:
>>> from sympy import lambdify
>>> f = lambdify(t, a)
>>> f(1)
18
And that f function is indeed simply 18*t:
>>> import dis
>>> dis.dis(f)
1 0 LOAD_CONST 1 (18)
3 LOAD_FAST 0 (_Dummy_18)
6 BINARY_MULTIPLY
7 RETURN_VALUE
If you want to compile the resulting code into machine code, you can try a JIT compiler such as Numba, Theano, or Parakeet.
Here's how I would approach this problem:
compile() your function to an AST (Abstract Syntax Tree) instead of a normal bytecode function - see the standard ast module for details.
Traverse the AST, replacing all references to the fixed parameters with their fixed value. There are libraries such as macropy that may be useful for this, I don't have any specific recommendation.
Traverse the AST again, performing whatever optimizations this might enable, such as Mult(1, X) => X. You don't have to worry about operations between two constants, as Python (since 2.6) optimizes that already.
compile() the AST into a normal function. Call it, and hope that the speed was increased by a sufficient amount to justify all the pre-optimization.
Note that Python will never optimize things like 1*X on its own, as it cannot know what type X will be at runtime - it could be an instance of a class that implements the multiplication operation in an arbitrary way, so the result is not necessarily X. Only your knowledge that all the variables are ordinary numbers, obeying the usual rules of arithmetic, makes this optimization valid.
The "right way" to solve a problem like this is one or more of:
Find a more efficient formulation
Symbolically simplify and reduce terms
Use vectorization (e.g. NumPy)
Punt to low-level libraries that are already optimized (e.g. in languages like C or Fortran that implicitly do strong expression optimization, rather than Python, which does nada).
Let's say for a moment, though, that approaches 1, 3, and 4 are not available, and you have to do this in Python. Then simplifying and "hoisting" common subexpressions is your primary tool.
The good news is, there are a lot of opportunities. The expression r**6, for example, is repeated 26 times. You could save 25 computations by simply assigning r_6 = r ** 6 once, then replacing r**6 every time it occurs.
When you start looking for common expressions here, you'll find them everywhere. It'd be nice to mechanize that process, right? In general, that requires a full expression parser (e.g. from the ast module) and is an exponential-time optimization problem. But your expression is a bit of a special case. While long and varied, it's not especially complicated. It has few internal parenthetical groupings, so we can get away with a quicker and dirtier approach.
Before the how, the resulting code is:
sa = r**6 # 26 occurrences
sb = u1**2 # 5 occurrences
sc = u2**2 # 5 occurrences
sd = v1**2 # 5 occurrences
se = u1**4 # 4 occurrences
sf = u2**3 # 3 occurrences
sg = u1**3 # 3 occurrences
sh = v1**4 # 3 occurrences
si = u2**4 # 3 occurrences
sj = v1**3 # 3 occurrences
sk = v2**2 # 1 occurrence
sl = v1**6 # 1 occurrence
sm = v1**5 # 1 occurrence
sn = u1**6 # 1 occurrence
so = u1**5 # 1 occurrence
sp = u2**6 # 1 occurrence
sq = u2**5 # 1 occurrence
sr = 6*sa # 6 occurrences
ss = 3*sa # 5 occurrences
st = ss*t # 5 occurrences
su = 12*sa # 4 occurrences
sv = sa*t # 3 occurrences
sw = v1*v2 # 5 occurrences
sx = sj*v2 # 3 occurrences
sy = 24*sv # 3 occurrences
sz = 15*sv # 2 occurrences
sA = sr*u1 # 2 occurrences
sB = sy*u1 # 2 occurrences
sC = sb*sc # 2 occurrences
sD = st*se # 2 occurrences
# revised formula
sv*sn - sr*so*u2 - sz*se*sc +
20*sa*sg*sf + sz*sb*si - sA*sq -
sv*sp + sD*sd - su*sg*u2*sd -
18*sv*sC*sd + su*u1*sf*sd +
st*si*sd + st*sb*sh - sA*u2*sh -
st*sc*sh + sv*sl - sr*se*sw -
sy*sg*u2*sw + 36*sa*sC*sw +
sB*sf*sw - sr*si*sw -
su*sb*sx - sB*u2*sx +
su*sc*sx - sr*sm*v2 - sD*sk
That avoids 81 computations. It's just a rough cut. Even the result could be further improved. The subexpressions sr*sw and su*sd for example, could be pre-computed as well. But we'll leave that next level for another day.
Note that this doesn't include the starting r*((-16*(. The majority of the simplification can be (and needs to be) done on the core of the expression, not on its outer terms. So I stripped those away for now; they can be added back once the common core is computed.
How do you do this?
f = """
r**6*t*u1**6 - 6*r**6*u1**5*u2 - 15*r**6*t*u1**4*u2**2 +
20*r**6*u1**3*u2**3 + 15*r**6*t*u1**2*u2**4 - 6*r**6*u1*u2**5 -
r**6*t*u2**6 + 3*r**6*t*u1**4*v1**2 - 12*r**6*u1**3*u2*v1**2 -
18*r**6*t*u1**2*u2**2*v1**2 + 12*r**6*u1*u2**3*v1**2 +
3*r**6*t*u2**4*v1**2 + 3*r**6*t*u1**2*v1**4 - 6*r**6*u1*u2*v1**4 -
3*r**6*t*u2**2*v1**4 + r**6*t*v1**6 - 6*r**6*u1**4*v1*v2 -
24*r**6*t*u1**3*u2*v1*v2 + 36*r**6*u1**2*u2**2*v1*v2 +
24*r**6*t*u1*u2**3*v1*v2 - 6*r**6*u2**4*v1*v2 -
12*r**6*u1**2*v1**3*v2 - 24*r**6*t*u1*u2*v1**3*v2 +
12*r**6*u2**2*v1**3*v2 - 6*r**6*v1**5*v2 - 3*r**6*t*u1**4*v2**2
""".strip()
from collections import Counter
import re
expre = re.compile('(?<!\w)\w+\*\*\d+')
multre = re.compile('(?<!\w)\w+\*\w+')
expr_saved = 0
stmts = []
secache = {}
seindex = 0
def subexpr(e):
global seindex
cached = secache.get(e)
if cached:
return cached
base = ord('a') if seindex < 26 else ord('A') - 26
name = 's' + chr(seindex + base)
seindex += 1
secache[e] = name
return name
def hoist(e, flat, c):
"""
Hoist the expression e into name defined by flat.
c is the count of how many times seen in incoming
formula.
"""
global expr_saved
assign = "{} = {}".format(flat, e)
s = "{:30} # {} occurrence{}".format(assign, c, '' if c == 1 else 's')
stmts.append(s)
print "{} needless computations quashed with {}".format(c-1, flat)
expr_saved += c - 1
def common_exp(form):
"""
Replace ALL exponentiation operations with a hoisted
sub-expression.
"""
# find the exponentiation operations
exponents = re.findall(expre, form)
# find and count exponentiation operations
expcount = Counter(re.findall(expre, form))
# for each exponentiation, create a hoisted sub-expression
for e, c in expcount.most_common():
hoist(e, subexpr(e), c)
# replace all exponentiation operations with their sub-expressions
form = re.sub(expre, lambda x: subexpr(x.group(0)), form)
return form
def common_mult(f):
"""
Replace multiplication operations with a hoisted
sub-expression if they occur > 1 time. Also, only
replaces one sub-expression at a time (the most common)
because it may affect further expressions
"""
mults = re.findall(multre, f)
for e, c in Counter(mults).most_common():
# unlike exponents, only replace if >1 occurrence
if c == 1:
return f
# occurs >1 time, so hoist
hoist(e, subexpr(e), c)
# replace in loop and return
return re.sub('(?<!\w)' + re.escape(e), subexpr(e), f)
# return f.replace(e, flat(e))
return f
# fix all exponents
form = common_exp(f)
# fix selected multiplies
prev = form
while True:
form = common_mult(form)
if form == prev:
# have converged; no more replacements possible
break
prev = form
print "--"
mults = re.split(r'\s*[+-]\s*', form)
smults = ['*'.join(sorted(terms.split('*'))) for terms in mults]
print smults
# print the hoisted statements and the revised expression
print '\n'.join(stmts)
print
print "# revised formula"
print form
Parsing with regular expressions is dicey business. That journey is prone to error, sorrow, and regret. I guarded against bad outcomes by hoisting some exponentiations that didn't strictly need to be, and by plugging random values into both the before and after formulas to make sure they both give the same results. I recommend the "punt to C" strategy if this is production code. But if you can't...
I am having trouble translating an old IDL script to python- my issue lies in understanding exactly how to interpret IDL's "WHERE" function.
Here is my code:
FUNCTION noise,day,y
N = N_ELEMENTS(y)
valid = WHERE(ABS(day[0:N-3]-day[2:N-1]) LT 20,cc)
IF cc LT 2 THEN RETURN,[-9.99,-9.99,-9.99,-9.99]
y_int = (y[0:N-3] * (day[2:N-1] - day[1:N-2]) + y[2:N-1] * (day[1:N-2] - day[0:N-3]))/ (day[2:N-1] - day[0:N-3])
dif = ABS(y_int - y[1:N-2])
difR = ABS(y_int/y[1:N-2] - 1.)
dif = dif [valid]
difR= difR[valid]
; Remove 5% of higher values
Nv = LONG(cc*0.95)
s = SORT(dif) & s = s[0:Nv-1]
noise5 = SQRT(TOTAL(dif[s]^2)/(Nv-1)) ; Absolu Noise minus 5% of higher values
noise = SQRT(TOTAL(dif^2)/(cc-1)) ; Absolu Noise
s = SORT(difR) & s = s[0:Nv-1]
noiseR5 = SQRT(TOTAL(difR[s]^2)/(Nv-1)) ; Relative Noise minus 5% of higher values
noiseR = SQRT(TOTAL(difR^2)/(cc-1)) ; Relative Noise
RETURN,[noise5,noiseR5*100.,noise,noiseR*100.]
END
Can anyone help me understand the python equivalent? TY.
I would translate:
valid = WHERE(ABS(day[0:N-3]-day[2:N-1]) LT 20,cc)
as:
valid = (numpy.abs(day[0:-2] - day[2:]) < 20).nonzero()
How do I implement levenshtein distance on records in a database table using python? I know how to connect python with database, coding in python may not be problem, and I also have the records in a database table. I understand the theory and the dynamic programming of levenshtein distance. The problem here is, how do I write the codes in such a way that after connecting to the database table, I can compare two records having up to three fields and output their similarity score. Below is a snipet of my database table:
Record 1:
Author : Michael I James
Title : Advancement in networking
Journal: ACM
Record 2:
Author: Michael J Inse
Title: Advancement in networking
Journal: ACM
Any ideas is welcome. I'm a newbie in this area, please try explain with a little detail.
Thanks.
My understanding of you problem is that you do need to identify very similar records which are potentially duplicated.
I would solve this in the database itself. No need to do programming. If you don't have the Levenshtein function available in your DB you may want to create a User Defined Function.
Here is an example for MySQL:
CREATE FUNCTION `levenshtein`(s1 VARCHAR(255), s2 VARCHAR(255)) RETURNS int(11) DETERMINISTIC
BEGIN
DECLARE s1_len, s2_len, i, j, c, c_temp, cost INT;
DECLARE s1_char CHAR; DECLARE cv0, cv1 VARBINARY(256);
SET s1_len = CHAR_LENGTH(s1), s2_len = CHAR_LENGTH(s2), cv1 = 0x00, j = 1, i = 1, c = 0;
IF s1 = s2 THEN
RETURN 0;
ELSEIF s1_len = 0 THEN
RETURN s2_len;
ELSEIF s2_len = 0 THEN
RETURN s1_len;
ELSE
WHILE j <= s2_len DO
SET cv1 = CONCAT(cv1, UNHEX(HEX(j))), j = j + 1;
END WHILE;
WHILE i <= s1_len DO
SET s1_char = SUBSTRING(s1, i, 1), c = i, cv0 = UNHEX(HEX(i)), j = 1;
WHILE j <= s2_len DO
SET c = c + 1;
IF s1_char = SUBSTRING(s2, j, 1) THEN
SET cost = 0;
ELSE
SET cost = 1;
END IF;
SET c_temp = CONV(HEX(SUBSTRING(cv1, j, 1)), 16, 10) + cost;
IF c > c_temp THEN
SET c = c_temp;
END IF;
SET c_temp = CONV(HEX(SUBSTRING(cv1, j+1, 1)), 16, 10) + 1;
IF c > c_temp THEN
SET c = c_temp;
END IF;
SET cv0 = CONCAT(cv0, UNHEX(HEX(c))), j = j + 1;
END WHILE;
SET cv1 = cv0, i = i + 1;
END WHILE;
END IF;
RETURN c;
END
Then you do need to compare all of your records to each other. This requires a self full join, which of course can be a bit heavy. If too heavy, you will need to go the Python way, which will allow you to avoid repetitions (compare various times the same records).
Here is what I would do. Note that I would rather use an ID for easier identification:
SELECT a.ID AS IDa,
b.ID AS IDb,
a.Author AS AuthorA,
b.Author AS AuthorB,
ap.levenshtein(a.Author, b.Author) AS Lev_Aut,
a.Title AS TitleA, b.Title AS TitleB, ap.levenshtein(a.Title, b.Title) AS Lev_Title,
a.Journal AS JounalA , b.Journal AS JournalB, ap.levenshtein(a.Journal, b.Journal) AS Lev_Journal,
ap.levenshtein(a.Author, b.Author) + ap.levenshtein(a.Title, b.Title) + ap.levenshtein(a.Journal, b.Journal) AS Composite
FROM test.zzz AS a, test.zzz AS b
WHERE a.ID != b.ID
ORDER BY 8;
Would return a list of Levenshtein values ordered from the best match to the worst (the composite column). The condition avoids a record to be compared to itself.
I've written this very badly optimized C code that does a simple math calculation:
#include <stdio.h>
#include <math.h>
#include <stdlib.h>
#define MIN(a, b) (((a) < (b)) ? (a) : (b))
#define MAX(a, b) (((a) > (b)) ? (a) : (b))
unsigned long long int p(int);
float fullCheck(int);
int main(int argc, char **argv){
int i, g, maxNumber;
unsigned long long int diff = 1000;
if(argc < 2){
fprintf(stderr, "Usage: %s maxNumber\n", argv[0]);
return 0;
}
maxNumber = atoi(argv[1]);
for(i = 1; i < maxNumber; i++){
for(g = 1; g < maxNumber; g++){
if(i == g)
continue;
if(p(MAX(i,g)) - p(MIN(i,g)) < diff && fullCheck(p(MAX(i,g)) - p(MIN(i,g))) && fullCheck(p(i) + p(g))){
diff = p(MAX(i,g)) - p(MIN(i,g));
printf("We have a couple %llu %llu with diff %llu\n", p(i), p(g), diff);
}
}
}
return 0;
}
float fullCheck(int number){
float check = (-1 + sqrt(1 + 24 * number))/-6;
float check2 = (-1 - sqrt(1 + 24 * number))/-6;
if(check/1.00 == (int)check)
return check;
if(check2/1.00 == (int)check2)
return check2;
return 0;
}
unsigned long long int p(int n){
return n * (3 * n - 1 ) / 2;
}
And then I've tried (just for fun) to port it under Python to see how it would react. My first version was almost a 1:1 conversion that run terribly slow (120+secs in Python vs <1sec in C).
I've done a bit of optimization, and this is what I obtained:
#!/usr/bin/env/python
from cmath import sqrt
import cProfile
from pstats import Stats
def quickCheck(n):
partial_c = (sqrt(1 + 24 * (n)))/-6
c = 1/6 + partial_c
if int(c.real) == c.real:
return True
c = c - 2*partial_c
if int(c.real) == c.real:
return True
return False
def main():
maxNumber = 5000
diff = 1000
for i in range(1, maxNumber):
p_i = i * (3 * i - 1 ) / 2
for g in range(i, maxNumber):
if i == g:
continue
p_g = g * (3 * g - 1 ) / 2
if p_i > p_g:
ma = p_i
mi = p_g
else:
ma = p_g
mi = p_i
if ma - mi < diff and quickCheck(ma - mi):
if quickCheck(ma + mi):
print ('New couple ', ma, mi)
diff = ma - mi
cProfile.run('main()','script_perf')
perf = Stats('script_perf').sort_stats('time', 'calls').print_stats(10)
This runs in about 16secs which is better but also almost 20 times slower than C.
Now, I know C is better than Python for this kind of calculations, but what I would like to know is if there something that I've missed (Python-wise, like an horribly slow function or such) that could have made this function faster.
Please note that I'm using Python 3.1.1, if this makes a difference
Since quickCheck is being called close to 25,000,000 times, you might want to use memoization to cache the answers.
You can do memoization in C as well as Python. Things will be much faster in C, also.
You're computing 1/6 in each iteration of quickCheck. I'm not sure if this will be optimized out by Python, but if you can avoid recomputing constant values, you'll find things are faster. C compilers do this for you.
Doing things like if condition: return True; else: return False is silly -- and time consuming. Simply do return condition.
In Python 3.x, /2 must create floating-point values. You appear to need integers for this. You should be using //2 division. It will be closer to the C version in terms of what it does, but I don't think it's significantly faster.
Finally, Python is generally interpreted. The interpreter will always be significantly slower than C.
I made it go from ~7 seconds to ~3 seconds on my machine:
Precomputed i * (3 * i - 1 ) / 2 for each value, in yours it was computed twice quite a lot
Cached calls to quickCheck
Removed if i == g by adding +1 to the range
Removed if p_i > p_g since p_i is always smaller than p_g
Also put the quickCheck-function inside main, to make all variables local (which have faster lookup than global).
I'm sure there are more micro-optimizations available.
def main():
maxNumber = 5000
diff = 1000
p = {}
quickCache = {}
for i in range(maxNumber):
p[i] = i * (3 * i - 1 ) / 2
def quickCheck(n):
if n in quickCache: return quickCache[n]
partial_c = (sqrt(1 + 24 * (n)))/-6
c = 1/6 + partial_c
if int(c.real) == c.real:
quickCache[n] = True
return True
c = c - 2*partial_c
if int(c.real) == c.real:
quickCache[n] = True
return True
quickCache[n] = False
return False
for i in range(1, maxNumber):
mi = p[i]
for g in range(i+1, maxNumber):
ma = p[g]
if ma - mi < diff and quickCheck(ma - mi) and quickCheck(ma + mi):
print('New couple ', ma, mi)
diff = ma - mi
Because the function p() monotonically increasing you can avoid comparing the values as g > i implies p(g) > p(i). Also, the inner loop can be broken early because p(g) - p(i) >= diff implies p(g+1) - p(i) >= diff.
Also for correctness, I changed the equality comparison in quickCheck to compare difference against an epsilon because exact comparison with floating point is pretty fragile.
On my machine this reduced the runtime to 7.8ms using Python 2.6. Using PyPy with JIT reduced this to 0.77ms.
This shows that before turning to micro-optimization it pays to look for algorithmic optimizations. Micro-optimizations make spotting algorithmic changes much harder for relatively tiny gains.
EPS = 0.00000001
def quickCheck(n):
partial_c = sqrt(1 + 24*n) / -6
c = 1/6 + partial_c
if abs(int(c) - c) < EPS:
return True
c = 1/6 - partial_c
if abs(int(c) - c) < EPS:
return True
return False
def p(i):
return i * (3 * i - 1 ) / 2
def main(maxNumber):
diff = 1000
for i in range(1, maxNumber):
for g in range(i+1, maxNumber):
if p(g) - p(i) >= diff:
break
if quickCheck(p(g) - p(i)) and quickCheck(p(g) + p(i)):
print('New couple ', p(g), p(i), p(g) - p(i))
diff = p(g) - p(i)
There are some python compilers that might actually do a good bit for you. Have a look at Psyco.
Another way of dealing with math intensive programs is to rewrite the majority of the work into a math kernel, such as NumPy, so that heavily optimized code is doing the work, and your python code only guides the calculation. To get the most out of this strategy, avoid doing calculations in loops, and instead let the math kernel do all of that.
The other respondents have already mentioned several optimizations that will help. However, ultimately, you're not going to be able to match the performance of C in Python. Python is a nice tool, but since it's interpreted, it isn't really suited for heavy number crunching or other apps where performance is key.
Also, even in your C version, your inner loop could use quite a bit of help. Updated version:
for(i = 1; i < maxNumber; i++){
for(g = 1; g < maxNumber; g++){
if(i == g)
continue;
max=i;
min=g;
if (max<min) {
// xor swap - could use swap(p_max,p_min) instead.
max=max^min;
min=max^min;
max=max^min;
}
p_max=P(max);
p_min=P(min);
p_i=P(i);
p_g=P(g);
if(p_max - p_min < diff && fullCheck(p_max-p_min) && fullCheck(p_i + p_g)){
diff = p_max - p_min;
printf("We have a couple %llu %llu with diff %llu\n", p_i, p_g, diff);
}
}
}
///////////////////////////
float fullCheck(int number){
float den=sqrt(1+24*number)/6.0;
float check = 1/6.0 - den;
float check2 = 1/6.0 + den;
if(check == (int)check)
return check;
if(check2 == (int)check2)
return check2;
return 0.0;
}
Division, function calls, etc are costly. Also, calculating them once and storing in vars such as I've done can make things a lot more readable.
You might consider declaring P() as inline or rewrite as a preprocessor macro. Depending on how good your optimizer is, you might want to perform some of the arithmetic yourself and simplify its implementation.
Your implementation of fullCheck() would return what appear to be invalid results, since 1/6==0, where 1/6.0 would return 0.166... as you would expect.
This is a very brief take on what you can do to your C code to improve performance. This will, no doubt, widen the gap between C and Python performance.
20x difference between Python and C for a number crunching task seems quite good to me.
Check the usual performance differences for some CPU intensive tasks (keep in mind that the scale is logarithmic).
But look on the bright side, what's 1 minute of CPU time compared with the brain and typing time you saved writing Python instead of C? :-)