Wrong calculation when using Cython [duplicate]

Wrong calculation when using Cython [duplicate] - python

This question already has answers here:
C left shift on 64 bits fail
(2 answers)
Closed 2 years ago.
I implemented Lucas-Lehmer primality test to check Mersenne prime in python. Then I use Cython to speed up the calculation.
Original Python code:
def lucas_lehmer(p):
if p == 2:
return True
s = 4
M = (1 << p) - 1
for i in range(p-2):
s = ((s * s) - 2) % M
print("Processed: {}%".format(100*i//(p-2)))
if s == 0:
return True
else:
return False
Cython code:
cpdef lucas_lehmer(int p):
if p == 2:
return True
cdef unsigned long long int M
M = (1 << p) - 1
cdef unsigned long long int s
s = 4
cdef int i
for i in range(p-2):
s = ((s * s) - 2) % M
print("Processed: {}%".format(100*i//(p-2)))
if s == 0:
return True
else:
return False
Running the original Python code, it works correctly. But for Cython, it's only correct with p = 31 and lower, testing with p = 61 and bigger (all tested p values are values that 2^p-1 is prime), it returns False (not a prime number), except for p = 86243.
For some p like 97, even though 2^97-1 is not a prime number, the program actually return True (is a prime number), which is a contradiction.
Why does this happen? Without using cdef for variable M and s, the calculation will be correct, but the performance won't get any improved.

Running a few tests on your code I found that M was always equal to 1
so I defined p as a cdef and got the required result.
Not sure exactly what the issue is but it's something to do with that bit operation on p. p needs to be of the same type as M for it to make sense and if one is cdef and one is python int somehow it doesn't work?
cpdef lucas_lehmer(int py):
cdef p
p = py
if p == 2:
return True
cdef M
M = (1 << p) - 1
cdef s
s = 4
cdef int i
for i in range(p-2):
s = ((s * s) - 2) % M
print("Processed: {}%".format(100*i//(p-2)))
if s == 0:
return True
else:
return False

Related

Cython promlem: cimport libcpp.vector not compiled

I'm trying to use cython to speed up my code. Since I'm working with an array of strings, I want to use string and vector from c++. But I have problems compiling if I import c libraries. For an example, I tried to implement an example from here: https://cython.readthedocs.io/en/latest/src/tutorial/cython_tutorial.html.
So, my code is
from libcpp.vector cimport vector
def primes(unsigned int nb_primes):
cdef int n, i
cdef vector[int] p
p.reserve(nb_primes) # allocate memory for 'nb_primes' elements.
n = 2
while p.size() < nb_primes: # size() for vectors is similar to len()
for i in p:
if n % i == 0:
break
else:
p.push_back(n) # push_back is similar to append()
n += 1
# Vectors are automatically converted to Python
# lists when converted to Python objects.
return p
I save thiscode like 'test_char.pyx'. For compilation i use it:
from Cython.Build import cythonize
setup(name='test_char',
ext_modules = cythonize('test_char.pyx')
)
After that i get test_char.c, but i don't get test_char.py.
If i will use this code (without cimport):
def primes(int nb_primes):
cdef int n, i, len_p
cdef int p[1000]
if nb_primes > 1000:
nb_primes = 1000
len_p = 0 # The current number of elements in p.
n = 2
while len_p < nb_primes:
# Is n prime?
for i in p[:len_p]:
if n % i == 0:
break
# If no break occurred in the loop, we have a prime.
else:
p[len_p] = n
len_p += 1
n += 1
# Let's return the result in a python list:
result_as_list = [prime for prime in p[:len_p]]
return result_as_list
all be right. So, plz, any ideas?

from distutils.extension import Extension
extensions = [
Extension("test_char", ["test_char.pyx"]
, language="c++"
)
]
setup(
name="test_char",
ext_modules = cythonize(extensions),
)
it can solve this problem

RSA Python Issue

I am having an issue with getting my python program to decrypt a message with an RSA problem. For some reason my Python program is stalling, really just not outputting anything. Anyone got an idea as to why?
n = 23952937352643527451379227516428377705004894508566304313177880191662177061878993798938496818120987817049538365206671401938265663712351239785237507341311858383628932183083145614696585411921662992078376103990806989257289472590902167457302888198293135333083734504191910953238278860923153746261500759411620299864395158783509535039259714359526738924736952759753503357614939203434092075676169179112452620687731670534906069845965633455748606649062394293289967059348143206600765820021392608270528856238306849191113241355842396325210132358046616312901337987464473799040762271876389031455051640937681745409057246190498795697239
p = 153143042272527868798412612417204434156935146874282990942386694020462861918068684561281763577034706600608387699148071015194725533394126069826857182428660427818277378724977554365910231524827258160904493774748749088477328204812171935987088715261127321911849092207070653272176072509933245978935455542420691737433
c = 18031488536864379496089550017272599246134435121343229164236671388038630752847645738968455413067773166115234039247540029174331743781203512108626594601293283737392240326020888417252388602914051828980913478927759934805755030493894728974208520271926698905550119698686762813722190657005740866343113838228101687566611695952746931293926696289378849403873881699852860519784750763227733530168282209363348322874740823803639617797763626570478847423136936562441423318948695084910283653593619962163665200322516949205854709192890808315604698217238383629613355109164122397545332736734824591444665706810731112586202816816647839648399
e = 65537
q = 156408916769576372285319235535320446340733908943564048157238512311891352879208957302116527435165097143521156600690562005797819820759620198602417583539668686152735534648541252847927334505648478214810780526425005943955838623325525300844493280040860604499838598837599791480284496210333200247148213274376422459183
phi = (q-1)*(p-1)
d = pow(e,-1,phi)
m = pow(c,d)%n
print(m)
I apologize for the weird code formatting. Thanks in advance.

Assuming the math is correct (I didn't check), you definitely want to change this:
m = pow(c,d)%n
to this:
m = pow(c, d, n)
The first spelling computes c**d to full precision before dividing by n to find the remainder. That can be enormously expensive. The second way keeps reducing intermediate results, under the covers, mod n all along, and never needs to do arithmetic in integers larger than about n**2.
So, replacing the last line of your code and continuing:
>>> m = pow(c, d, n) # less than an eyeblink
>>> m
14311663942709674867122208214901970650496788151239520971623411712977120586163535880168563325
>>> pow(m, e, n) == c
True
So the original "message" (c) is recovered by doing modular exponentiation to powers d and e in turn.

As already answered by #TimPeters main issue you have is pow(c,d)%n which should be replaced with pow(c, d, n) for huge performance improvement.
So as your question was already answered, I decided to dig a bit further. Inspired by your question I decided to implement most of RSA mathematics from scratch according to WikiPedia article. Maybe it is a bit offtopic (not what you asked) but I'm sure next code will be useful demo for somebody who wants to try RSA in plain Python, and may be helpful to you too.
Next code has all variables named same as in wikipedia, formulas are also taken from there. Important!, one thing is missing in my code, I didn't implement padding for simplicity (just to show classical RSA math), it is very important to have correct (e.g. OAEP) padding in your system, without it there exist attacks on RSA. Also I used just 512 bits for prime parts of modulus, real systems shoud have thousands of bits to be secure. Also I don't do any splitting of message, long messages should be split into sub-messages and padded to fit modulus bitsize.
Try it online!
import random
def fermat_prp(n):
# https://en.wikipedia.org/wiki/Fermat_primality_test
assert n >= 4, n
for i in range(24):
a = (3, 5, 7)[i] if n >= 9 and i < 3 else random.randint(2, n - 2)
if pow(a, n - 1, n) != 1:
return False
return True
def gen_prime(bits):
assert bits >= 3, bits
while True:
n = random.randrange(1 << (bits - 1), 1 << bits)
if fermat_prp(n):
return n
def gcd(a, b):
while b != 0:
a, b = b, a % b
return a
def lcm(a, b):
return a * b // gcd(a, b)
def egcd(a, b):
# https://en.wikipedia.org/wiki/Extended_Euclidean_algorithm
ro, r, so, s, to, t = a, b, 1, 0, 0, 1
while r != 0:
q = ro // r
ro, r = r, ro - q * r
so, s = s, so - q * s
to, t = t, to - q * t
return ro, so, to
def demo():
# https://en.wikipedia.org/wiki/RSA_(cryptosystem)
bits = 512
p, q = gen_prime(bits), gen_prime(bits)
n = p * q
ln = lcm(p - 1, q - 1)
e = 65537
print('PublicKey: e =', e, 'n =', n)
d = egcd(e, ln)[1] % ln
mtext = 'Hello, World!'
print('Plain:', mtext)
m = int.from_bytes(mtext.encode('utf-8'), 'little')
c = pow(m, e, n)
print('Encrypted:', c)
md = pow(c, d, n)
mdtext = md.to_bytes((md.bit_length() + 7) // 8, 'little').decode('utf-8')
print('Decrypted:', mdtext)
if __name__ == '__main__':
demo()
Output:
PublicKey: e = 65537 n = 110799663895649286762656294752173883884148615506062673584673343016070245791505883867301519267702723384430131035038547340921850290913097297607190494504060280758901448419479350528305305851775098631904614278162314251019568026506239421634950337278112960925116975344093575400871044570868887447462560168862887909233
Plain: Hello, World!
Encrypted: 51626387443589883457155394323971044262931599278626885275220384098221412582734630381413609428210758734789774315702921245355044370166117558802434906927834933002999816979504781510321118769252529439999715937013823223670924340787833496790181098038607416880371509879507193070745708801500713956266209367343820073123
Decrypted: Hello, World!

Euclidean GCD function returns type None instead of int [duplicate]

This question already has answers here:
Why does my recursive function return None?
(4 answers)
Closed 3 years ago.
I'm getting back into mathematics, algorithms, and data structures. Today, I spent time studying up on the Euclidean algorithm and greatest common divisors.
Below, I implemented a function to demonstrate what I learned:
from math import floor
def euclidian(a: int, b: int):
# a = b * q + r
_q: int = int(floor(a / b))
print(f"Quotient: {_q}")
r: int = a % b
print(f"Remainder: {r}")
a = b
print(f"A = B({a})")
b = r
print(f"B = R({b})")
if a != 0 and b != 0:
euclidian(a, b)
# a = 0; gcd(0, b) = b
elif a == 0:
print(f"Returning value a({b}) | type: {type(b)}")
return b
# b = 0; gcd(a, 0) = a
elif b == 0:
print(f"Returning value a({a}) | type: {type(a)}")
return a
a: int = 270
b: int = 192
gcd: int = euclidian(a, b)
print(f"GCD type: {type(gcd)}")
print(f"GCD({a}, {b}) = {gcd}")
This recursive function goes through a couple iterations, and ends up returning these results:
Quotient: 6
Remainder: 0
A = B(6)
B = R(0)
Returning value a(6) | type: <class 'int'>
GCD type: <class 'NoneType'>
GCD(270, 192) = None
It's getting later in the day, so perhaps I just need a cup of tea to wake myself up. But I can't seem to wrap my head around why the variable gcd is None and not the integer value of a. What am I missing?
Thanks.

You are making a recursive call but not returning the value. i.e.
if a != 0 and b != 0:
euclidian(a, b)
should be
if a != 0 and b != 0:
return euclidian(a, b)
You also have no base case as others have said.

When a function returns None, it usually means it ran off the end of the code without hitting a return statement. In your case you have a bunch of if statements, and I'm guessing none of them applied.

Python recursive program

I'm relatively newcomer on programming as I'm educated a mathematician and have no experience on Python. I would like to know how to solve this problem in Python which appeared as I was studying one maths problem on my own:
Program asks a positive integer m. If m is of the form 2^n-1 it returns T(m)=n*2^{n-1}. Otherwise it writes m to the form 2^n+x, where -1 < x < 2^n, and returns T(m)=T(2^n-1)+x+1+T(x). Finally it outputs the answer.

I thought this was a neat problem so I attempted a solution. As far as I can tell, this satisfies the parameters in the original question.
#!/usr/bin/python
import math
def calculate(m: int) -> int:
"""
>>> calculate(10)
20
>>> calculate(100)
329
>>> calculate(1.2)
>>> calculate(-1)
"""
if (m <= 0 or math.modf(m)[0] != 0):
return None
n, x = decompose(m + 1)
if (x == 0):
return n * 2**(n - 1)
else:
return calculate(2**n - 1) + x + 1 + calculate(x)
def decompose(m: int) -> (int, int):
"""
Returns two numbers (n, x), where
m = 2**n + x and -1 < x < 2^n
"""
n = int(math.log(m, 2))
return (n, m - 2**n)
if __name__ == "__main__":
import doctest
doctest.testmod(verbose = True)
Assuming the numbers included in the calculate function's unit tests are the correct results for the problem, this solution should be accurate. Feedback is most welcome, of course.

Python code optimization (20x slower than C)

I've written this very badly optimized C code that does a simple math calculation:
#include <stdio.h>
#include <math.h>
#include <stdlib.h>
#define MIN(a, b) (((a) < (b)) ? (a) : (b))
#define MAX(a, b) (((a) > (b)) ? (a) : (b))
unsigned long long int p(int);
float fullCheck(int);
int main(int argc, char **argv){
int i, g, maxNumber;
unsigned long long int diff = 1000;
if(argc < 2){
fprintf(stderr, "Usage: %s maxNumber\n", argv[0]);
return 0;
}
maxNumber = atoi(argv[1]);
for(i = 1; i < maxNumber; i++){
for(g = 1; g < maxNumber; g++){
if(i == g)
continue;
if(p(MAX(i,g)) - p(MIN(i,g)) < diff && fullCheck(p(MAX(i,g)) - p(MIN(i,g))) && fullCheck(p(i) + p(g))){
diff = p(MAX(i,g)) - p(MIN(i,g));
printf("We have a couple %llu %llu with diff %llu\n", p(i), p(g), diff);
}
}
}
return 0;
}
float fullCheck(int number){
float check = (-1 + sqrt(1 + 24 * number))/-6;
float check2 = (-1 - sqrt(1 + 24 * number))/-6;
if(check/1.00 == (int)check)
return check;
if(check2/1.00 == (int)check2)
return check2;
return 0;
}
unsigned long long int p(int n){
return n * (3 * n - 1 ) / 2;
}
And then I've tried (just for fun) to port it under Python to see how it would react. My first version was almost a 1:1 conversion that run terribly slow (120+secs in Python vs <1sec in C).
I've done a bit of optimization, and this is what I obtained:
#!/usr/bin/env/python
from cmath import sqrt
import cProfile
from pstats import Stats
def quickCheck(n):
partial_c = (sqrt(1 + 24 * (n)))/-6
c = 1/6 + partial_c
if int(c.real) == c.real:
return True
c = c - 2*partial_c
if int(c.real) == c.real:
return True
return False
def main():
maxNumber = 5000
diff = 1000
for i in range(1, maxNumber):
p_i = i * (3 * i - 1 ) / 2
for g in range(i, maxNumber):
if i == g:
continue
p_g = g * (3 * g - 1 ) / 2
if p_i > p_g:
ma = p_i
mi = p_g
else:
ma = p_g
mi = p_i
if ma - mi < diff and quickCheck(ma - mi):
if quickCheck(ma + mi):
print ('New couple ', ma, mi)
diff = ma - mi
cProfile.run('main()','script_perf')
perf = Stats('script_perf').sort_stats('time', 'calls').print_stats(10)
This runs in about 16secs which is better but also almost 20 times slower than C.
Now, I know C is better than Python for this kind of calculations, but what I would like to know is if there something that I've missed (Python-wise, like an horribly slow function or such) that could have made this function faster.
Please note that I'm using Python 3.1.1, if this makes a difference

Since quickCheck is being called close to 25,000,000 times, you might want to use memoization to cache the answers.
You can do memoization in C as well as Python. Things will be much faster in C, also.
You're computing 1/6 in each iteration of quickCheck. I'm not sure if this will be optimized out by Python, but if you can avoid recomputing constant values, you'll find things are faster. C compilers do this for you.
Doing things like if condition: return True; else: return False is silly -- and time consuming. Simply do return condition.
In Python 3.x, /2 must create floating-point values. You appear to need integers for this. You should be using //2 division. It will be closer to the C version in terms of what it does, but I don't think it's significantly faster.
Finally, Python is generally interpreted. The interpreter will always be significantly slower than C.

I made it go from ~7 seconds to ~3 seconds on my machine:
Precomputed i * (3 * i - 1 ) / 2 for each value, in yours it was computed twice quite a lot
Cached calls to quickCheck
Removed if i == g by adding +1 to the range
Removed if p_i > p_g since p_i is always smaller than p_g
Also put the quickCheck-function inside main, to make all variables local (which have faster lookup than global).
I'm sure there are more micro-optimizations available.
def main():
maxNumber = 5000
diff = 1000
p = {}
quickCache = {}
for i in range(maxNumber):
p[i] = i * (3 * i - 1 ) / 2
def quickCheck(n):
if n in quickCache: return quickCache[n]
partial_c = (sqrt(1 + 24 * (n)))/-6
c = 1/6 + partial_c
if int(c.real) == c.real:
quickCache[n] = True
return True
c = c - 2*partial_c
if int(c.real) == c.real:
quickCache[n] = True
return True
quickCache[n] = False
return False
for i in range(1, maxNumber):
mi = p[i]
for g in range(i+1, maxNumber):
ma = p[g]
if ma - mi < diff and quickCheck(ma - mi) and quickCheck(ma + mi):
print('New couple ', ma, mi)
diff = ma - mi

Because the function p() monotonically increasing you can avoid comparing the values as g > i implies p(g) > p(i). Also, the inner loop can be broken early because p(g) - p(i) >= diff implies p(g+1) - p(i) >= diff.
Also for correctness, I changed the equality comparison in quickCheck to compare difference against an epsilon because exact comparison with floating point is pretty fragile.
On my machine this reduced the runtime to 7.8ms using Python 2.6. Using PyPy with JIT reduced this to 0.77ms.
This shows that before turning to micro-optimization it pays to look for algorithmic optimizations. Micro-optimizations make spotting algorithmic changes much harder for relatively tiny gains.
EPS = 0.00000001
def quickCheck(n):
partial_c = sqrt(1 + 24*n) / -6
c = 1/6 + partial_c
if abs(int(c) - c) < EPS:
return True
c = 1/6 - partial_c
if abs(int(c) - c) < EPS:
return True
return False
def p(i):
return i * (3 * i - 1 ) / 2
def main(maxNumber):
diff = 1000
for i in range(1, maxNumber):
for g in range(i+1, maxNumber):
if p(g) - p(i) >= diff:
break
if quickCheck(p(g) - p(i)) and quickCheck(p(g) + p(i)):
print('New couple ', p(g), p(i), p(g) - p(i))
diff = p(g) - p(i)

There are some python compilers that might actually do a good bit for you. Have a look at Psyco.
Another way of dealing with math intensive programs is to rewrite the majority of the work into a math kernel, such as NumPy, so that heavily optimized code is doing the work, and your python code only guides the calculation. To get the most out of this strategy, avoid doing calculations in loops, and instead let the math kernel do all of that.

The other respondents have already mentioned several optimizations that will help. However, ultimately, you're not going to be able to match the performance of C in Python. Python is a nice tool, but since it's interpreted, it isn't really suited for heavy number crunching or other apps where performance is key.
Also, even in your C version, your inner loop could use quite a bit of help. Updated version:
for(i = 1; i < maxNumber; i++){
for(g = 1; g < maxNumber; g++){
if(i == g)
continue;
max=i;
min=g;
if (max<min) {
// xor swap - could use swap(p_max,p_min) instead.
max=max^min;
min=max^min;
max=max^min;
}
p_max=P(max);
p_min=P(min);
p_i=P(i);
p_g=P(g);
if(p_max - p_min < diff && fullCheck(p_max-p_min) && fullCheck(p_i + p_g)){
diff = p_max - p_min;
printf("We have a couple %llu %llu with diff %llu\n", p_i, p_g, diff);
}
}
}
///////////////////////////
float fullCheck(int number){
float den=sqrt(1+24*number)/6.0;
float check = 1/6.0 - den;
float check2 = 1/6.0 + den;
if(check == (int)check)
return check;
if(check2 == (int)check2)
return check2;
return 0.0;
}
Division, function calls, etc are costly. Also, calculating them once and storing in vars such as I've done can make things a lot more readable.
You might consider declaring P() as inline or rewrite as a preprocessor macro. Depending on how good your optimizer is, you might want to perform some of the arithmetic yourself and simplify its implementation.
Your implementation of fullCheck() would return what appear to be invalid results, since 1/6==0, where 1/6.0 would return 0.166... as you would expect.
This is a very brief take on what you can do to your C code to improve performance. This will, no doubt, widen the gap between C and Python performance.

20x difference between Python and C for a number crunching task seems quite good to me.
Check the usual performance differences for some CPU intensive tasks (keep in mind that the scale is logarithmic).
But look on the bright side, what's 1 minute of CPU time compared with the brain and typing time you saved writing Python instead of C? :-)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Wrong calculation when using Cython [duplicate] - python

Related

Cython promlem: cimport libcpp.vector not compiled

RSA Python Issue

Euclidean GCD function returns type None instead of int [duplicate]

Python recursive program

Python code optimization (20x slower than C)

Categories

Resources