Here is my code:
import math
# I define a function that will calculate the log
def shank(a,p,z):
x=math.log(p,a)%z
return (x)
print("First log")
print(shank(106, 12375, 24691))
print("Second log")
print(shank(6, 248388, 458009))
I get correct answers, but it doesn't give me integers. For example if I input
print(shank(3, 525, 809))
I get 5.701190790597276, while 309 also works and would be the answer I prefer.
Thanks for any help.
math.log(p, a) % z is not a discrete logarithm. This expression tells Python to compute a regular logarithm, divide it by z, and give you the remainder. If you could somehow limit Python to work only with integers, you still wouldn't get a discrete logarithm; you would get an exception when Python finds that math.log(p, a) does not have an integer solution.
Python doesn't come with a built-in discrete logarithm routine. Computing discrete logarithms efficiently is one of the famous unsolved problems of computer science, and many cryptographic systems rely on it being hard. That said, if you're okay with a painfully slow solution, you can just compute higher and higher modular powers until you find one that works:
def discrete_log(base, target, modulus):
i = 0
power = 1
while power != target:
power = (power * base) % modulus
i += 1
return i
Related
I am currently working on a program to calculate 100,000 digits of the first sophomore's dream constant, I1. It's given by the sum below.
After about 10,000 terms in this series it gets quite slow. I opted to write the program so small because I wanted to see how small I could make it
from decimal import *
def sophodream(a):
s,i,t=0,1,int(a*1.5)
while i<t:
print(i)
n,d=Decimal(pow(-1,i+1)),Decimal(i**i)
f=n/d
s+=f
i+=1
return s
I would like to know if there are any ways to speed this up aside from multithreading/multiprocessing. I find that when I do series like these in threaded pieces the accuracy of them gets lower.
There are some minor changes / simplifications that can be made to your code but as has already been noted, you're working (at times) with some very big numbers.
from decimal import getcontext, Decimal
def sophodream(a):
s, p = 0, 1
getcontext().prec = a
for i in range(1, int(a * 1.5)):
s += p / Decimal(i**i)
p = -p
return s
print(sophodream(100))
Output:
0.7834305107121344070592643865269754694076819901469309582554178227016001845891404456248642049722689389
Obviously just a very short version to prove functionality
I'm curious as to why it's so much faster to multiply than to take powers in python (though from what I've read this may well be true in many other languages too). For example it's much faster to do
x*x
than
x**2
I suppose the ** operator is more general and can also deal with fractional powers. But if that's why it's so much slower, why doesn't it perform a check for an int exponent and then just do the multiplication?
Edit: Here's some example code I tried...
def pow1(r, n):
for i in range(r):
p = i**n
def pow2(r, n):
for i in range(r):
p = 1
for j in range(n):
p *= i
Now, pow2 is just a quick example and is clearly not optimised!
But even so I find that using n = 2 and r = 1,000,000, then pow1 takes ~ 2500ms and pow2 takes ~ 1700ms.
I admit that for large values of n, then pow1 does get much quicker than pow2. But that's not too surprising.
Basically naive multiplication is O(n) with a very low constant factor. Taking the power is O(log n) with a higher constant factor (There are special cases that need to be tested... fractional exponents, negative exponents, etc) . Edit: just to be clear, that's O(n) where n is the exponent.
Of course the naive approach will be faster for small n, you're only really implementing a small subset of exponential math so your constant factor is negligible.
Adding a check is an expense, too. Do you always want that check there? A compiled language could make the check for a constant exponent to see if it's a relatively small integer because there's no run-time cost, just a compile-time cost. An interpreted language might not make that check.
It's up to the particular implementation unless that kind of detail is specified by the language.
Python doesn't know what distribution of exponents you're going to feed it. If it's going to be 99% non-integer values, do you want the code to check for an integer every time, making runtime even slower?
Doing this in the exponent check will slow down the cases where it isn't a simple power of two very slightly, so isn't necessarily a win. However, in cases where the exponent is known in advance( eg. literal 2 is used), the bytecode generated could be optimised with a simple peephole optimisation. Presumably this simply hasn't been considered worth doing (it's a fairly specific case).
Here's a quick proof of concept that does such an optimisation (usable as a decorator). Note: you'll need the byteplay module to run it.
import byteplay, timeit
def optimise(func):
c = byteplay.Code.from_code(func.func_code)
prev=None
for i, (op, arg) in enumerate(c.code):
if op == byteplay.BINARY_POWER:
if c.code[i-1] == (byteplay.LOAD_CONST, 2):
c.code[i-1] = (byteplay.DUP_TOP, None)
c.code[i] = (byteplay.BINARY_MULTIPLY, None)
func.func_code = c.to_code()
return func
def square(x):
return x**2
print "Unoptimised :", timeit.Timer('square(10)','from __main__ import square').timeit(10000000)
square = optimise(square)
print "Optimised :", timeit.Timer('square(10)','from __main__ import square').timeit(10000000)
Which gives the timings:
Unoptimised : 6.42024898529
Optimised : 4.52667593956
[Edit]
Actually, thinking about it a bit more, there's a very good reason why this optimisaton isn't done. There's no guarantee that someone won't create a user defined class that overrides the __mul__ and __pow__ methods and do something different for each. The only way to do it safely is if you can guarantee that the object on the top of the stack is one that has the same result "x**2" and "x*x", but working that out is much harder. Eg. in my example it's impossible, as any object could be passed to the square function.
An implementation of b^p with binary exponentiation
def power(b, p):
"""
Calculates b^p
Complexity O(log p)
b -> double
p -> integer
res -> double
"""
res = 1
while p:
if p & 0x1: res *= b
b *= b
p >>= 1
return res
I'd suspect that nobody was expecting this to be all that important. Typically, if you want to do serious calculations, you do them in Fortran or C or C++ or something like that (and perhaps call them from Python).
Treating everything as exp(n * log(x)) works well in cases where n isn't integral or is pretty large, but is relatively inefficient for small integers. Checking to see if n is a small enough integer does take time, and adds complication.
Whether the check is worth it depends on the expected exponents, how important it is to get best performance here, and the cost of the extra complexity. Apparently, Guido and the rest of the Python gang decided the check wasn't worth doing.
If you like, you could write your own repeated-multiplication function.
how about xxxxx?
is it still faster than x**5?
as int exponents gets larger, taking powers might be faster than multiplication.
but the number where actual crossover occurs depends on various conditions, so in my opinion, that's why the optimization was not done(or couldn't be done) in language/library level. But users can still optimize for some special cases :)
In Python, at what stage should round be used? Take this example: 10 * math.log(x) + 10 If I want this to be rounded which should I use?
round(10 * math.log(x) + 5)
round(10 * math.log(x)) + 5
10 * round(math.log(x)) + 5
My guess would be that rounding early would run the fastest because more arithmetic happens with integers, which seem like they should be faster than floats. Rounding seems less likely to break if some later values change.
Would the answer be the same with int()?
Don't prematurely optimize. In many cases, it's not highly optimized mathematical functions which slow down programs, but the logic, structure or data types used in the calculation.
To that end, I recommend you use cProfile to identify bottlenecks. Note that cProfile itself has an overhead, so it is mostly useful for relative comparisons.
As per #glibdud's comment, you have to understand how rounding will affect your calculation. Try a few examples, or perform a test to see how your error may vary across a large number of inputs.
The earlier you are rounding, the more your result will be affected by this rounding. In my opinion, it all depends on the expectations of your program.
As for the difference between int() and round(), this thread answers it perfectly.
To be more specific to your question about performance : The round() function, that is a python built-in, is implemented in C, and you shouldn't really worry about performance as it will be very, very negligible.
Round function
That entirely depends upon how you want your answer to be formatted and interpreted. I would not be hung up on the speed of the round function though (unless the very minor performance gain is crucial to your program). I would think about what I'm trying to accomplish by rounding. If your goal is to produce an output that is rounded to the nearest integer (for simplicity) then I would encompass your entire arithmetic statement into the round function. If your goal is to only use rounded integers in your log calculations (maybe because you don't want to use floats) then you should only round the math.log(x) function. There is no technical reason why you would use either, but there is definitely a logical reason that you would want to choose either of your options.
Please note that the Python Math.log() function is the base of e by default. By your questions it's unclear what base you expect so I'll assume log base of 10 like Google does. In order to make it equivalent to the mathematical function provided the code would need to be:
import math
#assuming x equals 2
x = 2
function1 = round(10 * math.log(x,10) + 5)
function2 = round(10 * math.log(x,10)) + 5)
function3 = 10 * round(math.log(x,10)) + 5)
function4 = 10*math.log(x,10)+5
print(function1)
print(function2)
print(function3)
print(function4)
Now, assuming x = 2, the calculations for the mathematical equation is 8.01029995664
Looking at the printed output from the above code:
8
8
5
8.010299956639813
It clearly shows that functions 1,2 and 4 are roughly mathematically equivalent with function 3 being incorrect. This is because the round function uses Half and Above rule to round up. Math.log(2,10) results in 0.3, so when the round function happens it drops to zero.
As for the equivalence of int() and round() the link referenced by IMCoins is pretty good. The summation is that int() removes decimal values from a number and the round uses the half and above rule so it will act like the int() for anything less than x.5.
As for the speed question, if accuracy is non-negotiable it would be best to round upon completion of the answer due to the same reasons as why function 3 was wrong above. If you're fairly certain you can round safely at a step, then I agree with the answer above to use CProfile and find the bottlenecks
Hope this helps.
I have no clue, but let's see :)
import time
import math
n = 1000000
x = 5
def timeit(f):
t_0 = time.perf_counter()
for _ in range(n):
f()
t_1 = time.perf_counter()
print((t_1 - t_0)/ n)
def fun1():
round(10 * math.log(x) + 5)
def fun2():
round(10 * math.log(x)) + 5
def fun3():
10 * round(math.log(x)) + 5
[timeit(_) for _ in [fun1, fun2, fun3]]
On my computer the last one is slightly faster than the others.
Given positive integers b, c, m where (b < m) is True it is to find a positive integer e such that
(b**e % m == c) is True
where ** is exponentiation (e.g. in Ruby, Python or ^ in some other languages) and % is modulo operation. What is the most effective algorithm (with the lowest big-O complexity) to solve it?
Example:
Given b=5; c=8; m=13 this algorithm must find e=7 because 5**7%13 = 8
From the % operator I'm assuming that you are working with integers.
You are trying to solve the Discrete Logarithm problem. A reasonable algorithm is Baby step, giant step, although there are many others, none of which are particularly fast.
The difficulty of finding a fast solution to the discrete logarithm problem is a fundamental part of some popular cryptographic algorithms, so if you find a better solution than any of those on Wikipedia please let me know!
This isn't a simple problem at all. It is called calculating the discrete logarithm and it is the inverse operation to a modular exponentation.
There is no efficient algorithm known. That is, if N denotes the number of bits in m, all known algorithms run in O(2^(N^C)) where C>0.
Python 3 Solution:
Thankfully, SymPy has implemented this for you!
SymPy is a Python library for symbolic mathematics. It aims to become a full-featured computer algebra system (CAS) while keeping the code as simple as possible in order to be comprehensible and easily extensible. SymPy is written entirely in Python.
This is the documentation on the discrete_log function. Use this to import it:
from sympy.ntheory import discrete_log
Their example computes \log_7(15) (mod 41):
>>> discrete_log(41, 15, 7)
3
Because of the (state-of-the-art, mind you) algorithms it employs to solve it, you'll get O(\sqrt{n}) on most inputs you try. It's considerably faster when your prime modulus has the property where p - 1 factors into a lot of small primes.
Consider a prime on the order of 100 bits: (~ 2^{100}). With \sqrt{n} complexity, that's still 2^{50} iterations. That being said, don't reinvent the wheel. This does a pretty good job. I might also add that it was almost 4x times more memory efficient than Mathematica's MultiplicativeOrder function when I ran with large-ish inputs (44 MiB vs. 173 MiB).
Since a duplicate of this question was asked under the Python tag, here is a Python implementation of baby step, giant step, which, as #MarkBeyers points out, is a reasonable approach (as long as the modulus isn't too large):
def baby_steps_giant_steps(a,b,p,N = None):
if not N: N = 1 + int(math.sqrt(p))
#initialize baby_steps table
baby_steps = {}
baby_step = 1
for r in range(N+1):
baby_steps[baby_step] = r
baby_step = baby_step * a % p
#now take the giant steps
giant_stride = pow(a,(p-2)*N,p)
giant_step = b
for q in range(N+1):
if giant_step in baby_steps:
return q*N + baby_steps[giant_step]
else:
giant_step = giant_step * giant_stride % p
return "No Match"
In the above implementation, an explicit N can be passed to fish for a small exponent even if p is cryptographically large. It will find the exponent as long as the exponent is smaller than N**2. When N is omitted, the exponent will always be found, but not necessarily in your lifetime or with your machine's memory if p is too large.
For example, if
p = 70606432933607
a = 100001
b = 54696545758787
then 'pow(a,b,p)' evaluates to 67385023448517
and
>>> baby_steps_giant_steps(a,67385023448517,p)
54696545758787
This took about 5 seconds on my machine. For the exponent and the modulus of those sizes, I estimate (based on timing experiments) that brute force would have taken several months.
Discrete logarithm is a hard problem
Computing discrete logarithms is believed to be difficult. No
efficient general method for computing discrete logarithms on
conventional computers is known.
I will add here a simple bruteforce algorithm which tries every possible value from 1 to m and outputs a solution if it was found. Note that there may be more than one solution to the problem or zero solutions at all. This algorithm will return you the smallest possible value or -1 if it does not exist.
def bruteLog(b, c, m):
s = 1
for i in xrange(m):
s = (s * b) % m
if s == c:
return i + 1
return -1
print bruteLog(5, 8, 13)
and here you can see that 3 is in fact the solution:
print 5**3 % 13
There is a better algorithm, but because it is often asked to be implemented in programming competitions, I will just give you a link to explanation.
as said the general problem is hard. however a prcatical way to find e if and only if you know e is going to be small (like in your example) would be just to try each e from 1.
btw e==3 is the first solution to your example, and you can obviously find that in 3 steps, compare to solving the non discrete version, and naively looking for integer solutions i.e.
e = log(c + n*m)/log(b) where n is a non-negative integer
which finds e==3 in 9 steps
I've been doing simple numerical experiments with python, like computing
factorials. For instance, compute the factorial of 32:
My routine:
2.6313083693369503e+35
From scipy.misc:
2.6313083693369355e+35
I want to point out that my routine calculates the logarithm of the factorial,
it calculates the sumation of logarithms starting from 1 to 32 (in this case)
and then I just take the exp function (I do it this way because of stuff learned from
Fortran 90).
It is a surprise that the correct answer is
263130836933693530167218012160000000
according to pari/gp.
I would be very happy if someone can point me out to references where I can look for
correct numerical answers in Python. The documentation it's ok but only if one want
"short" numbers.
log and exp functions operate on floating points, which have limited precision. Python's integers, on the other hand, can have arbitrary precision. So, you can compute the factorial of 32 in linear space just fine using integers.
f = 1
for i in xrange(32):
f *= i + 1
print f # prints '263130836933693530167218012160000000'
You can do it this way:
import operator
n=32
print reduce(operator.__mul__,range(1,n+1))
# 263130836933693530167218012160000000