I'd like to take the modular inverse of a matrix like [[1,2],[3,4]] mod 7 in Python. I've looked at numpy (which does matrix inversion but not modular matrix inversion) and I saw a few number theory packages online, but nothing that seems to do this relatively common procedure (at least, it seems relatively common to me).
By the way, the inverse of the above matrix is [[5,1],[5,3]] (mod 7). I'd like Python to do it for me though.
Okay...for those who care, I solved my own problem. It took me a while, but I think this works. It's probably not the most elegant, and should include some more error handling, but it works:
import numpy
import math
from numpy import matrix
from numpy import linalg
def modMatInv(A,p): # Finds the inverse of matrix A mod p
n=len(A)
A=matrix(A)
adj=numpy.zeros(shape=(n,n))
for i in range(0,n):
for j in range(0,n):
adj[i][j]=((-1)**(i+j)*int(round(linalg.det(minor(A,j,i)))))%p
return (modInv(int(round(linalg.det(A))),p)*adj)%p
def modInv(a,p): # Finds the inverse of a mod p, if it exists
for i in range(1,p):
if (i*a)%p==1:
return i
raise ValueError(str(a)+" has no inverse mod "+str(p))
def minor(A,i,j): # Return matrix A with the ith row and jth column deleted
A=numpy.array(A)
minor=numpy.zeros(shape=(len(A)-1,len(A)-1))
p=0
for s in range(0,len(minor)):
if p==i:
p=p+1
q=0
for t in range(0,len(minor)):
if q==j:
q=q+1
minor[s][t]=A[p][q]
q=q+1
p=p+1
return minor
A hackish trick which works when rounding errors aren't an issue:
find the regular inverse (may have non-integer entries), and the determinant (an integer), both implemented in numpy
multiply the inverse by the determinant, and round to integers (hacky)
now multiply everything by the determinant's multiplicative inverse (modulo your modulus, code below)
do entrywise mod by your modulus
A less hackish way is to actually implement gaussian elimination. Here's my code using Gaussian elimination, which I wrote for my own purposes (rounding errors were an issue for me). q is the modulus, which is not necessarily prime.
def generalizedEuclidianAlgorithm(a, b):
if b > a:
return generalizedEuclidianAlgorithm(b,a);
elif b == 0:
return (1, 0);
else:
(x, y) = generalizedEuclidianAlgorithm(b, a % b);
return (y, x - (a / b) * y)
def inversemodp(a, p):
a = a % p
if (a == 0):
print "a is 0 mod p"
return None
if a > 1 and p % a == 0:
return None
(x,y) = generalizedEuclidianAlgorithm(p, a % p);
inv = y % p
assert (inv * a) % p == 1
return inv
def identitymatrix(n):
return [[long(x == y) for x in range(0, n)] for y in range(0, n)]
def inversematrix(matrix, q):
n = len(matrix)
A = np.matrix([[ matrix[j, i] for i in range(0,n)] for j in range(0, n)], dtype = long)
Ainv = np.matrix(identitymatrix(n), dtype = long)
for i in range(0, n):
factor = inversemodp(A[i,i], q)
if factor is None:
raise ValueError("TODO: deal with this case")
A[i] = A[i] * factor % q
Ainv[i] = Ainv[i] * factor % q
for j in range(0, n):
if (i != j):
factor = A[j, i]
A[j] = (A[j] - factor * A[i]) % q
Ainv[j] = (Ainv[j] - factor * Ainv[i]) % q
return Ainv
EDIT: as commenters point out, there are some cases this algorithm fails. It's slightly nontrivial to fix, and I don't have time nowadays. Back then it worked for random matrices in my case (the moduli were products of large primes). Basically, the first non-zero entry might not be relatively prime to the modulus. The prime case is easy since you can search for a different row and swap. In the non-prime case, I think it could be that all leading entries aren't relatively prime so you have to combine them
It can be calculated using Sage (www.sagemath.org) as
Matrix(IntegerModRing(7), [[1, 2], [3,4]]).inverse()
Although Sage is huge to install and you have to use the version of python that comes with it which is a pain.
'sympy' package Matrix class function 'sqMatrix.inv_mod(mod)' computes modulo matrix inverse for small and arbitrarily large modulus. By combining sympy with numpy, it becomes easy to compute modulo inverse of 2-D numpy arrays (see the code snippet below):
enter code here
import numpy
from sympy import Matrix
def matInvMod (vmnp, mod):
nr = vmnp.shape[0]
nc = vmnp.shape[1]
if (nr!= nc):
print "Error: Non square matrix! exiting"
exit()
vmsym = Matrix(vmnp)
vmsymInv = vmsym.inv_mod(mod)
vmnpInv = numpy.array(vmsymInv)
print "vmnpInv: ", vmnpInv, "\n"
k = nr
vmtest = [[1 for i in range(k)] for j in range(k)] # just a 2-d list
vmtestInv = vmsym*vmsymInv
for i in range(k):
for j in range(k):
#print i, j, vmtrx2[i,j] % mod
vmtest[i][j] = vmtestInv[i,j] % mod
print "test vmk*vkinv % mod \n:", vmtest
return vmnpInv
if __name__ == '__main__':
#p = 271
p =
115792089210356248762697446949407573530086143415290314195533631308867097853951
vm = numpy.array([[1,1,1,1], [1, 2, 4, 8], [1, 4, 16, 64], [1, 5, 25, 125]])
#vminv = modMatInv(vm, p)
vminv = matInvMod(vm, p)
print vminv
vmtestnp = vm.dot(vminv)%p # test mtrx inversion
print vmtestnp
Unfortunately numpy does not have modular arithmetic implementations. You can always code up the proposed algorithm using row reduction or determinants as demonstrated here. A modular inverse seems to be quite useful for cryptography.
Related
Out of curiosity, I was wondering if there's a way to solve a binominal coefficient by simulation in python. I tried a little bit, but the numbers are getting so big so quickly that I wasn't able to solve it for anything but really small numbers.
I'm aware of this question but wasn't able to identify one solution that uses only brute force to solve the coefficient. But I have to admit that I don't understand all the implementations listed there.
Here's my naive approach:
import random
import numpy as np
from math import factorial as fac
# Calculating the reference with help of factorials
def comb(n,k):
return fac(n) // fac(k) // fac(n-k)
# trying a simple simulation with help of random.sample
random.seed(42)
n,k = 30,3
n_sim = 100000
samples = np.empty([n_sim,k], dtype=int)
for i in range(n_sim):
x = random.sample(range(n),k)
samples[i] = sorted(x)
u = np.unique(samples, axis=0)
print(len(u))
print(comb(n,k))
Would it be possible to do this efficiently and fast for big numbers?
I use this, its pretty efficient for large numbers:
def nck(n, k):
if k < 0 or k > n:
return 0
if k == 0 or k == n:
return 1
k = min(k, n - k) # take advantage of symmetry
c = 1
for i in range(k):
c = c * (n - i) // (i + 1)
return c
I'm using the following code for finding primitive roots modulo n in Python:
Code:
def gcd(a,b):
while b != 0:
a, b = b, a % b
return a
def primRoots(modulo):
roots = []
required_set = set(num for num in range (1, modulo) if gcd(num, modulo) == 1)
for g in range(1, modulo):
actual_set = set(pow(g, powers) % modulo for powers in range (1, modulo))
if required_set == actual_set:
roots.append(g)
return roots
if __name__ == "__main__":
p = 17
primitive_roots = primRoots(p)
print(primitive_roots)
Output:
[3, 5, 6, 7, 10, 11, 12, 14]
Code fragment extracted from: Diffie-Hellman (Github)
Can the primRoots method be simplified or optimized in terms of memory usage and performance/efficiency?
One quick change that you can make here (not efficiently optimum yet) is using list and set comprehensions:
def primRoots(modulo):
coprime_set = {num for num in range(1, modulo) if gcd(num, modulo) == 1}
return [g for g in range(1, modulo) if coprime_set == {pow(g, powers, modulo)
for powers in range(1, modulo)}]
Now, one powerful and interesting algorithmic change that you can make here is to optimize your gcd function using memoization. Or even better you can simply use built-in gcd function form math module in Python-3.5+ or fractions module in former versions:
from functools import wraps
def cache_gcd(f):
cache = {}
#wraps(f)
def wrapped(a, b):
key = (a, b)
try:
result = cache[key]
except KeyError:
result = cache[key] = f(a, b)
return result
return wrapped
#cache_gcd
def gcd(a,b):
while b != 0:
a, b = b, a % b
return a
# or just do the following (recommended)
# from math import gcd
Then:
def primRoots(modulo):
coprime_set = {num for num in range(1, modulo) if gcd(num, modulo) == 1}
return [g for g in range(1, modulo) if coprime_set == {pow(g, powers, modulo)
for powers in range(1, modulo)}]
As mentioned in comments, as a more pythoinc optimizer way you can use fractions.gcd (or for Python-3.5+ math.gcd).
Based on the comment of Pete and answer of Kasramvd, I can suggest this:
from math import gcd as bltin_gcd
def primRoots(modulo):
required_set = {num for num in range(1, modulo) if bltin_gcd(num, modulo) }
return [g for g in range(1, modulo) if required_set == {pow(g, powers, modulo)
for powers in range(1, modulo)}]
print(primRoots(17))
Output:
[3, 5, 6, 7, 10, 11, 12, 14]
Changes:
It now uses pow method's 3-rd argument for the modulo.
Switched to gcd built-in function that's defined in math (for Python 3.5) for a speed boost.
Additional info about built-in gcd is here: Co-primes checking
In the special case that p is prime, the following is a good bit faster:
import sys
# translated to Python from http://www.bluetulip.org/2014/programs/primitive.js
# (some rights may remain with the author of the above javascript code)
def isNotPrime(possible):
# We only test this here to protect people who copy and paste
# the code without reading the first sentence of the answer.
# In an application where you know the numbers are prime you
# will remove this function (and the call). If you need to
# test for primality, look for a more efficient algorithm, see
# for example Joseph F's answer on this page.
i = 2
while i*i <= possible:
if (possible % i) == 0:
return True
i = i + 1
return False
def primRoots(theNum):
if isNotPrime(theNum):
raise ValueError("Sorry, the number must be prime.")
o = 1
roots = []
r = 2
while r < theNum:
k = pow(r, o, theNum)
while (k > 1):
o = o + 1
k = (k * r) % theNum
if o == (theNum - 1):
roots.append(r)
o = 1
r = r + 1
return roots
print(primRoots(int(sys.argv[1])))
You can greatly improve your isNotPrime function by using a more efficient algorithm. You could double the speed by doing a special test for even numbers and then only testing odd numbers up to the square root, but this is still very inefficient compared to an algorithm such as the Miller Rabin test. This version in the Rosetta Code site will always give the correct answer for any number with fewer than 25 digits or so. For large primes, this will run in a tiny fraction of the time it takes to use trial division.
Also, you should avoid using the floating point exponentiation operator ** when you are dealing with integers as in this case (even though the Rosetta code that I just linked to does the same thing!). Things might work fine in a particular case, but it can be a subtle source of error when Python has to convert from floating point to integers, or when an integer is too large to represent exactly in floating point. There are efficient integer square root algorithms that you can use instead. Here's a simple one:
def int_sqrt(n):
if n == 0:
return 0
x = n
y = (x + n//x)//2
while (y<x):
x=y
y = (x + n//x)//2
return x
Those codes are all in-efficient, in many ways, first of all you do not need to iterate for all co-prime reminders of n, you need to check only for powers that are dividers of Euler's function from n. In the case n is prime Euler's function is n-1. If n i prime, you need to factorize n-1 and make check with only those dividers, not all. There is a simple mathematics behind this.
Second. You need better function for powering a number imagine the power is too big, I think in python you have the function pow(g, powers, modulo) which at each steps makes division and getting the remainder only ( _ % modulo ).
If you are going to implement the Diffie-Hellman algorithm it is better to use safe primes. They are such primes that p is a prime and 2p+1 is also prime, so that 2p+1 is called safe prime. If you get n = 2*p+1, then the dividers for that n-1 (n is prime, Euler's function from n is n-1) are 1, 2, p and 2p, you need to check only if the number g at power 2 and g at power p if one of them gives 1, then that g is not primitive root, and you can throw that g away and select another g, the next one g+1, If g^2 and g^p are non equal to 1 by modulo n, then that g is a primitive root, that check guarantees, that all powers except 2p would give numbers different from 1 by modulo n.
The example code uses Sophie Germain prime p and the corresponding safe prime 2p+1, and calculates primitive roots of that safe prime 2p+1.
You can easily re-work the code for any prime number or any other number, by adding a function to calculate Euler's function and to find all divisors of that value. But this is only a demo not a complete code. And there might be better ways.
class SGPrime :
'''
This object expects a Sophie Germain prime p, it does not check that it accept that as input.
Euler function from any prime is n-1, and the order (see method get_order) of any co-prime
remainder of n could be only a divider of Euler function value.
'''
def __init__(self, pSophieGermain ):
self.n = 2*pSophieGermain+1
#TODO! check if pSophieGermain is prime
#TODO! check if n is also prime.
#They both have to be primes, elsewhere the code does not work!
# Euler's function is n-1, #TODO for any n, calculate Euler's function from n
self.elrfunc = self.n-1
# All divisors of Euler's function value, #TODO for any n, get all divisors of the Euler's function value.
self.elrfunc_divisors = [1, 2, pSophieGermain, self.elrfunc]
def get_order(self, r):
'''
Calculate the order of a number, the minimal power at which r would be congruent with 1 by modulo p.
'''
r = r % self.n
for d in self.elrfunc_divisors:
if ( pow( r, d, self.n) == 1 ):
return d
return 0 # no such order, not possible if n is prime, - see small Fermat's theorem
def is_primitive_root(self, r):
'''
Check if r is a primitive root by modulo p. Such always exists if p is prime.
'''
return ( self.get_order(r) == self.elrfunc )
def find_all_primitive_roots(self, max_num_of_roots = None):
'''
Find all primitive roots, only for demo if n is large the list is large for DH or any other such algorithm
better to stop at first primitive roots.
'''
primitive_roots = []
for g in range(1, self.n):
if ( self.is_primitive_root(g) ):
primitive_roots.append(g)
if (( max_num_of_roots != None ) and (len(primitive_roots) >= max_num_of_roots)):
break
return primitive_roots
#demo, Sophie Germain's prime
p = 20963
sggen = SGPrime(p)
print (f"Safe prime : {sggen.n}, and primitive roots of {sggen.n} are : " )
print(sggen.find_all_primitive_roots())
Regards
I'm generating prime numbers from Fibonacci as follows (using Python, with mpmath and sympy for arbitrary precision):
from mpmath import *
def GCD(a,b):
while a:
a, b = fmod(b, a), a
return b
def generate(x):
mp.dps = round(x, int(log10(x))*-1)
if x == GCD(x, fibonacci(x-1)):
return True
if x == GCD(x, fibonacci(x+1)):
return True
return False
for x in range(1000, 2000)
if generate(x)
print(x)
It's a rather small algorithm but seemingly generates all primes (except for 5 somehow, but that's another question). I say seemingly because a very little percentage (0.5% under 1000 and 0.16% under 10K, getting less and less) isn't prime. For instance under 1000: 323, 377 and 442 are also generated. These numbers are not prime.
Is there something off in my script? I try to account for precision by relating the .dps setting to the number being calculated. Can it really be that Fibonacci and prime numbers are seemingly so related, but then when it's get detailed they aren't? :)
For this type of problem, you may want to look at the gmpy2 library. gmpy2 provides access to the GMP multiple-precision library which includes gcd() and fib() functions which calculate the greatest common divisor and the n-th fibonacci numbers quickly, and only using integer arithmetic.
Here is your program re-written to use gmpy2.
import gmpy2
def generate(x):
if x == gmpy2.gcd(x, gmpy2.fib(x-1)):
return True
if x == gmpy2.gcd(x, gmpy2.fib(x+1)):
return True
return False
for x in range(7, 2000):
if generate(x):
print(x)
You shouldn't be using any floating-point operations. You can calculate the GCD just using the builtin % (modulo) operator.
Update
As others have commented, you are checking for Fibonacci pseudoprimes. The actual test is slightly different than your code. Let's call the number being tested n. If n is divisible by 5, then the test passes if n evenly divides fib(n). If n divided by 5 leaves a remainder of either 1 or 4, then the test passes if n evenly divides fib(n-1). If n divided by 5 leaves a remainder of either 2 or 3, then the test passes if n evenly divides fib(n+1). Your code doesn't properly distinguish between the three cases.
If n evenly divides another number, say x, it leaves a remainder of 0. This is equivalent to x % n being 0. Calculating all the digits of the n-th Fibonacci number is not required. The test just cares about the remainder. Instead of calculating the Fibonacci number to full precision, you can calculate the remainder at each step. The following code calculates just the remainder of the Fibonacci numbers. It is based on the code given by #pts in Python mpmath not arbitrary precision?
def gcd(a,b):
while b:
a, b = b, a % b
return a
def fib_mod(n, m):
if n < 0:
raise ValueError
def fib_rec(n):
if n == 0:
return 0, 1
else:
a, b = fib_rec(n >> 1)
c = a * ((b << 1) - a)
d = b * b + a * a
if n & 1:
return d % m, (c + d) % m
else:
return c % m, d % m
return fib_rec(n)[0]
def is_fib_prp(n):
if n % 5 == 0:
return not fib_mod(n, n)
elif n % 5 == 1 or n % 5 == 4:
return not fib_mod(n-1, n)
else:
return not fib_mod(n+1, n)
It's written in pure Python and is very quick.
The sequence of numbers commonly known as the Fibonacci numbers is just a special case of a general Lucas sequence L(n) = p*L(n-1) - q*L(n-2). The usual Fibonacci numbers are generated by (p,q) = (1,-1). gmpy2.is_fibonacci_prp() accepts arbitrary values for p,q. gmpy2.is_fibonacci(1,-1,n) should match the results of the is_fib_pr(n) given above.
Disclaimer: I maintain gmpy2.
This isn't really a Python problem; it's a math/algorithm problem. You may want to ask it on the Math StackExchange instead.
Also, there is no need for any non-integer arithmetic whatsoever: you're computing floor(log10(x)) which can be done easily with purely integer math. Using arbitrary-precision math will greatly slow this algorithm down and may introduce some odd numerical errors too.
Here's a simple floor_log10(x) implementation:
from __future__ import division # if using Python 2.x
def floor_log10(x):
res = 0
if x < 1:
raise ValueError
while x >= 1:
x //= 10
res += 1
return res
I would like to compute
for values of n up to 1000000 as accurately as possible. Here is some sample code.
from __future__ import division
from scipy.misc import comb
def M(n):
return sum(comb(n,k,exact=True)*(1/n)*(1-k/n)**(2*n-k)*(k/n)**(k-1) for k in xrange(1,n+1))
for i in xrange(1,1000000,100):
print i,M(i)
The first problem is that I get OverflowError: long int too large to convert to float when n = 1101. This is because comb(n,k,exact=True) is too large to be converted to a float. The end result is however always a number around 0.159 .
I asked a related question at How to compute sum with large intermediate values however this question is different for three main reasons.
The formula I want to compute is different which causes different problems.
The solution proposed before to use exact=True does not help here as can be seen in the example I gave. Coding up my own implementation of comb is also not going to work as I still need to perform the floating point division.
I need to compute the answer for much bigger values than before which causes new problems. I suspect it can't be done without coding up the sum in some clever way.
A solution that doesn't crash is to use
from fractions import Fraction
def M2(n):
return sum(comb(n,k,exact=True)*Fraction(1,n)*(1-Fraction(k,n))**(2*n-k)*Fraction(k,n)**(k-1) for k in xrange(1,n+1))
for i in xrange(1,1000000,100):
print i, M2(i)*1.0
Unfortunately it is now so slow that I don't get an answer for n=1101 in a reasonable amount of time.
So the second problem is how to make it fast enough to complete for large n.
You can compute each summand in with a logarithm transformation that replaces multiplication, division, and exponentiation with addition, subtraction, and multiplication, respectively.
def summand(n,k):
lk=log(k)
ln=log(n)
a=(lk-ln)*(k-1)
b=(log(n-k)-ln)*(2*n-k)
c=-ln
d=sum(log(x) for x in xrange(n-k+1,n+1))-sum(log(x) for x in xrange(1,k+1))
return exp(a+b+c+d)
def M(n):
return sum(summand(n,k) for k in xrange(1,n))
Note that when k=n the summand will be zero so I do not compute it since the logarithm will be undefined.
You can use gmpy2. It has arbitrary precision floating point arithmetic with large exponent bounds.
from __future__ import division
from gmpy2 import comb,mpfr,fsum
def M(n):
return fsum(comb(n,k)*(mpfr(1)/n)*(mpfr(1)-mpfr(k)/n)**(mpfr(2)*n-k)*(mpfr(k)/n)**(k-1) for k in xrange(1,n+1))
for i in xrange(1,1000000,100):
print i,M(i)
Here is an excerpt of the output:
2001 0.15857490038127975
2101 0.15857582611615381
2201 0.15857666768820194
2301 0.15857743607577454
2401 0.15857814042739268
2501 0.15857878842787806
2601 0.15857938657957615
Disclaimer: I maintain gmpy2.
A rather brutal method is to compute all the factors and then mutliply in such a way that the result stays around 1.0 (Python 3.x):
def M(n):
return sum(summand(n, k) for k in range(1, n + 1))
def f1(n, k):
for i in range(k - 1):
yield k
for i in range(k):
yield n - i
def f2(n, k):
for i in range(k - 1):
yield 1 / n
for i in range(2 * n - k):
yield 1 - k / n
yield 1 / n
for i in range(2, k + 1):
yield 1 / i
def summand(n, k):
result = 1.0
factors1 = f1(n, k)
factors2 = f2(n, k)
while True:
empty1 = False
for factor in factors1:
result *= factor
if result > 1:
break
else:
empty1 = True
for factor in factors2:
result *= factor
if result < 1:
break
else:
if empty1:
break
return result
For M(1101) I get 0.15855899364641846, but it takes a few seconds. M(2000) takes about 14 seconds and yields 0.15857489065619598.
(I'm sure it can be optimised.)
How can this Mathematica code be ported to Python? I do not know the Mathematica syntax and am having a hard time understanding how this is described in a more traditional language.
Source (pg 5): http://subjoin.net/misc/m496pres1.nb.pdf
This cannot be ported to Python directly as the definition a[j] uses the Symbolic Arithmetic feature of Mathematica.
a[j] is basically the coefficient of xj in the series expansion of that rational function inside Apart.
Assume you have a[j], then f[n] is easy. A Block in Mathematica basically introduces a scope for variables. The first list initializes the variable, and the rest is the execution of the code. So
from __future__ import division
def f(n):
v = n // 5
q = v // 20
r = v % 20
return sum(binomial(q+5-j, 5) * a[r+20*j] for j in range(5))
(binomial is the Binomial coefficient.)
Using the proposed solutions from the previous answers I found that sympy sadly doesn't compute the apart() of the rational immediatly. It somehow gets confused. Moreover, the python list of coefficients returned by *Poly.all_coeffs()* has a different semantics than a Mathmatica list. Hence the try-except-clause in the definition of a().
The following code does work and the output, for some tested values, concurs with the answers given by the Mathematica formula in Mathematica 7:
from __future__ import division
from sympy import expand, Poly, binomial, apart
from sympy.abc import x
A = Poly(apart(expand(((1-x**20)**5)) / expand((((1-x)**2)*(1-x**2)*(1-x**5)*(1-x**10))))).all_coeffs()
def a(n):
try:
return A[n]
except IndexError:
return 0
def f(n):
v = n // 5
q = v // 20
r = v % 20
return sum(a[r+20*j]* binomial(q+5-j, 5) for j in range(5))
print map(f, [100, 50, 1000, 150])
The symbolics can be done with sympy. Combined with KennyTM's answer, something like this might be what you want:
from __future__ import division
from sympy import Symbol, apart, binomial
x = Symbol('x')
poly = (1-x**20)**5 / ((1-x)**2 * (1-x**2) * (1-x**5) * (1-x**10))
poly2 = apart(poly,x)
def a(j):
return poly2.coeff(x**j)
def f(n):
v = n // 5
q = v // 20
r = v % 20
return sum(binomial(q+5-j, 5)*a(r+20*j) for j in range(5))
Although I have to admit that f(n) does not work (I'm not very good at Python).