efficient ways of finding the largest prime factor of a number - python

I'm doing this problem on a site that I found (project Euler), and there is a question that involves finding the largest prime factor of a number. My solution fails at really large numbers so I was wondering how this code could be streamlined?
""" Find the largest prime of a number """
def get_factors(number):
factors = []
for integer in range(1, number + 1):
if number%integer == 0:
factors.append(integer)
return factors
def test_prime(number):
prime = True
for i in range(1, number + 1):
if i!=1 and i!=2 and i!=number:
if number%i == 0:
prime = False
return prime
def test_for_primes(lst):
primes = []
for i in lst:
if test_prime(i):
primes.append(i)
return primes
################################################### program starts here
def find_largest_prime_factor(i):
factors = get_factors(i)
prime_factors = test_for_primes(factors)
print prime_factors
print find_largest_prime_factor(22)
#this jams my computer
print find_largest_prime_factor(600851475143)
it fails when using large numbers, which is the point of the question I guess. (computer jams, tells me I have run out of memory and asks me which programs I would like to stop).
************************************ thanks for the answer. there was actually a couple bugs in the code in any case. so the fixed version of this (inefficient code) is below.
""" Find the largest prime of a number """
def get_factors(number):
factors = []
for integer in xrange(1, number + 1):
if number%integer == 0:
factors.append(integer)
return factors
def test_prime(number):
prime = True
if number == 1 or number == 2:
return prime
else:
for i in xrange(2, number):
if number%i == 0:
prime = False
return prime
def test_for_primes(lst):
primes = []
for i in lst:
if test_prime(i):
primes.append(i)
return primes
################################################### program starts here
def find_largest_prime_factor(i):
factors = get_factors(i)
print factors
prime_factors = test_for_primes(factors)
return prime_factors
print find_largest_prime_factor(x)

From your approach you are first generating all divisors of a number n in O(n) then you test which of these divisors is prime in another O(n) number of calls of test_prime (which is exponential anyway).
A better approach is to observe that once you found out a divisor of a number you can repeatedly divide by it to get rid of all of it's factors. Thus, to get the prime factors of, say 830297 you test all small primes (cached) and for each one which divides your number you keep dividing:
830297 is divisible by 13 so now you'll test with 830297 / 13 = 63869
63869 is still divisible by 13, you are at 4913
4913 doesn't divide by 13, next prime is 17 which divides 4913 to get 289
289 is still a multiple of 17, you have 17 which is the divisor and stop.
For further speed increase, after testing the cached prime numbers below say 100, you'll have to test for prime divisors using your test_prime function (updated according to #Ben's answer) but go on reverse, starting from sqrt. Your number is divisible by 71, the next number will give an sqrt of 91992 which is somewhat close to 6857 which is the largest prime factor.

Here is my favorite simple factoring program for Python:
def factors(n):
wheel = [1,2,2,4,2,4,2,4,6,2,6]
w, f, fs = 0, 2, []
while f*f <= n:
while n % f == 0:
fs.append(f)
n /= f
f, w = f + wheel[w], w+1
if w == 11: w = 3
if n > 1: fs.append(n)
return fs
The basic algorithm is trial division, using a prime wheel to generate the trial factors. It's not quite as fast as trial division by primes, but there's no need to calculate or store the prime numbers, so it's very convenient.
If you're interested in programming with prime numbers, you might enjoy this essay at my blog.

My solution is in C#. I bet you can translate it into python. I've been test it with random long integer ranging from 1 to 1.000.000.000 and it's doing good. You can try to test the result with online prime calculator Happy coding :)
public static long biggestPrimeFactor(long num) {
for (int div = 2; div < num; div++) {
if (num % div == 0) {
num \= div
div--;
}
}
return num;
}

The naive primality test can be improved upon in several ways:
Test for divisibility by 2 separately, then start your loop at 3 and go by 2's
End your loop at ceil(sqrt(num)). You're guaranteed to not find a prime factor above this number
Generate primes using a sieve beforehand, and only move onto the naive way if you've exhausted the numbers in your sieve.
Beyond these easy fixes, you're going to have to look up more efficient factorization algorithms.

Use a Sieve of Eratosthenes to calculate your primes.
from math import sqrt
def sieveOfEratosthenes(n):
primes = range(3, n + 1, 2) # primes above 2 must be odd so start at three and increase by 2
for base in xrange(len(primes)):
if primes[base] is None:
continue
if primes[base] >= sqrt(n): # stop at sqrt of n
break
for i in xrange(base + (base + 1) * primes[base], len(primes), primes[base]):
primes[i] = None
primes.insert(0,2)
return filter(None, primes)

The point to prime factorization by trial division is, the most efficient solution for factorizing just one number doesn't need any prime testing.
You just enumerate your possible factors in ascending order, and keep dividing them out of the number in question - all thus found factors are guaranteed to be prime. Stop when the square of current factor exceeds the current number being factorized. See the code in user448810's answer.
Normally, prime factorization by trial division is faster on primes than on all numbers (or odds etc.), but when factorizing just one number, to find the primes first to test divide by them later, will might cost more than just going ahead with the increasing stream of possible factors. This enumeration is O(n), prime generation is O(n log log n), with the Sieve of Eratosthenes (SoE), where n = sqrt(N) for the top limit N. With trial division (TD) the complexity is O(n1.5/(log n)2).
Of course the asymptotics are to be taken just as a guide, actual code's constant factors might change the picture. Example, execution times for a Haskell code derived from here and here, factorizing 600851475149 (a prime):
2.. 0.57 sec
2,3,5,... 0.28 sec
2,3,5,7,11,13,17,19,... 0.21 sec
primes, segmented TD 0.65 sec first try
0.05 sec subsequent runs (primes are memoized)
primes, list-based SoE 0.44 sec first try
0.05 sec subsequent runs (primes are memoized)
primes, array-based SoE 0.15 sec first try
0.06 sec subsequent runs (primes are memoized)
so it depends. Of course factorizing the composite number in question, 600851475143, is near instantaneous, so it doesn't matter there.

Here is an example in JavaScript
function largestPrimeFactor(val, divisor = 2) {
let square = (val) => Math.pow(val, 2);
while ((val % divisor) != 0 && square(divisor) <= val) {
divisor++;
}
return square(divisor) <= val
? largestPrimeFactor(val / divisor, divisor)
: val;
}

I converted the solution from #under5hell to Python (2.7x). what an efficient way!
def largest_prime_factor(num, div=2):
while div < num:
if num % div == 0 and num/div > 1:
num = num /div
div = 2
else:
div = div + 1
return num
>> print largest_prime_factor(600851475143)
6857
>> print largest_prime_factor(13195)
29

Try this piece of code:
from math import *
def largestprime(n):
i=2
while (n>1):
if (n % i == 0):
n = n/i
else:
i=i+1
print i
strinput = raw_input('Enter the number to be factorized : ')
a = int(strinput)
largestprime(a)

Old one but might help
def isprime(num):
if num > 1:
# check for factors
for i in range(2,num):
if (num % i) == 0:
return False
return True
def largest_prime_factor(bignumber):
prime = 2
while bignumber != 1:
if bignumber % prime == 0:
bignumber = bignumber / prime
else:
prime = prime + 1
while isprime(prime) == False:
prime = prime+1
return prime
number = 600851475143
print largest_prime_factor(number)

I Hope this would help and easy to understand.
A = int(input("Enter the number to find the largest prime factor:"))
B = 2
while (B <(A/2)):
if A%B != 0:
B = B+1
else:
A = A/B
C = B
B = 2
print (A)

This code for getting the largest prime factor, with nums value of prime_factor(13195) when I run it, will return the result in less than a second.
but when nums value gets up to 6digits it will return the result in 8seconds.
Any one has an idea of what is the best algorithm for the solution...
def prime_factor(nums):
if nums < 2:
return 0
primes = [2]
x = 3
while x <= nums:
for i in primes:
if x%i==0:
x += 2
break
else:
primes.append(x)
x += 2
largest_prime = primes[::-1]
# ^^^ code above to gets all prime numbers
intermediate_tag = []
factor = []
# this code divide nums by the largest prime no. and return if the
# result is an integer then append to primefactor.
for i in largest_prime:
x = nums/i
if x.is_integer():
intermediate_tag.append(x)
# this code gets the prime factors [29.0, 13.0, 7.0, 5.0]
for i in intermediate_tag:
y = nums/i
factor.append(y)
print(intermediate_tag)
print(f"prime factor of {nums}:==>",factor)
prime_factor(13195)
[455.0, 1015.0, 1885.0, 2639.0]
prime factor of 13195:==> [29.0, 13.0, 7.0, 5.0]

Related

Prime checker including non primes

I am trying to solve Project Euler number 7.
By listing the first six prime numbers: 2, 3, 5, 7, 11, and 13, we can see that the 6th prime is 13.
What is the 10 001st prime number?
First thing that came into my mind was using length of list. This was very ineffective solution as it took over a minute. This is the used code.
def ch7():
primes = []
x = 2
while len(primes) != 10001:
for i in range(2, x):
if x % i == 0:
break
else:
primes.append(x)
x += 1
print(primes[-1])
ch7()
# Output is: 104743.
This works well but I wanted to reach faster solution. Therefore I did a bit of research and found out that in order to know if a number is a prime, we need to test whether it is divisible by any number up to its square root e.g. in order to know if 100 is a prime we dont need to divide it by every number up to 100, but only up to 10.
When I implemented this finding weird thing happened. The algorithm included some non primes. To be exact 66 of them. This is the adjusted code:
import math
primes = []
def ch7():
x = 2
while len(primes) != 10001:
for i in range(2, math.ceil(math.sqrt(x))):
if x % i == 0:
break
else:
primes.append(x)
x += 1
print(primes[-1])
ch7()
# Output is 104009
This solution takes under a second but it includes some non primes. I used math.ceil() in order to get int instead of float but I figured it should not be a problem since it still tests by every int up to square root of x.
Thank you for any suggestions.
Your solution generates a list of primes, but doens't use that list for anything but extracting the last element. We can toss that list, and cut the time of the code in half by treating 2 as a special case, and only testing odd numbers:
def ch7(limit=10001): # assume limit is >= 1
prime = 2
number = 3
count = 1
while count < limit:
for divisor in range(3, int(number ** 0.5) + 1, 2):
if number % divisor == 0:
break
else: # no break
prime = number
count += 1
number += 2
return prime
print(ch7())
But if you're going to collect a list of primes, you can use that list to get even more speed out of the program (about 10% for the test limits in use) by using those primes as divisors instead of odd numbers:
def ch7(limit=10001): # assume limit is >= 1
primes = [2]
number = 3
while len(primes) < limit:
for prime in primes:
if prime * prime > number: # look no further
primes.append(number)
break
if number % prime == 0: # composite
break
else: # may never be needed but prime gaps can be arbitrarily large
primes.append(number)
number += 2
return primes[-1]
print(ch7())
BTW, your second solution, even with the + 1 fix you mention in the comments, comes up with one prime beyond the correct answer. This is due to the way your code (mis)handles the prime 2.

Algorithm recommendation for calculating prime factors (Python Beginner)

SPOILER ALERT! THIS MAY INFLUENCE YOUR ANSWER TO PROJECT EULER #3
I managed to get a working piece of code but it takes forever to compute the solution because of the large number I am analyzing.
I guess brute force is not the right way...
Any help in making this code more efficient?
# What is the largest prime factor of the number 600851475143
# Set variables
number = 600851475143
primeList = []
primeFactorList = []
# Make list of prime numbers < 'number'
for x in range(2, number+1):
isPrime = True
# Don't calculate for more than the sqrt of number for efficiency
for y in range(2, int(x**0.5)+1):
if x % y == 0:
isPrime = False
break
if isPrime:
primeList.append(x)
# Iterate over primeList to check for prime factors of 'number'
for i in primeList:
if number % i == 0:
primeFactorList.append(i)
# Print largest prime factor of 'number'
print(max(primeFactorList))
I'll first just address some basic problems in the particular algorithm you attempted:
You don't need to pre-generate the primes. Generate them on the fly as you need them - and you'll also see that you were generating way more primes than you need (you only need to try the primes up to sqrt(600851475143))
# What is the largest prime factor of the number 600851475143
# Set variables
number = 600851475143
primeList = []
primeFactorList = []
def primeList():
# Make list of prime numbers < 'number'
for x in range(2, number+1):
isPrime = True
# Don't calculate for more than the sqrt of number for efficiency
for y in range(2, int(x**0.5)+1):
if x % y == 0:
isPrime = False
break
if isPrime:
yield x
# Iterate over primeList to check for prime factors of 'number'
for i in primeList():
if i > number**0.5:
break
if number % i == 0:
primeFactorList.append(i)
# Print largest prime factor of 'number'
print(max(primeFactorList))
By using a generator (see the yield?) PrimeList() could even just return prime numbers forever by changing it to:
def primeList():
x = 2
while True:
isPrime = True
# Don't calculate for more than the sqrt of number for efficiency
for y in range(2, int(x**0.5)+1):
if x % y == 0:
isPrime = False
break
if isPrime:
yield x
x += 1
Although I can't help but optimize it slightly to skip over even numbers greater than 2:
def primeList():
yield 2
x = 3
while True:
isPrime = True
# Don't calculate for more than the sqrt of number for efficiency
for y in range(2, int(x**0.5)+1):
if x % y == 0:
isPrime = False
break
if isPrime:
yield x
x += 2
If you abandon your initial idea of enumerating the primes and trying them one at a time against number, there is an alternative: Instead deal directly with number and factor it out -- i.e., doing what botengboteng suggests and breaking down the number directly.
This will be much faster because we're now checking far fewer numbers:
number = 600851475143
def factors(num):
factors = []
if num % 2 == 0:
factors.append(2)
while num % 2 == 0:
num = num // 2
for f in range(3, int(num**0.5)+1, 2):
if num % f == 0:
factors.append(f)
while num % f == 0:
num = num // f
# Don't keep going if we're dividing by potential factors
# bigger than what is left.
if f > num:
break
if num > 1:
factors.append(num)
return factors
# grab last factor for maximum.
print(factors(number)[-1])
You can use user defined function such as
def isprime(n):
if n < 2:
return False
for i in range(2,(n**0.5)+1):
if n % i == 0:
return False
return True
it will return boolean values. You can use this function for verifying prime factors.
OR
you can continuously divide the number by 2 first
n = 600851475143
while n % 2 == 0:
print(2),
n = n / 2
n has to be odd now skip 2 in for loop, then print every divisor
for i in range(3,n**0.5+1,2):
while n % i== 0:
print(i)
n = n / i
at this point, n will be equal to 1 UNLESS n is a prime. So
if n > 1:
print(n)
to print itself as the prime factor.
Have fun exploring
First calculate the sqrt of your number as you've done.
number = 600851475143
number_sqrt = (number**0.5)+1
Then in your outermost loop only search for the prime numbers with a value less than the sqrt root of your number. You will not need any prime number greater than that. (You can infer any greater based on your list).
for x in range(2, number_sqrt+1):
To infer the greatest factor just divide your number by the items in the factor list including any combinations between them and determine if the resultant is or is not a prime.
No need to recalculate your list of prime numbers. But do define a function for determining if a number is prime or not.
I hope I was clear. Good Luck. Very interesting question.
I made this code, everything is explained if you have questions feel free to comment it
def max_prime_divisor_of(n):
for p in range(2, n+1)[::-1]: #We try to find the max prime who divides it, so start descending
if n%p is not 0: #If it doesn't divide it does not matter if its prime, so skip it
continue
for i in range(3, int(p**0.5)+1, 2): #Iterate over odd numbers
if p%i is 0:
break #If is not prime, skip it
if p%2 is 0: #If its multiple of 2, skip it
break
else: #If it didn't break it, is prime and divide our number, we got it
return p #return it
continue #If it broke, means is not prime, instead is just a non-prime divisor, skip it
If you don't know: What it does in range(2, n+1)[::-1] is the same as reversed(range(2, n+1)) so it means that instead of starting with 2, it starts with n because we are searching the max prime. (Basically it reverses the list to start that way)
Edit 1: This code runs faster the more divisor it has, otherwise is incredibly slow, for general purposes use the code above
def max_prime_divisor_of(n): #Decompose by its divisor
while True:
try:
n = next(n//p for p in range(2, n) if n%p is 0) #Decompose with the first divisor that we find and repeat
except StopIteration: #If the number doesn't have a divisor different from itself and 1, means its prime
return n
If you don't know: What it does in next(n//p for p in range(2, n) if n%p is 0) is that gets the first number who is divisor of n

Summation of primes : Using Generator vs w/o Generator

Project Euler 10th problem:
Find the sum of all the primes below two million.
Well I came up with two solutions :
USING GENERATOR FUNCTION
def gen_primes():
n = 2
primes = []
while True:
for p in primes:
if n % p == 0:
break
else:
primes.append(n)
yield n
n += 1
flag = True
sum=0
p=gen_primes()
while flag:
prime=p.__next__()
if prime > 2000000:
flag = False
else:
sum+=prime
print(sum)
W/O GENERATOR
def prime(number):
if number ==1:
return -1
else:
for a in (range(1,int(number**0.5)+1))[::2]:
if number%a==0 and a!=1:
return False
else:
return True
count=2
for i in (range(1,2000000))[::2]:
if prime(i)==True and i!=1:
count+=i
else:
continue
print(count)
Surprisingly, the latter (w/o g) takes 7.4 seconds whereas the former (using g) takes around 10 minutes !!!
Why is it so? Thought the generator would perform better because of fewer steps.
Your generator is doing a lot of unnecessary work. It checks all primes up to n when it only needs to check primes up to the square root of n. Here is a modified version of your generator function with the unnecessary work removed:
from time import time
t = time()
def gen_primes():
primes = []
yield 2
n = 3
while True:
is_prime = True
upper_limit = n ** .5
for p in primes:
if n % p == 0:
is_prime = False
break
elif p > upper_limit:
break
if is_prime:
primes.append(n)
yield n
n += 2 # only need to check divisibility by odd numbers
sum = 0
for prime in gen_primes():
if prime > 2_000_000:
break
else:
sum += prime
print("time:", time() - t)
print(sum)
This takes 2.3 seconds on my computer. The version without the generator takes 4.8 seconds on my computer.
If you're interested in a much more efficient algorithm for finding prime numbers, check out the Sieve of Eratosthenes.
https://en.wikipedia.org/wiki/Sieve_of_Eratosthenes
Here's a rework of your generator code that's almost twice as fast as the non-generator code. Similar to #DanielGiger's solution (+1) but attempts to be simpler and possibly a hair faster. It uses multiple squaring instead of a single square root (may or may not win out) and doesn't need a boolean flag:
def gen_primes():
yield 2
primes = [2]
number = 3
while True:
for prime in primes:
if prime * prime > number:
yield number
primes.append(number)
break
if number % prime == 0:
break
number += 2
total = 0
generator = gen_primes()
while True:
prime = next(generator)
if prime > 2_000_000:
break
total += prime
print(total)
Here's an equivalent rework of your non-generator code that also attempts to be simpler and just a hair faster. My rule of thumb is to deal with 2 and less as special cases and then write a clean odd divisor check:
def is_prime(number):
if number < 2:
return False
if number % 2 == 0:
return number == 2
for divisor in range(3, int(number**0.5) + 1, 2):
if number % divisor == 0:
return False
return True
total = 2
for i in range(3, 2_000_000, 2):
if is_prime(i):
total += i
print(total)
Also, your code generates an entire range of numbers and then throws away the even numbers -- we can use the third argument to range() to generate only the odd numbers in the first place.
In the end, the use of a generator has nothing to do with the general performance of your two solutions. You can rewrite your second solution as a generator:
def gen_primes():
yield 2
number = 3
while True:
for divisor in range(3, int(number**0.5) + 1, 2):
if number % divisor == 0:
break
else: # no break
yield number
number += 2
The difference between the two solutions is one uses only primes as trial divisors (at the cost of storage) and the other uses odd numbers (pseudoprimes) as trial divisors (with no storage cost). Time-wise, the one that does fewer trial divisions, wins.

Divisors of a number: How can I improve this code to find the number of divisors of big numbers?

My code is very slow when it comes to very large numbers.
def divisors(num):
divs = 1
if num == 1:
return 1
for i in range(1, int(num/2)):
if num % i == 0:
divs += 1
elif int(num/2) == i:
break
else:
pass
return divs
For 10^9 i get a run time of 381.63s.
Here is an approach that determines the multiplicities of the various distinct prime factors of n. Each such power, k, contributes a factor of k+1 to the total number of divisors.
import math
def smallest_divisor(p,n):
#returns the smallest divisor of n which is greater than p
for d in range(p+1,1+math.ceil(math.sqrt(n))):
if n % d == 0:
return d
return n
def divisors(n):
divs = 1
p = 1
while p < n:
p = smallest_divisor(p,n)
k = 0
while n % p == 0:
k += 1
n //= p
divs *= (k+1)
return divs - 1
It returns the number of proper divisors (so not counting the number itself). If you want to count the number itself, don't subtract 1 from the result.
It works rapidly with numbers of the size 10**9, though will slow down dramatically with even larger numbers.
Division is expensive, multiplication is cheap.
Factorize the number into primes. (Download the list of primes, keep dividing from the <= sqrt(num).)
Then count all the permutations.
If your number is a power of exactly one prime, p^n, you obvioulsy have n divisors for it, excluding 1; 8 = 2^3 has 3 divisors: 8, 4, 2.
In general case, your number factors into k primes: p0^n0 * p1^n1 * ... * pk^nk. It has (n0 + 1) * (n1 + 1) * .. * (nk + 1) divisors. The "+1" term counts for the case when all other powers are 0, that is, contribute a 1 to the multiplication.
Alternatively, you could just google and RTFM.
Here is an improved version of my code in the question. The running time is better, 0.008s for 10^9 now.
def divisors(num):
ceiling = int(sqrt(num))
divs = []
if num == 1:
return [1]
for i in range(1, ceiling + 1):
if num % i == 0:
divs.append(num / i)
if i != num // i:
divs.append(i)
return divs
It is important for me to also keep the divisors, so if this can still
be improved I'd be glad.
Consider this:
import math
def num_of_divisors(n):
ct = 1
rest = n
for i in range(2, int(math.ceil(math.sqrt(n)))+1):
while rest%i==0:
ct += 1
rest /= i
print i # the factors
if rest == 1:
break
if rest != 1:
print rest # the last factor
ct += 1
return ct
def main():
number = 2**31 * 3**13
print '{} divisors in {}'.format(num_of_divisors(number), number)
if __name__ == '__main__':
main()
We can stop searching for factors at the square root of n. Multiple factors are found in the while loop. And when a factor is found we divide it out from the number.
edit:
#Mark Ransom is right, the factor count was 1 too small for numbers where one factor was greater than the square root of the number, for instance 3*47*149*6991. The last check for rest != 1 accounts for that.
The number of factors is indeed correct then - you don't have to check beyond sqrt(n) for this.
Both statements where a number is printed can be used to append this number to the number of factors, if desired.

Euler Project #3 in Python

I'm trying to solve the Project Euler problem 3 in Python:
The prime factors of 13195 are 5, 7, 13 and 29.
What is the largest prime factor of the number 600851475143 ?
I know my program is inefficient and oversized, but I just wanted to know why doesn't it work?
Here's the code:
def check(z):
# checks if the given integer is prime
for i in range(2, z):
if z % i == 0:
return False
break
i = i+1
return True
def primegen(y):
# generates a list of prime integers in the given range
tab = []
while y >= 2:
if check(y) == True:
tab.append(y)
i = i-1
def mainfuntion(x):
# main function; prints the largest prime factor of the given integer
primegen(x)
for i in range(len(tab)):
if x % tab[i] == 0:
print tab[i]
break
mainfuntion(600851475143)
And here's the error:
for i in range(2, z):
OverflowError: range() result has too many items
The reason is that a list in Python is limited to 536,870,912 elements (see How Big can a Python Array Get?) and when you create the range in your example, the number of elements exceeds that number, causing the error.
The fun of Project Euler is figuring out the stuff on your own (which I know you're doing :) ), so I will give one very small hint that will bypass that error. Think about what a factor of a number is - you know that it is impossible for 600851475142 to be a factor of 600851475143. Therefore you wouldn't have to check all the way up to that number. Following that logic, is there a way you can significantly reduce the range that you check? If you do a bit of research into the properties of prime factors, you may find something interesting :)
This is the code that I used!!
n = 600851475143
i = 2
while i * i < n:
while n % i == 0:
n = n / i
i = i + 1
print n
-M1K3
#seamonkey8: Your code can be improved. You can increase it by 2 rather than one. It makes a speed difference for really high numbers:
n,i = 6008514751231312143,2
while i*i <= n:
if n%i == 0:
n //= i
else:
i+=[1,2][i>2]

Categories

Resources