What is the complexity of this Prime finding algorithm? - python

My dad and I are trying to determine the algorithmic complexity of this prime finding function that my dad came up with as a youngster.
The first loop is obviously n since it sets up the dictionary. The trickier part is the nested loops. The outer loop runs n/4 times: 0 to n/2, step=2. The inner loop only runs if the number is considered prime which happens a lot at the beginning but happens less and less as the numbers increase.
def primesV2(n):
count = 0 # count is for counting the number of iterations done
# set all even numbers (and 1) to False, else assume prime
x = {}
for i in range(n):
if (i != 2 and i % 2 == 0) or i==1:
x[i] = False
else:
x[i] = True
# start at 3 because its the first odd prime
i=3
while i < n/2: # loop until halfway to n
if x[i]: # if the number is considered prime
for j in range(3*i,n,i*2): # if i=3, j will be 9,15,21 (odd multiples of 3)
x[j] = False # these are not prime
count = count + 1
else:
count = count + 1
i = i+2
return x, count

What you have here is a modified Sieve of Eratosthenes. Without any optimizations the complexity would be O(n log log n). Check out this wikipedia article why that is.
Your optimizations speed it up by a total factor of 4. You only go up to n/2 (you could stop at sqrt n) and you skip half the multiples. While this will make the code faster the complexity remains unchanged (constant factors are ignored). So it will still be O(n log log n).

Related

Prime checker including non primes

I am trying to solve Project Euler number 7.
By listing the first six prime numbers: 2, 3, 5, 7, 11, and 13, we can see that the 6th prime is 13.
What is the 10 001st prime number?
First thing that came into my mind was using length of list. This was very ineffective solution as it took over a minute. This is the used code.
def ch7():
primes = []
x = 2
while len(primes) != 10001:
for i in range(2, x):
if x % i == 0:
break
else:
primes.append(x)
x += 1
print(primes[-1])
ch7()
# Output is: 104743.
This works well but I wanted to reach faster solution. Therefore I did a bit of research and found out that in order to know if a number is a prime, we need to test whether it is divisible by any number up to its square root e.g. in order to know if 100 is a prime we dont need to divide it by every number up to 100, but only up to 10.
When I implemented this finding weird thing happened. The algorithm included some non primes. To be exact 66 of them. This is the adjusted code:
import math
primes = []
def ch7():
x = 2
while len(primes) != 10001:
for i in range(2, math.ceil(math.sqrt(x))):
if x % i == 0:
break
else:
primes.append(x)
x += 1
print(primes[-1])
ch7()
# Output is 104009
This solution takes under a second but it includes some non primes. I used math.ceil() in order to get int instead of float but I figured it should not be a problem since it still tests by every int up to square root of x.
Thank you for any suggestions.
Your solution generates a list of primes, but doens't use that list for anything but extracting the last element. We can toss that list, and cut the time of the code in half by treating 2 as a special case, and only testing odd numbers:
def ch7(limit=10001): # assume limit is >= 1
prime = 2
number = 3
count = 1
while count < limit:
for divisor in range(3, int(number ** 0.5) + 1, 2):
if number % divisor == 0:
break
else: # no break
prime = number
count += 1
number += 2
return prime
print(ch7())
But if you're going to collect a list of primes, you can use that list to get even more speed out of the program (about 10% for the test limits in use) by using those primes as divisors instead of odd numbers:
def ch7(limit=10001): # assume limit is >= 1
primes = [2]
number = 3
while len(primes) < limit:
for prime in primes:
if prime * prime > number: # look no further
primes.append(number)
break
if number % prime == 0: # composite
break
else: # may never be needed but prime gaps can be arbitrarily large
primes.append(number)
number += 2
return primes[-1]
print(ch7())
BTW, your second solution, even with the + 1 fix you mention in the comments, comes up with one prime beyond the correct answer. This is due to the way your code (mis)handles the prime 2.

Incorrect output Project Euler #50

Project Euler problem 50 reads as follows:
The prime 41, can be written as the sum of six consecutive primes:
41 = 2 + 3 + 5 + 7 + 11 + 13
This is the longest sum of consecutive primes that adds to a prime below one-hundred.
The longest sum of consecutive primes below one-thousand that adds to a prime, contains 21 terms, and is equal to 953.
Which prime, below one-million, can be written as the sum of the most consecutive primes?
In my approach I pregenerate a list of primes using sieve of eratosthenes, then
in the function itself I keep adding succeeding elements of my prime number list
and each time i do that I check if the sum itself is prime and if it is I keep track of it as the biggest one and return it. Well that should work i guess ? Obviously the answer is incorrect, but the interesting thing is that when i change the sieve to generate primes below 100000 it doesn't give an index error but gives another result.
from algorithms import gen_primes
primes = [i for i in gen_primes(1000000)]
def main(n):
idx, total, maximum = 0, 0, 0
while total < n:
total += primes[idx]
idx += 1
if total in primes:
maximum = total
return maximum
print(main(1000000))
Your program doesn't solve the general problem: you always start your list of consecutive primes at the lowest, 2. Thus, what you return is the longest consecutive list starting at 2*, rather than any consecutive list of primes.
In short, you need another loop ...
start_idx = 0
while start_idx < len(primes) and best_len*primes[start_idx] < n:
# find longest list starting at primes[start_idx]
start_idx += 1
In case it's any help, the successful sequence begins between 1500 and 2000.

I can't find where I did wrong :(

I was working on project euler question 23 with python. For this question, I have to find sum of any numbers <28124 that cannot be made by sum of two abundant numbers. abundant numbers are numbers that are smaller then its own sum of proper divisors.
my apporach was : https://gist.github.com/anonymous/373f23098aeb5fea3b12fdc45142e8f7
from math import sqrt
def dSum(n): #find sum of proper divisors
lst = set([])
if n%2 == 0:
step = 1
else:
step = 2
for i in range(1, int(sqrt(n))+1, step):
if n % i == 0:
lst.add(i)
lst.add(int(n/i))
llst = list(lst)
lst.remove(n)
sum = 0
for j in lst:
sum += j
return sum
#any numbers greater than 28123 can be written as the sum of two abundant numbers.
#thus, only have to find abundant numbers up to 28124 / 2 = 14062
abnum = [] #list of abundant numbers
sum = 0
can = set([])
for i in range(1,14062):
if i < dSum(i):
abnum.append(i)
for i in abnum:
for j in abnum:
can.add(i + j)
print (abnum)
print (can)
cannot = set(range(1,28124))
cannot = cannot - can
cannot = list(cannot)
cannot.sort ()
result = 0
print (cannot)
for i in cannot:
result += i
print (result)
which gave me answer of 31531501, which is wrong.
I googled the answer and answer should be 4179871.
theres like 1 million difference between the answers, so it should mean that I'm removing numbers that cannot be written as sum of two abundant numbers. But when I re-read the code it looks fine logically...
Please save from this despair
Just for some experience you really should look at comprehensions and leveraging the builtins (vs. hiding them):
You loops outside of dSum() (which can also be simplified) could look like:
import itertools as it
abnum = [i for i in range(1,28124) if i < dSum(i)]
can = {i+j for i, j in it.product(abnum, repeat=2)}
cannot = set(range(1,28124)) - can
print(sum(cannot)) # 4179871
There are a few ways to improve your code.
Firstly, here's a more compact version of dSum that's fairly close to your code. Operators are generally faster than function calls, so I use ** .5 instead of calling math.sqrt. I use a conditional expression instead of an if...else block to compute the step size. I use the built-in sum function instead of a for loop to add up the divisors; also, I use integer subtraction to remove n from the total because that's more efficient than calling the set.remove method.
def dSum(n):
lst = set()
for i in range(1, int(n ** .5) + 1, 2 if n % 2 else 1):
if n % i == 0:
lst.add(i)
lst.add(n // i)
return sum(lst) - n
However, we don't really need to use a set here. We can just add the divisor pairs as we find them, if we're careful not to add any divisor twice.
def dSum(n):
total = 0
for i in range(1, int(n ** .5) + 1, 2 if n % 2 else 1):
if n % i == 0:
j = n // i
if i < j:
total += i + j
else:
if i == j:
total += i
break
return total - n
This is slightly faster, and uses less RAM, at the expense of added code complexity. However, there's a more efficient approach to this problem.
Instead of finding the divisors (and hence the divisor sum) of each number individually, it's better to use a sieving approach that finds the divisors of all the numbers in the required range. Here's a simple example.
num = 28124
# Build a table of divisor sums.
table = [1] * num
for i in range(2, num):
for j in range(2 * i, num, i):
table[j] += i
# Collect abundant numbers
abnum = [i for i in range(2, num) if i < table[i]]
print(len(abnum), abnum[0], abnum[-1])
output
6965 12 28122
If we need to find divisor sums for a very large num a good approach is to find the prime power factors of each number, since there's an efficient way to compute the sum of the divisors from the prime power factorization. However, for numbers this small the minor time saving doesn't warrant the extra code complexity. (But I can add some prime power sieve code if you're curious; for finding divisor sums for all numbers < 28124, the prime power sieve technique is about twice as fast as the above code).
AChampion's answer shows a very compact way to find the sum of the numbers that cannot be written as the sum of two abundant numbers. However, it's a bit slow, mostly because it loops over all pairs of abundant numbers in abnum. Here's a faster way.
def main():
num = 28124
# Build a table of divisor sums. table[0] should be 0, but we ignore it.
table = [1] * num
for i in range(2, num):
for j in range(2 * i, num, i):
table[j] += i
# Collect abundant numbers
abnum = [i for i in range(2, num) if i < table[i]]
del table
# Create a set for fast searching
abset = set(abnum)
print(len(abset), abnum[0], abnum[-1])
total = 0
for i in range(1, num):
# Search for pairs of abundant numbers j <= d: j + d == i
for j in abnum:
d = i - j
if d < j:
# No pairs were found
total += i
break
if d in abset:
break
print(total)
if __name__ == "__main__":
main()
output
6965 12 28122
4179871
This code runs in around 2.7 seconds on my old 32bit single core 2GHz machine running Python 3.6.0. On Python 2, it's about 10% faster; I think that's because list comprehensions have less overhead in Python 2 (the run in the current scope rather than creating a new scope).

efficient ways of finding the largest prime factor of a number

I'm doing this problem on a site that I found (project Euler), and there is a question that involves finding the largest prime factor of a number. My solution fails at really large numbers so I was wondering how this code could be streamlined?
""" Find the largest prime of a number """
def get_factors(number):
factors = []
for integer in range(1, number + 1):
if number%integer == 0:
factors.append(integer)
return factors
def test_prime(number):
prime = True
for i in range(1, number + 1):
if i!=1 and i!=2 and i!=number:
if number%i == 0:
prime = False
return prime
def test_for_primes(lst):
primes = []
for i in lst:
if test_prime(i):
primes.append(i)
return primes
################################################### program starts here
def find_largest_prime_factor(i):
factors = get_factors(i)
prime_factors = test_for_primes(factors)
print prime_factors
print find_largest_prime_factor(22)
#this jams my computer
print find_largest_prime_factor(600851475143)
it fails when using large numbers, which is the point of the question I guess. (computer jams, tells me I have run out of memory and asks me which programs I would like to stop).
************************************ thanks for the answer. there was actually a couple bugs in the code in any case. so the fixed version of this (inefficient code) is below.
""" Find the largest prime of a number """
def get_factors(number):
factors = []
for integer in xrange(1, number + 1):
if number%integer == 0:
factors.append(integer)
return factors
def test_prime(number):
prime = True
if number == 1 or number == 2:
return prime
else:
for i in xrange(2, number):
if number%i == 0:
prime = False
return prime
def test_for_primes(lst):
primes = []
for i in lst:
if test_prime(i):
primes.append(i)
return primes
################################################### program starts here
def find_largest_prime_factor(i):
factors = get_factors(i)
print factors
prime_factors = test_for_primes(factors)
return prime_factors
print find_largest_prime_factor(x)
From your approach you are first generating all divisors of a number n in O(n) then you test which of these divisors is prime in another O(n) number of calls of test_prime (which is exponential anyway).
A better approach is to observe that once you found out a divisor of a number you can repeatedly divide by it to get rid of all of it's factors. Thus, to get the prime factors of, say 830297 you test all small primes (cached) and for each one which divides your number you keep dividing:
830297 is divisible by 13 so now you'll test with 830297 / 13 = 63869
63869 is still divisible by 13, you are at 4913
4913 doesn't divide by 13, next prime is 17 which divides 4913 to get 289
289 is still a multiple of 17, you have 17 which is the divisor and stop.
For further speed increase, after testing the cached prime numbers below say 100, you'll have to test for prime divisors using your test_prime function (updated according to #Ben's answer) but go on reverse, starting from sqrt. Your number is divisible by 71, the next number will give an sqrt of 91992 which is somewhat close to 6857 which is the largest prime factor.
Here is my favorite simple factoring program for Python:
def factors(n):
wheel = [1,2,2,4,2,4,2,4,6,2,6]
w, f, fs = 0, 2, []
while f*f <= n:
while n % f == 0:
fs.append(f)
n /= f
f, w = f + wheel[w], w+1
if w == 11: w = 3
if n > 1: fs.append(n)
return fs
The basic algorithm is trial division, using a prime wheel to generate the trial factors. It's not quite as fast as trial division by primes, but there's no need to calculate or store the prime numbers, so it's very convenient.
If you're interested in programming with prime numbers, you might enjoy this essay at my blog.
My solution is in C#. I bet you can translate it into python. I've been test it with random long integer ranging from 1 to 1.000.000.000 and it's doing good. You can try to test the result with online prime calculator Happy coding :)
public static long biggestPrimeFactor(long num) {
for (int div = 2; div < num; div++) {
if (num % div == 0) {
num \= div
div--;
}
}
return num;
}
The naive primality test can be improved upon in several ways:
Test for divisibility by 2 separately, then start your loop at 3 and go by 2's
End your loop at ceil(sqrt(num)). You're guaranteed to not find a prime factor above this number
Generate primes using a sieve beforehand, and only move onto the naive way if you've exhausted the numbers in your sieve.
Beyond these easy fixes, you're going to have to look up more efficient factorization algorithms.
Use a Sieve of Eratosthenes to calculate your primes.
from math import sqrt
def sieveOfEratosthenes(n):
primes = range(3, n + 1, 2) # primes above 2 must be odd so start at three and increase by 2
for base in xrange(len(primes)):
if primes[base] is None:
continue
if primes[base] >= sqrt(n): # stop at sqrt of n
break
for i in xrange(base + (base + 1) * primes[base], len(primes), primes[base]):
primes[i] = None
primes.insert(0,2)
return filter(None, primes)
The point to prime factorization by trial division is, the most efficient solution for factorizing just one number doesn't need any prime testing.
You just enumerate your possible factors in ascending order, and keep dividing them out of the number in question - all thus found factors are guaranteed to be prime. Stop when the square of current factor exceeds the current number being factorized. See the code in user448810's answer.
Normally, prime factorization by trial division is faster on primes than on all numbers (or odds etc.), but when factorizing just one number, to find the primes first to test divide by them later, will might cost more than just going ahead with the increasing stream of possible factors. This enumeration is O(n), prime generation is O(n log log n), with the Sieve of Eratosthenes (SoE), where n = sqrt(N) for the top limit N. With trial division (TD) the complexity is O(n1.5/(log n)2).
Of course the asymptotics are to be taken just as a guide, actual code's constant factors might change the picture. Example, execution times for a Haskell code derived from here and here, factorizing 600851475149 (a prime):
2.. 0.57 sec
2,3,5,... 0.28 sec
2,3,5,7,11,13,17,19,... 0.21 sec
primes, segmented TD 0.65 sec first try
0.05 sec subsequent runs (primes are memoized)
primes, list-based SoE 0.44 sec first try
0.05 sec subsequent runs (primes are memoized)
primes, array-based SoE 0.15 sec first try
0.06 sec subsequent runs (primes are memoized)
so it depends. Of course factorizing the composite number in question, 600851475143, is near instantaneous, so it doesn't matter there.
Here is an example in JavaScript
function largestPrimeFactor(val, divisor = 2) {
let square = (val) => Math.pow(val, 2);
while ((val % divisor) != 0 && square(divisor) <= val) {
divisor++;
}
return square(divisor) <= val
? largestPrimeFactor(val / divisor, divisor)
: val;
}
I converted the solution from #under5hell to Python (2.7x). what an efficient way!
def largest_prime_factor(num, div=2):
while div < num:
if num % div == 0 and num/div > 1:
num = num /div
div = 2
else:
div = div + 1
return num
>> print largest_prime_factor(600851475143)
6857
>> print largest_prime_factor(13195)
29
Try this piece of code:
from math import *
def largestprime(n):
i=2
while (n>1):
if (n % i == 0):
n = n/i
else:
i=i+1
print i
strinput = raw_input('Enter the number to be factorized : ')
a = int(strinput)
largestprime(a)
Old one but might help
def isprime(num):
if num > 1:
# check for factors
for i in range(2,num):
if (num % i) == 0:
return False
return True
def largest_prime_factor(bignumber):
prime = 2
while bignumber != 1:
if bignumber % prime == 0:
bignumber = bignumber / prime
else:
prime = prime + 1
while isprime(prime) == False:
prime = prime+1
return prime
number = 600851475143
print largest_prime_factor(number)
I Hope this would help and easy to understand.
A = int(input("Enter the number to find the largest prime factor:"))
B = 2
while (B <(A/2)):
if A%B != 0:
B = B+1
else:
A = A/B
C = B
B = 2
print (A)
This code for getting the largest prime factor, with nums value of prime_factor(13195) when I run it, will return the result in less than a second.
but when nums value gets up to 6digits it will return the result in 8seconds.
Any one has an idea of what is the best algorithm for the solution...
def prime_factor(nums):
if nums < 2:
return 0
primes = [2]
x = 3
while x <= nums:
for i in primes:
if x%i==0:
x += 2
break
else:
primes.append(x)
x += 2
largest_prime = primes[::-1]
# ^^^ code above to gets all prime numbers
intermediate_tag = []
factor = []
# this code divide nums by the largest prime no. and return if the
# result is an integer then append to primefactor.
for i in largest_prime:
x = nums/i
if x.is_integer():
intermediate_tag.append(x)
# this code gets the prime factors [29.0, 13.0, 7.0, 5.0]
for i in intermediate_tag:
y = nums/i
factor.append(y)
print(intermediate_tag)
print(f"prime factor of {nums}:==>",factor)
prime_factor(13195)
[455.0, 1015.0, 1885.0, 2639.0]
prime factor of 13195:==> [29.0, 13.0, 7.0, 5.0]

Python Eratosthenes Sieve Algorithm Optimization

I'm attempting to implement the Sieve of Eratosthenes. The output seems to be correct (minus "2" that needs to be added) but if the input to the function is larger than 100k or so it seems to take an inordinate amount of time. What are ways that I can optimize this function?
def sieveErato(n):
numberList = range(3,n,2)
for item in range(int(math.sqrt(len(numberList)))):
divisor = numberList[item]
for thing in numberList:
if(thing % divisor == 0) and thing != divisor:
numberList.remove(thing)
return numberList
Your algorithm is not the Sieve of Eratosthenes. You perform trial division (the modulus operator) instead of crossing-off multiples, as Eratosthenes did over two thousand years ago. Here is an explanation of the true sieving algorithm, and shown below is my simple, straight forward implementation, which returns a list of primes not exceeding n:
def sieve(n):
m = (n-1) // 2
b = [True]*m
i,p,ps = 0,3,[2]
while p*p < n:
if b[i]:
ps.append(p)
j = 2*i*i + 6*i + 3
while j < m:
b[j] = False
j = j + 2*i + 3
i+=1; p+=2
while i < m:
if b[i]:
ps.append(p)
i+=1; p+=2
return ps
We sieve only on the odd numbers, stopping at the square root of n. The odd-looking calculations on j map between the integers being sieved 3, 5, 7, 9, ... and indexes 0, 1, 2, 3, ... in the b array of bits.
You can see this function in action at http://ideone.com/YTaMB, where it computes the primes to a million in less than a second.
You can try the same way Eratosthenes did. Take an array with all numbers you need to check order ascending, go to number 2 and mark it. Now scratch every second number till the end of the array. Then go to 3 and mark it. After that scratch every third number . Then go to 4 - it is already scratched, so skip it. Repeat this for every n+1 which is not already scratched.
In the end, the marked numbers are the prime one. This algorithm is faster, but sometimes need lots of memory. You can optimize it a little by drop all even numbers (cause they are not prime) and add 2 manually to the list. This will twist the logic a little, but will take half the memory.
Here is an illustration of what I'm talking about: http://en.wikipedia.org/wiki/Sieve_of_Eratosthenes
Warning: removing elements from an iterator while iterating on it can be dengerous...
You could make the
if(thing % divisor == 0) and thing != divisor:
test lighter by splitting it in the loop that breaks when you arrive to the index of 'divisor' and then the test:
for thing in numberList_fromDivisorOn:
if(thing % divisor == 0):
numberList.remove(thing)
This code takes 2 seconds to generate primes less than 10M
(it is not mine, i found it somewer on google)
def erat_sieve(bound):
if bound < 2:
return []
max_ndx = (bound - 1) // 2
sieve = [True] * (max_ndx + 1)
#loop up to square root
for ndx in range(int(bound ** 0.5) // 2):
# check for prime
if sieve[ndx]:
# unmark all odd multiples of the prime
num = ndx * 2 + 3
sieve[ndx+num:max_ndx:num] = [False] * ((max_ndx-ndx-num-1)//num + 1)
# translate into numbers
return [2] + [ndx * 2 + 3 for ndx in range(max_ndx) if sieve[ndx]]
I followed this link: Sieve of Eratosthenes - Finding Primes Python as suggested by #MAK and I've found that the accepted answer could be improved with an idea I've found in your code:
def primes_sieve2(limit):
a = [True] * limit # Initialize the primality list
a[0] = a[1] = False
sqrt = int(math.sqrt(limit))+1
for i in xrange(sqrt):
isprime = a[i]
if isprime:
yield i
for n in xrange(i*i, limit, i): # Mark factors non-prime
a[n] = False
for (i, isprime) in enumerate(a[sqrt:]):
if isprime:
yield i+sqrt
if given unlimited memory and time, the following code will print all the prime numbers. and it'll do it without using trial division. it is based on the haskell code in the paper: The Genuine Sieve of Eratosthenes by Melissa E. O'Neill
from heapq import heappush, heappop, heapreplace
def sieve():
w = [2,4,2,4,6,2,6,4,2,4,6,6,2,6,4,2,6,4,6,8,4,2,4,2,4,8,6,4,6,2,4,6,2,6,6,4,2,4,6,2,6,4,2,4,2,10,2,10]
for p in [2,3,5,7]: print p
n,o = 11,0
t = []
l = len(w)
p = n
heappush(t, (p*p, n,o,p))
print p
while True:
n,o = n+w[o],(o+1)%l
p = n
if not t[0][0] <= p:
heappush(t, (p*p, n,o,p))
print p
continue
while t[0][0] <= p:
_, b,c,d = t[0]
b,c = b+w[c],(c+1)%l
heapreplace(t, (b*d, b,c,d))
sieve()

Categories

Resources