Sieve of Eratosthenes - Finding Primes Python - python

Just to clarify, this is not a homework problem :)
I wanted to find primes for a math application I am building & came across Sieve of Eratosthenes approach.
I have written an implementation of it in Python. But it's terribly slow. For say, if I want to find all primes less than 2 million. It takes > 20 mins. (I stopped it at this point). How can I speed this up?
def primes_sieve(limit):
limitn = limit+1
primes = range(2, limitn)
for i in primes:
factors = range(i, limitn, i)
for f in factors[1:]:
if f in primes:
primes.remove(f)
return primes
print primes_sieve(2000)
UPDATE:
I ended up doing profiling on this code & found that quite a lot of time was spent on removing an element from the list. Quite understandable considering it has to traverse the entire list (worst-case) to find the element & then remove it and then readjust the list (maybe some copy goes on?). Anyway, I chucked out list for dictionary. My new implementation -
def primes_sieve1(limit):
limitn = limit+1
primes = dict()
for i in range(2, limitn): primes[i] = True
for i in primes:
factors = range(i,limitn, i)
for f in factors[1:]:
primes[f] = False
return [i for i in primes if primes[i]==True]
print primes_sieve1(2000000)

You're not quite implementing the correct algorithm:
In your first example, primes_sieve doesn't maintain a list of primality flags to strike/unset (as in the algorithm), but instead resizes a list of integers continuously, which is very expensive: removing an item from a list requires shifting all subsequent items down by one.
In the second example, primes_sieve1 maintains a dictionary of primality flags, which is a step in the right direction, but it iterates over the dictionary in undefined order, and redundantly strikes out factors of factors (instead of only factors of primes, as in the algorithm). You could fix this by sorting the keys, and skipping non-primes (which already makes it an order of magnitude faster), but it's still much more efficient to just use a list directly.
The correct algorithm (with a list instead of a dictionary) looks something like:
def primes_sieve2(limit):
a = [True] * limit # Initialize the primality list
a[0] = a[1] = False
for (i, isprime) in enumerate(a):
if isprime:
yield i
for n in range(i*i, limit, i): # Mark factors non-prime
a[n] = False
(Note that this also includes the algorithmic optimization of starting the non-prime marking at the prime's square (i*i) instead of its double.)

def eratosthenes(n):
multiples = []
for i in range(2, n+1):
if i not in multiples:
print (i)
for j in range(i*i, n+1, i):
multiples.append(j)
eratosthenes(100)

Removing from the beginning of an array (list) requires moving all of the items after it down. That means that removing every element from a list in this way starting from the front is an O(n^2) operation.
You can do this much more efficiently with sets:
def primes_sieve(limit):
limitn = limit+1
not_prime = set()
primes = []
for i in range(2, limitn):
if i in not_prime:
continue
for f in range(i*2, limitn, i):
not_prime.add(f)
primes.append(i)
return primes
print primes_sieve(1000000)
... or alternatively, avoid having to rearrange the list:
def primes_sieve(limit):
limitn = limit+1
not_prime = [False] * limitn
primes = []
for i in range(2, limitn):
if not_prime[i]:
continue
for f in xrange(i*2, limitn, i):
not_prime[f] = True
primes.append(i)
return primes

Much faster:
import time
def get_primes(n):
m = n+1
#numbers = [True for i in range(m)]
numbers = [True] * m #EDIT: faster
for i in range(2, int(n**0.5 + 1)):
if numbers[i]:
for j in range(i*i, m, i):
numbers[j] = False
primes = []
for i in range(2, m):
if numbers[i]:
primes.append(i)
return primes
start = time.time()
primes = get_primes(10000)
print(time.time() - start)
print(get_primes(100))

I realise this isn't really answering the question of how to generate primes quickly, but perhaps some will find this alternative interesting: because python provides lazy evaluation via generators, eratosthenes' sieve can be implemented exactly as stated:
def intsfrom(n):
while True:
yield n
n += 1
def sieve(ilist):
p = next(ilist)
yield p
for q in sieve(n for n in ilist if n%p != 0):
yield q
try:
for p in sieve(intsfrom(2)):
print p,
print ''
except RuntimeError as e:
print e
The try block is there because the algorithm runs until it blows the stack and without the
try block the backtrace is displayed pushing the actual output you want to see off screen.

By combining contributions from many enthusiasts (including Glenn Maynard and MrHIDEn from above comments), I came up with following piece of code in python 2:
def simpleSieve(sieveSize):
#creating Sieve.
sieve = [True] * (sieveSize+1)
# 0 and 1 are not considered prime.
sieve[0] = False
sieve[1] = False
for i in xrange(2,int(math.sqrt(sieveSize))+1):
if sieve[i] == False:
continue
for pointer in xrange(i**2, sieveSize+1, i):
sieve[pointer] = False
# Sieve is left with prime numbers == True
primes = []
for i in xrange(sieveSize+1):
if sieve[i] == True:
primes.append(i)
return primes
sieveSize = input()
primes = simpleSieve(sieveSize)
Time taken for computation on my machine for different inputs in power of 10 is:
3 : 0.3 ms
4 : 2.4 ms
5 : 23 ms
6 : 0.26 s
7 : 3.1 s
8 : 33 s

Using a bit of numpy, I could find all primes below 100 million in a little over 2 seconds.
There are two key features one should note
Cut out multiples of i only for i up to root of n
Setting multiples of i to False using x[2*i::i] = False is much faster than an explicit python for loop.
These two significantly speed up your code. For limits below one million, there is no perceptible running time.
import numpy as np
def primes(n):
x = np.ones((n+1,), dtype=np.bool)
x[0] = False
x[1] = False
for i in range(2, int(n**0.5)+1):
if x[i]:
x[2*i::i] = False
primes = np.where(x == True)[0]
return primes
print(len(primes(100_000_000)))

A simple speed hack: when you define the variable "primes," set the step to 2 to skip all even numbers automatically, and set the starting point to 1.
Then you can further optimize by instead of for i in primes, use for i in primes[:round(len(primes) ** 0.5)]. That will dramatically increase performance. In addition, you can eliminate numbers ending with 5 to further increase speed.

My implementation:
import math
n = 100
marked = {}
for i in range(2, int(math.sqrt(n))):
if not marked.get(i):
for x in range(i * i, n, i):
marked[x] = True
for i in range(2, n):
if not marked.get(i):
print i

Here's a version that's a bit more memory-efficient (and: a proper sieve, not trial divisions). Basically, instead of keeping an array of all the numbers, and crossing out those that aren't prime, this keeps an array of counters - one for each prime it's discovered - and leap-frogging them ahead of the putative prime. That way, it uses storage proportional to the number of primes, not up to to the highest prime.
import itertools
def primes():
class counter:
def __init__ (this, n): this.n, this.current, this.isVirgin = n, n*n, True
# isVirgin means it's never been incremented
def advancePast (this, n): # return true if the counter advanced
if this.current > n:
if this.isVirgin: raise StopIteration # if this is virgin, then so will be all the subsequent counters. Don't need to iterate further.
return False
this.current += this.n # pre: this.current == n; post: this.current > n.
this.isVirgin = False # when it's gone, it's gone
return True
yield 1
multiples = []
for n in itertools.count(2):
isPrime = True
for p in (m.advancePast(n) for m in multiples):
if p: isPrime = False
if isPrime:
yield n
multiples.append (counter (n))
You'll note that primes() is a generator, so you can keep the results in a list or you can use them directly. Here's the first n primes:
import itertools
for k in itertools.islice (primes(), n):
print (k)
And, for completeness, here's a timer to measure the performance:
import time
def timer ():
t, k = time.process_time(), 10
for p in primes():
if p>k:
print (time.process_time()-t, " to ", p, "\n")
k *= 10
if k>100000: return
Just in case you're wondering, I also wrote primes() as a simple iterator (using __iter__ and __next__), and it ran at almost the same speed. Surprised me too!

I prefer NumPy because of speed.
import numpy as np
# Find all prime numbers using Sieve of Eratosthenes
def get_primes1(n):
m = int(np.sqrt(n))
is_prime = np.ones(n, dtype=bool)
is_prime[:2] = False # 0 and 1 are not primes
for i in range(2, m):
if is_prime[i] == False:
continue
is_prime[i*i::i] = False
return np.nonzero(is_prime)[0]
# Find all prime numbers using brute-force.
def isprime(n):
''' Check if integer n is a prime '''
n = abs(int(n)) # n is a positive integer
if n < 2: # 0 and 1 are not primes
return False
if n == 2: # 2 is the only even prime number
return True
if not n & 1: # all other even numbers are not primes
return False
# Range starts with 3 and only needs to go up the square root
# of n for all odd numbers
for x in range(3, int(n**0.5)+1, 2):
if n % x == 0:
return False
return True
# To apply a function to a numpy array, one have to vectorize the function
def get_primes2(n):
vectorized_isprime = np.vectorize(isprime)
a = np.arange(n)
return a[vectorized_isprime(a)]
Check the output:
n = 100
print(get_primes1(n))
print(get_primes2(n))
[ 2 3 5 7 11 13 17 19 23 29 31 37 41 43 47 53 59 61 67 71 73 79 83 89 97]
[ 2 3 5 7 11 13 17 19 23 29 31 37 41 43 47 53 59 61 67 71 73 79 83 89 97]
Compare the speed of Sieve of Eratosthenes and brute-force on Jupyter Notebook. Sieve of Eratosthenes in 539 times faster than brute-force for million elements.
%timeit get_primes1(1000000)
%timeit get_primes2(1000000)
4.79 ms ± 90.3 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
2.58 s ± 31.2 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

I figured it must be possible to simply use the empty list as the terminating condition for the loop and came up with this:
limit = 100
ints = list(range(2, limit)) # Will end up empty
while len(ints) > 0:
prime = ints[0]
print prime
ints.remove(prime)
i = 2
multiple = prime * i
while multiple <= limit:
if multiple in ints:
ints.remove(multiple)
i += 1
multiple = prime * i

import math
def sieve(n):
primes = [True]*n
primes[0] = False
primes[1] = False
for i in range(2,int(math.sqrt(n))+1):
j = i*i
while j < n:
primes[j] = False
j = j+i
return [x for x in range(n) if primes[x] == True]

i think this is shortest code for finding primes with eratosthenes method
def prime(r):
n = range(2,r)
while len(n)>0:
yield n[0]
n = [x for x in n if x not in range(n[0],r,n[0])]
print(list(prime(r)))

The fastest implementation I could come up with:
isprime = [True]*N
isprime[0] = isprime[1] = False
for i in range(4, N, 2):
isprime[i] = False
for i in range(3, N, 2):
if isprime[i]:
for j in range(i*i, N, 2*i):
isprime[j] = False

I just came up with this. It may not be the fastest, but I'm not using anything other than straight additions and comparisons. Of course, what stops you here is the recursion limit.
def nondivsby2():
j = 1
while True:
j += 2
yield j
def nondivsbyk(k, nondivs):
j = 0
for i in nondivs:
while j < i:
j += k
if j > i:
yield i
def primes():
nd = nondivsby2()
while True:
p = next(nd)
nd = nondivsbyk(p, nd)
yield p
def main():
for p in primes():
print(p)

I made a one liner version of the Sieve of Eratosthenes
sieve = lambda j: [print(x) for x in filter(lambda n: 0 not in map(lambda i: n % i, range(2, n)) and (n!=1)&(n!=0), range(j + 1))]
In terms of performance, I am pretty sure this isn't the fastest thing by any means, and in terms of readability / following PEP8, this is pretty terrible, but it's more the novelty of the length than anything.
EDIT: Note that this simply prints the sieve & does not return (if you attempt to print it you will get a list of Nones, if you want to return, change the print(x) in the list comprehension to just "x".

not sure if my code is efficeient, anyone care to comment?
from math import isqrt
def isPrime(n):
if n >= 2: # cheating the 2, is 2 even prime?
for i in range(3, int(n / 2 + 1),2): # dont waste time with even numbers
if n % i == 0:
return False
return True
def primesTo(n):
x = [2] if n >= 2 else [] # cheat the only even prime
if n >= 2:
for i in range(3, n + 1,2): # dont waste time with even numbers
if isPrime(i):
x.append(i)
return x
def primes2(n): # trying to do this using set methods and the "Sieve of Eratosthenes"
base = {2} # again cheating the 2
base.update(set(range(3, n + 1, 2))) # build the base of odd numbers
for i in range(3, isqrt(n) + 1, 2): # apply the sieve
base.difference_update(set(range(2 * i, n + 1 , i)))
return list(base)
print(primesTo(10000)) # 2 different methods for comparison
print(primes2(10000))

Probably the quickest way to have primary numbers is the following:
import sympy
list(sympy.primerange(lower, upper+1))
In case you don't need to store them, just use the code above without conversion to the list. sympy.primerange is a generator, so it does not consume memory.

Using recursion and walrus operator:
def prime_factors(n):
for i in range(2, int(n ** 0.5) + 1):
if (q_r := divmod(n, i))[1] == 0:
return [i] + factor_list(q_r[0])
return [n]

Basic sieve
with numpy is amazing fast. May be the fastest implementation
# record: sieve 1_000_000_000 in 6.9s (core i7 - 2.6Ghz)
def sieve_22max_naive(bound):
sieve = np.ones(bound, dtype=bool) # default all prime
sieve[:2] = False # 0, 1 is not prime
sqrt_bound = math.ceil(math.sqrt(bound))
for i in range(2, sqrt_bound):
if sieve[i]:
inc = i if i == 2 else 2 * i
sieve[i * i:bound:inc] = False
return np.arange(bound)[sieve]
if __name__ == '__main__':
start = time.time()
prime_list = sieve_22max_naive(1_000_000_000)
print(f'Count: {len(prime_list):,}\n'
f'Greatest: {prime_list[-1]:,}\n'
f'Elapsed: %.3f' % (time.time() - start))
Segment sieve (use less memory)
# find prime in range [from..N), base on primes in range [2..from)
def sieve_era_part(primes, nfrom, n):
sieve_part = np.ones(n - nfrom, dtype=bool) # default all prime
limit = math.ceil(math.sqrt(n))
# [2,3,5,7,11...p] can find primes < (p+2)^2
if primes[-1] < limit - 2:
print(f'Not enough base primes to find up to {n:,}')
return
for p in primes:
if p >= limit: break
mul = p * p
inc = p * (2 if p > 2 else 1)
if mul < nfrom:
mul = math.ceil(nfrom / p) * p
(mul := mul + p) if p > 2 and (mul & 1) == 0 else ... # odd, not even
sieve_part[mul - nfrom::inc] = False
return np.arange(nfrom, n)[sieve_part]
# return np.where(sieve_part)[0] + nfrom
# return [i + nfrom for i, is_p in enumerate(sieve_part) if is_p]
# return [i for i in range(max(nfrom, 2), n) if sieve_part[i - nfrom]]
# find nth prime number, use less memory,
# extend bound to SEG_SIZE each loop
# record: 50_847_534 nth prime in 6.78s, core i7 - 9850H 2.6GHhz
def nth_prime(n):
# find prime up to bound
bound = 500_000
primes = sieve_22max_naive(bound)
SEG_SIZE = int(50e6)
while len(primes) < n:
# sieve for next segment
new_primes = sieve_era_part(primes, bound, bound + SEG_SIZE)
# extend primes
bound += SEG_SIZE
primes = np.append(primes, new_primes)
return primes[n - 1]
if __name__ == '__main__':
start = time.time()
prime = nth_prime(50_847_534)
print(f'{prime:,} Time %.6f' % (time.time() - start))

here is my solution, the same as Wikipedia
import math
def sieve_of_eratosthenes(n):
a = [i for i in range(2, n+1)]
clone_a = a[:]
b = [i for i in range(2, int(math.sqrt(n))+1)]
for i in b:
if i in a:
c = [pow(i, 2)+(j*i) for j in range(0, n+1)]
for j in c:
if j in clone_a:
clone_a.remove(j)
return clone_a
if __name__ == '__main__':
print(sieve_of_eratosthenes(23))

Related

Python program performance and memory management

I've written primality test program in python but I'm certain of big issues. One being that program execution becomes very slow if the number to test is extremely big, I've used a generator. And it could be that the program utilizes a huge memory space. Is there any way the program can be optimized?
#define function
def isprime(n):
return next((False for i in range(2,n) if n%i == 0), (n > 1))
#test it
nums = [i for i in range(101) if isprime(i)]
print(nums)
As indicated in the comments, the fastest way to find all primes below a certain number is the Sieve Of Eratosthenes method. Checked on my machine, this algorithm is about 1000 times faster than the above one.
A nice comparison between different methods can be found here: analysis-different-methods-find-prime-number-python
Just in case the external link will change in the future, I'm putting here the code:
# Python Program to find prime numbers in a range
import time
primes = []
def SieveOfEratosthenes(n):
# Create a boolean array "prime[0..n]" and
# initialize all entries it as true. A value
# in prime[i] will finally be false if i is
# Not a prime, else true.
prime = [True for i in range(n + 1)]
p = 2
while (p * p <= n):
# If prime[p] is not changed, then it is
# a prime
if (prime[p] == True):
# Update all multiples of p
for i in range(p * p, n + 1, p):
prime[i] = False
p += 1
c = 0
for p in range(2, n):
if prime[p]:
primes.append(p)
c += 1
return c
t0 = time.time()
c = SieveOfEratosthenes(100000)
print("Total prime numbers in range:", c)
print("Primes: ", primes)
t1 = time.time()
print("Time required:", t1 - t0)

Prime factorization - list

I am trying to implement a function primeFac() that takes as input a positive integer n and returns a list containing all the numbers in the prime factorization of n.
I have gotten this far but I think it would be better to use recursion here, not sure how to create a recursive code here, what would be the base case? to start with.
My code:
def primes(n):
primfac = []
d = 2
while (n > 1):
if n%d==0:
primfac.append(d)
# how do I continue from here... ?
A simple trial division:
def primes(n):
primfac = []
d = 2
while d*d <= n:
while (n % d) == 0:
primfac.append(d) # supposing you want multiple factors repeated
n //= d
d += 1
if n > 1:
primfac.append(n)
return primfac
with O(sqrt(n)) complexity (worst case). You can easily improve it by special-casing 2 and looping only over odd d (or special-casing more small primes and looping over fewer possible divisors).
The primefac module does factorizations with all the fancy techniques mathematicians have developed over the centuries:
#!python
import primefac
import sys
n = int( sys.argv[1] )
factors = list( primefac.primefac(n) )
print '\n'.join(map(str, factors))
This is a comprehension based solution, it might be the closest you can get to a recursive solution in Python while being possible to use for large numbers.
You can get proper divisors with one line:
divisors = [ d for d in xrange(2,int(math.sqrt(n))) if n % d == 0 ]
then we can test for a number in divisors to be prime:
def isprime(d): return all( d % od != 0 for od in divisors if od != d )
which tests that no other divisors divides d.
Then we can filter prime divisors:
prime_divisors = [ d for d in divisors if isprime(d) ]
Of course, it can be combined in a single function:
def primes(n):
divisors = [ d for d in range(2,n//2+1) if n % d == 0 ]
return [ d for d in divisors if \
all( d % od != 0 for od in divisors if od != d ) ]
Here, the \ is there to break the line without messing with Python indentation.
I've tweaked #user448810's answer to use iterators from itertools (and python3.4, but it should be back-portable). The solution is about 15% faster.
import itertools
def factors(n):
f = 2
increments = itertools.chain([1,2,2], itertools.cycle([4,2,4,2,4,6,2,6]))
for incr in increments:
if f*f > n:
break
while n % f == 0:
yield f
n //= f
f += incr
if n > 1:
yield n
Note that this returns an iterable, not a list. Wrap it in list() if that's what you want.
Most of the above solutions appear somewhat incomplete. A prime factorization would repeat each prime factor of the number (e.g. 9 = [3 3]).
Also, the above solutions could be written as lazy functions for implementation convenience.
The use sieve Of Eratosthenes to find primes to test is optimal, but; the above implementation used more memory than necessary.
I'm not certain if/how "wheel factorization" would be superior to applying only prime factors, for division tests of n.
While these solution are indeed helpful, I'd suggest the following two functions -
Function-1 :
def primes(n):
if n < 2: return
yield 2
plist = [2]
for i in range(3,n):
test = True
for j in plist:
if j>n**0.5:
break
if i%j==0:
test = False
break
if test:
plist.append(i)
yield i
Function-2 :
def pfactors(n):
for p in primes(n):
while n%p==0:
yield p
n=n//p
if n==1: return
list(pfactors(99999))
[3, 3, 41, 271]
3*3*41*271
99999
list(pfactors(13290059))
[3119, 4261]
3119*4261
13290059
Here is my version of factorization by trial division, which incorporates the optimization of dividing only by two and the odd integers proposed by Daniel Fischer:
def factors(n):
f, fs = 3, []
while n % 2 == 0:
fs.append(2)
n /= 2
while f * f <= n:
while n % f == 0:
fs.append(f)
n /= f
f += 2
if n > 1: fs.append(n)
return fs
An improvement on trial division by two and the odd numbers is wheel factorization, which uses a cyclic set of gaps between potential primes to greatly reduce the number of trial divisions. Here we use a 2,3,5-wheel:
def factors(n):
gaps = [1,2,2,4,2,4,2,4,6,2,6]
length, cycle = 11, 3
f, fs, nxt = 2, [], 0
while f * f <= n:
while n % f == 0:
fs.append(f)
n /= f
f += gaps[nxt]
nxt += 1
if nxt == length:
nxt = cycle
if n > 1: fs.append(n)
return fs
Thus, print factors(13290059) will output [3119, 4261]. Factoring wheels have the same O(sqrt(n)) time complexity as normal trial division, but will be two or three times faster in practice.
I've done a lot of work with prime numbers at my blog. Please feel free to visit and study.
def get_prime_factors(number):
"""
Return prime factor list for a given number
number - an integer number
Example: get_prime_factors(8) --> [2, 2, 2].
"""
if number == 1:
return []
# We have to begin with 2 instead of 1 or 0
# to avoid the calls infinite or the division by 0
for i in xrange(2, number):
# Get remainder and quotient
rd, qt = divmod(number, i)
if not qt: # if equal to zero
return [i] + get_prime_factors(rd)
return [number]
Most of the answer are making things too complex. We can do this
def prime_factors(n):
num = []
#add 2 to list or prime factors and remove all even numbers(like sieve of ertosthenes)
while(n%2 == 0):
num.append(2)
n /= 2
#divide by odd numbers and remove all of their multiples increment by 2 if no perfectlly devides add it
for i in xrange(3, int(sqrt(n))+1, 2):
while (n%i == 0):
num.append(i)
n /= i
#if no is > 2 i.e no is a prime number that is only divisible by itself add it
if n>2:
num.append(n)
print (num)
Algorithm from GeeksforGeeks
prime factors of a number:
def primefactors(x):
factorlist=[]
loop=2
while loop<=x:
if x%loop==0:
x//=loop
factorlist.append(loop)
else:
loop+=1
return factorlist
x = int(input())
alist=primefactors(x)
print(alist)
You'll get the list.
If you want to get the pairs of prime factors of a number try this:
http://pythonplanet.blogspot.in/2015/09/list-of-all-unique-pairs-of-prime.html
def factorize(n):
for f in range(2,n//2+1):
while n%f == 0:
n //= f
yield f
It's slow but dead simple. If you want to create a command-line utility, you could do:
import sys
[print(i) for i in factorize(int(sys.argv[1]))]
Here is an efficient way to accomplish what you need:
def prime_factors(n):
l = []
if n < 2: return l
if n&1==0:
l.append(2)
while n&1==0: n>>=1
i = 3
m = int(math.sqrt(n))+1
while i < m:
if n%i==0:
l.append(i)
while n%i==0: n//=i
i+= 2
m = int(math.sqrt(n))+1
if n>2: l.append(n)
return l
prime_factors(198765430488765430290) = [2, 3, 5, 7, 11, 13, 19, 23, 3607, 3803, 52579]
You can use sieve Of Eratosthenes to generate all the primes up to (n/2) + 1 and then use a list comprehension to get all the prime factors:
def rwh_primes2(n):
# http://stackoverflow.com/questions/2068372/fastest-way-to-list-all-primes-below-n-in-python/3035188#3035188
""" Input n>=6, Returns a list of primes, 2 <= p < n """
correction = (n%6>1)
n = {0:n,1:n-1,2:n+4,3:n+3,4:n+2,5:n+1}[n%6]
sieve = [True] * (n/3)
sieve[0] = False
for i in xrange(int(n**0.5)/3+1):
if sieve[i]:
k=3*i+1|1
sieve[ ((k*k)/3) ::2*k]=[False]*((n/6-(k*k)/6-1)/k+1)
sieve[(k*k+4*k-2*k*(i&1))/3::2*k]=[False]*((n/6-(k*k+4*k-2*k*(i&1))/6-1)/k+1)
return [2,3] + [3*i+1|1 for i in xrange(1,n/3-correction) if sieve[i]]
def primeFacs(n):
primes = rwh_primes2((n/2)+1)
return [x for x in primes if n%x == 0]
print primeFacs(99999)
#[3, 41, 271]
from sets import Set
# this function generates all the possible factors of a required number x
def factors_mult(X):
L = []
[L.append(i) for i in range(2,X) if X % i == 0]
return L
# this function generates list containing prime numbers upto the required number x
def prime_range(X):
l = [2]
for i in range(3,X+1):
for j in range(2,i):
if i % j == 0:
break
else:
l.append(i)
return l
# This function computes the intersection of the two lists by invoking Set from the sets module
def prime_factors(X):
y = Set(prime_range(X))
z = Set(factors_mult(X))
k = list(y & z)
k = sorted(k)
print "The prime factors of " + str(X) + " is ", k
# for eg
prime_factors(356)
Simple way to get the desired solution
def Factor(n):
d = 2
factors = []
while n >= d*d:
if n % d == 0:
n//=d
# print(d,end = " ")
factors.append(d)
else:
d = d+1
if n>1:
# print(int(n))
factors.append(n)
return factors
This is the code I made. It works fine for numbers with small primes, but it takes a while for numbers with primes in the millions.
def pfactor(num):
div = 2
pflist = []
while div <= num:
if num % div == 0:
pflist.append(div)
num /= div
else:
div += 1
# The stuff afterwards is just to convert the list of primes into an expression
pfex = ''
for item in list(set(pflist)):
pfex += str(item) + '^' + str(pflist.count(item)) + ' * '
pfex = pfex[0:-3]
return pfex
I would like to share my code for finding the prime factors of number given input by the user:
a = int(input("Enter a number: "))
def prime(a):
b = list()
i = 1
while i<=a:
if a%i ==0 and i!=1 and i!=a:
b.append(i)
i+=1
return b
c = list()
for x in prime(a):
if len(prime(x)) == 0:
c.append(x)
print(c)
def prime_factors(num, dd=2):
while dd <= num and num>1:
if num % dd == 0:
num //= dd
yield dd
dd +=1
Lot of answers above fail on small primes, e.g. 3, 5 and 7. The above is succinct and fast enough for ordinary use.
print list(prime_factors(3))
[3]

Python Eratosthenes Sieve Algorithm Optimization

I'm attempting to implement the Sieve of Eratosthenes. The output seems to be correct (minus "2" that needs to be added) but if the input to the function is larger than 100k or so it seems to take an inordinate amount of time. What are ways that I can optimize this function?
def sieveErato(n):
numberList = range(3,n,2)
for item in range(int(math.sqrt(len(numberList)))):
divisor = numberList[item]
for thing in numberList:
if(thing % divisor == 0) and thing != divisor:
numberList.remove(thing)
return numberList
Your algorithm is not the Sieve of Eratosthenes. You perform trial division (the modulus operator) instead of crossing-off multiples, as Eratosthenes did over two thousand years ago. Here is an explanation of the true sieving algorithm, and shown below is my simple, straight forward implementation, which returns a list of primes not exceeding n:
def sieve(n):
m = (n-1) // 2
b = [True]*m
i,p,ps = 0,3,[2]
while p*p < n:
if b[i]:
ps.append(p)
j = 2*i*i + 6*i + 3
while j < m:
b[j] = False
j = j + 2*i + 3
i+=1; p+=2
while i < m:
if b[i]:
ps.append(p)
i+=1; p+=2
return ps
We sieve only on the odd numbers, stopping at the square root of n. The odd-looking calculations on j map between the integers being sieved 3, 5, 7, 9, ... and indexes 0, 1, 2, 3, ... in the b array of bits.
You can see this function in action at http://ideone.com/YTaMB, where it computes the primes to a million in less than a second.
You can try the same way Eratosthenes did. Take an array with all numbers you need to check order ascending, go to number 2 and mark it. Now scratch every second number till the end of the array. Then go to 3 and mark it. After that scratch every third number . Then go to 4 - it is already scratched, so skip it. Repeat this for every n+1 which is not already scratched.
In the end, the marked numbers are the prime one. This algorithm is faster, but sometimes need lots of memory. You can optimize it a little by drop all even numbers (cause they are not prime) and add 2 manually to the list. This will twist the logic a little, but will take half the memory.
Here is an illustration of what I'm talking about: http://en.wikipedia.org/wiki/Sieve_of_Eratosthenes
Warning: removing elements from an iterator while iterating on it can be dengerous...
You could make the
if(thing % divisor == 0) and thing != divisor:
test lighter by splitting it in the loop that breaks when you arrive to the index of 'divisor' and then the test:
for thing in numberList_fromDivisorOn:
if(thing % divisor == 0):
numberList.remove(thing)
This code takes 2 seconds to generate primes less than 10M
(it is not mine, i found it somewer on google)
def erat_sieve(bound):
if bound < 2:
return []
max_ndx = (bound - 1) // 2
sieve = [True] * (max_ndx + 1)
#loop up to square root
for ndx in range(int(bound ** 0.5) // 2):
# check for prime
if sieve[ndx]:
# unmark all odd multiples of the prime
num = ndx * 2 + 3
sieve[ndx+num:max_ndx:num] = [False] * ((max_ndx-ndx-num-1)//num + 1)
# translate into numbers
return [2] + [ndx * 2 + 3 for ndx in range(max_ndx) if sieve[ndx]]
I followed this link: Sieve of Eratosthenes - Finding Primes Python as suggested by #MAK and I've found that the accepted answer could be improved with an idea I've found in your code:
def primes_sieve2(limit):
a = [True] * limit # Initialize the primality list
a[0] = a[1] = False
sqrt = int(math.sqrt(limit))+1
for i in xrange(sqrt):
isprime = a[i]
if isprime:
yield i
for n in xrange(i*i, limit, i): # Mark factors non-prime
a[n] = False
for (i, isprime) in enumerate(a[sqrt:]):
if isprime:
yield i+sqrt
if given unlimited memory and time, the following code will print all the prime numbers. and it'll do it without using trial division. it is based on the haskell code in the paper: The Genuine Sieve of Eratosthenes by Melissa E. O'Neill
from heapq import heappush, heappop, heapreplace
def sieve():
w = [2,4,2,4,6,2,6,4,2,4,6,6,2,6,4,2,6,4,6,8,4,2,4,2,4,8,6,4,6,2,4,6,2,6,6,4,2,4,6,2,6,4,2,4,2,10,2,10]
for p in [2,3,5,7]: print p
n,o = 11,0
t = []
l = len(w)
p = n
heappush(t, (p*p, n,o,p))
print p
while True:
n,o = n+w[o],(o+1)%l
p = n
if not t[0][0] <= p:
heappush(t, (p*p, n,o,p))
print p
continue
while t[0][0] <= p:
_, b,c,d = t[0]
b,c = b+w[c],(c+1)%l
heapreplace(t, (b*d, b,c,d))
sieve()

How can I get this Python code to run more quickly? [Project Euler Problem #7]

I'm trying to complete this Project Euler challenge:
By listing the first six prime numbers: 2, 3, 5, 7, 11, and 13, we can
see that the 6th prime is 13.
What is the 10 001st prime number?
My code seem to be right because it works with small numbers, e.g 6th prime is 13.
How can i improve it so that the code will run much more quickly for larger numbers such as 10 001.
Code is below:
#Checks if a number is a prime
def is_prime(n):
count = 0
for i in range(2, n):
if n%i == 0:
return False
break
else:
count += 1
if count == n-2:
return True
#Finds the value for the given nth term
def term(n):
x = 0
count = 0
while count != n:
x += 1
if is_prime(x) == True:
count += 1
print x
term(10001)
UPDATE:
Thanks for your responses. I should have been more clear, I am not looking to speed up the interpreter or finding a faster interpreter, because i know my code isn't great, so i was looking for ways of make my code more efficient.
A few questions to ponder:
Do you really need to check the division until n-1? How earlier can you stop?
Apart from 2, do you really need to check the division by all the multiples of two ?
What about the multiples of 3? 5? Is there a way to extend this idea to all the multiples of previously tested primes?
The purpose of Project Euler is not really to think learn programming, but to think about algorithms. On problem #10, your algorithm will need to be even faster than on #7, etc. etc. So you need to come up with a better way to find prime numbers, not a faster way to run Python code. People solve these problems under the time limit with far slower computers that you're using now by thinking about the math.
On that note, maybe ask about your prime number algorithm on https://math.stackexchange.com/ if you really need help thinking about the problem.
A faster interpreter won't cut it. Even an implementation written in C or assembly language won't be fast enough (to be in the "about one second" timeframe of project Euler). To put it bluntly, your algorithm is pathetic. Some research and thinking will help you write an algorithm that runs faster in a dog-slow interpreter than your current algorithm implemented in native code (I won't name any specifics, partly because that's your job and partly because I can't tell offhand how much optimization will be needed).
Many of the Euler problems (including this one) are designed to have a solution that computes in acceptable time on pretty much any given hardware and compiler (well, not INTERCAL on a PDP-11 maybe).
You algorithm works, but it has quadratic complexity. Using a faster interpreter will give you a linear performance boost, but the quadratic complexity will dwarf it long before you calculate 10,000 primes. There are algorithms with much lower complexity; find them (or google them, no shame in that and you'll still learn a lot) and implement them.
Without discussing your algorithm, the PyPy interpreter can be ridiculously faster than the normal CPython one for tight numerical computation like this. You might want to try it out.
to check the prime number you dont have to run till n-1 or n/2....
To run it more faster,you can check only until square root of n
And this is the fastest algorithm I know
def isprime(number):
if number<=1:
return False
if number==2:
return True
if number%2==0:
return False
for i in range(3,int(sqrt(number))+1):
if number%i==0:
return False
return True
As most people have said, it's all about coming up with the correct algorithm. Have you considered looking at a
Sieve of Eratosthenes
import time
t=time.time()
def n_th_prime(n):
b=[]
b.append(2)
while len(b)<n :
for num in range(3,n*11,2):
if all(num%i!=0 for i in range(2,int((num)**0.5)+1)):
b.append(num)
print list(sorted(b))[n-1]
n_th_prime(10001)
print time.time()-t
prints
104743
0.569000005722 second
A pythonic Answer
import time
t=time.time()
def prime_bellow(n):
b=[]
num=2
j=0
b.append(2)
while len(b)-1<n:
if all(num%i!=0 for i in range(2,int((num)**0.5)+1)):
b.append(num)
num += 1
print b[n]
prime_bellow(10001)
print time.time()-t
Prints
104743
0.702000141144 second
import math
count = 0 <br/> def is_prime(n):
if n % 2 == 0 and n > 2:
return False
for i in range(3, int(math.sqrt(n)) + 1, 2):
if n % i == 0:
return False
return True
for i in range(2,2000000):
if is_prime(i):
count += 1
if count == 10001:
print i
break
I approached it a different way. We know that all multiples of 2 are not going to be prime (except 2) we also know that all non-prime numbers can be broken down to prime constituents.
i.e.
12 = 3 x 4 = 3 x 2 x 2
30 = 5 x 6 = 5 x 3 x 2
Therefore I iterated through a list of odd numbers, accumulating a list of primes, and only attempting to find the modulus of the odd numbers with primes in this list.
#First I create a helper method to determine if it's a prime that
#iterates through the list of primes I already have
def is_prime(number, list):
for prime in list:
if number % prime == 0:
return False
return True
EDIT: Originally I wrote this recursively, but I think the iterative case is much simpler
def find_10001st_iteratively():
number_of_primes = 0
current_number = 3
list_of_primes = [2]
while number_of_primes <= 10001:
if is_prime(current_number, list_of_primes):
list_of_primes.append(current_number)
number_of_primes += 1
current_number += 2
return current_number
A different quick Python solution:
import math
prime_number = 4 # Because 2 and 3 are already prime numbers
k = 3 # It is the 3rd try after 2 and 3 prime numbers
milestone = 10001
while k <= milestone:
divisible = 0
for i in range(2, int(math.sqrt(prime_number)) + 1):
remainder = prime_number % i
if remainder == 0: #Check if the number is evenly divisible (not prime) by i
divisible += 1
if divisible == 0:
k += 1
prime_number += 1
print(prime_number-1)
import time
t = time.time()
def is_prime(n): #check primes
prime = True
for i in range(2, int(n**0.5)+1):
if n % i == 0:
prime = False
break
return prime
def number_of_primes(n):
prime_list = []
counter = 0
num = 2
prime_list.append(2)
while counter != n:
if is_prime(num):
prime_list.append(num)
counter += 1
num += 1
return prime_list[n]
print(number_of_primes(10001))
print(time.time()-t)
104743
0.6159017086029053
based on the haskell code in the paper: The Genuine Sieve of Eratosthenes by Melissa E. O'Neill
from itertools import cycle, chain, tee, islice
wheel2357 = [2,4,2,4,6,2,6,4,2,4,6,6,2,6,4,2,6,4,6,8,4,2,4,2,4,8,6,4,6,2,4,6,2,6,6,4,2,4,6,2,6,4,2,4,2,10,2,10]
def spin(wheel, n):
for x in wheel:
yield n
n = n + x
import heapq
def insertprime(p,xs,t):
heapq.heappush(t,(p*p,(p*v for v in xs)))
def adjust(t,x):
while True:
n, ns = t[0]
if n <= x:
n, ns = heapq.heappop(t)
heapq.heappush(t, (ns.next(), ns))
else:
break
def sieve(it):
t = []
x = it.next()
yield x
xs0, xs1 = tee(it)
insertprime(x,xs1,t)
it = xs0
while True:
x = it.next()
if t[0][0] <= x:
adjust(t,x)
continue
yield x
xs0, xs1 = tee(it)
insertprime(x,xs1,t)
it = xs0
primes = chain([2,3,5,7], sieve(spin(cycle(wheel2357), 11)))
from time import time
s = time()
print list(islice(primes, 10000, 10001))
e = time()
print "%.8f seconds" % (e-s)
prints:
[104743]
0.18839407 seconds
from itertools import islice
from heapq import heappush, heappop
wheel2357 = [2,4,2,4,6,2,6,4,2,4,6,6,2,6,4,2,6,4,6,8,4,2,4,2,
4,8,6,4,6,2,4,6,2,6,6,4,2,4,6,2,6,4,2,4,2,10,2,10]
class spin(object):
__slots__ = ('wheel','o','n','m')
def __init__(self, wheel, n, o=0, m=1):
self.wheel = wheel
self.o = o
self.n = n
self.m = m
def __iter__(self):
return self
def next(self):
v = self.m*self.n
self.n += self.wheel[self.o]
self.o = (self.o + 1) % len(self.wheel)
return v
def copy(self):
return spin(self.wheel, self.n, self.o, self.m)
def times(self, x):
return spin(self.wheel, self.n, self.o, self.m*x)
def adjust(t,x):
while t[0][0] <= x:
n, ns = heappop(t)
heappush(t, (ns.next(), ns))
def sieve_primes():
for p in [2,3,5,7]:
yield p
it = spin(wheel2357, 11)
t = []
p = it.next()
yield p
heappush(t, (p*p, it.times(p)))
while True:
p = it.next()
if t[0][0] <= p:
adjust(t,p)
continue
yield p
heappush(t, (p*p, it.times(p)))
from time import time
s = time()
print list(islice(sieve_primes(), 10000, 10001))[-1]
e = time()
print "%.8f seconds" % (e-s)
prints:
104743
0.22022200 seconds
import time
from math import sqrt
wheel2357 = [2,4,2,4,6,2,6,4,2,4,6,6,2,6,4,2,6,4,6,8,4,2,4,2,4,8,6,4,6,2,4,6,2,6,6,4,2,4,6,2,6,4,2,4,2,10,2,10]
list_prime = [2,3,5,7]
def isprime(num):
limit = sqrt(num)
for prime in list_prime:
if num % prime == 0: return 0
if prime > limit: break
return 1
def generate_primes(no_of_primes):
o = 0
n = 11
w = wheel2357
l = len(w)
while len(list_prime) < no_of_primes:
i = n
n = n + w[o]
o = (o + 1) % l
if isprime(i):
list_prime.append(i)
t0 = time.time()
generate_primes(10001)
print list_prime[-1] # 104743
t1 = time.time()
print t1-t0 # 0.18 seconds
prints:
104743
0.307313919067

Can this be made more pythonic?

I came across this (really) simple program a while ago. It just outputs the first x primes. I'm embarrassed to ask, is there any way to make it more "pythonic" ie condense it while making it (more) readable? Switching functions is fine; I'm only interested in readability.
Thanks
from math import sqrt
def isprime(n):
if n ==2:
return True
if n % 2 ==0 : # evens
return False
max = int(sqrt(n))+1 #only need to search up to sqrt n
i=3
while i <= max: # range starts with 3 and for odd i
if n % i == 0:
return False
i+=2
return True
reqprimes = int(input('how many primes: '))
primessofar = 0
currentnumber = 2
while primessofar < reqprimes:
result = isprime(currentnumber)
if result:
primessofar+=1
print currentnumber
#print '\n'
currentnumber += 1
Your algorithm itself may be implemented pythonically, but it's often useful to re-write algorithms in a functional way - You might end up with a completely different but more readable solution at all (which is even more pythonic).
def primes(upper):
n = 2; found = []
while n < upper:
# If a number is not divisble through all preceding primes, it's prime
if all(n % div != 0 for div in found):
yield n
found.append( n )
n += 1
Usage:
for pr in primes(1000):
print pr
Or, with Alasdair's comment taken into account, a more efficient version:
from math import sqrt
from itertools import takewhile
def primes(upper):
n = 2; foundPrimes = []
while n < upper:
sqrtN = int(sqrt(n))
# If a number n is not divisble through all preceding primes up to sqrt(n), it's prime
if all(n % div != 0 for div in takewhile(lambda div: div <= sqrtN, foundPrimes)):
yield n
foundPrimes.append(n)
n += 1
The given code is not very efficient. Alternative solution (just as inefficient):†
>>> from math import sqrt
>>> def is_prime(n):
... return all(n % d for d in range(2, int(sqrt(n)) + 1))
...
>>> def primes_up_to(n):
... return filter(is_prime, range(2, n))
...
>>> list(primes_up_to(20))
[2, 3, 5, 7, 11, 13, 17, 19]
This code uses all, range, int, math.sqrt, filter and list. It is not completely identical to your code, as it prints primes up to a certain number, not exactly n primes. For that, you can do:
>>> from itertools import count, islice
>>> def n_primes(n):
... return islice(filter(is_prime, count(2)), n)
...
>>> list(n_primes(10))
[2, 3, 5, 7, 11, 13, 17, 19, 23, 29]
That introduces another two functions, namely itertools.count and itertools.islice. (That last piece of code works only in Python 3.x; in Python 2.x, use itertools.ifilter instead of filter.)
†: A more efficient method is to use the Sieve of Eratosthenes.
A few minor things from the style guide.
Uses four spaces, not two. (Personally I prefer tabs, but that's not the Pythonic way.)
Fewer blank lines.
Consistent whitespace: n ==2: => n == 2:
Use underscores in your variables names: currentnumber => current_number
Firstly, you should not assign max to a variable as it is an inbuilt function used to find the maximum value from an iterable. Also, that entire section of code can instead be written as
for i in xrange(3, int(sqrt(n))+1, 2):
if n%i==0: return False
Also, instead of defining a new variable result and putting the value returned by isprime into it, you can just directly do
if isprime(currentnumber):
I recently found Project Euler solutions in functional python and it has some really nice examples of working with primes like this. Number 7 is pretty close to your problem:
def isprime(n):
"""Return True if n is a prime number"""
if n < 3:
return (n == 2)
elif n % 2 == 0:
return False
elif any(((n % x) == 0) for x in xrange(3, int(sqrt(n))+1, 2)):
return False
return True
def primes(start=2):
"""Generate prime numbers from 'start'"""
return ifilter(isprime, count(start))
Usually you don't use while loops for simple things like this. You rather create a range object and get the elements from there. So you could rewrite the first loop to this for example:
for i in range( 3, int( sqrt( n ) ) + 1, 2 ):
if n % i == 0:
return False
And it would be a lot better if you would cache your prime numbers and only check the previous prime numbers when checking a new number. You can save a lot time by that (and easily calculate larger prime numbers this way). Here is some code I wrote before to get all prime numbers up to n easily:
def primeNumbers ( end ):
primes = []
primes.append( 2 )
for i in range( 3, end, 2 ):
isPrime = True
for j in primes:
if i % j == 0:
isPrime = False
break
if isPrime:
primes.append( i )
return primes
print primeNumbers( 20 )
Translated from the brilliant guys at stacktrace.it (Daniele Varrazzo, specifically), this version takes advantage of a binary min-heap to solve this problem:
from heapq import heappush, heapreplace
def yield_primes():
"""Endless prime number generator."""
# Yield 2, so we don't have to handle the empty heap special case
yield 2
# Heap of (non-prime, prime factor) tuples.
todel = [ (4, 2) ]
n = 3
while True:
if todel[0][0] != n:
# This number is not on the head of the heap: prime!
yield n
heappush(todel, (n*n, n)) # add to heap
else:
# Not prime: add to heap
while todel[0][0] == n:
p = todel[0][1]
heapreplace(todel, (n+p, p))
# heapreplace pops the minimum value then pushes:
# heap size is unchanged
n += 1
This code isn't mine and I don't understand it fully (but the explaination is here :) ), so I'm marking this answer as community wiki.
You can make it more pythonic with sieve algorithm (all primes small than 100):
def primes(n):
sieved = set()
for i in range(2, n):
if not(i in sieved):
for j in range(i + i, n, i):
sieved.add(j)
return set(range(2, n)) - sieved
print primes(100)
A very small trick will turn it to your goal.

Categories

Resources