I have several days struggling with this Prime Generator algorithm for SPOJ problem. The problem state to print at least 100000 primes from a number m,n with n<=1000000000 in 6 seconds. I have this implementation that print 100000 prime in 11.701067686080933 seconds. is it possible to beat the time restriction(6s) in Python.
I feel that I miss something in my segmented sieve function , cause I just implemented it how I understand the algorithm work, maybe a change can make it better.
Some Help would appreciated here.
def sieveOfErosthen(m):
limit=m+1
prime=[True]*limit
for i in range(2,int(m**0.5)):
if prime[i]:
for x in range(i*i,limit,i):
prime[x]=False
return prime
def segmentedSieve(m,n):
limit= n+1
segment=[True]*limit
for j in range(2,int(n**0.5) ):
if sieveOfErosthen(j):
for b in range(j*(m//j),limit,j):
if b >j:
segment[b]=False
for v in range(m,limit):
if segment[v]:
print(v)
return True
This code is a disaster. Let's begin with the most glaring error:
if sieveOfErosthen(j):
(This is particularly confusing as your original code didn't define this function but instead defined EratosthenesSieve() -- later editors of your post mapped one onto the other which I'm assuming is correct.) What does sieveOfErosthen(j) return? It returns an array, so in the boolean context of if, this test is always True, as the array always contains at least one element if j is positive!
Change this to if True: and see that your output doesn't change. What's left is a very inefficient sieve algorithm, which we can speed up in various ways:
def segmentedSieve(m, n):
primes = []
limit = n + 1
segment = [True] * limit
if limit > 0:
segment[0] = False
if limit > 1:
segment[1] = False
for j in range(2, int(limit**0.5) + 1):
if segment[j]:
for b in range(j * j, limit, j):
segment[b] = False
for v in range(m, limit):
if segment[v]:
primes.append(v)
return primes
This code can easily find the first 100,000 primes in a fraction of a second, But ultimately, if n <= 1000000000 (a billion) then we have to assume the worst case, i.e. the last 100,000 primes in 6 seconds for some suitable m in segmentedSieve(m, 1000000000) which will take this code minutes not seconds.
Finally, you didn't implement a segmented sieve -- you implemented a regular sieve and just skimmed off the requested range. I recommend you read about segmented sieves in Wikipedia, or elsewhere, and start over if you need a segmented sieve.
For solving this problem you have to use Segmented sieve.
there are some good resources please check these
geeksforgeeks
quora
https://discuss.codechef.com/questions/54416/segmented-sieve
https://github.com/calmhandtitan/algorepo/blob/master/numberTheory/sieve_fast.cpp
Related
I am working on a "Perfect Power" algorithm and seem to be running into memory issues. I have a solution outlined as follows which should give me the right answer, but it's possible I am looping through too many iterations. Not sure how to get around this. This code will not even finish running when I try to run it.
Here is the problem text:
"A perfect power is a classification of positive integers:
In mathematics, a perfect power is a positive integer that can be expressed as an integer power of another positive integer. More formally, n is a perfect power if there exist natural numbers m > 1, and k > 1 such that mk = n.
Your task is to check wheter a given integer is a perfect power. If it is a perfect power, return a pair m and k with mk = n as a proof. Otherwise return Nothing, Nil, null, NULL, None or your language's equivalent.
Note: For a perfect power, there might be several pairs. For example 81 = 34 = 92, so (3, 4) and (9, 2) are valid solutions. However, the tests take care of this, so if a number is a perfect power, return any pair that proves it."
def isPP(n):
for k in range(2, n):
for m in range(2, n):
if m**k == n:
return (m, k)
return None
This can benefit from a lot of optimizations, but the simplest one is the following:
def isPP(n):
for k in range(2, n):
for m in range(2, n):
result = m ** k
if result > n:
break # break the inner loop only
elif result == n:
return (m, k)
return None
But there are more optimizations that you can add to make it run faster, which I will leave for you to find.
There is no memory issue, and there isn't a reason for one, you're not really allocating any more space for variables inside the loop (I've added one allocation for result, which is overwritten each time m increases)`
So, I know if I
A) Let this run all night with an Upper Bound of 10,000,000
or
B) Change the range of n in increments of 500
I can get the answer to this Euler problem. But both of those feel like cheating / lazy. I've seen other solutions to this posted in C, but not in Python.
My main areas of optimization here would presumably be either/or/both some better manner of summing all integers up to a certain value, and not having Python constantly creating and dumping the list of factors for every single number it's checking. Having waited 20 minutes with an Upper Bound of one million and the program not completing, I'm not happy with the way I've implemented this.
for x in range(1,SomeUpperBound):
n = sum(range(1,x))
factorize(n)
def factorize(n):
for i in range(1,n+1):
if n > 1 and n % i == 0 and i not in factors:
factors.append(i)
if len(factors) == 500:
print(n)
print(factors)
else: factors.clear()
The question is available here. My Python code is
def solution(A, B):
if len(A) == 1:
return [1]
ways = [0] * (len(A) + 1)
ways[1], ways[2] = 1, 2
for i in xrange(3, len(ways)):
ways[i] = ways[i-1] + ways[i-2]
result = [1] * len(A)
for i in xrange(len(A)):
result[i] = ways[A[i]] & ((1<<B[i]) - 1)
return result
The detected time complexity by the system is O(L^2) and I can't see why. Thank you in advance.
First, let's show that the runtime genuinely is O(L^2). I copied a section of your code, and ran it with increasing values of L:
import time
import matplotlib.pyplot as plt
def solution(L):
if L == 0:
return
ways = [0] * (L+5)
ways[1], ways[2] = 1, 2
for i in xrange(3, len(ways)):
ways[i] = ways[i-1] + ways[i-2]
points = []
for L in xrange(0, 100001, 10000):
start = time.time()
solution(L)
points.append(time.time() - start)
plt.plot(points)
plt.show()
The result graph is this:
To understand why this O(L^2) when the obvious "time complexity" calculation suggests O(L), note that "time complexity" is not a well-defined concept on its own since it depends on which basic operations you're counting. Normally the basic operations are taken for granted, but in some cases you need to be more careful. Here, if you count additions as a basic operation, then the code is O(N). However, if you count bit (or byte) operations then the code is O(N^2). Here's the reason:
You're building an array of the first L Fibonacci numbers. The length (in digits) of the i'th Fibonacci number is Theta(i). So ways[i] = ways[i-1] + ways[i-2] adds two numbers with approximately i digits, which takes O(i) time if you count bit or byte operations.
This observation gives you an O(L^2) bit operation count for this loop:
for i in xrange(3, len(ways)):
ways[i] = ways[i-1] + ways[i-2]
In the case of this program, it's quite reasonable to count bit operations: your numbers are unboundedly huge as L increases and addition of huge numbers is linear in clock time rather than O(1).
You can fix the complexity of your code by computing the Fibonacci numbers mod 2^32 -- since 2^32 is a multiple of 2^B[i]. That will keep a finite bound on the numbers you're dealing with:
for i in xrange(3, len(ways)):
ways[i] = (ways[i-1] + ways[i-2]) & ((1<<32) - 1)
There are some other issues with the code, but this will fix the slowness.
I've taken the relevant parts of the function:
def solution(A, B):
for i in xrange(3, len(A) + 1): # replaced ways for clarity
# ...
for i in xrange(len(A)):
# ...
return result
Observations:
A is an iterable object (e.g. a list)
You're iterating over the elements of A in sequence
The behavior of your function depends on the number of elements in A, making it O(A)
You're iterating over A twice, meaning 2 O(A) -> O(A)
On point 4, since 2 is a constant factor, 2 O(A) is still in O(A).
I think the page is not correct in its measurement. Had the loops been nested, then it would've been O(A²), but the loops are not nested.
This short sample is O(N²):
def process_list(my_list):
for i in range(0, len(my_list)):
for j in range(0, len(my_list)):
# do something with my_list[i] and my_list[j]
I've not seen the code the page is using to 'detect' the time complexity of the code, but my guess is that the page is counting the number of loops you're using without understanding much of the actual structure of the code.
EDIT1:
Note that, based on this answer, the time complexity of the len function is actually O(1), not O(N), so the page is not incorrectly trying to count its use for the time-complexity. If it were doing that, it would've incorrectly claimed a larger order of growth because it's used 4 separate times.
EDIT2:
As #PaulHankin notes, asymptotic analysis also depends on what's considered a "basic operation". In my analysis, I've counted additions and assignments as "basic operations" by using the uniform cost method, not the logarithmic cost method, which I did not mention at first.
Most of the time simple arithmetic operations are always treated as basic operations. This is what I see most commonly being done, unless the algorithm being analysed is for a basic operation itself (e.g. time complexity of a multiplication function), which is not the case here.
The only reason why we have different results appears to be this distinction. I think we're both correct.
EDIT3:
While an algorithm in O(N) is also in O(N²), I think it's reasonable to state that the code is still in O(N) b/c, at the level of abstraction we're using, the computational steps that seem more relevant (i.e. are more influential) are in the loop as a function of the size of the input iterable A, not the number of bits being used to represent each value.
Consider the following algorithm to compute an:
def function(a, n):
r = 1
for i in range(0, n):
r *= a
return r
Under the uniform cost method, this is in O(N), because the loop is executed n times, but under logarithmic cost method, the algorithm above turns out to be in O(N²) instead due to the time complexity of the multiplication at line r *= a being in O(N), since the number of bits to represent each number is dependent on the size of the number itself.
Codility Ladder competition is best solved in here:
It is super tricky.
We first compute the Fibonacci sequence for the first L+2 numbers. The first two numbers are used only as fillers, so we have to index the sequence as A[idx]+1 instead of A[idx]-1. The second step is to replace the modulo operation by removing all but the n lowest bits
The 10th problem in Project Euler:
The sum of the primes below 10 is 2 + 3 + 5 + 7 = 17.
Find the sum of all the primes below two million.
I found this snippet :
sieve = [True] * 2000000 # Sieve is faster for 2M primes
def mark(sieve, x):
for i in xrange(x+x, len(sieve), x):
sieve[i] = False
for x in xrange(2, int(len(sieve) ** 0.5) + 1):
if sieve[x]: mark(sieve, x)
print sum(i for i in xrange(2, len(sieve)) if sieve[i])
published here
which run for 3 seconds.
I wrote this code:
def isprime(n):
for x in xrange(3, int(n**0.5)+1):
if n % x == 0:
return False
return True
sum=0;
for i in xrange(1,int(2e6),2):
if isprime(i):
sum += i
I don't understand why my code (the second one) is much slower?
Your algorithm is checking every number individually from 2 to N (where N=2000000) for primality.
Snippet-1 uses the sieve of Eratosthenes algorithm, discovered about 2200 years ago.
It does not check every number but:
Makes a "sieve" of all numbers from 2 to 2000000.
Finds the first number (2), marks it as prime, then deletes all its multiples from the sieve.
Then finds the next undeleted number (3), marks it as prime and deletes all its multiples from the sieve.
Then finds the next undeleted number (5), marks it as prime and deletes all its multiples from the sieve.
...
Until it finds the prime 1409 and deletes all its multiples from the sieve.
Then all primes up to 1414 ~= sqrt(2000000) have been found and it stops
The numbers from 1415 up to 2000000 do not have to be checked. All of them who have not been deleted are primes, too.
So the algorithm produces all primes up to N.
Notice that it does not do any division, only additions (not even multiplications, and not that it matters with so small numbers but it might with bigger ones). Time complexity is O(n loglogn) while your algorithm has something near O(n^(3/2)) (or O(n^(3/2) / logn) as #Daniel Fischer commented), assuming divisions cost the same as multiplications.
From the Wikipedia (linked above) article:
Time complexity in the random access machine model is O(n log log n) operations, a direct consequence of the fact that the prime harmonic series asymptotically approaches log log n.
(with n = 2e6 in this case)
The first version pre-computes all the primes in the range and stores them in the sieve array, then finding the solution is a simple matter of adding the primes in the array. It can be seen as a form of memoization.
The second version tests for each number in the range to see if it is prime, repeating a lot of work already made by previous calculations.
In conclusion, the first version avoids re-computing values, whereas the second version performs the same operations again and again.
To easily understand the difference, try thinking how many times each number will be used as a potential divider:
In your solution, the number 2 will be tested for EACH number when that number will be tested for being a prime. Every number you pass along the way will then be used as a potential divider for every next number.
In the first solution, once you stepped over a number you never look back - you always move forward from the place you reached. By the way, a possible and common optimization is to go for odd numbers only after you marked 2:
mark(sieve, 2)
for x in xrange(3, int(len(sieve) ** 0.5) + 1, 2):
if sieve[x]: mark(sieve, x)
This way you only look at each number once and clear out all of its multiplications forward, rather than going through all possible dividers again and again checking each number with all its predecessors, and the if statement prevents you from doing repeated work for a number you previously encountered.
As Óscar's answer indicates, your algorithm repeats a lot of work. To see just how much processing the other algorithm saves, consider the following modified version of the mark() and isprime() functions, which keep track of how many times the function has been called and the total number of for loop iterations:
calls, count = 0, 0
def mark(sieve, x):
global calls, count
calls += 1
for i in xrange(x+x, len(sieve), x):
count += 1
sieve[i] = False
After running the first code with this new function we can see that mark() is called 223 times with a total of 4,489,006 (~4.5 million) iterations in the for loop.
calls, count = 0
def isprime(n):
global calls, count
calls += 1
for x in xrange(3, int(n**0.5)+1):
count += 1
if n % x == 0:
return False
return True
If we make a similar change to your code, we can see that isprime() is called 1,000,000 (1 million) times with 177,492,735 (~177.5 million) iterations of the for loop.
Counting function calls and loop iterations isn't always a conclusive way to determine why an algorithm is faster, but generally less steps == less time, and clearly your code could use some optimization to reduce the number of steps.
Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 11 years ago.
Improve this question
I'm a 17 year old getting started with programming with the help of the Python programming language.
I've been seeking to optimize this algorithm, perhaps by eliminating one of the loops, or with a better test to check for prime numbers.
Trying to calculate and display 100000 prime numbers has the script pausing for about 6 seconds as it populates the list with primes before the primes list is returned to the console as output.
I've been experimenting with using
print odd,
to simply print every found prime number, which is faster for smaller inputs like n = 1000, but for n = 1000000 the list itself prints much faster (both in the python shell and in the console).
Perhaps the entire code/algorithm should be revamped, but the script should remain essentially the same: The user types in the number of prime numbers to be printed (n) and the script returns all prime numbers up to the nth prime number.
from time import time
odd = 1
primes = [2]
n = input("Number of prime numbers to print: ")
clock = time()
def isPrime(number):
global primes
for i in primes:
if i*i > number:
return True
if number%i is 0:
return False
while len(primes) < n:
odd += 2
if isPrime(odd):
primes += [odd]
print primes
clock -= time()
print "\n", -clock
raw_input()
I might wanna rewrite the whole script to use a sieve like the Sieve of Atkin: http://en.wikipedia.org/wiki/Sieve_of_Atkin
However, I am simply a beginner at Python (or even at programming: I started writing code only 2 weeks ago) and it would be quite a challenge for me to figure out how to code a Sieve of Atkin algorithm in Python.
I wish a google hacker out there would hand hold me through stuff like this :(
You could use prime sieve, and with a simple twist:
Define the first prime 2 as you do, set the largest number reached (max) to 2;
Generate a list of n consecutive numbers from max+1 to max+n;
Use sieve with the primes on this list. When sieving, set the beginning number for each prime to the smallest number in the list that could be divided by the prime;
If the amount is not reacher, goto 2.
This way, you could control the length of the list, and as the length grows larger, the speed will be faster. However, this is a total rework of the algorithm, and is harder to program.
Here's a sample code, which is quite crude, but this only takes less than 70% time of the original:
from math import sqrt
from time import time
primes = [2]
max = 3
n = input("Number of prime numbers to print: ")
r=2
clock = time()
def sieve(r):
global primes
global max
s = set(range(max,max+r))
for i in primes:
b=max//i
if (b*i<max):
b=b+1
b=b*i
while b<=max+r-1:
if b in s:
s.remove(b)
b=b+i
for i in s:
primes.append(i)
while len(primes) < n:
r=primes[-1]
sieve(r)
max=max+r
primes=primes[0:n]
print primes
clock -= time()
print "\n", -clock
raw_input()
There are many ways to improve this, this just shows the notion of the approach.
Also, this can blow up the memory when the number is large. I used the dynamic limit try to somewhat relieve this.
And if you are really curious (and fearless), you could look at the more complicated implementations in various open source projects. One example is Pari/GP, which is written in C++, and is blazing fast (I tested 1 to 50000000 in less than 1 min, if I remember correctly). Translating them to Python may be hard, but will be helpful, perhaps not just for yourself;-)
One simple optimizations which could be applied without hacking the code completely.
the i*i on every prime gets very wasteful as the list gets longer. Instead calculate the square root of i outside the loop and test against this value inside the loop.
However square root is itself and expensive calculation and the majority of candidate numbers will be rejected as divisible by one of the lower primes (3,5,7) so this turns out to be not such a good optimization (pessimization?). But we don't actually need to be that precise and a simple check that the prime is less than one third of the value has a similar effect without the computational cost of the square root calculation, but, at the expense of a relatively few unnecessary test.
As was already said by Ziyao Wei I'd also try a Sieve implementation. The only thing I'd improve is to use the Prime number theorem as a starting point for the used size.
Computing the inverse function isn't straightforward in pure python, but an iterative approach should be good enough and that way you could get a pretty good idea how large the sieve would have to be. Since I don't really remember the proofs for the theorem in detail and it's 6am in the morning here, someone else will have to chip in to say if the theorem guarantees any certain upper boundary that could be used to allow using the simple sieve without having to worry about growing it. iirc that's sadly not the case.
As already mentioned, the presented algorithm cannot be improved significantly. If a fast solution is requested then the Eratosthenes sieve is appropriate. The size x of the sieve can be estimated using n >= x/(ln x + 2) if x >= 55. This equation can be solved using the Newton's iteration. The presented algorithm is about 10 times faster the original:
def sieveSize(n):
# computes x such that pi(x) >= n (assumes x >= 55)
x = 1.5 * n # start
y = x - n * math.log(x) - 2 * n
while abs(y) > 0.1:
derivative = 1 - n/x
x = x - y / derivative
y = x - n * math.log(x) - 2 * n
return int(x) + 1
def eratosthenes(n):
# create a string flags: flags[i]=='1' iff i prime
size = sieveSize(n)
flags = ['1'] * size # start with: all numbers are prime
flags[0] = flags[1] = '0' # 0 and 1 are not primes
i = 0
while i * i < size:
if flags[i] == '1':
for j in range(i * i, size, i):
flags[j] = '0'
i += 1
return flags
def primes(n):
flags = eratosthenes(n)
prims = []
for i in range(0, len(flags)):
if flags[i] == '1':
prims.append(i)
return prims
prims = primes(100000)
Any number that ends in 5, other than 5, is not a prime. So you can put a statement that skips any number ending in 5 that is greater than 5.