I am trying to write a python script for Sieve of Eratosthenes. I have everything done, but when I tell the program to test each number if it is a multiple of at least one of the prime numbers below the √input. It goes well but when I try to remove a multiple it works until it tries to delete 6 twice... I've been looking over my code and can't seem to find out why. I'm rather new to Python so help would be much appreciated!
--- Note my code's spacing is off due to SOF's way of making code blocks:
import math, os, threading
global iswhole
global tdprime
def iswhole(x):
if x%1 == 0:
return True
else:
return False
def tdprime(x):
prime = True
n = 2
while n < x:
if iswhole(x/n) == True:
return False
prime = False
n = x
else:
n += 1
if prime == True:
return True
number = 0
while number != 1:
number = int(input("Enter number to calculate all primes up to: "))
number_range = range(1,number+1)
sqrt = math.sqrt(number)
rsqrt = round(sqrt)
thelist = []
tsl = []
for num in range(1,rsqrt):
if tdprime(num) == True:
tsl.append(num)
tsl.remove(1)
for x in number_range: ## Making the List
thelist.append(x)
## Note it's " Key: Value " in dictionaries
print(tsl)
print("thelist: ",thelist)
print("Square root of input = ",sqrt)
print("Rounded Square root of input = ",rsqrt)
print()
for x in thelist: ## Testing each number in thelist
multiple = False
for y in tsl: ## Each prime number under √number
if iswhole(x/y):
print(x) ## If the current number in thelist divided by one of the prime numbers under √number is a whole
thelist.remove(x)## Remove the current number which isn't a prime
thelist.sort()
multiple = True ## Make multiple true so it doesn't print (could remove the multiple stuff and other if function, kind of pointless function now)
if multiple == False:
print("Prime! ",x)
print(thelist)
Your code can be much simplified by the following realisations:
The numbers up to a number n can easily be generated as needed, they don't have to be calculated berforehand and stored in a list.
2 is the only even prime, so you only need to test odd numbers.
All numbers can be fully factorised into prime factors, uniquely (this is called the fundamental theorem of arithmetic). This means that the only numbers which you need to try dividing by are the primes you have previously calculated --- all other numbers will be divisible by at least one of those primes anyway (or they would be primes themselves).
Try something along these lines:
stop = int(input("Enter number to calculate all primes up to: "))
primes = []
for n in range(3, stop, 2): # Step by two each time
was_prime = True # Needed to break out of nested loop
for p in primes:
if n % p == 0: # A prime divides n
was_prime = False
break # No need to continue
if was_prime:
primes.append(n)
print(primes) # You can insert the number 2 here if you like
Also, please pay attention to spacing and indentation. In Python, when you get your spacing wrong it is not only difficult to read, but a runtime error.
I'm a little worried about your iswhole() function:
def iswhole(x):
if x%1 == 0:
return True
else:
return False
(Incidentally, this could be re-written as simply return (x%1 == 0) and skip the return True and return False pieces.)
Floating point numbers are odd creatures:
>>> 10.0%1
0.0
>>> (10.1 * 10)%1
0.0
>>> (10.01 * 100)%1
0.0
>>> (10.001 * 1000)%1
0.0
>>> (10.0001 * 10000)%1
0.0
>>> (10.00001 * 100000)%1
0.0
>>> (10.000001 * 1000000)%1
0.0
>>> (10.0000001 * 10000000)%1
0.0
>>> (10.00000001 * 100000000)%1
1.1920928955078125e-07
For an input such as 6, this doesn't seem likely to be the problem, but it might become a problem for larger inputs. Comparing floating point numbers is troublesome.
For the 6 problem, I'd like to suggest that instead of a mutable list to keep track of your data, you might wish to use an array instead. The array never needs to be sorted (the fact that you have a sort() in your program smells a bit like trying to paper over some kind of bug) and deleting elements from the middles of lists is often extremely expensive. (That sort pretty much means your performance is going to be poor anyhow. Not a big deal for small inputs, but try 1000000 as a starting point.) Setting elements in an array to 1 or 0 is almost always far cheaper than mutable lists.
I see a problem with this bit:
for x in thelist: ## Testing each number in thelist
multiple = False
for y in tsl: ## Each prime number under √number
if iswhole(x/y):
print(x) ## If the current number in thelist divided by one of the prime numbers under √number is a whole
thelist.remove(x)## Remove the current number which isn't a prime
thelist.sort()
You call thelist.remove(x) and thelist.sort() while iterating over thelist. You generally don't want to modify a structure you're iterating over in Python. It usually confuses the internal machinery doing the iteration; you get symptoms like mysteriously skipping elements or processing them twice, or other weirdness.
Related
This is for a school assignment.
I have been tasked to define a function determining the largest square pyramidal number up to a given integer(argument). For some background, these are square pyramidal numbers:
1 = 1^2
5 = 1^2+2^2
14 = 1^2+2^2+3^2
So for a function and parameter largest_square_pyramidal_num(15), the function should return 14, because that's the largest number within the domain of the argument.
I get the idea. And here's my code:
def largest_square_pyramidal_num(n):
sum = 0
i = 0
while sum < n:
sum += i**2
i += 1
return sum
Logically to me, it seemed nice and rosy until I realised it doesn't stop when it's supposed to. When n = 15, sum = 14, sum < n, so the code adds one more round of i**2, and n is exceeded. I've been cracking my head over how to stop the iteration before the condition sum < n turns false, including an attempt at break and continue:
def largest_square_pyramidal_num(n):
sum = 0
for i in range(n+1):
sum += i**2
if sum >= n:
break
else:
continue
return sum
Only to realise it doesn't make any difference.
Can someone give me any advice? Where is my logical lapse? Greatly appreciated!
You can do the following:
def largest_pyr(x):
pyr=[sum([i**2 for i in range(1,k+1)]) for k in range(int(x**0.5)+1)]
pyr=[i for i in pyr if i<=x]
return pyr[-1]
>>>largest_pyr(15)
14
>>> largest_pyr(150)
140
>>> largest_pyr(1500)
1496
>>> largest_pyr(15000)
14910
>>> largest_pyr(150000)
149226
Let me start by saying that continue in the second code piece is redundant. This instruction is used for scenario when you don't want the code in for loop to continue but rather to start a new iteration (in your case there are not more instructions in the loop body).
For example, let's print every number from 1 to 100, but skip those ending with 0:
for i in range(1, 100 + 1):
if i % 10 != 0:
print(i)
for i in range(1, 100 + 1):
if i % 10 == 0:
# i don't want to continue executing the body of for loop,
# get me to the next iteration
continue
print(i)
The first example is to accept all "good" numbers while the second is rather to exclude the "bad" numbers. IMHO, continue is a good way to get rid of some "unnecessary" elements in the container rather than writing an if (your code inside if becomes extra-indented, which worsens readability for bigger functions).
As for your first piece, let's think about it for a while. You while loop terminates when the piramid number is greater or equal than n. And that is not what you really want (yes, you may end up with a piramid number which is equal to n, but it is not always the case).
What I like to suggest is to generate a pyramid number until in exceedes n and then take a step back by removing an extra term:
def largest_square_pyramidal_num(n):
result = 0
i = 0
while result <= n:
i += 1
result += i**2
result -= i ** 2
return result
2 things to note:
don't use sum as a name for the variable (it might confuse people with built-in sum() function)
I swapped increment and result updating in the loop body (such that i is up-to-date when the while loop terminates)
So the function reads like this: keep adding terms until we take too much and go 1 step back.
Hope that makes some sense.
Cheers :)
I was looking at the code below and I can't seem to get my head around line 6 with the if all statement.
Could someone explain what it is doing and what happens in the first iteration when the list is empty. This is from Euler 7
def main():
number_prime_to_find = 1001
x = 2
list_of_primes = []
while (len(list_of_primes) < number_prime_to_find):
if all(x % prime for prime in list_of_primes):
list_of_primes.append(x)
The block:
if all(x % prime for prime in list_of_primes):
list_of_primes.append(x)
means "check that, for all primes in the list, x is not a multiple of that prime". It's effectively the same as (pseudo-code):
xIsAPrime = True
for each prime in list_of_primes:
if x % prime == 0:
xIsAPrime = False
break
if xIsAPrime:
list_of_primes.append(x)
If a candidate prime is not a multiple of any primes in the list, then that candidate prime is an actual prime, so is added to the list.
If the list of primes is empty (initial state), then the first number you check will be considered a prime (you're starting at 2 so that's correct).
The reason for this is that all(x) simply means "true if there are no falsey values in the iterable x, false otherwise". Since an empty iterable cannot have falsey values by definition, performing all() on it will always give true.
Following that (the addition of 2 to the list):
The next prime will then be the next number that's not a multiple of two, so we add 3;
The next prime after that will be the next number that's not a multiple of two or three, so we add 5;
The next prime after that will be the next number that's not a multiple of any in the set { 2, 3, 5 }, hence we add 7;
The next prime after that will be the next number that's not a multiple of any in the set { 2, 3, 5, 7 }, add 11;
And so on, until the list size gets big enought that the final item is the prime we want.
Unfortunately for you, that bit of code doesn't seem to ever change x (unless you've just left it off in the question) so it's unlikely to give you what you want, especially since you also don't seem to print or return it. You can rectify that with:
def main():
number_prime_to_find = 1001
list_of_primes = []
x = 2
while len(list_of_primes) < number_prime_to_find:
if all(x % prime for prime in list_of_primes):
list_of_primes.append(x)
x += 1 # Need to modify x
return list_of_primes[-1] # Probably also want to output it.
print(main())
As an aside, you could further optimise this by realising that, other then two and three, all primes are of the form 6n ± 1, for n > 0 (see here for a more detailed explanation as to why this is the case):
def main():
number_prime_to_find = 1001
list_of_primes = [2, 3]
(x, xadd) = (5, 2)
while len(list_of_primes) < number_prime_to_find:
if all(x % prime for prime in list_of_primes):
list_of_primes.append(x)
x += xadd # Skip impossible primes.
xadd = 6 - xadd # Alternate adding 2, 4: gives 7, 11, 13, 17, ...
return list_of_primes[-1]
print(main())
This is likely to be about (very roughly) three times faster, since you only need to check two candidates in each group of six numbers, rather than the original six candidates.
You may also want to consider making it a more generic function. While finding the 1001st prime is something we all want to do three or four times a day, you may want to avoid having to write a whole other function should some new starter on the team decide they want to get the 1002nd :-)
Something like this, I would imagine:
def getNthPrime(n):
# Cater for smart-alecs :-)
if n < 1: return None
# Initial list holds those that aren't 6n +/- 1.
primeList = [2, 3]
# Next candidate and candidate increment.
(candidate, delta) = (5, 2)
# Until we have enough primes in list.
while len(primeList) < n:
# Add to list if it's prime, move to next candidate.
if all(candidate % prime for prime in primeList):
primeList.append(candidate)
candidate += delta
delta = 6 - delta
# List big enough, return correct one. Note we use n-1 here
# rather than -1, since you may want prime #1 and the list
# STARTS with two primes. We subtract one because first prime
# is actually at index zero.
return primeList[n-1]
print(getNthPrime(1001))
I want to find the largest prime number within range(old_number + 1 , 2*old_number)
This is my code so far:
def get_nearest_prime(self, old_number):
for num in range(old_number + 1, 2 * old_number) :
for i in range(2,num):
if num % i == 0:
break
return num
when I call the get_nearest_prime(13)
the correct output should be 23, while my result was 25.
Anyone can help me to solve this problem? Help will be appreciated!
There are lots of changes you could make, but which ones you should make depend on what you want to accomplish. The biggest problem with your code as it stands is that you're successfully identifying primes with the break and then not doing anything with that information. Here's a minimal change that does roughly the same thing.
def get_nearest_prime(old_number):
largest_prime = 0
for num in range(old_number + 1, 2 * old_number) :
for i in range(2,num):
if num % i == 0:
break
else:
largest_prime = num
return largest_prime
We're using the largest_prime local variable to keep track of all the primes you find (since you iterate through them in increasing order). The else clause is triggered any time you exit the inner for loop "normally" (i.e., without hitting the break clause). In other words, any time you've found a prime.
Here's a slightly faster solution.
import numpy as np
def seive(n):
mask = np.ones(n+1)
mask[:2] = 0
for i in range(2, int(n**.5)+1):
if not mask[i]:
continue
mask[i*i::i] = 0
return np.argwhere(mask)
def get_nearest_prime(old_number):
try:
n = np.max(seive(2*old_number-1))
if n < old_number+1:
return None
return n
except ValueError:
return None
It does roughly the same thing, but it uses an algorithm called the "Sieve of Eratosthenes" to speed up the finding of primes (as opposed to the "trial division" you're using). It isn't the fastest Sieve in the world, but it's reasonably understandable without too many tweaks.
In either case, if you're calling this a bunch of times you'll probably want to keep track of all the primes you've found since computing them is expensive. Caching is easy and flexible in Python, and there are dozens of ways to make that happen if you do need the speed boost.
Note that I'm not positive the range you've specified always contains a prime. It very well might, and if it does you can get away with a lot shorter code. Something like the following.
def get_nearest_prime(old_number):
return np.max(seive(2*old_number-1))
I don't completely agree with the name you've chosen since the largest prime in that interval is usually not the closest prime to old_number, but I think this is what you're looking for anyway.
You can use a sublist to check if the number is prime, if all(i % n for n in range(2, i)) means that number is prime due to the fact that all values returned from modulo were True, not 0. From there you can append those values to a list called primes and then take the max of that list.
List comprehension:
num = 13
l = [*range(num, (2*num)+1)]
print(max([i for i in l if all([i % n for n in range(2,i)])]))
Expanded:
num = 13
l = [*range(num, (2*num)+1)]
primes = []
for i in l:
if all([i % n for n in range(2, i)]):
primes.append(i)
print(max(primes))
23
Search for the nearest prime number from above using the seive function
def get_nearest_prime(old_number):
return old_number+min(seive(2*old_number-1)-old_number, key=lambda a:a<0)
I'm trying to create a python program to check if a given number "n" is prime or not. I've first created a program that lists the divisors of n:
import math
def factors(n):
i = 2
factlist = []
while i <= n:
if n% i == 0:
factlist.append(i)
i = i + 1
return factlist
factors(100)
Next, I'm trying to use the "for i in" function to say that if p1 (the list of factors of n) only includes n itself, then print TRUE, and if else, print FALSE. This seems really easy, but I cannot get it to work. This is what I've done so far:
def isPrime(n):
p1 = factors(n)
for i in p1:
if factors(n) == int(i):
return True
return False
Any help is appreciated! This is a hw assignment, so it's required to use the list of factors in our prime test. Thanks in advance!
p1 will only have n if if has length 1. It may be worthwhile to add if len(p1)==1 as the conditional instead.
What I did (I actually managed to get it in 1 line) was use [number for number in range (2, num // 2) if num % number == 0]. This gets all the numbers that are factors of num. Then you can check if the length of that list is > 0. (Wouldn't be prime).
I usually do something like this:
primes = [2,3]
def is_prime(num):
if num>primes[-1]:
for i in range(primes[-1]+2, num+1,2):
if all(i%p for p in primes):
primes.append(i)
return num in primes
This method generates a global list of prime numbers as it goes. It uses that list to find other prime numbers. The advantage of this method is fewer calculations the program has to make, especially when there are many repeated calls to the is_prime function for possible prime numbers that are less than the largest number you have previously sent to it.
You can adapt some of the concepts in this function for your HW.
I am trying to do some calculations with python, where I ran out of memory. Therefore, I want to read/write a file in order to free memory. I need a something like a very big list object, so I thought writing a line for each object in the file and read/write to that lines instead of to memory. Line ordering is important for me since I will use line numbers as index. So I was wondering how I can replace lines in python, without moving around other lines (Actually, it is fine to move lines, as long as they return back to where I expect them to be).
Edit
I am trying to help a friend, which is worse than or equal to me in python. This code supposed to find biggest prime number, that divides given non-prime number. This code works for numbers until the numbers like 1 million, but after dead, my memory gets exhausted while trying to make numbers list.
# a comes from a user input
primes_upper_limit = (a+1) / 2
counter = 3L
numbers = list()
while counter <= primes_upper_limit:
numbers.append(counter)
counter += 2L
counter=3
i=0
half = (primes_upper_limit + 1) / 2 - 1
root = primes_upper_limit ** 0.5
while counter < root:
if numbers[i]:
j = int((counter*counter - 3) / 2)
numbers[j] = 0
while j < half:
numbers[j] = 0
j += counter
i += 1
counter = 2*i + 3
primes = [2] + [num for num in numbers if num]
for numb in reversed(primes):
if a % numb == 0:
print numb
break
Another Edit
What about wrinting different files for each index? for example a billion of files with long integer filenames, and just a number inside of the file?
You want to find the largest prime divisor of a. (Project Euler Question 3)
Your current choice of algorithm and implementation do this by:
Generate a list numbers of all candidate primes in range (3 <= n <= sqrt(a), or (a+1)/2 as you currently do)
Sieve the numbers list to get a list of primes {p} <= sqrt(a)
Trial Division: test the divisibility of a by each p. Store all prime divisors {q} of a.
Print all divisors {q}; we only want the largest.
My comments on this algorithm are below. Sieving and trial division are seriously not scalable algorithms, as Owen and I comment. For large a (billion, or trillion) you really should use NumPy. Anyway some comments on implementing this algorithm:
Did you know you only need to test up to √a, int(math.sqrt(a)), not (a+1)/2 as you do?
There is no need to build a huge list of candidates numbers, then sieve it for primeness - the numbers list is not scalable. Just construct the list primes directly. You can use while/for-loops and xrange(3,sqrt(a)+2,2) (which gives you an iterator). As you mention xrange() overflows at 2**31L, but combined with the sqrt observation, you can still successfully factor up to 2**62
In general this is inferior to getting the prime decomposition of a, i.e. every time you find a prime divisor p | a, you only need to continue to sieve the remaining factor a/p or a/p² or a/p³ or whatever). Except for the rare case of very large primes (or pseudoprimes), this will greatly reduce the magnitude of the numbers you are working with.
Also, you only ever need to generate the list of primes {p} once; thereafter store it and do lookups, not regenerate it.
So I would separate out generate_primes(a) from find_largest_prime_divisor(a). Decomposition helps greatly.
Here is my rewrite of your code, but performance still falls off in the billions (a > 10**11 +1) due to keeping the sieved list. We can use collections.deque instead of list for primes, to get a faster O(1) append() operation, but that's a minor optimization.
# Prime Factorization by trial division
from math import ceil,sqrt
from collections import deque
# Global list of primes (strictly we should use a class variable not a global)
#primes = deque()
primes = []
def is_prime(n):
"""Test whether n is divisible by any prime known so far"""
global primes
for p in primes:
if n%p == 0:
return False # n was divisible by p
return True # either n is prime, or divisible by some p larger than our list
def generate_primes(a):
"""Generate sieved list of primes (up to sqrt(a)) as we go"""
global primes
primes_upper_limit = int(sqrt(a))
# We get huge speedup by using xrange() instead of range(), so we have to seed the list with 2
primes.append(2)
print "Generating sieved list of primes up to", primes_upper_limit, "...",
# Consider prime candidates 2,3,5,7... in increasing increments of 2
#for number in [2] + range(3,primes_upper_limit+2,2):
for number in xrange(3,primes_upper_limit+2,2):
if is_prime(number): # use global 'primes'
#print "Found new prime", number
primes.append(number) # Found a new prime larger than our list
print "done"
def find_largest_prime_factor(x, debug=False):
"""Find all prime factors of x, and return the largest."""
global primes
# First we need the list of all primes <= sqrt(x)
generate_primes(x)
to_factor = x # running value of the remaining quantity we need to factor
largest_prime_factor = None
for p in primes:
if debug: print "Testing divisibility by", p
if to_factor%p != 0:
continue
if debug: print "...yes it is"
largest_prime_factor = p
# Divide out all factors of p in x (may have multiplicity)
while to_factor%p == 0:
to_factor /= p
# Stop when all factors have been found
if to_factor==1:
break
else:
print "Tested all primes up to sqrt(a), remaining factor must be a single prime > sqrt(a) :", to_factor
print "\nLargest prime factor of x is", largest_prime_factor
return largest_prime_factor
If I'm understanding you correctly, this is not an easy task. They way I interpreted it, you want to keep a file handle open, and use the file as a place to store character data.
Say you had a file like,
a
b
c
and you wanted to replace 'b' with 'bb'. That's going to be a pain, because the file actually looks like a\nb\nc -- you can't just overwrite the b, you need another byte.
My advice would be to try and find a way to make your algorithm work without using a file for extra storage. If you got a stack overflow, chances are you didn't really run out of memory, you overran the call stack, which is much smaller.
You could try reworking your algorithm to not be recursive. Sometimes you can use a list to substitute for the call stack -- but there are many things you could do and I don't think I could give much general advice not seeing your algorithm.
edit
Ah I see what you mean... when the list
while counter <= primes_upper_limit:
numbers.append(counter)
counter += 2L
grows really big, you could run out of memory. So I guess you're basically doing a sieve, and that's why you have the big list numbers? It makes sense. If you want to keep doing it this way, you could try a numpy bool array, because it will use substantially less memory per cell:
import numpy as np
numbers = np.repeat(True, a/2)
Or (and maybe this is not appealing) you could go with an entirely different approach that doesn't use a big list, such as factoring the number entirely and picking the biggest factor.
Something like:
factors = [ ]
tail = a
while tail > 1:
j = 2
while 1:
if tail % j == 0:
factors.append(j)
tail = tail / j
print('%s %s' % (factors, tail))
break
else:
j += 1
ie say you were factoring 20: tail starts out as 20, then you find 2 tail becomes 10, then it becomes 5.
This is not terrible efficient and will become way too slow for a large (billions) prime number, but it's ok for numbers with small factors.
I mean your sieve is good too, until you start running out of memory ;). You could give numpy a shot.
pytables is excellent for working with and storing huge amounts of data. But first start with implementing the comments in smci's answer to minimize the amount of numbers you need to store.
For a number with only twelve digits, as in Project Euler #3, no fancy integer factorization method is needed, and there is no need to store intermediate results on disk. Use this algorithm to find the factors of n:
Set f = 2.
If n = 1, stop.
If f * f > n, print n and stop.
Divide n by f, keeping both the quotient q and the remainder r.
If r = 0, print q, divide n by q, and go to Step 2.
Otherwise, increase f by 1 and go to Step 3.
This just does trial division by every integer until it reaches the square root, which indicates that the remaining cofactor is prime. Each factor is printed as it is found.