Prime factorization using list comprehension in Python

Prime factorization using list comprehension in Python - python

How to write a function which returns a list of tuples like (c_i, k_i) for n such that n = c1^k1 * c2^k2 * ... , ci is a prime number.
For example: 12600 = 2^3 * 3^2 * 5^2 * 7^1
Desired output: [(2, 3), (3, 2), (5, 2), (7, 1)]
I know how to do it with while but is it possible to do it using list comprehension? Efficiency is not required in this task.
# naive function
def is_prime(n):
return n > 1 and all(n % i != 0 for i in range(2, n))
# while version
def factorization_while(n):
res = []
for i in range(1, n + 1):
if is_prime(i):
j = 0
while n % i == 0:
n = n // i
j += 1
if j != 0:
res.append((i, j))
return res
# list comprehension version
def factorization(n):
return [
(i, j) for i in range(1, n + 1) if is_prime(i) \
and n % i == 0 ... # add smth
]

I don't think this should be too hard. You don't actually need to modify n to find its prime factors, they're all completely independent of each other. So just iterate through the appropriate primes, and find the maximum power that works!
from math import log
def prime_factors(n):
return [(prime, max(power for power in range(1, int(log(n, prime))+1)
if n % prime**power == 0))
for prime in range(2, n+1) if n % prime == 0 and isprime(prime)]
There are a few ways you might improve this further. You could use itertools.takewhile on an infinite generator of powers (e.g. itertools.count), as once you find the first power such that prime**power is not a factor of n, none of the later ones will be either. That would allow you to avoid the log call.
And there are a whole bunch of ways to efficiently iterate over the primes (see for instance, the very simple generator suggested here, or higher performance versions that you can find in the answers to a few different questions here on SO).

Related

I am having trouble returning prime factors of a number

I have the following method and when I run it I want to return a list of the prime factors for a number. For example, if I input 144, I want it to return [2,3]. I have the following method, but when I run it, there is infinite recursion.
def primeFactors(n):
for i in range(2, math.ceil(math.sqrt(n)) + 1):
if n % i == 0:
return list(set([i] + primeFactors(int(i))))
return [n]

You're passing the wrong value to the recursive call. If i divides n, then you want to add i to the set of prime factors of n // i (since i * (n // i) == n when i divides n evenly).
for i in range(2, math.ceil(math.sqrt(n)) + 1):
if n % i == 0:
return [i] + primeFactors(n // i)
return [n]
Iterating to n (rather than the square root of n) may be faster in most cases, as the fact that n shrinks with each recursive call and only the last recursive call will actually iterate all the way to n means that iteration will
probably be faster than computing square roots.
If you only want the unique prime factors, use sets instead.
for i in range(2, math.ceil(math.sqrt(n)) + 1):
if n % i == 0:
return {i} | primeFactors(n // i)
return {n}
If you really want to speed things up, pass the last found factor as an argument and use it as your starting point:
def primefactors(n, x=2):
for i in range(x, n):
q, d = divmod(n, i)
if d == 0:
return {i} | primefactors(q, i)
return {n}
As an exercise, try writing a version that extracts as many factors of 2 as possible; then you can iterate over only the odd factors 3, 5, ... instead of all integers greater than 2.

I can't find where I did wrong :(

I was working on project euler question 23 with python. For this question, I have to find sum of any numbers <28124 that cannot be made by sum of two abundant numbers. abundant numbers are numbers that are smaller then its own sum of proper divisors.
my apporach was : https://gist.github.com/anonymous/373f23098aeb5fea3b12fdc45142e8f7
from math import sqrt
def dSum(n): #find sum of proper divisors
lst = set([])
if n%2 == 0:
step = 1
else:
step = 2
for i in range(1, int(sqrt(n))+1, step):
if n % i == 0:
lst.add(i)
lst.add(int(n/i))
llst = list(lst)
lst.remove(n)
sum = 0
for j in lst:
sum += j
return sum
#any numbers greater than 28123 can be written as the sum of two abundant numbers.
#thus, only have to find abundant numbers up to 28124 / 2 = 14062
abnum = [] #list of abundant numbers
sum = 0
can = set([])
for i in range(1,14062):
if i < dSum(i):
abnum.append(i)
for i in abnum:
for j in abnum:
can.add(i + j)
print (abnum)
print (can)
cannot = set(range(1,28124))
cannot = cannot - can
cannot = list(cannot)
cannot.sort ()
result = 0
print (cannot)
for i in cannot:
result += i
print (result)
which gave me answer of 31531501, which is wrong.
I googled the answer and answer should be 4179871.
theres like 1 million difference between the answers, so it should mean that I'm removing numbers that cannot be written as sum of two abundant numbers. But when I re-read the code it looks fine logically...
Please save from this despair

Just for some experience you really should look at comprehensions and leveraging the builtins (vs. hiding them):
You loops outside of dSum() (which can also be simplified) could look like:
import itertools as it
abnum = [i for i in range(1,28124) if i < dSum(i)]
can = {i+j for i, j in it.product(abnum, repeat=2)}
cannot = set(range(1,28124)) - can
print(sum(cannot)) # 4179871

There are a few ways to improve your code.
Firstly, here's a more compact version of dSum that's fairly close to your code. Operators are generally faster than function calls, so I use ** .5 instead of calling math.sqrt. I use a conditional expression instead of an if...else block to compute the step size. I use the built-in sum function instead of a for loop to add up the divisors; also, I use integer subtraction to remove n from the total because that's more efficient than calling the set.remove method.
def dSum(n):
lst = set()
for i in range(1, int(n ** .5) + 1, 2 if n % 2 else 1):
if n % i == 0:
lst.add(i)
lst.add(n // i)
return sum(lst) - n
However, we don't really need to use a set here. We can just add the divisor pairs as we find them, if we're careful not to add any divisor twice.
def dSum(n):
total = 0
for i in range(1, int(n ** .5) + 1, 2 if n % 2 else 1):
if n % i == 0:
j = n // i
if i < j:
total += i + j
else:
if i == j:
total += i
break
return total - n
This is slightly faster, and uses less RAM, at the expense of added code complexity. However, there's a more efficient approach to this problem.
Instead of finding the divisors (and hence the divisor sum) of each number individually, it's better to use a sieving approach that finds the divisors of all the numbers in the required range. Here's a simple example.
num = 28124
# Build a table of divisor sums.
table = [1] * num
for i in range(2, num):
for j in range(2 * i, num, i):
table[j] += i
# Collect abundant numbers
abnum = [i for i in range(2, num) if i < table[i]]
print(len(abnum), abnum[0], abnum[-1])
output
6965 12 28122
If we need to find divisor sums for a very large num a good approach is to find the prime power factors of each number, since there's an efficient way to compute the sum of the divisors from the prime power factorization. However, for numbers this small the minor time saving doesn't warrant the extra code complexity. (But I can add some prime power sieve code if you're curious; for finding divisor sums for all numbers < 28124, the prime power sieve technique is about twice as fast as the above code).
AChampion's answer shows a very compact way to find the sum of the numbers that cannot be written as the sum of two abundant numbers. However, it's a bit slow, mostly because it loops over all pairs of abundant numbers in abnum. Here's a faster way.
def main():
num = 28124
# Build a table of divisor sums. table[0] should be 0, but we ignore it.
table = [1] * num
for i in range(2, num):
for j in range(2 * i, num, i):
table[j] += i
# Collect abundant numbers
abnum = [i for i in range(2, num) if i < table[i]]
del table
# Create a set for fast searching
abset = set(abnum)
print(len(abset), abnum[0], abnum[-1])
total = 0
for i in range(1, num):
# Search for pairs of abundant numbers j <= d: j + d == i
for j in abnum:
d = i - j
if d < j:
# No pairs were found
total += i
break
if d in abset:
break
print(total)
if __name__ == "__main__":
main()
output
6965 12 28122
4179871
This code runs in around 2.7 seconds on my old 32bit single core 2GHz machine running Python 3.6.0. On Python 2, it's about 10% faster; I think that's because list comprehensions have less overhead in Python 2 (the run in the current scope rather than creating a new scope).

Finding the greatest prime divisor (the fastest program possible)

I was checking the problems on http://projecteuler.net/
The third problem is as follows:
The prime factors of 13195 are 5, 7, 13 and 29. What is the largest
prime factor of the number 600851475143 ?
My solution code is below. But it is so slow that I think it will take weeks to complete.
How can I improve it? Or is Python itself too slow to solve this problem?
def IsPrime(num):
if num < 2:
return False
if num == 2:
return True
else:
for div in range(2,num):
if num % div == 0:
return False
return True
GetInput = int (input ("Enter the number: "))
PrimeDivisors = []
for i in range(GetInput, 1, -1):
print(i)
if GetInput % i == 0 and IsPrime(i) is True:
PrimeDivisors.append(i)
break
else:
continue
print(PrimeDivisors)
print("The greatest prime divisor is:", max(PrimeDivisors))

The problem with your solution is that you don't take your found prime factors into account, so you're needlessly checking factors after you've actually found the largest one. Here's my solution:
def largest_prime_factor(n):
largest = None
for i in range(2, n):
while n % i == 0:
largest = i
n //= i
if n == 1:
return largest
if n > 1:
return n
Project Euler problems are more about mathematics than programming, so if your solution is too slow, it's probably not your language that's at fault.
Note that my solution works quickly for this specific number by chance, so it's definitely not a general solution. Faster solutions are complicated and overkill in this specific case.

This might not be the fastest algorithm but it's quite efficient:
def prime(x):
if x in [0, 1]:
return False
for n in xrange(2, int(x ** 0.5 + 1)):
if x % n == 0:
return False
return True
def primes():
"""Prime Number Generator
Generator an infinite sequence of primes
http://stackoverflow.com/questions/567222/simple-prime-generator-in-python
"""
# Maps composites to primes witnessing their compositeness.
# This is memory efficient, as the sieve is not "run forward"
# indefinitely, but only as long as required by the current
# number being tested.
#
D = {}
# The running integer that's checked for primeness
q = 2
while True:
if q not in D:
# q is a new prime.
# Yield it and mark its first multiple that isn't
# already marked in previous iterations
#
yield q
D[q * q] = [q]
else:
# q is composite. D[q] is the list of primes that
# divide it. Since we've reached q, we no longer
# need it in the map, but we'll mark the next
# multiples of its witnesses to prepare for larger
# numbers
#
for p in D[q]:
D.setdefault(p + q, []).append(p)
del D[q]
q += 1
def primefactors(x):
if x in [0, 1]:
yield x
elif prime(x):
yield x
else:
for n in primes():
if x % n == 0:
yield n
break
for factor in primefactors(x // n):
yield factor
Usage:
>>> list(primefactors(100))
[2, 2, 5, 5]

My code which seems enough fast to me. Using collections.defaultdict() would make the code of primes() abit cleaner but I guess the code would loose some speed due to importing it.
def primes():
"""Prime number generator."""
n, skip = 2, {}
while True:
primes = skip.get(n)
if primes:
for p in primes:
skip.setdefault(n + p, set()).add(p)
del skip[n]
else:
yield n
skip[n * n] = {n}
n += 1
def un_factor(n):
"""Does an unique prime factorization on n.
Returns an ordered tuple of (prime, prime_powers)."""
if n == 1:
return ()
result = []
for p in primes():
(div, mod), power = divmod(n, p), 1
while mod == 0:
if div == 1:
result.append((p, power))
return tuple(result)
n = div
div, mod = divmod(n, p)
if mod != 0:
result.append((p, power))
power += 1
Test run:
>>> un_factor(13195)
((5, 1), (7, 1), (13, 1), (29, 1))
>>> un_factor(600851475143)
((71, 1), (839, 1), (1471, 1), (6857, 1))
>>> un_factor(20)
((2, 2), (5, 1))
EDIT: Minor edits for primes() generator based on this recipe.
EDIT2: Fixed for 20.
EDIT3: Replaced greatest_prime_divisor() with un_factor().

def getLargestFactor(n):
maxFactor = sqrt(n)
lastFactor = n
while n%2 == 0:
n /= 2
lastFactor = 2
for i in xrange(3,int(maxFactor),2 ):
if sqrt(n) < i:
return n
while n%i == 0 and n > 1:
n /= i
lastFactor = i
return lastFactor
This should be fairly efficient. Dividing each factor all out, this way we only find the prime factors. And using the fact that there can only be one prime factor of a number larger than sqrt(n).

Python Eratosthenes Sieve Algorithm Optimization

I'm attempting to implement the Sieve of Eratosthenes. The output seems to be correct (minus "2" that needs to be added) but if the input to the function is larger than 100k or so it seems to take an inordinate amount of time. What are ways that I can optimize this function?
def sieveErato(n):
numberList = range(3,n,2)
for item in range(int(math.sqrt(len(numberList)))):
divisor = numberList[item]
for thing in numberList:
if(thing % divisor == 0) and thing != divisor:
numberList.remove(thing)
return numberList

Your algorithm is not the Sieve of Eratosthenes. You perform trial division (the modulus operator) instead of crossing-off multiples, as Eratosthenes did over two thousand years ago. Here is an explanation of the true sieving algorithm, and shown below is my simple, straight forward implementation, which returns a list of primes not exceeding n:
def sieve(n):
m = (n-1) // 2
b = [True]*m
i,p,ps = 0,3,[2]
while p*p < n:
if b[i]:
ps.append(p)
j = 2*i*i + 6*i + 3
while j < m:
b[j] = False
j = j + 2*i + 3
i+=1; p+=2
while i < m:
if b[i]:
ps.append(p)
i+=1; p+=2
return ps
We sieve only on the odd numbers, stopping at the square root of n. The odd-looking calculations on j map between the integers being sieved 3, 5, 7, 9, ... and indexes 0, 1, 2, 3, ... in the b array of bits.
You can see this function in action at http://ideone.com/YTaMB, where it computes the primes to a million in less than a second.

You can try the same way Eratosthenes did. Take an array with all numbers you need to check order ascending, go to number 2 and mark it. Now scratch every second number till the end of the array. Then go to 3 and mark it. After that scratch every third number . Then go to 4 - it is already scratched, so skip it. Repeat this for every n+1 which is not already scratched.
In the end, the marked numbers are the prime one. This algorithm is faster, but sometimes need lots of memory. You can optimize it a little by drop all even numbers (cause they are not prime) and add 2 manually to the list. This will twist the logic a little, but will take half the memory.
Here is an illustration of what I'm talking about: http://en.wikipedia.org/wiki/Sieve_of_Eratosthenes

Warning: removing elements from an iterator while iterating on it can be dengerous...
You could make the
if(thing % divisor == 0) and thing != divisor:
test lighter by splitting it in the loop that breaks when you arrive to the index of 'divisor' and then the test:
for thing in numberList_fromDivisorOn:
if(thing % divisor == 0):
numberList.remove(thing)

This code takes 2 seconds to generate primes less than 10M
(it is not mine, i found it somewer on google)
def erat_sieve(bound):
if bound < 2:
return []
max_ndx = (bound - 1) // 2
sieve = [True] * (max_ndx + 1)
#loop up to square root
for ndx in range(int(bound ** 0.5) // 2):
# check for prime
if sieve[ndx]:
# unmark all odd multiples of the prime
num = ndx * 2 + 3
sieve[ndx+num:max_ndx:num] = [False] * ((max_ndx-ndx-num-1)//num + 1)
# translate into numbers
return [2] + [ndx * 2 + 3 for ndx in range(max_ndx) if sieve[ndx]]

I followed this link: Sieve of Eratosthenes - Finding Primes Python as suggested by #MAK and I've found that the accepted answer could be improved with an idea I've found in your code:
def primes_sieve2(limit):
a = [True] * limit # Initialize the primality list
a[0] = a[1] = False
sqrt = int(math.sqrt(limit))+1
for i in xrange(sqrt):
isprime = a[i]
if isprime:
yield i
for n in xrange(i*i, limit, i): # Mark factors non-prime
a[n] = False
for (i, isprime) in enumerate(a[sqrt:]):
if isprime:
yield i+sqrt

if given unlimited memory and time, the following code will print all the prime numbers. and it'll do it without using trial division. it is based on the haskell code in the paper: The Genuine Sieve of Eratosthenes by Melissa E. O'Neill
from heapq import heappush, heappop, heapreplace
def sieve():
w = [2,4,2,4,6,2,6,4,2,4,6,6,2,6,4,2,6,4,6,8,4,2,4,2,4,8,6,4,6,2,4,6,2,6,6,4,2,4,6,2,6,4,2,4,2,10,2,10]
for p in [2,3,5,7]: print p
n,o = 11,0
t = []
l = len(w)
p = n
heappush(t, (p*p, n,o,p))
print p
while True:
n,o = n+w[o],(o+1)%l
p = n
if not t[0][0] <= p:
heappush(t, (p*p, n,o,p))
print p
continue
while t[0][0] <= p:
_, b,c,d = t[0]
b,c = b+w[c],(c+1)%l
heapreplace(t, (b*d, b,c,d))
sieve()

Can this be made more pythonic?

I came across this (really) simple program a while ago. It just outputs the first x primes. I'm embarrassed to ask, is there any way to make it more "pythonic" ie condense it while making it (more) readable? Switching functions is fine; I'm only interested in readability.
Thanks
from math import sqrt
def isprime(n):
if n ==2:
return True
if n % 2 ==0 : # evens
return False
max = int(sqrt(n))+1 #only need to search up to sqrt n
i=3
while i <= max: # range starts with 3 and for odd i
if n % i == 0:
return False
i+=2
return True
reqprimes = int(input('how many primes: '))
primessofar = 0
currentnumber = 2
while primessofar < reqprimes:
result = isprime(currentnumber)
if result:
primessofar+=1
print currentnumber
#print '\n'
currentnumber += 1

Your algorithm itself may be implemented pythonically, but it's often useful to re-write algorithms in a functional way - You might end up with a completely different but more readable solution at all (which is even more pythonic).
def primes(upper):
n = 2; found = []
while n < upper:
# If a number is not divisble through all preceding primes, it's prime
if all(n % div != 0 for div in found):
yield n
found.append( n )
n += 1
Usage:
for pr in primes(1000):
print pr
Or, with Alasdair's comment taken into account, a more efficient version:
from math import sqrt
from itertools import takewhile
def primes(upper):
n = 2; foundPrimes = []
while n < upper:
sqrtN = int(sqrt(n))
# If a number n is not divisble through all preceding primes up to sqrt(n), it's prime
if all(n % div != 0 for div in takewhile(lambda div: div <= sqrtN, foundPrimes)):
yield n
foundPrimes.append(n)
n += 1

The given code is not very efficient. Alternative solution (just as inefficient):†
>>> from math import sqrt
>>> def is_prime(n):
... return all(n % d for d in range(2, int(sqrt(n)) + 1))
...
>>> def primes_up_to(n):
... return filter(is_prime, range(2, n))
...
>>> list(primes_up_to(20))
[2, 3, 5, 7, 11, 13, 17, 19]
This code uses all, range, int, math.sqrt, filter and list. It is not completely identical to your code, as it prints primes up to a certain number, not exactly n primes. For that, you can do:
>>> from itertools import count, islice
>>> def n_primes(n):
... return islice(filter(is_prime, count(2)), n)
...
>>> list(n_primes(10))
[2, 3, 5, 7, 11, 13, 17, 19, 23, 29]
That introduces another two functions, namely itertools.count and itertools.islice. (That last piece of code works only in Python 3.x; in Python 2.x, use itertools.ifilter instead of filter.)
†: A more efficient method is to use the Sieve of Eratosthenes.

A few minor things from the style guide.
Uses four spaces, not two. (Personally I prefer tabs, but that's not the Pythonic way.)
Fewer blank lines.
Consistent whitespace: n ==2: => n == 2:
Use underscores in your variables names: currentnumber => current_number

Firstly, you should not assign max to a variable as it is an inbuilt function used to find the maximum value from an iterable. Also, that entire section of code can instead be written as
for i in xrange(3, int(sqrt(n))+1, 2):
if n%i==0: return False
Also, instead of defining a new variable result and putting the value returned by isprime into it, you can just directly do
if isprime(currentnumber):

I recently found Project Euler solutions in functional python and it has some really nice examples of working with primes like this. Number 7 is pretty close to your problem:
def isprime(n):
"""Return True if n is a prime number"""
if n < 3:
return (n == 2)
elif n % 2 == 0:
return False
elif any(((n % x) == 0) for x in xrange(3, int(sqrt(n))+1, 2)):
return False
return True
def primes(start=2):
"""Generate prime numbers from 'start'"""
return ifilter(isprime, count(start))

Usually you don't use while loops for simple things like this. You rather create a range object and get the elements from there. So you could rewrite the first loop to this for example:
for i in range( 3, int( sqrt( n ) ) + 1, 2 ):
if n % i == 0:
return False
And it would be a lot better if you would cache your prime numbers and only check the previous prime numbers when checking a new number. You can save a lot time by that (and easily calculate larger prime numbers this way). Here is some code I wrote before to get all prime numbers up to n easily:
def primeNumbers ( end ):
primes = []
primes.append( 2 )
for i in range( 3, end, 2 ):
isPrime = True
for j in primes:
if i % j == 0:
isPrime = False
break
if isPrime:
primes.append( i )
return primes
print primeNumbers( 20 )

Translated from the brilliant guys at stacktrace.it (Daniele Varrazzo, specifically), this version takes advantage of a binary min-heap to solve this problem:
from heapq import heappush, heapreplace
def yield_primes():
"""Endless prime number generator."""
# Yield 2, so we don't have to handle the empty heap special case
yield 2
# Heap of (non-prime, prime factor) tuples.
todel = [ (4, 2) ]
n = 3
while True:
if todel[0][0] != n:
# This number is not on the head of the heap: prime!
yield n
heappush(todel, (n*n, n)) # add to heap
else:
# Not prime: add to heap
while todel[0][0] == n:
p = todel[0][1]
heapreplace(todel, (n+p, p))
# heapreplace pops the minimum value then pushes:
# heap size is unchanged
n += 1
This code isn't mine and I don't understand it fully (but the explaination is here :) ), so I'm marking this answer as community wiki.

You can make it more pythonic with sieve algorithm (all primes small than 100):
def primes(n):
sieved = set()
for i in range(2, n):
if not(i in sieved):
for j in range(i + i, n, i):
sieved.add(j)
return set(range(2, n)) - sieved
print primes(100)
A very small trick will turn it to your goal.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Prime factorization using list comprehension in Python - python

Related

I am having trouble returning prime factors of a number

I can't find where I did wrong :(

Finding the greatest prime divisor (the fastest program possible)

Python Eratosthenes Sieve Algorithm Optimization

Can this be made more pythonic?

Categories

Resources