Python: Performance when looping and manipulating a large array

Python: Performance when looping and manipulating a large array - python

My question is two-fold:
Is there a way to both efficiently loop over and manipulate an
array using enumerate for example and manipulate the loop at
the same time?
Are there any memory-optimized versions of arrays in python?
(like NumPy creating smaller arrays with a specified type)
I have made an algorithm finding prime numbers in range (2 - rng) with the Sieve of Eratosthenes.
Note: The problem is nonexistent if searching for primes in 2 - 1,000,000 (under 1 sec total runtime too). In the tens and hundreds of millions this starts to hurt. So far changing the table from including all natural numbers to just odd ones, the rough maximum range I was able to search was 400 million (200 million in odd numbers).
Whiles instead of for loops decrease performance at least with the current algorithm.
NumPy while being able to create smaller arrays with type conversion, it actually takes roughly double the time to process with the same code, except
oddTable = np.int8(np.zeros(size))
in place of
oddTable = [0] * size
and using integers to assign values "prime" and "not prime" to keep the array type.
Using pseudo-code, the algorithm would look like this:
oddTable = [0] * size # Array representing odd numbers excluding 1 up to rng
for item in oddTable:
if item == 0: # Prime, since not product of any previous prime
set item to "prime"
set every multiple of item in oddTable to "not prime"
Python is a neat language particularly when looping over every item in a list, but as the index in, say
for i in range(1000)
can't be manipulated while in the loop, I had to convert the range a few times to produce an iterable which to use. In the code: "P" marks prime numbers, "_" marks not primes and 0 not checked.
num = 1 # Primes found (2 is prime)
size = int(rng / 2) - 1 # Size of table required to represent odd numbers
oddTable = [0] * size # Array with odd numbers \ 1: [3, 5, 7, 9...]
new_rng = int((size - 1) / 3) # To go through every 3rd item
for i in range(new_rng): # Eliminate no % 3's
oddTable[i * 3] = "_"
oddTable[0] = "P" # Set 3 to prime
num += 1
def act(x): # The actual integer index x in table refers to
x = (x + 1) * 2 + 1
return x
# Multiples of 2 and 3 eliminated, so all primes are 6k + 1 or 6k + 5
# In the oddTable: remaining primes are either 3*i + 1 or 3*i + 2
# new_rng to loop exactly 1/3 of the table length -> touch every item once
for i in range(new_rng):
j = 3*i + 1 # 3*i + 1
if oddTable[j] == 0:
num += 1
oddTable[j] = "P"
k = act(j)
multiple = j + k # The odd multiple indexes of act(j)
while multiple < size:
oddTable[multiple] = "_"
multiple += k
j += 1 # 3*i + 2
if oddTable[j] == 0:
num += 1
oddTable[j] = "P"
k = act(j)
multiple = j + k
while multiple < size:
oddTable[multiple] = "_"
multiple += k

To make your code more pythonic, split your algorithm in smaller chunks (functions), so that each chunk can be grasped easily.
My second comment might astound you: Python comes with "batteries included". In order to program your Erathostenes' Sieve, why do you need to manipulate arrays explicitly and pollute your code with it? Why not create a function (e.g. is_prime) and use the standard memoize decorator that was provided for that purpose? (If you insist on using 2.7, see also memoization library for python 2.7).
The result of the two pieces of advice above might not be the "most efficient", but it will (as I experienced with that exact problem) work well enough, while allowing you to quickly create sleek code that will save your programmer's time (both for creation and maintenance).

Related

How to improve efficiency and time-complexity of AIO Cloud Cover solution

Sorry for the noob question, but is there a less time expensive method to iterate through the input list, as upon submission I receive timeout errors. I tried changing the method of checking for the minimum answer by appending to a list and using min function, but as expected that didn't help at all.
Input:
6 3
3
6
4
2
5
Solution:
with open("cloudin.txt", "r") as input_file:
n, covered = map(int, input_file.readline().split())
ls = [None for i in range(100005)]
for i in range(n-1):
ls[i] = int(input_file.readline().strip())
ans = 1000000001
file = open("cloudout.txt", "w")
for i in range(n-covered):
a = 0
for j in range(covered):
a += ls[i+j]
if a < ans:
ans = a
file.write(str(ans))
output:
11
https://orac2.info/problem/aio18cloud/
Note: Blue + White indicates timeout

The core logic of your code is contained in these lines:
ans = 1000000001
for i in range(n-covered):
a = 0
for j in range(covered):
a += ls[i+j]
if a < ans:
ans = a
Let's break down what this code actually does. For each closed interval (i.e. including the endpoints) [left, right] from the list [0, covered-1], [1, covered], [2, covered+1], ..., [n-covered-1, n-2] (that is, all closed intervals containing exactly covered elements and that are subintervals of [0, n-2]), you are computing the range sum ls[left] + ls[left+1] + ... + ls[right]. Then you set ans to the minimum such range sum.
Currently, that nested loop takes O((n-covered)*covered)) steps, which is O(n^2) if covered is n/2, for example. You want a way to compute that range sum in constant time, eliminating the nested loop, to make the runtime O(n).
The easiest way to do this is with a prefix sum array. In Python, itertools.accumulate() is the standard/simplest way to generate those. To see how this helps:
Original Sum: ls[left] + ls[left+1] + ... + ls[right]
can be rewritten as the difference of prefix sums
(ls[0] + ls[1] + ... + ls[right])
- (ls[0] + ls[1] + ... + ls[left-1])
which is prefix_sum(0, right) - prefix_sum(0, left-1)
where are intervals are written in inclusive notation.
Pulling this into a separate range_sum() function, you can rewrite the original core logic block as:
prefix_sums = list(itertools.accumulate(ls, initial=0))
def range_sum(left: int, right: int) -> int:
"""Given indices left and right, returns the sum of values of
ls in the inclusive interval [left, right].
Equivalent to sum(ls[left : right+1])"""
return prefix_sums[right+1] - prefix_sums[left]
ans = 1000000001
for i in range(n - covered):
a = range_sum(left=i, right=i+covered-1)
if a < ans:
ans = a
The trickiest part of prefix sum arrays is just avoiding off-by-one errors in indexes. Notice that our prefix sum array of the length-n array ls has n+1 elements, since it starts with the empty initial prefix sum of 0, and so we add 1 to array accesses to prefix_sums compared to our formula.
Also, it's possible there may be an off-by-one error in your original code, as the value ls[n-1] is never accessed or used for anything after being set?

How do i optimize this code to run for larger values? [duplicate]

This question already has answers here:
Elegant Python code for Integer Partitioning [closed]
(11 answers)
Closed 1 year ago.
I'm writing a python function that takes an integer value between 3 and 200 as input, calculates the number of sums using unique nonzero numbers that will equal the number and prints the output.
For example; with 3 as input 1 will be printed because only 1 + 2 will give 3, with 6 as input 3 will be printed because 1+2+3, 1+5 and 2+4 equal 6.
My code works well only for numbers less than 30 after which it starts getting slow. How do I optimize my code to run efficiently for all input between 3 and 200.
from itertools import combinations
def solution(n):
count = 0
max_terms = 0
num = 0
for i in range(1,n):
if num + i <= n:
max_terms += 1
num = num + i
for terms in range(2,max_terms + 1):
for sample in list(combinations(list(range(1,n)),terms)):
if sum(sample) == n:
count += 1
print(count)

Generating all combinations is indeed not very efficient as most will not add up to n.
Instead, you could use a recursive function, which can be called after taking away one partition (i.e. one term of the sum), and will solve the problem for the remaining amount, given an extra indication that future partitions should be greater than the one just taken.
To further improve the efficiency, you can use memoization (dynamic programming) to avoid solving the same sub problem multiple times.
Here is the code for that:
def solution(n, least=1, memo={}):
if n < least:
return 0
key = (n, least)
if key in memo: # Use memoization
return memo[key]
# Counting the case where n is not partitioned
# (But do not count it when it is the original number itself)
count = int(least > 1)
# Counting the cases where n is partitioned
for i in range(least, (n + 1) // 2):
count += solution(n - i, i + 1)
memo[key] = count
return count
Tested the code with these arguments. The comments list the sums that are counted:
print(solution(1)) # none
print(solution(2)) # none
print(solution(3)) # 1+2
print(solution(4)) # 1+3
print(solution(5)) # 1+4, 2+3
print(solution(6)) # 1+2+3, 1+5, 2+4
print(solution(7)) # 1+2+4, 1+6, 2+5, 3+4
print(solution(8)) # 1+2+5, 1+3+4, 1+7, 2+6, 3+5
print(solution(9)) # 1+2+6, 1+3+5, 2+3+4, 1+8, 2+7, 3+6, 4+5
print(solution(10)) # 1+2+3+4, 1+2+7, 1+3+6, 1+4+5, 2+3+5, 1+9, 2+8, 3+7, 4+6

your question isn't clear enough. So, I'm making some assumptionns...
So, what you want is to enter a number. say 4 and then, figure out the total combinations where two different digits add up to that number. If that is what you want, then the answer is quite simple.
for 4, lets take that as 'n'. 'n' has the combinations 1+3,2+2.
for n as 6, the combos are - 1+5,2+4,3+3.
You might have caught a pattern. (4 and 6 have half their combinations) also, for odd numbers, they have combinations that are half their previous even number. i.e. - 5 has (4/2)=2 combos. i.e. 1+4,2+3 so...
the formula to get the number for comnbinations are -
(n)/2 - this is if you want to include same number combos like 2+2 for 4 but, exclude combos with 0. i.e. 0+4 for 4
(n+1)/2 - this works if you want to exclude either the combos with 0 i.e. 0+4 for 4 or the ones with same numbers i.e. 2+2 for 4.
(n-1)/2 - here, same number combos are excluded. i.e. 2+2 wont be counted as a combo for n as 4. also, 0 combos i.e. 0+4 for 4 are excluded.
n is the main number. in these examples, it is '4'. This will work only if n is an integer and these values after calculations are stored as an integer.
3 number combos are totally different. I'm sure there's a formula for that too.

Guidance on removing a nested for loop from function

I'm trying to write the fastest algorithm possible to return the number of "magic triples" (i.e. x, y, z where z is a multiple of y and y is a multiple of x) in a list of 3-2000 integers.
(Note: I believe the list was expected to be sorted and unique but one of the test examples given was [1,1,1] with the expected result of 1 - that is a mistake in the challenge itself though because the definition of a magic triple was explicitly noted as x < y < z, which [1,1,1] isn't. In any case, I was trying to optimise an algorithm for sorted lists of unique integers.)
I haven't been able to work out a solution that doesn't include having three consecutive loops and therefore being O(n^3). I've seen one online that is O(n^2) but I can't get my head around what it's doing, so it doesn't feel right to submit it.
My code is:
def solution(l):
if len(l) < 3:
return 0
elif l == [1,1,1]:
return 1
else:
halfway = int(l[-1]/2)
quarterway = int(halfway/2)
quarterIndex = 0
halfIndex = 0
for i in range(len(l)):
if l[i] >= quarterway:
quarterIndex = i
break
for i in range(len(l)):
if l[i] >= halfway:
halfIndex = i
break
triples = 0
for i in l[:quarterIndex+1]:
for j in l[:halfIndex+1]:
if j != i and j % i == 0:
multiple = 2
while (j * multiple) <= l[-1]:
if j * multiple in l:
triples += 1
multiple += 1
return triples
I've spent quite a lot of time going through examples manually and removing loops through unnecessary sections of the lists but this still completes a list of 2,000 integers in about a second where the O(n^2) solution I found completes the same list in 0.6 seconds - it seems like such a small difference but obviously it means mine takes 60% longer.
Am I missing a really obvious way of removing one of the loops?
Also, I saw mention of making a directed graph and I see the promise in that. I can make the list of first nodes from the original list with a built-in function, so in principle I presume that means I can make the overall graph with two for loops and then return the length of the third node list, but I hit a wall with that too. I just can't seem to make progress without that third loop!!

from array import array
def num_triples(l):
n = len(l)
pairs = set()
lower_counts = array("I", (0 for _ in range(n)))
upper_counts = lower_counts[:]
for i in range(n - 1):
lower = l[i]
for j in range(i + 1, n):
upper = l[j]
if upper % lower == 0:
lower_counts[i] += 1
upper_counts[j] += 1
return sum(nx * nz for nz, nx in zip(lower_counts, upper_counts))
Here, lower_counts[i] is the number of pairs of which the ith number is the y, and z is the other number in the pair (i.e. the number of different z values for this y).
Similarly, upper_counts[i] is the number of pairs of which the ith number is the y, and x is the other number in the pair (i.e. the number of different x values for this y).
So the number of triples in which the ith number is the y value is just the product of those two numbers.
The use of an array here for storing the counts is for scalability of access time. Tests show that up to n=2000 it makes negligible difference in practice, and even up to n=20000 it only made about a 1% difference to the run time (compared to using a list), but it could in principle be the fastest growing term for very large n.

How about using itertools.combinations instead of nested for loops? Combined with list comprehension, it's cleaner and much faster. Let's say l = [your list of integers] and let's assume it's already sorted.
from itertools import combinations
def div(i,j,k): # this function has the logic
return l[k]%l[j]==l[j]%l[i]==0
r = sum([div(i,j,k) for i,j,k in combinations(range(len(l)),3) if i<j<k])

#alaniwi provided a very smart iterative solution.
Here is a recursive solution.
def find_magicals(lst, nplet):
"""Find the number of magical n-plets in a given lst"""
res = 0
for i, base in enumerate(lst):
# find all the multiples of current base
multiples = [num for num in lst[i + 1:] if not num % base]
res += len(multiples) if nplet <= 2 else find_magicals(multiples, nplet - 1)
return res
def solution(lst):
return find_magicals(lst, 3)
The problem can be divided into selecting any number in the original list as the base (i.e x), how many du-plets we can find among the numbers bigger than the base. Since the method to find all du-plets is the same as finding tri-plets, we can solve the problem recursively.
From my testing, this recursive solution is comparable to, if not more performant than, the iterative solution.

This answer was the first suggestion by #alaniwi and is the one I've found to be the fastest (at 0.59 seconds for a 2,000 integer list).
def solution(l):
n = len(l)
lower_counts = dict((val, 0) for val in l)
upper_counts = lower_counts.copy()
for i in range(n - 1):
lower = l[i]
for j in range(i + 1, n):
upper = l[j]
if upper % lower == 0:
lower_counts[lower] += 1
upper_counts[upper] += 1
return sum((lower_counts[y] * upper_counts[y] for y in l))
I think I've managed to get my head around it. What it is essentially doing is comparing each number in the list with every other number to see if the smaller is divisible by the larger and makes two dictionaries:
One with the number of times a number is divisible by a larger
number,
One with the number of times it has a smaller number divisible by
it.
You compare the two dictionaries and multiply the values for each key because the key having a 0 in either essentially means it is not the second number in a triple.
Example:
l = [1,2,3,4,5,6]
lower_counts = {1:5, 2:2, 3:1, 4:0, 5:0, 6:0}
upper_counts = {1:0, 2:1, 3:1, 4:2, 5:1, 6:3}
triple_tuple = ([1,2,4], [1,2,6], [1,3,6])

Improving runtime on Euler #10

So I was attacking a Euler Problem that seemed pretty simple on a small scale, but as soon as I bump it up to the number that I'm supposed to do, the code takes forever to run. This is the question:
The sum of the primes below 10 is 2 + 3 + 5 + 7 = 17.
Find the sum of all the primes below two million.
I did it in Python. I could wait a few hours for the code to run, but I'd rather find a more efficient way to go about this. Here's my code in Python:
x = 1;
total = 0;
while x <= 2000000:
y = 1;
z = 0;
while x >= y:
if x % y == 0:
z += 1;
y += 1;
if z == 2:
total += x
x += 1;
print total;

Like mentioned in the comments, implementing the Sieve of Eratosthenes would be a far better choice. It takes up O(n) extra space, which is an array of length ~2 million, in this case. It also runs in O(n), which is astronomically faster than your implementation, which runs in O(n²).
I originally wrote this in JavaScript, so bear with my python:
max = 2000000 # we only need to check the first 2 million numbers
numbers = []
sum = 0
for i in range(2, max): # 0 and 1 are not primes
numbers.append(i) # fill our blank list
for p in range(2, max):
if numbers[p - 2] != -1: # if p (our array stays at 2, not 0) is not -1
# it is prime, so add it to our sum
sum += numbers[p - 2]
# now, we need to mark every multiple of p as composite, starting at 2p
c = 2 * p
while c < max:
# we'll mark composite numbers as -1
numbers[c - 2] = -1
# increment the count to 3p, 4p, 5p, ... np
c += p
print(sum)
The only confusing part here might be why I used numbers[p - 2]. That's because I skipped 0 and 1, meaning 2 is at index 0. In other words, everything's shifted to the side by 2 indices.

Clearly the long pole in this tent is computing the list of primes in the first place. For an artificial situation like this you could get someone else's list (say, this one), prase it and add up the numbers in seconds.
But that's unsporting, in my view. In which case, try the sieve of atkin as noted in this SO answer.

Python 2 lists of positive integers finding prime number

Given 2 lists of positive integers, find how many ways you can select a number from each of the lists such that their sum is a prime number.
My code is tooo slow As i have both list1 and list 2 containing 50000 numbers each. So any way to make it faster so it solves it in minutes instead of days?? :)
# 2 is the only even prime number
if n == 2: return True
# all other even numbers are not primes
if not n & 1: return False
# range starts with 3 and only needs to go
# up the squareroot of n for all odd numbers
for x in range(3, int(n**0.5)+1, 2):
if n % x == 0: return False
return True
for i2 in l2:
for i1 in l1:
if isprime(i1 + i2):
n = n + 1 # increasing number of ways
s = "{0:02d}: {1:d}".format(n, i1 + i2)
print(s) # printing out

Sketch:
Following #Steve's advice, first figure out all the primes <= max(l1) + max(l2). Let's call that list primes. Note: primes doesn't really need to be a list; you could instead generate primes up the max one at a time.
Swap your lists (if necessary) so that l2 is the longest list. Then turn that into a set: l2 = set(l2).
Sort l1 (l1.sort()).
Then:
for p in primes:
for i in l1:
diff = p - i
if diff < 0:
# assuming there are no negative numbers in l2;
# since l1 is sorted, all diffs at and beyond this
# point will be negative
break
if diff in l2:
# print whatever you like
# at this point, p is a prime, and is the
# sum of diff (from l2) and i (from l1)
Alas, if l2 is, for example:
l2 = [2, 3, 100000000000000000000000000000000000000000000000000]
this is impractical. It relies on that, as in your example, max(max(l1), max(l2)) is "reasonably small".
Fleshed out
Hmm! You said in a comment that the numbers in the lists are up to 5 digits long. So they're less than 100,000. And you said at the start that the list have 50,000 elements each. So they each contain about half of all possible integers under 100,000, and you're going to have a very large number of sums that are primes. That's all important if you want to micro-optimize ;-)
Anyway, since the maximum possible sum is less than 200,000, any way of sieving will be fast enough - it will be a trivial part of the runtime. Here's the rest of the code:
def primesum(xs, ys):
if len(xs) > len(ys):
xs, ys = ys, xs
# Now xs is the shorter list.
xs = sorted(xs) # don't mutate the input list
sum_limit = xs[-1] + max(ys) # largest possible sum
ys = set(ys) # make lookups fast
count = 0
for p in gen_primes_through(sum_limit):
for x in xs:
diff = p - x
if diff < 0:
# Since xs is sorted, all diffs at and
# beyond this point are negative too.
# Since ys contains no negative integers,
# no point continuing with this p.
break
if diff in ys:
#print("%s + %s = prime %s" % (x, diff, p))
count += 1
return count
I'm not going to supply my gen_primes_through(), because it's irrelevant. Pick one from the other answers, or write your own.
Here's a convenient way to supply test cases:
from random import sample
xs = sample(range(100000), 50000)
ys = sample(range(100000), 50000)
print(primesum(xs, ys))
Note: I'm using Python 3. If you're using Python 2, use xrange() instead of range().
Across two runs, they each took about 3.5 minutes. That's what you asked for at the start ("minutes instead of days"). Python 2 would probably be faster. The counts returned were:
219,334,097
and
219,457,533
The total number of possible sums is, of course, 50000**2 == 2,500,000,000.
About timing
All the methods discussed here, including your original one, take time proportional to the product of two lists' lengths. All the fiddling is to reduce the constant factor. Here's a huge improvement over your original:
def primesum2(xs, ys):
sum_limit = max(xs) + max(ys) # largest possible sum
count = 0
primes = set(gen_primes_through(sum_limit))
for i in xs:
for j in ys:
if i+j in primes:
# print("%s + %s = prime %s" % (i, j, i+j))
count += 1
return count
Perhaps you'll understand that one better. Why is it a huge improvement? Because it replaces your expensive isprime(n) function with a blazing fast set lookup. It still takes time proportional to len(xs) * len(ys), but the "constant of proportionality" is slashed by replacing a very expensive inner-loop operation with a very cheap operation.
And, in fact, primesum2() is faster than my primesum() in many cases too. What makes primesum() faster in your specific case is that there are only around 18,000 primes less than 200,000. So iterating over the primes (as primesum() does) goes a lot faster than iterating over a list with 50,000 elements.
A "fast" general-purpose function for this problem would need to pick different methods depending on the inputs.

You should use the Sieve of Eratosthenes to calculate prime numbers.
You are also calculating the prime numbers for each possible combination of sums. Instead, consider finding the maximum value you can achieve with the sum from the lists. Generate a list of all the prime numbers up to that maximum value.
Whilst you are adding up the numbers, you can see if the number appears in your prime number list or not.

I would find the highest number in each range. The range of primes is the sum of the highest numbers.
Here is code to sieve out primes:
def eras(n):
last = n + 1
sieve = [0, 0] + list(range(2, last))
sqn = int(round(n ** 0.5))
it = (i for i in xrange(2, sqn + 1) if sieve[i])
for i in it:
sieve[i * i:last:i] = [0] * (n // i - i + 1)
return filter(None, sieve)
It takes around 3 seconds to find the primes up to 10 000 000. Then I would use the same n ^ 2 algorithm you are using for generating sums. I think there is an n logn algorithm but I can't come up with it.
It would look something like this:
from collections import defaultdict
possible = defaultdict(int)
for x in range1:
for y in range2:
possible[x + y] += 1
def eras(n):
last = n + 1
sieve = [0, 0] + list(range(2, last))
sqn = int(round(n ** 0.5))
it = (i for i in xrange(2, sqn + 1) if sieve[i])
for i in it:
sieve[i * i:last:i] = [0] * (n // i - i + 1)
return filter(None, sieve)
n = max(possible.keys())
primes = eras(n)
possible_primes = set(possible.keys()).intersection(set(primes))
for p in possible_primes:
print "{0}: {1} possible ways".format(p, possible[p])

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python: Performance when looping and manipulating a large array - python

Related

How to improve efficiency and time-complexity of AIO Cloud Cover solution

How do i optimize this code to run for larger values? [duplicate]

Guidance on removing a nested for loop from function

Improving runtime on Euler #10

Python 2 lists of positive integers finding prime number

Categories

Resources