why is my iterator implementation very inefficient?

why is my iterator implementation very inefficient? - python

I wrote the following python script to count the number of occurrences of a character (a) in the first n characters of an infinite string.
from itertools import cycle
def count_a(str_, n):
count = 0
str_ = cycle(str_)
for i in range(n):
if next(str_) == 'a':
count += 1
return count
My understanding of iterators is that they are supposed to be efficient, but this approach is super slow for very large n. Why is this so?

The cycle iterator might not be as efficient as you think, the documentation says
Make an iterator returning elements from the iterable and saving a
copy of each.
When the iterable is exhausted, return elements from the saved copy.
Repeats indefinitely
...Note, this member of the toolkit may require significant auxiliary
storage (depending on the length of the iterable).
Why not simplify and just not use the iterator at all? It adds unnecessary overhead and gives you no benefit. You can easily count the occurrences with a simple str_[:n].count('a')

The first problem here is that despite using itertools, you're still doing explicit python-level for loop. To gain the C level speed boost when using itertools you want to keep all the iteration in the high speed itertools.
So let's do this step by step, first we want to get the number of characters in a finite string. To do this, you can use the itertools.islice method to get the first n characters in the string:
str_first_n_chars = islice(cycle(str_), n)
You next want to count the number of occurrences of the letter (a), to do this you can do some variation of either of these (you may want to experiment which variants is faster):
count_a = sum(1 for c in str_first_n_chars if c == 'a')
count_a = len(tuple(filter('a'.__eq__, str_first_n_chars))
This is all and well, but this is still slow for really large n because you need to iterate through str_ many, many times for really large n, like for example n = 10**10000. In other words, this algorithm is O(n).
There's one last improvement we could made. Notice how that the number of (a) in the str_ never really change in each iteration. Rather than iterating through str_ multiple times for large n, we can do a little bit of smarter with a bit of math so that we only need to iterate through str_ twice. First we count the number of (a) in a single stretch of str_:
count_a_single = str_.count('a')
Then we find out how many times we would have to iterate through str_ to get length n by using divmod function:
iter_count, remainder = divmod(n, len(str_))
then we can just multiply iter_count with count_a_single and add the number of (a) in the remaining length. We don't need cycle or islice and such here because remainder < len(str_)
count_a = iter_count * count_a_single + str_[:remainder].count('a')
With this method, the runtime performance of the algorithm grows only on the length of a single cycle of str_ rather than n. In other words, this algorithm is O(len(str_)).

Related

How can I partition `itertools.combinations` such that I can process the results in parallel?

I have a massive quantity of combinations (86 choose 10, which yields 3.5 trillion results) and I have written an algorithm which is capable of processing 500,000 combinations per second. I would not like to wait 81 days to see the final results, so naturally I am inclined to separate this into many processes to be handled by my many cores.
Consider this naive approach:
import itertools
from concurrent.futures import ProcessPoolExecutor
def algorithm(combination):
# returns a boolean in roughly 1/500000th of a second on average
def process(combinations):
for combination in combinations:
if algorithm(combination):
# will be very rare (a few hundred times out of trillions) if that matters
print("Found matching combination!", combination)
combination_generator = itertools.combinations(eighty_six_elements, 10)
# My system will have 64 cores and 128 GiB of memory
with ProcessPoolExecutor(workers=63) as executor:
# assign 1,000,000 combinations to each process
# it may be more performant to use larger batches (to avoid process startup overhead)
# but eventually I need to start worrying about running out of memory
group = []
for combination in combination_generator:
group.append(combination)
if len(group) >= 1_000_000:
executor.submit(process, group)
group = []
This code "works", but it has virtually no performance gain over a single-threaded approach, since it is bottlenecked by the generation of the combinations for combination in combination_generator.
How can I pass this computation off to the child-processes so that it can be parallelized? How can each process generate a specific subset of itertools.combinations?
p.s. I found this answer, but it only deals with generating single specified elements, whereas I need to efficiently generate millions of specified elements.

I'm the author of one answer to the question you already found for generating the combination at a given index. I'd start with that: Compute the total number of combinations, divide that by the number of equally sized subsets you want, then compute the cut-over for each of them. Then do your subprocess tasks with these combinations as bounds. Within each subprocess you'd do the iteration yourself, not using itertools. It's not hard:
def next_combination(n: int, c: list[int]):
"""Compute next combination, in lexicographical order.
Args:
n: the number of items to choose from.
c: a list of integers in strictly ascending order,
each of them between 0 (inclusive) and n (exclusive).
It will get modified by the call.
Returns: the list c after modification,
or None if this was the last combination.
"""
i = len(c)
while i > 0:
i -= 1
n -= 1
if c[i] == n: continue
c[i] += 1
for j in range(i + 1, len(c)):
c[j] = c[j - 1] + 1
return c
return None
Note that both the code above and the one from my other answer assume that you are looking for combinations of elements from range(n). If you want to combine other elements, do combinations for this range then use the elements of the found combinations as indices into your sequence of actual things you want to combine.
The main advantage of the approach above is that it ensures equal batch size, which might be useful if processing time is expected to be mostly determined by batch size. If processing time still varies greatly even for batches of the same size, that might be too much effort. I'll post an alternative answer addressing that.

You can do a recursive divide-and-conquer approach, where you make a decision based on the expected number of combinations. If it is small, use itertools. If it is large, handle the case of the first element being included and of it being excluded both in recursive calls.
The result does not ensure batches of equal size, but it does give you an upper bound on the size of each batch. If processing time of each batch is somewhat varied anyway, that might be good enough.
T = typing.TypeVar('T')
def combination_batches(
seq: collections.abc.Sequence[T],
r: int,
max_batch_size: int,
prefix: tuple[T, ...] = ()
) -> collections.abc.Iterator[collections.abc.Iterator[tuple[T, ...]]]:
"""Compute batches of combinations.
Each yielded value is itself a generator over some of the combinations.
Taken together they produce all the combinations.
Args:
seq: The sequence of elements to choose from.
r: The number of elements to include in each combination.
max_batch_size: How many elements each returned iterator
is allowed to iterate over.
prefix: Used during recursive calls, prepended to each returned tuple.
Yields: generators which each generate a subset of all the combinations,
in a way that generators together yield every combination exactly once.
"""
if math.comb(len(seq), r) > max_batch_size:
# One option: first element taken.
yield from combination_batches(
seq[1:], r - 1, max_batch_size, prefix + (seq[0],))
# Other option: first element not taken.
yield from combination_batches(
seq[1:], r, max_batch_size, prefix)
return
yield (prefix + i for i in itertools.combinations(seq, r))
See https://ideone.com/GD6WYl for a more compete demonstration.
Note that I don't know how well the process pool executor deals with generators as arguments, whether it is able to just forward a short description of each. There is a chance that in order to ship the generator to a subprocess it will actually generate all the values. So instead of yielding the generator expression the way I did, you might want to yield some object which pickles more nicely but still offers iteration over the same values.

What's the overhead of using built-in python functions like zip() and join() on the performance of my function?

Below I have provided the function to calculate the LCF (longest common prefix). I want to know the Big O time-complexity and space complexity. Can I say it is O(n)? Or do zip() and join() affect the time-complexity? I am wondering the space complexity is O(1). Please correct me if I am wrong. The input to the function is a list containing strings e.g., ["flower","flow","flight"].
def longestCommonPrefix(self, strs):
res = []
for x in zip(*strs):
if len(set(x)) == 1:
res.append(x[0])
else:
break
return "".join(res)

Iterating to get a single tuple value from zip(*strs) takes O(len(strs)) time and space. That's just the time it takes to allocate and fill a tuple of that length.
Iterating to consume the whole iterator takes O(len(strs) * min(len(s) for s in strs)) time, but shouldn't take any additional space over a single iteration.
Your iteration code is a bit trickier, because you may stop iterating early, when you find the first place within your strings where some characters don't match. In the worst case, all the strings are identical (up to the length of the shortest one) and so you'd use the time complexity above. And in the best case there is no common prefix, so you can use the single-value iteration as your best case.
But there's no good way to describe "average case" performance because it depends a lot on the distributions of the different inputs. If your inputs were random strings, you could do some statistics and predict an average number of iterations, but if your input strings are words, or even more likely, specific words expected to have common prefixes, then it's very likely that all bets are off.
Perhaps the best way to describe that part of the function's performance is actually in terms of its own output. It takes O(len(strs) * len(self.longestCommonPrefix(strs)) time to run.
As for str.join, running "".join(res) if we know nothing about res takes O(len(res) + len("".join(res))) for both time and space. Because your code only joins individual characters, the two lengths are going to be the same, so we can say that the join in your function takes O(len(self.longestCommonPrefix(strs))) time and space.
Putting things together, we can see that the main loop takes a multiple of the time taken by the join call, so we can ignore the latter and say that the function's time complexity is just O(len(strs) * len(self.longestCommonPrefix(strs)). However, the memory usage complexities for the two parts are independent and we can't easily predict if the number of strings or the length of the output will grow faster. So we need to combine them and say that you need O(len(strs) + len(self.longestCommonPrefix(strs))) space.

Time:
Your code is O(n * m), where n is the lenght of the list and m is the lenght of the biggest string in the list.
zip() is O(1) in python 3.x. The function allocates a special iterable (called the zip object), and assigns the parameter array to an internal field. In case of zip(*x) (pointed from #juanpa.arrivillaga), it builds a tuple, so it is O(n). As a result, you will get an O(n) because you iterate over the list (tuple) plus the zip(*x) call staying at the end with O(n).
join() is O(n). Where n is the total length of the input.
set() is O(m). Where m is the total lenght of the word.
Space:
It is O(n), because in the worst scenario, res will need to append x[0] n times.

Generate all possible sequence of altrenate digits and alphabets

I want to generate all possible sequence of alternative digit and numbers. For example
5j1c6l2d4p9a9h9q
6d5m7w4c8h7z4s0i
3z0v5w1f3r6b2b1z
NumberSmallletterNumberSmallletter
NumberSmallletterNumberSmallletter
NumberSmallletterNumberSmallletter
NumberSmallletterNumberSmallletter
I can do it by using 16 loop But it will take 30+ hours (rough idea). Is there any efficient way. I hope there will be in python.

You can use itertools.product to generate all of the 16 long cases:
import string, itertools
i = itertools.product(string.digits, string.ascii_lowercase, repeat=8)
j = (''.join(p) for p in i)
As i is an iterator of tuples, we need to convert these all to strings (so they are in the format that you want). This is relatively straight forward to do as we can just pass each tuple into a generator and join the elements together into one string.
We can see that the iterator (j) is working by calling next() on it a couple of times:
>>> next(j)
'0a0a0a0a0a0a0a0a'
>>> next(j)
'0a0a0a0a0a0a0a0b'
>>> next(j)
'0a0a0a0a0a0a0a0c'
>>> next(j)
'0a0a0a0a0a0a0a0d'
>>> next(j)
'0a0a0a0a0a0a0a0e'
>>> next(j)
'0a0a0a0a0a0a0a0f'
>>> next(j)
'0a0a0a0a0a0a0a0g'

There is no "efficient" way to do this. There are 2.8242954e+19 different possible combonations, or 28,242,954,000,000,000,000. If each combination is 16 characters long, storing this all in a raw text file would take up 451,887,264,000 gigabytes, 441,296,156.25 terabytes, 430,953.2775878906 petabytes, or 420.8528101444 exabytes. The largest hard drive available to the average consumer is 16TB (Samsung PM1633a). They cost 12 thousand US dollars. This puts the total cost of storing all of this data to 330,972,117,600 US dollars (3677.46797 times Bill Gates' net worth). Even ignoring the amount of space all of these drives would take up, and ignoring the cost of the hardware you would need to connect them to, and assuming that they could all be running at highest performance all together in a lossless RAID array, this would make the write speed 330,972,118 gigabytes a second. Sounds like a lot, doesn't it? Even with that write speed, the file would take 22 minutes to write, assuming that there were no bottlenecks from CPU power, RAM speed, or the RAID controller itself.
Sources - a calculator.

import sys
n = 5
ans = [0 for i in range(26)]
all = ['a','b','A','Z','0','1']
def rec(pos, prev):
if (pos==n) :
for i in range(n):
sys.stdout.write(str(ans[i]))
sys.stdout.flush()
print ""
return
for i in all:
if(i != prev):
ans[pos] = i
rec(pos+1, i)
return
for i in all:
ans[0] = i;
rec(1, i)
The basic idea is backtracking. It is too slow. But code is short. You can modify your characters in 'all' and length of the sequences n.
If you don't clear about the code then try to simulate it with some cases.

Limiting the number of combinations /permutations in python

I was going to generate some combination using the itertools, when i realized that as the number of elements increase the time taken will increase exponentially. Can i limit or indicate the maximum number of permutations to be produced so that itertools would stop after that limit is reached.
What i mean to say is:
Currently i have
#big_list is a list of lists
permutation_list = list(itertools.product(*big_list))
Currently this permutation list has over 6 Million permutations. I am pretty sure if i add another list, this number would hit the billion mark.
What i really need is a significant amount of permutations (lets say 5000). Is there a way to limit the size of the permutation_list that is produced?

You need to use itertools.islice, like this
itertools.islice(itertools.product(*big_list), 5000)
It doesn't create the entire list in memory, but it returns an iterator which consumes the actual iterable lazily. You can convert that to a list like this
list(itertools.islice(itertools.product(*big_list), 5000))

itertools.islice has many benefits such as ability to set start and step. Solutions below aren't that flexible and you should use them only if start is 0 and step is 1. On the other hand, they don't require any imports.
You could create a tiny wrapper around itertools.product
it = itertools.product(*big_list)
pg = (next(it) for _ in range(5000)) # generator expression
(next(it) for _ in range(5000)) returns a generator not capable of producing more than 5000 values. Convert it to list by using the list constructor
pl = list(pg)
or by wrapping the generator expression with square brackets (instead of round ones)
pl = [next(it) for _ in range(5000)] # list comprehension
Another solution, which is just as efficient as the first one, is
pg = (p for p, _ in zip(itertools.product(*big_list), range(5000))
Works in Python 3+, where zip returns an iterator that stops when the shortest iterable is exhausted. Conversion to list is done as in the first solution.

You can try out this method to get particular number of permutations number of results a permutation produce is n! where n stands for the number of elements in a list for example if you want to get only 2 results then you can try the following:
Use any temporary variable and limit it
from itertools import permutations
m=['a','b','c','d']
per=permutations(m)
temp=1
for i in list(per):
if temp<=2: #2 is the limit set
print (i)
temp=temp+1
else:
break

Permute a string to print all possible words

The code that i have written seems to be looking bad with asymptotic measure of running time and space
I am getting
T(N) = T(N-1)*N + O((N-1!)*N) where N is the size of input. I need advise to optimize it
Since it is an algorithm based interview question we are required to implement the logic in most efficient way without using any libraries
Here is my code
def str_permutations(str_input,i):
if len(str_input) == 1:
return [str_input]
comb_list = []
while i < len(str_input):
key = str_input[i]
if i+1 != len(str_input):
remaining_str = "".join((str_input[0:i],str_input[i+1:]))
else:
remaining_str = str_input[0:i]
all_combinations = str_permutations(remaining_str,0)
for index,value in enumerate(all_combinations):
all_combinations[index] = "".join((key,value))
comb_list.extend(all_combinations)
i = i+1
return comb_list

As I mentioned in a comment to the question, in the general case you won't get below exponential complexity since for n distinct characters, there are n! permutations of the input string, and O(2n) is a subset of O(n!).
Now the following won't improve the asymptotic complexity for the general case, but you can optimize the brute-force approach of producing all permutations for strings that have some characters with multiple occurrences. Take for example the string daedoid; if you blindly produce all permutations of it, you'll get every permutation 6 = 3! times since you have three occurrences of d. You can avoid that by first eliminating multiple occurrences of the same letter and instead remembering how often to use each letter. So if there is a letter c that has kc occurrences, you'll save kc! permutations. So in total, this saves you a factor of "product over kc! for all c".

If you don't need to write your own, see itertools.permutations and combinations.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

why is my iterator implementation very inefficient? - python

Related

How can I partition `itertools.combinations` such that I can process the results in parallel?

What's the overhead of using built-in python functions like zip() and join() on the performance of my function?

Generate all possible sequence of altrenate digits and alphabets

Limiting the number of combinations /permutations in python

Permute a string to print all possible words

Categories

Resources