So, let's say I have an arbitrarily long list of numbers. I'd like to get a list of every number in that list multiplied by every number in that list. I'd do that by nesting for loops like this:
for x in numbers:
for y in numbers:
print(x*y)
Now if I'd like to multiply every number in that list times every number in that list times every number in that list again, I'd do this:
for x in numbers:
for y in numbers:
for z in numbers:
print(x*y*z)
My issue is that I'm searching a graph for a subgraph, and I need to allow for arbitrarily large subgraphs. To do this I have to construct every subgraph with n edges from the edges in the main graph - and I have to allow for arbitrary values of n. How?
itertools.product with an iterative product computing function (I favor reduce(mul, ...)). If you need n-way products (in both senses of the word "product"):
from functools import reduce
from operator import mul
for numset in itertools.product(numbers, repeat=n):
print(reduce(mul, numset))
The above is simple, but it will needlessly recompute partial products when the set of values is large and n >= 3. A recursive function could be used to avoid that:
def compute_products(numbers, repeat):
if repeat == 1:
yield from numbers
return
numbers = tuple(numbers) # Only needed if you want to handle iterator/generator inputs
for prod in compute_products(numbers, repeat-1):
yield from (prod * x for x in numbers)
for prod in compute_products(numbers, n):
print(prod)
I think a recursive function of some kind would probably help you. This is just untested pseudocode typed into the Stack Overflow editor, so beware of using this verbatim, but something like this:
def one_dimension(numbers, n):
if (n < 1):
return
for num in numbers:
for next_num in one_dimension(numbers, n-1):
yield num*next_num
The basic idea is that each "level" of the function calls the next one, and then deals with what the next one produces. I don't know how familiar you are with Python's yield keyword, but if you know what it does, then you should be able to adapt the code above.
Update: Or just use itertools.product like ShadowRanger suggested in his answer; that's probably a better solution to your problem.
Related
How to write a function that takes n (where n > 0) and returns the list of all combinations of positive integers that sum to n?
This is a common question on the web. And there are different answers provided such as 1, 2 and 3. However, in the answers provided, they use two functions to solve the problem. I want to do it with only one single function. Therefore, I coded as follows:
def all_combinations_sum_to_n(n):
from itertools import combinations_with_replacement
combinations_list = []
if n < 1:
return combinations_list
l = [i for i in range(1, n + 1)]
for i in range(1, n + 1):
combinations_list = combinations_list + (list(combinations_with_replacement(l, i)))
result = [list(i) for i in combinations_list if sum(i) == n]
result.sort()
return result
If I pass 20 to my function which is all_combinations_sum_to_n(20), the OS of my machine kills the process as it is very costly. I think the space complexity of my function is O(n*n!). How do I modify my code so that I don't have to create any other function and yet my single function has an improved time or space complexity? I don't think it is possible by using itertools.combinations_with_replacement.
UPDATE
All answers provided by Barmar, ShadowRanger and pts are great. As I was looking for an efficient answer in terms of both memory and runtime, I used https://perfpy.com and selected python 3.8 to compare the answers. I used six different values of n and in all cases, ShadowRanger's solution had the highest score. Therefore, I chose ShadowRanger's answer as the best one. The scores were as follows:
You've got two main problems, one causing your current problem (out of memory) and one that will continue the problem even if you solve that one:
You're accumulating all combinations before filtering, so your memory requirements are immense. You don't even need a single list if your function can be a generator (that is iterated to produce a value at a time) rather than returning a fully realized list, and even if you must return a list, you needn't generate such huge intermediate lists. You might think you need at least one list for sorting purposes, but combinations_with_replacement is already guaranteed to produce a predictable order based on the input ordering, and since range is ordered, the values produced will be ordered as well.
Even if you solve the memory problem, the computational cost of just generating that many combinations is prohibitive, due to poor scaling; for the memory, but not CPU, optimized version of the code below, it handles an input of 11 in 0.2 seconds, 12 in ~2.6 seconds, and 13 in ~11 seconds; at that scaling rate, 20 is going to approach heat death of the universe timeframes.
Barmar's answer is one solution to both problems, but it's still doing work eagerly and storing the complete results when the complete work might not even be needed, and it involves sorting and deduplication, which aren't strictly necessary if you're sufficiently careful about how you generate the results.
This answer will fix both problems, first the memory issue, then the speed issue, without ever needing memory requirements above linear in n.
Solving the memory issue alone actually makes for simpler code, that still uses your same basic strategy, but without consuming all your RAM needlessly. The trick is to write a generator function, that avoids storing more than one results at a time (the caller can listify if they know the output is small enough and they actually need it all at once, but typically, just looping over the generator is better):
from collections import deque # Cheap way to just print the last few elements
from itertools import combinations_with_replacement # Imports should be at top of file,
# not repeated per call
def all_combinations_sum_to_n(n):
for i in range(1, n + 1): # For each possible length of combinations...
# For each combination of that length...
for comb in combinations_with_replacement(range(1, n + 1), i):
if sum(comb) == n: # If the sum matches...
yield list(comb) # yield the combination
# 13 is the largest input that will complete in some vaguely reasonable time, ~10 seconds on TIO
print(*deque(all_combinations_sum_to_n(13), maxlen=10), sep='\n')
Try it online!
Again, to be clear, this will not complete in any reasonable amount of time for an input of 20; there's just too much redundant work being done, and the growth pattern for combinations scales with the factorial of the input; you must be more clever algorithmically. But for less intense problems, this pattern is simpler, faster, and dramatically more memory-efficient than a solution that builds up enormous lists and concatenates them.
To solve in a reasonable period of time, using the same generator-based approach (but without itertools, which isn't practical here because you can't tell it to skip over combinations when you know they're useless), here's an adaptation of Barmar's answer that requires no sorting, produces no duplicates, and as a result, can produce the solution set in less than a 20th of a second, even for n = 20:
def all_combinations_sum_to_n(n, *, max_seen=1):
for i in range(max_seen, n // 2 + 1):
for subtup in all_combinations_sum_to_n(n - i, max_seen=i):
yield (i,) + subtup
yield (n,)
for x in all_combinations_sum_to_n(20):
print(x)
Try it online!
That not only produces the individual tuples with internally sorted order (1 is always before 2), but produces the sequence of tuples in sorted order (so looping over sorted(all_combinations_sum_to_n(20)) is equivalent to looping over all_combinations_sum_to_n(20) directly, the latter just avoids the temporary list and a no-op sorting pass).
Use recursion instead of generating all combinations and then filtering them.
def all_combinations_sum_to_n(n):
combinations_set = set()
for i in range(1, n):
for sublist in all_combinations_sum_to_n(n-i):
combinations_set.add(tuple(sorted((i,) + sublist)))
combinations_set.add((n,))
return sorted(combinations_set)
I had a simpler solution that didn't use sorted() and put the results in a list, but it would produce duplicates that just differed in order, e.g. [1, 1, 2] and [1, 2, 1] when n == 4. I added those to get rid of duplicates.
On my MacBook M1 all_combinations_sum_to_n(20) completes in about 0.5 seconds.
Here is a fast iterative solution:
def csum(n):
s = [[None] * (k + 1) for k in range(n + 1)]
for k in range(1, n + 1):
for m in range(k, 0, -1):
s[k][m] = [[f] + terms
for f in range(m, (k >> 1) + 1) for terms in s[k - f][f]]
s[k][m].append([k])
return s[n][1]
import sys
n = 5
if len(sys.argv) > 1: n = int(sys.argv[1])
for terms in csum(n):
print(' '.join(map(str, terms)))
Explanation:
Let's define terms as a non-empty, increasing (can contain the same value multiple times) list of postitive integers.
The solution for n is a list of all terms in increasing lexicographical order, where the sum of each terms is n.
s[k][m] is a list of all terms in increasing lexicographical order, where the sum of each terms in n, and the first (smallest) integer in each terms is at least m.
The solution is s[n][1]. Before returning this solution, the csum function populates the s array using iterative dynamic programming.
In the inner loop, the following recursion is used: each term in s[k][m] either has at least 2 elements (f and the rest) or it has 1 element (k). In the former case, the rest is a terms, where the sum is k - f and the smallest integer is f, thus it is from s[k - f][f].
This solution is a lot faster than #Barmar's solution if n is at least 20. For example, on my Linux laptop for n = 25, it's about 400 times faster, and for n = 28, it's about 3700 times faster. For larger values of n, it gets much faster really quickly.
This solution uses more memory than #ShadowRanger's solution, because this solution creates lots of temporary lists, and it uses all of them until the very end.
How to come up with such a fast solution?
Try to find a recursive formula. (Don't write code yet!)
Sometimes recursion works only with recursive functions with multiple variables. (In our case, s[k][m] is the recursive function, k is the obvious variable implied by the problem, and m is the extra variable we had to invent.) So try to add a few variables to the recursive formula: add the minimum number of variables to make it work.
Write your code so that it computes each recursive function value exactly once (not more). For that, you may add a cache (memoization) to your recursive function, or you may use dynamic programming, i.e. populating a (multidimensional) array in the correct order, so that what is needed is already populated.
I have just started with Python programming language. I tried to write a function which takes input either a list or multiple integers to find their product. I am trying to find the product of first million natural numbers but its displaying an MemoryError.
def product(*arg):
answer=1
if type(arg) == tuple:
arg=str(arg)
arg=arg.lstrip('[(')
arg=arg.rstrip('],)')
arg=arg.split(',')
for i in arg:
answer*=int(i)
return answer
else:
for i in arg:
answer*=int(i)
return answer
j=range(1,1000000,1)
j=list(j)
print(product(j))
Steps:
I convert the range object into list object if i am to pass a list as
argument
Now, within the function, i try to split the tuple by converting it
string.
I convert the resultant string into a list and then loop over the
elements to find the product
Q1: How to avoid the memory error as i try to find the product of first Million natural numbers?
Q2 How to improve this code?
You can use a Generator in Python.
def generate_product():
r = 1
for i in range(1,1000000):
r *= i + 1
yield r
list(generate_product())[0]
It is more memory efficient and better in terms of performance.
To calculate the product of all numbers from 1 to 1 million use a simple loop:
r = 1
for l in range(1,1000000):
r*=(i+1)
print(res)
But keep in mind that the result will be a pretty big number.
That means that your calculation might take long and the resulting number will need a lot memory.
EDIT Then i missread your question a little. This is a function that multiplies the elements in a list:
def multiply_list_elements(_list):
result = 1
for element in _list:
result*=element
return result
multiply_list_elements([1,2,3,4])
>>> 24
The memory error probably came from the huge number as #ZabirAlNazi calculated so nicely.
All of the solution is fine, but one point to make - your question is equivalent to find the factorial of 1 million.
The number of digits of n! = log10(1) + log10(2) + ... log10(n)
import math
num_dig = 1
for i in range(1,1000000):
num_dig += math.log10(i)
print(num_dig)
So, the number of digits in your answer is 5565703 (approx.).
That's only the final n, if you also want the intermediate results it will require squared memory O(m^2).
import math
ans = 1
for i in range(2,1000001):
ans *= i
print(ans)
N.B: You can approximate with logarithms and Stirling numbers with faster run-time.
A very simple solution would be:
def prod_of():
p=1
for i in range(1,1000000):
p* = i
print(p)
I want to find consecutive digits in a string that sum to a given number.
Example:
a="23410212" number is=5 — output 23,41,410,0212,212.
This code is not working. What do I need to fix?
def find_ten_sstrsum():
num1="2825302"
n=0;
total=0;
alist=[];
ten_str="";
nxt=1;
for n in range(len(num1)):
for n1 in range(nxt,len(num1)):
print(total)
if(total==0):
total=int(num1[n])+int(num1[n1])
ten_str=num1[n]+num1[n1]
else:
total+=int(num1[n1])
ten_str+=num1[n1]
if(total==10):
alist.append(ten_str)
ten_str=""
total=0
nxt+=1
break
elif(total<10):
nxt+=1
return alist
This (sort-of) one-liner will work:
def find_ten_sstrsum(s, n):
return list( # list call only in Python 3 if you don't want an iterator
filter(
lambda y: sum(map(int, y))==n,
(s[i:j] for i in range(len(s)) for j in range(i+1, len(s)+1)))
)
>>> find_ten_sstrsum('23410212', 5)
['23', '41', '410', '0212', '212']
This uses a nested generator expression over all possible slices and filters out the ones with the correct digit-sum.
This is, of course, far from optimal (especially for long strings) because the inner loop should be stopped as soon as the digit-sum exceeds n, but should give you an idea.
A more performant and readable solution would be a generator function:
def find_ten_sstrsum(s, n):
for start in range(len(s)):
for end in range(start+1, len(s)+1):
val = sum(map(int, s[start:end]))
if val > n:
break
if val == n:
yield s[start:end]
>>> list(find_ten_sstrsum('23410212', 5))
['23', '41', '410', '0212', '212']
Definitely read up on sum and map, as well.
Your function has several problems. Most of all, you have no way to give it different data to work on. You have it hard-coded to handle one particular string and a total of 10. You've even written variable names with "ten" in them, even though your example uses n=5. Your function should have two input parameters: the given string and the target sum.
NOTE: schwobaseggl just posted a lovely, Pythonic solution. However, I'll keep writing this, in case you need a function closer to your present learning level.
You have several logic paths, making it hard to follow how you handle your data. I recommend a slightly different approach, so that you can treat each partial sum cleanly:
for start in range(len(num1)):
total = 0 # Start at 0 each time you get a new starting number.
sum_str = ""
for last in num1[start:]:
print(total)
# Don't create separate cases for the first and other additions.
this_digit = num1[last]
total += int(this_digit)
ten_str += this_digit
# If we hit the target, save the solution and
# start at the next digit
if(total == target):
alist.append(ten_str)
break
# If we passed the target, just
# start at the next digit
elif(total > target):
break
return alist
Now, this doesn't solve quite all of your problems, and I haven't done some of the accounting work for you (variable initializations, def line, etc.). However, I think it moves you in the right direction and preserves the spirit of your code.
My question is quite similar to this one here:
Function with varying number of For Loops (python)
However, what I really want is for example:
def loop_rec(n):
for i in range(n):
for j in range(n):
for k in range(n):
#... n loops
#Do something with i, j and k
#Such as i+j+k
The example in the link does not allow the index x to vary.
Something like the answer suggested in that question but to use the indices instead of just x.
def loop_rec(y, n):
if n >= 1:
for x in range(y): # Not just x
loop_rec(y, n - 1)
else:
whatever()
Thanks
For problems where you have to deal with multiple nested loops, python standard library provides a convenient tool called itertools.product
In your particular case, all you have to do is to wrap the range with itertools.product, and specify how many nested loops via the repeat parameter. itertools.product, necessarily performs Cartesian product.
def loop_rec(y, n):
from itertools import product
for elems in product(range(y),repeat=n):
# sum(args)
# x,y,z ....(n) terms = elems
# index elems as elems[0], elems[1], ....
based on your requirement, you might want to use the entire tuple of each Cartesian product, or might want to index the tuple individually, or if you know the loop depth, you can assign it to the variables.
Thanks, but what if I wanted to change the range for each loop, ie. i
in range(n), j in range(n-1), k in range(n-2)
Assuming, you want to vary the range from m to n i.e. range(n), range(n-1), range(n-2), .... range(m). you can rewrite the product as
product(*map(range, range(m,n)))
.
I have this, and it works:
# E. Given two lists sorted in increasing order, create and return a merged
# list of all the elements in sorted order. You may modify the passed in lists.
# Ideally, the solution should work in "linear" time, making a single
# pass of both lists.
def linear_merge(list1, list2):
finalList = []
for item in list1:
finalList.append(item)
for item in list2:
finalList.append(item)
finalList.sort()
return finalList
# +++your code here+++
return
But, I'd really like to learn this stuff well. :) What does 'linear' time mean?
Linear means O(n) in Big O notation, while your code uses a sort() which is most likely O(nlogn).
The question is asking for the standard merge algorithm. A simple Python implementation would be:
def merge(l, m):
result = []
i = j = 0
total = len(l) + len(m)
while len(result) != total:
if len(l) == i:
result += m[j:]
break
elif len(m) == j:
result += l[i:]
break
elif l[i] < m[j]:
result.append(l[i])
i += 1
else:
result.append(m[j])
j += 1
return result
>>> merge([1,2,6,7], [1,3,5,9])
[1, 1, 2, 3, 5, 6, 7, 9]
Linear time means that the time taken is bounded by some undefined constant times (in this context) the number of items in the two lists you want to merge. Your approach doesn't achieve this - it takes O(n log n) time.
When specifying how long an algorithm takes in terms of the problem size, we ignore details like how fast the machine is, which basically means we ignore all the constant terms. We use "asymptotic notation" for that. These basically describe the shape of the curve you would plot in a graph of problem size in x against time taken in y. The logic is that a bad curve (one that gets steeper quickly) will always lead to a slower execution time if the problem is big enough. It may be faster on a very small problem (depending on the constants, which probably depends on the machine) but for small problems the execution time isn't generally a big issue anyway.
The "big O" specifies an upper bound on execution time. There are related notations for average execution time and lower bounds, but "big O" is the one that gets all the attention.
O(1) is constant time - the problem size doesn't matter.
O(log n) is a quite shallow curve - the time increases a bit as the problem gets bigger.
O(n) is linear time - each unit increase means it takes a roughly constant amount of extra time. The graph is (roughly) a straight line.
O(n log n) curves upwards more steeply as the problem gets more complex, but not by very much. This is the best that a general-purpose sorting algorithm can do.
O(n squared) curves upwards a lot more steeply as the problem gets more complex. This is typical for slower sorting algorithms like bubble sort.
The nastiest algorithms are classified as "np-hard" or "np-complete" where the "np" means "non-polynomial" - the curve gets steeper quicker than any polynomial. Exponential time is bad, but some are even worse. These kinds of things are still done, but only for very small problems.
EDIT the last paragraph is wrong, as indicated by the comment. I do have some holes in my algorithm theory, and clearly it's time I checked the things I thought I had figured out. In the mean time, I'm not quite sure how to correct that paragraph, so just be warned.
For your merging problem, consider that your two input lists are already sorted. The smallest item from your output must be the smallest item from one of your inputs. Get the first item from both and compare the two, and put the smallest in your output. Put the largest back where it came from. You have done a constant amount of work and you have handled one item. Repeat until both lists are exhausted.
Some details... First, putting the item back in the list just to pull it back out again is obviously silly, but it makes the explanation easier. Next - one input list will be exhausted before the other, so you need to cope with that (basically just empty out the rest of the other list and add it to the output). Finally - you don't actually have to remove items from the input lists - again, that's just the explanation. You can just step through them.
Linear time means that the runtime of the program is proportional to the length of the input. In this case the input consists of two lists. If the lists are twice as long, then the program will run approximately twice as long. Technically, we say that the algorithm should be O(n), where n is the size of the input (in this case the length of the two input lists combined).
This appears to be homework, so I will no supply you with an answer. Even though this is not homework, I am of the opinion that you will be best served by taking a pen and a piece of paper, construct two smallish example lists which are sorted, and figure out how you would merge those two lists, by hand. Once you figured that out, implementing the algorithm is a piece of cake.
(If all goes well, you will notice that you need to iterate over each list only once, in a single direction. That means that the algorithm is indeed linear. Good luck!)
If you build the result in reverse sorted order, you can use pop() and still be O(N)
pop() from the right end of the list does not require shifting the elements, so is O(1)
Reversing the list before we return it is O(N)
>>> def merge(l, r):
... result = []
... while l and r:
... if l[-1] > r[-1]:
... result.append(l.pop())
... else:
... result.append(r.pop())
... result+=(l+r)[::-1]
... result.reverse()
... return result
...
>>> merge([1,2,6,7], [1,3,5,9])
[1, 1, 2, 3, 5, 6, 7, 9]
This thread contains various implementations of a linear-time merge algorithm. Note that for practical purposes, you would use heapq.merge.
Linear time means O(n) complexity. You can read something about algorithmn comlexity and big-O notation here: http://en.wikipedia.org/wiki/Big_O_notation .
You should try to combine those lists not after getting them in the finalList, try to merge them gradually - adding an element, assuring the result is sorted, then add next element... this should give you some ideas.
A simpler version which will require equal sized lists:
def merge_sort(L1, L2):
res = []
for i in range(len(L1)):
if(L1[i]<L2[i]):
first = L1[i]
secound = L2[i]
else:
first = L2[i]
secound = L1[i]
res.extend([first,secound])
return res
itertoolz provides an efficient implementation to merge two sorted lists
https://toolz.readthedocs.io/en/latest/_modules/toolz/itertoolz.html#merge_sorted
'Linear time' means that time is an O(n) function, where n - the number of items input (items in the lists).
f(n) = O(n) means that that there exist constants x and y such that x * n <= f(n) <= y * n.
def linear_merge(list1, list2):
finalList = []
i = 0
j = 0
while i < len(list1):
if j < len(list2):
if list1[i] < list2[j]:
finalList.append(list1[i])
i += 1
else:
finalList.append(list2[j])
j += 1
else:
finalList.append(list1[i])
i += 1
while j < len(list2):
finalList.append(list2[j])
j += 1
return finalList