Related
Given a number n and a list of divisors A, how can I efficiently find all the combinations of divisors that, when multiplied, yield to the number?
e.g.
n = 12
A = [2, 3, 4]
Output:
[[3, 2, 2],
[2, 3, 2],
[2, 2, 3],
[4, 3],
[3, 4]]
This is what I managed to do so far (code that I re-adapted from one of the many find-prime-factorization questions on stackoverflow):
def products(n, A):
if n == 1:
yield []
for each_divisor in A:
if n % each_divisor == 0:
for new_product in products(n // each_divisor, A):
yield new_product + [each_divisor]
This code seems to work properly but it's very slow, and if I try to use memoization (passing A as a tuple to the function to avoid unhashable type error) the code doesn't provide the correct result.
Any suggestions on how to improve the efficiency of this code?
The memoized code I tried is the following:
class Memoize:
def __init__(self, fun):
self.fun = fun
self.memo = {}
def __call__(self, *args):
if args not in self.memo:
self.memo[args] = self.fun(*args)
return self.memo[args]
#Memoize
def products(n, A): [as above]
When calling the function with the above defined parameters n, A:
>>> list(products(12, (2, 3, 4)))
[[3, 2, 2]]
Without memoization, the output of the same code is:
[[3, 2, 2], [2, 3, 2], [2, 2, 3], [4, 3], [3, 4]]
Note that other memoizazation functions (e.g. from the functools package #functools.lru_cache(maxsize=128)) lead to the same problem.
Rather than using memoization, you can split the problem into a recursive portion to find all the unique combinations, and a portion to find the combinations of each arrangement. That should cut down your search space considerably and only permute the options that will actually work.
To accomplish this, A should be sorted.
Part 1:
Do a DFS on the graph of possible factorizations that are available. Truncate the search down redundant branches by only selecting orderings in which each factor is greater than or equal to its predecessor. For example:
12
/ | \
/ | \
/ | \
2(x6) 3(x4) 4(x3)
/ | | \
2(x3) 3(x2) 3 4(x1)
/ |
2 3(x1)
Bold nodes are the paths that lead to a successful factorization. Struck nodes are ones that lead to a redundant branch because the remaining n after dividing by the factor is less than the factor. Nodes that don't show a remaining value in parentheses do not lead to a factorization at all. No branch is attempted for the factors lower than the current one: when we try 3, 2 is never revisited, only 3 and 4, etc.
In code:
A.sort()
def products(n, A):
def inner(n, A, L):
for i in range(len(A)):
factor = A[i]
if n % factor: continue
k = n // factor
if k < factor:
if k == 1:
yield L + [factor]
elif n in A:
yield L + [n]
break # Following k guaranteed to be even smaller
# until k == 1, which elif shortcuts
yield from inner(k, A[i:], L + [factor])
yield from inner(n, A, [])
This is pretty fast. In your particular case, it only inspects 4 nodes instead of ~30. In fact, you can prove that it inspects the absolute minimum number of nodes possible. The only improvement you might get is by using iteration instead of recursion, and I doubt that will help much.
Part 2:
Now, you just generate a permutation of each element of the result. Python provides the tools to do this directly in the standard library:
from itertools import chain, permutations
chain.from_iterable(map(permutations, products(n, A)))
You can put this into the last line of products as
yield from chain.from_iterable(map(permutations, inner(n, A, [])))
Running list(products(12, A)) shows a 20-30% improvement on my machine this way (5.2µs vs 4.0µs). Running with a more complicated example like list(products(2 * 3 * 4 * 5 * 5 * 7 * 11, [2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 14, 22])) shows an even more dramatic improvement: 7ms vs 42ms.
Part 2b:
You can filter out duplicate permutations that occur because of duplicate factors using an approach similar to the one shown here (shameless plug). Adapting for the fact that we always deal with an initial list of sorted integers, it can be written something like this:
def perm_dedup(tup):
maximum = (-1,) * len(tup)
for perm in permutations(tup):
if perm <= maximum: continue
maximum = perm
yield perm
Now you can use the following in the last line:
yield from chain.from_iterable(map(perm_dedup, inner(n, A, [])))
The timings sill favor this complete approach very much: 5.2µs vs 4.9µs for the question and 6.5ms vs 42ms for the long example. In fact, if anything, avoiding duplicate permutations seems to reduce the timing even more.
TL;DR
A much more efficient implementation that only uses standard libraries and searches only for unique permutations of unique factorizations:
from itertools import chain, permutations
def perm_dedup(tup):
maximum = (-1,) * len(tup)
for perm in permutations(tup):
if perm <= maximum: continue
maximum = perm
yield perm
def products(n, A):
A = sorted(set(A))
def inner(n, A, L):
for i in range(len(A)):
factor = A[i]
if n % factor: continue
k = n // factor
if k < factor:
if k == 1:
yield L + [factor]
elif n in A:
yield L + [n]
break # Following k guaranteed to be even smaller
# until k == 1, which elif shortcuts
yield from inner(k, A[i:], L + [factor])
yield from chain.from_iterable(map(perm_dedup, inner(n, A, [])))
I'm writing a code that generates a list of sublists in a progressive manner based on two numerical lists (a & b) that are both in ascending order. Each sublist, containing two elements, can be seen as a combination of elements from those two lists. The second element (from list b) is required to be larger than the first (from list a). Particularly, for the second element, the value may not always be numerical. A sublist can be [elem, None], meaning that there is no match in list b for "elem" in list a. There should not be any duplicates in the final output. If you imagine the output to be in a table, each sublist would be one row and within each of the two columns, the elements are in ascending order, apart from "None" in the second column.
I was inspired and have written a code that can achieve the objective thanks to the kind responses from my last question. (How to generate combinations with none values in a progressive manner) The code is shown here.
import itertools as it
import time
start=time.time()
a=[1,5,6,7,8,10,11,13,15,16,20,24,25,27]
b=[2,8,9,10,11,12,13,14,17,18,21,26]
def create_combos(lst1, lst2): #a is the base list; l is the adjacent detector list
n = len(lst1)
x_ref = [None,None]
for i in range(1,n+1):
choices_index = it.combinations(range(n),i)
choices_value = list(it.combinations(lst2,i))
for choice in choices_index:
for values in choices_value:
x = [[elem,None] for elem in lst1]
for index,value in zip(choice,values): #Iterate over two lists in parallel
if value <= x[index][0]:
x[index][0] = None
break
else:
x[index][1] = value #over-write in appropriate location
if x_ref not in x:
yield x
count=0
combos=create_combos(a,b)
for combo in combos:
# print(combo)
count+=1
print('The number of combos is ',count)
end=time.time()
print('Run time is ',end-start)
This code is about the best I can get in terms of speed with my limited python knowledge. However, it still took too long to run as the number of elements in list a & b grows over 15. I understand it is probably because of the drastic increase in combinations. However, I wonder if any improvement can be made to increase its efficiency, perhaps regarding the way the combinations are generated. Moreover, I was generating all possible combinations and the inappropriate ones were dropped afterwards, which I assume may also be inefficient.
The desired result would be to handle about 30 elements in each list within a reasonable amount of time.
EDIT: Since once the number of elements in each list grows larger, the number of combos also increase drastically. Thus, I would like to keep the generator in order to allow only one combo to occupy the memory at a time.
Please feel free ask questions if I am unclear on any of the above statements. Thank you:)
EDIT 2:
Okay so you can do this much faster if you just do things a bit smarter. I'm going to be using NumPy and Numba now to really accelerate things. If you don't want to use Numba it should still work by just commenting the parts where it's used, only slower. If you don't want NumPy, it could be replaced with lists or nested lists, but again probably with a significant performance difference.
So let's see. The two key things to change are:
Preallocating the space for the output (instead of having a generator we produce the whole output at once).
Reusing computed combinations.
To preallocate, we need to first count how many combinations we will have in total. The algorithm is similar, but just with counting, and if you have a cache for the partial counts it is actually quite fast. Numba does make a huge difference here but I already used it.
import numba as nb
def count_combos(a, b):
cache = np.zeros([len(a), len(b)], dtype=np.int32)
total = count_combos_rec(a, b, 0, 0, cache)
return total
#nb.njit
def count_combos_rec(a, b, i, j, cache):
if i >= len(a) or j >= len(b):
return 1
if cache[i][j] > 0:
return cache[i][j]
while j < len(b) and a[i] >= b[j]:
j += 1
count = 0
for j2 in range(j, len(b)):
count += count_combos_rec(a, b, i + 1, j2 + 1, cache)
count += count_combos_rec(a, b, i + 1, j, cache)
cache[i][j] = count
return count
Now we can preallocate a big array for all the combinations. Instead of storing the combinations directly in there, I will have an array of integers representing the position of the element in b (the element in a is implicit by the position, and the None matches are represented by -1).
In order to reuse combinations, we do as follows. Every time we need to find the combinations for a certain pair i/j, if it has not been computed before, we do it, and then we save the position in the output array where these combinations have been stored for the first time. Next time we come across the same i/j pair, we just need to copy the corresponding part we made before.
All in all, the algorithm ends up as follows (the result in this case is a NumPy objects array, the first column being the element from a and the second the element from b, but you can use .tolist() to convert it to a regular Python list).
import numpy as np
import numba as nb
def generate_combos(a, b):
a = np.asarray(a)
b = np.asarray(b)
# Count combos
total = count_combos(a, b)
count_table = np.zeros([len(a), len(b)], np.int32)
# Table telling first position of a i/j match
ref_table = -np.ones([len(a), len(b)], dtype=np.int32)
# Preallocate result
result_idx = np.empty([total, len(a)], dtype=np.int32)
# Make combos
generate_combos_rec(a, b, 0, 0, result_idx, 0, count_table, ref_table)
# Produce matchings array
seconds = np.where(result_idx >= 0, b[result_idx], None)
firsts = np.tile(a[np.newaxis], [len(seconds), 1])
return np.stack([firsts, seconds], axis=-1)
#nb.njit
def generate_combos_rec(a, b, i, j, result, idx, count_table, ref_table):
if i >= len(a):
return idx + 1
if j >= len(b):
result[idx, i:] = -1
return idx + 1
elif ref_table[i, j] >= 0:
r = ref_table[i, j]
c = count_table[i, j]
result[idx:idx + c, i:] = result[r:r + c, i:]
return idx + c
else:
idx_ref = idx
j_ref = j
while j < len(b) and a[i] >= b[j]:
j += 1
for j2 in range(j, len(b)):
idx_next = generate_combos_rec(a, b, i + 1, j2 + 1, result, idx, count_table, ref_table)
result[idx:idx_next, i] = j2
idx = idx_next
idx_next = generate_combos_rec(a, b, i + 1, j, result, idx, count_table, ref_table)
result[idx:idx_next, i] = -1
idx = idx_next
ref_table[i, j_ref] = idx_ref
count_table[i, j_ref] = idx - idx_ref
return idx
Let's check the result is still correct:
a = [1, 5, 6, 7, 8, 10, 11, 13, 15, 16, 20, 24, 25, 27]
b = [2, 8, 9, 10, 11, 12, 13, 14, 17, 18, 21, 26]
# generate_combos_prev is the previous recursive method
combos1 = list(generate_combos_prev(a, b))
# Note we do not need list(...) here because it is not a generator
combos = generate_combos(a, b)
print((combos1 == combos).all())
# True
Okay good, now let's see about the performance.
%timeit list(generate_combos_prev(a, b))
# 3.7 s ± 17.3 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%timeit generate_combos(a, b)
# 593 ms ± 2.66 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
Nice! That is like 6x faster! The only possible downsides, besides the additional dependencies, is that we are making all the combinations at once instead of iteratively (so you will have them all at once in memory) and that we need a table for the partial counts with size O(len(a) * len(b)).
This is a faster way to do what you are doing:
def generate_combos(a, b):
# Assumes a and b are already sorted
yield from generate_combos_rec(a, b, 0, 0, [])
def generate_combos_rec(a, b, i, j, current):
# i and j are the current indices for a and b respectively
# current is the current combo
if i >= len(a):
# Here a copy of current combo is yielded
# If you are going to use only one combo at a time you may skip the copy
yield list(current)
else:
# Advance j until we get to a value bigger than a[i]
while j < len(b) and a[i] >= b[j]:
j += 1
# Match a[i] with every possible value from b
for j2 in range(j, len(b)):
current.append((a[i], b[j2]))
yield from generate_combos_rec(a, b, i + 1, j2 + 1, current)
current.pop()
# Match a[i] with None
current.append((a[i], None))
yield from generate_combos_rec(a, b, i + 1, j, current)
current.pop()
a = [1, 5, 6, 7, 8, 10, 11, 13, 15, 16, 20, 24, 25, 27]
b = [2, 8, 9, 10, 11, 12, 13, 14, 17, 18, 21, 26]
count = 0
combos = generate_combos(a, b)
for combo in combos:
count += 1
print('The number of combos is', count)
# 1262170
The only difference with this algorithm is that it generates one more combination than yours (in your code the final count is 1262169), namely one where every element in a is matched with None. This is always the last combination to be generated, so you can just ignore that one if you want.
EDIT: If you prefer, you can move the # Match a[i] with None block in generate_combos_rec to just between the while loop and the for loop, and then the extra combination with every value in a matched to None will be the first one instead of the last one. That may make it easier to skip it. Alternatively, you can replace yield list(current) with:
if any(m is not None for _, m in current):
yield list(current)
To avoid generating the extra combination (at the expense of an additional inspection of every generated combination).
EDIT 2:
Here is a slightly modified version that avoids the extra combination by just carrying a boolean indicator in the recursion.
def generate_combos(a, b):
yield from generate_combos_rec(a, b, 0, 0, [], True)
def generate_combos_rec(a, b, i, j, current, all_none):
if i >= len(a):
if not all_none:
yield list(current)
else:
while j < len(b) and a[i] >= b[j]:
j += 1
for j2 in range(j, len(b)):
current.append((a[i], b[j2]))
yield from generate_combos_rec(a, b, i + 1, j2 + 1, current, False)
current.pop()
current.append((a[i], None))
yield from generate_combos_rec(a, b, i + 1, j, current, all_none)
current.pop()
I did this code that finds two integers in a said list (in this case [2,4,5,1,6,40,-1]) that multiply to twenty. I got a little stuck in the beginning, but adding a function to it solved my problems. I showed this code to a friend of mine who's a programmer and he said I could make this code more "pythonic", but I have no clue how.
Here's the code:
num_list = [2,4,5,1,6,40,-1]
def get_mult_num(given_list):
for i in given_list:
for j in range(i+1, len(given_list)): #for j not to be == i and to be in the list
mult_two_numbers = i * j
if mult_two_numbers == 20:
return i,j
print(get_mult_num(num_list))
I don't necessarily think it is 'unpythonic', you are using standard Python idioms to loop over your data and produce a single result or None. The term Pythonic is nebulous, a subject marred in "I know it when I see it" parameters.
Not that you produced a correct implementation. While i loops over given_numbers, j loops over an integer from i + 2 through to len(given_numbers), mixing values from given_list with indices? For your sample input, you are taking j from the half-open ranges [4, 7), [6, 7), [7, 7) (empty), [3, 7), [8, 7) (empty), [42, 7) (empty) and [1, 7), respectively. That it produces the correct answer at all is luck, not due to correctness; if you give your function the list [2, 10], it'll not find a solution! You want to loop over given_numbers again, limited with slicing, or generate indices starting at the current index of i, but then your outer loop needs to add a enumerate() call too:
for ii, i in enumerate(given_numbers):
for j in given_numbers[ii + 1:]:
# ...
or
for ii, i in enumerate(given_numbers):
for jj in range(ii + 1, len(given_numbers)):
j = given_numbers[jj]
# ...
All this is not nearly as efficient as it can be; the Python standard library offers you the tools to generate your i, j pairs without a nested for loop or slicing or other forms of filtering.
Your double loop should generate combinations of the integer inputs, so use the itertools.combinations() object to generate unique i, j pairs:
from itertools import combinations
def get_mult_num(given_list):
return [(i, j) for i, j in combinations(given_list, 2) if i * j == 20]
This assumes there can be zero or more such solutions, not just a single solution.
If you only ever need the first result or None, you can use the next() function:
def get_mult_num(given_list):
multiplies_to_20 = (
(i, j) for i, j in combinations(given_list, 2)
if i * j == 20)
return next(multiplies_to_20, None)
Next, rather than produce all possible combinations, you may want to invert the problem. If you turn given_list into a set, you can trivially check if the target number 20 can be divided cleanly without remainder by any of your given numbers and where the result of the division is larger and is also an integer in the set of numbers. That gives you an answer in linear time.
You can further limit the search by dividing with numbers smaller than the square root of the target value, because you won't find a larger value to match in your input numbers (given a number n and it's square root s, by definition s * (s + 1) is going to be larger than n).
If we add an argument for the target number to the function and make it a generator function, then you get:
def gen_factors_for(target, numbers):
possible_j = set(numbers)
limit = abs(target) ** 0.5
for i in numbers:
if abs(i) < limit and target % i == 0:
j = target // i
if j in possible_j and abs(j) > abs(i):
yield i, j
This approach is a lot faster than testing all permutations, especially if you need to find all possible factors. Note that I made both functions generators here to even out the comparisons:
>>> import random, operator
>>> from timeit import Timer
>>> def gen_factors_for_division(target, numbers):
... possible_j = set(numbers)
... limit = abs(target) ** 0.5
... for i in numbers:
... if abs(i) < limit and target % i == 0:
... j = target // i
... if j in possible_j and abs(j) > abs(i):
... yield i, j
...
>>> def gen_factors_for_combinations(target, given_list):
... return ((i, j) for i, j in combinations(given_list, 2) if i * j == target)
...
>>> numbers = [random.randint(-10000, 10000) for _ in range(100)]
>>> targets = [operator.mul(*random.sample(set(numbers), 2)) for _ in range(5)]
>>> targets += [t + random.randint(1, 100) for t in targets] # add likely-to-be-unsolvable numbers
>>> for (label, t) in (('first match:', 'next({}, None)'), ('all matches:', 'list({})')):
... print(label)
... for f in (gen_factors_for_division, gen_factors_for_combinations):
... test = t.format('f(t, n)')
... timer = Timer(
... f"[{test} for t in ts]",
... 'from __main__ import targets as ts, numbers as n, f')
... count, total = timer.autorange()
... print(f"{f.__name__:>30}: {total / count * 1000:8.3f}ms")
...
first match:
gen_factors_for_division: 0.219ms
gen_factors_for_combinations: 4.664ms
all matches:
gen_factors_for_division: 0.259ms
gen_factors_for_combinations: 3.326ms
Note that I generate 10 different random targets, to try to avoid a lucky best-case-scenario hit for either approach.
[(i,j) for i in num_list for j in num_list if i<j and i*j==20]
This is my take on it, which uses enumerate:
def get_mult_num(given_list):
return [
item1, item2
for i, item1 in enumerate(given_list)
for item2 in given_list[:i]
if item1*item2 == 20
]
I think your friend may be hinting towards using comprehensions when it makes the code cleaner (sometimes it doesn't).
I can think of using list-comprehension. This also helps to find multiple such-pairs if they exist in the given list.
num_list = [2,4,5,1,6,40,-1]
mult_num = [(num_list[i],num_list[j]) for i in range(len(num_list)) for j in range(i+1, len(num_list)) if num_list[i]*num_list[j] == 20]
print mult_num
Output:
[(4, 5)]
I came up with this. It reverses the approach a little bit, in that it searches in num_list for the required pair partner that the iteration value val would multiply to 20 with. This makes the code easier and needs no imports, even if it's not the most efficient way.
for val in num_list:
if 20 / val in num_list:
print(val, int(20/val))
You could make it more pythonic by using itertools.combinations, instead of nested loops, to find all pairs of numbers. Not always, but often iterating over indices as in for i in range(len(L)): is less pythonic than directly iterating over values as in for v in L:.
Python also allows you to make your function into a generator via the yield keyword so that instead of just returning the first pair that multiplies to 20, you get every pair that does by iterating over the function call.
import itertools
def factors(x, numbers):
""" Generate all pairs in list of numbers that multiply to x.
"""
for a, b in itertools.combinations(numbers, 2):
if a * b == x:
yield (a, b)
numbers = [2, 4, 5, 1, 6, 40, -1]
for pair in factors(20, numbers):
print(pair)
I am new to python, and I was wondering if I could generate the fibonacci series using python's list comprehension feature. I don't know how list comprehensions are implemented.
I tried the following (the intention was to generate the first five fibonacci numbers):
series=[]
series.append(1)
series.append(1)
series += [series[k-1]+series[k-2] for k in range(2,5)]
This piece of code throws the error: IndexError: list index out of range.
Let me know if it is even possible to generate such a series using a list comprehension.
You cannot do it like that: the list comprehension is evaluated first, and then that list is added to series. So basically it would be like you would have written:
series=[]
series.append(1)
series.append(1)
temp = [series[k-1]+series[k-2] for k in range(2,5)]
series += temp
You can however solve this by using list comprehension as a way to force side effects, like for instance:
series=[]
series.append(1)
series.append(1)
[series.append(series[k-1]+series[k-2]) for k in range(2,5)]
Note that we here do not add the result to series. The list comprehension is only used such that .append is called on series. However some consider list comprehensions with side effects rather error prone: it is not very declarative and tends to introduce bugs if not done carefully.
We could write it as a clean Python list comprehension (or generator) using it's relationship to the golden ratio:
>>> series = [int((((1 + 5**0.5) / 2)**n - ((1 - 5**0.5) / 2)**n) / 5**0.5) for n in range(1, 21)]
>>> series
[1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, 233, 377, 610, 987, 1597, 2584, 4181, 6765]
>>>
or a little more nicely as:
>>> square_root_of_five = 5**0.5
>>> Phi = (1 + square_root_of_five) / 2
>>> phi = (1 - square_root_of_five) / 2
>>>
>>> series = [int((Phi**n - phi**n) / square_root_of_five) for n in range(1, 21)]
>>> series
[1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, 233, 377, 610, 987, 1597, 2584, 4181, 6765]
If you know how many terms of the series you will need then you can write the code compactly without a list comprehension like this.
def Fibonacci(n):
f0, f1 = 1, 1
for _ in range(n):
yield f0
f0, f1 = f1, f0+f1
fibs = list(Fibonacci(10))
print (fibs)
If you want some indefinite number of terms then you could use this, which is very similar.
def Fibonacci():
f0, f1 = 1, 1
while True:
yield f0
f0, f1 = f1, f0+f1
fibs = []
for f in Fibonacci():
fibs.append(f)
if f>100:
break
print (fibs)
When you need a potentially infinite collection of items you should perhaps consider either a function with one or more yield statements or a generator expression. I'd love to be able to make Fibonacci numbers with a generator expression but apparently one can't.
Using Assignment Expression (python >= 3.8):
s = [0, 1]
s += [(s := [s[1], s[0] + s[1]]) and s[1] for k in range(10)]
print (s)
# [0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89]
To build on what Willem van Onsem said:
The conventional way to calculate the nth term of the fibonacci sequence is to sum the n-1 and n-2 terms, as you're aware. A list comprehension is designed to create a list with no side effects during the comprehension (apart from the creation of the single list). Storing the last 2 terms of the sequence during calculation of the sequence is a side-effect, therefore a list comprehension is ill-suited to the task on its own.
A safe way around this would be to make a closure generator (essentially a generator with some associated private state) that can be passed to the list comprehension such that the list comprehension does not have to worry about the details of what's being stored:
def fib_generator(n):
def fib_n_generator():
last = 1
curr = 1
if n == 0:
return
yield last
if n == 1:
return
yield curr
if n == 2:
return
ii = 2
while ii < n:
next = curr + last
yield next
last = curr
curr = next
ii += 1
return fib_n_generator()
fib = [xx for xx in fib_generator(10)]
print(fib)
Here's a one-line list comprehension solution that avoids the separate initialization step with nested ternary operators and the walrus operator (so needs Python 3.8), and also avoids the rapid onset of overflow problems that the explicit form can give you (with its **n component):
[
0 if not i else
(x := [0, 1]) and 1 if i == 1 else
not x.append(x[-2] + x[-1]) and x[-1]
for i in range(10)
]
Gives:
[0, 1, 1, 2, 3, 5, 8, 13, 21, 34]
This is faster than the explicit form for generating all of the values up to N. If, however, you don't want all of the values then the explicit form could be much faster, but it does suffer from overflow for some N between 1000 and 2000:
n = 2000
int((((1 + 5**0.5) / 2)**n - ((1 - 5**0.5) / 2)**n) / 5**0.5)
gives for me:
OverflowError: (34, 'Numerical result out of range')
whereas the "adding the last two values" approach can generate higher values for larger N. On my machine, I can keep going until some N between 300000 and 400000 before I run out of memory.
Thanks to Jonathan Gregory for leading me most of the way to this approach.
List comprehension of the fibonacci serie, based on the explicit formula 1:
[int((0.5+5**0.5/2)**n/5**0.5+0.5) for n in range(21)]
From Python One-Liners by Christian Mayer.
n = 10
x = [0,1]
fibs = x[0:2] + [x.append(x[-1] + x[-2]) or x[-1] for i in range(n-2)]
print(fibs)
# [0, 1, 1, 2, 3, 5, 8, 13, 21, 34]
The answer is you can do this with a list comprehension without the assignment operator (works even in Python 2).
I did it this way:
def Phi(number:int):
n = [1,1]
[n.append(n[i-2]+n[i-1])for i in range(2,number)]
return n
Simplification of #dhassel version (requires python 3.8 or later)
series = [i0 := 0, i1 := 1]+[i1 := i0 + (i0 := i1) for j in range(2, 5)]
One can also be written as a generator expression, but it's a bit tricky because for some reason, the obvious answer: fibo = (v for g in ((i0 := 0, i1 := 1), (i1 := i0 + (i0 := i1) for j in range(2,10))) for v in g) doesn't work (I do not exclude a bug). However, it is OK if you get the subgenerators list outside :
glist = ((i0 := 0, i1 := 1), (i1 := i0 + (i0 := i1) for j in range(2, 5)))
fibo = (v for g in glist for v in g)
# Get a number from the user.
number = int(input("enter a number"))
# Create a empty list
mylist=[]
# create list comprehension following fibonaci series
[mylist.append(0) if n==0 else mylist.append(1) if n==1 else mylist.append(mylist[-2]+mylist[-1]) for n in range(number+1)]
print(mylist)
Using List comprehension :
n = int(input())
fibonacci_list = [0,1]
[fibonacci_list.append(fibonacci_list[k-1]+fibonacci_list[k-2]) for k in range(2,n)]
if n<=0:
print('+ve numbers only')
elif n == 1:
fibonacci_list = [fibonacci_list[0]]
print(fibonacci_list)
else:
print(fibonacci_list)
maybe it's a feasible solution for this problem...
I want to get a running total from a list of numbers.
For demo purposes, I start with a sequential list of numbers using range
a = range(20)
runningTotal = []
for n in range(len(a)):
new = runningTotal[n-1] + a[n] if n > 0 else a[n]
runningTotal.append(new)
# This one is a syntax error
# runningTotal = [a[n] for n in range(len(a)) if n == 0 else runningTotal[n-1] + a[n]]
for i in zip(a, runningTotal):
print "{0:>3}{1:>5}".format(*i)
yields
0 0
1 1
2 3
3 6
4 10
5 15
6 21
7 28
8 36
9 45
10 55
11 66
12 78
13 91
14 105
15 120
16 136
17 153
18 171
19 190
As you can see, I initialize an empty list [], then append() in each loop iteration. Is there a more elegant way to this, like a list comprehension?
A list comprehension has no good (clean, portable) way to refer to the very list it's building. One good and elegant approach might be to do the job in a generator:
def running_sum(a):
tot = 0
for item in a:
tot += item
yield tot
to get this as a list instead, of course, use list(running_sum(a)).
If you can use numpy, it has a built-in function named cumsum that does this.
import numpy as np
tot = np.cumsum(a) # returns a np.ndarray
tot = list(tot) # if you prefer a list
I'm not sure about 'elegant', but I think the following is much simpler and more intuitive (at the cost of an extra variable):
a = range(20)
runningTotal = []
total = 0
for n in a:
total += n
runningTotal.append(total)
The functional way to do the same thing is:
a = range(20)
runningTotal = reduce(lambda x, y: x+[x[-1]+y], a, [0])[1:]
...but that's much less readable/maintainable, etc.
#Omnifarous suggests this should be improved to:
a = range(20)
runningTotal = reduce(lambda l, v: (l.append(l[-1] + v) or l), a, [0])
...but I still find that less immediately comprehensible than my initial suggestion.
Remember the words of Kernighan: "Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it."
This can be implemented in 2 lines in Python.
Using a default parameter eliminates the need to maintain an aux variable outside, and then we just do a map to the list.
def accumulate(x, l=[0]): l[0] += x; return l[0];
map(accumulate, range(20))
Use itertools.accumulate(). Here is an example:
from itertools import accumulate
a = range(20)
runningTotals = list(accumulate(a))
for i in zip(a, runningTotals):
print "{0:>3}{1:>5}".format(*i)
This only works on Python 3. On Python 2 you can use the backport in the more-itertools package.
When we take the sum of a list, we designate an accumulator (memo) and then walk through the list, applying the binary function "x+y" to each element and the accumulator. Procedurally, this looks like:
def mySum(list):
memo = 0
for e in list:
memo = memo + e
return memo
This is a common pattern, and useful for things other than taking sums — we can generalize it to any binary function, which we'll supply as a parameter, and also let the caller specify an initial value. This gives us a function known as reduce, foldl, or inject[1]:
def myReduce(function, list, initial):
memo = initial
for e in list:
memo = function(memo, e)
return memo
def mySum(list):
return myReduce(lambda memo, e: memo + e, list, 0)
In Python 2, reduce was a built-in function, but in Python 3 it's been moved to the functools module:
from functools import reduce
We can do all kinds of cool stuff with reduce depending on the function we supply as its the first argument. If we replace "sum" with "list concatenation", and "zero" with "empty list", we get the (shallow) copy function:
def myCopy(list):
return reduce(lambda memo, e: memo + [e], list, [])
myCopy(range(10))
> [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
If we add a transform function as another parameter to copy, and apply it before concatenating, we get map:
def myMap(transform, list):
return reduce(lambda memo, e: memo + [transform(e)], list, [])
myMap(lambda x: x*2, range(10))
> [0, 2, 4, 6, 8, 10, 12, 14, 16, 18]
If we add a predicate function that takes e as a parameter and returns a boolean, and use it to decide whether or not to concatenate, we get filter:
def myFilter(predicate, list):
return reduce(lambda memo, e: memo + [e] if predicate(e) else memo, list, [])
myFilter(lambda x: x%2==0, range(10))
> [0, 2, 4, 6, 8]
map and filter are sort of unfancy ways of writing list comprehensions — we could also have said [x*2 for x in range(10)] or [x for x in range(10) if x%2==0]. There's no corresponding list comprehension syntax for reduce, because reduce isn't required to return a list at all (as we saw with sum, earlier, which Python also happens to offer as a built-in function).
It turns out that for computing a running sum, the list-building abilities of reduce are exactly what we want, and probably the most elegant way to solve this problem, despite its reputation (along with lambda) as something of an un-pythonic shibboleth. The version of reduce that leaves behind copies of its old values as it runs is called reductions or scanl[1], and it looks like this:
def reductions(function, list, initial):
return reduce(lambda memo, e: memo + [function(memo[-1], e)], list, [initial])
So equipped, we can now define:
def running_sum(list):
first, rest = list[0], list[1:]
return reductions(lambda memo, e: memo + e, rest, first)
running_sum(range(10))
> [0, 1, 3, 6, 10, 15, 21, 28, 36, 45]
While conceptually elegant, this precise approach fares poorly in practice with Python. Because Python's list.append() mutates a list in place but doesn't return it, we can't use it effectively in a lambda, and have to use the + operator instead. This constructs a whole new list, which takes time proportional to the length of the accumulated list so far (that is, an O(n) operation). Since we're already inside the O(n) for loop of reduce when we do this, the overall time complexity compounds to O(n2).
In a language like Ruby[2], where array.push e returns the mutated array, the equivalent runs in O(n) time:
class Array
def reductions(initial, &proc)
self.reduce [initial] do |memo, e|
memo.push proc.call(memo.last, e)
end
end
end
def running_sum(enumerable)
first, rest = enumerable.first, enumerable.drop(1)
rest.reductions(first, &:+)
end
running_sum (0...10)
> [0, 1, 3, 6, 10, 15, 21, 28, 36, 45]
same in JavaScript[2], whose array.push(e) returns e (not array), but whose anonymous functions allow us to include multiple statements, which we can use to separately specify a return value:
function reductions(array, callback, initial) {
return array.reduce(function(memo, e) {
memo.push(callback(memo[memo.length - 1], e));
return memo;
}, [initial]);
}
function runningSum(array) {
var first = array[0], rest = array.slice(1);
return reductions(rest, function(memo, e) {
return x + y;
}, first);
}
function range(start, end) {
return(Array.apply(null, Array(end-start)).map(function(e, i) {
return start + i;
}
}
runningSum(range(0, 10));
> [0, 1, 3, 6, 10, 15, 21, 28, 36, 45]
So, how can we solve this while retaining the conceptual simplicity of a reductions function that we just pass lambda x, y: x + y to in order to create the running sum function? Let's rewrite reductions procedurally. We can fix the accidentally quadratic problem, and while we're at it, pre-allocate the result list to avoid heap thrashing[3]:
def reductions(function, list, initial):
result = [None] * len(list)
result[0] = initial
for i in range(len(list)):
result[i] = function(result[i-1], list[i])
return result
def running_sum(list):
first, rest = list[0], list[1:]
return reductions(lambda memo, e: memo + e, rest, first)
running_sum(range(0,10))
> [0, 1, 3, 6, 10, 15, 21, 28, 36, 45]
This is the sweet spot for me: O(n) performance, and the optimized procedural code is tucked away under a meaningful name where it can be re-used the next time you need to write a function that accumulates intermediate values into a list.
The names reduce/reductions come from the LISP tradition, foldl/scanl from the ML tradition, and inject from the Smalltalk tradition.
Python's List and Ruby's Array are both implementations of an automatically resizing data structure known as a "dynamic array" (or std::vector in C++). JavaScript's Array is a little more baroque, but behaves identically provided you don't assign to out of bounds indices or mutate Array.length.
The dynamic array that forms the backing store of the list in the Python runtime will resize itself every time the list's length crosses a power of two. Resizing a list means allocating a new list on the heap of twice the size of the old one, copying the contents of the old list into the new one, and returning the old list's memory to the system. This is an O(n) operation, but because it happens less and less frequently as the list grows larger and larger, the time complexity of appending to a list works out to O(1) in the average case. However, the "hole" left by the old list can sometimes be difficult to recycle, depending on its position in the heap. Even with garbage collection and a robust memory allocator, pre-allocating an array of known size can save the underlying systems some work. In an embedded environment without the benefit of an OS, this kind of micro-management becomes very important.
Starting Python 3.8, and the introduction of assignment expressions (PEP 572) (:= operator), we can use and increment a variable within a list comprehension:
# items = range(7)
total = 0
[(x, total := total + x) for x in items]
# [(0, 0), (1, 1), (2, 3), (3, 6), (4, 10), (5, 15), (6, 21)]
This:
Initializes a variable total to 0 which symbolizes the running sum
For each item, this both:
increments total by the current looped item (total := total + x) via an assignment expression
and at the same time returns the new value of total as part of the produced mapped tuple
I wanted to do the same thing to generate cumulative frequencies that I could use bisect_left over - this is the way I've generated the list;
[ sum( a[:x] ) for x in range( 1, len(a)+1 ) ]
Here's a linear time solution one liner:
list(reduce(lambda (c,s), a: (chain(c,[s+a]), s+a), l,(iter([]),0))[0])
Example:
l = range(10)
list(reduce(lambda (c,s), a: (chain(c,[s+a]), s+a), l,(iter([]),0))[0])
>>> [0, 1, 3, 6, 10, 15, 21, 28, 36, 45]
In short, the reduce goes over the list accumulating sum and constructing an list. The final x[0] returns the list, x[1] would be the running total value.
Another one-liner, in linear time and space.
def runningSum(a):
return reduce(lambda l, x: l.append(l[-1]+x) or l if l else [x], a, None)
I'm stressing linear space here, because most of the one-liners I saw in the other proposed answers --- those based on the pattern list + [sum] or using chain iterators --- generate O(n) lists or generators and stress the garbage collector so much that they perform very poorly, in comparison to this.
I would use a coroutine for this:
def runningTotal():
accum = 0
yield None
while True:
accum += yield accum
tot = runningTotal()
next(tot)
running_total = [tot.send(i) for i in xrange(N)]
You are looking for two things: fold (reduce) and a funny function that keeps a list of the results of another function, which I have called running. I made versions both with and without an initial parameter; either way these need to go to reduce with an initial [].
def last_or_default(list, default):
if len(list) > 0:
return list[-1]
return default
def initial_or_apply(list, f, y):
if list == []:
return [y]
return list + [f(list[-1], y)]
def running_initial(f, initial):
return (lambda x, y: x + [f(last_or_default(x,initial), y)])
def running(f):
return (lambda x, y: initial_or_apply(x, f, y))
totaler = lambda x, y: x + y
running_totaler = running(totaler)
running_running_totaler = running_initial(running_totaler, [])
data = range(0,20)
running_total = reduce(running_totaler, data, [])
running_running_total = reduce(running_running_totaler, data, [])
for i in zip(data, running_total, running_running_total):
print "{0:>3}{1:>4}{2:>83}".format(*i)
These will take a long time on really large lists due to the + operator. In a functional language, if done correctly, this list construction would be O(n).
Here are the first few lines of output:
0 0 [0]
1 1 [0, 1]
2 3 [0, 1, 3]
3 6 [0, 1, 3, 6]
4 10 [0, 1, 3, 6, 10]
5 15 [0, 1, 3, 6, 10, 15]
6 21 [0, 1, 3, 6, 10, 15, 21]
This is inefficient as it does it every time from beginning but possible it is:
a = range(20)
runtot=[sum(a[:i+1]) for i,item in enumerate(a)]
for line in zip(a,runtot):
print line
with Python 3.8 and above you can now use walrus operator
xs = range(20)
total = 0
run = [(total := total + d) for d in xs]