Reducing the time complexity of stair-step question (Amazon interview question) - python

def step(n):
if (n==0) or (n==1):
return 1
elif n==2:
return 2
else:
return step(n-1) + step(n-2) + step(n-3)
n = int(input())
print(step(n))
For input 53798080 it is taking 1 second. It should take a lot lesser than that to satisfy the test case.

This type of problem - evaluating a recurrence relation - has had a lot of smart people study it over the years, and that means that there's a ton of cool insights and ideas you can use to speed things up.
The comments have done a great job identifying why your code slows down on large inputs - it's because you're generating lots of duplicate recursive calls. The question, then, is how to address this.
If you want to keep your same basic strategy, I would recommend using memoization. If you haven't seen this technique before, the basic idea is to have the recursion keep track of calls that have already been made and to cache the results of those calls. Then, if you try solving the same problem twice, you can just hand back the cached result.
The general template for memoization looks something like this. (It's in pseudocode, but shouldn't be too hard to adapt.)
def memoized_recursion(original_args, memoization_table):
if memoization_table contains original_args):
return memoization_table[original_args]
else
# Put the rest of your recursive code here.
# Before returning a result, store it in memoization_table.
This reduces the number of recursive calls pretty dramatically, speeding up your code.
This, of course, isn't the only solution to making your code faster. If you have to keep things recursive, there's a different insight you can use that fundamentally changes the strategy. The basic idea is this. You're generating a series of numbers that looks like this:
1, 1, 2, 4, 7, 13, 24, ...
The idea is that
the first three terms are 1, 1, 2;
every term past this one is the sum of the three previous numbers; and
you want the nth term of the series.
If you need terms 0, 1, or 2, you can just read the answer off because you know the first three numbers.
If not, here's another technique you can use. Rather than getting the three previous values and adding them together, use this useful fact: asking for the nth term of the series starting with 1, 1, 2 is equivalent to asking for the (n-1)st term of the series starting with 1, 2, 4. (Do you see why?)
More generally, if the first three terms of the series are a, b, and c and you want the nth term, you can ask for the (n-1)st term of the series starting with the sequence b, c, a + b + c. This gives a different recursive strategy where the recursion doesn't branch, meaning that you don't need memoization.
And now, one final strategy. The type of problem you're solving involves something called a homogeneous linear recurrence relation. That is, you have a recurrence of the form
a0, a1, ..., and ak-1 are fixed constants, and
an+k = c0 an + c1 an+1 + ... + ck-1an+k-1.
This recurrence includes things like the Fibonacci sequence, the Pell numbers, the Padovan sequence, etc.
It turns out that in any case where you're solving a recurrence like this, you can solve the problem by raising specifically-chosen matrices to specific powers. In your case, the basic idea is related to the one for the second recursive strategy. The idea is that if the last three terms of the sequence are a, b, and c, then you know that the the next term is a + b + c, and the two terms before this are b and c. In other words, you can think of a mapping that turns (a, b, c) into (b, c, a + b + c). This can be thought of as this matrix equation:
| 0 1 0 | |a| | b |
| 0 0 1 | |b| = | c |
| 1 1 1 | |c| | a + b + c |
If you let M be the matrix on the far left, then computing Mn and multiplying it by the column vector (a, b, c) will give you the nth, (n+1)st, and (n+2)nd terms of the recurrence relation. This gives a totally different strategy for solving the problem: build a matrix, then raise it to a large power!
You can do this very efficiently, in fact. There's a (recursive) technique called exponentiation by squaring that can compute the nth power of a matrix using only O(log n) multiplications. (The entries of the matrix will start to get pretty big, and multiplying them will start to be your bottleneck, unfortunately). It might be worth checking out this strategy, though, since it's a pretty cool technique!
And, finally, one last option. If you do some Googling, you'll find that your problem is closely related to finding the nth tribonacci number. There are some cool formulas you can use to compute this directly, also involving powers of numbers, though they might introduce some rounding errors that slow things down a bit too much for your purposes.

For input 53798080 it is taking 1 second.
I highly doubt this. Your code stack overflows on this input. What I believe is going on here is that it takes 1 second for input 30 to produce the output 53798080. By input 31, we're up to nearly half a minute.
If we memoize your code:
from functools import lru_cache
#lru_cache
def step(n):
if n == 0 or n == 1:
return 1
if n == 2:
return 2
return step(n-1) + step(n-2) + step(n-3)
It fixes the speed problem, as #templatetypedef explains. But, it blows up with stack overflow (assuming you don't allocate more stack) above input 500. We can double that range, and deal with the speed problem sans memoization, using a more efficient algorithm with fewer recursions:
def step(n, prev1=2, prev2=1, prev3=1):
if 0 <= n <= 1:
return 1
if n == 2:
return prev1
return step(n - 1, prev3 + prev2 + prev1, prev1, prev2)
This will handle input up to 999 and produce a result in a fraction of a second:
> time python3 test.py
1499952522327196729941271196334368245775697491582778125787566254148069690528296568742385996324542810615783529390195412125034236407070760756549390960727215226685972723347839892057807887049540341540394345570010550821354375819311674972209464069786275283520364029575324
0.032u 0.011s 0:00.04 100.0% 0+0k 0+0io 0pf+0w
>
(Adding #lru_cache to this code will reduce the input range back to the original and make no difference speed-wise.)

Related

Find runtime (number of operations of function) and calculate Big O

For the python function given below, I have to find the number of Operations and Big O.
def no_odd_number(list_nums):
i = 0
while i < len(list_nums):
num = list_nums[i]
if num % 2 != 0:
return False
i += 1
return True
From my calculation, the number of operations is 4 + 3n but I'm not sure as I don't know how to deal with if...else statements.
I am also given options to choose the correct Big O from, from my calculation, I think it should be d. O(n) but I'm not sure. Help please!
a. O(n^2)
b. O(1)
c. O(log n)
d. O(n)
e. None of these
Big O notation typically considers the worst case scenario. The function you have is pretty simple, but the early return seems to complicate things. However, since we care about the worst case you can ignore the if block. The worst case will be one where you don't return early. It would be a list like [2,4,6,8], which would run the loop four times.
Now, look at the things inside the while loop, with the above in mind. It doesn't matter how big list_nums is: inside the loop you just increment i and lookup something in a list. Both of those are constant time operations that are the same regardless of how large list_nums is.
The number of times you do this loop is the length of list_nums. This means as list_nums grows, the number of operations grows at the same rate. That makes this O(n) as you suspect.

how to calculate the minimum unfairness sum of a list

I have tried to summarize the problem statement something like this::
Given n, k and an array(a list) arr where n = len(arr) and k is an integer in set (1, n) inclusive.
For an array (or list) myList, The Unfairness Sum is defined as the sum of the absolute differences between all possible pairs (combinations with 2 elements each) in myList.
To explain: if mylist = [1, 2, 5, 5, 6] then Minimum unfairness sum or MUS. Please note that elements are considered unique by their index in list not their values
MUS = |1-2| + |1-5| + |1-5| + |1-6| + |2-5| + |2-5| + |2-6| + |5-5| + |5-6| + |5-6|
If you actually need to look at the problem statement, It's HERE
My Objective
given n, k, arr(as described above), find the Minimum Unfairness Sum out of all of the unfairness sums of sub arrays possible with a constraint that each len(sub array) = k [which is a good thing to make our lives easy, I believe :) ]
what I have tried
well, there is a lot to be added in here, so I'll try to be as short as I can.
My First approach was this where i used itertools.combinations to get all the possible combinations and statistics.variance to check its spread of data (yeah, I know I'm a mess).
Before you see the code below, Do you think these variance and unfairness sum are perfectly related (i know they are strongly related) i.e. the sub array with minimum variance has to be the sub array with MUS??
You only have to check the LetMeDoIt(n, k, arr) function. If you need MCVE, check the second code snippet below.
from itertools import combinations as cmb
from statistics import variance as varn
def LetMeDoIt(n, k, arr):
v = []
s = []
subs = [list(x) for x in list(cmb(arr, k))] # getting all sub arrays from arr in a list
i = 0
for sub in subs:
if i != 0:
var = varn(sub) # the variance thingy
if float(var) < float(min(v)):
v.remove(v[0])
v.append(var)
s.remove(s[0])
s.append(sub)
else:
pass
elif i == 0:
var = varn(sub)
v.append(var)
s.append(sub)
i = 1
final = []
f = list(cmb(s[0], 2)) # getting list of all pairs (after determining sub array with least MUS)
for r in f:
final.append(abs(r[0]-r[1])) # calculating the MUS in my messy way
return sum(final)
The above code works fine for n<30 but raised a MemoryError beyond that.
In Python chat, Kevin suggested me to try generator which is memory efficient (it really is), but as generator also generates those combination on the fly as we iterate over them, it was supposed to take over 140 hours (:/) for n=50, k=8 as estimated.
I posted the same as a question on SO HERE (you might wanna have a look to understand me properly - it has discussions and an answer by fusion which takes me to my second approach - a better one(i should say fusion's approach xD)).
Second Approach
from itertools import combinations as cmb
def myvar(arr): # a function to calculate variance
l = len(arr)
m = sum(arr)/l
return sum((i-m)**2 for i in arr)/l
def LetMeDoIt(n, k, arr):
sorted_list = sorted(arr) # i think sorting the array makes it easy to get the sub array with MUS quickly
variance = None
min_variance_sub = None
for i in range(n - k + 1):
sub = sorted_list[i:i+k]
var = myvar(sub)
if variance is None or var<variance:
variance = var
min_variance_sub=sub
final = []
f = list(cmb(min_variance_sub, 2)) # again getting all possible pairs in my messy way
for r in f:
final.append(abs(r[0] - r[1]))
return sum(final)
def MainApp():
n = int(input())
k = int(input())
arr = list(int(input()) for _ in range(n))
result = LetMeDoIt(n, k, arr)
print(result)
if __name__ == '__main__':
MainApp()
This code works perfect for n up to 1000 (maybe more), but terminates due to time out (5 seconds is the limit on online judge :/ ) for n beyond 10000 (the biggest test case has n=100000).
=====
How would you approach this problem to take care of all the test cases in given time limits (5 sec) ? (problem was listed under algorithm & dynamic programming)
(for your references you can have a look on
successful submissions(py3, py2, C++, java) on this problem by other candidates - so that you can
explain that approach for me and future visitors)
an editorial by the problem setter explaining how to approach the question
a solution code by problem setter himself (py2, C++).
Input data (test cases) and expected output
Edit1 ::
For future visitors of this question, the conclusions I have till now are,
that variance and unfairness sum are not perfectly related (they are strongly related) which implies that among a lots of lists of integers, a list with minimum variance doesn't always have to be the list with minimum unfairness sum. If you want to know why, I actually asked that as a separate question on math stack exchange HERE where one of the mathematicians proved it for me xD (and it's worth taking a look, 'cause it was unexpected)
As far as the question is concerned overall, you can read answers by archer & Attersson below (still trying to figure out a naive approach to carry this out - it shouldn't be far by now though)
Thank you for any help or suggestions :)
You must work on your list SORTED and check only sublists with consecutive elements. This is because BY DEFAULT, any sublist that includes at least one element that is not consecutive, will have higher unfairness sum.
For example if the list is
[1,3,7,10,20,35,100,250,2000,5000] and you want to check for sublists with length 3, then solution must be one of [1,3,7] [3,7,10] [7,10,20] etc
Any other sublist eg [1,3,10] will have higher unfairness sum because 10>7 therefore all its differences with rest of elements will be larger than 7
The same for [1,7,10] (non consecutive on the left side) as 1<3
Given that, you only have to check for consecutive sublists of length k which reduces the execution time significantly
Regarding coding, something like this should work:
def myvar(array):
return sum([abs(i[0]-i[1]) for i in itertools.combinations(array,2)])
def minsum(n, k, arr):
res=1000000000000000000000 #alternatively make it equal with first subarray
for i in range(n-k):
res=min(res, myvar(l[i:i+k]))
return res
I see this question still has no complete answer. I will write a track of a correct algorithm which will pass the judge. I will not write the code in order to respect the purpose of the Hackerrank challenge. Since we have working solutions.
The original array must be sorted. This has a complexity of O(NlogN)
At this point you can check consecutive sub arrays as non-consecutive ones will result in a worse (or equal, but not better) "unfairness sum". This is also explained in archer's answer
The last check passage, to find the minimum "unfairness sum" can be done in O(N). You need to calculate the US for every consecutive k-long subarray. The mistake is recalculating this for every step, done in O(k), which brings the complexity of this passage to O(k*N). It can be done in O(1) as the editorial you posted shows, including mathematic formulae. It requires a previous initialization of a cumulative array after step 1 (done in O(N) with space complexity O(N) too).
It works but terminates due to time out for n<=10000.
(from comments on archer's question)
To explain step 3, think about k = 100. You are scrolling the N-long array and the first iteration, you must calculate the US for the sub array from element 0 to 99 as usual, requiring 100 passages. The next step needs you to calculate the same for a sub array that only differs from the previous by 1 element 1 to 100. Then 2 to 101, etc.
If it helps, think of it like a snake. One block is removed and one is added.
There is no need to perform the whole O(k) scrolling. Just figure the maths as explained in the editorial and you will do it in O(1).
So the final complexity will asymptotically be O(NlogN) due to the first sort.

Combinations of expressions with 4 elementary operations

I could not come up with a better title, for an adequate one might require the whole explanation. Also, combinations could be misleading since the problem will involve permutations.
What I want to accomplish is to outperform a brute force approach in Python at the following problem: Given the 4 elementary operations [+,-,*,/] and the digits from 1 to 9, and given all the possible combinations of 5 digits and the 4 operations without repetition (permutations) that result in a given number (treated as an integer), as in 1+5*9-3/7=45, 1-2/3+9*5=45,... obtain all the integers from the lowest possible value to the highest possible value and find out wether all the integers in the space expanse exist.
My preliminary attempt with brute force is the following:
def brute_force(target):
temp = 0
x = [i for i in range(1,10)]
numbers = [str(i) for i in x]
operators = ["+","-","*","/"]
for values in permutations(numbers,5):
for oper in permutations(operators):
formula = "".join(o + v for o,v in zip([""]+list(oper),values))
if round(eval(formula)) == int(target):
temp += 1
if temp > 0:
return True
else:
return False
for i in range(-100,100):
total = brute_force(i)
if total:
print(i)
else:
print(str(i) + 'No')
It just prints 'No' besides the integers that have not been found. As may seem obvious, all integer values can be found in the space expanse, ranging between -71 to 79.
I am sort of a newcommer both with Python and with algorithmic implementation, but I think that the algorithm has complexity O(n!), judging by the fact that permutations are involved. But if that is not the case I nevertheless want an algorithm that performs better (such as recursion or dynamic programming).
Let's compute the set of possible results only once (and in a bit simpler and faster way):
expression = [None] * 9
results = {eval(''.join(expression))
for expression[::2] in permutations('123456789', 5)
for expression[1::2] in permutations('+-*/')}
It computes all possible results in about 4.5 seconds on my laptop. Yours rewritten like this takes about 5.5 seconds. Both of which are much faster than your way of redoing all calculations for every target integer.
Using that results set, we can then answer questions instantaneously, confirming your range and showing that only -70 and 78 are missing:
>>> min(results), max(results)
(-70.71428571428571, 78.83333333333333)
>>> set(range(-70, 79)) - results
{-70, 78}
First of all, let's look at the expression analytically. You have three terms: a product P (A*B), a quotient Q (A/B), and a scalar S. You combine these with an addition and a subtraction.
Two of the terms are positive; the other is negative, so we can simply negate one of the three terms (P, Q, S) and take the sum. This cuts down the combinatorics.
Multiplication is commutative; w.l.o.g, we can assume A>B, which cuts the permutations in half.
Here are my suggestion for first efficiency:
First choose the terms of the product with A>B; 36 combinations
Then choose S from the remaining 7 digits; 7*36=252 combinations
From the last 6 digits, the possible quotients range from less-than-1 through max_digit / min_digit. Group these into equivalence classes (one set for addition, one for subtraction), rather than running through all 30 permutations. This gives us roughly 6 values per case; we now have ~1500 combinations of three terms.
For each of these combinations, we have 3 possible choices for which one to negate; total is ~4500 sums.
Is that enough improvement for a start?
Thanks to Heap Overflow for pointing out the data flow case I missed (this is professionally embarrassing :-) ).
The case A*B/C+D-E is not covered above. The approach is comparable.
First choose the terms of the product with A>B; 36 combinations
Then choose C from the remaining 7 digits; 7*36=252 combinations
There are only 38 total possible quotients; you can generate these as you wish, but with so few combinations, brute-force is also reasonable.
From the last 6 digits, you have 30 combinations, but half of them are negations of the other half. Choose D>E to start and merely make a second pass for the negative ones. Don't bother to check for duplicated differences; it's not worth the time.
You now have less than 38 quotients to combine with a quantity of differences (min 5, max 8, mean almost 7).
As it happens, a bit of examination of the larger cases (quotients with divisor of 1) and the remaining variety of digits will demonstrate that this method will cover all of the integers in the range -8 through 77, inclusive. You cannot remove 3 large numbers from the original 9 digits without leaving numbers whose difference omits needed intervals.
If you're allowed to include that analysis in your coding, you can shorten this part by reversing the search. You demonstrate the coverage for the large cases {48, 54, 56, 63, 72}, demonstrate the gap-filling for smaller quotients, and then you can search with less complication for the cases in my original posting, enjoying the knowledge that you need only 78, 79, and numbers less than -8.
I think you just need to find the permutations ONCE. Create a set out of all the possible sums. And then just do a lookup. Still sort of brute force but saves you a lot of repeated calculations.
def find_all_combinations():
x = [i for i in range(1,10)]
output_set = set()
numbers = [str(i) for i in x]
operators = ["+","-","*","/"]
print("Starting Calculations", end="")
for values in permutations(numbers,5):
for oper in permutations(operators):
formula = "".join(o + v for o,v in zip([""]+list(oper),values))
# Add all the possible outputs to a set
output_set.add(round(eval(formula)))
print(".", end="")
return output_set
output = find_all_combinations()
for i in range(-100,100):
if i in output:
print(i)
else:
print(str(i) + 'No')

my code is giving me the wrong output sometimes, how to solve it?

I am trying to solve this problem: 'Your task is to construct a building which will be a pile of n cubes. The cube at the bottom will have a volume of n^3, the cube above will have the volume of (n-1)^3 and so on until the top which will have a volume of 1^3.
You are given the total volume m of the building. Being given m can you find the number n of cubes you will have to build?
The parameter of the function findNb (find_nb, find-nb, findNb) will be an integer m and you have to return the integer n such as n^3 + (n-1)^3 + ... + 1^3 = m if such a n exists or -1 if there is no such n.'
I tried to first create an arithmetic sequence then transform it into a sigma sum with the nth term of the arithmetic sequence, the get a formula which I can compare its value with m.
I used this code and work 70 - 80% fine, most of the calculations that it does are correct, but some don't.
import math
def find_nb(m):
n = 0
while n < 100000:
if (math.pow(((math.pow(n, 2))+n), 2)) == 4*m:
return n
break
n += 1
return -1
print(find_nb(4183059834009))
>>> output 2022, which is correct
print(find_nb(24723578342962))
>>> output -1, which is also correct
print(find_nb(4837083252765022010))
>>> real output -1, which is incorrect
>>> expected output 57323
As mentioned, this is a math problem, which is mainly what I am better at :).
Sorry for the in-line mathematical formula as I cannot do any math formula rendering (in SO).
I do not see any problem with your code and I believe your sample test case is wrong. However, I'll still give optimisation "tricks" below for your code to run quicker
Firstly as you know, sum of the cubes between 1^3 and n^3 is n^2(n+1)^2/4. Therefore we want to find integer solutions for the equation
n^2(n+1)^2/4 == m i.e. n^4+2n^3+n^2 - 4m=0
Running a loop for n from 1 (or in your case, 2021) to 100000 is inefficient. Firstly, if m is a large number (1e100+) the complexity of your code is O(n^0.25). Considering Python's runtime, you can run your code in time only if m is less than around 1e32.
To optimise your code, you have two approaches.
1) Use Binary Search. I will not get into the details here, but basically, you can halve the search range for a simple comparison. For the initial bounds you can use lower = 0 & upper = k. A better bound for k will be given below, but let's use k = m for now.
Complexity: O(log(k)) = O(log(m))
Feasible range for m: m < 10^(3e7)
2) Use the almighty Newton-Raphson!! Using the iteration formula x_(n+1) = x_n - f(x_n) / f'(x_n), where f'(x) can be calculated explicitly, and a reasonable initial guess, let's say k = m again, the complexity is (I believe) O(log(k)) + O(1) = O(log(m)).
Complexity: O(log(k)) = O(log(m))
Feasible range for m: m < 10^(3e7)
Finally, I'll give a better initial guess for k in the above methods, also given in Ian's answer to this question. Since n^4+2n^3+n^2 = O(n^4), we can actually take k ~ m^0.25 = (m^0.5)^0.5. To calculate this, We can take k = 2^(log(k)/4) where log is base 2. The log should be O(1), but I'm not sure for big numbers/dynamic size (int in Python). Not a theorist. Using this better guess and Newton-Raphson, since the guess is in a constant range from the result, the algorithm is nearly O(1). Again, check out the links for better understanding.
Finally
Since your goal is to find whether n exists such that the equation is "exactly satisfied", use Newton-Raphson and iterate until the next guess is less than 0.5 from the current guess. If your implementation is "floppy", you can also do a range +/- 10 from the guess to ensure that you find the solution.
I think this is a Math question rather than a programming question.
Firstly, I would advise you to start iterating from a function of your input m. Right now you are initialising your n value arbitrarily (though of course it might be a requirement of the question) but I think there are ways to optimise it. Maybe, just maybe you can iterate from the cube root, so if n reaches zero or if at any point the sum becomes smaller than m you can safely assume there is no possible building that can be built.
Secondly, the equation you derived from your summation doesn't seem to be correct. I substituted your expected n and input m into the condition in your if clause and they don't match. So either 1) your equation is wrong or 2) the expected output is wrong. I suggest that you relook at your derivation of the condition. Are you using the sum of cubes factorisation? There might be some edge cases that you neglected (maybe odd n) but my Math is rusty so I can't help much.
Of course, as mentioned, the break is unnecessary and will never be executed.

Is Haskell's laziness an elegant alternative to Python's generators?

In a programming exercise, it was first asked to program the factorial function and then calculate the sum: 1! + 2! + 3! +... n! in O(n) multiplications (so we can't use the factorial directly). I am not searching the solution to this specific (trivial) problem, I'm trying to explore Haskell abilities and this problem is a toy I would like to play with.
I thought Python's generators could be a nice solution to this problem. For example :
from itertools import islice
def ifact():
i , f = 1, 1
yield 1
while True:
f *= i
i += 1
yield f
def sum_fact(n):
return sum(islice(ifact(),5))
Then I've tried to figure out if there was something in Haskell having a similar behavior than this generator and I thought that laziness do all the staff without any additional concept.
For example, we could replace my Python ifact with
fact = scan1 (*) [1..]
And then solve the exercise with the following :
sum n = foldl1 (+) (take n fact)
I wonder if this solution is really "equivalent" to Python's one regarding time complexity and memory usage. I would say that Haskell's solution never store all the list fact since their elements are used only once.
Am I right or totally wrong ?
EDIT :
I should have check more precisely:
Prelude> foldl1 (+) (take 4 fact)
33
Prelude> :sprint fact
fact = 1 : 2 : 6 : 24 : _
So (my implementation of) Haskell store the result, even if it's no longer used.
Indeed, lazy lists can be used this way. There are some subtle differences though:
Lists are data structures. So you can keep them after evaluating them, which can be both good and bad (you can avoid recomputation of values and to recursive tricks as #ChrisDrost described, at the cost of keeping memory unreleased).
Lists are pure. In generators you can have computations with side effects, you can't do that with lists (which is often desirable).
Since Haskell is a lazy language, laziness is everywhere and if you just convert a program from an imperative language to Haskell, the memory requirements can change considerably (as #RomanL describes in his answer).
But Haskell offers more advanced tools to accomplish the generator/consumer pattern. Currently there are three libraries that focus on this problem: pipes, conduit and iteratees. My favorite is conduit, it's easy to use and the complexity of its types is kept low.
They have several advantages, in particular that you can create complex pipelines and you can base them on a chosen monad, which allows you to say what side effects are allowed in a pipeline.
Using conduit, your example could be expressed as follows:
import Data.Functor.Identity
import Data.Conduit
import qualified Data.Conduit.List as C
ifactC :: (Num a, Monad m) => Producer m a
ifactC = loop 1 1
where
loop r n = let r' = r * n
in yield r' >> loop r' (n + 1)
sumC :: (Num a, Monad m) => Consumer a m a
sumC = C.fold (+) 0
main :: IO ()
main = (print . runIdentity) (ifactC $= C.isolate 5 $$ sumC)
-- alternatively running the pipeline in IO monad directly:
-- main = (ifactC $= C.isolate 5 $$ sumC) >>= print
Here we create a Producer (a conduit that consumes no input) that yields factorials indefinitely. Then we compose it with isolate, which ensures that no more than a given number of values are propagated through it, and then we compose it with a Consumer that just sums values and returns the result.
Your examples are not equivalent in memory usage. It is easy to see if you replace * with a + (so that the numbers don't get big too quickly) and then run both examples on a big n such as 10^7. Your Haskell version will consume a lot of memory and python will keep it low.
Python generator will not generate a list of values then sum it up. Instead, the sum function will get values one-by-one from the generator and accumulate them. Thus, the memory usage will remain constant.
Haskell will evaluate functions lazily, but in order to calculate say foldl1 (+) (take n fact) it will have to evaluate the complete expression. For large n this will unfold into a huge expression the same way as (foldl (+) 0 [0..n]) does. For more details on evaluation and reduction have a look here: https://www.haskell.org/haskellwiki/Foldr_Foldl_Foldl%27.
You can fix your sum n by using foldl1' instead of foldl1 as described on the link above. As #user2407038 explained in his comment, you'd also need to keep fact local. The following works in GHC with a constant memory use:
let notfact = scanl1 (+) [1..]
let n = 20000000
let res = foldl' (+) 0 (take n notfact)
Note that in case of the actual factorial in place of notfact memory considerations are less of a concern. The numbers will get big quickly, arbitrary-precision arithmetic will slow things down so you won't be able get to big values of n in order to actually see the difference.
Basically, yes: Haskell's lazy-lists are a lot like Python's generators, if those generators were effortlessly cloneable, cacheable, and composeable. Instead of raising StopIteration you return [] from your recursive function, which can thread state into the generator.
They do some cooler stuff due to self-recursion. For example, your factorial generator is more idiomatically generated like:
facts = 1 : zipWith (*) facts [1..]
or the Fibonaccis as:
fibs = 1 : 1 : zipWith (+) fibs (tail fibs)
In general any iterative loop can be converted to a recursive algorithm by promoting the loop-state to arguments of a function and then calling it recursively to get the next loop cycle. Generators are just like that, but we prepend some elements each iteration of the recursive function, `go ____ = (stuff) : go ____.
The perfect equivalent is therefore:
ifact :: [Integer]
ifact = go 1 1
where go f i = f : go (f * i) (i + 1)
sum_fact n = sum (take n ifact)
In terms of what's fastest, the absolute fastest in Haskell will probably be the "for loop":
sum_fact n = go 1 1 1
where go acc fact i
| i <= n = go (acc + fact) (fact * i) (i + 1)
| otherwise = acc
The fact that this is "tail-recursive" (a call of go does not pipe any sub-calls to go to another function like (+) or (*)) means that the compiler can package it into a really tight loop, and that's why I'm comparing it with "for loops" even though that's not really a native idea to Haskell.
The above sum_fact n = sum (take n ifact) is a little slower than this but faster than sum (take n facts) where facts is defined with zipWith. The speed differences are not very large and I think mostly just come down to memory allocations that don't get used again.

Categories

Resources