This is the problem:
Write a recursive function f that generates the sequence 0, 1, 0.5, 0.75, 0.625, 0.6875, 0.65625, 0.671875, 0.6640625, 0.66796875. The first two terms are 0 and 1, every other term is the average of the two previous.
>>> f(0)
0
>>> f(1)
1
>>> f(2)
0.5
>>> [(i,f(i)) for i in range(10)]
[(0, 0), (1, 1), (2, 0.5), (3, 0.75), (4, 0.625), (5, 0.6875), (6, 0.65625), (7, 0.671875), (8, 0.6640625), (9, 0.66796875)]
This is my code so far, I can't seem to figure it out. Any help/suggestions would be appreciated.
def f(n):
if n==0:
return 0
if n==1:
return 1
else:
return f(n-2)//f(n-1)
The recursive case is wrong:
return f(n-2)//f(n-1) # not the formula for the average of two numbers
The average of two numbers a and b is (a+b)/2. So you can define your function as:
def f(n):
if n==0:
return 0
if n==1:
return 1
else:
return (f(n-1)+f(n-2))/2
Or we can make it Python version-independent like:
def f(n):
if n==0:
return 0
if n==1:
return 1
else:
return 0.5*(f(n-1)+f(n-2))
You can then generate the sequence with list comprehension like:
>>> [f(i) for i in range(10)]
[0, 1, 0.5, 0.75, 0.625, 0.6875, 0.65625, 0.671875, 0.6640625, 0.66796875]
or by using a map:
>>> list(map(f,range(10)))
[0, 1, 0.5, 0.75, 0.625, 0.6875, 0.65625, 0.671875, 0.6640625, 0.66796875]
So you have U0 = 0, U1 = 1, and Un = (U(n-1) + U(n-2)) / 2 for n > 1.
You just have to literally translate this as a function:
def f(n):
if n == 0:
return 0
elif n == 1:
return 1
else:
return (f(n-1) + f(n-2)) / 2
Now for generating the sequence of the Ui from 0 to n:
def generate_sequence(n):
return [f(i) for i in range(n)]
This could (and really should) be optimized with memoization. Basically, you just have to store the previously computed results in a dictionary (while you could directly use a list instead in this case).
results = dict()
def memoized_f(n):
if n in results:
return results[n]
else:
results[n] = f(n)
return results[n]
This way, f(n) will be computed only once for each n.
As a bonus, when memoized_f(n) is called, the results dictionary holds the values of f(i) from 0 to at least n.
Recursion à la auxiliary function
You can define this using a simple auxiliary procedure and a couple state variables – the following f implementation evolves a linear iterative process
def f (n):
def aux (n, a, b):
if n == 0:
return a
else:
return aux (n - 1, b, 0.5 * (a + b))
return aux(n, 0, 1)
print([f(x) for x in range(10)])
# [0, 1, 0.5, 0.75, 0.625, 0.6875, 0.65625, 0.671875, 0.6640625, 0.66796875]
Going generic
Or you can genericize the entire process in what I'll call fibx
from functools import reduce
def fibx (op, seed, n):
[x,*xs] = seed
if n == 0:
return x
else:
return fibx(op, xs + [reduce(op, xs, x)], n - 1)
Now we could implement (eg) fib using fibx
from operator import add
def fib (n):
return fibx (add, [0,1], n)
print([fib(x) for x in range(10)])
# [0, 1, 1, 2, 3, 5, 8, 13, 21, 34]
Or we can implement your f using fibx and a custom operator
def f (n):
return fibx (lambda a,b: 0.5 * (a + b), [0, 1], n)
print([f(x) for x in range(10)])
# [0, 1, 0.5, 0.75, 0.625, 0.6875, 0.65625, 0.671875, 0.6640625, 0.66796875]
Wasted computations
Some answers here are recursing with (eg) 0.5 * (f(n-1) + f(n-2)) which duplicates heaps of work. n values around 40 take astronomically longer (minutes compared to milliseconds) to compute than the methods I've described here.
Look at the tree recursion fib(5) in this example: see how fib(3) and fib(2) are repeated several times? This is owed to a naïve implementation of the fib program. In this particular case, we can easily avoid this duplicated work using the auxiliary looping function (as demonstrated in my answer) or using memoisation (described in another answer)
Tree recursion like this results in O(n2), whereas linear iterative recursion in my answer is O(n)
Generating a sequence for n
Another answer provided by #MadPhysicist generates a sequence for a single n input value – ie, f(9) will generate a list of the first 10 values. However, the implementation is simultaneously complex and naïve and wastes heaps of computations due to the same f(n-1), andf(n-2)` calls.
A tiny variation on our initial approach can generate the same sequence in a fraction of the time – f(40) using my code will take a fraction of a second whereas these bad tree recursion answers would take upwards of 2 minutes
(Changes in bold)
def f (n):
def aux (n, acc, a, b):
if n == 0:
return acc + [a]
else:
return aux (n - 1, acc + [a], b, 0.5 * (a + b))
return aux(n, [], 0, 1)
print(f(9))
# [0, 1, 0.5, 0.75, 0.625, 0.6875, 0.65625, 0.671875, 0.6640625, 0.66796875]
If you want a function that generates your sequence in a single call, without having to call the function for each element of the list, you can store the values you compute as you unwind the stack:
def f(n, _sequence=None):
if _sequence is None:
_sequence = [0] * (n + 1)
if n == 0 or n == 1:
val = n
else:
f(n - 1, _sequence)
f(n - 2, _sequence)
val = 0.5 * (_sequence[n - 1] + _sequence[n - 2])
_sequence[n] = val
return _sequence
This has the advantage of not requiring multiple recursions over the same values as you would end up doing with [f(n) for n in range(...)] if f returned a single value.
You can use a more global form of memoization as suggested by #RightLeg to record the knowledge between multiple calls.
Unlike the other solutions, this function will actually generate the complete sequence as it goes. E.g., your original example would be:
>>> f(9)
[0, 1, 0.5, 0.75, 0.625, 0.6875, 0.65625, 0.671875, 0.6640625, 0.66796875]
Another simple solution might look like this:
a=0.0
b=1.0
count = 0
def f(newavg, second, count):
avg = (second+newavg)/2
print avg
count=count+1
if count<8:
newavg = avg
f(second, avg, count)
f(a, b, count)
Granted this code just outputs to the monitor...if you want the output into a list just add the code in the recursion.
Also be careful to properly indent where required.
Related
In my research I'm trying to tackle the Kolmogorov backward equation, i.e. interested in
$$Af = b(x)f'(x)+\sigma(x)f''(x)$$
With the specific b(x) and \sigma(x), I'm trying to see how fast the coefficients of the expression are growing when calculating higher Af powers. I'm struggle to derive this analytically thus tried to see the trend empirically.
First, I have used sympy:
from sympy import *
import matplotlib.pyplot as plt
import re
import math
import numpy as np
import time
np.set_printoptions(suppress=True)
x = Symbol('x')
b = Function('b')(x)
g = Function('g')(x)
def new_coef(gamma, beta, coef_minus2, coef_minus1, coef):
return expand(simplify(gamma*coef_minus2 + beta*coef_minus1 + 2*gamma*coef_minus1.diff(x)\
+beta*coef.diff(x)+gamma*coef.diff(x,2)))
def new_coef_first(gamma, beta, coef):
return expand(simplify(beta*coef.diff(x)+gamma*coef.diff(x,2)))
def new_coef_second(gamma, beta, coef_minus1, coef):
return expand(simplify(beta*coef_minus1 + 2*gamma*coef_minus1.diff(x)\
+beta*coef.diff(x)+gamma*coef.diff(x,2)))
def new_coef_last(gamma, beta, coef_minus2):
return lambda x: gamma(x)*coef_minus2(x)
def new_coef_last(gamma, beta, coef_minus2):
return expand(simplify(gamma*coef_minus2 ))
def new_coef_second_to_last(gamma, beta, coef_minus2, coef_minus1):
return expand(simplify(gamma*coef_minus2 + beta*coef_minus1 + 2*gamma*coef_minus1.diff(x)))
def set_to_zero(expression):
expression = expression.subs(Derivative(b, x, x, x), 0)
expression = expression.subs(Derivative(b, x, x), 0)
expression = expression.subs(Derivative(g, x, x, x, x), 0)
expression = expression.subs(Derivative(g, x, x, x), 0)
return expression
def sum_of_coef(expression):
sum_of_coef = 0
for i in str(expression).split(' + '):
if i[0:1] == '(':
i = i[1:]
integers = re.findall(r'\b\d+\b', i)
if len(integers) > 0:
length_int = len(integers[0])
if i[0:length_int] == integers[0]:
sum_of_coef += int(integers[0])
else:
sum_of_coef += 1
else:
sum_of_coef += 1
return sum_of_coef
power = 6
charar = np.zeros((power, power*2), dtype=Symbol)
coef_sum_array = np.zeros((power, power*2))
charar[0,0] = b
charar[0,1] = g
coef_sum_array[0,0] = 1
coef_sum_array[0,1] = 1
for i in range(1, power):
#print(i)
for j in range(0, (i+1)*2):
#print(j, ':')
#start_time = time.time()
if j == 0:
charar[i,j] = set_to_zero(new_coef_first(g, b, charar[i-1, j]))
elif j == 1:
charar[i,j] = set_to_zero(new_coef_second(g, b, charar[i-1, j-1], charar[i-1, j]))
elif j == (i+1)*2-2:
charar[i,j] = set_to_zero(new_coef_second_to_last(g, b, charar[i-1, j-2], charar[i-1, j-1]))
elif j == (i+1)*2-1:
charar[i,j] = set_to_zero(new_coef_last(g, b, charar[i-1, j-2]))
else:
charar[i,j] = set_to_zero(new_coef(g, b, charar[i-1, j-2], charar[i-1, j-1], charar[i-1, j]))
#print("--- %s seconds for expression---" % (time.time() - start_time))
#start_time = time.time()
coef_sum_array[i,j] = sum_of_coef(charar[i,j])
#print("--- %s seconds for coeffiecients---" % (time.time() - start_time))
coef_sum_array
Then, looked into automated differentiation and used autograd:
import autograd.numpy as np
from autograd import grad
import time
np.set_printoptions(suppress=True)
b = lambda x: 1 + x
g = lambda x: 1 + x + x**2
def new_coef(gamma, beta, coef_minus2, coef_minus1, coef):
return lambda x: gamma(x)*coef_minus2(x) + beta(x)*coef_minus1(x) + 2*gamma(x)*grad(coef_minus1)(x)\
+beta(x)*grad(coef)(x)+gamma(x)*grad(grad(coef))(x)
def new_coef_first(gamma, beta, coef):
return lambda x: beta(x)*grad(coef)(x)+gamma(x)*grad(grad(coef))(x)
def new_coef_second(gamma, beta, coef_minus1, coef):
return lambda x: beta(x)*coef_minus1(x) + 2*gamma(x)*grad(coef_minus1)(x)\
+beta(x)*grad(coef)(x)+gamma(x)*grad(grad(coef))(x)
def new_coef_last(gamma, beta, coef_minus2):
return lambda x: gamma(x)*coef_minus2(x)
def new_coef_second_to_last(gamma, beta, coef_minus2, coef_minus1):
return lambda x: gamma(x)*coef_minus2(x) + beta(x)*coef_minus1(x) + 2*gamma(x)*grad(coef_minus1)(x)
power = 6
coef_sum_array = np.zeros((power, power*2))
coef_sum_array[0,0] = b(1.0)
coef_sum_array[0,1] = g(1.0)
charar = [b, g]
for i in range(1, power):
print(i)
charar_new = []
for j in range(0, (i+1)*2):
if j == 0:
new_funct = new_coef_first(g, b, charar[j])
elif j == 1:
new_funct = new_coef_second(g, b, charar[j-1], charar[j])
elif j == (i+1)*2-2:
new_funct = new_coef_second_to_last(g, b, charar[j-2], charar[j-1])
elif j == (i+1)*2-1:
new_funct = new_coef_last(g, b, charar[j-2])
else:
new_funct = new_coef(g, b, charar[j-2], charar[j-1], charar[j])
coef_sum_array[i,j] = new_funct(1.0)
charar_new.append(new_funct)
charar = charar_new
coef_sum_array
However, I'm so not happy with the speed of either of them. I would like to do at least thousand iterations while after 3 days of running simpy method, I got 30 :/
I expect that the second method (numerical) could be optimized to avoid recalculating expressions every time. Unfortunately, I cannot see that solution myself. Also, I have tried Maple, but again without luck.
Overview
So, there are two formulas about derivatives that are interesting here:
Faa di Bruno's formula which is a way to quickly find the n-th derivative of f(g(x)) , and looks a lot like the Multinomial theorem
The General Leibniz rule which is a way to quickly find the n-th derivative of f(x)*g(x) and looks a lot like the Binomial theorem
Both of these have been discussed in pull request #13892 the n-th derivative was sped up using the general Leibniz rule.
I'm trying to see how fast the coefficients of the expression are growing
In your code, the general formula for computing c[i][j] is this:
c[i][j] = g * c[i-1][j-2] + b * c[i-1][j-1] + 2 * g * c'[i-1][j-1] + g * c''[i-1][j]
(where by c'[i][j] and c''[i][j] are the 1st and 2nd derivatives of c[i][j])
Because of this, and by the Leibniz rule mentioned above, I think intuitively, the coefficients computed should be related to Pascal's triangle (or at the very least they should have some combinatorial relation).
Optimization #1
In the original code, the function sum_to_coef(f) serializes the expression f to a string and then discarding everything that doesn't look like a number, and then sums the remaining numbers.
We can avoid serialization here by just traversing the expression tree and collecting what we need
def sum_of_coef(f):
s = 0
if f.func == Add:
for sum_term in f.args:
res = sum_term if sum_term.is_Number else 1
if len(sum_term.args) == 0:
s += res
continue
first = sum_term.args[0]
if first.is_Number == True:
res = first
else:
res = 1
s += res
elif f.func == Mul:
first = f.args[0]
if first.is_Number == True:
s = first
else:
s = 1
elif f.func == Pow:
s = 1
return s
Optimization #2
In the function set_to_zero(expr) all the 2nd and 3rd derivatives of b, and the 3rd and 4th derivatives of g are replaced by zero.
We can collapse all those substitutions into one statement like so:
b3,b2=b.diff(x,3),b.diff(x,2)
g4,g3=g.diff(x,4),g.diff(x,3)
def set_to_zero(expression):
expression = expression.subs({b3:0,b2:0,g4:0,g3:0})
return expression
Optimization #3
In the original code, for every cell c[i][j] we're calling simplify. This turns out to have a big impact on performance but actually we can skip this call, because fortunately our expressions are just sums of products of derivatives or unknown functions.
So the line
charar[i,j] = set_to_zero(expand(simplify(expr)))
becomes
charar[i,j] = set_to_zero(expand(expr))
Optimization #4
The following was also tried but turned out to have very little impact.
For two consecutive values of j, we're computing c'[i-1][j-1] twice.
j-1 c[i-1][j-3] c[i-1][j-2] c[i-1][j-1]
j c[i-1][j-2] c[i-1][j-1] c[i-1][j]
If you look at the loop formula in the else branch, you see that c'[i-1][j-1] has already been computed. It can be cached, but this optimization
has little effect in the SymPy version of the code.
Here it's also important to mention that it's possible to visualize the call tree of SymPy involved in computing these derivatives. It's actually larger, but here is part of it:
We can also generate a flamegraph using the py-spy module just to see where time is being spent:
As far as I could tell, 34% of the time spent in _eval_derivative_n_times , 10% of the time spent in the function getit from assumptions.py , 12% of the time spent in subs(..) , 12% of the time spent in expand(..)
Optimization #5
Apparently when pull request #13892 was merged into SymPy, it also introduced a performance regression.
In one of the comments regarding that regression, Ondrej Certik recommends using SymEngine to improve performance of code that makes heavy use of derivatives.
So I've ported the code mentioned to SymEngine.py and noticed that it runs 98 times faster than the SymPy version for power=8 ( and 4320 times faster for power=30)
The required module can be installed via pip3 install --user symengine.
#!/usr/bin/python3
from symengine import *
import pprint
x=var("x")
b=Function("b")(x)
g=Function("g")(x)
b3,b2=b.diff(x,3),b.diff(x,2)
g4,g3=g.diff(x,4),g.diff(x,3)
def set_to_zero(e):
e = e.subs({b3:0,b2:0,g4:0,g3:0})
return e
def sum_of_coef(f):
s = 0
if f.func == Add:
for sum_term in f.args:
res = 1
if len(sum_term.args) == 0:
s += res
continue
first = sum_term.args[0]
if first.is_Number == True:
res = first
else:
res = 1
s += res
elif f.func == Mul:
first = f.args[0]
if first.is_Number == True:
s = first
else:
s = 1
elif f.func == Pow:
s = 1
return s
def main():
power = 8
charar = [[0] * (power*2) for x in range(power)]
coef_sum_array = [[0] * (power*2) for x in range(power)]
charar[0][0] = b
charar[0][1] = g
init_printing()
for i in range(1, power):
jmax = (i+1)*2
for j in range(0, jmax):
c2,c1,c0 = charar[i-1][j-2],charar[i-1][j-1],charar[i-1][j]
#print(c2,c1,c0)
if j == 0:
expr = b*c0.diff(x) + g*c0.diff(x,2)
elif j == 1:
expr = b*c1 + 2*g*c1.diff(x) + b*c0.diff(x) + g*c0.diff(x,2)
elif j == jmax-2:
expr = g*c2 + b*c1 + 2*g*c1.diff(x)
elif j == jmax-1:
expr = g*c2
else:
expr = g*c2 + b*c1 + 2*g*c1.diff(x) + b*c0.diff(x) + g*c0.diff(x,2)
charar[i][j] = set_to_zero(expand(expr))
coef_sum_array[i][j] = sum_of_coef(charar[i][j])
pprint.pprint(Matrix(coef_sum_array))
main()
Performance after optimization #5
I think it would be very interesting to look at the number of terms in c[i][j] to determine how quickly the expressions are growing. That would definitely help in estimating the complexity of the current code.
But for practical purposes I've plotted the current time and memory consumption of the SymEngine code above and managed to get the following chart:
Both the time and the memory seem to be growing polynomially with the input (the power parameter in the original code).
The same chart but as a log-log plot can be viewed here:
Like the wiki page says, a straight line on a log-log plot corresponds to a monomial. This offers a way to recover the exponent of the monomial.
So if we consider two points N=16 and N=32 between which the log-log plot looks like a straight line
import pandas as pd
df=pd.read_csv("modif6_bench.txt", sep=',',header=0)
def find_slope(col1,col2,i1,i2):
xData = df[col1].to_numpy()
yData = df[col2].to_numpy()
x0,x1 = xData[i1],xData[i2]
y0,y1 = yData[i1],yData[i2]
m = log(y1/y0)/log(x1/x0)
return m
print("time slope = {0:0.2f}".format(find_slope("N","time",16,32)))
print("memory slope = {0:0.2f}".format(find_slope("N","memory",16,32)))
Output:
time slope = 5.69
memory slope = 2.62
So very rough approximation of time complexity would be O(n^5.69) and an approximation of space complexity would be O(2^2.62).
There are more details about deciding whether the growth rate is polynomial or exponential here (it involves drawing a semi-log and a log-log plot, and seeing where the data shows up as a straight line).
Performance with defined b and g functions
In the first original code block, the functions b and g were undefined functions. This means SymPy and SymEngine didn't know anything about them.
The 2nd original code block defines b=1+x and g=1+x+x**2 . If we run all of this again with known b and g the code runs much faster, and the running time curve and the memory usage curves are better than with unknown functions
time slope = 2.95
memory slope = 1.35
Recorded data fitting onto known growth-rates
I wanted to look a bit more into matching the observed resource consumption(time and memory), so I wrote the following Python module that fits each growth rate (from a catalog of common such growth rates) to the recorded data, and then shows the plot to the user.
It can be installed via pip3 install --user matchgrowth
When run like this:
match-growth.py --infile ./tests/modif7_bench.txt --outfile time.png --col1 N --col2 time --top 1
It produces graphs of the resource usage, as well as the closest growth rates it matches to. In this case, it finds the polynomial growth to be closest:
Other notes
If you run this for power=8 (in the symengine code mentioned above) the coefficients will look like this:
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[1, 5, 4, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[1, 17, 40, 31, 9, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[1, 53, 292, 487, 330, 106, 16, 1, 0, 0, 0, 0, 0, 0, 0, 0]
[1, 161, 1912, 6091, 7677, 4693, 1520, 270, 25, 1, 0, 0, 0, 0, 0, 0]
[1, 485, 11956, 68719, 147522, 150706, 83088, 26573, 5075, 575, 36, 1, 0, 0, 0, 0]
[1, 1457, 73192, 735499, 2568381, 4118677, 3528928, 1772038, 550620, 108948, 13776, 1085, 49, 1, 0, 0]
[1, 4373, 443524, 7649215, 42276402, 102638002, 130209104, 96143469, 44255170, 13270378, 2658264, 358890, 32340, 1876, 64, 1]
So as it turns out, the 2nd column coincides with A048473 which according to OEIS is "The number of triangles (of all sizes, including holes) in Sierpiński's triangle after n inscriptions".
All the code for this is also available in this repo.
Relations between polynomial coefficients from the i-th line with coefficients from the (i-1)-th line
In the previous post c[i][j] was calculated. It's possible to check that deg(c[i][j])=j+1 .
This can be checked by initializing a separate 2d array, and computing the degree like so:
deg[i][j] = degree(poly(parse_expr(str(charar[i][j]))))
Vertical formulas:
Then if we denote by u(i,j,k) the coefficient of x^k in c[i][j] , we can try to find formulas for u(i,j,k) in terms of u(i-1,_,_). Formulas for u(i,j,_) will be the same as formulas for u(i+1,j,_) (and all following rows), so there's some oportunity there for caching.
Horizontal formulas:
It's also interesting that when we fix i, and find that the formulas for u(i,j,_) look the same as they do for u(i,j+1,_) except for the last 3 values of k. But I'm not sure if this can be leveraged.
The caching steps mentioned above might help to skip unnecessary computations.
See more about this here.
Some notes about analytical, closed-form solutions and asymptotics
I'm struggling to derive this analytically
Yes, this seems to be hard. The closest class of recursive sequences related to the one mentioned here are called Holonomic sequences (also called D-finite or P-recursive). The sequence c[i][j] is not C-finite because it has polynomial coefficients (in the general case even the asymptotics of recurrences with polynomial coefficients is an open problem).
However, the recurrence relation for c[i][j] does not qualify for this because of the derivatives. If we were to leave out the derivatives in the formula of c[i][j] then it would qualify as a Holonomic sequence. Here are some places where I found solutions for these:
"The Concrete Tetrahedron: Symbolic Sums, Recurrence Equations, Generating Functions, Asymptotic Estimates" by Kauers and Paule - Chapter 7 Holonomic Sequences and Power Series
Analytic Combinatorics by Flajolet and Sedgewick - Appendix B.4 Holonomic Functions
But also c[i][j] is a several variable recurrence, so that's another reason why it doesn't fit into that theory mentioned above.
There is however another book called Analytic Combinatorics in Several Variables by Robin Pemantle and Mark C. Wilson which does handle several variable recurrences.
All the books mentioned above require a lot of complex analysis, and they go much beyond the little math that I currently know, so hopefully someone with a more solid understanding of that kind of math can try this out.
The most advanced CAS that has generating-function-related operations and can operate on this kind of sequences is Maple and the gfun package gfun repo (which for now only handles the univariate case).
Lets say I have a list of numbers: [0.13,0.53,2.83]
I want these numbers to round UP to the nearest x=0.5 (or any other value for x)
This list would then become, given x=0.5: [0.5,1,3].
I tried somethings with % but neither of what I tried seemed to work.
Any ideas?
Harry
EDIT: the other posts want to know the nearest value, so 1.6 would become 1.5, but in my case it should become 2
You need math.ceil:
import math
numbers = [0.13, 0.53, 2.83]
x = 0.5
def roundup(numbers, x):
return [math.ceil(number / x) * x for number in numbers]
roundup(numbers, x)
# returns [0.5, 1.0, 3.0]
If a function suits your need then the function bellow works for positive numbers. "x" is the number you want to round and thresh is the value (between 0 and 1) used to round up.
def rounding_function(x, thresh):
if x == int(x):
return x
else:
float_part = x-int(x)
if float_part<=thresh:
return int(x) + thresh
else:
return int(x) + 1
Which gives the following result:
l = [0, 0.13,0.53, 0.61, 2.83, 3]
print([rounding_function(x, 0.5) for x in l]) # [0, 0.5, 1, 1, 3, 3]
print([rounding_function(x, 0.6) for x in l]) # [0, 0.6, 0.6, 1, 3, 3]
Here's a general solution. (Don't forget to handle all the "weird input"; e.g. negative incr, negative x, etc.)
import math
def increment_ceil(x, incr=1):
"""
return the smallest float greater than or equal to x
that is divisible by incr
"""
if incr == 0:
return float(x)
else:
return float(math.ceil(x / incr) * incr)
I want to write a bottom up fibonacci using O(1) space. My problem is python's recursion stack is limiting me from testing large numbers. Could someone provide an alternate or optimization to what I have? This is my code:
def fib_in_place(n):
def fibo(f2, f1, i):
if i < 1:
return f2
else:
return fibo(f1, f2+f1, i -1)
return fibo(0, 1, n)
Using recursion this way means you're using O(N) space, not O(1) - the O(N) is in the stack.
Why use recursion at all?
def fib(n):
a, b = 0, 1
for i in range(n):
a, b = b, a + b
return a
You can memoize the Fibonacci function for efficiency, but if you require a recursive function, it's still going to take at least O(n):
def mem_fib(n, _cache={}):
'''efficiently memoized recursive function, returns a Fibonacci number'''
if n in _cache:
return _cache[n]
elif n > 1:
return _cache.setdefault(n, mem_fib(n-1) + mem_fib(n-2))
return n
This is from my answer on the main Fibonacci in Python question: How to write the Fibonacci Sequence in Python
If you're allowed to use iteration instead of recursion, you should do this:
def fib():
a, b = 0, 1
while True: # First iteration:
yield a # yield 0 to start with and then
a, b = b, a + b # a will now be 1, and b will also be 1, (0 + 1)
usage:
>>> list(zip(range(10), fib()))
[(0, 0), (1, 1), (2, 1), (3, 2), (4, 3), (5, 5), (6, 8), (7, 13), (8, 21), (9, 34)]
If you just want to get the nth number:
def get_fib(n):
fib_gen = fib()
for _ in range(n):
next(fib_gen)
return next(fib_gen)
and usage
>>> get_fib(10)
55
Why use iteration at all?
def fib(n):
phi_1 = (math.sqrt(5) + 1) / 2
phi_2 = (math.sqrt(5) - 1) / 2
f = (phi_1**n - phi_2**n) / math.sqrt(5)
return round(f)
The algebraic result is exact; the round operation is only to allow for digital representation inaccuracy.
Tail-recursive definitions are easily turned into iterative definitions. If necessary, flip the condition so that the tail-recursive call is in the 'if' branch.
def fibo(f2, f1, i):
if i > 0:
return fibo(f1, f2+f1, i -1)
else:
return f2
Then turn 'if' into 'while', replace return with unpacking assignment of the new arguments, and (optionally) drop 'else'.
def fibo(f2, f1, i):
while i > 0:
f2, f1, i = f1, f2+f1, i -1
return f2
With iteration, you do not need the nested definition.
def fib_efficient(n):
if n < 0:
raise ValueError('fib argument n cannot be negative')
new, old = 0, 1
while n:
new, old = old, old+new
n -= 1
return new
Local names 'new' and 'old' refer to Fibonacci's use of biological reproduction to motivate the sequence. However, the story works better with yeast cells instead of rabbits. Old, mature yeast cells reproduce by budding off new, immature cells. (The original source of the function in India appears to be Virahanka counting the number a ways to make a Sanskrit poetic line with n beats from an ordered sequence of 1- and 2-beat syllables.)
I'm trying to solve a problem in TalentBuddy using Python
The problem is :
Given a number N. Print to the standard output the total number of
subsets that can be formed using the {1,2..N} set, but making sure
that none of the subsets contain any two consecutive integers. The
final count might be very large, this is why you must print the result
modulo 524287.
I've worked the code. All of the tests are OK, except Test 6. I got OverFlowError when the test is submitting 10000000 as the argument of my function. I don't know what should I do to resolve this error
My code :
import math
def count_subsets(n):
step1 = (1 / math.sqrt(5)) * (((1 + math.sqrt(5)) / 2) ** (n + 2))
step2 = (1 / math.sqrt(5)) * (((1 - math.sqrt(5)) / 2) ** (n + 2))
res = step1 - step2
print int(res) % 524287
I guess this is taking up memory a lot. I wrote this after I found a mathematical formula to the same topic on the Internet.
I guess my code isn't Pythonic at all.
How to do this, the "Pythonic" way? How to resolve the OverFlowError?
EDIT: In the problem, I've given the example input 3, and the result (output) is 5.
Explanation: The 5 sets are, {}, {1}, {2}, {3}, {1,3}.
However, in Test 6, the problem I've given are:
Summary for test #6
Input test:
[10000000]
Expected output:
165366
Your output:
Traceback (most recent call last):
On line 4, in function count_subsets:
step1 = (1 / math.sqrt(5)) * (((1 + math.sqrt(5)) / 2) ** (n + 2))
OverflowError:
Let f(N) be the number of subsets that contain no consecutive numbers. There's F(N-2) subsets that contain N, and F(N-1) subsets that don't contain N. This gives:
F(N) = F(N-1) + F(N-2).
F(0) = 1 (there's 1 subset of {}, namely {}).
F(1) = 2 (there's 2 subsets of {1}, namely {} and {1}).
This is the fibonacci sequence, albeit with non-standard starting conditions.
There is, as you've found, a formula using the golden ratio to calculate this. The problem is that for large N, you need more and more accuracy in your floating-point calculation.
An exact way to do the calculation is to use iteration:
a_0 = 1
b_0 = 2
a_{n+1} = b_n
b_{n+1} = a_n + b_n
The naive version of this is easy but slow.
def subsets(n, modulo):
a, b = 1, 2
for _ in xrange(n):
a, b = b, (a + b) % modulo
return a
Instead, a standard trick is to write the repeated application of the recurrences as a matrix power:
( a_n ) = | 0 1 |^N ( 1 )
( b_n ) = | 1 1 | . ( 2 )
You can compute the matrix power (using modulo-524287 arithmetic) by repeated squaring. See Exponentiation by squaring. Here's complete code:
def mul2x2(a, b, modulo):
result = [[0, 0], [0, 0]]
for i in xrange(2):
for j in xrange(2):
for k in xrange(2):
result[i][j] += a[i][k] * b[k][j]
result[i][j] %= modulo
return result
def pow(m, n, modulo):
result = [[1, 0], [0, 1]]
while n:
if n % 2: result = mul2x2(result, m, modulo)
m = mul2x2(m, m, modulo)
n //= 2
return result
def subsets(n):
m = pow([[0, 1], [1, 1]], n, 524287)
return (m[0][0] + 2 * m[0][1]) % 524287
for i in xrange(1, 10):
print i, subsets(i)
for i in xrange(1, 20):
print i, subsets(10 ** i)
This prints solutions for every power of 10 up to 10^19, and it's effectively instant (0.041sec real on my laptop).
I know there is nothing wrong with writing with proper function structure, but I would like to know how can I find nth fibonacci number with most Pythonic way with a one-line.
I wrote that code, but It didn't seem to me best way:
>>> fib = lambda n:reduce(lambda x, y: (x[0]+x[1], x[0]), [(1,1)]*(n-2))[0]
>>> fib(8)
13
How could it be better and simplier?
fib = lambda n:reduce(lambda x,n:[x[1],x[0]+x[1]], range(n),[0,1])[0]
(this maintains a tuple mapped from [a,b] to [b,a+b], initialized to [0,1], iterated N times, then takes the first tuple element)
>>> fib(1000)
43466557686937456435688527675040625802564660517371780402481729089536555417949051
89040387984007925516929592259308032263477520968962323987332247116164299644090653
3187938298969649928516003704476137795166849228875L
(note that in this numbering, fib(0) = 0, fib(1) = 1, fib(2) = 1, fib(3) = 2, etc.)
(also note: reduce is a builtin in Python 2.7 but not in Python 3; you'd need to execute from functools import reduce in Python 3.)
A rarely seen trick is that a lambda function can refer to itself recursively:
fib = lambda n: n if n < 2 else fib(n-1) + fib(n-2)
By the way, it's rarely seen because it's confusing, and in this case it is also inefficient. It's much better to write it on multiple lines:
def fibs():
a = 0
b = 1
while True:
yield a
a, b = b, a + b
I recently learned about using matrix multiplication to generate Fibonacci numbers, which was pretty cool. You take a base matrix:
[1, 1]
[1, 0]
and multiply it by itself N times to get:
[F(N+1), F(N)]
[F(N), F(N-1)]
This morning, doodling in the steam on the shower wall, I realized that you could cut the running time in half by starting with the second matrix, and multiplying it by itself N/2 times, then using N to pick an index from the first row/column.
With a little squeezing, I got it down to one line:
import numpy
def mm_fib(n):
return (numpy.matrix([[2,1],[1,1]])**(n//2))[0,(n+1)%2]
>>> [mm_fib(i) for i in range(20)]
[0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, 233, 377, 610, 987, 1597, 2584, 4181]
This is a closed expression for the Fibonacci series that uses integer arithmetic, and is quite efficient.
fib = lambda n:pow(2<<n,n+1,(4<<2*n)-(2<<n)-1)%(2<<n)
>> fib(1000)
4346655768693745643568852767504062580256466051737178
0402481729089536555417949051890403879840079255169295
9225930803226347752096896232398733224711616429964409
06533187938298969649928516003704476137795166849228875L
It computes the result in O(log n) arithmetic operations, each acting on integers with O(n) bits. Given that the result (the nth Fibonacci number) is O(n) bits, the method is quite reasonable.
It's based on genefib4 from http://fare.tunes.org/files/fun/fibonacci.lisp , which in turn was based on an a less efficient closed-form integer expression of mine (see: http://paulhankin.github.io/Fibonacci/)
If we consider the "most Pythonic way" to be elegant and effective then:
def fib(nr):
return int(((1 + math.sqrt(5)) / 2) ** nr / math.sqrt(5) + 0.5)
wins hands down. Why use a inefficient algorithm (and if you start using memoization we can forget about the oneliner) when you can solve the problem just fine in O(1) by approximation the result with the golden ratio? Though in reality I'd obviously write it in this form:
def fib(nr):
ratio = (1 + math.sqrt(5)) / 2
return int(ratio ** nr / math.sqrt(5) + 0.5)
More efficient and much easier to understand.
This is a non-recursive (anonymous) memoizing one liner
fib = lambda x,y=[1,1]:([(y.append(y[-1]+y[-2]),y[-1])[1] for i in range(1+x-len(y))],y[x])[1]
fib = lambda n, x=0, y=1 : x if not n else fib(n-1, y, x+y)
run time O(n), fib(0) = 0, fib(1) = 1, fib(2) = 1 ...
I'm Python newcomer, but did some measure for learning purposes. I've collected some fibo algorithm and took some measure.
from datetime import datetime
import matplotlib.pyplot as plt
from functools import wraps
from functools import reduce
from functools import lru_cache
import numpy
def time_it(f):
#wraps(f)
def wrapper(*args, **kwargs):
start_time = datetime.now()
f(*args, **kwargs)
end_time = datetime.now()
elapsed = end_time - start_time
elapsed = elapsed.microseconds
return elapsed
return wrapper
#time_it
def fibslow(n):
if n <= 1:
return n
else:
return fibslow(n-1) + fibslow(n-2)
#time_it
#lru_cache(maxsize=10)
def fibslow_2(n):
if n <= 1:
return n
else:
return fibslow_2(n-1) + fibslow_2(n-2)
#time_it
def fibfast(n):
if n <= 1:
return n
a, b = 0, 1
for i in range(1, n+1):
a, b = b, a + b
return a
#time_it
def fib_reduce(n):
return reduce(lambda x, n: [x[1], x[0]+x[1]], range(n), [0, 1])[0]
#time_it
def mm_fib(n):
return (numpy.matrix([[2, 1], [1, 1]])**(n//2))[0, (n+1) % 2]
#time_it
def fib_ia(n):
return pow(2 << n, n+1, (4 << 2 * n) - (2 << n)-1) % (2 << n)
if __name__ == '__main__':
X = range(1, 200)
# fibslow_times = [fibslow(i) for i in X]
fibslow_2_times = [fibslow_2(i) for i in X]
fibfast_times = [fibfast(i) for i in X]
fib_reduce_times = [fib_reduce(i) for i in X]
fib_mm_times = [mm_fib(i) for i in X]
fib_ia_times = [fib_ia(i) for i in X]
# print(fibslow_times)
# print(fibfast_times)
# print(fib_reduce_times)
plt.figure()
# plt.plot(X, fibslow_times, label='Slow Fib')
plt.plot(X, fibslow_2_times, label='Slow Fib w cache')
plt.plot(X, fibfast_times, label='Fast Fib')
plt.plot(X, fib_reduce_times, label='Reduce Fib')
plt.plot(X, fib_mm_times, label='Numpy Fib')
plt.plot(X, fib_ia_times, label='Fib ia')
plt.xlabel('n')
plt.ylabel('time (microseconds)')
plt.legend()
plt.show()
The result is usually the same.
Fiboslow_2 with recursion and cache, Fib integer arithmetic and Fibfast algorithms seems the best ones. Maybe my decorator not the best thing to measure performance, but for an overview it seemed good.
Another example, taking the cue from Mark Byers's answer:
fib = lambda n,a=0,b=1: a if n<=0 else fib(n-1,b,a+b)
I wanted to see if I could create an entire sequence, not just the final value.
The following will generate a list of length 100. It excludes the leading [0, 1] and works for both Python2 and Python3. No other lines besides the one!
(lambda i, x=[0,1]: [(x.append(x[y+1]+x[y]), x[y+1]+x[y])[1] for y in range(i)])(100)
Output
[1,
2,
3,
...
218922995834555169026,
354224848179261915075,
573147844013817084101]
Here's an implementation that doesn't use recursion, and only memoizes the last two values instead of the whole sequence history.
nthfib() below is the direct solution to the original problem (as long as imports are allowed)
It's less elegant than using the Reduce methods above, but, although slightly different that what was asked for, it gains the ability to to be used more efficiently as an infinite generator if one needs to output the sequence up to the nth number as well (re-writing slightly as fibgen() below).
from itertools import imap, islice, repeat
nthfib = lambda n: next(islice((lambda x=[0, 1]: imap((lambda x: (lambda setx=x.__setitem__, x0_temp=x[0]: (x[1], setx(0, x[1]), setx(1, x0_temp+x[1]))[0])()), repeat(x)))(), n-1, None))
>>> nthfib(1000)
43466557686937456435688527675040625802564660517371780402481729089536555417949051
89040387984007925516929592259308032263477520968962323987332247116164299644090653
3187938298969649928516003704476137795166849228875L
from itertools import imap, islice, repeat
fibgen = lambda:(lambda x=[0,1]: imap((lambda x: (lambda setx=x.__setitem__, x0_temp=x[0]: (x[1], setx(0, x[1]), setx(1, x0_temp+x[1]))[0])()), repeat(x)))()
>>> list(islice(fibgen(),12))
[1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144]
def fib(n):
x =[0,1]
for i in range(n):
x=[x[1],x[0]+x[1]]
return x[0]
take the cue from Jason S, i think my version have a better understanding.
Starting Python 3.8, and the introduction of assignment expressions (PEP 572) (:= operator), we can use and update a variable within a list comprehension:
fib = lambda n,x=(0,1):[x := (x[1], sum(x)) for i in range(n+1)][-1][0]
This:
Initiates the duo n-1 and n-2 as a tuple x=(0, 1)
As part of a list comprehension looping n times, x is updated via an assignment expression (x := (x[1], sum(x))) to the new n-1 and n-2 values
Finally, we return from the last iteration, the first part of the x
To solve this problem I got inspired by a similar question here in Stackoverflow Single Statement Fibonacci, and I got this single line function that can output a list of fibonacci sequence. Though, this is a Python 2 script, not tested on Python 3:
(lambda n, fib=[0,1]: fib[:n]+[fib.append(fib[-1] + fib[-2]) or fib[-1] for i in range(n-len(fib))])(10)
assign this lambda function to a variable to reuse it:
fib = (lambda n, fib=[0,1]: fib[:n]+[fib.append(fib[-1] + fib[-2]) or fib[-1] for i in range(n-len(fib))])
fib(10)
output is a list of fibonacci sequence:
[0, 1, 1, 2, 3, 5, 8, 13, 21, 34]
I don't know if this is the most pythonic method but this is the best i could come up with:->
Fibonacci = lambda x,y=[1,1]:[1]*x if (x<2) else ([y.append(y[q-1] + y[q-2]) for q in range(2,x)],y)[1]
The above code doesn't use recursion, just a list to store the values.
My 2 cents
# One Liner
def nthfibonacci(n):
return long(((((1+5**.5)/2)**n)-(((1-5**.5)/2)**n))/5**.5)
OR
# Steps
def nthfibonacci(nth):
sq5 = 5**.5
phi1 = (1+sq5)/2
phi2 = -1 * (phi1 -1)
n1 = phi1**(nth+1)
n2 = phi2**(nth+1)
return long((n1 - n2)/sq5)
Why not use a list comprehension?
from math import sqrt, floor
[floor(((1+sqrt(5))**n-(1-sqrt(5))**n)/(2**n*sqrt(5))) for n in range(100)]
Without math imports, but less pretty:
[int(((1+(5**0.5))**n-(1-(5**0.5))**n)/(2**n*(5**0.5))) for n in range(100)]
import math
sqrt_five = math.sqrt(5)
phi = (1 + sqrt_five) / 2
fib = lambda n : int(round(pow(phi, n) / sqrt_five))
print([fib(i) for i in range(1, 26)])
single line lambda fibonacci but with some extra variables
Similar:
def fibonacci(n):
f=[1]+[0]
for i in range(n):
f=[sum(f)] + f[:-1]
print f[1]
A simple Fibonacci number generator using recursion
fib = lambda x: 1-x if x < 2 else fib(x-1)+fib(x-2)
print fib(100)
This takes forever to calculate fib(100) in my computer.
There is also closed form of Fibonacci numbers.
fib = lambda n: int(1/sqrt(5)*((1+sqrt(5))**n-(1-sqrt(5))**n)/2**n)
print fib(50)
This works nearly up to 72 numbers due to precision problem.
Lambda with logical operators
fibonacci_oneline = lambda n = 10, out = []: [ out.append(i) or i if i <= 1 else out.append(out[-1] + out[-2]) or out[-1] for i in range(n)]
here is how i do it ,however the function returns None for the list comprehension line part to allow me to insert a loop inside ..
so basically what it does is appending new elements of the fib seq inside of a list which is over two elements
>>f=lambda list,x :print('The list must be of 2 or more') if len(list)<2 else [list.append(list[-1]+list[-2]) for i in range(x)]
>>a=[1,2]
>>f(a,7)
You can generate once a list with some values and use as needed:
fib_fix = []
fib = lambda x: 1 if x <=2 else fib_fix[x-3] if x-2 <= len(fib_fix) else (fib_fix.append(fib(x-2) + fib(x-1)) or fib_fix[-1])
fib_x = lambda x: [fib(n) for n in range(1,x+1)]
fib_100 = fib_x(100)
than for example:
a = fib_fix[76]