I have a NumPy array with the following properties:
shape: (9986080, 2)
dtype: np.float32
I have a method that loops over the range of the array, performs an operation and then inputs result to new array:
def foo(arr):
new_arr = np.empty(arr.size, dtype=np.uint64)
for i in range(arr.size):
x, y = arr[i]
e, n = ''
if x < 0:
e = '1'
else:
w = '2'
if y > 0:
n = '3'
else:
s = '4'
new_arr[i] = int(f'{abs(x)}{e}{abs(y){n}'.replace('.', ''))
I agree with Iguananaut's comment that this data structure seems a bit odd. My biggest problem with it is that it is really tricky to try and vectorize the putting together of integers in a string and then re-converting that to an integer. Still, this will certainly help speed up the function:
def foo(arr):
x_values = arr[:,0]
y_values = arr[:,1]
ones = np.ones(arr.shape[0], dtype=np.uint64)
e = np.char.array(np.where(x_values < 0, ones, ones * 2))
n = np.char.array(np.where(y_values < 0, ones * 3, ones * 4))
x_values = np.char.array(np.absolute(x_values))
y_values = np.char.array(np.absolute(y_values))
x_values = np.char.replace(x_values, '.', '')
y_values = np.char.replace(y_values, '.', '')
new_arr = np.char.add(np.char.add(x_values, e), np.char.add(y_values, n))
return new_arr.astype(np.uint64)
Here, the x and y values of the input array are first split up. Then we use a vectorized computation to determine where e and n should be 1 or 2, 3 or 4. The last line uses a standard list comprehension to do the string merging bit, which is still undesirably slow for super large arrays but faster than a regular for loop. Also vectorizing the previous computations should speed the function up hugely.
Edit:
I was mistaken before. Numpy does have a nice way of handling string concatenation using the np.char.add() method. This requires converting x_values and y_values to Numpy character arrays using np.char.array(). Also for some reason, the np.char.add() method only takes two arrays as inputs, so it is necessary to first concatenate x_values and e and y_values and n and then concatenate these results. Still, this vectorizes the computations and should be pretty fast. The code is still a bit clunky because of the rather odd operation you are after, but I think this will help you speed up the function greatly.
You may use np.apply_along_axis. When you feed this function with another function that takes row (or column) as an argument, it does what you want to do.
For you case, You may rewrite the function as below:
def foo(row):
x, y = row
e, n = ''
if x < 0:
e = '1'
else:
w = '2'
if y > 0:
n = '3'
else:
s = '4'
return int(f'{abs(x)}{e}{abs(y){n}'.replace('.', ''))
# Where you want to you use it.
new_arr = np.apply_along_axis(foo, 1, n)
I programmed class which looks something like this:
import numpy as np
class blank():
def __init__(self,a,b,c):
self.a=a
self.b=b
self.c=c
n=5
c=a/b*8
if (a>b):
y=c+a*b
else:
y=c-a*b
p = np.empty([1,1])
k = np.empty([1,1])
l = np.empty([1,1])
p[0]=b
k[0]=b*(c-1)
l[0]=p+k
for i in range(1, n, 1):
p=np.append(p,l[i-1])
k=np.append(k,(p[i]*(c+1)))
l=np.append(l,p[i]+k[i])
komp = np.zeros(shape=(n, 1))
for i in range(0, n):
pl_avg = (p[i] + l[i]) / 2
h=pl_avg*3
komp[i]=pl_avg*h/4
self.tot=komp+l
And when I call it like this:
from ex1 import blank
import numpy as np
res=blank(1,2,3)
print(res.tot)
everything works well.
BUT I want to call it like this:
res = blank(np.array([1,2,3]), np.array([3,4,5]), 3)
Is there an easy way to call it for each i element of this two arrays without editing class code?
You won't be able to instantiate a class with NumPy arrays as inputs without changing the class code. #PabloAlvarez and #NagaKiran already provided alternative: iterate with zip over arrays and instantiate class for each pair of elements. While this is pretty simple solution, it defeats the purpose of using NumPy with its efficient vectorized operations.
Here is how I suggest you to rewrite the code:
from typing import Union
import numpy as np
def total(a: Union[float, np.ndarray],
b: Union[float, np.ndarray],
n: int = 5) -> np.array:
"""Calculates what your self.tot was"""
bc = 8 * a
c = bc / b
vectorized_geometric_progression = np.vectorize(geometric_progression,
otypes=[np.ndarray])
l = np.stack(vectorized_geometric_progression(bc, c, n))
l = np.atleast_2d(l)
p = np.insert(l[:, :-1], 0, b, axis=1)
l = np.squeeze(l)
p = np.squeeze(p)
pl_avg = (p + l) / 2
komp = np.array([0.75 * pl_avg ** 2]).T
return komp + l
def geometric_progression(bc, c, n):
"""Calculates array l"""
return bc * np.logspace(start=0,
stop=n - 1,
num=n,
base=c + 2)
And you can call it both for sole numbers and NumPy arrays like that:
>>> print(total(1, 2))
[[2.6750000e+01 6.6750000e+01 3.0675000e+02 1.7467500e+03 1.0386750e+04]
[5.9600000e+02 6.3600000e+02 8.7600000e+02 2.3160000e+03 1.0956000e+04]
[2.1176000e+04 2.1216000e+04 2.1456000e+04 2.2896000e+04 3.1536000e+04]
[7.6205600e+05 7.6209600e+05 7.6233600e+05 7.6377600e+05 7.7241600e+05]
[2.7433736e+07 2.7433776e+07 2.7434016e+07 2.7435456e+07 2.7444096e+07]]
>>> print(total(3, 4))
[[1.71000000e+02 3.39000000e+02 1.68300000e+03 1.24350000e+04 9.84510000e+04]
[8.77200000e+03 8.94000000e+03 1.02840000e+04 2.10360000e+04 1.07052000e+05]
[5.59896000e+05 5.60064000e+05 5.61408000e+05 5.72160000e+05 6.58176000e+05]
[3.58318320e+07 3.58320000e+07 3.58333440e+07 3.58440960e+07 3.59301120e+07]
[2.29323574e+09 2.29323590e+09 2.29323725e+09 2.29324800e+09 2.29333402e+09]]
>>> print(total(np.array([1, 3]), np.array([2, 4])))
[[[2.67500000e+01 6.67500000e+01 3.06750000e+02 1.74675000e+03 1.03867500e+04]
[1.71000000e+02 3.39000000e+02 1.68300000e+03 1.24350000e+04 9.84510000e+04]]
[[5.96000000e+02 6.36000000e+02 8.76000000e+02 2.31600000e+03 1.09560000e+04]
[8.77200000e+03 8.94000000e+03 1.02840000e+04 2.10360000e+04 1.07052000e+05]]
[[2.11760000e+04 2.12160000e+04 2.14560000e+04 2.28960000e+04 3.15360000e+04]
[5.59896000e+05 5.60064000e+05 5.61408000e+05 5.72160000e+05 6.58176000e+05]]
[[7.62056000e+05 7.62096000e+05 7.62336000e+05 7.63776000e+05 7.72416000e+05]
[3.58318320e+07 3.58320000e+07 3.58333440e+07 3.58440960e+07 3.59301120e+07]]
[[2.74337360e+07 2.74337760e+07 2.74340160e+07 2.74354560e+07 2.74440960e+07]
[2.29323574e+09 2.29323590e+09 2.29323725e+09 2.29324800e+09 2.29333402e+09]]]
You can see that results are in compliance.
Explanation:
First of all I'd like to note that your calculation of p, k, and l doesn't have to be in the loop. Moreover, calculating k is unnecessary. If you see carefully, how elements of p and l are calculated, they are just geometric progressions (except the 1st element of p):
p = [b, b*c, b*c*(c+2), b*c*(c+2)**2, b*c*(c+2)**3, b*c*(c+2)**4, ...]
l = [b*c, b*c*(c+2), b*c*(c+2)**2, b*c*(c+2)**3, b*c*(c+2)**4, b*c*(c+2)**5, ...]
So, instead of that loop, you can use np.logspace. Unfortunately, np.logspace doesn't support base parameter as an array, so we have no other choice but to use np.vectorize which is just a loop under the hood...
Calculating of komp though is easily vectorized. You can see it in my example. No need for loops there.
Also, as I already noted in a comment, your class doesn't have to be a class, so I took a liberty of changing it to a function.
Next, note that input parameter c is overwritten, so I got rid of it. Variable y is never used. (Also, you could calculate it just as y = c + a * b * np.sign(a - b))
And finally, I'd like to remark that creating NumPy arrays with np.append is very inefficient (as it was pointed out by #kabanus), so you should always try to create them at once - no loops, no appending.
P.S.: I used np.atleast_2d and np.squeeze in my code and it could be unclear why I did it. They are necessary to avoid if-else clauses where we would check dimensions of array l. You can print intermediate results to see what is really going on there. Nothing difficult.
if it is just calling class with two different list elements, loop can satisfies well
res = [blank(i,j,3) for i,j in zip(np.array([1,2,3]),np.array([3,4,5]))]
You can see list of values for res variable
The only way I can think of iterating lists of arrays is by using a function on the main program for iteration and then do the operations you need to do inside the loop.
This solution works for each element of both arrays (note to use zip function for making the iteration in both lists if they have a small size as listed in this answer here):
for n,x in zip(np.array([1,2,3]),np.array([3,4,5])):
res=blank(n,x,3)
print(res.tot)
Hope it is what you need!
I know there is nothing wrong with writing with proper function structure, but I would like to know how can I find nth fibonacci number with most Pythonic way with a one-line.
I wrote that code, but It didn't seem to me best way:
>>> fib = lambda n:reduce(lambda x, y: (x[0]+x[1], x[0]), [(1,1)]*(n-2))[0]
>>> fib(8)
13
How could it be better and simplier?
fib = lambda n:reduce(lambda x,n:[x[1],x[0]+x[1]], range(n),[0,1])[0]
(this maintains a tuple mapped from [a,b] to [b,a+b], initialized to [0,1], iterated N times, then takes the first tuple element)
>>> fib(1000)
43466557686937456435688527675040625802564660517371780402481729089536555417949051
89040387984007925516929592259308032263477520968962323987332247116164299644090653
3187938298969649928516003704476137795166849228875L
(note that in this numbering, fib(0) = 0, fib(1) = 1, fib(2) = 1, fib(3) = 2, etc.)
(also note: reduce is a builtin in Python 2.7 but not in Python 3; you'd need to execute from functools import reduce in Python 3.)
A rarely seen trick is that a lambda function can refer to itself recursively:
fib = lambda n: n if n < 2 else fib(n-1) + fib(n-2)
By the way, it's rarely seen because it's confusing, and in this case it is also inefficient. It's much better to write it on multiple lines:
def fibs():
a = 0
b = 1
while True:
yield a
a, b = b, a + b
I recently learned about using matrix multiplication to generate Fibonacci numbers, which was pretty cool. You take a base matrix:
[1, 1]
[1, 0]
and multiply it by itself N times to get:
[F(N+1), F(N)]
[F(N), F(N-1)]
This morning, doodling in the steam on the shower wall, I realized that you could cut the running time in half by starting with the second matrix, and multiplying it by itself N/2 times, then using N to pick an index from the first row/column.
With a little squeezing, I got it down to one line:
import numpy
def mm_fib(n):
return (numpy.matrix([[2,1],[1,1]])**(n//2))[0,(n+1)%2]
>>> [mm_fib(i) for i in range(20)]
[0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, 233, 377, 610, 987, 1597, 2584, 4181]
This is a closed expression for the Fibonacci series that uses integer arithmetic, and is quite efficient.
fib = lambda n:pow(2<<n,n+1,(4<<2*n)-(2<<n)-1)%(2<<n)
>> fib(1000)
4346655768693745643568852767504062580256466051737178
0402481729089536555417949051890403879840079255169295
9225930803226347752096896232398733224711616429964409
06533187938298969649928516003704476137795166849228875L
It computes the result in O(log n) arithmetic operations, each acting on integers with O(n) bits. Given that the result (the nth Fibonacci number) is O(n) bits, the method is quite reasonable.
It's based on genefib4 from http://fare.tunes.org/files/fun/fibonacci.lisp , which in turn was based on an a less efficient closed-form integer expression of mine (see: http://paulhankin.github.io/Fibonacci/)
If we consider the "most Pythonic way" to be elegant and effective then:
def fib(nr):
return int(((1 + math.sqrt(5)) / 2) ** nr / math.sqrt(5) + 0.5)
wins hands down. Why use a inefficient algorithm (and if you start using memoization we can forget about the oneliner) when you can solve the problem just fine in O(1) by approximation the result with the golden ratio? Though in reality I'd obviously write it in this form:
def fib(nr):
ratio = (1 + math.sqrt(5)) / 2
return int(ratio ** nr / math.sqrt(5) + 0.5)
More efficient and much easier to understand.
This is a non-recursive (anonymous) memoizing one liner
fib = lambda x,y=[1,1]:([(y.append(y[-1]+y[-2]),y[-1])[1] for i in range(1+x-len(y))],y[x])[1]
fib = lambda n, x=0, y=1 : x if not n else fib(n-1, y, x+y)
run time O(n), fib(0) = 0, fib(1) = 1, fib(2) = 1 ...
I'm Python newcomer, but did some measure for learning purposes. I've collected some fibo algorithm and took some measure.
from datetime import datetime
import matplotlib.pyplot as plt
from functools import wraps
from functools import reduce
from functools import lru_cache
import numpy
def time_it(f):
#wraps(f)
def wrapper(*args, **kwargs):
start_time = datetime.now()
f(*args, **kwargs)
end_time = datetime.now()
elapsed = end_time - start_time
elapsed = elapsed.microseconds
return elapsed
return wrapper
#time_it
def fibslow(n):
if n <= 1:
return n
else:
return fibslow(n-1) + fibslow(n-2)
#time_it
#lru_cache(maxsize=10)
def fibslow_2(n):
if n <= 1:
return n
else:
return fibslow_2(n-1) + fibslow_2(n-2)
#time_it
def fibfast(n):
if n <= 1:
return n
a, b = 0, 1
for i in range(1, n+1):
a, b = b, a + b
return a
#time_it
def fib_reduce(n):
return reduce(lambda x, n: [x[1], x[0]+x[1]], range(n), [0, 1])[0]
#time_it
def mm_fib(n):
return (numpy.matrix([[2, 1], [1, 1]])**(n//2))[0, (n+1) % 2]
#time_it
def fib_ia(n):
return pow(2 << n, n+1, (4 << 2 * n) - (2 << n)-1) % (2 << n)
if __name__ == '__main__':
X = range(1, 200)
# fibslow_times = [fibslow(i) for i in X]
fibslow_2_times = [fibslow_2(i) for i in X]
fibfast_times = [fibfast(i) for i in X]
fib_reduce_times = [fib_reduce(i) for i in X]
fib_mm_times = [mm_fib(i) for i in X]
fib_ia_times = [fib_ia(i) for i in X]
# print(fibslow_times)
# print(fibfast_times)
# print(fib_reduce_times)
plt.figure()
# plt.plot(X, fibslow_times, label='Slow Fib')
plt.plot(X, fibslow_2_times, label='Slow Fib w cache')
plt.plot(X, fibfast_times, label='Fast Fib')
plt.plot(X, fib_reduce_times, label='Reduce Fib')
plt.plot(X, fib_mm_times, label='Numpy Fib')
plt.plot(X, fib_ia_times, label='Fib ia')
plt.xlabel('n')
plt.ylabel('time (microseconds)')
plt.legend()
plt.show()
The result is usually the same.
Fiboslow_2 with recursion and cache, Fib integer arithmetic and Fibfast algorithms seems the best ones. Maybe my decorator not the best thing to measure performance, but for an overview it seemed good.
Another example, taking the cue from Mark Byers's answer:
fib = lambda n,a=0,b=1: a if n<=0 else fib(n-1,b,a+b)
I wanted to see if I could create an entire sequence, not just the final value.
The following will generate a list of length 100. It excludes the leading [0, 1] and works for both Python2 and Python3. No other lines besides the one!
(lambda i, x=[0,1]: [(x.append(x[y+1]+x[y]), x[y+1]+x[y])[1] for y in range(i)])(100)
Output
[1,
2,
3,
...
218922995834555169026,
354224848179261915075,
573147844013817084101]
Here's an implementation that doesn't use recursion, and only memoizes the last two values instead of the whole sequence history.
nthfib() below is the direct solution to the original problem (as long as imports are allowed)
It's less elegant than using the Reduce methods above, but, although slightly different that what was asked for, it gains the ability to to be used more efficiently as an infinite generator if one needs to output the sequence up to the nth number as well (re-writing slightly as fibgen() below).
from itertools import imap, islice, repeat
nthfib = lambda n: next(islice((lambda x=[0, 1]: imap((lambda x: (lambda setx=x.__setitem__, x0_temp=x[0]: (x[1], setx(0, x[1]), setx(1, x0_temp+x[1]))[0])()), repeat(x)))(), n-1, None))
>>> nthfib(1000)
43466557686937456435688527675040625802564660517371780402481729089536555417949051
89040387984007925516929592259308032263477520968962323987332247116164299644090653
3187938298969649928516003704476137795166849228875L
from itertools import imap, islice, repeat
fibgen = lambda:(lambda x=[0,1]: imap((lambda x: (lambda setx=x.__setitem__, x0_temp=x[0]: (x[1], setx(0, x[1]), setx(1, x0_temp+x[1]))[0])()), repeat(x)))()
>>> list(islice(fibgen(),12))
[1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144]
def fib(n):
x =[0,1]
for i in range(n):
x=[x[1],x[0]+x[1]]
return x[0]
take the cue from Jason S, i think my version have a better understanding.
Starting Python 3.8, and the introduction of assignment expressions (PEP 572) (:= operator), we can use and update a variable within a list comprehension:
fib = lambda n,x=(0,1):[x := (x[1], sum(x)) for i in range(n+1)][-1][0]
This:
Initiates the duo n-1 and n-2 as a tuple x=(0, 1)
As part of a list comprehension looping n times, x is updated via an assignment expression (x := (x[1], sum(x))) to the new n-1 and n-2 values
Finally, we return from the last iteration, the first part of the x
To solve this problem I got inspired by a similar question here in Stackoverflow Single Statement Fibonacci, and I got this single line function that can output a list of fibonacci sequence. Though, this is a Python 2 script, not tested on Python 3:
(lambda n, fib=[0,1]: fib[:n]+[fib.append(fib[-1] + fib[-2]) or fib[-1] for i in range(n-len(fib))])(10)
assign this lambda function to a variable to reuse it:
fib = (lambda n, fib=[0,1]: fib[:n]+[fib.append(fib[-1] + fib[-2]) or fib[-1] for i in range(n-len(fib))])
fib(10)
output is a list of fibonacci sequence:
[0, 1, 1, 2, 3, 5, 8, 13, 21, 34]
I don't know if this is the most pythonic method but this is the best i could come up with:->
Fibonacci = lambda x,y=[1,1]:[1]*x if (x<2) else ([y.append(y[q-1] + y[q-2]) for q in range(2,x)],y)[1]
The above code doesn't use recursion, just a list to store the values.
My 2 cents
# One Liner
def nthfibonacci(n):
return long(((((1+5**.5)/2)**n)-(((1-5**.5)/2)**n))/5**.5)
OR
# Steps
def nthfibonacci(nth):
sq5 = 5**.5
phi1 = (1+sq5)/2
phi2 = -1 * (phi1 -1)
n1 = phi1**(nth+1)
n2 = phi2**(nth+1)
return long((n1 - n2)/sq5)
Why not use a list comprehension?
from math import sqrt, floor
[floor(((1+sqrt(5))**n-(1-sqrt(5))**n)/(2**n*sqrt(5))) for n in range(100)]
Without math imports, but less pretty:
[int(((1+(5**0.5))**n-(1-(5**0.5))**n)/(2**n*(5**0.5))) for n in range(100)]
import math
sqrt_five = math.sqrt(5)
phi = (1 + sqrt_five) / 2
fib = lambda n : int(round(pow(phi, n) / sqrt_five))
print([fib(i) for i in range(1, 26)])
single line lambda fibonacci but with some extra variables
Similar:
def fibonacci(n):
f=[1]+[0]
for i in range(n):
f=[sum(f)] + f[:-1]
print f[1]
A simple Fibonacci number generator using recursion
fib = lambda x: 1-x if x < 2 else fib(x-1)+fib(x-2)
print fib(100)
This takes forever to calculate fib(100) in my computer.
There is also closed form of Fibonacci numbers.
fib = lambda n: int(1/sqrt(5)*((1+sqrt(5))**n-(1-sqrt(5))**n)/2**n)
print fib(50)
This works nearly up to 72 numbers due to precision problem.
Lambda with logical operators
fibonacci_oneline = lambda n = 10, out = []: [ out.append(i) or i if i <= 1 else out.append(out[-1] + out[-2]) or out[-1] for i in range(n)]
here is how i do it ,however the function returns None for the list comprehension line part to allow me to insert a loop inside ..
so basically what it does is appending new elements of the fib seq inside of a list which is over two elements
>>f=lambda list,x :print('The list must be of 2 or more') if len(list)<2 else [list.append(list[-1]+list[-2]) for i in range(x)]
>>a=[1,2]
>>f(a,7)
You can generate once a list with some values and use as needed:
fib_fix = []
fib = lambda x: 1 if x <=2 else fib_fix[x-3] if x-2 <= len(fib_fix) else (fib_fix.append(fib(x-2) + fib(x-1)) or fib_fix[-1])
fib_x = lambda x: [fib(n) for n in range(1,x+1)]
fib_100 = fib_x(100)
than for example:
a = fib_fix[76]