How to make dictionary multiplication faster? - python

Is there a way to make function f() faster. I run it hundreds of millions of times, so any speed increase would be appreciated. The IDs of dictionary a and b are same across runs, w is a constant. The keys are integers; the keys are not evenly distributed in general.
Also, the function is in a class object. So f is f(self), and the variables are self.w, self.ID, self.a, and self.b
w = 10.25
ID = range(10)
a = {}
b = {}
for i in ID:
a[i] = random.uniform(0,1)
b[i] = random.uniform(0,1)
def f():
for i in ID:
a[i] = b[i] * w
t0 = time.time()
for i in xrange(1000000):
f()
t1 = time.time()
print t1-t0

You can get a nice speed-up by localizing the variables in f():
def f(ID=ID, a=a, b=b, w=w):
for i in ID:
a[i] = b[i] * w
Some folks don't like localizing in that way, so you can also build a closure which will give you a speed-up of global variable access:
def make_f(a, b, w, ID):
def f():
for i in ID:
a[i] = b[i] * w
return f
f = make_f(a, b, w, ID)
See this analysis of which types of variable access are fastest.
Algorithmically, there is not much else that can be done. The looping over ID is fast because it just increments reference counts for existing integers. The hashing of the integers is practically instant because the hash of an int is the int itself. Dict lookups themselves are already highly optimized. Likewise, there is no short-cut for multiplying by a constant.

Related

Similar subroutine statement

I have a code in Python.
My maincode calla a function matrixD:
def matrixD(n,x):
C=[]
for i in range(n):
C += [1.0]
C[0]=2.0
C[n-1]=2.0
D=[[0] * n for i in range(n)]
for i in range(n):
for j in range(n):
if i==j and i!=0 and i!=(n-1):
D[i][i] = -x[i]/(2.0*(1.0-x[i]**2))
else:
if i!=j:
D[i][j] = (C[i]/C[j])*(-1.0)**(i+1+j+1)/(x[i]-x[j])
# D[i][j] = 2.0
D[0][0] = (2.0*float(n-1)**2 + 1.0)/6.0
D[n-1][n-1] = -(2.0*float(n-1)**2 + 1.0)/6.0
return D
This function returns the value of D (a matrix).
But in this way D is not a global variable.
How can I make D a global variable?
Every time when I want to use D, I want to call the function and is not the ideal.
I want some similar subroutine in Fortran. You call just one time and you have a global variable.
*I am very new at Python

I am passing a set of N values to a loop but cant get it to print an ouput

I cannot seem to get an output when I pass numbers to the function. I need to get the computed value and subtract it from the exact. Is there something I am not getting right?
def f1(x):
f1 = np.exp(x)
return f1;
def trapezoid(f,a,b,n):
'''Computes the integral of functions using the trapezoid rule
f = function of x
a = upper limit of the function
b = lower limit of the function
N = number of divisions'''
h = (b-a)/N
xi = np.linspace(a,b,N+1)
fi = f(xi)
s = 0.0
for i in range(1,N):
s = s + fi[i]
s = np.array((h/2)*(fi[0] + fi[N]) + h*s)
print(s)
return s
exactValue = np.full((20),math.exp(1)-1)
a = 0.0;b = 1.0 # integration interval [a,b]
computed = np.empty(20)
E=np.zeros(20)
exact=np.zeros(20)
N=20
def convergence_tests(f, a, b, N):
n = np.zeros(N, 1);
E = np.zeros(N, 1);
Exact = math.exp(1)-1
for i in range(N):
n[i] = 2^i
computed[i] = trapezoid(f, a, b, n[i])
E = abs(Exact - computed)
print(E, computed)
return E, computed
You have defined several functions, but your main program never calls any of them. In fact, your "parent" function convergence_test cannot be called, because it's defined at the bottom of the program.
I suggest that you use incremental programming: write a few lines; test those before you proceed to the next mini-task in your code. In the posting, you've written about 30 lines of active code, without realizing that virtually none of it actually executes. There may well be several other errors in this; you'll likely have a difficult time fixing all of them to get the expected output.
Start small and grow incrementally.

Nested Numba function performance

currently I am trying to improve the performance of my python code. To do so I successfully use numba. In order to improve the structure of my code I create functions. Now I have noticed to my surprise that if I split the code into different numba functions, the code is significantly slower than if I put the whole code in one function with a numba decorator.
An example would be:
#nb.njit
def fct_4(a, b):
x = a ^ b
setBits = 0
while x > 0:
setBits += x & 1
x >>= 1
return setBits
#nb.njit
def fct_3(c, set_1, set_2):
h = 2
if c not in set_1 and c not in set_2:
if fct_4(0, c) <= h:
set_1.add(c)
else:
set_2.add(c)
#nb.njit
def fct_2(c, set_1, set_2):
fct_3(c, set_1, set_2)
#nb.njit
def fct_1(set_1, set_2):
for x1 in range(1000):
c = 2
fct_2(c, set_1, set_2)
is slower than
#nb.njit
def fct_1(set_1, set_2):
for x1 in range(1000):
c = 2
h = 2
if c not in set_1 and c not in set_2:
if fct_4(0, c) <= h:
set_1.add(c)
else:
set_2.add(c)
with
#nb.njit
def main_fct(set_1, set_2):
for i in range(50):
for x in range(1000):
fct_1(set_1, set_2)
set_1 = {0}
set_2 = {47}
start = timeit.default_timer()
main_fct(set_1, set_2)
stop = timeit.default_timer()
(2.70 seconds vs 0.46 seconds). I thought this shouldn't make a difference. Could you enlighten me?
Since python is a dynamically typed language, its function call overhead is quite high.
On top of that you are looping over the function calls, so the execution time incurred in calling the function and checking the arguments is multiplied 1000 times.

multiple values of a variable in same equation

A = [18.0,10.0]; B = [13.0,15.0]; C = [10.5,12.0];
these are the variables and think about function like
def hlf(A,B,C):
return A**(-1.0/2.0)-0.2*B-43+C
print "T:"
hlf(A,B,C)
Firstly, I want to use first values of the A B and C in the equation. After I want to use second values. How can I do this ?
map + list
Note map can take multiple iterable arguments:
res = map(hlf, A, B, C)
[-34.86429773960448, -33.68377223398316]
In Python 2.7, map returns a list. In Python 3.x map returns an iterator, so you can either iterate lazily or exhaust via list, i.e. list(map(hfl, A, B, C)).
Reference:
map(function, iterable, ...)
...If additional iterable arguments are passed, function must
take that many arguments and is applied to the items from all
iterables in parallel.
zip + list comprehension
You can use zip within a list comprehension. For clarity, you should avoid naming your arguments the same as your variables.
A = [18.0,10.0]; B = [13.0,15.0]; C = [10.5,12.0];
def hlf(x, y, z):
return x**(-1.0/2.0) - 0.2*y - 43 + z
res = [hlf(*vars) for vars in zip(A, B, C)]
[-34.86429773960448, -33.68377223398316]
Vectorize with Numpy. Best Performace
Normally its much better try to vectorize this kind of operations with numpy, because the best performance results. When you vectorize instead to use a loop, you are using all your cores, and its the fastest solution. You should vectorize the operation with numpy. Something like this:
import numpy as np
A = [18.0,10.0]; B = [13.0,15.0]; C = [10.5,12.0];
a = np.array(A)
b = np.array(B)
c = np.array(C)
And now your function with the new vectors like arguments:
def hlf(a_vector,b_vector,c_vector):
return a_vector**(-1.0/2.0)-0.2*b_vector-43+c_vector
And finally call your new function vectorized:
print (hlf(a_vector = a,b_vector = b,c_vector = c))
Output:
>>> array([-34.86429774, -33.68377223])
If you want to keep your function as is, you should call it N times with:
for i in range(N):
result = hlf(A[i], B[i], C[i])
print(result)
Another interesting method is to make a generator with your function:
A = [18.0,10.0]
B = [13.0,15.0]
C = [10.5,12.0];
def hlf(*args):
i=0
while i < len(args[0]):
yield args[0][i]**(-1.0/2.0) - 0.2*args[1][i] - 43 + args[2][i]
i += 1
results = hlf(A, B, C)
for r in results:
print(r)
Output:
-34.86429773960448
-33.68377223398316
Last one is rather edicational if you want to practice python generators.

Fastest way to add/multiply two floating point scalar numbers in python

I'm using python and apparently the slowest part of my program is doing simple additions on float variables.
It takes about 35seconds to do around 400,000,000 additions/multiplications.
I'm trying to figure out what is the fastest way I can do this math.
This is how the structure of my code looks like.
Example (dummy) code:
def func(x, y, z):
loop_count = 30
a = [0,1,2,3,4,5,6,7,8,9,10,11,12,...35 elements]
b = [0,11,22,33,44,55,66,77,88,99,1010,1111,1212,...35 elements]
p = [0,0,0,0,0,0,0,0,0,0,0,0,0,...35 elements]
for i in range(loop_count - 1):
c = p[i-1]
d = a[i] + c * a[i+1]
e = min(2, a[i]) + c * b[i]
f = e * x
g = y + d * c
.... and so on
p[i] = d + e + f + s + g5 + f4 + h7 * t5 + y8
return sum(p)
func() is called about 200k times. The loop_count is about 30. And I have ~20 multiplications and ~45 additions and ~10 uses of min/max
I was wondering if there is a method for me to declare all these as ctypes.c_float and do addition in C using stdlib or something similar ?
Note that the p[i] calculated at the end of the loop is used as c in the next loop iteration. For iteration 0, it just uses p[-1] which is 0 in this case.
My constraints:
I need to use python. While I understand plain math would be faster in C/Java/etc. I cannot use it due to a bunch of other things I do in python which cannot be done in C in this same program.
I tried writing this with cython, but it caused a bunch of issues with the environment I need to run this in. So, again - not an option.
I think you should consider using numpy. You did not mention any constraint.
Example case of a simple dot operation (x.y)
import datetime
import numpy as np
x = range(0,10000000,1)
y = range(0,20000000,2)
for i in range(0, len(x)):
x[i] = x[i] * 0.00001
y[i] = y[i] * 0.00001
now = datetime.datetime.now()
z = 0
for i in range(0, len(x)):
z = z+x[i]*y[i]
print "handmade dot=", datetime.datetime.now()-now
print z
x = np.arange(0.0, 10000000.0*0.00001, 0.00001)
y = np.arange(0.0, 10000000.0*0.00002, 0.00002)
now = datetime.datetime.now()
z = np.dot(x,y)
print 'numpy dot =',datetime.datetime.now()-now
print z
outputs
handmade dot= 0:00:02.559000
66666656666.7
numpy dot = 0:00:00.019000
66666656666.7
numpy is more than 100x times faster.
The reason is that numpy encapsulates a C library that does the dot operation with compiled code. In the full python you have a list of potentially generic objects, casting, ...

Categories

Resources