I have done the following code:
test.py
nc = 1; nb = 20; ni = 6; nc = 2; ia = 20; ib = 20; ic = 0
U1 = numpy.array((1,2,0,0,0,3))
U2 = numpy.array((2,2,1,0,0,1))
U3 = numpy.array((2,1,1,0,0,2))
U4 = numpy.array((2,1,0,1,0,2))
U5 = numpy.array((2,1,1,1,0,3))
for n in range(ni):
a = nc*(nb*nc*ia+nc*ib+ic)+U1[n]
a2 = ia + U1[n]
b2 = ib + U3[n]
c2 = ic + U4[n]
b = nc*(nb*nc*a2+nc*b2+c2)+U2[n]
A = str(numpy.array((a,b,U5[n])))
print(A)
with open("test.txt", 'w') as out:
for o in A:
out.write(o)
test.txt gives me the following:
[1683 1933 3]
But if I print test.py by using print(A), I get this:
[1681 1774 2]
[1682 1848 1]
[1680 1685 1]
[1680 1682 1]
[1680 1680 0]
[1683 1933 3]
How can I write the the whole print in test.txt? I assume to do something like this:
ol = []
ol.append(o))
The basic problem is you are opening the same file again and again and overwriting this file for every iteration in the for loop.
Use:
with open("test.txt", 'w') as out:
for n in range(ni):
a = nc*(nb*nc*ia+nc*ib+ic)+U1[n]
a2 = ia + U1[n]
b2 = ib + U3[n]
c2 = ic + U4[n]
b = nc*(nb*nc*a2+nc*b2+c2)+U2[n]
A = str(numpy.array((a,b,U5[n])))
out.write(f"{A}\n")
Now the text.txt file will contain:
[1681 1774 2]
[1682 1848 1]
[1680 1685 1]
[1680 1682 1]
[1680 1680 0]
[1683 1933 3]
The best solution - Fast and efficient
import numpy
nc = 1; nb = 20; ni = 6; nc = 2; ia = 20; ib = 20; ic = 0
U1 = numpy.array((1,2,0,0,0,3))
U2 = numpy.array((2,2,1,0,0,1))
U3 = numpy.array((2,1,1,0,0,2))
U4 = numpy.array((2,1,0,1,0,2))
U5 = numpy.array((2,1,1,1,0,3))
# No for loop here, we are using NumPy broadcasting features
a = nc*(nb*nc*ia+nc*ib+ic)+U1
a2 = ia + U1
b2 = ib + U3
c2 = ic + U4
b = nc * (nb * nc * a2 + nc * b2 + c2) + U2
# Transpose the matrix to get the result wanted in your case
A = numpy.array((a, b, U5)).T
with open(file="res.txt", mode="w") as b:
b.write(numpy.array2string(A))
Remarks about your code written in the question
In most cases, using NumPy broadcasting (removing loop over arrays) makes the code faster. It can also be easier to read.
Writing into a file while inside a loop is a bad practice. Even if you keep your file open with the use of the context manager with open, performances are poor.
Better build your array, convert it to string.
Then write the whole thing into a file.
Other solution using numpy builtin functions
Disclaimer: To use only if the row number of your array is small (<500)
Dumping a numpy array into a text file is a builtin function in NumPy.
Look at this: API doc | numpy.savetxt
However, if you look at the source code of this function, you will see that you iterate over the array's rows which impacts performances a lot when dimensions number increases (thanks to #hpaulj for the remark).
In your case, you could replace the two last lines of the snippet above with:
numpy.savetxt("a.txt", A) # just see the doc to add some formatting options
In each iteration of the outer loop, you are asking the file system for a fresh, empty copy of "test.txt". So of course the final version only contains the content of the last loop.
Open with the attributes "a" for "(write-and-)append" or as in the other answer, and more efficiently, open once outside of the loop.
Related
I have written a code that I want to use to reduce measurement data. For this, I iterate through 30 sets of measurement data. In each iteration, I use fsolve to solve a set of three non-linear equations. This give me an array containing three values that are then further processed (in the example below lbda, alp, bta, dlt, q, N). I can print the results but would need to collect the data for all of the 30 cycles in an 30 by 6 array to do some statistics on (i.e. np.mean on each of the 6 variables).
I've tried the, for me, most obvious function(s) np.append, np.vstack, np.concatenates at the end of each iteration but this only gives me a 1 by 6 array containing only the last iteration step rather than the desired array containing all of the 30 iteration steps.
# loading data above
m1 = data_arr_blkcorr [:,4] / data_arr_blkcorr [:,2]
m2 = data_arr_blkcorr [:,5] / data_arr_blkcorr [:,2]
m3 = data_arr_blkcorr [:,7] / data_arr_blkcorr [:,2]
N=-1
while (N<29):
N = N+1
T1 = 79.744440299369400
T2 = 4.756431967877120
T3 = 195.146815878103000
T4 = 1.333609171398
T5 = 0.540566631391
T6 = 1
T7 = 1.731261585620
T_all = np.array([T4, T5, T6, T7, T1, T2, T3])
n1 = 0.598169735
n2 = 1.509919737
n3 = 0.600477235
n4 = 0.9364071191658
n5 = 0.5815716133216
n6 = 1
n7 = 1.0455228260642
n_all = np.array([n4, n5, n6, n7, n1, n2, n3])
I1 = 94.905838
I2 = 96.906018
I3 = 97.905405
I4 = 99.907473
I5 = 91.90681
I6 = 93.90509
I7 = 95.90468
# some definition of variables here
A11 = T1-n1
A12 = T2-n2
A13 = T3-n3
A21 = -n1*P1
A22 = -n2*P2
A23 = -n3*P3
A31 = m1[N] * P1
A32 = m2[N] * P2
A33 = m3[N] * P3
b11 = m1[N] - n1
b12 = m2[N] - n2
b13 = m3[N] - n3
# some definition of variables here
T = np.array ([T1, T2, T3])
n = np. array([n1, n2, n3])
m = np.array([m1[N], m2[N], m3[N]])
P = np.array([P1, P2, P3])
def F(x):
return x[0]*T + (1-x[0])*n*np.exp(-x[1]/(1-x[0])*P) - m*np.exp(-x[2]*P)
y = fsolve(F, guess)
lbda = y[0]
alp = y[1]/(1-y[0])
bta = y[2]
dlt = (np.exp(-alp*P2)-1)*1000
N_all = n_all * np.exp(-alp*P_all)
q = (1 + (1 - lbda) / lbda * np.sum(N_all) / np.sum(T_all))**(-1)
print (lbda, alp, bta, dlt, q, N)
Going through the posts I have also used this (after a suggestion provided by Koke Cacao):
data_sum = None
new_data = [lbda, alp, bta, dlt, q, N]
data_sum = np.append([data_sum], new_data) if data_sum is not None else new_data
print(data_sum)
But this yields a list of 30 isolated 1 by 6 arrays to which I do not have access to on the whole (i.e. to calculate np.means for individual values over all 30 iteration steps).
0.0209809690838 0.00142553246898 1.61537217874 -0.0443566490317 0.492710128581 26
(0.020980969083774538, 0.0014255324689812997, 1.6153721787428821, -0.044356649031684903, 0.4927101285811698, 26)
0.0209791772348 0.00272489389093 1.61486845411 -0.0847856651612 0.492691689834 27
(0.020979177234773643, 0.0027248938909269797, 1.6148684541135419, -0.084785665161235535, 0.49269168983354455, 27)
0.0209792771323 0.004884280445 1.61191395635 -0.151970341101 0.49269849879 28
(0.020979277132325381, 0.0048842804449965851, 1.6119139563515672, -0.15197034110059349, 0.4926984987899769, 28)
0.0209799414614 0.00256323393277 1.61366560195 -0.0797557810515 0.492700571038 29
(0.020979941461433328, 0.0025632339327746521, 1.6136656019498359, -0.079755781051460417, 0.49270057103806092, 29)
Also, it concatenates results over multiple runs (even after shutting down Python and restarting it) and I cannot clear this (sort of) memory.
try creating an empty list outside your while loop and then append the array.
solution = []
while n < 29:
#your code here
solution.append([lbda, alp, bta, dlt, q, N])
you should declare an empty list outside the while loop scope and then append to it on each Iteration:
result = []
while(N<29):
# calculate something
result.append(your_data)
print(result) # that will give you all the data that you got from each Iteration
I'm using python and apparently the slowest part of my program is doing simple additions on float variables.
It takes about 35seconds to do around 400,000,000 additions/multiplications.
I'm trying to figure out what is the fastest way I can do this math.
This is how the structure of my code looks like.
Example (dummy) code:
def func(x, y, z):
loop_count = 30
a = [0,1,2,3,4,5,6,7,8,9,10,11,12,...35 elements]
b = [0,11,22,33,44,55,66,77,88,99,1010,1111,1212,...35 elements]
p = [0,0,0,0,0,0,0,0,0,0,0,0,0,...35 elements]
for i in range(loop_count - 1):
c = p[i-1]
d = a[i] + c * a[i+1]
e = min(2, a[i]) + c * b[i]
f = e * x
g = y + d * c
.... and so on
p[i] = d + e + f + s + g5 + f4 + h7 * t5 + y8
return sum(p)
func() is called about 200k times. The loop_count is about 30. And I have ~20 multiplications and ~45 additions and ~10 uses of min/max
I was wondering if there is a method for me to declare all these as ctypes.c_float and do addition in C using stdlib or something similar ?
Note that the p[i] calculated at the end of the loop is used as c in the next loop iteration. For iteration 0, it just uses p[-1] which is 0 in this case.
My constraints:
I need to use python. While I understand plain math would be faster in C/Java/etc. I cannot use it due to a bunch of other things I do in python which cannot be done in C in this same program.
I tried writing this with cython, but it caused a bunch of issues with the environment I need to run this in. So, again - not an option.
I think you should consider using numpy. You did not mention any constraint.
Example case of a simple dot operation (x.y)
import datetime
import numpy as np
x = range(0,10000000,1)
y = range(0,20000000,2)
for i in range(0, len(x)):
x[i] = x[i] * 0.00001
y[i] = y[i] * 0.00001
now = datetime.datetime.now()
z = 0
for i in range(0, len(x)):
z = z+x[i]*y[i]
print "handmade dot=", datetime.datetime.now()-now
print z
x = np.arange(0.0, 10000000.0*0.00001, 0.00001)
y = np.arange(0.0, 10000000.0*0.00002, 0.00002)
now = datetime.datetime.now()
z = np.dot(x,y)
print 'numpy dot =',datetime.datetime.now()-now
print z
outputs
handmade dot= 0:00:02.559000
66666656666.7
numpy dot = 0:00:00.019000
66666656666.7
numpy is more than 100x times faster.
The reason is that numpy encapsulates a C library that does the dot operation with compiled code. In the full python you have a list of potentially generic objects, casting, ...
There is a new edit at the end of the topic.
I'm new in Python and I would like to know how could I make a simple tridiagonal matrix NxN.
I have three vectors that will be updated over a loop.
I'm working with something like this:
Note: I just want to know how zeros and what Python parameters I could use to adjust this.
Well, I have two codes here, the first one I wrote in Fortran and it works fine. And the second is what I tried to write in Python.
Fortran:
do i=2,n-1
do j=2,n-1
if (i.eq.j) then
D(i,j+1)=-u_med(i+1)/(delta_r(i)*delta_r(i+1))
t1 =u_med(i+1)/(delta_r(i)*delta_r(i))
t2 = u_med(i)/(delta_r(i)*delta_r(i))
D(i,j)= t1 + t2 + V(i)
D(i+1,j)=-u_med(i+1)/(delta_r(i)*delta_r(i+1))
end if
end do
end do
Python:
for i in range(2,n):
for j in range(2,n):
if i == j:
D[i][j+1] = - u_med[i+1]/(delta_r[i]*delta_r[i+1])
t1 = u_med[i+1]/(delta_r[i]*delta_r[i])
t2 = u_med[i]/(delta_r[i]*delta_r[i])
D[i][j]= t1 + t2 + V[i]
D[i+1][j]= - u_med[i+1]/(delta_r[i]*delta_r[i+1])
t1 = u_med[2]/(delta_r[1]*delta_r[1])
t2 = 0
D[1][1]= t1 + t2 + V[1]
D[1][2]= - u_med[2]/(delta_r[1]*delta_r[2])
D[2,1]= - u_med[2]/(delta_r[2]*delta_r[1])
t1 = 0
t2 = u_med[n]/(delta_r[n]*delta_r[n])
D[n][n]= t1 + t2 + V[n]
Which gives the error:
D[i][j+1] = - u_med[i+1]/(delta_r[i]*delta_r[i+1]) ValueError: setting
an array element with a sequence.
Example based on the above image:
example
Comments:
For u_med:
u_med = np.zeros((n,2))
for i in range(2,n):
tta1 = r[i]*u[i]
tta2 = r[i-1]*u[i-1]
u_med[i] = 0.5*(tta1 + tta2)/(r[i] - r[i-1])
u_med[1] = u_med[2]
For delta_r:
delta_r = np.zeros((n-1,2))
for i in range(2,n-1):
ft1 = r[i+1]*r[i+1]
ft2 = r[i-1]*r[i-1]
ft3 = 2*r[i]*(r[i+1] - r[i-1])
delta_r[i] = math.sqrt(0.125*abs(ft1 - ft2 + ft3))
For r and n:
ri=0
n1 = 51
r1 = ri
r2 = 250
hr1 = (r2-r1)/(n1-1)
r = np.zeros((n1,1))
for i in range(n1):
r[i] = r1 + i*hr1
u = np.zeros((n+1,1))
for i in range(1,n+1):
i = 1
And for D:
D = npm.zeros((n,n))
Edit: It seems to be because the u_med and delta_r are two-dimensional and I'm trying to assign it in D with an incompatible shape. It works in Fortran, but how can I approach it differently in Python?
If you create the diagonals as just arrays or lists, then you could use numpy.diag() to create diagonal matrices and add them together.
D = np.diag(x) + np.diag(x1, 1) + np.diag(x1, -1)
to create your matrix. Perhaps something like
A = mu[1:]/h[1:]**2 + mu[:1]/h[1:]**2 + U[1:]
B = mu[1:]/ (h[:-1]*h[1:])
D = np.diag(A) + np.diag(B, 1) + np.diag(B, -1)
going off of the image you linked.
I am trying to combine different parts of arrays and tuples to generate a series of products. Here is the tuple 'i':
i=(2,5)
Here is the first matrix 'w':
w=[array([[-1.95446441, 1.53904854, -0.3461807 ],
[-0.19153855, -1.63290931, -1.76897156]]),
array([[ 0.25648535],
[ 0.20186475],
[ 0.78002102]])]
here is the second matrix 'b':
[array([[-0.02676943],
[ 0.25294377],
[-0.43625132]]),
array([[ 0.07763943]])]
I am trying to make a series of products from various parts of these datastructures in a list of lists or matrix called 'a'.
The list of these products should be equivalent to:
a[0][0] = (w[0][0][0]*i[0]) + (w[0][1][0]*i[1]) + b[0][0]
a[0][1] = (w[0][0][1]*i[0]) + (w[0][1][1]*i[1]) + b[0][1]
a[0][2] = (w[0][0][2]*i[0]) + (w[0][1][2]*i[1]) + b[0][2]
a[1][0] = (w[1][0] * a[0][0]) + (w[1][1] * a[0][1]) + (w[1][2] * a[0][2]) + b[1][0]
I am trying to use this as part of a neural network and have written a version that works perfectly well using iteration. However I am new to numpy and would like to build a matrix based version of this. The problem I am having is more to do with understanding the numpy syntax to perform the operation above. I tried adapting this from an online tutorial but am not sure where to go from here.
for b, w in zip(b, w):
layer = sigmoid(np.dot(w, layer)+b.T)
a.append(layer)
This throws and error:
ValueError: shapes (2,3) and (1,3) not aligned: 3 (dim 1) != 1 (dim 0)
Any pointers would be very helpful?
For a start let's split your 2 variables, w and b. They aren't really arrays, they are lists of arrays with different shapes
w0 = array([[-1.95446441, 1.53904854, -0.3461807 ],
[-0.19153855, -1.63290931, -1.76897156]])
w1 = array([[ 0.25648535],
[ 0.20186475],
[ 0.78002102]])
b0 = array([[-0.02676943],
[ 0.25294377],
[-0.43625132]])
b1 = array([[ 0.07763943]])
Maybe later you can iterate over them as 2 element lists, but for now that just complicates things.
Now your a calculation simplifies to:
a0[0] = w0[0,0]*i[0] + w0[1,0]*i[1] + b0[0]
a0[1] = w0[0,1]*i[0] + w0[1,1]*i[1] + b0[1]
a0[2] = w0[0,2]*i[0] + w0[1,2]*i[1] + b0[2]
a1[0] = w1[0]* a0[0] + w1[1]*a0[1] + w1[2]*a0[2] + b1[0]
which further simplifies to:
a0 = w0[0,:]*i[0] + w0[1,:]*i[1] + b0
a1 = np.sum(w1*a0) + b1
or
I0 = np.array([i]).T
a0 = np.sum(w0*i0, axis=0) + b0
Those sums could be turned into dots; I think this works:
a0 = np.dot(w0.T,i) + b0
But I doubt if it's much of an improvement.
You can't calculate a0 and a1 together, since the one uses the other. But you could cast it as an iteration like (not tested):
I0 = ...
w = [w0,w1]
b = [b0,b1]
a = [None,None]
for i in range(...):
a[i] = np.sum(w[i]*I0, axis=0) + b[i]
I0 = a[i]
I have following python code:
H1 = [[0.04,0.03,0.01,0.002],[0.02,0.04,0.001,0.5]]
H2 = [[0.06,0.02,0.02,0.004],[0.8,0.09,0.6,0.1]]
D1 = [0.01,0.02,0.1,0.01]
D2 = [0.1,0.3,0.01,0.4]
Tp = np.sum(D1)
Tn = np.sum(D2)
T = []
append2 = T.append
E = []
append3 = E.append
for h1,h2 in itertools.izip(H1,H2)
Err = []
append1 = Err.append
for v in h1:
L1 = [1 if i>=v else 0 for i in h1]
L2 = [1 if i>=v else 0 for i in h2]
Sp = np.dot(D1,L1)
Sn = np.dot(D2,L2)
err = min(Sp+Tn-Sn, Sn+Tp-Sp)
append1(err)
b = np.argmin(Err)
append2(h1[b])
append3(Err[b])
This is just an example code. I need to run the inner for loop near about 20,000 times (here it runs just twice). But the inner for loop takes much time making it inpractical to use.
In line profiler, it shows that line Sp = np.dot(D1,L1) , Sn = np.dot(D2,L2) and b = np.argmin(Err) are the most time consuming.
How can I reduce the time taken by above code.
Any help will be much appreciated.
Thanks!
You can get a pretty big speed boost if you use numpy functions with numpy arrays instead of lists. Most numpy functions will convert lists to arrays internally and that adds a lot of overhead to the run time. Here is a simple example:
In [16]: a = range(10)
In [17]: b = range(10)
In [18]: aa = np.array(a)
In [19]: bb = np.array(b)
In [20]: %timeit np.dot(a, b)
10000 loops, best of 3: 54 us per loop
In [21]: %timeit np.dot(aa, bb)
100000 loops, best of 3: 3.4 us per loop
numpy.dot run 16x faster when called with arrays in this case. Also when you use numpy arrays you'll be able to simplify some of your code which should also help it run faster. For example if h1 is an array, L1 = [1 if i>=v else 0 for i in h1] can be written as h1 > v which returns an array and should also run faster. Bellow I've gone ahead and replaced your lists with arrays so you can see what it would look like.
import numpy as np
H1 = np.array([[0.04,0.03,0.01,0.002],[0.02,0.04,0.001,0.5]])
H2 = np.array([[0.06,0.02,0.02,0.004],[0.8,0.09,0.6,0.1]])
D1 = np.array([0.01,0.02,0.1,0.01])
D2 = np.array([0.1,0.3,0.01,0.4])
Tp = np.sum(D1)
Tn = np.sum(D2)
T = np.zeros(H1.shape[0])
E = np.zeros(H1.shape[0])
for i in range(len(H1)):
h1 = H1[i]
h2 = H2[i]
Err = np.zeros(len(h1))
for j in range(len(h1)):
v = h1[j]
L1 = h1 > v
L2 = h2 > v
Sp = np.dot(D1, L1)
Sn = np.dot(D2, L2)
err = min(Sp+Tn-Sn, Sn+Tp-Sp)
Err[j] = err
b = np.argmin(Err)
T[i] = h1[b]
E[i] = Err[b]
Once you're more comfortable with numpy arrays you might want to look into expressing at least your inner loop using broadcasting. For some applications, using broadcasting can be much more efficient than python loops. Good luck, hope that helps.
You need to keep the data in ndarray types. When you do a numpy operation on a list, it has to construct a new array each time. I modified your code to run a variable number of times and found it too ~1s for 10000 iterations. Changing the datatypes to ndarrays reduced that by about a factor of two, and I think there is still some improvement to make (the first version of this had a bug that made it execute too fast)
import itertools
import numpy as np
N = 10000
H1 = [np.array([0.04,0.03,0.01,0.002])] * N
H2 = [np.array([0.06,0.02,0.02,0.004])] * N
D1 = np.array([0.01,0.02,0.1,0.01] )
D2 = np.array([0.1,0.3,0.01,0.4] )
Tp = np.sum(D1)
Tn = np.sum(D2)
T = []
append2 = T.append
E = []
append3 = E.append
for h1,h2 in itertools.izip(H1,H2):
Err = []
append1 = Err.append
for v in h1:
#L1 = [1 if i>=v else 0 for i in h1]
#L2 = [1 if i>=v else 0 for i in h2]
L1 = h1 > v
L2 = h2 > v
Sp = np.dot(D1,L1)
Sn = np.dot(D2,L2)
err = min(Sp+Tn-Sn, Sn+Tp-Sp)
append1(err)
b = np.argmin(Err)
append2(h1[b])
append3(Err[b])
There's some low-hanging fruit in your list comprehensions:
L1 = [1 if i>=v else 0 for i in h1]
L2 = [1 if i>=v else 0 for i in h2]
The above could be written as:
L1 = [i>=v for i in h1]
L2 = [i>=v for i in h2]
Because Booleans are a subclass of integers, True and False are already 1 and 0, just wearing fancy clothes.
err = min(Sp+Tn-Sn, Sn+Tp-Sp)
append1(err)
You could combine the above two lines to avoid the variable assignment and access.
If you put the code in a function, all local variable usage will be slightly faster. Also, any global functions or methods you use (e.g. min, np.dot) can be converted to locals in the function signature using default arguments. np.dot is an especially slow call to make (outside of how long the operation itself takes) because it involves an attribute lookup. This would be similar to the optimization you already make with the list append methods.
Now I imagine none of this will really affect performance much, since your question really seems to be "how can I make NumPy faster?" (which others are on top of for you) but they might have some impact and be worth doing.
If I have correctly understood what does the instruction np.dot() on two lists of dimensions 1, it seems to me that the following code should do the same as yours.
Could you test its speed , please ?
Its principle is to play on indices instead of the elements of lists, and to use the peculiarity of a list defined as a default value of a function
H1 = [[0.04,0.03,0.01,0.002],[0.02,0.04,0.001,0.5]]
H2 = [[0.06,0.02,0.02,0.004],[0.8,0.09,0.6,0.1]]
D1 = [0.01,0.02,0.1,0.01]
D2 = [0.1,0.3,0.01,0.4]
Tp = np.sum(D1)
Tn = np.sum(D2)
T,E = [],[]
append2 = T.append
append3 = E.append
ONE,TWO = [],[]
def zoui(v, ONE=ONE,TWO=TWO,
D1=D1,D2=D2,Tp=Tp,Tn=Tn,tu0123 = (0,1,2,3)):
diff = sum(D1[i] if ONE[i]>=v else 0 for i in tu0123)\
-sum(D2[i] if TWO[i]>=v else 0 for i in tu0123)
#or maybe
#diff = sum(D1[i] * ONE[i]>=v for i in tu0123)\
# -sum(D2[i] * TWO[i]>=v for i in tu0123)
return min(Tn+diff,Tp-diff)
for n in xrange(len(H1)):
ONE[:] = H1[n]
TWO[:] = H2[n]
Err = map(zoui,ONE)
b = np.argmin(Err)
append2(ONE[b])
append3(Err[b])