Out of curiosity, is there a specific numpy function to do the following (which would supposedly be faster):
a = np.array((0,2,4))
b = np.zeros(len(a) - 1)
for i in range(len(b)):
b[i] = a[i:i+2].mean()
print(b)
#prints [1,3]
Cheers
You could use
b = (a[1:] + a[:-1]) / 2.
to avoid the Python loop.
Related
I have a function that takes three arrays and runs a few lines of code to interpolate the data.
It looks something like this:
def interpolate(a, b, c):
# Avoid crashing if a > max(b)
xm_ = b[-1]
a[a > xm_] = xm_
i = np.arange(a.size)
j = np.searchsorted(b, a) - 1
d = (a - b[j]) / (b[j + 1] - b[j])
return (1 - d) * c[i, j] + c[i, j + 1] * d
Here are the shapes of the arrays:
a is (163080,)
b is (71,)
c is (162080, 71)
I keep running into the IndexError. I have tried to extend the b with higher values to try to not hit max values. I have also tried to use up to fourth last index when assigning xm_.
Any suggestions to what might be happening here? And potentially how to solve it?
Cheers!
I have a code in Matlab and I want to convert it to Python.
Matlab code:
...
...
for i=1:Z
index=0;
N_no=0;
clear A
clear B
for j=1:Z
Dist=distance(X(:,i),X(:,j));
if (all(Dist<=r) && all(Dist~=0))
index=index+1;
N_no=N_no+1;
A(:,index)=DeltaX(:,j);
B(:,index)=X(:,j);
end
end
...
...
end
Python code:
for i in range(0, Z):
index = -1
N_no = -1
A = np.zeros((Z, dim))
B = np.zeros((Z, dim))
for j in range(0, Z):
Dist = Distance(X[i, :], X[j, :])
if np.all(Dist <= r) and np.all(Dist != 0):
index = index + 1
N_no = N_no + 1
A[index, :] = DeltaX[j, :]
B[index, :] = X[j, :]
...
This code is working, but I am looking for an efficient way to convert it. I cannot use del A, del B in the Python code, instead of A = np.zeros((Z, dim)), B = np.zeros((Z, dim)), because I will get this error: UnboundLocalError: local variable 'A'/'B' referenced before assignment. Any suggestion?
This is the standard approach to assign an empty Numpy array. I would see no need of deleting the variable at all.
I assume, that you don't use the variable 'A' and 'B' in the code above and thus the error message is also valid.
You can not delete a variable that does not exist.
I have two numpy arrays like below
a=np.array([11,12])
b=np.array([9])
#a-b should be [2,12]
I want to subtract both a & b such that result should [2,12]. How can I achieve this result?
You can zero-pad one of the array.
import numpy as np
n = max(len(a), len(b))
a_pad = np.pad(a, (0, n - len(a)), 'constant')
b_pad = np.pad(b, (0, n - len(b)), 'constant')
ans = a_pad - b_pad
Here np.pad's second argument is (#of left pads, #of right pads)
A similar method to #BlownhitherMa, would be to create an array of zeros the size of a (we can call it c), then put in b's values where appropriate:
c = np.zeros_like(a)
c[np.indices(b.shape)] = b
>>> c
array([9, 0])
>>> a-c
array([ 2, 12])
You could use zip_longest from itertools:
import numpy as np
from itertools import zip_longest
a = np.array([11, 12])
b = np.array([9])
result = np.array([ai - bi for ai, bi in zip_longest(a, b, fillvalue=0)])
print(result)
Output
[ 2 12]
Here is a very long laid out solution.
diff =[]
n = min(len(a), len(b))
for i in range (n):
diff.append(a[i] - b[i])
if len(a) > n:
for i in range(n,len(a)):
diff.append(a[i])
elif len(b) > n:
for i in range(n,len(b)):
diff.append(b[i])
diff=np.array(diff)
print(diff)
We can avoid unnecessary padding / temporaries by copying a and then subtracting b in-place:
# let numpy determine appropriate dtype
dtp = (a[:0]-b[:0]).dtype
# copy a
d = a.astype(dtp)
# subtract b
d[:b.size] -= b
I'm using python and apparently the slowest part of my program is doing simple additions on float variables.
It takes about 35seconds to do around 400,000,000 additions/multiplications.
I'm trying to figure out what is the fastest way I can do this math.
This is how the structure of my code looks like.
Example (dummy) code:
def func(x, y, z):
loop_count = 30
a = [0,1,2,3,4,5,6,7,8,9,10,11,12,...35 elements]
b = [0,11,22,33,44,55,66,77,88,99,1010,1111,1212,...35 elements]
p = [0,0,0,0,0,0,0,0,0,0,0,0,0,...35 elements]
for i in range(loop_count - 1):
c = p[i-1]
d = a[i] + c * a[i+1]
e = min(2, a[i]) + c * b[i]
f = e * x
g = y + d * c
.... and so on
p[i] = d + e + f + s + g5 + f4 + h7 * t5 + y8
return sum(p)
func() is called about 200k times. The loop_count is about 30. And I have ~20 multiplications and ~45 additions and ~10 uses of min/max
I was wondering if there is a method for me to declare all these as ctypes.c_float and do addition in C using stdlib or something similar ?
Note that the p[i] calculated at the end of the loop is used as c in the next loop iteration. For iteration 0, it just uses p[-1] which is 0 in this case.
My constraints:
I need to use python. While I understand plain math would be faster in C/Java/etc. I cannot use it due to a bunch of other things I do in python which cannot be done in C in this same program.
I tried writing this with cython, but it caused a bunch of issues with the environment I need to run this in. So, again - not an option.
I think you should consider using numpy. You did not mention any constraint.
Example case of a simple dot operation (x.y)
import datetime
import numpy as np
x = range(0,10000000,1)
y = range(0,20000000,2)
for i in range(0, len(x)):
x[i] = x[i] * 0.00001
y[i] = y[i] * 0.00001
now = datetime.datetime.now()
z = 0
for i in range(0, len(x)):
z = z+x[i]*y[i]
print "handmade dot=", datetime.datetime.now()-now
print z
x = np.arange(0.0, 10000000.0*0.00001, 0.00001)
y = np.arange(0.0, 10000000.0*0.00002, 0.00002)
now = datetime.datetime.now()
z = np.dot(x,y)
print 'numpy dot =',datetime.datetime.now()-now
print z
outputs
handmade dot= 0:00:02.559000
66666656666.7
numpy dot = 0:00:00.019000
66666656666.7
numpy is more than 100x times faster.
The reason is that numpy encapsulates a C library that does the dot operation with compiled code. In the full python you have a list of potentially generic objects, casting, ...
I have an array of numbers, a. I have a second array, b, specifying how many times I want to retrieve the corresponding element in a. How can this be achieved? The ordering of the output is not important in this case.
import numpy as np
a = np.arange(5)
b = np.array([1,0,3,2,0])
# desired output = [0,2,2,2,3,3]
# i.e. [a[0], a[2], a[2], a[2], a[3], a[3] ]
Thats exactly what np.arange(5).repeat([1,0,3,2,0]) does.
A really inefficient way to do that is this one :
import numpy as np
a = np.arange(5)
b = np.array([1,0,3,2,0])
res = []
i = 0
for val in b:
for aa in range(val):
res.append(a[i])
i += 1
print res
here's one way to do it:
res = []
for i in xrange(len(b)):
for j in xrange(b[i]):
out.append(a[i])
res = np.array(res) # optional