I'm working with a 2d array. Basically just trying to do an element wise addition of a constant value.
Need to speed code up so attempted to use numpy array instead of list of list but finding numpy to be slower. Any idea of what I'm doing wrong? Thanks.
For example:
import time
import numpy as np
my_array_list = [[1,2,3],[4,5,6],[7,8,9]]
my_array_np = np.array(my_array_list)
n = 100000
s_np = time.time()
for a in range(n):
for i in range(3):
for j in range(3):
my_array_np[i,j] = my_array_np[i,j] + 5
end_np = time.time() - s_np
s_list = time.time()
for a in range(n):
for i in range(3):
for j in range(3):
my_array_list[i][j] = my_array_list[i][j] + 5
end_list = time.time() - s_list
print('my_array_np:', '\n', my_array_np, '\n')
print('my_array_list:', '\n',my_array_list, '\n')
print('time to complete with numpy:', end_np)
print('time to complete with list:', end_list)
Output:
my_array_np:
[[500001 500002 500003]
[500004 500005 500006]
[500007 500008 500009]]
my_array_list:
[[500001, 500002, 500003], [500004, 500005, 500006], [500007, 500008, 500009]]
time to complete with numpy: 0.7831366062164307
time to complete with list: 0.45527076721191406
Can see with this test using lists, the time to complete is significantly faster, ie, 0.45 vs 0.78 seconds. Should not numpy be significantly faster here?
Let's say you want to add something to all elements that are multiples of 3. Instead of iterating on all elements of the array, we would normally use a mask
In [355]: x = np.arange(12).reshape(3,4)
In [356]: mask = (x%3)==0
In [357]: mask
Out[357]:
array([[ True, False, False, True],
[False, False, True, False],
[False, True, False, False]])
In [358]: x[mask] += 100
In [359]: x
Out[359]:
array([[100, 1, 2, 103],
[ 4, 5, 106, 7],
[ 8, 109, 10, 11]])
Many operations are ufunc, which have a where parameter
In [360]: x = np.arange(12).reshape(3,4)
In [361]: np.add(x,100, where=mask, out=x)
Out[361]:
array([[100, 1, 2, 103],
[ 4, 5, 106, 7],
[ 8, 109, 10, 11]])
Fast numpy requires that we think in terms of the whole-array. The fast compiled code operates on arrays, or blocks of arrays. Python level iteration on arrays is slow, slower as you found out that iteration on lists. Accessing individual values of an array is more expensive.
For this small example, these whole-array methods are faster than the array iteration, though they are still slower than the list iteration. But the array methods scalar much better.
emmmmm... It seems that list derivation is faster in the current case.But np faster when I add numba.
import dis
import time
import numpy as np
from numba import jit
my_array_list = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
my_array_np = np.array(my_array_list)
n = 1000000
# #jit
def fun1(my_array_np):
# it is inplace option
for a in range(n):
my_array_np += 5
s_np = time.time()
fun1(my_array_np)
end_np = time.time() - s_np
def fuc2(my_array_list):
for a in range(n):
my_array_list = [[i + 5 for i in j] for j in my_array_list]
return my_array_list
s_list = time.time()
my_array_list = fuc2(my_array_list)
end_list = time.time() - s_list
print('my_array_np:', '\n', my_array_np, '\n')
print('my_array_list:', '\n', my_array_list, '\n')
print('time to complete with numpy:', end_np)
print('time to complete with list:', end_list)
my_array_np:
[[500001 500002 500003]
[500004 500005 500006]
[500007 500008 500009]]
my_array_list:
[[500001, 500002, 500003], [500004, 500005, 500006], [500007, 500008, 500009]]
# use numba
time to complete with numpy: 0.27802205085754395
time to complete with list: 1.9161949157714844
# not use numba
time to complete with numpy: 3.4962515830993652
time to complete with list: 1.9761543273925781
[Finished in 3.4s]
Related
I want to do nonzero cumsum with numpy array. Simply skip zeros in array and apply cumsum. Suppose I have a np. array
a = np.array([1,2,1,2,5,0,9,6,0,2,3,0])
my result should be
[1,3,4,6,11,0,20,26,0,28,31,0]
I have tried this
a = np.cumsum(a[a!=0])
but result is
[1,3,4,6,11,20,26,28,31]
Any ideas?
You need to mask the original array so only the non-zero elements are overwritten:
In [9]:
a = np.array([1,2,1,2,5,0,9,6,0,2,3,0])
a[a!=0] = np.cumsum(a[a!=0])
a
Out[9]:
array([ 1, 3, 4, 6, 11, 0, 20, 26, 0, 28, 31, 0])
Another method is to use np.where:
In [93]:
a = np.array([1,2,1,2,5,0,9,6,0,2,3,0])
a = np.where(a!=0,np.cumsum(a),a)
a
Out[93]:
array([ 1, 3, 4, 6, 11, 0, 20, 26, 0, 28, 31, 0])
timings
In [91]:
%%timeit
a = np.array([1,2,1,2,5,0,9,6,0,2,3,0])
a[a!=0] = np.cumsum(a[a!=0])
a
The slowest run took 4.93 times longer than the fastest. This could mean that an intermediate result is being cached
100000 loops, best of 3: 12.6 µs per loop
In [94]:
%%timeit
a = np.array([1,2,1,2,5,0,9,6,0,2,3,0])
a = np.where(a!=0,np.cumsum(a),a)
a
The slowest run took 6.00 times longer than the fastest. This could mean that an intermediate result is being cached
100000 loops, best of 3: 10.5 µs per loop
the above shows that np.where is marginally quicker than the first method
To my mind, jotasi's suggestion in a comment to the OP is the most idiomatic. Here are some timings, though note that Shawn. L's answer returns a Python list, not a NumPy array, so they are not strictly comparable.
import numpy as np
def jotasi(a):
b = np.cumsum(a)
b[a==0] = 0
return b
def EdChum(a):
a[a!=0] = np.cumsum(a[a!=0])
return a
def ShawnL(a):
b=np.cumsum(a)
b = [b[i] if ((i > 0 and b[i] != b[i-1]) or i==0) else 0 for i in range(len(b))]
return b
def Ed2(a):
return np.where(a!=0,np.cumsum(a),a)
To test, I generated a NumPy array of 1E5 integers in [0,100]. Therefore about 1% are 0. These results are from NumPy 1.9.2, Python 2.7.12, and are presented from slowest to fastest:
import timeit
a = np.random.random_integers(0,100,100000)
len(a[a==0]) #verify there are some 0's
1003
timeit.timeit("ShawnL(a)", "from __main__ import a,EdChum,ShawnL,jotasi,Ed2", number=250)
11.743098020553589
timeit.timeit("EdChum(a)", "from __main__ import a,EdChum,ShawnL,jotasi,Ed2", number=250)
0.1794271469116211
timeit.timeit("Ed2(a)", "from __main__ import a,EdChum,ShawnL,jotasi,Ed2", number=250)
0.1282949447631836
timeit.timeit("jotasi(a)", "from __main__ import a,EdChum,ShawnL,jotasi,Ed2", number=250)
0.09286999702453613
I'm a little surprised there's such a big difference between jotasi's and Ed Chum's answers - minimizing boolean operations is noticeable I guess. No surprise that a list comprehension is slow.
Just trying to simplify it:)
b=np.cumsum(a)
[b[i] if ((i > 0 and b[i] != b[i-1]) or i==0) else 0 for i in range(len(b))]
I have written a script that evaluates if some entry of arr is in check_elements. My approach does not compare single entries, but whole vectors inside of arr. Thus, the script checks if [8, 3], [4, 5], ... is in check_elements.
Here's an example:
import numpy as np
# arr.shape -> (2, 3, 2)
arr = np.array([[[8, 3],
[4, 5],
[6, 2]],
[[9, 0],
[1, 10],
[7, 11]]])
# check_elements.shape -> (3, 2)
# generally: (n, 2)
check_elements = np.array([[4, 5], [9, 0], [7, 11]])
# rslt.shape -> (2, 3)
rslt = np.zeros((arr.shape[0], arr.shape[1]), dtype=np.bool)
for i, j in np.ndindex((arr.shape[0], arr.shape[1])):
if arr[i, j] in check_elements: # <-- condition is checked against
# the whole last dimension
rslt[i, j] = True
else:
rslt[i, j] = False
Now:
print(rslt)
...would print:
[[False True False]
[ True False True]]
For getting the indices of I use:
print(np.transpose(np.nonzero(rslt)))
...which prints the following:
[[0 1] # arr[0, 1] -> [4, 5] -> is in check_elements
[1 0] # arr[1, 0] -> [9, 0] -> is in check_elements
[1 2]] # arr[1, 2] -> [7, 11] -> is in check_elements
This task would be easy and performant if I would check a condition on single values, like arr > 3 or np.where(...), but I am not interested in single values. I want to check a condition against the whole last dimension (or slices of it).
My question is: is there a faster way to achieve the same result? Am I right that vectorized attempts and things like np.where can not be used for my problem, because they always operate on single values and not on a whole dimension or slices of that dimension?
Here is a Numpythonic approach using broadcasting:
>>> (check_elements == arr[:,:,None]).reshape(2, 3, 6).any(axis=2)
array([[False, True, False],
[ True, False, True]], dtype=bool)
The numpy_indexed package (disclaimer: I am its author) contains functionality to perform these kind of queries; specifically, containment relations for nd (sub)arrays:
import numpy_indexed as npi
flatidx = npi.indices(arr.reshape(-1, 2), check_elements)
idx = np.unravel_index(flatidx, arr.shape[:-1])
Note that the implementation is fully vectorized under the hood.
Also, note that with this approach, the order of the indices in idx match with the order of check_elements; the first item in idx are the row and col of the first item in check_elements. This information is lost when using an approach along the lines you posted above, or when using one of the alternative suggested answers, which will give you the idx sorted by their order of appearance in arr instead, which is often undesirable.
You can use np.in1d even though it is meant for 1D arrays by giving it a 1D view of your array, containing one element per last axis:
arr_view = arr.view((np.void, arr.dtype.itemsize*arr.shape[-1])).ravel()
check_view = check_elements.view((np.void,
check_elements.dtype.itemsize*check_elements.shape[-1])).ravel()
will give you two 1D arrays, which contain a void type version of you 2 element arrays along the last axis. Now you can check, which of the elements in arr is also in check_view by doing:
flatResult = np.in1d(arr_view, check_view)
This will give a flattened array, which you can then reshape to the shape of arr, dropping the last axis:
print(flatResult.reshape(arr.shape[:-1]))
which will give you the desired result:
array([[False, True, False],
[ True, False, True]], dtype=bool)
I have a (numpy) array representing a measurement curve. I am looking for the first index i following which the subsequent N elements satisfy some condition, e.g. lie within specific bounds. In pseudo code words I am looking for the minimal i such that
lower_bound < measurement[i:i+N] < higher_bound
is satisfied for all elements in the range.
Of course I could do the following:
for i in xrange(len(measurement) - N):
test_vals = measurement[i:i + N]
if all([True if lower_bound < x < higher_bound else False for x in test_vals]):
return i
This is extremely inefficent as I am always comparing N values for every i.
What is the most pythonic way to achieve this? Has Numpy some built-in functionalities to find this?
EDIT:
As per request I provide some example input data
a = [1,2,3,4,5,5,6,7,8,5,4,5]
lower_bound = 3.5
upper_bound = 5.5
N = 3
should return 3 as starting at a[3] the elements are within the bounds for at least 3 values.
One NumPythonic vectorized solution would be to create sliding windows across the entire length of the input array measurement stacked as a 2D array, then index into the array with those indices to form a 2D array version of measurement. Next, look for bound successes in one go with np.all(..axis=1) after bound checks. Finally choose the first success index as the output. The implementation would go something along these lines -
m2D = measurement[np.arange(N) + np.arange(len(measurement)-N+1)[:,None]]
np.nonzero(np.all((lower_bound < m2D) & (higher_bound > m2D),axis=1))[0][0]
Sample run -
In [1]: measurement = np.array([1,2,3,4,5,5,6,7,8,5,4,5])
...: lower_bound = 3.5
...: higher_bound = 5.5
...: N = 3
...:
In [2]: m2D = measurement[np.arange(N) + np.arange(len(measurement)-N+1)[:,None]]
In [3]: m2D # Notice that is a 2D array (shifted) version of input
Out[3]:
array([[1, 2, 3],
[2, 3, 4],
[3, 4, 5],
[4, 5, 5],
[5, 5, 6],
[5, 6, 7],
[6, 7, 8],
[7, 8, 5],
[8, 5, 4],
[5, 4, 5]])
In [4]: np.nonzero(np.all((lower_bound < m2D) & (higher_bound > m2D),axis=1))[0][0]
Out[4]: 3
If M is the length of a, here is a O(M) solution.
locations=(lower_bound<a) & (a<upper_bound)
cum=locations.cumsum()
lengths=np.roll(cum,-N)-cum==N
result=lengths.nonzero()[0][0]+1
This answer could be helpful to you, although it is not specifically for numpy:
What is the best way to get the first item from an iterable matching a condition?
I have n matrices of the same size and want to see how many cells are equal to each other across all matrices. Code:
import numpy as np
a = np.array([[1,2,3],[4,5,6],[7,8,9]])
b = np.array([[5,6,7], [4,2,6], [7, 8, 9]])
c = np.array([2,3,4],[4,5,6],[1,2,5])
#Intuition is below but is wrong
a == b == c
How do I get Python to return a value of 2 (cells 2,1 and 2,3 match in all 3 matrices) or an array of [[False, False, False], [True, False, True], [False, False, False]]?
You can do:
(a == b) & (b==c)
[[False False False]
[ True False True]
[False False False]]
For n items in, say, a list like x=[a, b, c, a, b, c], one could do:
r = x[0] == x[1]
for temp in x[2:]:
r &= x[0]==temp
The result in now in r.
If the structure is already in a 3D numpy array, one could also use:
np.amax(x,axis=2)==np.amin(x,axis=2)
The idea for the above line is that although it would be ideal to have an equal function with an axis argument, there isn't one so this line notes that if amin==amax along the axis, then all elements are equal.
If the different arrays to be compared aren't already in a 3D numpy array (or won't be in the future), looping the list is a fast and easy approach. Although I generally agree with avoiding Python loops for Numpy arrays, this seems like a case where it's easier and faster (see below) to use a Python loop since the loop is only along a single axis and it's easy to accumulate the comparisons in place. Here's a timing test:
def f0(x):
r = x[0] == x[1]
for y in x[2:]:
r &= x[0]==y
def f1(x): # from #Divakar
r = ~np.any(np.diff(np.dstack(x),axis=2),axis=2)
def f2(x):
x = np.dstack(x)
r = np.amax(x,axis=2)==np.amin(x,axis=2)
# speed test
for n, size, reps in ((1000, 3, 1000), (10, 1000, 100)):
x = [np.ones((size, size)) for i in range(n)]
print n, size, reps
print "f0: ",
print timeit("f0(x)", "from __main__ import x, f0, f1", number=reps)
print "f1: ",
print timeit("f1(x)", "from __main__ import x, f0, f1", number=reps)
print
1000 3 1000
f0: 1.14673900604 # loop
f1: 3.93413209915 # diff
f2: 3.93126702309 # min max
10 1000 100
f0: 2.42633581161 # loop
f1: 27.1066679955 # diff
f2: 25.9518558979 # min max
If arrays are already in a single 3D numpy array (eg, from using x = np.dstack(x) in the above) then modifying the above function defs appropriately and with the addition of the min==max approach gives:
def g0(x):
r = x[:,:,0] == x[:,:,1]
for iy in range(x[:,:,2:].shape[2]):
r &= x[:,:,0]==x[:,:,iy]
def g1(x): # from #Divakar
r = ~np.any(np.diff(x,axis=2),axis=2)
def g2(x):
r = np.amax(x,axis=2)==np.amin(x,axis=2)
which yields:
1000 3 1000
g0: 3.9761030674 # loop
g1: 0.0599548816681 # diff
g2: 0.0313589572906 # min max
10 1000 100
g0: 10.7617051601 # loop
g1: 10.881870985 # diff
g2: 9.66712999344 # min max
Note also that for a list of large arrays f0 = 2.4 and for a pre-built array g0, g1, g2 ~= 10., so that if the input arrays are large, than fastest approach by about 4x is to store them separately in a list. I find this a bit surprising and guess that this might be due to cache swapping (or bad code?), but I'm not sure anyone really cares so I'll stop this here.
Concatenate along the third axis with np.dstack and perfom differentiation with np.diff, so that the identical ones would show up as zeros. Then, check for cases where all are zeros with ~np.any. Thus, you would have a one-liner solution like so -
~np.any(np.diff(np.dstack((a,b,c)),axis=2),axis=2)
Sample run -
In [39]: a
Out[39]:
array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
In [40]: b
Out[40]:
array([[5, 6, 7],
[4, 2, 6],
[7, 8, 9]])
In [41]: c
Out[41]:
array([[2, 3, 4],
[4, 5, 6],
[1, 2, 5]])
In [42]: ~np.any(np.diff(np.dstack((a,b,c)),axis=2),axis=2)
Out[42]:
array([[False, False, False],
[ True, False, True],
[False, False, False]], dtype=bool)
Try this:
z1 = a == b
z2 = a == c
z = np.logical_and(z1,z2)
print "count:", np.sum(z)
You can do this in a single statement:
count = np.sum( np.logical_and(a == b, a == c) )
The title might be ambiguous, didn't know how else to word it.
I have gotten a bit far with my particle simulator in python using numpy and matplotlib, I have managed to implement coloumb, gravity and wind, now I just want to add temperature and pressure but I have a pre-optimization question (root of all evil). I want to see when particles crash:
Q: Is it in numpy possible to take the difference of an array with each of its own element based on a bool condition? I want to avoid looping.
Eg: (x - any element in x) < a
Should return something like
[True, True, False, True]
If element 0,1 and 3 in x meets the condition.
Edit:
The loop quivalent would be:
for i in len(x):
for j in in len(x):
#!= not so important
##earlier question I asked lets me figure that one out
if i!=j:
if x[j] - x[i] < a:
True
I notice numpy operations are far faster than if tests and this has helped me speed up things ALOT.
Here is a sample code if anyone wants to play with it.
#Simple circular box simulator, part of part_sim
#Restructure to import into gravity() or coloumb () or wind() or pressure()
#Or to use all forces: sim_full()
#Note: Implement crashing as backbone to all forces
import numpy as np
import matplotlib.pyplot as plt
N = 1000 #Number of particles
R = 8000 #Radius of box
r = np.random.randint(0,R/2,2*N).reshape(N,2)
v = np.random.randint(-200,200,r.shape)
v_limit = 10000 #Speedlimit
plt.ion()
line, = plt.plot([],'o')
plt.axis([-10000,10000,-10000,10000])
while True:
r_hit = np.sqrt(np.sum(r**2,axis=1))>R #Who let the dogs out, who, who?
r_nhit = ~r_hit
N_rhit = r_hit[r_hit].shape[0]
r[r_hit] = r[r_hit] - 0.1*v[r_hit] #Get the dogs back inside
r[r_nhit] = r[r_nhit] +0.1*v[r_nhit]
#Dogs should turn tail before they crash!
#---
#---crash code here....
#---crash end
#---
vmin, vmax = np.min(v), np.max(v)
#Give the particles a random kick when they hit the wall
v[r_hit] = -v[r_hit] + np.random.randint(vmin, vmax, (N_rhit,2))
#Slow down honey
v_abs = np.abs(v) > v_limit
#Hit the wall at too high v honey? You are getting a speed reduction
v[v_abs] *=0.5
line.set_ydata(r[:,1])
line.set_xdata(r[:,0])
plt.draw()
I plan to add colors to the datapoints above once I figure out how...such that high velocity particles can easily be distinguished in larger boxes.
Eg: x - any element in x < a Should return something like
[True, True, False, True]
If element 0,1 and 3 in x meets the condition. I notice numpy operations are far faster than if tests and this has helped me speed up things ALOT.
Yes, it's just m < a. For example:
>>> m = np.array((1, 3, 10, 5))
>>> a = 6
>>> m2 = m < a
>>> m2
array([ True, True, False, True], dtype=bool)
Now, to the question:
Q: Is it in numpy possible to take the difference of an array with each of its own element based on a bool condition? I want to avoid looping.
I'm not sure what you're asking for here, but it doesn't seem to match the example directly below it. Are you trying to, e.g., subtract 1 from each element that satisfies the predicate? In that case, you can rely on the fact that False==0 and True==1 and just subtract the boolean array:
>>> m3 = m - m2
>>> m3
>>> array([ 0, 2, 10, 4])
From your clarification, you want the equivalent of this pseudocode loop:
for i in len(x):
for j in in len(x):
#!= not so important
##earlier question I asked lets me figure that one out
if i!=j:
if x[j] - x[i] < a:
True
I think the confusion here is that this is the exact opposite of what you said: you don't want "the difference of an array with each of its own element based on a bool condition", but "a bool condition based on the difference of an array with each of its own elements". And even that only really gets you to a square matrix of len(m)*len(m) bools, but I think the part left over is that the "any".
At any rate, you're asking for an implicit cartesian product, comparing each element of m to each element of m.
You can easily reduce this from two loops to one (or, rather, implicitly vectorize one of them, gaining the usual numpy performance benefits). For each value, create a new array by subtracting that value from each element and comparing the result with a, and then join those up:
>>> a = -2
>>> comparisons = np.array([m - x < a for x in m])
>>> flattened = np.any(comparisons, 0)
>>> flattened
array([ True, True, False, True], dtype=bool)
But you can also turn this into a simple matrix operation pretty easily. Subtracting every element of m from every other element of m is just m - m.T. (You can make the product more explicit, but the way numpy handles adding row and column vectors, it isn't necessary.) And then you just compare every element of that to the scalar a, and reduce with any, and you're done:
>>> a = -2
>>> m = np.matrix((1, 3, 10, 5))
>>> subtractions = m - m.T
>>> subtractions
matrix([[ 0, 2, 9, 4],
[-2, 0, 7, 2],
[-9, -7, 0, -5],
[-4, -2, 5, 0]])
>>> comparisons = subtractions < a
>>> comparisons
matrix([[False, False, False, False],
[False, False, False, False],
[ True, True, False, True],
[ True, False, False, False]], dtype=bool)
>>> np.any(comparisons, 0)
matrix([[ True, True, False, True]], dtype=bool)
Or, putting it all together in one line:
>>> np.any((m - m.T) < a, 0)
matrix([[ True, True, True, True]], dtype=bool)
If you need m to be an array rather than a matrix, you can replace the subtraction line with m - np.matrix(m).T.
For higher dimensions, you actually do need to work in arrays, because you're trying to cartesian-product a 2D array with itself to get a 4D array, and numpy doesn't do 4D matrices. So, you can't use the simple "row vector - column vector = matrix" trick. But you can do it manually:
>>> m = np.array([[1,2], [3,4]]) # 2x2
>>> m4d = m.reshape(1, 1, 2, 2) # 1x1x2x2
>>> m4d
array([[[[1, 2],
[3, 4]]]])
>>> mt4d = m4d.T # 2x2x1x1
>>> mt4d
array([[[[1]],
[[3]]],
[[[2]],
[[4]]]])
>>> subtractions = m - mt4d # 2x2x2x2
>>> subtractions
array([[[[ 0, 1],
[ 2, 3]],
[[-2, -1],
[ 0, 1]]],
[[[-1, 0],
[ 1, 2]],
[[-3, -2],
[-1, 0]]]])
And from there, the remainder is the same as before. Putting it together into one line:
>>> np.any((m - m.reshape(1, 1, 2, 2).T) < a, 0)
(If you remember my original answer, I'd somehow blanked on reshape and was doing the same thing by multiplying m by a column vector of 1s, which obviously is a much stupider way to proceed.)
One last quick thought: If your algorithm really is "the bool result of (for any element y of m, x - y < a) for each element x of m", you don't actually need "for any element y", you can just use "for the maximal element y". So you can simplify from O(N^2) to O(N):
>>> (m - m.max()) < a
Or, if a is positive, that's always false, so you can simplify to O(1):
>>> np.zeros(m.shape, dtype=bool)
But I'm guessing your real algorithm is actually using abs(x - y), or something more complicated, which can't be simplified in this way.